Quantcast
Channel: Postgres Weekly
Viewing all 472 articles
Browse latest View live

How to make those EXPLAIN ANALYZE plans more readable

$
0
0

#359 — June 10, 2020

Read on the Web

Postgres Weekly

10 Common Postgres Errors— Some quick fire common errors and warnings to watch out for, with symptoms and solutions, around things like memory, disk space, and permissions.

Ibrar Ahmed (Percona)

A Tool to Make EXPLAIN ANALYZE Plans More Readable— A long standing (over 11 years!) Web-based tool that lets you paste in the result of EXPLAIN ANALYZE queries and see a more easily understandable version. Why are we linking it again? It’s had a bunch of updates.

Hubert Lubaczewski, Łukasz Lewandowski

Whitepaper: Business Case for Professional Support— Learn the importance of Professional Support for your mission-critical PostgreSQL systems & how it can benefit your company. It increases database performance, helps scale, distributes data, reduces costs, saves you from known pitfalls, and more.

2ndQuadrant Services sponsor

PostgreSQL Load Balancing with HAProxy— I’m a huge fan of the haproxy TCP and HTTP proxy/load balancer, but admittedly haven’t used it for Postgres before.. This post demonstrates using HAProxy with Severalnines’ ClusterControl for load balancing Postgres much in a similar way as you might an HTTP server.

Severalnines

Multi-Master Replication Solutions for Postgres— Horizontally scaling Postgres has been enough of a challenge over the years that entire companies (e.g. Citus Data) have been founded to make it easier. But there are various ways to do multi-master replication nonetheless, and this post links to several approaches, whether open or closed source, free or paid.

Ibrar Ahmed (Percona)

Deduplication in Postgres 13 B-Tree Indexes— PostgreSQL v13 introduces deduplication of entries for B-tree indexes. This article describes the new feature and demonstrates it with a simple test.

Laurenz Albe

A Step-by-Step Way to Backup a Heroku Postgres Database to an AWS S3 Bucket— Another variation on this that’s worth doing IMO is turning on S3 versioning so you can easily get rolling backups and keep a consistent filename.

Paweł Urbanek

Locating the Postgres Configuration FileSHOW config_file is the key here, but it might take a little more work if using Docker, say.

Luca Ferrari

What Are Failover Slots?

Craig Ringer

Visualize Postgres Performance In Real-Time With Datadog— Datadog’s Postgres OOTB dashboard visualizes data on latency, errors, read/write capacity, and throttled request in a single pane of glass.

Datadog sponsor

pgsql-http: An HTTP Client Extension for Postgres— If you need to make HTTP requests direct from Postgres, this is one way to go and links against libcurl.

Paul Ramsey

PgHero 2.5: A Performance Dashboard for Postgres— Built in Ruby. And, yes, we have a newsletter for that. 😁

Andrew Kane

SQLancer: A Tool for Detecting Logic Bugs in Database Systems— Considered a ‘research prototype’ for now, SQLancer’s job is to stress a database system into returning inconsistent or illogical results. Written in Java, it supports several databases (including Postgres).

Manuel Rigger and Zhendong Su

supported by

💡 Tip of the Week

GROUPING SETS

Grouping sets let you perform grouping within a query that's more complex than simple GROUP BY column can do.

Given a table like this:

  name  | dept | location | salary
--------+------+----------+--------
 Abbey  | IT   | London   |  85000
 Paul   | IT   | Madrid   |  74000
 Clancy | HR   | London   |  71000
 Imani  | HR   | Madrid   |  74000

We could easily get the total salaries for each 'department' with a query like:

SELECT dept, SUM(salary) FROM staff GROUP BY dept;

But what about if we want the total salaries in various ways within the same query? For example, let's say we want the total salaries for everyone, the total salaries for each department, and the total salaries for each location, all within the same query. Can it be done?

Enter 'grouping sets' which lets you define multiple sets of grouping criteria each of which will be performed separately and appended to the result set.

For example, let's group the results by department, then by location, and finally by the empty set (i.e. the total):

SELECT dept, location, SUM(salary) FROM staff
GROUP BY GROUPING SETS ((dept), (location), ());

This gives us a set of results like so:

 dept | location |  sum   
------+----------+--------
      |          | 304000
 IT   |          | 159000
 HR   |          | 145000
      | London   | 156000
      | Madrid   | 148000

Now we can easily see from a single set of results that the staff in our example are paid more in London, paid more in the IT department, and the overall wage bill is 304000.

This week’s tip is sponsored by Retool. Build internal tools in minutes, not days. Retool provides powerful building blocks that connect to any DB so you can quickly build the tools your company needs.

🗓 Upcoming Online Events

  • Postgres Pulse - weekly at 11am ET each Monday. Weekly Zoom-based sessions with folks like Bruce Momjian, Vibhor Kumar, and other people at EnterpriseDB.
  • Postgres Vision 2020 on June 23-24. A full attempt at an online Postgres conference across multiple days with multiple tracks.

How 'RETURNING' yielded a 9x performance improvement

$
0
0

#360 — June 17, 2020

Read on the Web

Postgres Weekly

Jepsen Finds a Bug in Postgres 12.3— The name ‘Jepsen’ is sure to stir emotions in any database server creator as their analyses of distributed systems (such as databases!) often reveal all sorts of edge cases and flaws. And so it goes with Postgres with a bug in serializable isolation being found. This is technical stuff, but it’s great to see systems like Postgres placed under such rigorous analysis.

Kyle Kingsbury

How One Word in Postgres Unlocked a 9x Performance Improvement— The creator of a personal finance tool experienced a user whose data caused a flood of INSERTs large enough to cause a problem. Here’s the tale of how a simple RETURNING clause enabled a huge optimization in the process.

James Long

Webinar | Distributed SQL: A Modern, Cloud-Native PostgreSQL— Hitting challenges scaling PostgreSQL, deploying it in the cloud, or using it with microservice architecture? A Distributed SQL database might be a better fit for your workload. Tune in tomorrow, June 18, for an architecture review and live demo.

COCKROACH LABS sponsor

Citus 9.3 Released: The Postgres Horizontal Scaling Extension— Citus is the now Microsoft-owned Postgres extension that turns Postgres into a distributed database that can be more powerfully used across multiple servers. 9.3 improves the distributed SQL support, now with full support for window functions.

citusdata

XgeneCloud: Instant REST and GraphQL APIs on any Database— Where ‘any’ includes Postgres, naturally :-) There’s quite a lot of pieces to this and they seem to be forming a company around the tech, but it’s open source nonetheless.

xgenecloud

Understanding User Management in PgBouncerPgBouncer is a popular connection proxy and pooler for Postgres, but it doesn’t just pass things straight through to Postgres, it can handle authentication directly itself. This article clears up how that works.

Peter Eisentraut

Controlling Runaway Queries with Statement Timeouts— A well tuned database on good hardware is easily capable of many thousands of queries per second but long lived queries can quickly cause things to back up. Here’s one defense against queries that want to live forever.

Craig Kerstiens

A Free, Multi-Node, Petabyte Scale, Time-Series Database on PostgresRead more about Timescale’s multi-node time-series database and why they’re making it available for free.

Timescale sponsor

Controlling Server Variables at Connection Time— A brief tip, this. For example: psql 'options=-cwork_mem=100MB dbname=test'

Bruce Momjian

Composite Data Type Performance Issues in Postgres

Hans-Jürgen Schönig

Running pgbackrest on FreeBSDpgbackrest is a backup and restore system for Postgres.

Luca Ferrari

pq: The Pure Go(lang) Postgres Driver for database/sql— Now supports GSS authentication.

Blake Mizerany and contributors

🗓 Upcoming Online Events

  • Postgres Vision 2020 on June 23-24. A full attempt at an online Postgres conference across multiple days with multiple tracks.
  • Postgres Pulse - Zoom-based sessions with folks like Bruce Momjian, Vibhor Kumar, and other people at EnterpriseDB. Now running every other week at 11am ET on Mondays. The next is on June 29.

Unifying JSON and JSONB into a new JSON type

$
0
0

#361 — June 24, 2020

Read on the Web

Postgres Weekly

'10' Things Postgres Could Improve— An in-progress four part series (part 2 here) covering topics like transaction ID issues and replication.

Shaun Thomas

📊 PDF: It's Time for a JSON/JSONB Great Unification— This dense, technical slidedeck is for you if you’re really into your JSON and you really care about Postgres’s JSON capabilities. It digs into JSON support in a forthcoming SQL standard and what future versions of Postgres can do to support this by unifying JSON and JSONB into one new data type.

Oleg Bartunov

We Help Customers Speed Up Postgres Queries By 1000x. Learn How— With pganalyze, companies like Atlassian are able to speed up their queries by orders of magnitude. In this ebook, we share our best practices for optimizing Postgres performance.

pganalyze sponsor

Postgres 13 Beta 1 Now in Amazon RDS Database Preview Environment— RDS lets you run a managed Postgres deployment on AWS and the Preview Environment lets you play with pre-release versions if you want to test your tooling, etc.

Amazon Web Services

How to Force a Table to Have Just One Row— Bruce demonstrates how to force a table to have at most one row by creating a unique expression index on a constant, with no column name references.

Bruce Momjian

EnterpriseDB Rebrands to EDB— EnterpriseDB are a popular company in the Postgres space so if you now see 'EDB' anywhere, well.. it's them :-)

EnterpriseDB / EDB

SQL Trickery: Hypothetical Aggregates— You can use this technique to determine the rank of a hypothetical value within an existing set.

Hans-Jürgen Schönig

tuned, Postgres, and Youtuned is a dynamic adaptive system tuning daemon (from Red Hat) that tunes system settings dynamically depending on usage. This post demonstrates creating a tuned profile for Postgres in particular.

Douglas J Hunley

ltree vs. WITH RECURSIVE— This is a follow up to Hans-Jürgen’s article on hierarchical queries.

Hans-Jürgen Schönig

Your Data Is Your Business— PGX is a full-service database consultancy focused on PostgreSQL data systems, on any platform or hosting environment.

PostgreSQL Experts, Inc. sponsor

Wrapping Db2 with Postgres— Db2 is a family of database products developed by IBM and focused on enterprise use. If you need to migrate Db2 data to Postgres, db2_fdw provides a way to do it.

Marcelo Diaz

Reasons to Migrate from Oracle to Postgres— This is truly a case of ‘preaching to the choir’ in this newsletter, but if you need any more reasons when chatting to others.. 😄

Kirk Roybal

pg_auto_failover: Automated Failover and High-Availability Extension— Monitors and manages automated failover for a Postgres cluster.

Citus Data

supported by

💡 Tip of the Week

Converting SQL between dialects

This is one of those 'tool' tips where you're unlikely to need it right now but at some point in the future you'll go: "Aha! I know what I need to solve this!"😂

Let's say you have some SQL written in a dialect that isn't compatible with Postgres, or perhaps the opposite where you need to take your Postgres query into the Oracle or SQL Server world — what can you do?

You either hit the documentation and work out how to convert between SQL dialects, or you first hit up the jOOQ SQL Translator:

It's far from a perfect tool, but if you've forgotten some quirk about Oracle, SQL Server (or one of several dialects the tool supports), it can provide a useful sense check and I've leaned on it a few times to double check if the SQL I'm suggesting to others is widely supported.

This week’s tip is sponsored by Retool. Build internal tools in days, not weeks. Retool provides UI building blocks that connect to any DB and API so you can quickly build the tools your company needs.

🗓 Upcoming Online Events

  • Postgres Pulse - Zoom-based sessions with folks like Bruce Momjian, Vibhor Kumar, and other people at EnterpriseDB. Now running every other week at 11am ET on Mondays. The next is on June 29 (next Monday).
  • Table Partitioning in Postgres - Join Amit Langot for this free session in which he looks at partitioning as a tool to solve certain problems. June 26/27 (depending upon timezone).

Testing Postgres extensions with Docker and GitHub Actions

$
0
0

#362 — July 1, 2020

Web Version

Postgres Weekly

Using the system_stats Extension to Monitor Postgressystem_stats is a new extension (out of EDB, formerly EnterpriseDB) made up of several stored procedures you can use to monitor Postgres (or, more accurately, the server it’s running on).

Dave Page

Testing Extensions with GitHub Actionspgxn-tools provides a Docker image with scripts to install and run any version of Postgres and this can be brought together with a CI/CD platform like GitHub Actions to test Postgres extensions on multiple Postgres versions.

David E. Wheeler

Whitepaper: BDR - Advanced Clustering & Scaling for PostgreSQL— Learn more about Advanced Clustering & Scaling for PostgreSQL with BDR including use cases, architectures, features & benefits. BDR provides multi-master replication with AlwaysOn availability, worldwide clusters, & Cloud Native deployments for PostgreSQL databases.

2ndQuadrant PostgreSQL Products sponsor

Calculating Differences Between Rows in SQL— If you have lots of rows that represent changing values (such as time series data, say) being able to calculate the differences between the values over time could be useful, here are some approaches.

Hans-Jürgen Schönig

PDF: Running Postgres in Kubernetes— A slidedeck based rundown of the top options for running Postgres in Kubernetes and some of the features contained within each.

Lukas Fittl

13 Tips to Improve Insert Performance on Postgres and TimescaleDB— The first five tips are related to Postgres in general, with the remainder being specific to TimescaleDB, a time series data extension for Postgres.

Michael Freedman

On Join Strategies and Performance— It’s worth understanding the three forms of join strategy Postgres uses when joining relations and how you can influence them with indexes.

Laurenz Albe

Can Case Comparison Be Controlled?“Let’s go over some Postgres case-precision behaviors like the handling of character strings, identifiers, and keywords.” A handy primer to case comparison issues here.

Bruce Momjian

Timescale Cloud: Hosted, High-Performance PostgreSQL— Now available in 75+ regions across 3 major clouds. Analytics, performance, and scale. 👉 Start free, no credit card required.

Timescale sponsor

Postgres Anonymization on Demand— If your database stores personally identifiable information – and it probably does! – anonymization can help protect that data while still keeping the database useful. This articles digs into an approach for applying anonymization globally, covering all applications, with minimal code rewrites.

Achilleas Mantzios

Setting Up ora2pg for an Oracle to Postgres Migration— I suspect if you have access to an Oracle database, you might not be looking around for tutorials randomly, but if you are.. and if you need to migrate the data to Postgres.. ora2pg and this tutorial can help.

Yorvi Arias

Authenticating pgpool II with LDAP

Ahsan Hadi

A Story of Indexing JSONB Columns— A neat mix of story and tip when it comes to working with JSONB at scale.

Vsevolod Solovyov

supported by

💡 Tip of the Week

Use WITH to query a subset

There is a whole lot you can do with Common Table Expressions (CTEs) from traverse trees to using window functions in rankings. CTE functions are implemented by the “WITH” clause. A basic use is to query a subset -- similar to a subquery.

Let’s demonstrate this by setting up some sample data:

CREATE TABLE databases (
  name TEXT, model TEXT, type TEXT, pgcompat boolean, opensource boolean);
 
INSERT INTO databases VALUES
  ('YugabyteDB', 'rdbms' , 'distributed sql', TRUE, TRUE),
  ('CockroachDB', 'rdbms', 'distributed sql', FALSE, FALSE),
  ('MongoDB', 'document', 'nosql', FALSE, FALSE),
  ('Neo4j', 'graph', 'nosql', TRUE, FALSE),
  ('Cassandra', 'columnar', 'nosql', FALSE, TRUE);

If we wanted to run a query against databases that were open source licensed, support SQL and are compatible with PostgreSQL:

WITH dbs AS (
    SELECT
        name,
        model,
        (CASE
            WHEN type = 'distributed sql' then TRUE
            ELSE FALSE
        END) sql_support,
    pgcompat,
    opensource    
    FROM
        databases
)
SELECT
    name,
    model
FROM
    dbs
WHERE
    opensource = TRUE,
    pgcompat = TRUE,
    sql_support = TRUE
ORDER BY
    name;

We created a CTE called dbs and reinterpreted the “type” field to be sql_support as a boolean using a case statement. In the main select statement we query against this CTE and in our where clause, we are able to specify only booleans.

This gets us a result like so:

  name     | model
-------------------
YugabyteDB | rdbms

This week’s tip is sponsored by YugabyteDB, the high-performance distributed SQL database for internet-scale applications. Serve applications with SQL query flexibility and cloud native agility.

Amazon RDS now lets you take Postgres from outhouse to inhouse

$
0
0

#363 — July 8, 2020

Web Version

Postgres Weekly

Using the system_stats Extension to Monitor Postgressystem_stats is a new extension (out of EDB, formerly EnterpriseDB) made up of several stored procedures you can use to monitor Postgres (or, more accurately, the server it’s running on).

Dave Page

Generating Random Strings and Integers That Actually Aren’t— A look at creating “random-looking” coupon codes or similar strings in a PL/pgSQL function.

Josh Williams

We’ve Helped 100s of Customers Speed Up Postgres. Learn Our Best Practices— With pganalyze, companies like Robinhood are able to speed up their queries by orders of magnitude. In this ebook, we share our best practices for optimizing Postgres performance.

pganalyze sponsor

Bitnine Looks to Scale Postgres with AGEAgensGraph is a transactional graph database built on Postgres but has now morphed into a new project called AGE and is being incubated by Apache.

Datanami

Amazon RDS Now Works on AWS Outposts— Let’s untangle this. RDS (Relational Database Service) is AWS’s service for running Postgres (and MySQL) in the cloud. AWS Outpost is a way to take the AWS experience into your own datacenter. So now you can use AWS’s cloud database service on AWS’s hardware but within your own physical environment. OK - got it!

AWS

Using CREATE DOMAIN to Create New Data Types— If you want to create more advanced server side data type constraints than the default data types can provide, you can create your own data types (domains).

Hans-Jürgen Schönig

Generated Columns vs. Triggers in Postgres 12— Postgres 12 introduced the idea of generated columns - columns which are generated from other columns’ values. This post compares the performance against using triggers to similar ends.

Emanuel Calvo and Anthony Sotolongo

More Postgres 13 Features: Deduplication in B-Tree Indexes— Postgres 13, due later this year, has a feature that can decrease the size of B-tree indexes by way of avoiding duplication. This post includes a contrived example to show off the difference.

Hamid Akhtar

Enterprise-Ready PostgreSQL | EDB Supercharges PostgreSQL— A portfolio of service offerings to help you on your PostgreSQL journey. Data migration. Hybrid cloud. High availability.

EDB sponsor

How Storj Migrated from Postgres to CockroachDB: Bulk Loading Performance

Jessica Grebenschikov

plpgsql_check: A PL/pgSQL Code Checker— A specialized tool for when you want to find any errors lurking in your PL/pgSQL functions. Support for functions with arguments of polymorphic types has just been added.

Pavel Stehule

Noisia: A 'Harmful Workload Generator' for Postgres— Creates things like deadlocks, transactions that do nothing and queries that produce on-disk temporary files. Why? For testing, stress-testing your setup, etc. Use carefully and with caution.

Lesovsky Alexey

A way to build Postgres extensions in Rust

$
0
0

#364 — July 15, 2020

Web Version

Postgres Weekly

PGX: Build Postgres Extensions with Rust— A framework for building Postgres extensions with Rust rather than the C or C++ you might be more used to. Why is this a good thing? Rust is becoming pretty popular in more cutting edge circles lately and the language’s safety features make it a compelling choice.

ZomboDB

JSONB: A Container of Types— Bruce reminds us that JSONB columns don’t just have to hold collections like arrays or maps/hashes, you can directly store strings, numbers and booleans too.

Bruce Momjian

Achieve High Availability with PostgreSQL [Whitepaper]— Critical business applications require availability of their backend database cluster. Discover how to setup & deploy enterprise-grade High Availability of up to four 9s. Learn the features and benefits in the Highly Available Postgres Clusters Whitepaper.

2ndQuadrant PostgreSQL Products sponsor

How to Find a Postgres Bug Born in 1997— The tale of discovering a bug hidden away inside Postgres’s documentation or circle type (depending on your point of view) for the past 23 years.

David Zhang

⚡️ Quick bytes:

How to SCRAM with pgBouncer— Not only can you see how pgBouncer (the connection pooler) uses SCRAM authentication but also what SCRAM authentication is and how it works.

Jonathan S. Katz

What Are Postgres Templates?— We’ve covered this in one of our ‘tips of the week’ before, but if you’re unaware of what template databases are and how they’re used, this is a quick primer.

Angelico de los Reyes

Foreign Data Wrappers: Postgres's Secret Weapon?— Foreign data wrappers (often called FDWs, for short) lets you query remote databases directly from your PostgreSQL instance and here’s how that works alongside Splitgraph, a toolkit for building and working with “data images”.

Artjoms Iškovs (Splitgraph)

Distributed SQL Beats Polyglot Persistence for Building Microservices— Polyglot persistence is the silent killer of release agility; distributed SQL is the answer for today’s multicloud era.

YugabyteDB sponsor

PgTyped: Typesafe SQL in TypeScript— Use raw SQL in TypeScript with guaranteed type-safety.

Adel Salakh

pgsodium: A Postgres Extension for Using libsodiumlibsodium is a crypto library for encryption, decryption, signatures, password hashing and more. We first linked when it was in alpha but it’s now 1.0 :-)

Michel Pelletier

supported by

💡 Tip of the Week

RANK and DENSE_RANK for ranking orders

Let's say we've been taking votes of people's favorite database systems. Here's our schema:

CREATE TABLE databases (name TEXT, votes INT);
INSERT INTO databases VALUES
  ('Postgres', 10000),
  ('MySQL', 4522),
  ('SQLite', 9500),
  ('MongoDB', 4522),
  ('Oracle', 4580),
  ('Redis', 9500);

We want to get the results in order along with a rank of where each database stands:

SELECT row_number() OVER (ORDER by votes DESC), * FROM databases;

It doesn't seem very fair to me that Redis gets ranked #3 despite having the same number of votes as SQLite! Ranks in the real world often take note of equivalent/tied scores and give the same rank to similarly scored items.

Enter RANK and DENSE_RANK, two of several window functions:

SELECT DENSE_RANK() OVER (ORDER by votes DESC), * FROM databases;
SELECT RANK() OVER (ORDER by votes DESC), * FROM databases;

The difference is subtle but RANK gives us a more traditional rank approach where if there are two second place entries, there's no longer a third place entry. DENSE_RANK, however, ensures there are continguous rankings even if there are duplicates.

Try out this tip in PopSQL. PopSQL is a SQL editor built for teams. Think “Figma for SQL”. A free Premium trial awaits Postgres Weekly readers (no credit card required).

Recreating a location based social network with Postgres

$
0
0

#365 — July 22, 2020

Web Version

Postgres Weekly

Recreating YikYak with PostgreSQL— YikYak was an anonymous social network that used your location to show you posts 5km around you (we can’t see how that could possibly cause problems.. 😂) and you can recreate the underlying principle of its operation using Postgres’s geographical coordinate support.

Adam Fallon

pgwatch2 v1.8.0 Released— pgwatch2 is a popular Postgres monitoring tool and it now has Pgpool-II, Postgres 13 and TimescaleDB metrics storage support.

Kaarel Moppel

Real-Time Postgres Performance Monitoring— Collect out-of-the-box and custom Postgres metrics and correlate them with data from across your distributed infrastructure and applications. Try it free with Datadog for 14 days.

Datadog sponsor

Partitioning a Large Table Without a Long-Running Lock— You’ve got a huge table, you need to partition it, but you need that table to remain available to your app.. what do you do? Andrew Dunstan has a recipe to follow.

Andrew Dunstan

Getting More Performance for LIKE and ILIKE Statements— Our tip of the week (below) covers pattern matching and LIKE, but what if it’s not enough and you want to speed up performance? There are solutions!

Hans-Jürgen Schönig

GROUPING SETS and NULL Values— Our tip in issue 359 was about grouping sets, a way to perform grouping within a query that’s more complex than simple GROUP BY column can do. Bruce looks at the relationship between using GROUPING SETS and null values.

Bruce Momjian

Representing Dates, Times and Intervals in Postgres— Postgres comes with a bunch of built-in date and time related data types. But why should you use them over strings or integers? Here’s an overview looking at the why, plus advice on how to do so effectively.

RapidLoop

💻Live Webinar: 5 Ways to Improve Postgres Insert PerformanceJoin us on Aug 19 to learn five simple, yet powerful, techniques to supercharge your PostgreSQL ingest performance - plus live demos and pro tips for each tactic.

Timescale sponsor

Examining the Postgres Catalog with Python— Mark Ryan looks at how to get the most out of database metadata - by writing a program to automatically extract the information held within.

Towards Data Science

Generating & Managing Postgres Schema Migrations with Spring Boot JPA

Muhammad Haroon

▶  Using PEM to Improve Performance in Postgres: The Postgres Tuning Wizard and Performance Diagnostics— An hour-long webinar running through how to use EDB’s Postgres Enterprise Manager (PEM) GUI. (Note: the audio is a bit muffled generally.)

EDB

supported by

💡 Tip of the Week

Pattern matching

If you need to filter query results beyond usual comparisons or equality without setting up full text search, what are your options? LIKE is perhaps the best known:

SELECT * FROM people WHERE name LIKE 'Sam%';
 
// All 'people' whose names start with 'Sam' here

There's also ILIKE to make such a query case insensitive:

SELECT * FROM people WHERE name ILIKE 'sam%';
 
// 'SAM', 'sAMantha', etc. would be picked up

But did you know there are other options?

SIMILAR TO is like LIKE but it uses the SQL's standards definition of a regular expression to do the match (which means you still get % as a sort of .* equivalent):

SELECT * FROM people WHERE name SIMILAR TO '(Pat|Sam)%';
 
// Rows where name starts with Pat.. or Sam..

If you prefer POSIX style regular expressions (and I do!) you can also use operators like ~ (case sensitive) and ~* (case insensitive):

SELECT * FROM people WHERE name ~* '(Pat|Sam).*';

This tip is just to whet your appetite, but there's a lot more in the Pattern Matching documentation. You need to be careful with efficiency and only running such queries over subsets of tables if you're operating at scale (otherwise you might need to go with true full text indexes and searching) but for many situations, Postgres's regular expression and pattern matching support is just fine.

Try out this Tip in PopSQL. PopSQL is a SQL editor built for teams. Think “Google Docs meets a SQL editor”. A free Premium trial awaits Postgres Weekly readers (no credit card required).

Fill factors, SQL style, and figuring out prices in Postgres

$
0
0

#366 — July 29, 2020

Web Version

Postgres Weekly

What is 'Fill Factor' and How Does It Affect Performance?— The ‘fill factor’ defines just how tightly packed table storage is. For example, when a row is updated, the ‘new’ row might end up in a different page of storage due to size limitations, whereas a less tightly packed table would allow the update to happen almost in situ.

Kaarel Moppel

A SQL Style Guide— We linked to this a couple of years ago but Bruce Momjian has reminded us of this handy SQL style guide to ensure legible and maintainable queries.

Simon Holywell

[Case Study] Successful Migration to PostreSQL by Global Gaming Company— Learn how 2ndQuadrant helped International Gaming Technology (IGT) successfully Migrate to PostgreSQL from an expensive and proprietary DBMS to deliver high quality gaming experiences worldwide, as well as how they've experienced zero outages since the switch.

2ndQuadrant PostgreSQL Services sponsor

Unicode Normalization in Postgres 13— If you know nothing about what ‘fully composed’ or ‘fully decomposed’ means when it comes to Unicode or what Postgres does when considering the equality of Unicode strings, this is a short and sweet introduction to some ideas you might not need to know in detail but should probably know about.

Peter Eisentraut

Avoiding Passwords in Log Files“Because Postgres uses SQL queries to manage user accounts, including password assignment, it is possible for passwords to appear in the server logs..”

Bruce Momjian

A Safer Price Type in Postgres— Is it worth modelling prices by creating a domain/type rather than just sticking to cents in a number column? Vados experiments.

Vados

Postgres 13's EXPLAIN Now Includes WAL Information— You’ll be able to see how many WAL records are created and similar info.

Luca Ferrari

Monitor and Visualize Postgres Database Metrics End-To-End— Monitor PostgreSQL performance end-to-end with OOTB, customizable, drag-and-drag dashboards in Datadog. Try it free.

Datadog sponsor

Installing TimescaleDB on macOS with Postgres.appTimescaleDB is a popular time series data Postgres extension and Postgres.app is an equally popular way to run Postgres on the Mac. This tutorial brings the two together.

Prathamesh Sonpatki

Using Custom Types with Citus and Postgres, From Popular Hack to Transparent Feature— How custom types work with Citus and how user-defined PostgreSQL types are automatically propagated to all the nodes in a Citus cluster.

Nils Dijk

▶  Postgres Performance Tuning and Optimization— An hour-long talk that focuses on tuning configuration settings (both in Postgres and on Linux generally) and what things they each affect.

Ibrar Ahmed (Percona)

What It Was Like Speaking Online at EuroPython 2020— A speaker shares a little of the experience of giving a talk at an online conference.

Paolo Melchiorre

GoodJob: A New Postgres-Based, Multithreaded, Background Job System for Ruby on Rails— Ben calls GoodJob a “second-generation” backend because it focuses on compatibility with ActiveJob. It’s suited for use cases queuing fewer than 1 million jobs/day.

Ben Sheldon

supported by

💡 Tip of the Week

Adding constraints to JSON objects

I find it easy to think of JSONB columns as being open to anything – I'll often create a JSONB 'metadata' column and use it as a sort of generic bucket to throw data into that I may or may not need later. They perform this job well.

While you can use JSON in a very dynamic, 'seat of the pants' fashion, you can also bring it into the structured world with queries, functions, and constraints.

For example, you might have a books table and be storing data about books within it as JSON documents:

create table books(k serial primary key, doc jsonb not null);

insert into books(doc) values
  ('
    { "ISBN"    : 4582546494267,
      "title"   : "Macbeth",
      "author"  :
        {"given_name": "William",
         "family_name": "Shakespeare"},
      "year"    : 1623
    }
  ');

One type of constraint that might help in a table like this is to ensure that we even have a JSON document of some type. You can do this by checking the type of the document and ensuring that it's an object (i.e. that it has keys and values, rather than just being the number 10, say):

alter table books
add constraint books_doc_is_object
check(
  jsonb_typeof(doc) is not null and
  jsonb_typeof(doc) = 'object'
);

We can go a step further by validating the data within the JSON document. For example, an ISBN is a 13 digit integer that uniquely represents a book (or other forms of written material). We can ensure that the "ISBN" value given in a JSON document is a thirteen digit integer like so:

alter table books
add constraint books_doc_isbn_ok
check(
  doc->>'ISBN' is not null              and
  jsonb_typeof(doc->'ISBN') = 'number'  and
  (doc->>'ISBN')::bigint > 0            and
  length(doc->>'ISBN') = 13
);

This validation goes into some depth checking that the 'ISBN' value is present, that it's a number, then casts it into an integer to check that it's positive in value, and then checking that its length is equal to 13.

You can go a lot further than this, but the newsletter would be too long, so check out Bryn Llewellyn's document data modeling with JSON data types article for more. It's written about YugabyteDB but since its JSON support is built on top of Postgres, most of what's written applies to Postgres too.

This week’s tip is sponsored by YugabyteDB, the high-performance distributed SQL database for internet-scale applications. Serve applications with SQL query flexibility and cloud native agility.


An introduction to SCRAM authentication in Postgres 13

$
0
0

#367 — August 5, 2020

Web Version

Postgres Weekly

Postgres 13 Incompatibilities To Be Be Aware Of— Postgres 13 will be in general release soon (it’s currently in beta) and while Postgres tends to be reasonably backward compatible over time, there are some changes worth being aware of before you make a migration, as seen here.

Ibrar Ahmed

How to Securely Authenticate with SCRAM in Postgres 13— A tutorial on setting up SCRAM-based password authentication. Interesting not just as a tutorial but also a primer on what SCRAM is and why channel binding improves security overall.

Jeff Davis

How to Tune PostgreSQL GUC Parameters— This article discusses GUC parameters that can be used for PostgreSQL tuning and how to configure them for improved performance. Take a look.

EDB sponsor

Measuring the Difference Between Dates“What is the difference between two dates? You would think there was one answer, but there isn’t.” An interesting look at how there are multiple ways to get the difference between two dates or timestamps in Postgres and how they can return slightly different answers.

Bruce Momjian

effective_cache_size: A Practical Example— Some insights and a practical example of what the effective_cache_size setting does.

Hans-Jürgen Schönig

Announcing pgBackRest for Azure: Fast, Reliable Postgres Backups— Backups are a key staple of running any database. pgBackRest aims to be a fast, reliable, easy-to-use backup and restore solution with the ability to seamlessly scale to the largest databases, and now there’s official support for Azure.

Craig Kerstiens

Connection Pooling in Pgpool-IIPgpool-II is a connection pooling tool for Postgres – this post looks at the basics of its operation.

B Peng

Best-Practices on How to Speed Up Your Postgres Queries. Free eBook— Companies like Robinhood and Atlassian are able to speed up their queries by orders of magnitude. This eBook shares our best practices for optimizing Postgres performance.

pganalyze sponsor

▶  How to use SSL in Postgres The Right Way: Encrypt Your Data in Transit

Kirill Shirinkin

Continuous Backups using WAL-GWAL-G lets you manage the archiving of Postgres data and backups and can also let you restore a database to its state at a particular moment in time.

Angelico de los Reyes

Computing INTERVAL Values— Bruce seems to be on a date and time streak this week, and here he looks at some quirks of date/time intervals.

Bruce Momjian

▶  High Performance HTAP with Postgres & Hyperscale (Citus)

Marco Slot and Claire Giordano

supported by

💡 Tip of the Week

The FORMAT function

When a fresh Postgres tip doesn't immediately come to mind, one of my favorite techniques is to look at the (amazing) Postgres documentation and skim through until I find something I didn't know about before but that I think might be useful in future. So it goes today with the FORMAT function!

If you're familiar with format, printf or sprintf in various languages like Python, C, or Ruby, you'll know what a format string is – a string using a simple template language made up of special delimiters which can be replaced by supplied values. Postgres offers the same idea in the FORMAT function:

SELECT FORMAT('%s %s', 'hello', 'world');
/* => 'Hello world' */

SELECT FORMAT('|%10s|', 'test');
/* => '|      test|' */

SELECT FORMAT('INSERT INTO %I VALUES(%L, %L)',
  'people', 'pat', null);
/* => 'INSERT INTO people VALUES('pat', NULL)' */

Sadly, FORMAT isn't anywhere near as powerful as string formatting languages in non-SQL languages (there's no %d or %f, for example) but you could find it useful for bringing together multiple columns into a preferred form of output (such as a column of cents into a formatted price):

SELECT FORMAT('$%s', ROUND(cents / 100, 2));

Be sure to check out the official documentation for more if you plan to use FORMAT.

Try out this Tip in PopSQL. PopSQL is a SQL editor built for teams. Write queries, visualize data, and share your results. A free premium trial awaits Postgres Weekly readers (no credit card required).

Pagination, backtraces, and going beyong JSONB

$
0
0

#368 — August 12, 2020

Web Version

Postgres Weekly

Postgres Pagination Approaches— Most webapp users don’t want to see a list of thousands of items at once, so pagination is often adopted as a way to ‘page’ through a larger group of items in more manageable chunks. There are several ways to achieve this with Postgres with various tradeoffs.

Bruce Momjian

A pg_stat_statements Troubleshooting 'Hack'— If you want to get a useful ‘trick up your sleeve’ for troubleshooting with pg_stat_statements, you might enjoy Kaarel’s story.

Kaarel Moppel

[Whitepaper] AlwaysOn Postgres— Discover how to achieve AlwaysOn availability in PostgreSQL using BDR - groundbreaking technology from 2ndQuadrant. AlwaysOn guarantees up to six 9s of availability for worldwide PostgreSQL database clusters running mission critical business applications.

2ndQuadrant PostgreSQL Products sponsor

Use NATURAL FULL JOIN to Compare Two Tables in SQLNATURAL JOIN is syntax sugar for joining using all the shared column names of the two tables, and the FULL JOIN makes sure we can retrieve also the columns that are not matched by the join predicate.”

Lukas Eder

Backtraces in Postgres 13— Postgres 13 introduces a simple but useful capability to log a stack trace into the server logs when an error is reported.

Amit Khandekar

Postgres High Availability: Considerations and Candidates— The first in a promised series of posts looking at various ways to introduce high availability concepts to Postgres. This time, RepMgr, Patroni, PAF, and PgPool-II are mentioned.

Hamid Akhtar

eBook: The Most Important Events to Monitor in Your Postgres Logs— In this eBook, we are looking at the Top 6 Postgres log events for monitoring query performance and preventing downtime.

pganalyze sponsor

On Partitioning Improvements in Postgres 13

Ahsan Hadi

Avoiding the Pitfalls of BRIN Indexes

John Porvaznik

A Crash Course on Postgres for R UsersR is a popular statistical computing environment and language, so you might find it useful to use Postgres from it.

Pachá

Working with a JSONB Array of Objects in Postgres

Rob Tomlin

Going Beyond jsonb? A Generalized, Unstructured Data Type for Postgres— Álvaro wonders if regular JSON support is enough and if there should be something beyond it in Postgres – a generalized unstructured data type, if you will.

Álvaro Hernández

plpgsql_check Now Supports Tracingplpgsql_check is a linter for Pl/pgSQL code.

Pavel Stěhule

pgagroal 0.8.0: A High Performance Postgres Connection Pool— 0.8 brings failover and systemd support.

Red Hat Inc.

supported by

💡 Tip of the Week

Getting NOW() in your preferred time zone

When you use the NOW() function to get the current time, you get the current date and time in the server's timezone.

Many servers run on UTC/GMT and if you normalize your use of time around UTC/GMT as well, everything is well, but as soon as there's a mismatch (whether between your app and your database server, or otherwise) things can go awry so it pays to be more specific.

NOW() returns a timestamp with time zone (a.k.a. timestamptz in Postgres or YugabyteDB) value but if you have a plain timestamp [without time zone] column and place NOW() into it, the timezone information is silently dropped:

=# create table test
      (a int primary key, b timestamp without time zone);
    
=# select now(), pg_typeof(now());
    
                now            |     pg_typeof    
-------------------------------+--------------------------
 2020-05-06 16:44:03.917735-07 | timestamp with time zone
(1 row)


=# insert into test values
    (1, '2020-05-06 16:44:03.917735-07');
    
=# select * from test;
    
 a |          b         
---+----------------------------
 1 | 2020-05-06 16:44:03.917735
(1 row)

=# show timezone;
  TimeZone
------------
US/Pacific

We can see that the column “b” discards the time zone. This means that the stored result is wrong in the sense that it does not respect the user’s convention to understand stored plain timestamp values as UTC values.

We can remedy this by using now() at time zone 'utc' like so:

=# insert into test values
  (2, (now() at time zone 'utc'));
 
=# select * from test;
 
 a |          b         
---+----------------------------
 1 | 2020-06-10 19:38:22.859175
(1 row)

This week’s tip is sponsored by YugabyteDB, the high-performance distributed SQL database for internet-scale applications. Want to set up a virtual tech talk to learn more? Reach out now() or anytime.

New bug fix Postgres releases, and Envoy gets Postgres monitoring support

$
0
0

#369 — August 19, 2020

Web Version

Postgres Weekly

Postgraphile: Quickly Get a GraphQL API for Your Postgres Database— Work on v5 of this popular Node.js tool is underway, but meanwhile we get a 4.8 release which supports ‘enum tables’ and all of Postgres’s built-in geometric types. If you want to offer a Postgres database up over a GraphQL-based API, it’s worth checking out.

Graphile

Postgres 12.4, 11.9, 10.14, 9.6.19, 9.5.23, and 13 Beta 3 Released— A flurry of Postgres releases to fix a variety of bugs but, more importantly, two security vulnerabilities (in versions 9.5-12 only). We’re also reminded that Postgres 9.5 will stop receiving fixes after February 2021.

PostgreSQL Global Development Group

Real-Time Postgres Performance Monitoring— Collect out-of-the-box and custom PostgreSQL metrics and correlate them with data from your distributed infrastructure, applications, and logs. Gain real-time performance insights and set intelligent alerts with Datadog. Start a free trial.

Datadog sponsor

Building a Recommendation Engine Inside Postgres with Python and Pandas— Learn how you can leverage Python and Pandas from directly inside Postgres to build your own recommendation engine.

Craig Kerstiens

An Introduction to Recursive Queries— Recursive queries (built using WITH) let you do interesting things with SQL and aren’t too difficult to understand, particularly with the simple practical examples given here.

Laurenz Albe

SQLite 3.33.0 Released (With a Postgres-Inspired Feature)— We don't often cover SQLite in this newsletter, but when the world’s most heavily used database engine gains support for UPDATE FROM and aims to be compatible with Postgres's implementation, why not? 😄

SQLite Team

Timescale Cloud Extends to 76 Regions Across 3 Clouds— TimescaleDB essentially extends Postgres with significant time-series data functionality and Timescale Cloud is the creator’s commercial ‘TimescaleDB in the cloud’ service.

Timescale

Envoy 1.15 Introduces a New Postgres Extension with Monitoring SupportEnvoy is a popular service proxy for cloud applications, and it can now ‘speak’ Postgres’ wire protocol! This post digs into why this plugin was developed, what functionality it currently implements, and what the roadmap looks like for future releases.

CNCF

Running Multiple PgBouncer Instances with systemdPgBouncer is a connection pooler that runs as a single process, but with a little work you can run multiple instances for different virtual hosts.

Peter Eisentraut

A Look at Keyset Pagination— Last week, Bruce covered pagination approaches in general and is now focusing on one technique, using LIMIT and OFFSET alongside WHERE. He’s also dug into how it works in practice while items are being inserted or deleted.

Bruce Momjian

CockroachDB: Scalable, Distributed PostgreSQL— Never shard a Postgres instance again. Meet CockroachDB, the Distributed SQL database that’s naturally resilient and provides out-of-the-box scalability.

Cockroach Labs sponsor

Iterators in Postgres with Lateral Joins“When the keyword LATERAL is added to your join the output will now apply the right hand part of the join to every record in the left part of the join.”

Steve Pousty

How We Used Postgres Extended Statistics to Achieve a 3000x Speedup— Big numbers like that always make me put my cynical eyeglasses on, but in this case it’s a neat technical tale of encouraging Postgres to do the right thing.

Jared Rulison

PostGIS 3.0.2, 2.5.5, and 2.4.9 Released— Minor bug fix and performance enhancement releases for the popular geospatial extension.

PostGIS Developers

Why 13 will be a lucky number for Postgres

$
0
0

#370 — August 26, 2020

Web Version

Postgres Weekly

Why Postgres 13 Will Be A 'Lucky' Release— See why the author thinks that why 13, often maligned as an ‘unlucky’ number, will be lucky for Postgres it adds features like incremental sort, parallel vacuum, and improves performance of B-tree indexes.

Jonathan S. Katz

synchronous_commit Options and Synchronous Standby Replicationsynchronous_commit lets you customize when a transaction commit can be acknowledged back to the client as successful which has implications in replicated systems.

Jobin Augustine

[Whitepaper] Best Practices to Harden your Database— Learn the fundamental concepts of Security for PostgreSQL Databases & develop a deep understanding of industry best practices. Discover how to secure your Postgres cluster & keep your data safe. Have confidence your database is adequately protected from attacks.

2ndQuadrant Whitepapers sponsor

Improvements for Handling Large Number of Connections Coming to Postgres 14— Note that this is about Postgres 14 so you’re going to be waiting a while (as 13 isn’t even out yet!) but handling large numbers of connections is a perennial issue for many so these links to some discussion threads may be of interest.

Hubert depesz Lubaczewski

Query 40K+ Datasets Through a 'Data Delivery Network' With Any Postgres Client— Splitgraph offers what they call a ‘data delivery network’ that acts like a distributed SQL caching proxy compatible with the Postgres wire protocol. In this demo it lets you query over 40,000 datasets with SQL using the psql or other Postgres clients you know and love.

Splitgraph

Calculating Standard Deviation with SQL— As is often the case, Postgres offers a function for this task: stddev().

Bruce Momjian

A Look at TLS Related Updates Coming in Postgres 13— Supporting a minimum of TLS 1.2, Postgres 13 brings a few improvements when it comes to secure connections including support for channel binding.

Cary Huang

How to Setup Postgres on an IPv6 Enabled Network— If you’re confident with IPv6 already, this goes into more depth than you’ll need, but if not.. you may find some useful tips here.

David Zhang

Faster CI/CD for All Your Software Projects - Try Buildkite ✅— See how Shopify scaled from 300 to 1800 engineers while keeping their build times under 5 minutes.

Buildkite sponsor

pg-costop: Vector Arithmetic and Weighted, Variably Randomized Cosine Similarity Search— Mathemetically this is a bit beyond me but it’s a set of PL/pgSQL functions for working with vectors to calculate cosine proximity / similarity rankings.

turbo

pg-shortkey: YouTube-like Short IDs as Postgres Primary Keys“This installs a trigger and type which allow you to use YouTube-like short IDs (e.g. 1TNhBqYo-6Q) as Postgres Primary Keys. Just like YouTube IDs, ‘SHORTKEY’ IDs are fixed length and URL-safe.”

turbo

QuestDB: A Performance-Focused 'NewSQL' Time-Series Database System— A relational database (that isn’t built on Postgres but does support its wire protocol) focused on fast time-series data processing. Built in Java.

QuestDB Limited

Pgpool-II 4.1.3, 4.0.10, 3.7.15, 3.6.22 and 3.5.26 Released— Bug fixes all round for all maintained versions of the popular connection pooling system.

PgPool Development Group

supported by

💡 Tip of the Week

Checking for the existence of a row

You could check for the existence of a row on a Postgres table by just requesting that row and seeing if anything comes back, but depending on how you did this, the result may be ambiguous or inefficent.

The EXISTS subquery expression can be used to unambiguously determine if another query returns any rows or not, and it can therefore be used to detect if a particular row exists:

# CREATE TABLE test(id BIGSERIAL PRIMARY KEY);
# INSERT INTO test(id) VALUES (13);
# SELECT EXISTS(SELECT 1 FROM test WHERE id=11) AS "exists";
 exists
--------
 f
(1 row)

# SELECT EXISTS(SELECT 1 FROM test WHERE id=13) AS "exists";
 exists
--------
 t
(1 row)

EXISTS always returns a boolean – true or false.

This week’s tip is sponsored by YugabyteDB, the high-performance, cloud native, open source distributed SQL database. YugabyteDB EXISTS to help power your business-critical apps at scale. Get started.

Parallelism, vacuuming, and trusted extensions

$
0
0

#371 — September 2, 2020

Web Version

Postgres Weekly

An Overview of Trusted Extensions in PostgreSQL 13— Postgres supports extensions as a way to easily introduce new features and functions dynamically, but in Postgres 13 there’ll be the concept of a trusted extension which will allow any user with CREATE privileges on the current database to bring in an extension without superuser privileges.

Nidhi Bansal

Tuning Autovacuum for Postgres Databases— Most of the time you don’t need to tweak the settings for autovacuum as Postgres will generally do the right thing, but sometimes the default configuration just isn’t enough and this article will help you with some approaches for tuning things.

Laurenz Albe

Best-Practices on How to Speed Up Your Postgres Queries. Free eBook— Companies like Robinhood and Atlassian are able to speed up their queries by orders of magnitude. This eBook shares our best practices for optimizing Postgres performance.

pganalyze sponsor

Three Easy Things To Remember About Postgres Indexes— Basic ideas, but worth reviewing if you’re not an index expert. Indexes can speed things up beyond simple queries, indexes aren’t always used even if you think they should be, and indexes add maintenance and storage overhead you need to be prepared to manage.

Kat Batuigas

Parallelism Comes to VACUUM in Postgres 13— One of the noteworthy new features in Postgres 13 will be the ability to speed up the execution time of index vacuuming. Masahiko provides a quick example of the benefits.

Masahiko Sawada

How JSON Aggregate Functions Are Pretty Cool— Rather than extract data with SQL queries and then convert it into JSON, you could get Postgres to do the heavy lifting instead and here’s how.

Knut Hühne

The Need for External Compression Methods in Postgres— If you could specify different compression systems for different tables or columns, based upon the most appropriate one for the type of data stored, that would be pretty neat.. and it seems the Postgres team are beginning to discuss such a potential feature in Postgres.

Amit Khandekar

Using Envoy Proxy’s New PostgreSQL & TCP Filters to Collect SQL Statistics— The new PostgreSQL filter from Envoy Proxy makes it easy for developers and ops engineers to collect SQL statistics.

YugabyteDB sponsor

Benchmarking Checkpointing in Postgres— Evaluating checkpointing with varying log sizes from 1GB to 100GB.

Vadim Tkachenko

Full-Text Search Battle: Postgres vs Elasticsearch— Less a battle and more a basic comparison of techniques with a simple benchmark, but may be of interest.

Rocky Warren

Pitfalls and Quirks of Logical Replication in Postgres 12“I’d like to share some thoughts after setting up a logical replication in a large-ish (one terabyte) production database.”

Elephant Tamer

The effect B-tree index deduplication has in Postgres 13

$
0
0

#372 — September 9, 2020

Web Version

Postgres Weekly

A Look at B-Tree Index Deduplication in Postgres 13— B-tree indexes are the default type of index created in Postgres so any changes to their operation is likely to have a lot of knock-on effects. Deduplicating these indexes, as possible in the forthcoming Postgres 13, will help keep these indexes smaller and has performance implications (most likely lower I/O usage at a cost of minor CPU increase, but with higher overall performance in most cases).

Ryan Lambert

How to Get the Best Out of Postgres Logs— Postgres’s logging system is very tunable and there are lots of parameters to fiddle with. This post covers some basic practices for getting the most out of a Postgres server’s logs and what you can tweak.

Sadequl Hussain

[Whitepaper] Business Case for Professional Support— Learn the importance of Professional Support for your mission-critical PostgreSQL systems & how it can benefit your company. Discover how it increases database performance, helps scale, distributes data, reduces costs, saves you from known pitfalls, and more.

2ndQuadrant Services sponsor

What’s New in the Citus 9.4 Extension to Postgres— Citus transforms Postgres into a distributed database, distributing your data and your SQL queries across multiple nodes. v9.4 improves EXPLAIN ANALYZE, has some performance and safety improvements, and can now calculate percentiles at scale using the t-digest extension.

Marco Slot (Citus Data)

Generating a Normal Distribution in SQL— Postgres’s tablefunc extension provides a variety of functions that return tables, including sets of normally distributed random values.

Hans-Jürgen Schönig

PostGIS and the Geography Type— The PostGIS geography type is a geospatial type that understands coordinates as spherical coordinates (in latitude and longitude) and here’s a basic introduction to them (and one of the reasons to use PostGIS too, really).

Paul Ramsey

Best-Practices on How to Speed Up Your Postgres Queries. Free eBook— Companies like Robinhood and Atlassian are able to speed up their queries by orders of magnitude. This eBook shares our best practices for optimizing Postgres performance.

pganalyze sponsor

Tuning Postgres on ZFS“The main reason to use ZFS instead of ext4/xfs is compression. With reasonable configuration you can achieve 3-5x compression ratio using LZ4. That means that LZ4 compresses 1 terabyte of data down to ~300 gigabytes.”

Uptrace

Building Microservices with Deno, Reno, and PostgresDeno is a server-side JavaScript runtime built on top of V8 (a bit like Node, but not) and Reno is a routing library for Deno apps.

James Wright

Mining for Logic Bugs in the Citus Extension to Postgres with SQLancer— One of those things you’re unlikely to need to do, but it’s nice to know how such problems are approached. SQLancer is a tool we’ve linked to before that helps you detect logic-related bugs in database systems.

Nazli Ugur Koyluoglu

supported by

💡 Tip of the Week

Returning rows in the order specified in a list

If you have a table of data (let's say a books table of books stored with their name and publication_date) and you want to return rows as specified by a list, you can use IN to do this, like so:

SELECT * FROM books WHERE name IN ('The Adventures of Huckleberry Finn', 'Pride and Prejudice', 'The Great Gatsby');

                name                |  publication_date   
------------------------------------+---------------------
 Pride and Prejudice                | 1813-01-28 00:00:00
 The Adventures of Huckleberry Finn | 1884-12-10 00:00:00
 The Great Gatsby                   | 1925-04-10 00:00:00
(3 rows)

However, the rows are not guaranteed to be returned in the order in which they appear in the IN clause.

If you want to retrieve rows and order them based on their order in the IN clause, you can make use of the VALUES clause and do a join as shown below:

SELECT b.* FROM books b
    JOIN (
      VALUES ('The Adventures of Huckleberry Finn',1),
             ('Pride and Prejudice',2),
             ('The Great Gatsby',3)
    ) AS x (name, sortorder)
    ON b.name = x.name ORDER BY x.sortorder;

Alternatively, you can also use WITH ORDINALITY for a different approach:

SELECT b.* FROM books b
    JOIN
    unnest('{"The Adventures of Huckleberry Finn",
             "Pride and Prejudice",
             "The Great Gatsby"}'::text[])
    WITH ORDINALITY t(name, sortorder) USING (name)
    ORDER BY t.sortorder;

This week’s tip is sponsored by YugabyteDB. Get more tips, database migration stories, and real-world distributed SQL database journeys at the (free) Distributed SQL Virtual Summit, Sept 15-17.

Best practices for bulk data loading in Postgres

$
0
0

#373 — September 16, 2020

Web Version

Postgres Weekly

How (and Why) GitLab Upgraded from Postgres 9.6 to 11— Some of the GitLab team explain the precise maintenance process they took to execute a major version upgrade of Postgres on their systems. And, even better, they recorded the whole 2 hour process so you can see how it was actually conducted! It doesn’t make for exciting viewing, but it’s a very uncommon look behind the ops curtain at a major company.

GitLab

7 Best Practice Tips for Bulk Data Loading— Anyone who’s managed a database has needed to import a large quantity of data at some point, and anyone who’s tried this has rapidly discovered there are both fast and slow ways to do it 😉 This post looks at some some best practices for bulk importing data into Postgres databases.

Sadequl Hussain

Quickly Identify Slow-Running Postgres Queries in Datadog— Improve PostgreSQL performance in real-time with actionable alerts on slow-running queries and bottlenecks. Start proactively monitoring your PostgreSQL databases with a Datadog free trial.

Datadog sponsor

A Few Hidden Gems of Postgres 13— With all the blog posts we’re seeing each week, it feels like Postgres 13 is the most anticipated release ever 😄 Here, Jonathan reflects on a handful of ‘hidden gems’ that’ll be turning up in the release.

Jonathan S. Katz

Picking Between Joins or Subqueries in Postgres: Lessons Learned— The author was advised “Try to use joins rather than subqueries.” Here, he works through an example of why and where the benefits are.

Steve Pousty

Index Improvements Coming to Postgres 13— We’ve linked to numerous articles over the past few months about how indexes will be improved in Postgres 13 but this rounds up six of the biggest benefits in one place complete with code examples.

Ibrar Ahmed and Jobin Augustine

Which Partition Contains a Specific Row?— If you’re using hash partitioning, can you determine which partition contains a given row? Yes, you can.

Gabriele Bartolini

Using Postgres to Offload Real-Time Reporting and Analytics from MongoDB— Weighs up the advantages and disadvantages of moving read-heavy analytics off a primary MongoDB database to PostgreSQL.

Shawn Adams

Faster CI/CD for All Your Software Projects - Try Buildkite ✅— See how Shopify scaled from 300 to 1800 engineers while keeping their build times under 5 minutes.

Buildkite sponsor

How To Install and Run Postgres using Docker

J Shree

pgtools: A Visual Way to Monitor Database Events in Real Time— A very new project built in Python and using Vue.js on the client side.

Lukas Loeffler

allballs? A Rather Odd Time Keyword in Postgres..— There's a ‘childish behavior ahead’ warning on this one.. 😆 but via this tweet I learnt that Postgres (still) supports allballs as a keyword to mean 00:00:00. We’ll let you guess why, or you can read this explanation on the mailing list. Amazingly, as of the 13 betas, it's still there.

Postgres GitHub Repo


First release candidate of Postgres 13 released

$
0
0

#374 — September 23, 2020

Web Version

Postgres Weekly

A Battleship Game, Implemented with Postgres— See SQL taken to the next level with a working game running within Postgres, complete with a creative way of taking player input.

Firemoon777

PostgreSQL 13 Release Candidate 1 Released— We link to things about Postgres 13 all the time, if you hadn’t noticed, and it’s shaping up to be a huge release – so it’s great to see it close to completion with this first RC. The beta release notes cover the essentials, but we’ll do a full roundup of features at the final release.

PostgreSQL Global Development Group

Highway to Zero Downtime PostgreSQL Upgrades— Get a comprehensive walk-through of how to perform a "near" zero downtime upgrade using pglogical in this free webinar. Learn how logical decoding presents a whole new world of opportunities for upgrades that require a very small amount of downtime.

2ndQuadrant PostgreSQL Webinars sponsor

Crunchy Bridge: The Newest 'Postgres As A Service'— Crunchy Data is the latest company to get on the ‘Postgres as a managed service’ bandwagon with Crunchy Bridge which is available on AWS and Azure (and supports migration and replication between the two).

Craig Kerstiens (Crunchy Data)

Diary of an Engineer: Delivering 45x Faster Percentiles using Postgres, Citus, and t-digest— Nils had a problem to solve for a customer but couldn’t meet their SLA of 30 seconds and didn’t have the customer’s data to experiment with.. nonetheless, he found a creative way to estimate which types of percentile calculations would meet their SLA and used t-digest to do it.

Nils Dijk (Microsoft)

Postgres 13's LIMIT ... WITH TIESWITH TIES is a new SQL standard feature being implemented in Postgres that causes LIMIT (or FETCH FIRST) clauses to not just cut off at a specified limit but to also include rows with values that tie with the final one(s).

Álvaro Herrera

Lessons Learned from Running Postgres 13: Better Performance, Monitoring & More— We took a look at smaller indexes with B-Tree Deduplication, Parallel VACUUM, improved WAL Usage Stats, and more.

Pganalyze sponsor

How 'HOT' Updates Yield Better Performance— An introduction to a feature included first with Postgres 8.3 but which, allegedly, are not properly covered in the docs. HOT updates (Heap Only Tuple) occur behind the scenes and improve performance in certain situations where lots of UPDATEs occur.

Laurenz Albe

AWS Aurora Postgres Versions 'Vanished' for Days, Customers Claim— Greg Clough, a software engineer who uses AWS, noticed that several Postgres versions on AWS Aurora ‘vanished’ last week (in the sense they couldn’t be deployed – existing datbases didn’t disappear). Most now appear to be back, but it’s a curious story.

The Register

📄  Postgres and the Artificial Intelligence Landscape— It’s just slides for now (though a talk was given) but Bruce’s slides often provide value even on their own.

Bruce Momjian

Exploring PL/Python: Turning Postgres Table Data Into a NumPy Array

Kat Batuigas

Why RudderStack Used Postgres Over Apache Kafka for a Streaming Engine— Kafka was a natural fit for what RudderStack, a data platform, does, but they found enough negatives about it to build their own queueing system on top of Postgres instead.

RudderStack

supported by

💡 Tip of the Week

'Extracting' a Date from a TIMESTAMP

CREATE TABLE user_login(name TEXT, login_time TIMESTAMP);

The TIMESTAMP type stores a complete date and time with or without timezone, as opposed to DATE or TIME which respectively store only those particular elements. But what if you want to return only the date a TIMESTAMP refers to?

There are lots of date and time functions in Postgres, and you could extract the date elements piece by piece using EXTRACT:

SELECT EXTRACT(MONTH FROM TIMESTAMP '2020-09-21 12:21:13');
SELECT EXTRACT(DAY FROM TIMESTAMP '2020-09-21 12:21:13');
SELECT EXTRACT(YEAR FROM TIMESTAMP '2020-09-21 12:21:13');

But the easiest way is to cast the TIMESTAMP type into a DATE which automatically does the conversion needed:

SELECT name, login_time::date FROM user_login;
 name | login_time
------+------------
 john | 2019-11-11
 bill | 2020-10-22
 jane | 2020-04-01
(3 rows)

(Bill logged in from the future..?)

You could also use the DATE function to create a date in a similar way:

SELECT DATE('2020-09-21 12:21:13');
# => 2020-09-21T00:00:00.000Z

This week’s tip is sponsored by YugabyteDB. If you would like to set up a free one hour deep-dive Distributed SQL tech talk in your future, please let us know.

Postgres 13 released

$
0
0

#375

Web Version

Postgres Weekly

Postgres 13 Released— Just shy of a year after Postgres 12 was released, 13 is here focused on evolutionary steps forward for our favorite database. The release notes provide the required laundry list of new and tweaked features, but we'll cover some of the bigger things here with links to relevant articles:

PostgreSQL Global Development Group

🏆 Top 5 PostgreSQL Extensions— We round up a few of our & TimescaleDB community members’ must-have PG extensions, complete with why we 💛 them, install instructions, sample queries & pro tips to get you started.

Timescale sponsor

A Quick Look at Postgres 13 RC1's Query Performance— These benchmarks were done before the final Postgres 13 release (and took three days to run!) but should continue to stand up. The results are broadly (but not unanimously) positive for the new release versus 12.4.

Kaarel Moppel

Operating Postgres at Scale: Saving Space (Basically) for Free— With over 100 terabytes of data in play on their Postgres clusters, any efficiency gains possible are a big win for Braintree. Here’s the story of how they saved around 10% of disk space with ‘very little effort’ by re-ordering table columns.

James Coleman (Braintree)

When to Deploy or Upgrade to a New Major Postgres Release— It’s rarely practical to upgrade to the latest and greatest version of something (cough, cough, Postgres 13) as soon as it comes out but how should you approach the inevitable future upgrade?

Andrew Dunstan

Simple Anomaly Detection Using Plain SQL“Using some high school level statistics and a fair knowledge of SQL, I implemented a simple anomaly detection system that works.”

Haki Benita

Using Postgres and pgRouting To Explore The Smooth Waves of Yacht Rock— pgRouting is a powerful geospatial routing extension for Postgres and PostGIS usually used for pathfinding/mapping/direction applications. This post creatively attempts to use it to find the most influential Yacht rock artist.

John Porvaznik

We Look at What’s New in Postgres 13: Better Performance, Monitoring & More— Smaller indexes with B-Tree Deduplication, Extended Statis Improvements, Parallel VACUUM, improved WAL Usage Stats, and more.

Pganalyze sponsor

Debugging PL/pgSQL with GET STACKED DIAGNOSTICS— GET STACKED DIAGNOSTICS makes debugging PL/pgSQL code a lot easier but isn’t super well known. This post will show what it does and how you can make use of it.

Hans-Jürgen Schönig

How to Configure SCRAM and MD5 Authentication in Pgpool-IIPgpool-II is a popular Postgres connection pooler.

Bo Peng

DuckDB: An Embeddable SQL OLAP Database System— Built in C++, DuckDB bills itself (geddit?) as ‘SQLite for Analytics’ and has bindings for C/C++, Python, and R. We’ve mentioned this in our Database Weekly several times, but it’s just struck me that it may complement use cases you have for Postgres too.

CWI Database Architectures Group

Benchmarking queries, EDB buys 2ndQuadrant, and Azure's flexible Postgres service

$
0
0

#376

Web Version

Postgres Weekly

sqlbench: Measures and Compares The Execution Time of SQL Queries"The main use case is benchmarking simple CPU-bound query variants against each other during local development." Postgres-only for now but pull requests for other databases are welcomed. Written in Go.

Felix Geisendörfer

Using CTEs to Do a Binary Search of Large Tables with Non-Indexed Correlated Data— An interesting situation here. A query had to be optimized but no changes to the underlying schema were allowed at all, including the generation of indexes. Some clever thinking was required!

David Christensen

Managed PostgreSQL Hosting on GCP Now Available at ScaleGrid— ScaleGrid is the easiest way to manage PostgreSQL on Google Cloud Platform. Features such as Superuser Access, Custom Extensions and Slow Query Analysis. Bring Your Own Cloud (BYOC) to reduce hosting costs, or use our dedicated hosting. Try for free.

ScaleGrid sponsor

EDB Acquires 2ndQuadrant“Company buys other company” is rarely an exciting story but EDB (formerly EnterpriseDB) and 2ndQuadrant are big in the Postgres space and with this acquisition EDB has pledged to “push Postgres even further.”

Ed Boyajian (EDB)

What is Flexible Server in Azure Database for PostgreSQL?Flexible Server is a new deployment option for Azure Database for PostgreSQL that aims to provide more fine-grained control over scaling and cost optimization (think things like stop/start, burstable compute, setting custom maintenance windows).

Sunil Agarwal (Microsoft)

zheap: Reinvented Postgres Storage for Better Bloat— Table ‘bloat’ is when a table or indexes grow in size without the actual underlying data reflecting this. zheap is a way to keep such bloat under control with a storage engine capable of running UPDATE-intense workloads more efficiently.

Hans-Jürgen Schönig

Some VACUUM and ANALYZE Best Practice Tips— A look at two important and commonly used features which can cause confusion.. which means a few best practices could come in very useful.

Sadequl Hussain

Why You Might Migrate Your Heroku Postgres to AWS RDS— If you’re using Heroku, their Postgres offering is pretty fantastic, but here are some arguments for not using it which, if you’re in Europe or worried about GDPR, may even have regulatory consequences.

Paweł Urbanek opinion

dropdb --force, a New Postgres 13 Feature— Super keen to drop a database ASAP even if clients are connected to it? Now you can 😂 You can also use FORCE with DROP DATABASE.

Ibrar Ahmed

Best-Practices on How to Speed Up Your Postgres Queries. Free eBook— We share our learnings from helping companies like Atlassian, Robinhood, and others speed up their queries.

pganalyze sponsor

How to Fix Postgres Performance Issues with PG Extras— An introduction to a tool you can use to spot Postgres related issues in Node, Elixir and Ruby contexts. node-postgres-extras is the repo you want.

Paweł Urbanek

Postgres 13's Features Distilled— A quick, easily read roundup of Postgres 13’s main features if you didn’t catch up with our issue last week.

Kovid Rathee

dbcrossbar: Move Large Datasets Between Different Databases and Formats— Copy tabular data between databases, CSV files and cloud storage. Written in Rust.

Faraday, Inc.

Arctype: A Desktop SQL Client for Postgres and MySQL— When I saw someone tweet that this is the ‘most beautiful client I’ve used’, I had to take a look at this. spreadsheet-style data editing. Beekeeper Studio is another worth checking out that we might write more about soon.

Arctype

Testing the limits of Postgres's connection scalability

$
0
0

#377 — October 14, 2020

Web Version

Postgres Weekly

Optimizing Storage with pg_squeeze— A look at an extension that offers periodic, automatic and transparent fixing of tables that exceed a ‘bloat threshold’. "No more need for VACUUM FULL– pg_squeeze has it all."

Hans-Jürgen Schönig

Analyzing the Limits of Connection Scalability— Postgres can handle thousands of connections but high numbers of concurrent clients isn’t one of its strong points. This post looks at how things could be improved long-term and where the problems creep in.

Andres Freund

Crunchy Bridge: More Postgres Power, Less Administration— A better option for your Postgres hosting. With a rich set of extensions like PL/Python + SciPy/NumPy/Pandas and more on demand, Crunchy Bridge is made for modern apps. Built by Postgres experts and the team who brought you Heroku Postgres.

Crunchy Data sponsor

▶  The State of (Full) Text Search in Postgres 12— The audio isn’t great but nonetheless this is a great primer to different text search techniques in Postgres starting from the absolute basics (e.g. LIKE and regexp_match) and moving on to Postgres’s full-fat FTS features.

Jimmy Angelakos

On the Community Impact of EDB's 2ndQuadrant Purchase— Last week we mentioned that EDB (formerly EnterpriseDB) acquired 2ndQuadrant– this makes it a bit of a heavyweight in the Postgres space, and Bruce ponders the risks especially as over half the Postgres core team (including Bruce) is now from one parent company (this then turned into a Hacker News discussion.)

Bruce Momjian

Making 2 Million Ancient Usenet Posts Available with Postgres and PythonUsenet was a fascinating distributed discussion system whose heyday was in the 80s and 90s and there are various archives of posts made to it (such as someone asking what Postgres is in 1989!).

Jozef Jarosciak

Scaling Row Level Security to Group Roles— Row level security lets you grant privileges to selected rows for selected users but what about groups of users or ‘roles’? Enter pg_has_role.

Elephant Tamer

Postgres Monitoring for Developers: The DBA Fundamentals— A look at several stats you might keep an eye on when managing a Postgres database. pgmonitor is the sort of tool you can use to track these things.

Jonathan S. Katz

Measuring the Memory Overhead of a Postgres Connection— This relates directly to the connection scalability post featured above but may be of separate interest.

Andres Freund

Some Query Caching and Load Balancing Tools— Summarizes some query caching and load balancing options available to use with Postgres, including pgpool-II, Apache Ingest, Heimdall Data, HAProxy, and Bucardo.

Viorel Tabara

What Is This Mythical Zero Downtime Database Migration to the Cloud?— Join us for a CIO led panel discussion on the issues and risks and the technology available to streamline the experience.

ScaleArc sponsor

Multitenancy with Postgres Schemas: Key Concepts Explained— If you have an interest in this topic, their What Surprised Us with Postgres-Schema Multitenancy article may also be of use.

Tomasz Wróbel

What's New in pg_auto_failover 1.4pg_auto_failover is an open source extension that manages automated failover for Postgres clusters. This release was sent to me with the note: “In this new 1.4 release Dimitri has added multiple standby support, which is a really big deal.”

Dimitri Fontaine

How medium-sized text columns can impact table performance

$
0
0

#378 — October 21, 2020

Web Version

Postgres Weekly

The Surprising Impact of Medium-Size Texts on Performance— You’ve got your small text (usernames, emails), large text (entire documents), and your ‘medium’ text (comments, descriptions).. While TOAST brings efficiencies to storing larger documents, medium-sized columns can make rows very wide and affect performance disproportionately. A rather neat deep dive complete with examples.

Haki Benita

30 Years of Continuous Postgres Development with Bruce Momjian— It’s a brief interview, but Bruce shares a few tidbits about the Postgres development process, quality, and how features make it in.

Scott Grant

[Whitepaper] Achieving High Availability with PostgreSQL— Critical business applications require availability of their backend database cluster. Discover how to setup & deploy enterprise-grade High Availability of up to four 9s. Learn the features and benefits in the Highly Available Postgres Clusters Whitepaper.

2ndQuadrant PostgreSQL Products sponsor

PostgresConf.CN and PGConf.Asia 2020 Taking Place Online (November 17-20)
China PostgreSQL Association

Tuning Your Postgres Database for High Write Loads— If you set up a database as a proof of concept or to fit an initially small workload, problems can occur once things scale up including warnings like “checkpoints are occurring too frequently”. What’s going on and what can be tweaked?

Tom Swartz

How Postgres Stores 'null' Values— Good news – you almost certainly don’t need to know this, but intrigue always gets the better of me..

Movead Li

Multicorn: Python Powered Foreign Data Wrappers— Mulitcorn is a FDW but which comes with a variety of customizable subwrappers (for things like SQLAlchemy, RSS, working with the file system, or SQLite) that you can adjust.. or write your own.

Kozea

pgFormatter: A Postgres SQL Syntax Beautifier— ..that can work from the terminal or on a Web server via CGI. Demo here.

Gilles Darold

Prisma’s Data Guide: PostgreSQL— Introductory PostgreSQL tutorials: learn how to configure and use Postgres to take advantage of its best features.

Prisma sponsor

Logical Replication Upgrade in Postgres
James Chanco Jr.

Working with Data Consistency Issues in Logical Replication
Elephant Tamer

How to Upgrade PostgreSQL 11 to PostgreSQL 12 with Zero Downtime— using logical replication.

💡 Tip of the Week

Comparing the feature sets of different Postgres versions

While releases aren't at a Chrome or Node-esque pace, Postgres nonetheless has numerous versions that are still in production and their featuresets overlap in various interesting ways.

Other than reading Postgres Weekly every week and memorizing all of the articles about each version, how can you keep up with how the versions of Postgres differ?

Enter: the Feature Matrix!

Covering hundreds of subpoints across categories like performance and partitioning, the matrix lets you get a simple yes or no answer to what features each significant version of Postgres contains.

Thanks to Nikolay Samokhvalov for bringing this to our attention (and he has two other great resources on that Twitter thread too).

Viewing all 472 articles
Browse latest View live


Latest Images