/Vertically Scaling PostgreSQL

Vertically Scaling PostgreSQL

PostgreSQL can scale rather well vertically. The more resources (CPU, memory,
disk) that you can make available to your PostgreSQL server, the better it can
perform. However, while some parts of Postgres can automatically make use of
the increased resources, other parts need configuration changes before
improvements can be noticed.

Read on to learn more about how to ensure PostgreSQL makes full use of the system
you’re running it on.


What Scales Automatically

PostgreSQL has a traditional process architecture, consisting of a master
process (called the postmaster) that spawns a new process (called a
backend) for each new client connection. This means that if there are more
CPU cores available, more processes can run simultaneously, and therefore
backends do not have to contend as much for CPU availability. CPU-bound queries
will complete faster.

You may wish to adjust the maximum allowed simultaneous connections at a
system-wide, per-database or per-user level:

-- system level
ALTER SYSTEM SET max_connections = 200;

-- database level

-- user level

so that rogue apps cannot end up hogging too many connections.

What Needs Tweaking

The PostgreSQL server can spawn process to take care of housekeeping tasks, like
vacuuming, replication, subscriptions (for logical replication) etc. The number
of such workers is not determined dynamically, but just set via configuration,
and defaults to 8.

The top-level configuration setting for the number of worker process is:

# typically specified in postgresql.conf
max_worker_processes = 16

Increasing this value can result in speedup of maintenance jobs, parallel
queries and index creation.

Parallel Queries

Starting with version 9.6, Postgres can execute queries parallely if the query
planner decides it’ll help. Parallel querying involves spawning workers,
distributing work amongst them and then collecting (gathering) the results.
Subject to the overall limit of max_worker_processes set earlier, Postgres
will determine how many workers can be spawned for parallel query depending on
the value of two configuration settings:

# the maximum number of workers that the system can
# support for parallel operations
max_parallel_workers = 8

# the maximum number of workers that can be started
# by a single Gather or Gather Merge node
max_parallel_workers_per_gather = 8

If you have idle CPUs and parallelizable queries, increasing these values can
speedup such queries.

Parallel Index Creation

In Postgres 11, support for parallel creation of B-Tree indexes
was added. If you regularly
create B-Tree indexes or REINDEX them, increasing this value can help:

# the maximum number of parallel workers that can be
# started by a single utility command
max_parallel_maintenance_workers = 8

This will allow Postgres to spawn these many workers (subject to the overall
limit of max_worker_processes) to speed up the creation of B-Tree indexes.

Logical Replication

Logical replication (available in Postgres 10 and above), relies on worker
processes at the subscription side to fetch changes from the publisher. By
asking Postgres to spawn more logical replication workers, the changes can be
fetched and applied in parallel, especially if there are more tables. This
configuration setting increases the total number of replication workers:

# maximum number of logical replication workers
max_logical_replication_workers = 8

In streaming replication, you can start off a sync with a base backup. For
logical replication however, changes have to be pulled in via the replication
protocol itself, over the network. This can be time consuming. Allowing for more
workers during the sync phase can speed up this process:

# basically the number of tables that are synced in
# parallel during initialization of subscription
max_sync_workers_per_subscription = 8


Periodically, based on a bunch of configuration settings, Postgres will spwan a
bunch of workers that will VACUUM the database tables. This is of course, called
autovacuum, and the number of workers that the autovacuum launcher spawns each
time can be set via the configuration setting:

# the maximum number of autovacuum processes
autovacuum_max_workers = 8

WAL Compression

If you have CPU to spare, you can trade CPU for disk bandwidth by compressing
the pages that are written into the WAL files. This reduces the amount of data
that needs to be written to disk, at the expense of more CPU cycles to compress
the data. It also reduces the size of data that needs to be sent across the wire
for streaming replication.

Practically, the benefits of WAL compression are well worth the very reasonable
overhead. To turn it on, use:

# compresses full page images written to WAL
wal_compression = on


What Scales Automatically

The OS automatically manages and uses memory that is unused by any application
for caching data read from and written to the disk recently. This greatly speeds
up disk-intensive applications, and certainly PostgreSQL.

In Linux, the most popular host for Postgres, the size of the OS disk cache
cannot be set by the user. It’s management is internal to Linux. Under memory
pressure, it will yield disk cache memory to applications.

What Needs Tweaking

Query Planner

The query planner has to include the amount of disk cache provided by the OS
as a factor into it’s estimations. If you’ve managed to increase the OS disk
cache significantly (by increasing the available memory), increasing this
configuration setting might help in improving the planner’s estimates:

# the planner's assumption about the effective size
# of the disk cache that is available to a single query.
effective_cache_size = 64GB

Shared Memory

PostgreSQL uses a set of buffers that is shared between all workers and backend
processes. These are called shared buffers, and the amount of memory allocated
for shared buffers is set using the configuration setting:

shared_buffers = 32GB

Temporary Buffers

When temporary tables are accessed by a query, buffers are allocated to cache
the contents that are read in. The size of this buffer is set using the
configuration setting:

# the maximum number of temporary buffers used
# by each database session
temp_buffers = 100MB

If you have memory to spare and queries that use temporary tables heavily,
increasing this value can speed up such queries.

Working Memory

Working memory is allocated locally and privately by backends. It is used to
fulfil sorts and joins without having to create into temporary tables.
Increasing this from the default of 4MB can let queries complete faster by
during temporary table creation:

# the amount of memory to be used by internal sort
# operations and hash tables before writing to temporary disk files
work_mem = 16MB

Maintenance Operations

The memory used by VACUUM, index creation and other such maintenance commands
are controlled by the configuration setting maintenance_work_mem. Increasing
this amount can speed up these operations, especially on indexes or tables that
need to be recreated.

The memory used by the autovacuum workers can be taken from the maintenance work
memory (by setting autovacuum_work_mem = -1) or configured independently.

# the maximum amount of memory to be used by
# maintenance operations
maintenance_work_mem = 128MB

# maximum amount of memory to be used by each
# autovacuum worker process
autovacuum_work_mem = -1


What Scales Automatically

Disks can be made bigger, faster or more concurrent. The size of the disk is the
only thing that PostgreSQL doesn’t have to be instructed about. By default,
PostgreSQL will not constrain itself from using any available disk space.
This is usually just fine.

You can place a limit on the total size of temporary files created though, to
provide some amount of protection against queries that try to sort a billion
rows and the like:

# the maximum amount of disk space that a process
# can use for temporary files
temp_file_limit = 500GB

What Needs Tweaking


RAID-ed disks and file systems like ZFS can be setup to support more
concurrency. That is to say, you can have a few disk reads/writes being serviced
concurrently by such file systems because of the way the store or handle data

You can let Postgres issue multiple concurrent disk I/O, by using this
configuration setting:

# the number of concurrent disk I/O operations that
# PostgreSQL expects can be executed simultaneously
effective_io_concurrency = 4

This is currently used only by bitmap heap scans though.

Random Page Cost

The Postgres query planner assumes that sequential reads are faster than random
reads. Exactly how much faster is a value you can tweak. By default, it assumes
random reads are 4 times costlier.

Depending on your disk setup, workload and benchmarking, if you’re sure that
random reads are say, only twice as costly as sequential reads, you can tell
that to Postgres:

# the planner's estimate of the cost of a disk page
# fetch that is part of a series of sequential fetches
seq_page_cost = 1

# the planner's estimate of the cost of a
# non-sequentially-fetched disk page
random_page_cost = 2


To take advantage of multiple disks which are not mounted as one big single
filesystem, you can use tablespaces.
With tablespaces, you can places tables or indexes different filesystems. This
can improve concurrency and provides an easy way to handle table growth.

CREATE TABLESPACE disk2 LOCATION '/mnt/disk2/postgres';

Read more about tablespaces here.


Network is usually the least used resource on a PostgreSQL server, and is rarely
saturated. If you do need to scale, it is easy enough to add more network
interfaces each with it’s own IP and have PostreSQL listen on them all:

listen_addresses = ','

The clients will have to be load balanced across all the IPs that Postgres
listens on.


There are a few other configuration settings that can be tweaked, most of which
use up more CPU and memory.

Partitionwise Operations

Postgres 10 introduced table partitioning, which was improved upon
in Postgres 11. Some query optimizations on partitions are not turn on by
default, as they might result in higher CPU and memory consumption. These are:

# allow a join between partitioned tables to be
# performed by joining the matching partitions
enable_partitionwise_join = on

# allow grouping or aggregation on a partitioned
# tables performed separately for each partition
enable_partitionwise_aggregate = on

About pgDash

pgDash is a modern, in-depth monitoring solution designed
specifically for PostgreSQL deployments. pgDash shows you information and
metrics about every aspect of your PostgreSQL database server, collected using
the open-source tool pgmetrics.
pgDash provides core reporting and visualization
functionality, including collecting and displaying PostgreSQL information and
providing time-series graphs, detailed reports, alerting, teams and more.

pgDash Replication Monitoring

pgDash can monitor replication, locks, queries and more. It can also provide
quick diagnostics and extensive alerts. Learn more here or signup
for a free trial.