Databases

PostgreSQL Performance: Indexing Strategies and Query Optimization

February 12, 202614 min read

Slow queries are the silent killer of scalable apps. Master PostgreSQL indexing strategies, EXPLAIN ANALYZE, partial indexes, and query rewrites to achieve 10x–100x speedups.

PostgreSQL Performance: Indexing Strategies and Query Optimization

Introduction

PostgreSQL is one of the most capable relational databases available — but raw capability doesn't help if your queries time out under load. This guide covers the indexing strategies, query patterns, and diagnostic tools that separate a database that struggles at 1M rows from one that handles 1B rows without breaking a sweat.

EXPLAIN ANALYZE: Your Most Important Tool

Before optimising anything, read the query plan:

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.name, COUNT(o.id) AS order_count
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.created_at > NOW() - INTERVAL '30 days'
GROUP BY u.id, u.name
ORDER BY order_count DESC
LIMIT 20;

Key things to look for:

Seq Scan on large tables — usually bad, needs an index
Hash Join vs Nested Loop — hash is usually better for large sets
Rows — actual vs estimated; huge differences mean stale statistics
Buffers hit — cache hits vs disk reads

Index Types and When to Use Each

B-Tree (default) — equality and range queries

CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_orders_created_at ON orders(created_at DESC);

Composite — multiple columns, order matters

-- Covers: WHERE status = 'active' ORDER BY created_at DESC
CREATE INDEX idx_orders_status_created 
ON orders(status, created_at DESC);

-- Rule: equality columns first, then range/sort columns

Partial — index only a subset of rows

-- Only index active orders — much smaller, faster
CREATE INDEX idx_orders_active 
ON orders(user_id, created_at) 
WHERE status = 'active';

-- Only index unprocessed jobs
CREATE INDEX idx_jobs_pending 
ON jobs(priority DESC, created_at) 
WHERE processed_at IS NULL;

GIN — JSONB and full-text search

CREATE INDEX idx_products_metadata 
ON products USING GIN(metadata);

-- Now this is fast:
SELECT * FROM products WHERE metadata @> '{"category": "electronics"}';

BRIN — time-series data with natural ordering

-- Tiny index for append-only tables (logs, events)
CREATE INDEX idx_events_occurred_at 
ON events USING BRIN(occurred_at);

Common Query Anti-Patterns

1. Function on indexed column (index skip)

-- Bad — index on email is skipped
SELECT * FROM users WHERE LOWER(email) = 'john@example.com';

-- Good — use a functional index
CREATE INDEX idx_users_email_lower ON users(LOWER(email));

2. LIKE with leading wildcard

-- Bad — can't use B-tree index
SELECT * FROM products WHERE name LIKE '%phone%';

-- Good — use full-text search
CREATE INDEX idx_products_fts ON products USING GIN(to_tsvector('english', name));
SELECT * FROM products WHERE to_tsvector('english', name) @@ to_tsquery('phone');

3. N+1 hidden in code

-- Instead of loading users then querying each user's orders — use a JOIN
SELECT u.id, u.name, COALESCE(order_counts.total, 0) AS total_orders
FROM users u
LEFT JOIN (
  SELECT user_id, COUNT(*) AS total
  FROM orders
  GROUP BY user_id
) order_counts ON order_counts.user_id = u.id;

Materialized Views for Expensive Aggregations

-- Create a materialized view for the dashboard stats
CREATE MATERIALIZED VIEW monthly_revenue AS
SELECT
  DATE_TRUNC('month', created_at) AS month,
  SUM(total_amount) AS revenue,
  COUNT(*) AS order_count,
  AVG(total_amount) AS avg_order_value
FROM orders
WHERE status = 'completed'
GROUP BY DATE_TRUNC('month', created_at);

CREATE INDEX ON monthly_revenue(month DESC);

-- Refresh on a schedule (or after bulk inserts)
REFRESH MATERIALIZED VIEW CONCURRENTLY monthly_revenue;

CONCURRENTLY allows reads during the refresh — essential for production.

Connection Pooling with PgBouncer

PostgreSQL creates a new process per connection — it does not scale to thousands of concurrent connections. Always use a connection pooler:

# pgbouncer.ini
[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb

[pgbouncer]
pool_mode = transaction        ; best for web apps
max_client_conn = 1000
default_pool_size = 25
min_pool_size = 5
reserve_pool_size = 10

Partitioning for Massive Tables

-- Partition events table by month
CREATE TABLE events (
  id BIGSERIAL,
  user_id UUID NOT NULL,
  event_type TEXT NOT NULL,
  payload JSONB,
  occurred_at TIMESTAMPTZ NOT NULL
) PARTITION BY RANGE (occurred_at);

CREATE TABLE events_2026_01 PARTITION OF events
  FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');

CREATE TABLE events_2026_02 PARTITION OF events
  FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');

-- Queries with WHERE occurred_at filter only scan the relevant partition

Conclusion

PostgreSQL performance is 80% about indexes and 20% about query rewrites. Use EXPLAIN ANALYZE as a habit, not an afterthought. Add partial indexes for filtered queries, use JSONB GIN indexes for flexible schemas, and always run behind a connection pooler.

Key takeaways:

Composite index column order = equality first, then range
Partial indexes are often 10x smaller and faster than full indexes
CONCURRENTLY is your friend for materialized view refreshes
Never connect your app directly to Postgres — always use PgBouncer or equivalent