Database VIEWs are powerful tools that often don't get the attention they deserve when building database-driven applications. They make our database work easier in several ways:
- They let us reuse common query patterns instead of writing them over and over
- They give us a place to define business rules once and use them everywhere
- They help us write cleaner, more organized queries
Let's see how this works with a practical example. Imagine we want to work with active users - users who have used the application within the last 7 days. Instead of writing this condition in every query, we can define a view:
CREATE VIEW active_users AS
SELECT
*
FROM users;
WHERE
last_login > current_date - INTERVAL '7 days';
Now we can easily use this view whenever we need active users. For example, if we want to find active users from Germany:
SELECT
user_id
FROM active_users
WHERE
country = 'Germany';
While this simple example shows how views can make developers' lives easier by organizing and reusing common logic, there's more to the story. In this article, we'll explore something even more interesting: how PostgreSQL can optimize these views through a process called "inlining" - making them not just convenient, but fast too.
What is VIEW inlining
When you use a view, it acts like a building block in SQL queries. Taking our previous example, PostgreSQL effectively transforms the query by replacing the view with its underlying subquery.
SELECT
user_id
FROM (
SELECT
*
FROM users
WHERE
last_login > current_date - INTERVAL '7 days'
) active users
WHERE
country = 'Germany';
While we write the query using active_users
VIEW, PostgreSQL query planner will see it as an opportunity to optimize it further. Instead of treating the sub-query as separate step (and retrieving large subset of the users), it effectively transforms the query and inlines the view into the query itself. Behind the scenes, PostgreSQL will execute the query similar to:
SELECT
user_id
FROM users
WHERE
last_login > current_date - INTERVAL '7 days'
AND country = 'Germany';
The inlining process allows the PostgreSQL query planner to optimize the entire query as a single unit. This allows it to select the best indexes, possible join strategies and other aspects together, rather than having to execute the individual parts separately and then filtering the data that won't be otherwise used. This is exactly what VIEWs are for - we get the best of both worlds - clean, modular code for developers and optimal performance for the database.
Inlining in action
The easiest way how to preview the VIEW inlining is to run EXPLAIN
Seq Scan on users (cost=0.00..19.20 rows=1 width=4)
Filter: ((country = 'Germany'::text) AND (last_login > (CURRENT_DATE - '7 days'::interval)))
The Filter part here is the one showing how the filtering conditions from both the query and the view got merged together by the query planner.
This works especially well with complex queries involving joins, aggregations and filters. PostgreSQL can optimize entire query chain as one unit instead of executing individual pieces separately.
CREATE VIEW order_analytics AS
SELECT
o.customer_id,
c.country,
DATE_TRUNC('month', o.created_at) as month,
COUNT(*) as orders,
SUM(o.total_amount) as revenue
FROM orders o
JOIN customers c ON c.id = o.customer_id
GROUP BY 1, 2, 3;
When you use this VIEW in the query
SELECT customer_id, revenue
FROM order_analytics
WHERE country = 'Germany'
AND month >= '2024-01-01'
AND revenue > 1000;
you will see how PostgreSQL effectively inlines the filter conditions where they belong.
HashAggregate (cost=205.97..207.97 rows=200 width=16)
Group Key: o.customer_id, c.country, (date_trunc('month'::text, o.created_at))
Filter: (sum(o.total_amount) > 1000)
-> Hash Join (cost=16.12..185.25 rows=1235 width=20)
Hash Cond: (o.customer_id = c.id)
-> Seq Scan on orders o
Filter: (created_at >= '2024-01-01'::date)
-> Hash
-> Seq Scan on customers c
Filter: (country = 'Germany'::text)
This works especially well with increasing query complexity. Particulary involving joins and further filters. PostgreSQL can optimize entire query chain as one unit instead of executing individual pieces separately.
CREATE VIEW completed_orders AS
SELECT
o.id,
o.customer_id,
o.total_amount,
o.created_at,
o.status,
p.name as product_name,
c.email,
c.country
FROM orders o
JOIN customers c ON c.id = o.customer_id
JOIN order_items oi ON oi.order_id = o.id
JOIN products p ON p.id = oi.product_id
WHERE
o.status = 'completed';
SELECT
*
FROM completed_orders
WHERE
country = 'Germany';
CREATE INDEX idx_orders_customer_id ON orders(customer_id); CREATE INDEX idx_order_items_order_id ON order_items(order_id); CREATE INDEX idx_order_items_product_id ON order_items(product_id);
as demonstrated in the query plan
Nested Loop (cost=0.30..19.70 rows=1 width=176)
-> Nested Loop (cost=0.15..13.52 rows=1 width=148)
Join Filter: (o.id = oi.order_id)
-> Nested Loop (cost=0.15..12.43 rows=1 width=144)
-> Seq Scan on orders o (cost=0.00..1.05 rows=1 width=80)
Filter: (status = 'completed'::text)
-> Index Scan using customers_pkey on customers c (cost=0.15..8.17 rows=1 width=68)
Index Cond: (id = o.customer_id)
Filter: (country = 'Germany'::text)
-> Seq Scan on order_items oi (cost=0.00..1.04 rows=4 width=8)
-> Index Scan using products_pkey on products p (cost=0.15..6.17 rows=1 width=36)
Index Cond: (id = oi.product_id)
Planner barriers
While PostgreSQL is very good at inlining views, certain operations create "planner barrier" that prevent the optimization. In case of views some of the conditions that will cause it are
- DISTINCT ON operations
- Window functions (OVER clauses)
- Set operations (UNION/INTERSECT/EXCEPT)
- More complex aggregations
- Common Table Expressions that are materialized (either with MATERIALIZED hint or by the decision of the query planner)
- Use of VOLATILE functions
- And in some cases complex subqueries
When planner is not able to inline VIEWs it might lead to the unnecessary performance and resource impact. I.e. each sub query might materialize results separately, leading to the already retrieved rows to be discarded by another part of the query as one of the most common examples.
When using EXPLAIN this is demostated by one of the following options:
Subquery scan on...
nodes- Materialization nodes
- Separate aggregation/sorting steps before the main execution
As mentioned above the window function acts as such a planner barrier.
CREATE VIEW order_insights AS
SELECT
o.customer_id,
o.total_amount,
o.created_at,
ROW_NUMBER() OVER (PARTITION BY o.customer_id ORDER BY o.created_at) as order_sequence,
SUM(o.total_amount) OVER (PARTITION BY o.customer_id ORDER BY o.created_at) as running_total
FROM orders o;
EXPLAIN
SELECT * FROM order_insights
WHERE customer_id = 1
AND total_amount > 500;
Giving us easy to spot Subquery scan
node.
Subquery Scan on order_insights (cost=1.06..1.10 rows=1 width=84)
Filter: (order_insights.total_amount > '500'::numeric)
-> WindowAgg (cost=1.06..1.08 rows=1 width=84)
-> Sort (cost=1.06..1.06 rows=1 width=44)
Sort Key: o.created_at
-> Seq Scan on orders o (cost=0.00..1.05 rows=1 width=44)
Filter: (customer_id = 1)
Runtime Materialization (Not MATERIALIZED VIEWs)
Let's clear up a possible confusion - we're not talking about CREATE MATERIALIZED VIEW here. This is about PostgreSQL's runtime decision to cache view results in memory during query execution. Query planner might use explicit materialization when it needs to reference view results multiple times or prevent repeated expensive computations. The query planner shows this through a Materialize node:
CREATE VIEW customer_summary AS
SELECT
customer_id,
COUNT(*) as orders,
SUM(total_amount) as total_spent
FROM orders
GROUP BY 1;
EXPLAIN SELECT
a.customer_id,
a.total_spent,
b.total_spent as other_customer_spent
FROM customer_summary a
JOIN customer_summary b ON b.total_spent > a.total_spent;
Giving the perfect example of materialization.
Nested Loop (cost=46.00..200.75 rows=3333 width=68)
Join Filter: (b.total_spent > (sum(orders.total_amount)))
-> HashAggregate (cost=23.00..24.25 rows=100 width=44)
Group Key: orders.customer_id
-> Seq Scan on orders (cost=0.00..18.00 rows=1000 width=15)
-> Materialize (cost=23.00..25.75 rows=100 width=32)
-> Subquery Scan on b (cost=23.00..25.25 rows=100 width=32)
-> HashAggregate (cost=23.00..24.25 rows=100 width=44)
Group Key: orders_1.customer_id
-> Seq Scan on orders orders_1 (cost=0.00..18.00 rows=1000 width=15)
Next to the materialization - the materialized VIEWs (CREATE MATERIALIZED VIEW
) are for practical purposes "just another table" they present an ultimate planner barrier. PostgreSQL must scan the materialized data directly, trading query flexibility for performance
Tips for writing inlining friendly VIEWs
Well, there's not a single "right" way how to write views. And don't forget - VIEWs should make your database easier to work with, not harder. When designing views, focus on single responsibility - each view should handle one aspect of business logic rather than trying to optimize everything at once. Simple views are more likely to get inlined by PostgreSQL's query planner.
There are several common patterns that prevent inlining
- Avoid window functions, distinct and set operations completely
- Split complex logic into multiple views if possible
- Minimize subquery usage
- Consider materialized views for aggregations
View dependencies are a critical consideration. Even minor schema changes can trigger cascading view modifications across your database. Instead of creating deep view hierarchies, split complex logic into smaller, composable views. For critical views that many applications depend on, consider versioning (v1, v2) rather than modifying existing ones.
When in doubt, write the plain SQL first, then extract common patterns into views. This helps avoid overengineering and keeps your database schema maintainable.