This month’s PGSQL Phriday topic is triggers, proposed by the mighty Lætitia Avrot. I’ll let others wade into the debate on whether/how much business logic should be implemented database-side (for my opinion, check out the Postgres FM episode on stored procedures). Instead, I’d like to cover the very practical (a.k.a. way more boring) aspect of how and when I most commonly see triggers causing performance issues.
But I don’t use triggers
Before you stop reading, you are very likely using triggers! Even if you’re not directly, Postgres uses triggers behind the scenes for some things, including foreign keys.
Break it down
If you run EXPLAIN with (at least) the ANALYZE parameter, you’ll see time spent executing triggers at the bottom of the query plan. If you include VERBOSE, you’ll also see their names.
Lets create an authors and a books table, with 10,000 authors, 10 books per author, and a foreign key between them:
create table authors (
id bigint generated always as identity primary key,
name text not null
);
create table books (
id bigint generated always as identity primary key,
title text not null,
author_id bigint,
foreign key (author_id) references authors (id) on delete cascade
);
insert into authors (name)
select 'author' || n
from generate_series(1, 10000) as n;
insert into books (title, author_id)
select 'book' || n,
ceil(n::decimal / 10)
from generate_series(1, 100000) as n;
vacuum analyze;
Great, so let’s see what happens when we delete an author:
explain (analyze, buffers, verbose)
delete from authors where id = 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Delete on public.authors (cost=0.29..8.30 rows=0 width=0) (actual time=0.121..0.122 rows=0 loops=1)
Buffers: shared hit=5
-> Index Scan using authors_pkey on public.authors (cost=0.29..8.30 rows=1 width=6) (actual time=0.086..0.089 rows=1 loops=1)
Output: ctid
Index Cond: (authors.id = 1)
Buffers: shared hit=3
Planning Time: 0.192 ms
Trigger RI_ConstraintTrigger_a_16717 for constraint books_author_id_fkey: time=13.971 calls=1
Execution Time: 14.149 ms
We can tell from that penultimate line that the RI (referential integrity) trigger behind our foreign key (fkey) was called 1 time, and that took 13.971 milliseconds.
When can trigger time dominate your query?
Triggers are relatively notorious for causing performance issues, especially when used badly. Having said that, by far the most common trigger related issue I see while helping folks is related to the above example.
Looking back at that, how did Postgres delete the corresponding rows in the books table?
Sadly, the query plan doesn’t show us, but Postgres will have done a sequential scan of the books table, filtering for rows with author_id =1. We know this, because, unlike primary keys, Postgres does not automatically create indexes for foreign keys, and we didn’t create one either.
Let’s create that index and delete another author to see what impact it has:
create index author_id_idx ON books (author_id);
explain (analyze, buffers, verbose)
delete from authors where id = 2;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Delete on public.authors (cost=0.29..8.30 rows=0 width=0) (actual time=0.178..0.179 rows=0 loops=1)
Buffers: shared hit=5
-> Index Scan using authors_pkey on public.authors (cost=0.29..8.30 rows=1 width=6) (actual time=0.089..0.092 rows=1 loops=1)
Output: ctid
Index Cond: (authors.id = 2)
Buffers: shared hit=3
Planning Time: 0.196 ms
Trigger RI_ConstraintTrigger_a_16717 for constraint books_author_id_fkey: time=0.990 calls=1
Execution Time: 1.218 ms
This time, Postgres is able to use our index behind the scenes to locate the rows in our books table more efficiently, hence the speed up! Naturally, on large tables, this effect is even more pronounced.
As a bonus bit of trivia, notice how even though I included BUFFERS, we don’t get information about the extra blocks being read for those scans, so you really do need to keep an eye on the timing info to spot this issue.
Looking out for excessive time spent in triggers is one of the main reasons I recommend checking the bottom of the query plan first (along with planning time and JIT compilation time issues).
