Last time, we talked about the magic trick to make your full text searches go fast. This time, I’ll tell you about another performance issue I encountered that probably also affects your performance, at least if you are using Ecto and PostgreSQL.
Here’s the story of the interplay between Ecto, PostgreSQL and the PostgreSQL query planner.
Hashtags
It is a truth universally acknowledged, that a social network in possession of a good amount of posts, must be in want of hashtags. Users love #hashtags, because they help them to discover new people or to view all the #nsfw pictures on the site for research purposes. The database needs to be able to return a list of posts tagged with a certain tag, sorted by descending post id. There are a few ways to implement this, and this is the way we are doing it for this example:
Name | Type |
---|---|
id | integer |
content | text |
tags | varchar[] |
On insertion, we’ll just put all the hashtags that occur in the text into the tags
array. Then we can use the array operators to find all posts with a certain tag:
pleroma_benchmark=# select count(*) from activities where ARRAY['cofe']::varchar[] && tags;
count
-------
28053
(1 row)
And this would be the query we are using to get the posts:
select * from activities where ARRAY['cofe']::varchar[] && tags order by id desc limit 10;
&&
is the overlap operator, which will be true if the two arrays share any elements. How fast is this? Well, if we sort by the descending post id, the database will have to walk through every post one by one and check if the condition matches. The speed of that depends heavily on the hashtag you are searching for. Why is that?
In general, hashtags follow a power law distribution. This is the pretentious way of saying that there are a lot of hashtags that are very unpopular, and few hashtags that are very popular. There should be new posts every day that contain #cofe
, but there aren’t too many that contain #phantasmagoric
, for example. So, for the popular hashtag, #cofe
, this will be fast, because it can quickly find 10 posts that match after going through a few hundred posts or even less. Here’s an example for the performance for that:
pleroma_benchmark=# explain analyze select * from activities where ARRAY['cofe']::varchar[] && tags order by id desc limit 10;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..21.71 rows=10 width=1002) (actual time=0.029..0.073 rows=10 loops=1)
-> Index Scan Backward using activities_pkey on activities (cost=0.42..60341.23 rows=28348 width=1002) (actual time=0.028..0.070 rows=10 loops=1)
Filter: ('{cofe}'::character varying[] && tags)
Rows Removed by Filter: 37
Planning Time: 0.162 ms
Execution Time: 0.089 ms
(6 rows)
For #phantasmagoric
, this can take a lot longer. Maybe we have to go through a million posts until we have 10 posts that match. Even worse, if we don’t even have 10 posts, this will look through the whole database, absolutely ruining performance.
As always, the solution here is to use the index. We can index our array using our old friend, the GIN index, which will speed up the &&
operator. Now, when we look for #phantasmagoric
, the index will be used. Here’s an example of that:
pleroma_benchmark=# explain analyze select * from activities where ARRAY['phantasmagoric']::varchar[] && tags order by id desc limit 10;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1066.03..1066.05 rows=10 width=1002) (actual time=0.616..0.619 rows=10 loops=1)
-> Sort (cost=1066.03..1066.71 rows=273 width=1002) (actual time=0.615..0.616 rows=10 loops=1)
Sort Key: id DESC
Sort Method: top-N heapsort Memory: 57kB
-> Bitmap Heap Scan on activities (cost=26.12..1060.13 rows=273 width=1002) (actual time=0.094..0.509 rows=300 loops=1)
Recheck Cond: ('{phantasmagoric}'::character varying[] && tags)
Heap Blocks: exact=298
-> Bitmap Index Scan on activities_tags_index (cost=0.00..26.05 rows=273 width=0) (actual time=0.056..0.056 rows=300 loops=1)
Index Cond: (tags && '{phantasmagoric}'::character varying[])
Planning Time: 0.118 ms
Execution Time: 0.639 ms
(11 rows)
This is also really fast, even if it uses a very different plan. Will our query for #cofe
now also use that index? The answer is no, and that’s a good thing. The trade-off with this index is that it will give you the exact matches, but they will be unsorted. Note the extra sort step in the plan for #phantasmagoric
. So if you have a lot of matches, it will actually be faster to just go through the database like in our #cofe
example, because the results will already be sorted. PostgreSQL keeps statistics about the data distribution in the index, so it can decide which plan is the best for any given input.
Benchmarking
I wrote a benchmark with (Benchee)[https://github.com/bencheeorg/benchee] that will test the hashtag timeline speed for 20 different hashtags. Here’s their distribution:
Post count | tag |
---|---|
67 | tag_0 |
300 | tag_1 |
617 | tag_2 |
1131 | tag_3 |
1788 | tag_4 |
2405 | tag_5 |
3453 | tag_6 |
4475 | tag_7 |
5581 | tag_8 |
6963 | tag_9 |
8338 | tag_10 |
9888 | tag_11 |
11871 | tag_12 |
13516 | tag_13 |
15762 | tag_14 |
17973 | tag_15 |
20025 | tag_16 |
22702 | tag_17 |
25092 | tag_18 |
28053 | tag_19 |
Now here’s the benchmark:
##### With input For #tag_0 #####
Name ips average deviation median 99th %
Hashtag fetching 48.96 20.42 ms ±8.27% 20.41 ms 25.63 ms
##### With input For #tag_1 #####
Name ips average deviation median 99th %
Hashtag fetching 46.36 21.57 ms ±7.89% 21.20 ms 25.74 ms
##### With input For #tag_2 #####
Name ips average deviation median 99th %
Hashtag fetching 44.37 22.54 ms ±10.21% 22.66 ms 27.69 ms
##### With input For #tag_3 #####
Name ips average deviation median 99th %
Hashtag fetching 42.40 23.58 ms ±10.36% 23.94 ms 27.98 ms
##### With input For #tag_4 #####
Name ips average deviation median 99th %
Hashtag fetching 39.86 25.09 ms ±7.49% 25.25 ms 29.05 ms
##### With input For #tag_5 #####
Name ips average deviation median 99th %
Hashtag fetching 37.68 26.54 ms ±7.69% 26.49 ms 31.62 ms
##### With input For #tag_6 #####
Name ips average deviation median 99th %
Hashtag fetching 35.70 28.01 ms ±7.11% 28.10 ms 35.64 ms
##### With input For #tag_7 #####
Name ips average deviation median 99th %
Hashtag fetching 32.67 30.61 ms ±6.05% 30.64 ms 35.79 ms
##### With input For #tag_8 #####
Name ips average deviation median 99th %
Hashtag fetching 30.22 33.10 ms ±4.26% 32.65 ms 37.99 ms
##### With input For #tag_9 #####
Name ips average deviation median 99th %
Hashtag fetching 27.82 35.94 ms ±5.30% 35.90 ms 41.05 ms
##### With input For #tag_10 #####
Name ips average deviation median 99th %
Hashtag fetching 25.64 38.99 ms ±4.35% 38.91 ms 46.33 ms
##### With input For #tag_11 #####
Name ips average deviation median 99th %
Hashtag fetching 23.57 42.43 ms ±3.48% 42.16 ms 45.55 ms
##### With input For #tag_12 #####
Name ips average deviation median 99th %
Hashtag fetching 21.91 45.64 ms ±2.90% 45.29 ms 49.81 ms
##### With input For #tag_13 #####
Name ips average deviation median 99th %
Hashtag fetching 20.85 47.96 ms ±4.00% 48.00 ms 52.15 ms
##### With input For #tag_14 #####
Name ips average deviation median 99th %
Hashtag fetching 19.05 52.49 ms ±3.07% 52.24 ms 56.69 ms
##### With input For #tag_15 #####
Name ips average deviation median 99th %
Hashtag fetching 17.33 57.72 ms ±3.05% 57.29 ms 61.32 ms
##### With input For #tag_16 #####
Name ips average deviation median 99th %
Hashtag fetching 16.55 60.43 ms ±3.17% 60.48 ms 66.44 ms
##### With input For #tag_17 #####
Name ips average deviation median 99th %
Hashtag fetching 15.36 65.12 ms ±3.48% 64.96 ms 74.16 ms
##### With input For #tag_18 #####
Name ips average deviation median 99th %
Hashtag fetching 14.38 69.54 ms ±2.52% 69.44 ms 75.13 ms
##### With input For #tag_19 #####
Name ips average deviation median 99th %
Hashtag fetching 13.52 73.99 ms ±3.78% 74.17 ms 86.89 ms
Well, that looks bad. Every tag has rather bad performance, but even worse, it seems to decrease the more popular a tag gets. This looks like it’s using the index for every query, even the ones where it is less than optimal. Can this be true? Let me show you an interesting variation of this benchmark, run with the same code, on the same machine…
##### With input For #tag_19 #####
Name ips average deviation median 99th %
Hashtag fetching 614.88 1.63 ms ±14.80% 1.59 ms 2.23 ms
##### With input For #tag_18 #####
Name ips average deviation median 99th %
Hashtag fetching 665.19 1.50 ms ±10.79% 1.46 ms 1.97 ms
##### With input For #tag_17 #####
Name ips average deviation median 99th %
Hashtag fetching 661.95 1.51 ms ±8.86% 1.51 ms 1.91 ms
##### With input For #tag_16 #####
Name ips average deviation median 99th %
Hashtag fetching 617.43 1.62 ms ±11.37% 1.58 ms 2.16 ms
##### With input For #tag_15 #####
Name ips average deviation median 99th %
Hashtag fetching 593.35 1.69 ms ±13.22% 1.65 ms 2.29 ms
##### With input For #tag_14 #####
Name ips average deviation median 99th %
Hashtag fetching 591.22 1.69 ms ±13.00% 1.64 ms 2.27 ms
##### With input For #tag_13 #####
Name ips average deviation median 99th %
Hashtag fetching 657.09 1.52 ms ±9.41% 1.49 ms 1.99 ms
##### With input For #tag_12 #####
Name ips average deviation median 99th %
Hashtag fetching 595.46 1.68 ms ±13.80% 1.64 ms 2.29 ms
##### With input For #tag_11 #####
Name ips average deviation median 99th %
Hashtag fetching 628.92 1.59 ms ±9.89% 1.57 ms 2.10 ms
##### With input For #tag_10 #####
Name ips average deviation median 99th %
Hashtag fetching 593.27 1.69 ms ±46.42% 1.67 ms 2.03 ms
##### With input For #tag_9 #####
Name ips average deviation median 99th %
Hashtag fetching 90.78 11.02 ms ±130.93% 2.45 ms 38.98 ms
##### With input For #tag_8 #####
Name ips average deviation median 99th %
Hashtag fetching 145.24 6.88 ms ±160.21% 1.94 ms 34.36 ms
##### With input For #tag_7 #####
Name ips average deviation median 99th %
Hashtag fetching 69.16 14.46 ms ±91.25% 3.55 ms 33.33 ms
##### With input For #tag_6 #####
Name ips average deviation median 99th %
Hashtag fetching 36.76 27.20 ms ±15.37% 27.98 ms 31.92 ms
##### With input For #tag_5 #####
Name ips average deviation median 99th %
Hashtag fetching 38.48 25.99 ms ±7.76% 26.04 ms 31.80 ms
##### With input For #tag_4 #####
Name ips average deviation median 99th %
Hashtag fetching 40.26 24.84 ms ±6.17% 24.86 ms 29.13 ms
##### With input For #tag_3 #####
Name ips average deviation median 99th %
Hashtag fetching 43.20 23.15 ms ±11.70% 23.64 ms 28.61 ms
##### With input For #tag_2 #####
Name ips average deviation median 99th %
Hashtag fetching 44.63 22.40 ms ±8.99% 22.46 ms 26.62 ms
##### With input For #tag_1 #####
Name ips average deviation median 99th %
Hashtag fetching 49.27 20.30 ms ±12.78% 20.76 ms 25.40 ms
##### With input For #tag_0 #####
Name ips average deviation median 99th %
Hashtag fetching 50.05 19.98 ms ±10.95% 20.12 ms 25.05 ms
This is the exact same benchmark, just run in reverse order. What a difference! Our popular hashtags are super fast now! And the others are not that great, but still fine. So the solution seems to be easy: Just tell our users to only look up hashtags in that order!
Just kidding, of course, but why should the speed of our queries depend on their order? These are all reads, after all. Well, I will tell you, but first we have to talk about parallel universes prepared statements.
Prepared statements for the unprepared
You know these queries that we did above, the select ...
type of things? There’s another way to those, and that’s prepared statements. These are very similar to the normal queries. They are essentially parameterized versions of them. So you prepare your ‘please find this tag’ query once, and the EXECUTE
it with the tag itself as an argument. Why do this? Well, it prevents injection attacks, because the query and the input are strictly separated. It also can have advantages when planning would take a long time, because a prepared statement will plan a generic plan that can be reused for every execution. Our favorite not-ORM Ecto also uses prepared statemens for every query it does. The plot thiccens..
So, will PostgreSQL just use the same generic plan for every query now? It isn’t quite like that. If you look at the documentation you’ll read that PostgreSQL will do a custom plan for the first 5 times the query is run, then check if it’s better than the generic plan, and if it is, it will use it for all remaining queries. Otherwise it will use the generic plan, and never custom plan again. That looks like what happened above: If we start with the unpopular hashtags, the query plan using the index is used for EVERY query. If we start with the popular hashtags, it will use the one without the index.
So can we only hope that we get an alright ‘generic’ query that will be used for everything? The point of the index statistics is to select a different plan for different inputs. Is there a way to tell PostgreSQL to custom plan even for prepared statements? The answer used to be no, but PostgreSQL added an option in version 12 named plan_cache_mode
. If you set this one to force_custom_plan
, it will do just that. Let’s show the above benchmark with this options set.
##### With input For #tag_0 #####
Name ips average deviation median 99th %
Hashtag fetching 451.91 2.21 ms ±11.56% 2.18 ms 2.88 ms
##### With input For #tag_1 #####
Name ips average deviation median 99th %
Hashtag fetching 325.84 3.07 ms ±7.90% 3.06 ms 3.73 ms
##### With input For #tag_2 #####
Name ips average deviation median 99th %
Hashtag fetching 250.63 3.99 ms ±7.24% 3.99 ms 4.67 ms
##### With input For #tag_3 #####
Name ips average deviation median 99th %
Hashtag fetching 206.90 4.83 ms ±10.10% 4.83 ms 5.99 ms
##### With input For #tag_4 #####
Name ips average deviation median 99th %
Hashtag fetching 240.58 4.16 ms ±9.33% 4.16 ms 5.16 ms
##### With input For #tag_5 #####
Name ips average deviation median 99th %
Hashtag fetching 274.36 3.64 ms ±9.53% 3.65 ms 4.33 ms
##### With input For #tag_6 #####
Name ips average deviation median 99th %
Hashtag fetching 320.01 3.12 ms ±9.45% 3.12 ms 3.85 ms
##### With input For #tag_7 #####
Name ips average deviation median 99th %
Hashtag fetching 324.70 3.08 ms ±9.19% 3.09 ms 3.75 ms
##### With input For #tag_8 #####
Name ips average deviation median 99th %
Hashtag fetching 430.82 2.32 ms ±12.27% 2.31 ms 3.06 ms
##### With input For #tag_9 #####
Name ips average deviation median 99th %
Hashtag fetching 413.51 2.42 ms ±8.43% 2.42 ms 2.95 ms
##### With input For #tag_10 #####
Name ips average deviation median 99th %
Hashtag fetching 391.69 2.55 ms ±8.21% 2.55 ms 3.14 ms
##### With input For #tag_11 #####
Name ips average deviation median 99th %
Hashtag fetching 401.20 2.49 ms ±7.19% 2.51 ms 2.91 ms
##### With input For #tag_12 #####
Name ips average deviation median 99th %
Hashtag fetching 417.51 2.40 ms ±8.79% 2.39 ms 2.95 ms
##### With input For #tag_13 #####
Name ips average deviation median 99th %
Hashtag fetching 454.05 2.20 ms ±9.24% 2.18 ms 2.82 ms
##### With input For #tag_14 #####
Name ips average deviation median 99th %
Hashtag fetching 423.63 2.36 ms ±9.85% 2.36 ms 3.09 ms
##### With input For #tag_15 #####
Name ips average deviation median 99th %
Hashtag fetching 446.85 2.24 ms ±10.48% 2.23 ms 2.91 ms
##### With input For #tag_16 #####
Name ips average deviation median 99th %
Hashtag fetching 525.40 1.90 ms ±10.19% 1.84 ms 2.55 ms
##### With input For #tag_17 #####
Name ips average deviation median 99th %
Hashtag fetching 517.15 1.93 ms ±6.60% 1.91 ms 2.46 ms
##### With input For #tag_18 #####
Name ips average deviation median 99th %
Hashtag fetching 534.47 1.87 ms ±9.54% 1.83 ms 2.54 ms
##### With input For #tag_19 #####
Name ips average deviation median 99th %
Hashtag fetching 533.57 1.87 ms ±10.53% 1.81 ms 2.55 ms
Now this is what we wanted. Our performance is great across all inputs now and we can finally stop running puzzling benchmarks all day. The standard behavior can be especially devilish because you will see bad performance in your application, but when you run the same query in psql
, a new plan will be generated and the performance will be fine! In general, if you are using Ecto, I’d always suggest using force_custom_plan
unless you know what you are doing.
UPDATE
It came to my attention that you can give the prepare: :unnamed
option to your Ecto database configuration to achieve a similar effect on Postgres versions before 12. This works universally, but it only gives you about 80% of the performance of the force_custom_plan
option in my testing, presumably because we lose the performance boost that skipping the parsing step for prepared queries gives us.
TL;DR:
- Use PostgreSQL 12, if possible
- If you use Ecto, set
plan_cache_mode
toforce_custom_plan
in the PostgreSQL config - If you use PostgreSQL before version 12, set
prepare: :unnamed
in you Ecto database settings