Как ускорить ORDER BY в Postgres?

Question

whats @whats

PostgreSQL

Как ускорить ORDER BY в Postgres?

Приветствую. Подскажите, пожалуйста, есть такая ситуация.
Таблица дерева каталога.
e20b3a1fb3fc20782168cfdaa3f7d2b536603606

Таблица соответствий
f2e687765aa089f2719ce10fbb043721aeff1fc4

Есть запрос, который строит это дерево в табличном виде

SELECT DISTINCT t0.groupname, t1.groupname,  t2.groupname
FROM grouptree as t0 
left join models as m on m.id = t0.idmodel 
left join grouptree as t1 on t1.parent = t0.groupno AND t1.idmodel = t0.idmodel
left join grouptree as t2 on t2.parent = t1.groupno AND t2.idmodel = t1.idmodel
where t0.parent = 0 AND m.typeauto = 0

В условиях мы выбираем первый уровень дерева и отношение к типу автомобилей.
Вот план запроса

"HashAggregate  (cost=144714.35..145034.98 rows=32063 width=117) (actual time=1209.785..1211.691 rows=10716 loops=1)"
"  ->  Nested Loop Left Join  (cost=2379.07..144473.88 rows=32063 width=117) (actual time=28.075..840.266 rows=1391881 loops=1)"
"        ->  Nested Loop Left Join  (cost=2378.64..128016.41 rows=32063 width=86) (actual time=28.066..171.952 rows=264099 loops=1)"
"              ->  Hash Join  (cost=2378.21..25071.47 rows=32063 width=47) (actual time=28.057..69.569 rows=30423 loops=1)"
"                    Hash Cond: (t0.idmodel = m.id)"
"                    ->  Bitmap Heap Scan on grouptree t0  (cost=660.19..22638.32 rows=35066 width=47) (actual time=20.854..35.009 rows=34960 loops=1)"
"                          Recheck Cond: (parent = 0)"
"                          ->  Bitmap Index Scan on "2-parent-to-idmodel"  (cost=0.00..651.42 rows=35066 width=0) (actual time=15.255..15.255 rows=34960 loops=1)"
"                                Index Cond: (parent = 0)"
"                    ->  Hash  (cost=1346.41..1346.41 rows=29729 width=4) (actual time=7.171..7.171 rows=29715 loops=1)"
"                          Buckets: 4096  Batches: 1  Memory Usage: 1045kB"
"                          ->  Seq Scan on models m  (cost=0.00..1346.41 rows=29729 width=4) (actual time=0.004..4.488 rows=29715 loops=1)"
"                                Filter: (typeauto = 0)"
"                                Rows Removed by Filter: 2798"
"              ->  Index Scan using "2-parent-to-idmodel" on grouptree t1  (cost=0.43..3.20 rows=1 width=51) (actual time=0.001..0.002 rows=9 loops=30423)"
"                    Index Cond: ((parent = t0.groupno) AND (idmodel = t0.idmodel))"
"        ->  Index Scan using "2-parent-to-idmodel" on grouptree t2  (cost=0.43..0.50 rows=1 width=47) (actual time=0.001..0.002 rows=5 loops=264099)"
"              Index Cond: ((parent = t1.groupno) AND (idmodel = t1.idmodel))"
"Total runtime: 1212.145 ms"

Все хорошо и работает как нужно. Но хотелось бы отсортировать данные.
Добавляю всего 1 строку в конец

...
order by t0.groupname ASC

Картина в корне меняется. И время выполнения запроса увеличивается в 20 раз.

"Unique  (cost=146873.58..147194.21 rows=32063 width=117) (actual time=21456.087..21864.988 rows=10716 loops=1)"
"  Output: t0.groupname, t1.groupname, t2.groupname"
"  Buffers: shared hit=1240970"
"  ->  Sort  (cost=146873.58..146953.73 rows=32063 width=117) (actual time=21456.085..21594.183 rows=1391881 loops=1)"
"        Output: t0.groupname, t1.groupname, t2.groupname"
"        Sort Key: t0.groupname, t1.groupname, t2.groupname"
"        Sort Method: quicksort  Memory: 312756kB"
"        Buffers: shared hit=1240970"
"        ->  Nested Loop Left Join  (cost=2379.07..144473.88 rows=32063 width=117) (actual time=25.376..867.222 rows=1391881 loops=1)"
"              Output: t0.groupname, t1.groupname, t2.groupname"
"              Buffers: shared hit=1240970"
"              ->  Nested Loop Left Join  (cost=2378.64..128016.41 rows=32063 width=86) (actual time=25.366..172.208 rows=264099 loops=1)"
"                    Output: t0.groupname, t1.groupname, t1.idmodel, t1.groupno"
"                    Buffers: shared hit=162206"
"                    ->  Hash Join  (cost=2378.21..25071.47 rows=32063 width=47) (actual time=25.357..66.067 rows=30423 loops=1)"
"                          Output: t0.groupname, t0.idmodel, t0.groupno"
"                          Hash Cond: (t0.idmodel = m.id)"
"                          Buffers: shared hit=21731"
"                          ->  Bitmap Heap Scan on public.grouptree t0  (cost=660.19..22638.32 rows=35066 width=47) (actual time=17.457..31.343 rows=34960 loops=1)"
"                                Output: t0.idmodel, t0.groupno, t0.parent, t0.groupname, t0.groupnameen, t0.pictureindex, t0.mark, t0.sortorder"
"                                Recheck Cond: (t0.parent = 0)"
"                                Buffers: shared hit=20791"
"                                ->  Bitmap Index Scan on "2-parent-to-idmodel"  (cost=0.00..651.42 rows=35066 width=0) (actual time=14.128..14.128 rows=34960 loops=1)"
"                                      Index Cond: (t0.parent = 0)"
"                                      Buffers: shared hit=98"
"                          ->  Hash  (cost=1346.41..1346.41 rows=29729 width=4) (actual time=7.868..7.868 rows=29715 loops=1)"
"                                Output: m.id"
"                                Buckets: 4096  Batches: 1  Memory Usage: 1045kB"
"                                Buffers: shared hit=940"
"                                ->  Seq Scan on public.models m  (cost=0.00..1346.41 rows=29729 width=4) (actual time=0.003..5.048 rows=29715 loops=1)"
"                                      Output: m.id"
"                                      Filter: (m.typeauto = 0)"
"                                      Rows Removed by Filter: 2798"
"                                      Buffers: shared hit=940"
"                    ->  Index Scan using "2-parent-to-idmodel" on public.grouptree t1  (cost=0.43..3.20 rows=1 width=51) (actual time=0.001..0.002 rows=9 loops=30423)"
"                          Output: t1.idmodel, t1.groupno, t1.parent, t1.groupname, t1.groupnameen, t1.pictureindex, t1.mark, t1.sortorder"
"                          Index Cond: ((t1.parent = t0.groupno) AND (t1.idmodel = t0.idmodel))"
"                          Buffers: shared hit=140475"
"              ->  Index Scan using "2-parent-to-idmodel" on public.grouptree t2  (cost=0.43..0.50 rows=1 width=47) (actual time=0.001..0.002 rows=5 loops=264099)"
"                    Output: t2.idmodel, t2.groupno, t2.parent, t2.groupname, t2.groupnameen, t2.pictureindex, t2.mark, t2.sortorder"
"                    Index Cond: ((t2.parent = t1.groupno) AND (t2.idmodel = t1.idmodel))"
"                    Buffers: shared hit=1078764"
"Total runtime: 21879.380 ms"

Видно из профайлера что на сортировке он застревает. Причем сортирует сразу по 3 полям, хотя указано только 1. Подскажите, что я делаю не так ?

Вопрос задан более трёх лет назад
3258 просмотров

Комментировать

Подписаться 3 Оценить Комментировать

Пригласить эксперта

Ответы на вопрос 2

Комментировать

Ваш ответ на вопрос

Войдите, чтобы написать ответ

Войти через центр авторизации

Похожие вопросы

C++

+1 ещё

Средний
Dbeaver C++ connection error?
- 1 подписчик
- 9 часов назад
- 45 просмотров
0

ответов
PostgreSQL

Простой
Нужен ли первичный ключ в таблицах PostgreSQL?
- 1 подписчик
- 23 апр.
- 150 просмотров
2

ответа
PostgreSQL

+1 ещё

Простой
Как добавить отношения «многие-ко-многим» между таблицами из разных баз данных?
- 1 подписчик
- 22 апр.
- 134 просмотра
4

ответа
PostgreSQL

+1 ещё

Простой
Как исправить неправильное отображение данных в csv после экспорта?
- 1 подписчик
- 17 апр.
- 107 просмотров
1

ответ
Java

+3 ещё

Средний
Пытаюсь подключиться к postgresql 16 через docker-compose, использую spring-boot 3.2.4, что не так?
- 1 подписчик
- 17 апр.
- 203 просмотра
3

ответа
PostgreSQL

+2 ещё

Простой
Где искать рекомендуемые настройки SSL-аутентификации для Docker-образа Posgres?
- 1 подписчик
- 17 апр.
- 65 просмотров
3

ответа
PostgreSQL

+1 ещё

Простой
Как поправить язык в SQL Shell (psql)?
- 1 подписчик
- 14 апр.
- 98 просмотров
1

ответ
Python

+1 ещё

Простой
Как оптимизировать запрос?
- 1 подписчик
- 13 апр.
- 146 просмотров
2

ответа
PostgreSQL

Простой
Как взять значение из одной таблицы и прибавить к значению другой таблицы?
- 1 подписчик
- 12 апр.
- 81 просмотр
1

ответ
PostgreSQL

Простой
Psq восстановление бэкапа, что делаю не так?
- 1 подписчик
- 11 апр.
- 85 просмотров
2

ответа
Показать ещё Загружается…

Разработчик баз данных PostgreSQL

Объединенные системы управления транспортом • Москва

До 220 000 ₽

Администратор PostgreSQL

Гринатом

До 200 000 ₽

DBA / Администратор баз данных PostgreSQL

СберТех • Москва

от 320 000 ₽

Разработать CRM/ERP проект на Pure Php + Symfony

26 апр. 2024, в 18:27

200000 руб./за проект

Таргетированная реклама в Tik Tok

26 апр. 2024, в 18:24

80000 руб./за проект

Протестировать виджет на личном сайте

26 апр. 2024, в 18:00

500 руб./за проект

Answer 1 · 2014-04-21 19:35:02

work_mem - 1GB Сортируется в памяти. Какие еще параметры нужно назвать для информативности ?

Кстати если сделать вот так :

select * from (SELECT DISTINCT t0.groupname as s1, t1.groupname,  t2.groupname
FROM grouptree as t0 
left join models as m on m.id = t0.idmodel 
left join grouptree as t1 on t1.parent = t0.groupno AND t1.idmodel = t0.idmodel
left join grouptree as t2 on t2.parent = t1.groupno AND t2.idmodel = t1.idmodel
where t0.parent = 0 AND m.typeauto = 0) as t
order by t.s1 ASC

Что, я считаю, в корне не верно, то запрос отработает за секунды.

"Sort  (cost=147755.31..147835.47 rows=32063 width=117) (actual time=1241.491..1241.930 rows=10716 loops=1)"
"  Output: t0.groupname, t1.groupname, t2.groupname"
"  Sort Key: t0.groupname"
"  Sort Method: quicksort  Memory: 2612kB"
"  Buffers: shared hit=1240970"
"  ->  HashAggregate  (cost=144714.35..145034.98 rows=32063 width=117) (actual time=1228.137..1229.846 rows=10716 loops=1)"
"        Output: t0.groupname, t1.groupname, t2.groupname"
"        Buffers: shared hit=1240970"
"        ->  Nested Loop Left Join  (cost=2379.07..144473.88 rows=32063 width=117) (actual time=29.937..852.961 rows=1391881 loops=1)"
"              Output: t0.groupname, t1.groupname, t2.groupname"
"              Buffers: shared hit=1240970"
"              ->  Nested Loop Left Join  (cost=2378.64..128016.41 rows=32063 width=86) (actual time=29.927..172.365 rows=264099 loops=1)"
"                    Output: t0.groupname, t1.groupname, t1.idmodel, t1.groupno"
"                    Buffers: shared hit=162206"
"                    ->  Hash Join  (cost=2378.21..25071.47 rows=32063 width=47) (actual time=29.915..67.004 rows=30423 loops=1)"
"                          Output: t0.groupname, t0.idmodel, t0.groupno"
"                          Hash Cond: (t0.idmodel = m.id)"
"                          Buffers: shared hit=21731"
"                          ->  Bitmap Heap Scan on public.grouptree t0  (cost=660.19..22638.32 rows=35066 width=47) (actual time=22.059..35.273 rows=34960 loops=1)"
"                                Output: t0.idmodel, t0.groupno, t0.parent, t0.groupname, t0.groupnameen, t0.pictureindex, t0.mark, t0.sortorder"
"                                Recheck Cond: (t0.parent = 0)"
"                                Buffers: shared hit=20791"
"                                ->  Bitmap Index Scan on "2-parent-to-idmodel"  (cost=0.00..651.42 rows=35066 width=0) (actual time=14.175..14.175 rows=34960 loops=1)"
"                                      Index Cond: (t0.parent = 0)"
"                                      Buffers: shared hit=98"
"                          ->  Hash  (cost=1346.41..1346.41 rows=29729 width=4) (actual time=7.824..7.824 rows=29715 loops=1)"
"                                Output: m.id"
"                                Buckets: 4096  Batches: 1  Memory Usage: 1045kB"
"                                Buffers: shared hit=940"
"                                ->  Seq Scan on public.models m  (cost=0.00..1346.41 rows=29729 width=4) (actual time=0.003..5.017 rows=29715 loops=1)"
"                                      Output: m.id"
"                                      Filter: (m.typeauto = 0)"
"                                      Rows Removed by Filter: 2798"
"                                      Buffers: shared hit=940"
"                    ->  Index Scan using "2-parent-to-idmodel" on public.grouptree t1  (cost=0.43..3.20 rows=1 width=51) (actual time=0.001..0.002 rows=9 loops=30423)"
"                          Output: t1.idmodel, t1.groupno, t1.parent, t1.groupname, t1.groupnameen, t1.pictureindex, t1.mark, t1.sortorder"
"                          Index Cond: ((t1.parent = t0.groupno) AND (t1.idmodel = t0.idmodel))"
"                          Buffers: shared hit=140475"
"              ->  Index Scan using "2-parent-to-idmodel" on public.grouptree t2  (cost=0.43..0.50 rows=1 width=47) (actual time=0.001..0.002 rows=5 loops=264099)"
"                    Output: t2.idmodel, t2.groupno, t2.parent, t2.groupname, t2.groupnameen, t2.pictureindex, t2.mark, t2.sortorder"
"                    Index Cond: ((t2.parent = t1.groupno) AND (t2.idmodel = t1.idmodel))"
"                    Buffers: shared hit=1078764"
"Total runtime: 1242.392 ms"

Как правильно отсортировать ?

Answer 2 · 2014-04-25 14:03:04

вот ответ от более опытных коллег (я еще сам учусь))):
конкретно тот запрос не ускоряется ни как... проблему при order by снимать через WITH или offset 0 в подзапросе и только после накладыванием сортировки.
т.е. там сама постановка задачи с 1.4M строк в сортировке и 10k строк на выходе - не рабочая

вот как-то так, совсем безрадостно ((

Как ускорить ORDER BY в Postgres?

Войдите, чтобы написать ответ

Минуточку внимания

Войдите на сайт