Почему могут падать потоковые реплики postgres?

Начали падать все, 10 штук, дружно позавчера, без каких-то видимых ранее причин, строго в определённое время сначала в 0 часов UTC, сегодня ровно в 0 часов EST

ОС ubuntu 16.04, версия PG 9.5 логи реплики

2019-04-13 00:00:03.370 EST [2267] LOG: 00000: invalid record length at 1BCA/DA0E0258 2019-04-13 00:00:03.370 EST [2267] LOCATION: ReadRecord, xlog.c:4012
2019-04-13 00:00:03.370 EST [14961] FATAL: 57P01: terminating walreceiver process due to administrator command 2019-04-13 00:00:03.370 EST [14961] LOCATION: ProcessWalRcvInterrupts, walreceiver.c:167
2019-04-13 00:00:03.370 EST [2267] LOG: 00000: invalid record length at 1BCA/DA0E0258
2019-04-13 00:00:03.370 EST [2267] LOCATION: ReadRecord, xlog.c:4012
2019-04-13 00:00:03.370 EST [2267] LOG: 00000: invalid record length at 1BCA/DA0E0258
2019-04-13 00:00:03.370 EST [2267] LOCATION: ReadRecord, xlog.c:4012
2019-04-13 00:00:03.370 EST [2267] LOG: 00000: invalid record length at 1BCA/DA0E0258
2019-04-13 00:00:03.370 EST [2267] LOCATION: ReadRecord, xlog.c:4012

А на мастере в это время - ничего, вообще!

2019-04-12 23:59:49.854 EST [49236] tracker@aaa LOG: 08006: could not receive data from client: Connection reset by peer

2019-04-12 23:59:49.854 EST [49236] tracker@aaa LOCATION: pq_recvbuf, pqcomm.c:919

2019-04-13 00:00:09.589 EST [49559] tracker@aaa LOG: 08006: could not receive data from client: Connection reset by peer

2019-04-13 00:00:09.589 EST [49559] tracker@aaa LOCATION: pq_recvbuf, pqcomm.c:919

2019-04-13 00:00:09.589 EST [49561] tracker@aaa LOG: 08006: could not receive data from client: Connection reset by peer

2019-04-13 00:00:09.589 EST [49561] tracker@aaa LOCATION: pq_recvbuf, pqcomm.c:919

2019-04-13 00:00:09.589 EST [49556] tracker@aaa LOG: 08006: could not receive data from client: Connection reset by peer

сейчас восстановил 2 реплики из 10ти, жду 0 часов по ЕСТ

postgres=# select * from pg_stat_replication;

pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent_location | write_location | flush_location | replay_location | sync_priority | sync_state -------+----------+----------+------------------+----------------+-----------------+-------------+-------------------------------+--------------+-----------+---------------+----------------+----------------+-----------------+---------------+------------

92078 | 10 | postgres | walreceiver | 18.72.2.14 | | 2184 | 2019-04-13 05:59:55.361197-05 | | streaming | 1BD6/63740D68 | 1BD6/63740D68 | 1BD6/63740D68 | 1BD6/6373DE78 | 0 | async

91982 | 10 | postgres | walreceiver | 31.22.2.8 | | 31620 | 2019-04-13 05:59:38.198516-05 | 486133549 | streaming | 1BD6/63740D68 | 1BD6/63740D68 | 1BD6/63740D68 | 1BD6/6373DE78 | 0 | async

(2 rows)
  • Вопрос задан
  • 467 просмотров
Пригласить эксперта
Ваш ответ на вопрос

Войдите, чтобы написать ответ

Войти через центр авторизации
Похожие вопросы