-
Notifications
You must be signed in to change notification settings - Fork 895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restart scheduler on error #7527
base: main
Are you sure you want to change the base?
Conversation
0e54d38
to
33521cd
Compare
Compared to #6195 this adds code to the start of the launcher that handles the status for any schedulers that existed prior to the restart by killing them. Reproducing it in a test is difficult because of how the schedulers are used in tests, but this short script demonstrates the issue: create view tsdb_bgw as
select datname, pid, backend_type, application_name from pg_stat_activity
where backend_type like '%TimescaleDB%';
alter system set timescaledb.debug_bgw_scheduler_exit_status to 1;
alter system set timescaledb.bgw_scheduler_restart_time to 10;
select pg_reload_conf();
-- Kill the schedulers
select pid, pg_terminate_backend(pid) from pg_stat_activity where backend_type like '%Scheduler%';
select pg_sleep(1);
select datname, backend_type, count(*) from tsdb_bgw group by datname, backend_type;
-- Wait for the schedulers to restart.
select pg_sleep(10);
select datname, backend_type, count(*) from tsdb_bgw group by datname, backend_type;
-- Kill the launcher, it should restart immediately.
select pid, pg_terminate_backend(pid) from pg_stat_activity where backend_type like '%Launcher%';
select pg_sleep(1);
-- This should only show one scheduler for each database.
select datname, backend_type, count(*) from tsdb_bgw group by datname, backend_type; |
e8e94bd
to
cbd7583
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #7527 +/- ##
==========================================
+ Coverage 80.06% 82.30% +2.24%
==========================================
Files 190 238 +48
Lines 37181 43746 +6565
Branches 9450 10980 +1530
==========================================
+ Hits 29770 36007 +6237
- Misses 2997 3406 +409
+ Partials 4414 4333 -81 ☔ View full report in Codecov by Sentry. |
42187ae
to
c167a6b
Compare
39e827c
to
bb093c2
Compare
8135f6c
to
bf26586
Compare
c4a2650
to
7dadb19
Compare
858040d
to
990dc77
Compare
4e74f91
to
e66c69b
Compare
If the scheduler receives an error, it will never restart again since `bgw_restart_time` is set to `BGW_NEVER_RESTART`, which will prevent all jobs from executing. This commit adds the GUC `timescaledb.bgw_scheduler_restart_time` that can be set to the restart time for the scheduler. It defaults to 60 seconds, which is the default restart interval for background workers defined by PostgreSQL. It also adds `timescaledb.debug_bgw_scheduler_exit_status` to be able to shutdown the scheduler with a non-zero exit status, which allows the restart functionality to be tested. It also ensures that `backend_type` is explicitly set up rather than copied from `application_name` and add some more information to `application_name`. It also updates the tests to use `backend_type` where applicable. To avoid exhausting slots when the launcher restarts, it will kill all existing schedulers and start new ones. Since background worker slots are easily exhausted in the `bgw_launcher` test, we do not run it repeatedly in the flakes workflow.
e66c69b
to
48cdcb9
Compare
If the scheduler receives an error, it will never restart again since
bgw_restart_time
is set toBGW_NEVER_RESTART
, which will prevent all jobs from executing.This commit adds the GUC
timescaledb.bgw_scheduler_restart_time
that can be set to the restart time for the scheduler. It defaults to 60 seconds, which is the default restart interval for background workers PostgreSQL defines.It also adds
timescaledb.debug_bgw_scheduler_exit_status
to be able to shutdown the scheduler with a non-zero exit status, which allows the restart functionality to be tested.It also ensures that
backend_type
is explicitly set up rather than copied fromapplication_name
and add some more information toapplication_name
. It also updates the tests to usebackend_type
where applicable.Disable-check: loader-change