Memory recycler in the age of Sidekiq

Starting with the move to sidekiq, the memory recycler is gone. Luckily, by splitting the executor into several systemd services, we can leverage the resource control features[1] provided by systemd and cgroups to fill the feature gap created by removal of the memory recycler.

Prerequisites

Before we can get to the memory limiting, let's take a look at how the default state looks. There are the orchestrator, worker and worker-hosts-queue processes running as instances of the dynflow-sidekiq@.service template service and as we can see from the output, systemd doesn't track how much memory the instances are using or any limits placed on memory usage.

# systemctl status dynflow-sidekiq@* | grep -e Memory -e '^. dynflow-sidekiq@.*.service'
● dynflow-sidekiq@worker-hosts-queue.service - Foreman jobs daemon - worker-hosts-queue on sidekiq
● dynflow-sidekiq@worker.service - Foreman jobs daemon - worker on sidekiq
● dynflow-sidekiq@orchestrator.service - Foreman jobs daemon - orchestrator on sidekiq

Enabling memory accounting

In order for the resource control to work, we first have to enable resource accounting for the unit. Please note this needs to be done only once.

# mkdir -p /etc/systemd/system/dynflow-sidekiq@.service.d
# cat <<EOF > /etc/systemd/system/dynflow-sidekiq@.service.d/memory-accounting.conf
[Service]
MemoryAccounting=yes
EOF

Make systemd reload the service definitions and restart all the dynflow-sidekiq services.

# systemctl daemon-reload
# systemctl restart dynflow-sidekiq@*

If we take a look at the status of the services, we should see that systemd started tracking how much memory each of the services uses.

# systemctl status 'dynflow-sidekiq@*' | grep -e Memory -e '^. dynflow-sidekiq@.*.service'
● dynflow-sidekiq@worker-hosts-queue.service - Foreman jobs daemon - worker-hosts-queue on sidekiq
   Memory: 263.1M
● dynflow-sidekiq@worker.service - Foreman jobs daemon - worker on sidekiq
   Memory: 264.1M
● dynflow-sidekiq@orchestrator.service - Foreman jobs daemon - orchestrator on sidekiq
   Memory: 264.5M

Now that the memory accounting is set up, we can move to the actual limiting.

Limiting memory usage

Here we have two options, the options can be either set individually for each of the instances or it can be set on the template level, in which case it would apply to all the instances. If set on both levels, the per-instance setting is prioritized over the template one.

Setting memory limit globally

For example to set a global limit of 2 gigabytes, the following snippet could be used.

# mkdir -p /etc/systemd/system/dynflow-sidekiq@.service.d
# cat <<EOF > /etc/systemd/system/dynflow-sidekiq@.service.d/memory-limit.conf
[Service]
MemoryLimit=2G
EOF
# systemctl daemon-reload
# systemctl restart dynflow-sidekiq@*

Now we can check the output of systemctl status to see the limit is applied.

# systemctl status 'dynflow-sidekiq@*' | grep -e Memory -e '^. dynflow-sidekiq@.*.service'
● dynflow-sidekiq@worker-hosts-queue.service - Foreman jobs daemon - worker-hosts-queue on sidekiq
   Memory: 490.1M (limit: 2.0G)
● dynflow-sidekiq@worker.service - Foreman jobs daemon - worker on sidekiq
   Memory: 490.1M (limit: 2.0G)
● dynflow-sidekiq@orchestrator.service - Foreman jobs daemon - orchestrator on sidekiq
   Memory: 492.5M (limit: 2.0G)

Setting memory limit per instance

To apply per-instance overrides, the approach is the same, just the path is a bit different. For example to increase the limit for the worker to 4 gigabytes, the following snippet could be used.

# mkdir -p /etc/systemd/system/dynflow-sidekiq@worker.service.d
# cat <<EOF > /etc/systemd/system/dynflow-sidekiq@worker.service.d/memory-limit.conf
[Service]
MemoryLimit=4G
EOF
# systemctl daemon-reload
# systemctl restart dynflow-sidekiq@worker

We use the same command to check the per-instance setting overrides the template one.

# systemctl status 'dynflow-sidekiq@*' | grep -e Memory -e '^. dynflow-sidekiq@.*.service'
● dynflow-sidekiq@worker-hosts-queue.service - Foreman jobs daemon - worker-hosts-queue on sidekiq
   Memory: 490.9M (limit: 2.0G)
● dynflow-sidekiq@worker.service - Foreman jobs daemon - worker on sidekiq
   Memory: 175.6M (limit: 4.0G)
● dynflow-sidekiq@orchestrator.service - Foreman jobs daemon - orchestrator on sidekiq
   Memory: 494.2M (limit: 2.0G)

Memory limiting in action

Now, we enabled memory accounting, set up the limits, but what happens when the limit is actually reached? Sadly it is nothing too sophisticated, once the limit is reached the service in question is killed with SIGKILL.

# journalctl -u dynflow-sidekiq@worker
Jul 27 08:57:34 foreman.example.com systemd[1]: Started Foreman jobs daemon - worker on sidekiq.
Jul 27 08:57:36 foreman.example.com dynflow-sidekiq@worker[10878]: 2020-07-27T12:57:36.683Z 10878 TID-43k2m INFO: GitLab reliable fetch activated!
Jul 27 08:57:36 foreman.example.com dynflow-sidekiq@worker[10878]: 2020-07-27T12:57:36.731Z 10878 TID-d3w7u INFO: Booting Sidekiq 5.2.7 with redis options {:id=>"Sidekiq-server-PID-10878", :url=>"redis://localhost:6379/0"}
----- B< ----- SNIP ----- B< -----
Jul 27 08:58:39 foreman.example.com systemd[1]: dynflow-sidekiq@worker.service: main process exited, code=killed, status=9/KILL
Jul 27 08:58:39 foreman.example.com systemd[1]: Unit dynflow-sidekiq@worker.service entered failed state.
Jul 27 08:58:39 foreman.example.com systemd[1]: dynflow-sidekiq@worker.service failed.
Jul 27 08:58:40 foreman.example.com systemd[1]: dynflow-sidekiq@worker.service holdoff time over, scheduling restart.
Jul 27 08:58:40 foreman.example.com systemd[1]: Stopped Foreman jobs daemon - worker on sidekiq.
Jul 27 08:58:40 foreman.example.com systemd[1]: Started Foreman jobs daemon - worker on sidekiq.

Since the service is set to restart on non-graceful shutdowns, systemd restarts the freshly killed service after the holdoff time is over.

At the time of writing this, the version of systemd on EL7 was 219. Newer versions of systemd promise better handling of memory management with MemoryMax, MemoryHigh and MemoryLow.

Recovery after worker is killed

When the worker is killed, it may or may not have been processing one or more jobs. If we used Sidekiq as-is, those jobs would be lost. For this reason, we are using gitlab-sidekiq-fetcher which implements the reliable fetch pattern. When a worker starts executing a job, it takes the job from a queue and puts it onto a working queue. When the job is finished, the worker removes the job from the working queue. However if the worker is killed while executing a job, the job stays in the working queue. Once per hour, the working queue is checked and any orphaned jobs there are removed from the working queue, requeued to the original queue and executed again.

When the job gets executed by a worker for the second time, Dynflow notices it already tried to execute the job, turns the step into error state and doesn't really execute it. From here on, rescue strategies can get applied to handle this situation further.

There was a bug in Dynflow <= 1.4.6 which made the job get stuck when attempting to run the step for the second time. It was fixed as Dynflow/dynflow#360. Until a version with the fix is released, the fix has to be applied manually.

[1] - https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html#MemoryMax=bytes

adamruzicka/memory recycling.md