We've had a number of requests for being able to run scripts against a deployment. The challenge has been defining these scripts and the execution environment without reducing the predictable, descriptive and deterministic nature of BOSH.
- Bind / Unbind services when deploying Runtime and Services.
- Run automated tests against a recent deployment
- Add seed data such as buildpacks to a deployment.
- Push applications to a freshly deployed Runtime.
- Running backup processes.
- Triggered execution (ie: Run this errand after every deployment).
- Cron support (ie: Run this errand every 10 minutes).
- Running multiple instances of a particular errand at the same time. BOSH will error saying that the specified errand is already running.
- Running an errand on the same VM as another job.
Errands are defined in a release, and are configured in the deploy manifest. They are similar to Jobs, but are neither long running, nor deployed during a deployment.
When an errand is run, BOSH will provision a new VM using the specified stemcell, install that errand and all required packages and templates, and run the given command.
By making use of some of the same constructs as jobs, BOSH can ensure that errands are run in a completely predictable, descriptive, and deterministic environment each and every time.
Errands are defined in the release in a similar fashion to jobs. Errands make use of packages, and generate templates to configure those packages. This allows an errand that requires a specific runtime (such as ruby) to include it explicitly.
The only differences are:
- Errands are first class objects. They're defined as
errands
as opposed tojobs
. - Errands aren't run via
monit
, but via thebosh run errand
command. - Errands are installed into
/var/vcap/errands/foo
Deploy Manifest
---
...
networks:
- name: default
...
resource_pools:
- name: default
network: default
...
jobs:
...
errands:
- name: smoke_test
release: release_name
templates:
- name: smoke_test
resource_pool: default
networks:
- name: default
default:
- dns
- gateway
# instances: 1
# persistent_disk: 20480
Once an errand is defined and configured, running is as simple as bosh run errand smoke_test
. Bosh will then provision a new VM, install the errand on it, and execute the errands/smoke_test/execute
script.
Additional command-line arguments given to bosh run errand
, will be passed to the remote script as command line arguments.
- Will errands be run asynchronously (like tasks)? If so, that opens up a whole can of work around
bosh errands
,bosh errand 123
, etc. If not, then we'll have to ensure we don't have timeout issues. - How will we provide the return code of the script being run as opposed to the return code of the
bosh run errand
command? If errands are async, then thebosh run errand foo
command could return before the script finishes executing. If synchronous, thenbosh run errand foo
could be killed locally, while the script continues to a successful exit. - How do we deal with script output? Do we need to stream that back to the
bosh run errand
command?
Looks cool! I had a go at answering the open questions below.
Could errands be a kind of task? They could show up in the
bosh tasks
list and be attached to withbosh task <id>
. The user interface for this could be:Choosing a default of
--attach
or--detach
would be up for discussion. By piggybacking on top of the existing tasks semantics you get most of your open questions answered for free.After this change BOSH will have "tasks", "errands", and about 5 different kinds of "jobs". All of these have very similar meanings outside of BOSH. Could there be a benefit in unifying or eliminating some of these? The Google paper on Omega shows that they have a similar overloading of "jobs" and "tasks" (but they use them to mean slightly different things than their equivalents in BOSH).