Beyond shell scripts

Background: The objective is to run python everywhere and my first step is to replace shell activity with pypyr. These notes relate to the process of automating a common task: running a containerized application in a local virtual host. It happens to the podman, but the structure is similar to Docker, Vagrant, kubectl, etc. The naive workkflow of "copy/paste commands into yaml blocks" resulted in the question of how to handle common failure scenarios. So I wonder if there is an approach or even some scaffolding for the "typical launch script"

Starting a service requiring a Docker host

What things could go wrong?

The happy path is usually what is left over after everything else has failed. My intuition is that one should probe for the common failure conditions first. I didn't do that and so I ran into the failures organically, instead of seeking them out.

Is the service already running?

The app runs in a container and it requires a running podman virtual machine. This is structurally similar to docker, vagrant, etc.

In the shell, I would run:

 podman machine start                                                                                                                      ──(Sun,Sep18)─┘
Starting machine "podman-machine-default"
Waiting for VM ...
Mounting volume... /Users/bpabon:/Users/bpabon

This machine is currently configured in rootless mode. If your containers
require root permissions (e.g. ports < 1024), or if you run into compatibility
issues with non-podman clients, you can switch using the following command:

	podman machine set --rootful

API forwarding listening on: /var/run/docker.sock
Docker API clients default to this address. You do not need to set DOCKER_HOST.

Machine "podman-machine-default" started successfully

So, I suppose I could tell pypyr to listen for the string started successfully or I could give podman a flag to return a POSIX result code.

This simple pipeline so far assumes that it must start the host VM and also the application container.

steps:  
	- name: pypyr.steps.cmd  
		description: starts the podman machine but doesn't run it in a shell.  
		in:  
			cmd: podman machine start  
  	- name: pypyr.steps.cmd
    		description: --> giving pypi 10s to start the machine.
    		in:
      			cmd: sleep 10
	- name: pypyr.steps.debug  
		description: check out the cmd output saved to cmdOut!  
		in:  
			debug:  
			keys: cmdOut  
	- name: pypyr.steps.echo  
		run: !py cmdOut.returncode == 0  
		in:  
			echoMe: "Machine podman-machine-default started successfully: {cmdOut.stdout}"
	- name: pypyr.steps.cmd  
		in:  
			cmd: >  
				'podman run  
				--name api-server   
				--detach  
				--tty  
				--volume ~/.ara/server:/opt/ara  
				-p 8000:8000  
				docker.io/recordsansible/ara-api:latest'

Is the launch not yet complete?

If the VM is already starting, the next steps in the pipeline will fail. - I should check the state of the VM first. If I try too quickly, I get:

Error: cannot start VM podman-machine-default: VM already running or starting
Error while running step pypyr.steps.cmd at pipeline yaml line: 2, col: 5
Something went wrong. Will now try to run on_failure.

CalledProcessError: Command '['podman', 'machine', 'start']' returned non-zero exit status 125.

Error: cannot start VM podman-machine-default: VM already running or starting
Error while running step pypyr.steps.cmd at pipeline yaml line: 2, col: 5
Something went wrong. Will now try to run on_failure.

CalledProcessError: Command '['podman', 'machine', 'start']' returned non-zero exit status 125.

Idempotency: Did we already launch the service?

The host service may already be running from a previous invocation, so we should proceed to the application stage.

Error: cannot start VM podman-machine-default: VM already running or starting

Solution: retry until ready

use retry to keep on retrying a command until it's successful. For example:

- name: pypyr.steps.cmd
	description: --> installing just published release from pypi for smoke-test
	retry:
		max: 5
		sleep: 10
	in:
		cmd: pip install --upgrade --no-cache-dir {package_name}=={expected_version}

A complex command with diff types of arguments

From the shell prompt, I start the application calling the podman CLI, followed by the run command, followed by several options. Some of these options have additional parameters:

 podman run --name api-server --detach --tty \
  --volume ~/.ara/server:/opt/ara -p 8000:8000 \
  docker.io/recordsansible/ara-api:latest

Name collisions?

The application container may collide with another if the names are the same. Most often, this is because I have run the command twice.

Error: error creating container storage: the container name "api-server" is already in use by a8673f400c138e96cd9e036bfa47bd8285d058953635b614511723ba0d8bed7d. You have to remove that container to be able to reuse that name: that name is already in use

Solution: pipeline-reserved names

Consider using a known name - let's say api-server__, and then making it so that only your pipelines are supposed to use that name.

A similar approach can work with network ports if another container is using the same number.

Solution: foldable literal blocks

Yaml has the > symbol to indicate a foldable quote. Each line is interpreted with new lines rendered as spaces:

  - name: pypyr.steps.cmd  
		in:  
		cmd: >  
			podman run  
			--name api-server__   
			--detach  
			--tty  
			--volume ~/.ara/server:/opt/ara  
			-p 8000:8000  
			docker.io/recordsansible/ara-api:latest

Anatomy of a pipeline for launching a containerized app

Make sure as part of your pipeline that you:
a) check the it's not already running. if it is. . . stop it and start a new instance
b) do your work
c) clean-up after fail (stop the instance).

steps:
  - name: pypyr.steps.call
    comment: Block of pre-flight tasks - comments don't get exposed at runtime.
    in:
      call: start-services

  - name: pypyr.steps.echo
    description: Running the steps specific to this pipeline
    comment: Block of tasks to meet the objective
    in:
      echoMe: do your work here and in subsequent steps once services started

  - name: pypyr.steps.call
    comment: stop the services when done. could also put this in `on_success` group.
    in:
      call: stop-services

# This block is called in the beginning.
start-services:
  - name: pypyr.steps.cmd
    comment: this cmd should return 0 if podman api-server already running
    swallow: True
    in:
      cmd: echo some sort of cmd to check if api-server is running

  - name: pypyr.steps.call
    comment: if previous step errored, means podman already running.
             so stop it first, so we can start clean.
    run: !py "'runErrors' in locals()"
    in:
      call: stop-services

  # if stop is asynchronous/detached, you might have to have another retry-style
  # step here to wait for it to stop completely.
  
  - name: pypyr.steps.cmd
    comment: now we know api-server is deffo NOT running. so can just start it here.
    in:
      cmd: echo start your api-server 

  - name: pypyr.steps.cmd
    comment: this will retry max 5 times w 10s sleep in between to wait for api-server to start
             if api-server does not start in this time, will raise error and stop here.
    retry:
      max: 5
      sleep: 10
    in:
      cmd: echo some sort of cmd that returns 0 when api-server started and ready

# this block is called at the end.
stop-services:
  - name: pypyr.steps.cmd
    in:
      cmd: echo podman stop api-server__ stop etc.

# this is the global err handler. this on_failure group will run if any unhandled
# (i.e un-swallowed) error in the pipeline happens.
on_failure:
  - name: pypyr.steps.call
    comment: try to stop the services. this is best effort, so if this fails
             just swallow, not much more we can do.
    swallow: True
    in:
      call: stop-services

# The destination of the happy path
on_success:
  - name: pypyr.steps.echo
    description: Joyous, we embark on the happy path
    comment: announce helpful info, consider some variable substitution
    in:
      echoMe: You are ready to use at URL... reports at URL:// ...

How to clean up after the pipeline.

It's good to anticipate these conditions, it's even better to avoid creating them. Try to leave things in a good place for the next run.

Colophon

Tech docs are often considered to have two dimensions, Understanding & Action, each with a spectrum from Abstract to concrete.

The current Pyper docs are heavier on understanding (explanation and reference) and lighter on actions (tutorials and guides). I would be curious about developing a process for selecting scenarios and describing their structure.

blaisep/podman_pypyr.md