A few comments on the Heavy Networking 442: The Source Of Truth Shall Set You Free (To Automate) podcast.
There's never enough time to discuss every detail that might be worth exploring, of course.
In the discussion of the three sources of truth, I wasn't clear:
- if the data in the Netbox data repository was versioned so that a bad config could be reverted
- what's the repository for the remaining cloud services
- how emerging new services like a servicemesh would be handled.
"configure the devices reliably or idempotently"
This isn't really what "idempotent" means.
Performing an operation idempotently doesn't mean it will be done reliably, just that the operation can be done repeatedly and produce the same result.
For example, set variable_value to 2
as opposed to add 1 to A
.
Probably could only be handled in a separate pod, but networking issues associated with converting from cloud to on-site would be of a lot of interest.
Bursting capability to go into other data centers {from cloud]. I would have like to hear more about this since I've never heard of anyone actually making it work in real life. It's fine to pass over, though, since it wasn't central to the topic.
I wasn't clear why, if you wanted a feature added to store an AS number per device, it couldn't be a feature request or a contribution. See Contributing to netbox: feature request
It would have been interesting to have more discussion when talking about CI/CD and tests about how the coding practices align with what the developers outside the networking group are doing.
Some specific points:
- Are code development and standards in the networking area the same as used in the dev team?
- How are networking changes integrated in the overall dev to prod workflow?
- Python 2 or 3? If 2, what's the plan to go to 3 and how disruptive will it be?
- Is DB admin for the Postgres DB handled in the networking group or does the organization's DB group handle or assist?
Also, is there a "bus + 1" strategy to avoid being dependent on Damien (if he gets hit by a bus)?
Addendum: Interesting comments in HN445 on using a Python based automation framework to replace/augment many of the tasks discussed in the HN442 pod.
(Below here are just notes on what was mentioned at various timings.)
2:45 2 DCs, 12 POPs, worldwide backbone, 12 months
4:20 3 "sources of truth" with some integration
- legacy DB, applications & services, not infra
- Netbox
- Git repo
6:40 "Configure the devices reliably or idempotently"
14:55 bursting capability to go into other DCs
35:05 DevOps approach
39:35 Want to store an AS number per device. Contributing to netbox: feature request
48:30 CI/CD and code test: no, some unit tests
52:10 Use Git to manage changes