Skip to content

Instantly share code, notes, and snippets.

@maligree
Created April 27, 2018 11:48
Show Gist options
  • Save maligree/f1f8cc542b172fa1bf4b3e5e1c613d4b to your computer and use it in GitHub Desktop.
Save maligree/f1f8cc542b172fa1bf4b3e5e1c613d4b to your computer and use it in GitHub Desktop.

Since we moved today’s KS because of low attendance + the fact that it will take 2-3 separate KS meetings to fully cover the topic and a long weekend between them would suck (there’s some context to it), I’ve compiled my slides and other notes to give you something to check out if you’re extra bored/interested in the subject.

These are just bullet points, all of them (and more) will be expanded during the actual KS sessions after the weekend.

  • the notion of performance critical code should come up as early as first code review, but let’s not optimize prematurely
  • I spent a lot of time rewriting parts of Rye to first C++, then dropped the experiment for a few weeks because of TrustedMail work
  • I did a golang implementation as well (actually, I initially did it out of boredom on a weekend and wasn’t even expecting to finish it — but it turned out so pleasant that I did).
  • go has nice built-in benchmarking support, allows you to turn a regular test into a benchmark, which is NICE.
  • I won’t go into discussing languages here and now… that’s going to be a long discussion.
  • This week and the next I’m working on a final benchmark of Python vs Cython vs Rye++ vs Ryego.
  • Whenever we cross the language boundary, we need to create an interface - and since we have Python calling C++/Go, the interface is either using ctypes directly or wrapping the C++/Go in a custom Python module. Both have their pros and cons.
  • Some of our data structures are tricky to pass to a different language!
  • One example is a correspondences dictionary in Rye, where the keys are 2-element arrays and the values are 3-element arrays. This is of course achievable in other languages, but it’s something that feels REALLY, REALLY wrong.
  • So looking at other languages could actually help up clear up our way of thinking about existing code.
  • Another example is from wheat, where the template descriptions are complex nested dicts. That’s really unwieldy inside a statically typed language. And here’s a fun fact if you thought about using a JSON library to serialize/deserialize and access those: JSON parsing is pretty fast relative to anything else we do in Python. In a piece of go code, parsing a moderately complex JSON structure can be a bottleneck.
  • The moral is: once we go to a lower level, we’ll discover stuff that we never thought was slow in the first place.
  • Since there’s a lot of work with implementing the cross language interface/boundary, I’ve decided to look into a few RPC frameworks.
  • RPC frameworks give us a standardized protocol for serialization when sending data between services (protobuf for grpc… something else for Thrift)
  • I’ve worked with Apache Thrift - Home (originally created at Facebook) before, but Python 3 support still isn’t official. I decided to jump in to the most marketed alternative, Google’s grpc / grpc.io
  • There’s also Cap’n Proto: Introduction which is something worth checking out - especially since they promise good serialization performance
  • The serialization/deserialization process is the biggest performance point, barring network which I’m ignoring right now since I’m assuming all RPC “resource groups” will run on one machine.
  • Example experiment workflow for Rye:
    • Take a CPU-intensive part out of Rye
    • Re-write it in a different language
    • Wrap the rewrite in a thin RPC layer, turning it into a service
    • Replace the original calls with RPC calls
    • This gives us TWO services: one is the original Python service, the other is “internal” and at first would run alongside the original service, for minimal network latency between them.
  • Fun links
  • These notes are a raging mess!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment