Last active
March 18, 2024 05:43
-
-
Save moon-chilled/e8b79be9e3bf397b14dd93b1c5e3241e to your computer and use it in GitHub Desktop.
bad but cute implementation strategies for safe pointer publication
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// aarch64 assumed, but reasonably general | |
// x0: address of object to be initialised; x1, x2, x3, values to use to initialise its first three slots | |
// we want to ensure no other thread ever sees an unitialised x0 | |
// the dumb way: fence | |
str x1, [x0] | |
str x2, [x0, 8] | |
str x3, [x0, 16] | |
dmb ishst | |
// but fences are cringe. how can we do 'better'? data dependencies! | |
swp x1, x4, [x0] | |
swp x2, x5, [x0, 8] // (for the pedants, this addressing mode doesn't exist; this is pseudo-code) | |
swp x3, x6, [x0, 16] | |
add x4, x4, x5 // this could be anything (xor, sub, or, etc.) | |
add x4, x4, x6 | |
sub x4, x4, x4 // produce constant zero dependency-ordered after all writes | |
add x0, x0, x4 // and ensure further accesses to the object are similarly dp-ordered | |
// or ... control dependencies! | |
swp x1, x4, [x0] | |
swp x2, x5, [x0, 8] | |
swp x3, x6, [x0, 16] | |
add x4, x4, x5 | |
add x4, x4, x6 | |
cmp x4, x4 | |
b.ne anywhere | |
// or ... forbidden memory control dependency hacks! | |
swp x1, x4, [x0] | |
swp x2, x5, [x0, 8] | |
swp x3, x6, [x0, 16] | |
add x4, x4, x5 | |
add x4, x4, x6 | |
sub x4, x4, x4 | |
ldr x4, [x4, x0] | |
// the dummy load can actually go anywhere; it just has to depend on x4, so e.g. this also works | |
ldr x4, [x4, sp] | |
// and it doesn't even have to be a load; it could be a store too | |
str xzr, [x4, sp] | |
// I don't know how widely the memory control dependency hack is known, if it's known at all, but I find it incredibly cute | |
// swp is likely too expensive for any of this to be worth it, sadly (awaiting benchmarks to the contrary!) | |
// but it would be really nice to have an str variant that also produces a dependency token you can use to enforce ordering |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment