Skip to content

Instantly share code, notes, and snippets.

@orangecms
Last active July 7, 2024 20:31
Show Gist options
  • Save orangecms/cdc5b46f275414d17fdece2eeba04d09 to your computer and use it in GitHub Desktop.
Save orangecms/cdc5b46f275414d17fdece2eeba04d09 to your computer and use it in GitHub Desktop.
kexec on RISC-V on 6.10-rc6

kexec file patch series test

This test is based on the patches from https://github.com/torvalds/linux/compare/master...bjoto:linux:kexec-image-v2-fixup. I have ported them both to 6.9.7 and 6.10-rc6 with success.

My branch: https://github.com/orangecms/linux/tree/6.10-jh7110-cpu

Additional kexec-tools patches: https://github.com/sugarfillet/kexec-tools/commits/main_rv

NOTE: The SBI we are using is not handling unaligned access, but delegates it to Linux.

Assessment

I have run both the u-root kexec command (which works without changes) as well as the kexec-tools one (requiring the patches above). The latter needs -s -f to enforce using the file load syscall and skip the (for us non-existent) shutdown command.

For convenience when performing the experiment, I run kexec over cpu.

Multiple subsequent kexecs are working fine, taking about 15s from invocation to the new kernel being up and ready to cpu into again.

Purgatory

The following is painful to debug. See the corresponding commit in my fork.

I had to hard disable ARCH_SUPPORTS_KEXEC_PURGATORY in arch/riscv/Kconfig.

We would otherwise be running into an eventual instruction fault, because in kexec, Linux sets a new trap handler (stvec), which loads from an address stored in s0. It is only designed for the moment when the MMU is turned off. The s0 register is cleared before jumping to the purgatory, and for any exception there, we end up in the trap handler. And lo and behold, there are instances of unaligned memory access in purgatory, so the whole chain goes 💥.

Output with purgatory bypass:

[   28.860255][   T60] kexec_file(Image): Loaded kernel at 0x40200000 bufsz=0x12f13f8 memsz=0x0
[   28.870411][   T60] Copy segment 0
[   28.873871][   T60] Copy segment 1
[   29.078757][   T23] starfive-dwmac 16030000.ethernet eth0: Link is Down
[   29.087039][   T60] kexec_core: Starting new kernel
[   29.091983][   T60] Will call new kernel at 40200000 from hart id 1
[   29.098248][   T60] FDT image at c01f7000
[   29.102253][   T60] Bye...
[SBI] It's a trap!
        SCAUSE: Exception(InstructionFault)
        INSTRUCTION: 0xffffffd8077c704c
[SBI] dump hex: 0000000040200000
[SBI] dump 256 bytes @401fffa0
defcdeba9afe3f7fd2dbdadab3fbf2fed9dbc2fc7b4e5afe7278fa5efb7efbce
fefacaff5ef756be7efadb9abe7adefad25a92f258dfd37afe7bde3bcaf35bdb
fafed77a75d953dafbf67bdaf1ff1afa7edbf897e93b787b5a5bdbfefe5e72ca
6f00400c000000000000200000000000087e3501000000000000000000000000
0200000000000000000000000000000052495343560000005253430500000000
9785fd00938505f38c6517060000130666fb918dae9017060000130626032e96
731056101356c5009785fd00938585f48c614d8e17952f011305c5f831814d8d
730000127310051817050000130545037310551097012f019381c18d73100618
[SBI] mepc: 00000000479c70ac
[    0.000000][    T0] Linux version 6.10.0-rc6-cyrevolt-00007-g49a9818add6a (dama@orangelemp) (U4
[    0.000000][    T0] random: crng init done
[    0.000000][    T0] Machine model: StarFive VisionFive 2 v1.3B
[    0.000000][    T0] SBI specification v1.0 detected
[    0.000000][    T0] SBI implementation ID=0x4 Version=0x302
[    0.000000][    T0] SBI TIME extension detected
[    0.000000][    T0] SBI IPI extension detected
[    0.000000][    T0] SBI RFENCE extension detected
[    0.000000][    T0] SBI SRST extension detected
[    0.000000][    T0] earlycon: sbi0 at I/O port 0x0 (options '')
[    0.000000][    T0] printk: legacy bootconsole [sbi0] enabled

Conclusion

Overall, kexec file works, but there are still unaligned memory accesses in purgatory that need fixing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment