Assembling code with Jump instructions results in Segmentation Faults (on ARM64 and AMD64 hosts)
Problem: Compiling a RISC-V program from assembly using GNU binutils results in a Segmentation Fault if a Jump instruction (j, jal, jr, etc.) is encountered at any point during program execution.
Example:
poly.s:
.global _start
.text
_start:
la s0, poly # load address of poly[0] into s0
la s1, res # load address of res[0] into s1
mv a0, zero # set a0 to the value at zero (0)
loop:
beq s0, s1, end # break to end if s0 equals s1
lw t0, 0(s0) # load the value at s0[0] into t0
add a0, a0, t0 # add the value of t0 and a0 into a0
addi s0, s0, 4 # add 4 to s0 into s0
j loop # jump to loop
end:
sw a0, 0(s1) # store the value of a0 into s1[0]
li a7, 93 # load 93 into a7 (exit code)
ecall # call exit
.data
poly: .word 5, 11, 3, 13, 7, 17 # list of edge lengths
res: .word 0 # result
If the above code is compiled and executed as follows…
$ as -o code.o code.s
$ ld -o code code.o
$ ./poly
… the program segfaults:
Things I've tried:
- Running the above in Quartz under LifI in UTM (as intended)
- Manually installing Quartz on an ARM Linux host and running this from there
- Configuring my own RISC-V64 Debian Sid VM to run directly in UTM (no Quartz, pictured above)
- Cross-compiling from my host machine (macOS, ARM64) and transferring the compiled programs to each of the above setups for testing
Diagnosis:
Given that the problem persists, with the same results, when neither Quartz nor the binutils packaged with Debian Sid are present, I would safely conclude that this is a problem in upstream with qEMU's RISC-V emulation. (The suggested alternatives using other emulators, such as JSLinux, do not share this problem.) Unfortunately, I do not have access to an Intel host to test on, so I can't discount that as part of the problem.
Workaround:
Assembling the code with the -fPIC flag enabled results in expected behavior. According to the the gcc documentation,
If supported for the target machine, emit position-independent code, suitable for dynamic linking and avoiding any limit on the size of the global offset table. This option makes a difference on AArch64, m68k, PowerPC and SPARC.
So, no mention of RISC-V, but I wouldn't be surprised if qEMU's implementation of relative offsets were bugged in this regard.
Credit goes to Michael Schrauber for discovering this workaround while facing a similar but related problem with branching instructions.
