Assembling code with Jump instructions results in Segmentation Faults (on ARM64 and AMD64 hosts)

Problem: Compiling a RISC-V program from assembly using GNU binutils results in a Segmentation Fault if a Jump instruction (j, jal, jr, etc.) is encountered at any point during program execution.

Example:

poly.s:

.global _start

.text
_start:
    la  s0, poly            # load address of poly[0] into s0
    la  s1, res             # load address of res[0] into s1
    mv  a0, zero            # set a0 to the value at zero (0)

loop:
    beq s0, s1, end         # break to end if s0 equals s1
    lw  t0, 0(s0)           # load the value at s0[0] into t0
    add a0, a0, t0          # add the value of t0 and a0 into a0
    addi    s0, s0, 4       # add 4 to s0 into s0
    j   loop                # jump to loop

end:
    sw  a0, 0(s1)           # store the value of a0 into s1[0]

    li  a7, 93              # load 93 into a7 (exit code)
    ecall                   # call exit

.data
poly:   .word   5, 11, 3, 13, 7, 17 # list of edge lengths
res:    .word    0                   # result

If the above code is compiled and executed as follows…

$ as -o code.o code.s
$ ld -o code code.o
$ ./poly

… the program segfaults:

segfault.png

Things I've tried:

  • Running the above in Quartz under LifI in UTM (as intended)
  • Manually installing Quartz on an ARM Linux host and running this from there
  • Configuring my own RISC-V64 Debian Sid VM to run directly in UTM (no Quartz, pictured above)
  • Cross-compiling from my host machine (macOS, ARM64) and transferring the compiled programs to each of the above setups for testing

Diagnosis:

Given that the problem persists, with the same results, when neither Quartz nor the binutils packaged with Debian Sid are present, I would safely conclude that this is a problem in upstream with qEMU's RISC-V emulation. (The suggested alternatives using other emulators, such as JSLinux, do not share this problem.) Unfortunately, I do not have access to an Intel host to test on, so I can't discount that as part of the problem.

Workaround:

Assembling the code with the -fPIC flag enabled results in expected behavior. According to the the gcc documentation,

If supported for the target machine, emit position-independent code, suitable for dynamic linking and avoiding any limit on the size of the global offset table. This option makes a difference on AArch64, m68k, PowerPC and SPARC.

So, no mention of RISC-V, but I wouldn't be surprised if qEMU's implementation of relative offsets were bugged in this regard.

Credit goes to Michael Schrauber for discovering this workaround while facing a similar but related problem with branching instructions.

Edited by Alberto R. González-Marín