Skip to content
Commit 8e256b33 authored by Nicolas Pitre's avatar Nicolas Pitre Committed by Christopher Friedt
Browse files

scripts: gen_syscalls: fix argument marshalling with 64-bit debug builds



Let's consider this (simplified) compilation result of a debug build
using -O0 for riscv64:

|__pinned_func
|static inline int k_sem_init(struct k_sem * sem,
|                             unsigned int initial_count,
|                             unsigned int limit)
|{
|    80000ad0:   6105                    addi    sp,sp,32
|    80000ad2:   ec06                    sd      ra,24(sp)
|    80000ad4:   e42a                    sd      a0,8(sp)
|    80000ad6:   c22e                    sw      a1,4(sp)
|    80000ad8:   c032                    sw      a2,0(sp)
|        ret = arch_is_user_context();
|    80000ada:   b39ff0ef                jal     ra,80000612
|        if (z_syscall_trap()) {
|    80000ade:   c911                    beqz    a0,80000af2
|                return (int) arch_syscall_invoke3(*(uintptr_t *)&sem,
|                                    *(uintptr_t *)&initial_count,
|                                    *(uintptr_t *)&limit,
|                                    K_SYSCALL_K_SEM_INIT);
|    80000ae0:   6522                    ld      a0,8(sp)
|    80000ae2:   00413583                ld      a1,4(sp)
|    80000ae6:   6602                    ld      a2,0(sp)
|    80000ae8:   0b700693                li      a3,183
|    [...]

We clearly see the 32-bit values `initial_count` (a1) and `limit` (a2)
being stored in memory with the `sw` (store word) instruction. Then,
according to the source code, the address of those values is casted
as a pointer to uintptr_t values, and that pointer is dereferenced to
get back those values with the `ld` (load double) instruction this time.

In other words, the assembly does exactly what the C code indicates.
This is wrong for 2 reasons:

- The top half of a1 and a2 will contain garbage due to the `ld` used
  to retrieve them. Whether or not the top bits will be cleared
  eventually depends on the architecture and compiler.
- Regardless of the above, a1 and a2 would be plain wrong on a big
  endian system.
- The load of a1 will cause a misaligned trap as it is 4-byte aligned
  while `ld` expects a 8-byte alignment.

The above code happens to work properly when compiling with
optimizations enabled as the compiler simplifies the cast and
dereference away, and register content is used as is in that case.
That doesn't make the code any more "correct" though.

The reason for taking the address of an argument and dereference it as an
uintptr_t pointer is most likely done to work around the fact that the
compiler refuses to cast an aggregate value to an integer, even if that
aggregate value is in fact a simple structure wrapping an integer.

So let's fix this code by:

- Removing the pointer dereference roundtrip and associated casts. This
  gets rid of all the issues listed above.
- Using a union to perform the type transition which deals with
  aggregates perfectly well. The compiler does optimize things to the
  same assembly output in the end.

This also makes the compiler happier as those pragmas to shut up warnings
are no longer needed. It should be the same about coverity.

Signed-off-by: default avatarNicolas Pitre <npitre@baylibre.com>
(cherry picked from commit 1db5c8b9)
parent 74f27607
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment