Resolve NIF addresses in Address Sanatizer backtrace

Debugging an NIF with asan as specified in the docs produces logs with backtraces. The backtraces to statically linked libs are symbolized with c source file and line number. However, backtraces to dynamically linked libs including the NIF only show a hexadecimal offset, like in the following file:

=================================================================
==428422==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7e1ccf3ca220 at pc 0x7f8cce4e39ea bp 0x7b8cb162a870 sp 0x7b8cb162a040
WRITE of size 262192 at 0x7e1ccf3ca220 thread T4
    #0 0x7f8cce4e39e9 in memset (/lib64/libasan.so.8+0xe39e9) (BuildId: 10b8ccd49f75c21babf1d7abe51bb63589d8471f)
    #1 0x7b8c9c71255c  (priv/lz4_nif.so+0x1f55c) (BuildId: 72ef0b282d9aedda10dff70cc27f64796b44914d)
    #2 0x7b8c9c7125b0  (priv/lz4_nif.so+0x1f5b0) (BuildId: 72ef0b282d9aedda10dff70cc27f64796b44914d)
    #3 0x7b8c9c6f39dc  (priv/lz4_nif.so+0x9dc) (BuildId: 72ef0b282d9aedda10dff70cc27f64796b44914d)
    #4 0x00000068f3d3 in beam_jit_call_nif(process*, void const*, unsigned long*, unsigned long (*)(enif_environment_t*, int, unsigned long*), erl_module_nif*) beam/jit/beam_jit_common.cpp:645
    #5 0x7b8cb58f3756  (/memfd:vmem (deleted)+0x756)

0x7e1ccf3ca220 is located 0 bytes after 16416-byte region [0x7e1ccf3c6200,0x7e1ccf3ca220)
allocated by thread T4 here:
    #0 0x7f8cce4e6f2b in malloc (/lib64/libasan.so.8+0xe6f2b) (BuildId: 10b8ccd49f75c21babf1d7abe51bb63589d8471f)
    #1 0x000000cbecfd in erts_sys_alloc sys/unix/sys.c:999
    #2 0x000000c1c65c in erts_alloc_fnf beam/erl_alloc.h:290
    #3 0x000000c1c65c in enif_alloc beam/erl_nif.c:616
    #4 0x7b8c9c6f39ba  (priv/lz4_nif.so+0x9ba) (BuildId: 72ef0b282d9aedda10dff70cc27f64796b44914d)
    #5 0x00000068f3d3 in beam_jit_call_nif(process*, void const*, unsigned long*, unsigned long (*)(enif_environment_t*, int, unsigned long*), erl_module_nif*) beam/jit/beam_jit_common.cpp:645
    #6 0x7b8cb58f3756  (/memfd:vmem (deleted)+0x756)

Thread T4 created by T0 here:
    #0 0x7f8cce4de492 in pthread_create (/lib64/libasan.so.8+0xde492) (BuildId: 10b8ccd49f75c21babf1d7abe51bb63589d8471f)
    #1 0x000000ee9933 in ethr_thr_create pthread/ethread.c:403

SUMMARY: AddressSanitizer: heap-buffer-overflow (priv/lz4_nif.so+0x1f55c) (BuildId: 72ef0b282d9aedda10dff70cc27f64796b44914d) 
Shadow bytes around the buggy address:
  0x7e1ccf3c9f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7e1ccf3ca000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7e1ccf3ca080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7e1ccf3ca100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7e1ccf3ca180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x7e1ccf3ca200: 00 00 00 00[fa]fa fa fa fa fa fa fa fa fa fa fa
  0x7e1ccf3ca280: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x7e1ccf3ca300: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x7e1ccf3ca380: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x7e1ccf3ca400: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x7e1ccf3ca480: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
=================================================================
==428422==ERROR: AddressSanitizer: SEGV on unknown address 0xffffffffffffffff (pc 0x7b8cb5fd3d8f bp 0x7dfccd2fd878 sp 0x7dfccd2fd848 T4)
==428422==The signal is caused by a READ memory access.
    #0 0x7b8cb5fd3d8f  (/memfd:vmem (deleted)+0x6e0d8f)

==428422==Register values:
rax = 0x0000000000000000  rbx = 0x00007b8cb162ab00  rcx = 0xf5f5f5f5f5f5f5f5  rdx = 0x00007b8cb5fd3d80  
rdi = 0x0000000000000080  rsi = 0x0000000000000000  rbp = 0x00007dfccd2fd878  rsp = 0x00007dfccd2fd848  
 r8 = 0x0000000000000000   r9 = 0x0000000000000014  r10 = 0x0000000000000000  r11 = 0x00007e1ccf3f3325  
r12 = 0x0000000000000001  r13 = 0x00007cfccd008b80  r14 = 0x0000000000000f3e  r15 = 0x00007dfccd2fbc48  
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/memfd:vmem (deleted)+0x6e0d8f) 
==428422==ABORTING

These offsets can be symbolized with addr2line, or llvm-symbolizer

$ addr2line -e priv/lz4_nif.so 0x1f55c                                                     
/home/bphilip/GitHub/lz4-erlang/c_src/lz4hc.c:1583
$ llvm-symbolizer -pse priv/lz4_nif.so 0x1f55c
LZ4_initStreamHC at lz4hc.c:1583:7

After hours of trial and error I’ve come up with the following (GNU) sed script to symbolize them post asan:

cat asan/beam.asan.smp-0-tc-0007-lz4_nif_SUITE-compress_hc_dest_size.428422 | sed -E 's/^([^(]*)\((.*\.so)\+(.{1,10})\)(.*)$/echo -n "\1"; llvm-symbolizer -pse \2 \3 | tr -d "\n"; echo -n "\4"/e'

To produce the following output:

=================================================================
==428422==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7e1ccf3ca220 at pc 0x7f8cce4e39ea bp 0x7b8cb162a870 sp 0x7b8cb162a040
WRITE of size 262192 at 0x7e1ccf3ca220 thread T4
    #0 0x7f8cce4e39e9 in memset (/lib64/libasan.so.8+0xe39e9) (BuildId: 10b8ccd49f75c21babf1d7abe51bb63589d8471f)
    #1 0x7b8c9c71255c  LZ4_initStreamHC at lz4hc.c:1583:7 (BuildId: 72ef0b282d9aedda10dff70cc27f64796b44914d)
    #2 0x7b8c9c7125b0  LZ4_compress_HC_destSize at lz4hc.c:1540:33 (BuildId: 72ef0b282d9aedda10dff70cc27f64796b44914d)
    #3 0x7b8c9c6f39dc  compress_hc_dest_size at lz4_nif.c:210:24 (BuildId: 72ef0b282d9aedda10dff70cc27f64796b44914d)
    #4 0x00000068f3d3 in beam_jit_call_nif(process*, void const*, unsigned long*, unsigned long (*)(enif_environment_t*, int, unsigned long*), erl_module_nif*) beam/jit/beam_jit_common.cpp:645
    #5 0x7b8cb58f3756  (/memfd:vmem (deleted)+0x756)

0x7e1ccf3ca220 is located 0 bytes after 16416-byte region [0x7e1ccf3c6200,0x7e1ccf3ca220)
allocated by thread T4 here:
    #0 0x7f8cce4e6f2b in malloc (/lib64/libasan.so.8+0xe6f2b) (BuildId: 10b8ccd49f75c21babf1d7abe51bb63589d8471f)
    #1 0x000000cbecfd in erts_sys_alloc sys/unix/sys.c:999
    #2 0x000000c1c65c in erts_alloc_fnf beam/erl_alloc.h:290
    #3 0x000000c1c65c in enif_alloc beam/erl_nif.c:616
    #4 0x7b8c9c6f39ba  compress_hc_dest_size at lz4_nif.c:207:16 (BuildId: 72ef0b282d9aedda10dff70cc27f64796b44914d)
    #5 0x00000068f3d3 in beam_jit_call_nif(process*, void const*, unsigned long*, unsigned long (*)(enif_environment_t*, int, unsigned long*), erl_module_nif*) beam/jit/beam_jit_common.cpp:645
    #6 0x7b8cb58f3756  (/memfd:vmem (deleted)+0x756)

Thread T4 created by T0 here:
    #0 0x7f8cce4de492 in pthread_create (/lib64/libasan.so.8+0xde492) (BuildId: 10b8ccd49f75c21babf1d7abe51bb63589d8471f)
    #1 0x000000ee9933 in ethr_thr_create pthread/ethread.c:403

SUMMARY: AddressSanitizer: heap-buffer-overflow LZ4_initStreamHC at lz4hc.c:1583:7 (BuildId: 72ef0b282d9aedda10dff70cc27f64796b44914d) 
Shadow bytes around the buggy address:
  0x7e1ccf3c9f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7e1ccf3ca000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7e1ccf3ca080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7e1ccf3ca100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7e1ccf3ca180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x7e1ccf3ca200: 00 00 00 00[fa]fa fa fa fa fa fa fa fa fa fa fa
  0x7e1ccf3ca280: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x7e1ccf3ca300: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x7e1ccf3ca380: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x7e1ccf3ca400: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x7e1ccf3ca480: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
=================================================================
==428422==ERROR: AddressSanitizer: SEGV on unknown address 0xffffffffffffffff (pc 0x7b8cb5fd3d8f bp 0x7dfccd2fd878 sp 0x7dfccd2fd848 T4)
==428422==The signal is caused by a READ memory access.
    #0 0x7b8cb5fd3d8f  (/memfd:vmem (deleted)+0x6e0d8f)

==428422==Register values:
rax = 0x0000000000000000  rbx = 0x00007b8cb162ab00  rcx = 0xf5f5f5f5f5f5f5f5  rdx = 0x00007b8cb5fd3d80  
rdi = 0x0000000000000080  rsi = 0x0000000000000000  rbp = 0x00007dfccd2fd878  rsp = 0x00007dfccd2fd848  
 r8 = 0x0000000000000000   r9 = 0x0000000000000014  r10 = 0x0000000000000000  r11 = 0x00007e1ccf3f3325  
r12 = 0x0000000000000001  r13 = 0x00007cfccd008b80  r14 = 0x0000000000000f3e  r15 = 0x00007dfccd2fbc48  
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/memfd:vmem (deleted)+0x6e0d8f) 
==428422==ABORTING

This solution is less than ideal, since there are multiple steps, a variation style between dynamic libs and static libs, and the regex doesn’t expand all dynamic libraries.

How do you get asan to symbolize the backtrace to the NIF (and other dynamic libs) by default?

1 Like

Edit: I now use the following alias to edit in place:

alias symbolize="sed -i -E 's/^([^(]*) \((.*\.so)\+(.{1,10})\)(.*)$/echo -n \"\1in \"; llvm-symbolizer -pse \2 \3 | tr -d \"\\\n\"/e'"

This produces cleaner output than the previous version and edits a given file(s) in place.

1 Like