GRiSP Alloy Goes Open, But GCC 14 Crashed the Party!
Ever had one of those days when you decide to upgrade everything new tools, shiny kernels, and the latest Ubuntu named after some groovy animal, because, hey, why not? Sounds straightforward, right? Tell that to the gremlins lurking in our toolchain. We learned a thing or two on our wild ride releasing GRiSP Alloy!
Introducing GRiSP Alloy
GRiSP Alloy (formerly GRiSP Linux builder) began its life inspired by the Nerves project, aiming to streamline our work with the GRiSP 2 boards. As our roadmap grew bolder, a rebrand felt right. Now Alloy sits proudly beside GRiSP Metal (RTEMS) with GRiSP Foundry (our Yocto-based system) quietly brewing in the background. Cool names, cooler tech, and plenty to look forward to.
Spring-Cleaning the Stack
Our big goal: make GRiSP Alloy fully open. No sweat, right?
First things first: those ancient dependencies had to go. Ubuntu Bionic Beaver, seriously? We upgraded all the way to Ubuntu Noble Numbat (from dams to termites!). Crosstool-ng? From prehistoric to the latest stable release. Buildroot? Swung from a dusty 2020.08 to shiny 2025.05. Our kernel saw a modest but useful upgrade from version 4.19 to 5.15, specifically chosen for compatibility with our hardware. OTP hit version 28, and a dozen other bits and pieces (erlinit, fwup, you name it) got a facelift too.
Good stuff, right? Well...
Crash Course in Core Dumps
After flashing the new firmware, OTP crashed spectacularly with a cryptic core dump, every developer's favorite nightmare.
Booting entry 'sdcard'
mmc0: detected SD card version 2.0
mmc0: registered mmc0
Booting entry 'Grisp2 A (/mnt/mmc0.0/loader/entries/boot.conf)'
blspec: booting Grisp2 A from mmc0
Loading ARM Linux zImage '/mnt/mmc0.0/zImage.a'
Loading devicetree from '/mnt/mmc0.0/oftree'
commandline: console=ttymxc0,115200n8 root=/dev/mmcblk0p2 rootfstype=squashfs rootwait
Starting kernel in secure mode
[ 0.250789] mdio_bus 2188000.ethernet-1: MDIO device at address 2 is missing.
[nbtty: terminating]
[ 2.264954] reboot: Restarting system
The kernel booted without issues, and we could launch it manually from the barebox CLI. With a few nifty erlinit tweaks, we got ourselves a shell and started sleuthing.
Enable core dumps in ramfs? Check. Get OTP to crash again? Easy. Core dump neatly stashed in /tmp
? Naturally.
Hit m for menu or any to stop autoboot: 2
barebox@GRiSP2:/ global linux.bootargs.extra="loglevel=8 ignore_loglevel initcall_debug panic=-1 -v --run-on-exit /bin/sh --hang-on-exit"
barebox@GRiSP2:/ boot
...
[ 4.951665] erlinit: Launching erl...
[ 5.063493] erlinit: reaped pid 55
[nbtty: terminating]
[ 5.279872] erlinit: reaped pid 53
[ 5.311362] erlinit: Erlang VM exited
[ 5.315132] erlinit: run_cmd '/bin/sh'
# dmesg -n 8
# echo 0 > /proc/sys/kernel/printk_ratelimit
# ulimit -c unlimited
# echo '/tmp/core.%e.%p' > /proc/sys/kernel/core_pattern
# PATH=/usr/sbin:/usr/bin:/sbin:/bin ROOTDIR=/srv/erlang BINDIR=/srv/erlang/erts-16.0.1/bin EMU=beam PROGNAME=erlexec RELEASE_SYS_CONFIG=/srv/erlang/releases/0.0.1/sys RELEASE_ROOT=/srv/erlang RELEASE_TMP=/tmp LANG=en_US.UTF-8 LANGUAGE=en ERL_INETRC=/etc/erl_inetrc ERL_CRASH_DUMP=/tmp/erl_crash.dump /usr/bin/nbtty /srv/erlang/erts-16.0.1/bin/erlexec -config /srv/erlang/releases/0.0.1/sys.config -boot /srv/erlang/releases/0.0.1/no_dot_erlang -args_file /srv/erlang/releases/0.0.1/vm.args -boot_var RELEASE_LIB /srv/erlang/lib
[nbtty: terminating]
# ls /tmp
core.beam.smp.63
Giddy with a mix of excitement and mild frustration, we snagged that elusive core file back to our dev machine with a quick-and-dirty SCP trick.
# udhcpc -i eth0 -p /tmp/udhcpc.pid
# scp -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null /tmp/core.* username@XX.XX.XX.XX:/tmp
Back in our trusty vagrant VM, armed with GDB, we analyzed the core dump meticulously:
~/grisp_alloy$ cp /tmp/core.beam.smp.63 artefacts
~/grisp_alloy$ vagrant ssh
vagrant@vagrant:~$ CORE=/home/vagrant/artefacts/core.beam.smp.63
vagrant@vagrant:~$ EXE=/home/vagrant/_build/firmware/projects/hello_grisp/_build/default/rel/hello_grisp/erts-16.0.1/bin/beam.smp
vagrant@vagrant:~$ GDB=/opt/grisp_linux_sdk/0.2.0/grisp2/0.2.0/host/bin/armv7-unknown-linux-gnueabihf-gdb
vagrant@vagrant:~$ $GDB $EXE $CORE
GNU gdb (crosstool-NG UNKNOWN) 15.1
...
Core was generated by '/srv/erlang/erts-16.0.1/bin/beam.smp -- -root /srv/erlang -bindir /srv/erlang/e'.
Program terminated with signal SIGILL, Illegal instruction.
#0 0x007fdcb4 in ethr_dw_atomic_read_acqb ()
(gdb) bt
#0 0x007fdcb4 in ethr_dw_atomic_read_acqb ()
#1 0x00511114 in erts_alcu_start ()
#2 0x00691cb8 in erts_aoffalc_start ()
#3 0x004f564c in start_au_allocator ()
#4 0x004f8400 in erts_alloc_init ()
#5 0x00515c44 in erl_start ()
#6 0x00498df4 in main ()
(gdb) x/6i $pc-12
0x7fdca8 <ethr_dw_atomic_read_wb+12>: strd r2, [r1]
0x7fdcac <ethr_dw_atomic_read_wb+16>: bx lr
0x7fdcb0 <ethr_dw_atomic_read_acqb>: and r3, r0, #7
=> 0x7fdcb4 <ethr_dw_atomic_read_acqb+4>: ldrd r2, [r0, r3]
0x7fdcb8 <ethr_dw_atomic_read_acqb+8>: strd r2, [r1]
0x7fdcbc <ethr_dw_atomic_read_acqb+12>: dmb sy
(gdb) quit
Surprise! OTP was choking on an unpredictable ldrd instruction… why, oh why, GCC? After some frantic googling (we have all been there), we found a GCC patch that looked suspiciously perfect: PR117675. Great! Except we were on GCC 14, not 15. Updating would push us onto unreleased Crosstool-NG and Buildroot masters, and nobody wants to mess with unreleased masters.
Undaunted, we tried fixing GCC 14. Because Crosstool-NG specifically manages its own set of patches for GCC, we first had to patch Crosstools-NG to include our GCC fix. Essentially, a patch to add another patch. This meta-patch laughed at us. Hours of debugging later, we confirmed the GCC patch was correctly applied (a quick source code inspection confirmed it), but mysteriously, Crosstools-NG was still stubbornly refusing to build the toolchain. Cue existential despair and head-scratching; sometimes even the best-applied patches can't save you.
Meanwhile, the Nerves project was happily compiling with GCC 13. Maybe they knew something we didn't. One long sigh, two coffees with milk, a downgrade to GCC 13, a full rebuild… and guess what?
Booting entry 'sdcard'
mmc0: detected SD card version 2.0
mmc0: registered mmc0
Booting entry 'Grisp2 A (/mnt/mmc0.0/loader/entries/boot.conf)'
blspec: booting Grisp2 A from mmc0
Loading ARM Linux zImage '/mnt/mmc0.0/zImage.a'
Loading devicetree from '/mnt/mmc0.0/oftree'
commandline: console=ttymxc0,115200n8 root=/dev/mmcblk0p2 rootfstype=squashfs rootwait
Starting kernel in secure mode
[ 0.249061] mdio_bus 2188000.ethernet-1: MDIO device at address 2 is missing.
Erlang/OTP 28 [erts-16.0.1] [source] [32-bit] [smp:1:1] [ds:1:1:10] [async-threads:1]
Eshell V16.0.1 (press Ctrl+G to abort, type help(). for help)
(hello_grisp@127.0.0.1)1>
Lessons Learned
Being on the bleeding edge isn't always the smartest move. Sometimes you gotta slow your roll and stick with the known good.
That's how GRiSP Alloy made its grand entrance, bumps, bruises, and surprise core dumps included. Would we do it again? In a heartbeat, but next time with fewer midnight patches.