Looks pretty good.
at 100mbit, what happens if you set the interface to 100mbit (using ethtool or a device on the other end at 100Mbit) with just fq_codel or cake, no shaping? (does the erx have bql? 40ms of inherent buffering at 100mbit sucks).
It looks like the erx peaks out at 1gbit/400 mbit in the default offloaded path, fq_codel shaped starts to struggle at 120mbit symmettric, cake about the same.