Here is my temporary rough and dirty workaround (fix) that surprisingly seems to work quite reliably.
The script is tailored for specific scenario so you can't just copy/paste but it's easy to modify for other scenarios.
My scenario:
- fail-over only
- primary WAN on eth0
- secondary (fail-over-only) WAN on eth1
My normal routing table after boot when both WANs are reachable, note eth0 and eth1 and distances
# show ip route Codes: K - kernel, C - connected, S - static, R - RIP, B - BGP O - OSPF, IA - OSPF inter area N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2 E1 - OSPF external type 1, E2 - OSPF external type 2> - selected route, * - FIB route, p - stale info IP Route Table for VRF "default" S *> 0.0.0.0/0 [10/0] via 135.23.39.1, eth0 S 0.0.0.0/0 [20/0] via 172.16.16.1 eth1 S *> 10.10.8.0/22 [1/0] via 172.16.16.1, eth1 C *> 127.0.0.0/8 is directly connected, lo C *> 135.23.39.0/24 is directly connected, eth0 C *> 172.16.16.0/24 is directly connected, eth1 S *> 172.17.0.0/16 [1/0] via 192.168.33.33, switch0 S *> 192.168.1.0/24 [1/0] via 172.16.16.5, eth1 C *> 192.168.33.0/24 is directly connected, switch0
Now, here's my custom transition script /config/scripts/wlb-transition.sh that changes distance of eth1 to 5 when active and back to 20 when inactive. The script calls separate script to change the distance as I was not able to find how to do this with native linux commands (any help here appreciated, the way I have it is really slow).
#!/bin/bash GROUP=$1 INTF=$2 STATUS=$3 MYLOG="/var/log/wlb" TS=$(date +"%Y%m%d-%T") run=/opt/vyatta/bin/vyatta-op-cmd-wrapper INTFDSCR=$($run show interfaces | grep $INTF | awk '{print $4}') /usr/sbin/conntrack -F /usr/sbin/ubnt-add-connected.pl case "$STATUS" in active) msg="$TS: Internet connection $GROUP:$INTF:$INTFDSCR is active." # Change eth1 to shortest distance ... fix local routing if [ $INTF = "eth1" ] then /config/scripts/wlb-change-distance.sh 5 fi # Email sysadmin when interface becomes active echo -e "$msg\n\n$(uptime)\n\nLast 10 fail-over evets\n$(grep wlb: /var/log/messages|tail -n 10)" \ | mailx -r "noreply@mydomain.com" \ -s "Router $(hostname) WAN fail-over event" \ -S smtp="smtp.gmail.com:587" \ -S smtp-use-starttls \ -S smtp-auth=login \ -S smtp-auth-user="noreply@mydomain.com" \ -S smtp-auth-password="***" \ -S ssl-verify=ignore support@mydomain.com ;; inactive) msg="$TS: Internet connection $GROUP:$INTF:$INTFDSCR is inactive." ;; failover) msg="$TS: Internet connection $GROUP:$INTF:$INTFDSCR is failover." # Change eth1 to longest distance ... fix local routing if [ $INTF = "eth1" ] then /config/scripts/wlb-change-distance.sh 20
/usr/sbin/conntrack -F fi ;; *) msg="$TS: Oh crap, $GROUP:$INTF:$INTFDSCR going [$STATUS]" ;; esac echo $msg >> $MYLOG logger $msg exit 0
And here is /config/scripts/wlb-change-distance.sh
#!/bin/vbash DIST=$1 source /opt/vyatta/etc/functions/script-template configure set protocols static route 0.0.0.0/0 next-hop 172.16.16.1 distance $DIST commit # save exit
...as I said, rough and dirty but it does the trick. Localy sourced traffic is now properly routed on fail-over.
And don't forget to apply this FIX too.