Not just another BGP Hijack
On 1 April 2020, many networks witnessed a massive BGP hijack by AS12389 (Rostelecom). In this post, we’ll dig into the details of this routing incident and explain why implementing strict network filtering practices and adhering to the MANRS principles is vital to a secure and resilient global Internet routing system.
BGP hijacks are sadly common, but most are very short-lived and don’t create service disruptions on a global level. Most (not all) routing incidents happen because of configuration mistakes, but as we have learned over and over again, implementing strict filtering drastically reduces the chance that these mistakes will propagate further into the network and cause additional disruption.
This week’s hijack unfortunately did create service disruptions for many around the globe.
A brief overview was covered by this QRATOR Labs blog post. QRATOR Labs provides Network Analytics and Monitoring services called Radar.
As per the QRATOR blog post, the first route mis-origination — where a network announces routes to peers not actually part of their network — was recorded at 19:28 UTC on 1 April 2020. They showed a screenshot from their monitoring system with a list of IP Prefixes, which were hijacked by AS12389 (Rostelecom):
When I looked up each prefix above in the route-views, I didn’t find any hit for these specific prefixes. BGPMon also provided a list containing more than 8800 prefixes.
But the mass scale hijack happened, as it was shared by many prominent members of the community and it impacted at least Amazon and Akamai that we know of. There was a RIPE Bgplay link in the blog as well which was showing 220.127.116.11/24.
To get more information, I looked up all the announcements from AS12389 with the AS_PATH “20764 12389”. I used Isolario.it bgpdump, as it has more peers than route-views. Instantly there were several hits (4569 unique announcements) and none of them belonged to AS12389. Hurricane Electric’s BGP Toolkit also noted a similar number (4567).
Out of those 4569 prefixes, 4255 belong to Amazon(AS16509 and AS14618), 85 belong to Akamai (AS20940, AS16625), and the rest belong to several different service providers including Level3, Alibaba, Digital Ocean, Linode, and others.
There are some interesting points about these prefixes; let’s take these five prefixes of Akamai (AS20940) as our examples.
- 18.104.22.168/24:20940 | 22.214.171.124 | AKAMAI-ASN1, US
- 126.96.36.199/24:20940 | 188.8.131.52 | AKAMAI-ASN1, US
- 184.108.40.206/24:20940 | 220.127.116.11 | AKAMAI-ASN1, US
- 18.104.22.168/24:20940 | 22.214.171.124 | AKAMAI-ASN1, US
- 126.96.36.199/24:20940 | 188.8.131.52 | AKAMAI-ASN1, US
As per RIPE RIS Routing Status, these prefixes in their exact prefix length were never visible in the global routing table before it was announced by AS12389 during the hijack. Same result for the remaining four prefixes (interesting to note the timestamp). Status of 184.108.40.206/24
I checked more than 100 prefixes with the same outcome. This behaviour might look similar to other kinds of routing behaviour you may have seen in the past. That’s because this is similar to how BGP Optimizers behave. (In this case, it’s not clear if it was done by a BGP Optimizer or some internal traffic engineering.)
Also, if you go back three years, there was a very similar incident from AS12389 which was very well documented by BGPMon in “BGPStream and The Curious Case of AS12389”. Fifty (50) prefixes were hijacked from 37 different ASNs by announcing more specific prefixes (/24s).
Why Filtering Matters
All of this would have been prevented if AS20764 (Rascom) implemented strict filtering, ensuring that they did not announce routes for addresses they or their customers don’t actually control. Another interesting point is that AS12389 also directly peers with AS3356 (Level3), AS1299 (Telia) and others, but none of them allowed these bad announcements to propagate except AS20764 which also peers with AS174 (Cogent Co) and unfortunately AS174 didn’t filter these bad announcements and propagated it further. This shows directly how networks that strictly filter outgoing traffic help to work against the propagation of bogus route announcements, malicious or otherwise. RPKI has made filtering of mis-origination (hijacks) much easier. Out of 4569 hijacked prefixes, 4040 had VALID ROAs for their respective ASNs. For example here is one of the hijacked prefix 220.127.116.11/24 of Amazon.
Job Snijders also did a really good analysis of this incident from the RPKI perspective because RIPE NCC accidentally deleted around 4100 ROAs during the same time period as highlighted here. His analysis can be found at RIPE routing-wg mailing list.
It is important to mention that AS174 (Cogent Co) and AS3356 (Level3) should have done a better job by filtering at all levels. We have witnessed in the past that they all do prefix filtering (remember the Amazon Route53 Hijack? They did filter it nicely). Filtering one peer and leaving another doesn’t serve the purpose at all. Mistakes happen, but we all have to learn from these mistakes and put measures in place not to repeat them again.
I’m still looking for a bgp rib dump to match QRATOR Labs and bgpmon’s 8800+ hijacked prefixes and why it lasted for almost 1hr as stated in their QRATOR’s blog post? RIPE RIS and bgpdumps from isolario indicates that it lasted for almost 10mins or so. It highly depends on the vantage points for sure.
Where are the vantage points which saw those 8800+ announcements? Were those extra announcements made to any particular IX only? Did this all happen because of a BGP Optimizer? Or something else? Was it an attempt to hijack those prefixes because similar behaviour in 2017 as well? That may continue to remain a mystery.
Bottom Line – Network Operators Must Secure Global Routing
I can’t emphasise this enough – this can happen again at any time. Network operators have a responsibility to ensure a globally robust and secure routing infrastructure. Your network’s safety depends on a routing infrastructure that stops bad actors and mitigate accidental misconfigurations that wreak havoc on the Internet. The more network operators work together, the fewer incidents there will be, and the less damage they can do.
If you are network operator, we encourage you to implement MANRS Actions and to join the MANRS community. If you are already a MANRS member (as some ASNs are who forwarded these bad announcements) then learn from this and make sure it doesn’t happen again. You have an obligation to keep the global routing table clean.
Only together, we can protect the core!
Learn more about MANRS and join us as a network operator, Internet Exchange Point, or Content Delivery Network/cloud provider.