One Does Not Simply “Deploy RPKI”

<Massimo Candela is a 2022 MANRS Ambassador and will speak on the topic below during RPKI Week today on Tuesday, 5 July, starting at 12:00 UTC. Register here.>

Deploying RPKI is usually considered the completion of two steps: creating Route Origin Authorizations (ROAs) and performing Route Origin Validation (ROV). A ROA is a cryptographically signed document, stored in a public repository, that declares which Autonomous System (AS) is authorized to announce a certain prefix in BGP routing. A router performing ROV accepts or rejects BGP routes based on the ROAs.

However, less documented is the effort needed to maintain a healthy RPKI deployment during daily routing operations. In particular, new routing configurations may require an update of the relevant ROAs. If this important step is underestimated, you and your customers may be subject to suboptimal routing or unreachability.

There are two errors I have come across repeatedly. The first is announcing a new prefix and forgetting (or delaying) ROA updates. Most of the time this is done based on a misconception that since the prefix is a “new” announced prefix, this will be “unknown” during ROV. However, this is not always true. The existence of a ROA covering a less-specific prefix may result in a violation of the maxLength, and hence in an RPKI invalid announcement. The second error involves operating on BGP and ROAs without taking into consideration timing. In fact, the creation or update of a ROA may require some time for it to be public. Additionally, there is an intrinsic propagation time due to the various Internet players being able to see the new ROA in the next run of their RPKI validators.

A one-year review of RPKI operations at NTT

NTT is a Tier1 provider, operating the Global IP Network, one of the largest backbone networks in the world. At NTT, we started doing ROV in March 2020. However, we soon realized that we were not immune to the problems described above. To further improve public knowledge on the topic, we decided to share our experience. I reviewed one year of RPKI-related alerts generated by our BGPalerter instance during 2021. BGPalerter is an open-source monitoring solution used by hundreds of organizations to monitor BGP and RPKI. In addition to the alerts, I analyzed the tickets describing the actions taken by our NOC to resolve those alerts. With this data I was able to derive three common causes for most of the RPKI-invalid announcements:

  • We announced a new prefix, but due to the existence of a ROA covering a less-specific prefix it resulted in a violation of the maxLength;
  • We announced a customer’s prefix (not RPKI unknown), but the customer did not have a ROA for our AS; and
  • We migrated prefixes from one AS to another (change of origin), but we didn’t update properly the ROAs.
Figure 1. Causes of RPKI invalid announcements originated by NTT’s AS2914.Green (57.6%): wrong maxLength
Red (16.9%): we announced a customer’s prefix, but the customer did not have a ROA for our AS
Yellow (25.5%): we migrated prefixes from one AS to another, but we didn’t update properly the ROAs
Figure 1. Causes of RPKI invalid announcements originated by NTT’s AS2914.

Green (57.6%): wrong maxLength
Red (16.9%): we announced a customer’s prefix, but the customer did not have a ROA for our AS
Yellow (25.5%): we migrated prefixes from one AS to another, but we didn’t update properly the ROAs

Fig. 1 shows the alerts plotted based on their cause type. We had 71 alerts. The wrong maxLength cause was the most common. The number of alerts is marginal compared to the number of network operations we perform daily; however, in mid 2021, we decided to further step up our game by developing software and procedures to support our RPKI operations. This includes the development of more advanced RPKI monitoring features, released open source in BGPalerter. It resulted in a reduction of 86.84% in RPKI-invalid announcements, as shown in Fig. 2. We use BGPalerter to make sure our BGP routing configurations are/will be in sync with our ROAs. For example, we monitor when a customer creates a ROA for our AS and we start announcing the customer’s prefix only in that case. We also monitor if anything is affecting the validity of our ROAs. More info can be found in the following presentations: NTT’s RPKI Deployment Update and A One-Year Review of RPKI Operations.

Figure 2. Reduction of RPKI-invalid announcements. On 26 March 2021 (blue dot on the x-axis), we introduced new software and procedures for our RPKI operations.

The importance of monitoring

As previously described, RPKI-invalid announcements can be the result of stale ROAs (not being up to date with what you want to do at the BGP level), or the result of the ROAs’ publication and propagation delays. For this reason, RPKI monitoring is a fundamental activity to mitigate or to timely correct invalid announcements which would impact the reachability of your services. Unfortunately, this is not yet a widespread practice. For example, if we analyze RIPE RIS data we can easily see that many RPKI invalid routes remain visible on the Internet for days before getting fixed. E.g., in 2021, around 60% of the RPKI invalid announcements due to wrong maxLength (but correct origin) lasted more than 1 day, with more than 20% of them lasting more than 25 days. While part of these could be due to malicious intent and to data impurity, it is reasonable to think that many are due to the lack of monitoring during network operations.

NTT’s Global IP Network is monitored by BGPalerter, an open-source tool I developed keeping in mind not only our internal needs, but also the needs of a wider audience of network operators. In an Internet comprised of thousands of network players, with different levels of automation and expertise, providing free and easy to use tools for monitoring the correctness of BGP and RPKI is essential for improving the stability of the Internet. BGPalerter is a tool that self-configures and doesn’t require any data collection. Essentially it is an application that you just run – the only necessary input is your AS number. It can send alerts by email or to most of the currently available messaging platforms. BGPalerter’s BGP monitoring informs you about prefixes losing visibility, hijacks, unexpected downstrams/upstreams ASes, and more.

Its RPKI monitor informs you if your AS is announcing RPKI invalids or prefixes not covered by ROAs, if there is an ongoing Trust Anchor malfunction, if your VRP file is corrupted, or if any of your ROAs are expiring. Additionally, BGPalerter will inform you if a ROA involving any of your prefixes or ASes is added/edited/deleted, this is useful to monitor the publication of RPKI changes and coordinate BGP changes.

The MANRS community is driving behavioral change toward more secure routing. We now have more than 800 participants who believe in implementing basic routing security measures and are committed to convincing others to do the same. Please join us as we make the Internet better.

Acknowledgments

I would like to thank some other amazing open projects:

  • RIPE RIS, a fundamental project maintained by RIPE NCC that collects and publishes BGP routing data collected by vantage points distributed in the world.
  • OpenBSD rpki-client, an open-source RPKI validator used by many Internet operators. This validator can export metadata that is fundamental for some advanced RPKI monitoring.

Leave a Comment