me.cloudconnect.id

← Back to Blog list

Did Google go down?

February 04, 2021 • ☕️ 5 min read

Today, unfortunately Google's service was inaccessible. news, kompas source, kompas.com

news, twit source, twitter

Netizens in Indonesia expressed their frustration, saying, “Google went down!” Was that accurate?

What occurred was simply a classic case of Route Leaks.

This time, I will explain it technically.

BGP (Border Gateway Protocol)

BGP is a protocol by which all junction points on the internet (Routers) communicate with one another to dynamically establish the correct paths that network packets should follow to reach their intended destination.

Route leaks have caused BGP to redirects network packets to incorrect paths. In this context, it refers to the IP addressing at layer 3 [network layer].

A quick check using show ip bgp 8.8.8.8 in the OpenIXP Looking Glass or Nation Inter Connection Exchange, as shown at the bottom of the page, confirmed that

aspath, bocor

Google’s IP 8.8.8.8 was listed in the OpenIXP bgp route table, with AS 139409 announcing Google’s IP prefixes to OpenIXP.

Then the traceroute result matches the sequence of as_path information above, directing 8.8.8.8 through the Edge routerhop 1 of one of the peering members in the OpenIXP; from as_path AS 139409 first, then through AS 55818 to reach Google AS 15169,

trace2, bocor

Well, imagine this: When all the Routers of peering members in the OpenIXP receive this routing information simultaneously, and at the same time every user [especially those in Indonesia] accesses Google's service, they would be misdirected, which should not occur through there, as it wasn’t Google's backbone.

via mtr, bocor

How to fix it?

The quickest action from the OpenIXP Admin side was to temporarily shut down the leaking peer member. It can be said that BGP relies on a degree of TRUST between networks, with the principles rules that all peering members must follow. [see OpenIXP-Technical.pdf].

A simple preventive practice for peering members is that when implementing a multi-homed setup, it should usually involve discarding any as_path other than the peering as_path associated with it. I’ve been using these simple preventive measures for the past 8 years, and it continue to work well.

Also filtering by limits the Maximum Prefix limit correctly, ensuring that if other peers inadvertently leak multiple routes, it remains protected. This is indeed a rigid solution, as we cannot fully control the number of new prefixes that must be received.

Will it happen again?

didn't, down

Google’s service inaccessibility incident occurred again, and what transpired at noon was that Google’s service was NOT down, Instead, it was due to occasional mistakes made by network operators (e.g. human error), but the technician’s mistake is not entirely to blame. Sometimes, the system within an organization does not support adequate devices. In the case of a BGP leak, an error in the BGP configuration on the router can lead to the announcement of incorrect routes, which may redirect significant traffic in the wrong direction, causing various issues. Similarly, other networks might exploit this trust to hijack traffic for malicious purposes.

IRR and RPKI

BGP security methods have long been developed, including the IRR database recording method and the Resource Public Key Infrastructure [RPKI] cryptographic approach. Despite both methods being implemented by Google, why do leaks still occur?

The implementation may not be fully realized on one side (e.g., Google, AWS, and other top-level tiers). There needs to be significant global participation.

Top Level providers (e.g., Tier 1s) utilize RPKI validators and reference IRR data to create inbound routing filters that apply to all peers within their AS. We, as other stub networks, should adopt similar measures to mitigate the effects of route leaks.

Implementing RPKI is merely the first step toward enhancing BGP route security, as RPKI only secures the route origin and does not secure the path. (Unfortunately, the same limitation applies to IRR data).

Nevertheless, widespread adoption of BGP Security is essential for the seamless functioning of internet activities for all users.


Back to Blog listEdit on GitHubDiscuss on Twitter


me.cloudconnect.id

A Rahman

Personal blog by A Rahman.
Menulis untuk mengingatnya.