
On April 4, 2023, Virgin Media experienced an internet outage (as posted in my previous post), and while the company has not publicly disclosed the cause, there are several speculations. Some users suggested it could be DNS issues, as changing the DNS settings to Google’s or Cloudflare’s public DNS could circumvent the issue, but this did not work for everyone.
However, it is suspected that the outage was caused by a BGP configuration issue rather than DNS, as there were intermittent outages according to Cisco BGPStream. BGP (Border Gateway Protocol) is a protocol used to route internet traffic between different networks, and it is possible that an engineer made changes to Virgin Media’s BGP peering routers, or there were hardware issues at one of their data centres. Without going into how BGP works Cloudflare have done a nice write up of the Virgin Media outage and how BGP works all in one blog post, but essentially Virgin Media failed to advertise its presence to other networks on Internet. If it can’t advertise its presence, other networks can’t find its network and it becomes effectively unavailable.
During the outage, Virgin Media’s authoritative nameservers were unavailable, and this caused home broadband routers to be unable to contact the virginmedia.net. Additionally, support pages, call centres, sip trunks, and authoritative nameservers went through the same ASN (autonomous system number), Virgin Media’s support and internal infrastructure seemed to fall apart.
Despite the widespread outage, many users were frustrated by the lack of transparency and updates from Virgin Media and took to Twitter. It’s hard to understand why someone in support with a mobile phone wouldn’t have updated Twitter in a timelier fashion! However, if companies are transparent about their mistakes, everyone can learn from them and improve to do IT better.
Like, Comment or WordPress Reblog the post and Subscribe to IT Service Guru for future blog posts.