GitHub header
Incident On 2023-06-29
Incident Report for GitHub
Resolved
From 17:39-18:12 UTC GitHub was down in parts of North America, particularly the US East coast, and South America.

GitHub takes measures to ensure that we have redundancy in our system for various disaster scenarios. We have been working on building redundancy to an earlier single point of failure in our network architecture at a second Internet edge facility. This second Internet edge facility was completed in January and has been actively routing production traffic since then. Today we were performing a live failover test to validate that we could in fact use this second Internet edge facility if the primary were to fail. Unfortunately, during this failover we inadvertently caused a production outage.

During the test we exposed that the secondary site had a network pathing configuration issue that prevented it from properly functioning as the primary facility. This caused issues with Internet connectivity to GitHub, ultimately resulting in an outage. We were immediately notified of the issue in our monitoring and alerting. Within two minutes of being alerted we reverted the change and brought the primary facility back online. Once online it took time for traffic to be rebalanced and for our border routers to reconverge restoring public connectivity to affected GitHub systems.

This failover test helped expose the configuration issue, and we are addressing the gaps in both configuration and our failover testing which will help make GitHub more resilient. We recognize the severity of this outage and apologize for the impact it has to our customers.
Posted Jun 29, 2023 - 18:36 UTC
Update
We have recovered and are operating normally
Posted Jun 29, 2023 - 18:35 UTC
Update
The root cause has been mitigated and most services have fully recovered. We are still monitoring for full recovery.
Posted Jun 29, 2023 - 18:33 UTC
Update
We are continuing to investigate this issue.
Posted Jun 29, 2023 - 18:22 UTC
Update
We are continuing to see recovery and are continuing to monitor as recovery continues.
Posted Jun 29, 2023 - 18:20 UTC
Update
We are starting to see recovery and are continuing to monitor as we mitigate.
Posted Jun 29, 2023 - 18:12 UTC
Update
We have identified the root cause of the outage and are working toward mitigation
Posted Jun 29, 2023 - 18:02 UTC
Investigating
We are currently experiencing an outage of GitHub products and are investigating.
Posted Jun 29, 2023 - 17:52 UTC
This incident affected: Git Operations, Webhooks, API Requests, Issues, Pull Requests, Actions, Packages, Pages, Codespaces, and Copilot.