From 16:24-16:43 UTC, multiple GitHub services were down or degraded due to an outage in one of our primary databases.
The primary host for a shared datastore for GitHub experienced an underlying file system write error which affected availability for the majority of public-facing GitHub services. SAML login was affected, as was access to Actions, Issues, Pull Requests, Pages, API, Webhooks, Codespaces, and Packages. In this case, our automatic failover was unable to handle the partial file system failure mode. We mitigated by manually performing a forced failover, initiated 17 minutes after our first alert and completed 2 minutes later.
With the incident mitigated, we are working to assess more detailed impact and resilience improvements to each impacted service while also improving the automated failover mechanism to support this scenario.
Posted Sep 05, 2023 - 17:01 UTC
Update
Pages is operating normally.
Posted Sep 05, 2023 - 17:01 UTC
Update
Actions is operating normally.
Posted Sep 05, 2023 - 17:00 UTC
Update
Issues is operating normally.
Posted Sep 05, 2023 - 16:51 UTC
Update
We have performed a mitigation affecting write traffic across services and are seeing recovery for affected customers.
Posted Sep 05, 2023 - 16:51 UTC
Update
Packages is operating normally.
Posted Sep 05, 2023 - 16:50 UTC
Update
Webhooks is operating normally.
Posted Sep 05, 2023 - 16:49 UTC
Update
API Requests is operating normally.
Posted Sep 05, 2023 - 16:49 UTC
Update
Codespaces is operating normally.
Posted Sep 05, 2023 - 16:48 UTC
Update
Packages is experiencing degraded performance. We are continuing to investigate.
Posted Sep 05, 2023 - 16:45 UTC
Update
Webhooks is experiencing degraded performance. We are continuing to investigate.
Posted Sep 05, 2023 - 16:44 UTC
Update
We are investigating an issue that is impacting a small percentage of requests across all services including authentication and are working on mitigating impact.
Posted Sep 05, 2023 - 16:40 UTC
Update
Issues is experiencing degraded performance. We are continuing to investigate.
Posted Sep 05, 2023 - 16:39 UTC
Update
API Requests is experiencing degraded performance. We are continuing to investigate.
Posted Sep 05, 2023 - 16:35 UTC
Update
Codespaces is experiencing degraded availability. We are continuing to investigate.
Posted Sep 05, 2023 - 16:33 UTC
Update
Pull Requests is experiencing degraded performance. We are continuing to investigate.
Posted Sep 05, 2023 - 16:31 UTC
Update
Pages is experiencing degraded performance. We are continuing to investigate.
Posted Sep 05, 2023 - 16:31 UTC
Investigating
We are investigating reports of degraded performance for Actions
Posted Sep 05, 2023 - 16:30 UTC
This incident affected: Webhooks, API Requests, Issues, Pull Requests, Actions, Packages, Pages, and Codespaces.