Between April 3rd, 2024 23:15 UTC and April 4th, 2024 01:10 UTC, GitHub Actions experienced a partial infrastructure outage that led to degraded workflows (failed or delayed starts). Additionally, 0.15% of Webhook deliveries were degraded due to an unrelated spike in database latency in a single availability zone. SLOs for Actions were 90% during the incident, but this was not evenly distributed across customers. We statused green after a long stretch of recovered SLOs, starting at April 4th, 2024 00:35 UTC. During this incident, we also had issues with incident tooling (https://www.githubstatus.com/) failing to update the public status page and occasionally not loading.
The incident was resolved after the infrastructure issue was mitigated at 2024-04-04 04:27 UTC.
We are working to improve monitoring and processes in response to this incident. We are investigating how we can improve resilience and our communication with our infrastructure provider, and how we can better handle ongoing incidents that are no longer impacting SLOs. We are also improving our incident tooling to ensure that the public status page is updated in a timely manner.
Posted Apr 04, 2024 - 01:10 UTC
Update
API Requests is operating normally.
Posted Apr 04, 2024 - 01:09 UTC
Update
Actions is operating normally.
Posted Apr 04, 2024 - 01:07 UTC
Update
We are seeing recovery in Actions workflows creation and accessing Actions statuses via the API.
Posted Apr 04, 2024 - 00:46 UTC
Update
Webhooks is experiencing degraded performance. We are continuing to investigate.
Posted Apr 04, 2024 - 00:25 UTC
Update
We are investigating Actions workflows failures and delays.
Posted Apr 04, 2024 - 00:12 UTC
Update
API Requests is experiencing degraded performance. We are continuing to investigate.
Posted Apr 04, 2024 - 00:06 UTC
Investigating
We are investigating reports of degraded performance for Actions
Posted Apr 03, 2024 - 23:59 UTC
This incident affected: Webhooks, API Requests, and Actions.