GitHub header
Incident with Pull Requests, Actions and Issues
Incident Report for GitHub
On March 1, 2024, between 14:17 UTC and 15:54 UTC the service that sends messages from our event stream into our background job processing service was degraded and delayed the transmission of jobs for processing. No data or jobs were lost. From 14:17 to 14:41 UTC, there was a partial degradation, where customers would experience intermittent delays with PRs and Actions. From 14:41 to 15:24 UTC, 36% of PRs users saw stale data, and 100% of in progress Actions workflows did not see updates , even though the workflows were succeeding. At 15:24 UTC, we mitigated the incident by redeploying our service and jobs began to burn down, with full job catchup by 15:54 UTC. This was due to under provisioned memory and lack of memory based back pressure in the service, which overwhelmed consumers and led to OutOfMemory crashes.

We have adjusted memory configurations to prevent this problem, and are analyzing and adjusting our alert sensitivity to reduce our time to detection of issues like this one in the future.
Posted Mar 01, 2024 - 16:12 UTC
Issues, Pull Requests and Actions are operating normally.
Posted Mar 01, 2024 - 16:12 UTC
We're seeing our background job queue sizes trend down, and expect full recovery in the next 15 minutes.
Posted Mar 01, 2024 - 15:48 UTC
Issues is experiencing degraded performance. We are continuing to investigate.
Posted Mar 01, 2024 - 15:39 UTC
We're continuing to investigate issues with background jobs that have impacted Actions and Pull Requests. We have a mitigation in place and are monitoring for recovery.
Posted Mar 01, 2024 - 15:27 UTC
We're investigating issues with background jobs that are causing sporadic delays in pull request synchronization and reduced Actions throughput.
Posted Mar 01, 2024 - 14:51 UTC
We are investigating reports of degraded performance for Pull Requests and Actions
Posted Mar 01, 2024 - 14:39 UTC
This incident affected: Issues, Pull Requests, and Actions.