On October 5, 2023 at 13:40 UTC, our monitoring systems observed an increase in the time it was taking for Git pushes to become visible when viewing commits in Pull Requests. Under normal operating conditions, a series of asynchronous jobs runs in response to every push and within a few seconds applies a number of side-effects in Pull Requests such as requesting reviews, marking Pull Requests as merged, and showing new commits. During the incident, jobs were entering the queue faster than we could process them, resulting in processing delays as high as 75 seconds on average, and as much as 15 minutes in the worst case. About 10% of all Pull Request page loads were showing out-of-date data during this time.
We had recently created a dedicated worker pool for processing these side-effects, with the goal of improving isolation between services and providing product teams with more direct control over critical parts of the system. We mitigated the incident by increasing capacity of the worker pool, which fully processed the backlog of delayed jobs, and returned everything to normal by 16:07 UTC. In response to this incident, we have adjusted our monitoring thresholds and improved our procedures for scaling up worker pools in response to increasing utilization.
Posted Oct 05, 2023 - 16:42 UTC
We've implemented a fix for Pull Request web UI delays and are seeing recovery. We are monitoring and will send another update in a few minutes.
Posted Oct 05, 2023 - 16:12 UTC
We are still investigating delays of up to 15 minutes in commits showing up on Pull Requests in the Web UI for all customers.
Posted Oct 05, 2023 - 16:01 UTC
We are continuing to investigate the delay of commits showing up on Pull Requests in the Web UI. Pull Requests will still function normally otherwise.
Posted Oct 05, 2023 - 15:20 UTC
We are investigating delays for commits showing up on Pull Requests page loads in the web UI. As a result of this, about 15% of pull requests are currently showing stale data. We are currently investigating contributing factors right now.
Posted Oct 05, 2023 - 14:34 UTC
We are investigating reports of degraded performance for Pull Requests