Restless Glade
Incident Report for Rootly Demo
Investigating
The incident involved a website outage affecting nearly all customers, triggered by issues following a recent deployment that resulted in a significant spike in server errors. In response, the team, led by GTOC Manager JP Cheung, analyzed logs and Datadog data, which indicated that the deployment was likely the cause of the errors. They promptly initiated a rollback of the deployment, which successfully restored website functionality. The customer support team was kept informed to manage incoming inquiries, and a customer update was drafted and sent to communicate the situation. The incident response included active participation from several GTOC Site Engineers, with continued monitoring ensured post-rollback.
Posted Aug 22, 2024 - 11:13 PDT