Website access interrupted due to server errors following recent deploy
Incident Report for Rootly Demo
The incident involved a website outage affecting many customers, triggered by server errors caused by a recent deploy. The team quickly identified a significant increase in 500 errors linked to this deploy, leading JP Cheung to initiate a rollback that successfully restored functionality and reduced errors. Andre communicated updates to customer support and drafted a customer message, while Sara prepared communication for internal leadership. Following the rollback, the website stabilized and a customer update was sent to inform users of the resolution.

- Severity: SEV1
- Services: elasticsearch-prod
- @Incident Types: Customer Facing, Cloud
Posted Sep 03, 2024 - 13:30 PDT
An incident was reported by JP Cheung regarding a website outage affecting a significant number of customers, triggered by a recent deploy that introduced server errors. The team acted quickly, with James investigating system anomalies and JP pulling logs, which revealed a substantial increase in 500 errors linked to the deploy. In response, JP initiated a rollback, which successfully restored website functionality and reduced error rates. Andre communicated with customer support and drafted a customer update, while Sara prepared an internal leadership update. The rollback completed successfully, stabilizing the system, and the customer update was sent out to inform users of the resolution.
Posted Sep 03, 2024 - 13:23 PDT