Fashionable Dandelion
Incident Report for Rootly Demo
Investigating
The incident involved a significant website outage affecting almost all customers, triggered by a recent deployment that caused a spike in server errors. The team, led by JP Cheung, quickly identified the issue through log analysis and Datadog monitoring, and decided to roll back the deploy as a primary resolution step. As a result of their prompt actions, they observed a decrease in error rates, and the website was restored to normal functionality after the rollback was completed. A customer update was drafted and sent out to inform users of the situation, while leadership was kept apprised of the developments.
Posted Aug 22, 2024 - 15:23 PDT