Rootly
Investigating - An incident with severity SEV2 has been identified involving the elasticsearch-prod service, which began on September 25, 2024, at 7:37 PM UTC due to an alert from Slack, reported by JP Cheung. The incident is customer-facing but the specific impact on customers is not detailed. JP Cheung is assigned as the Commander, and has completed tasks including updating the rootly summary and ensuring roles are assigned; however, there is an outstanding task to check the access cluster health API that remains unassigned.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Sep 25, 2024 - 12:40 PDT
Investigating - An incident was reported by JP Cheung regarding an issue with the `elasticsearch-prod` service, classified as a SEV2 severity level, which is currently affecting customers. The incident was triggered by an alert in Slack at 3:39 PM UTC on September 24, 2024. As part of the resolution efforts, tasks include checking the access cluster health API, assigned to Andre King, while JP Cheung has completed updating the rootly summary and ensuring roles are assigned.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Sep 24, 2024 - 08:44 PDT
Investigating - An incident was reported involving an issue with the `elasticsearch-prod` service, which has caused a SEV2 severity level disruption. The incident was triggered by an alert from Slack shortly before it was officially reported by JP Cheung, who is currently serving as Commander. To address the problem, JP Cheung has completed tasks to update the rootly summary and assign roles, while Andre King is tasked with checking the cluster health API. No further follow-ups have been noted.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Sep 23, 2024 - 13:16 PDT
Investigating - An incident has been reported affecting the `elasticsearch-prod` service, categorized as a customer-facing issue with severity level SEV2, which began on September 20, 2024, at 1:48 PM UTC. The incident was triggered by an alert from Slack, and JP Cheung is acting as the Commander overseeing the response. Initial tasks have included updating the incident summary and ensuring roles are assigned, while Andre King has been tasked with checking the access cluster health API. Follow-up actions are pending as the incident progresses.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Customer Facing, Cloud

Sep 20, 2024 - 06:59 PDT
Investigating - A SEV2 incident with the elasticsearch-prod service was reported by JP Cheung due to a sagetap issue, which has affected customers. The incident was triggered by an alert from Slack, and JP Cheung has taken the lead as Commander, completing tasks such as updating the summary and assigning roles, while Andre King is tasked with checking the access cluster health API to further investigate the issue.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Sep 19, 2024 - 07:40 PDT
Investigating - A SEV2 incident was reported by JP Cheung regarding an issue with the `elasticsearch-prod` service, which has begun impacting customers since it started on September 18, 2024. The incident was triggered by an alert from Slack and JP Cheung is currently acting as Commander while no Scribe has been assigned. Action items that have been completed include updating the summary and ensuring roles are assigned, while checking the access cluster health API remains a task to be done.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Sep 18, 2024 - 15:43 PDT
Investigating - An incident was reported by JP Cheung regarding an issue affecting the elasticsearch-prod service, classified as severity SEV2, due to an alert received via Slack. The incident started on September 18, 2024, at 8:14 PM UTC, and has currently not been resolved. So far, JP Cheung has ensured that roles are assigned and updated the summary, while Jamie Baker is tasked with checking the access cluster health API. There are no follow-ups noted at this time.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Sep 18, 2024 - 13:20 PDT
Investigating - An incident categorized as severity SEV2 has occurred affecting the 'elasticsearch-prod' service, which is customer-facing, as reported by JP Cheung. The incident began on September 17, 2024, prompting immediate actions including testing assigned to Jamie Baker, health checks on the access cluster, and role assignments managed by JP Cheung, who is also leading the incident response.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Sep 17, 2024 - 06:24 PDT
Investigating - elasticsearch is down
Sep 10, 2024 - 12:17 PDT
Update - summary
Sep 06, 2024 - 12:09 PDT
Update - summary
Aug 30, 2024 - 13:56 PDT
Investigating - summary
Aug 30, 2024 - 13:39 PDT
Update - An incident was reported by JP Cheung at 4:20 PM UTC on September 6, 2024, involving an issue with the elasticsearch-prod service, categorized as SEV2 and impacting customers. As of now, the team is investigating the defined impact of the problem, with action items completed by JP Cheung, including updating the summary and assigning roles, while Nicole Bodzewski is tasked with checking the access cluster health API.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Sep 06, 2024 - 09:28 PDT
Investigating - Defined impact, investigating
Sep 06, 2024 - 09:24 PDT
Investigating - basic summary
Sep 04, 2024 - 14:37 PDT
Update - new issue found
show the change

Sep 04, 2024 - 06:25 PDT
Investigating - found an issue
Sep 04, 2024 - 06:24 PDT
Update - The incident involved a website outage affecting many customers, triggered by server errors caused by a recent deploy. The team quickly identified a significant increase in 500 errors linked to this deploy, leading JP Cheung to initiate a rollback that successfully restored functionality and reduced errors. Andre communicated updates to customer support and drafted a customer message, while Sara prepared communication for internal leadership. Following the rollback, the website stabilized and a customer update was sent to inform users of the resolution.

- Severity: SEV1
- Services: elasticsearch-prod
- @Incident Types: Customer Facing, Cloud

Sep 03, 2024 - 13:30 PDT
Investigating - An incident was reported by JP Cheung regarding a website outage affecting a significant number of customers, triggered by a recent deploy that introduced server errors. The team acted quickly, with James investigating system anomalies and JP pulling logs, which revealed a substantial increase in 500 errors linked to the deploy. In response, JP initiated a rollback, which successfully restored website functionality and reduced error rates. Andre communicated with customer support and drafted a customer update, while Sara prepared an internal leadership update. The rollback completed successfully, stabilizing the system, and the customer update was sent out to inform users of the resolution.
Sep 03, 2024 - 13:23 PDT
Update - An issue was found with the `elasticsearch-prod` service, classified as a SEV1 severity incident and impacting customers. The incident started on August 29, 2024, at 3:24 PM UTC, and was reported by JP Cheung, who is also the Commander managing the response. Action items include updating summaries and ensuring roles are assigned, but some tasks remain unassigned. Follow-up actions are pending as the team works to resolve the problem.

- Severity: SEV1
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Aug 29, 2024 - 08:37 PDT
Investigating - found an issue
Aug 29, 2024 - 08:27 PDT
Investigating - Requesting 30k in equipment
Aug 27, 2024 - 10:27 PDT
Investigating - The incident involved a significant website outage affecting almost all customers, triggered by a recent deployment that caused a spike in server errors. The team, led by JP Cheung, quickly identified the issue through log analysis and Datadog monitoring, and decided to roll back the deploy as a primary resolution step. As a result of their prompt actions, they observed a decrease in error rates, and the website was restored to normal functionality after the rollback was completed. A customer update was drafted and sent out to inform users of the situation, while leadership was kept apprised of the developments.
Aug 22, 2024 - 15:23 PDT
Investigating - The incident involved a website outage affecting nearly all customers, triggered by issues following a recent deployment that resulted in a significant spike in server errors. In response, the team, led by GTOC Manager JP Cheung, analyzed logs and Datadog data, which indicated that the deployment was likely the cause of the errors. They promptly initiated a rollback of the deployment, which successfully restored website functionality. The customer support team was kept informed to manage incoming inquiries, and a customer update was drafted and sent to communicate the situation. The incident response included active participation from several GTOC Site Engineers, with continued monitoring ensured post-rollback.
Aug 22, 2024 - 11:13 PDT
Update - A significant website outage occurred on 8/22/2024, affecting almost all customers, which resulted in overwhelming calls to customer support. The incident was triggered by a recent deployment that led to a spike in server errors. The response team, including JP Cheung, Prasanthi, and JJ, acted swiftly, pulling logs and analyzing data to confirm the rollback of the faulty deploy was necessary. Following the rollback, error rates decreased, and the website functionality was restored, allowing for a customer update to be sent out regarding the resolution.
Aug 22, 2024 - 11:05 PDT
Update - A significant website outage occurred on 8/22/2024, affecting almost all customers, which resulted in overwhelming calls to customer support. The incident was triggered by a recent deployment that led to a spike in server errors. The response team, including JP Cheung, Prasanthi, and JJ, acted swiftly, pulling logs and analyzing data to confirm the rollback of the faulty deploy was necessary. Following the rollback, error rates decreased, and the website functionality was restored, allowing for a customer update to be sent out regarding the resolution.
Aug 22, 2024 - 10:45 PDT
Investigating - A significant website outage occurred on 8/22/2024, affecting almost all customers, which resulted in overwhelming calls to customer support. The incident was triggered by a recent deployment that led to a spike in server errors. The response team, including JP Cheung, Prasanthi, and JJ, acted swiftly, pulling logs and analyzing data to confirm the rollback of the faulty deploy was necessary. Following the rollback, error rates decreased, and the website functionality was restored, allowing for a customer update to be sent out regarding the resolution.
Aug 22, 2024 - 10:04 PDT
Update - test
Aug 22, 2024 - 08:51 PDT
Identified - Test
Aug 22, 2024 - 08:50 PDT
Update - An issue has been identified in the Elasticsearch production service, causing a cloud-based incident classified as SEV2 and potentially impacting customers. The incident was triggered by an alert received via Slack at 7:42 PM UTC. JP Cheung has taken command of the situation, ensuring that roles are assigned and updating the incident summary, while Jamie Baker is tasked with checking the access cluster health API.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Aug 21, 2024 - 12:55 PDT
Investigating - found an issue, investigating
Aug 21, 2024 - 12:46 PDT
Investigating - elasticsearch issue
Aug 20, 2024 - 13:19 PDT
Investigating - elastic search is down
Aug 19, 2024 - 07:13 PDT
Update - The incident involved an outage of Elasticsearch (severity SEV2), which has affected customer access to the website and triggered numerous support calls. The issue was identified after an alert reported Elasticsearch being down, and it is suspected that the problem relates to a recent deploy that may have introduced breaking changes, evidenced by a spike in 5xx errors noticed in Datadog. The team, led by JP Cheung as Commander, is in the process of rolling back the last deploy and monitoring system metrics, with a customer update already drafted to inform users about the ongoing resolution efforts. The primary members involved in the response were Bach (Incident Commander), Stanley (SRE Lead), and JP (DevOps Engineer), who are working together to rectify the situation and ensure communication with customers and leadership.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Aug 16, 2024 - 11:18 PDT
Update - The incident involved an outage of Elasticsearch (severity SEV2), which has affected customer access to the website and triggered numerous support calls. The issue was identified after an alert reported Elasticsearch being down, and it is suspected that the problem relates to a recent deploy that may have introduced breaking changes, evidenced by a spike in 5xx errors noticed in Datadog. The team, led by JP Cheung as Commander, is in the process of rolling back the last deploy and monitoring system metrics, with a customer update already drafted to inform users about the ongoing resolution efforts. The primary members involved in the response were Bach (Incident Commander), Stanley (SRE Lead), and JP (DevOps Engineer), who are working together to rectify the situation and ensure communication with customers and leadership.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Aug 16, 2024 - 11:17 PDT
Investigating - elastic search is down
Aug 16, 2024 - 11:13 PDT
Investigating - elasticsearch is down
Aug 15, 2024 - 09:54 PDT
Investigating - elastic search down
Aug 13, 2024 - 10:13 PDT
Update - The incident report details a SEV2 severity issue where Elasticsearch is currently down, impacting customer services. The problem was triggered by an alert received by JP Cheung via Slack. Actions taken thus far include checking the access cluster health API assigned to Jamie Baker and updating the incident summary as well as ensuring role assignments, both of which were completed by JP Cheung, who is commanding the response to the incident. There are no follow-ups identified at this time.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Aug 13, 2024 - 08:42 PDT
Investigating - elastic search is down
Aug 13, 2024 - 08:39 PDT
Update - A SEV2 incident has been initiated due to an issue found in the `elasticsearch-prod` service, which may affect customers. The problem was triggered by an alert received on Slack, reported by JP Cheung, who is also acting as the Commander for the incident. So far, JP Cheung has updated the incident summary and ensured that roles are assigned, while Jamie Baker is tasked with checking the access cluster health API. Further action items and follow-ups are still pending.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Aug 12, 2024 - 16:43 PDT
Investigating - issue found
Aug 12, 2024 - 16:38 PDT
Update - The incident involves an outage of the Elasticsearch service, categorized as a SEV2 event that began on August 12, 2024. Customers are experiencing significant disruption in accessing the SailPoint website due to this issue. The problem was believed to be related to a recent deployment, indicated by elevated error rates and a notable drop in response times observed shortly after the deployment. Immediate actions were taken by JP Cheung, who reported the incident and is acting as the Commander, while Andre King was assigned to check the cluster health API. Communication efforts were handled by Dan and Fabio, who coordinated troubleshooting, initiated a rollback of the last deployment, and updated both customer and internal teams on the service status.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Aug 12, 2024 - 12:38 PDT
Investigating - elastic search is down
Aug 12, 2024 - 12:35 PDT
Investigating - something is wrong
Aug 12, 2024 - 10:11 PDT
Investigating - Service is down
Aug 09, 2024 - 15:10 PDT
Investigating - blah blah blah
Aug 08, 2024 - 21:30 PDT
Investigating - Canend message
Jul 30, 2024 - 14:11 PDT
Investigating - We are currently experiencing an issue impacting the elasticsearch-staging service due to the API being down. Our team is actively working on resolving this incident. We appreciate your patience and will provide updates as soon as possible. Thank you for your understanding.
Jul 23, 2024 - 13:23 PDT
Investigating - investigating
Jul 16, 2024 - 13:16 PDT
Investigating - investigating
Jul 16, 2024 - 13:16 PDT
Investigating - We are currently investigating a SEV2 incident affecting the API in the elasticsearch-staging service, impacting our customers. Our team is actively working on resolving the issue. We will provide further updates as soon as possible. Thank you for your patience.
Jul 08, 2024 - 13:25 PDT
Investigating - Found problem ,investigating
Jun 28, 2024 - 12:43 PDT
Investigating - Found problem ,investigating
Jun 28, 2024 - 12:43 PDT
Investigating - Due to a DDoS attack targeting our web infrastructure, the NTWRK site experienced an outage affecting user logins. Our team is actively working to restore services and improve response capabilities to prevent future incidents. Thank you for your patience and understanding.
Apr 19, 2024 - 09:37 PDT
Checkout Operational
90 days ago
100.0 % uptime
Today
Service ABC Degraded Performance
90 days ago
100.0 % uptime
Today
Functionality A ? Degraded Performance
90 days ago
100.0 % uptime
Today
Functionality B Operational
90 days ago
100.0 % uptime
Today
Login via SSO Partial Outage
90 days ago
80.32 % uptime
Today
Submit Reviews Operational
90 days ago
100.0 % uptime
Today
Slack Connections Partial Outage
Slack Messaging Operational
Slack Notifications Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Major outage
Partial outage
No downtime recorded on this day.
No data exists for this day.
had a major outage.
had a partial outage.
Past Incidents
Sep 27, 2024

No incidents reported today.

Sep 26, 2024

No incidents reported.

Sep 25, 2024

Unresolved incident: issue identified.

Sep 24, 2024

Unresolved incident: ryan found an issue.

Sep 23, 2024

Unresolved incident: found an issue.

Sep 22, 2024

No incidents reported.

Sep 21, 2024

No incidents reported.

Sep 20, 2024

Unresolved incident: found an issue.

Sep 19, 2024

Unresolved incident: sagetap issue.

Sep 18, 2024

Unresolved incidents: novi connect issue, found an issue v2.

Sep 17, 2024

Unresolved incident: Clever Embrace.

Sep 16, 2024

No incidents reported.

Sep 15, 2024

No incidents reported.

Sep 14, 2024

No incidents reported.

Sep 13, 2024

No incidents reported.