Rootly
Investigating - An incident has occurred where the Elasticsearch production database reached 100% capacity, impacting customer-facing services. The incident was reported by JP Cheung, who is also serving as the commander for the resolution efforts. Immediate actions taken include ensuring roles are assigned and updating the summary, while pending tasks include checking the cluster health API assigned to Alex Conrad. The incident started on March 19, 2025, at 8:43 PM UTC, triggered by an alert from Slack.

- [Meeting](https://us06web.zoom.us/j/85320373859?pwd=AtBnjcQRmiuaPjKUR9KSU8RWf2EYzK.1)
- Severity: SEV2
- Services: elasticsearch-prod
- Incident Types: Customer Facing, Cloud
- Root Cause Analysis: What caused this incident

Mar 19, 2025 - 13:55 PDT
Investigating - The incident report details that the elasticsearch-prod service is currently experiencing an overload issue, affecting customers. The incident was triggered by an alert received via Slack and was reported by JP Cheung, who is also serving as the Commander. So far, action items have been completed, including updating the rootly summary and ensuring roles are assigned, while Mark Grod is tasked with checking the access cluster health API to aid in resolution. The incident began on February 21, 2025, at 7:26 PM UTC.

- Bridge URL: https://us06web.zoom.us/j/83841543482?pwd=3fHAe3QOcpMHCnQbMmzLOzOa0soSIR.1
- Severity: SEV2
- Services: elasticsearch-prod
- Incident Types: Customer Facing, Cloud
- Root Cause Analysis: What caused this incident

Feb 21, 2025 - 11:31 PST
Investigating - An incident has occurred involving the elasticsearch-prod service, where the database has reached 100% capacity, initially reported by JP Cheung, who is also serving as the incident commander. The situation is classified as SEV2 and customer-facing, with the potential to negatively impact users. Actions have been taken, including updating the summary and ensuring proper role assignments, while Andre King has been tasked with checking the access cluster health API. The incident was triggered by an alert from Slack, and further follow-ups are yet to be assigned.

- Bridge URL: https://us06web.zoom.us/j/85090020996?pwd=xYV3QjGc3l7gbqAbpKIW1Ol8ibVOvM.1
- Severity: SEV2
- Services: elasticsearch-prod
- Incident Types: Customer Facing, Cloud
- Root Cause Analysis: What caused this incident

Feb 19, 2025 - 12:10 PST
Investigating - An incident was reported concerning a problem found in the Elasticsearch production database, which has been classified as a SEV2 severity and is affecting customer-facing services. The incident was flagged by JP Cheung after receiving an alert via Slack. As steps to address the issue, JP completed the role assignments and updated the relevant summaries, while additional tasks, such as checking the cluster health API, remain to be assigned. The incident began on January 24, 2025, at 9:14 PM UTC.

- Bridge URL: https://us06web.zoom.us/j/89970746570?pwd=DIjHtiCQfvP3PiX7d7pbx08IJ0HYut.1
- Severity: SEV2
- Services: elasticsearch-prod
- Incident Types: Cloud, Customer Facing
- Root Cause Analysis: What caused this incident

Jan 24, 2025 - 13:21 PST
Investigating - An issue was identified impacting the `elasticsearch-prod` service, categorized as a SEV2 incident and affecting customers. The incident was triggered by an alert received via Slack at 5:13 PM UTC on January 17, 2025. JP Cheung, who reported the incident, is currently serving as the Commander overseeing the situation, with tasks completed to update summaries and ensure roles are assigned. Andre King is assigned to check the access cluster health API as part of the ongoing resolution efforts.

- Bridge URL: https://us06web.zoom.us/j/84563414075?pwd=7G65R5b2vXfMEYEDEg1DI8S6fieKL1.1
- Severity: SEV2
- Services: elasticsearch-prod
- Incident Types: Cloud, Customer Facing
- Root Cause Analysis: What caused this incident

Jan 17, 2025 - 09:23 PST
Investigating - An incident has been reported concerning a problem with the `elasticsearch-prod` service, classified as SEV2 and currently ongoing. JP Cheung is managing the incident, having discovered it through a Slack alert. Actions taken include updating the summary and ensuring role assignments, with tasks assigned to Andre King to check the access cluster health API. The incident started on January 16, 2025, at 1:18 PM UTC, and is customer-facing, potentially impacting service availability for users.

- Bridge URL: https://us06web.zoom.us/j/85371516923?pwd=r1BXaQCke3WudpzIb1LFJ3KXFj4UY7.1
- Severity: SEV2
- Services: elasticsearch-prod
- Incident Types: Cloud, Customer Facing
- Root Cause Analysis: What caused this incident

Jan 16, 2025 - 05:27 PST
Investigating - An incident was reported involving an issue in the Elasticsearch production environment, classified as a SEV2 severity and affecting customer-facing services. The incident was initially triggered by an alert from Slack, and action items have been outlined, including checking the access cluster health API which is assigned to Andre King. JP Cheung is serving as the Commander of the incident response and has completed initial tasks such as updating the summary and ensuring role assignments. The status of the incident is currently "Started."

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing
- Root Cause Analysis: What caused this incident

Jan 14, 2025 - 06:30 PST
Investigating - sample message
Jan 10, 2025 - 00:18 PST
Investigating - A SEV1 incident has been reported concerning an issue with the `elasticsearch-prod` service, affecting customers and traced back to a problem with database access. The incident was triggered by an alert posted on Slack and was reported by JP Cheung, who completed tasks to ensure roles were assigned and update the summary, while Andre King was assigned to check the access cluster health API. Further follow-ups and tasks are still pending, and no scribe has been assigned to document the incident.

- Severity: SEV1
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing
- Root Cause Analysis: What caused this incident

Jan 09, 2025 - 23:36 PST
Identified - dskndm,
Jan 08, 2025 - 15:44 PST
Investigating - An issue was found in the Elasticsearch production database, classified as a SEV2 incident, potentially affecting customers. JP Cheung reported the incident after receiving an alert via Slack and is currently serving as the Commander overseeing the response efforts. Steps taken include ensuring role assignments and updating the Rootly summary, though no one has been assigned to check the access cluster health API yet.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing
- Root Cause Analysis: What caused this incident

Dec 26, 2024 - 07:48 PST
Update - Noticed something is wrong
Dec 23, 2024 - 11:14 PST
Update - Noticed something is wrong
Dec 23, 2024 - 11:11 PST
Investigating - Noticed something is wrong
Dec 23, 2024 - 11:06 PST
Update - we noticed something is wrong.
Dec 23, 2024 - 10:22 PST
Update - we noticed something is wrong.
Dec 23, 2024 - 10:19 PST
Investigating - we noticed something is wrong.
Dec 23, 2024 - 10:16 PST
Update - We have discovered an issue
Dec 23, 2024 - 06:51 PST
Investigating - We have discovered an issue
Dec 23, 2024 - 06:45 PST
Investigating - An incident with a SEV2 severity affecting the `elasticsearch-prod` service was reported by JP Cheung, who is also serving as the Commander. The incident, which started on December 17, 2024, at 9:14 PM UTC, remains unresolved, and actions have been taken to update the incident summary and assign roles, while checking the access cluster health API has been assigned to Andre Yang.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing
- Root Cause Analysis: What caused this incident

Dec 17, 2024 - 13:20 PST
Investigating - An incident was reported involving a SEV2 issue affecting the `elasticsearch-prod` service, which has implications for customer-facing operations. The incident was triggered by an alert in Slack, leading JP Cheung to assume the role of Commander to manage the situation. Actions taken so far include updating the summary and ensuring role assignments, while Andre King is tasked with checking the health of the access cluster API. The status of the incident remains ongoing as of now.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing
- Root Cause Analysis: What caused this incident

Dec 12, 2024 - 08:29 PST
Investigating - An issue was reported with the `elasticsearch-prod` service, classified as a SEV2 incident affecting customer-facing operations. The incident was triggered by an alert from Slack at 5:13 PM UTC on November 26, 2024, and is currently being managed by JP Cheung, who is the Commander for the incident. Action items include updating the summary and ensuring roles are assigned, both completed by JP Cheung, while a check on the access cluster health API is pending and assigned to Andre King.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Nov 26, 2024 - 09:18 PST
Investigating - An issue has been identified with the `elasticsearch-prod` service, categorized as a SEV2 incident that began on November 25, 2024. JP Cheung reported the incident, and though no customers have been mentioned as affected, it is classified as customer-facing. The incident was triggered by an alert from Slack. Actions taken so far include updating the /rootly summary and ensuring roles are assigned, both managed by JP Cheung, while Andre King has been tasked with checking the access cluster health API.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Nov 25, 2024 - 12:57 PST
Identified - testing SP
Nov 11, 2024 - 12:47 PST
Investigating - An issue has been identified in the Elasticsearch production database, categorized as a SEV2 incident, which is impacting customer-facing services. The incident was triggered by an alert received via Slack and was reported by JP Cheung, who is also serving as the incident commander. Steps taken so far include updating the summary and ensuring that roles are assigned, while a check on the cluster health API is still pending. No follow-ups have been documented at this stage.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Nov 11, 2024 - 08:14 PST
Investigating - Found an issue
Oct 09, 2024 - 14:08 PDT
Investigating - Post to status page
Oct 08, 2024 - 06:15 PDT
Investigating - An incident was reported involving the Elasticsearch production service, classified as SEV2, which started on October 4, 2024, and was triggered by an alert from Slack. JP Cheung, the commander, is currently overseeing the resolution process, while Andre King was responsible for checking the access cluster health API. JP Cheung has also completed updating the summary and ensuring roles are assigned, with no follow-up actions noted at this time.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Oct 04, 2024 - 07:12 PDT
Investigating - An issue was discovered affecting the elasticsearch-prod service, which is classified as SEV2 and customer-facing. The incident was triggered by an alert from Slack and was reported by JP Cheung, who is also acting as Commander, while Alexandra Chapin has been tasked with checking the access cluster health API. JP has completed the updates to the summary and ensured appropriate roles are assigned, with no follow-ups needed at this time.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Oct 01, 2024 - 12:48 PDT
Investigating - An incident was reported involving the `elasticsearch-prod` service, classified as SEV2, which began on October 1, 2024, at 5:51 PM UTC, with no immediate impact reported on customers. The incident was initially triggered by an undetermined issue that necessitated assessing the cluster health API. JP Cheung is leading the response as the Commander and has completed tasks including ensuring role assignments and updating the summary; however, there remains a task to check the cluster health API that currently lacks an assigned individual.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Oct 01, 2024 - 10:52 PDT
Investigating - A SEV2 incident involving the elasticsearch-prod service was reported by JP Cheung on October 1, 2024, at 1:40 PM UTC, affecting customer access. The incident is currently in progress, with JP Cheung serving as the Commander while updating the summary and assigning roles, and Purvai Nanda tasked with checking the access cluster health API to aid in resolution. There are no follow-up actions noted at this time.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Oct 01, 2024 - 06:50 PDT
Investigating - An incident with severity SEV2 has been identified involving the elasticsearch-prod service, which began on September 25, 2024, at 7:37 PM UTC due to an alert from Slack, reported by JP Cheung. The incident is customer-facing but the specific impact on customers is not detailed. JP Cheung is assigned as the Commander, and has completed tasks including updating the rootly summary and ensuring roles are assigned; however, there is an outstanding task to check the access cluster health API that remains unassigned.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Sep 25, 2024 - 12:40 PDT
Investigating - An incident was reported by JP Cheung regarding an issue with the `elasticsearch-prod` service, classified as a SEV2 severity level, which is currently affecting customers. The incident was triggered by an alert in Slack at 3:39 PM UTC on September 24, 2024. As part of the resolution efforts, tasks include checking the access cluster health API, assigned to Andre King, while JP Cheung has completed updating the rootly summary and ensuring roles are assigned.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Sep 24, 2024 - 08:44 PDT
Investigating - An incident was reported involving an issue with the `elasticsearch-prod` service, which has caused a SEV2 severity level disruption. The incident was triggered by an alert from Slack shortly before it was officially reported by JP Cheung, who is currently serving as Commander. To address the problem, JP Cheung has completed tasks to update the rootly summary and assign roles, while Andre King is tasked with checking the cluster health API. No further follow-ups have been noted.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Sep 23, 2024 - 13:16 PDT
Investigating - An incident has been reported affecting the `elasticsearch-prod` service, categorized as a customer-facing issue with severity level SEV2, which began on September 20, 2024, at 1:48 PM UTC. The incident was triggered by an alert from Slack, and JP Cheung is acting as the Commander overseeing the response. Initial tasks have included updating the incident summary and ensuring roles are assigned, while Andre King has been tasked with checking the access cluster health API. Follow-up actions are pending as the incident progresses.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Customer Facing, Cloud

Sep 20, 2024 - 06:59 PDT
Investigating - A SEV2 incident with the elasticsearch-prod service was reported by JP Cheung due to a sagetap issue, which has affected customers. The incident was triggered by an alert from Slack, and JP Cheung has taken the lead as Commander, completing tasks such as updating the summary and assigning roles, while Andre King is tasked with checking the access cluster health API to further investigate the issue.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Sep 19, 2024 - 07:40 PDT
Investigating - A SEV2 incident was reported by JP Cheung regarding an issue with the `elasticsearch-prod` service, which has begun impacting customers since it started on September 18, 2024. The incident was triggered by an alert from Slack and JP Cheung is currently acting as Commander while no Scribe has been assigned. Action items that have been completed include updating the summary and ensuring roles are assigned, while checking the access cluster health API remains a task to be done.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Sep 18, 2024 - 15:43 PDT
Investigating - An incident was reported by JP Cheung regarding an issue affecting the elasticsearch-prod service, classified as severity SEV2, due to an alert received via Slack. The incident started on September 18, 2024, at 8:14 PM UTC, and has currently not been resolved. So far, JP Cheung has ensured that roles are assigned and updated the summary, while Jamie Baker is tasked with checking the access cluster health API. There are no follow-ups noted at this time.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Sep 18, 2024 - 13:20 PDT
Investigating - An incident categorized as severity SEV2 has occurred affecting the 'elasticsearch-prod' service, which is customer-facing, as reported by JP Cheung. The incident began on September 17, 2024, prompting immediate actions including testing assigned to Jamie Baker, health checks on the access cluster, and role assignments managed by JP Cheung, who is also leading the incident response.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Sep 17, 2024 - 06:24 PDT
Investigating - elasticsearch is down
Sep 10, 2024 - 12:17 PDT
Update - summary
Sep 06, 2024 - 12:09 PDT
Update - summary
Aug 30, 2024 - 13:56 PDT
Investigating - summary
Aug 30, 2024 - 13:39 PDT
Update - An incident was reported by JP Cheung at 4:20 PM UTC on September 6, 2024, involving an issue with the elasticsearch-prod service, categorized as SEV2 and impacting customers. As of now, the team is investigating the defined impact of the problem, with action items completed by JP Cheung, including updating the summary and assigning roles, while Nicole Bodzewski is tasked with checking the access cluster health API.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Sep 06, 2024 - 09:28 PDT
Investigating - Defined impact, investigating
Sep 06, 2024 - 09:24 PDT
Investigating - basic summary
Sep 04, 2024 - 14:37 PDT
Update - new issue found
show the change

Sep 04, 2024 - 06:25 PDT
Investigating - found an issue
Sep 04, 2024 - 06:24 PDT
Update - The incident involved a website outage affecting many customers, triggered by server errors caused by a recent deploy. The team quickly identified a significant increase in 500 errors linked to this deploy, leading JP Cheung to initiate a rollback that successfully restored functionality and reduced errors. Andre communicated updates to customer support and drafted a customer message, while Sara prepared communication for internal leadership. Following the rollback, the website stabilized and a customer update was sent to inform users of the resolution.

- Severity: SEV1
- Services: elasticsearch-prod
- @Incident Types: Customer Facing, Cloud

Sep 03, 2024 - 13:30 PDT
Investigating - An incident was reported by JP Cheung regarding a website outage affecting a significant number of customers, triggered by a recent deploy that introduced server errors. The team acted quickly, with James investigating system anomalies and JP pulling logs, which revealed a substantial increase in 500 errors linked to the deploy. In response, JP initiated a rollback, which successfully restored website functionality and reduced error rates. Andre communicated with customer support and drafted a customer update, while Sara prepared an internal leadership update. The rollback completed successfully, stabilizing the system, and the customer update was sent out to inform users of the resolution.
Sep 03, 2024 - 13:23 PDT
Update - An issue was found with the `elasticsearch-prod` service, classified as a SEV1 severity incident and impacting customers. The incident started on August 29, 2024, at 3:24 PM UTC, and was reported by JP Cheung, who is also the Commander managing the response. Action items include updating summaries and ensuring roles are assigned, but some tasks remain unassigned. Follow-up actions are pending as the team works to resolve the problem.

- Severity: SEV1
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Aug 29, 2024 - 08:37 PDT
Investigating - found an issue
Aug 29, 2024 - 08:27 PDT
Investigating - Requesting 30k in equipment
Aug 27, 2024 - 10:27 PDT
Investigating - The incident involved a significant website outage affecting almost all customers, triggered by a recent deployment that caused a spike in server errors. The team, led by JP Cheung, quickly identified the issue through log analysis and Datadog monitoring, and decided to roll back the deploy as a primary resolution step. As a result of their prompt actions, they observed a decrease in error rates, and the website was restored to normal functionality after the rollback was completed. A customer update was drafted and sent out to inform users of the situation, while leadership was kept apprised of the developments.
Aug 22, 2024 - 15:23 PDT
Investigating - The incident involved a website outage affecting nearly all customers, triggered by issues following a recent deployment that resulted in a significant spike in server errors. In response, the team, led by GTOC Manager JP Cheung, analyzed logs and Datadog data, which indicated that the deployment was likely the cause of the errors. They promptly initiated a rollback of the deployment, which successfully restored website functionality. The customer support team was kept informed to manage incoming inquiries, and a customer update was drafted and sent to communicate the situation. The incident response included active participation from several GTOC Site Engineers, with continued monitoring ensured post-rollback.
Aug 22, 2024 - 11:13 PDT
Update - A significant website outage occurred on 8/22/2024, affecting almost all customers, which resulted in overwhelming calls to customer support. The incident was triggered by a recent deployment that led to a spike in server errors. The response team, including JP Cheung, Prasanthi, and JJ, acted swiftly, pulling logs and analyzing data to confirm the rollback of the faulty deploy was necessary. Following the rollback, error rates decreased, and the website functionality was restored, allowing for a customer update to be sent out regarding the resolution.
Aug 22, 2024 - 11:05 PDT
Update - A significant website outage occurred on 8/22/2024, affecting almost all customers, which resulted in overwhelming calls to customer support. The incident was triggered by a recent deployment that led to a spike in server errors. The response team, including JP Cheung, Prasanthi, and JJ, acted swiftly, pulling logs and analyzing data to confirm the rollback of the faulty deploy was necessary. Following the rollback, error rates decreased, and the website functionality was restored, allowing for a customer update to be sent out regarding the resolution.
Aug 22, 2024 - 10:45 PDT
Investigating - A significant website outage occurred on 8/22/2024, affecting almost all customers, which resulted in overwhelming calls to customer support. The incident was triggered by a recent deployment that led to a spike in server errors. The response team, including JP Cheung, Prasanthi, and JJ, acted swiftly, pulling logs and analyzing data to confirm the rollback of the faulty deploy was necessary. Following the rollback, error rates decreased, and the website functionality was restored, allowing for a customer update to be sent out regarding the resolution.
Aug 22, 2024 - 10:04 PDT
Update - test
Aug 22, 2024 - 08:51 PDT
Identified - Test
Aug 22, 2024 - 08:50 PDT
Update - An issue has been identified in the Elasticsearch production service, causing a cloud-based incident classified as SEV2 and potentially impacting customers. The incident was triggered by an alert received via Slack at 7:42 PM UTC. JP Cheung has taken command of the situation, ensuring that roles are assigned and updating the incident summary, while Jamie Baker is tasked with checking the access cluster health API.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Aug 21, 2024 - 12:55 PDT
Investigating - found an issue, investigating
Aug 21, 2024 - 12:46 PDT
Investigating - elasticsearch issue
Aug 20, 2024 - 13:19 PDT
Investigating - elastic search is down
Aug 19, 2024 - 07:13 PDT
Update - The incident involved an outage of Elasticsearch (severity SEV2), which has affected customer access to the website and triggered numerous support calls. The issue was identified after an alert reported Elasticsearch being down, and it is suspected that the problem relates to a recent deploy that may have introduced breaking changes, evidenced by a spike in 5xx errors noticed in Datadog. The team, led by JP Cheung as Commander, is in the process of rolling back the last deploy and monitoring system metrics, with a customer update already drafted to inform users about the ongoing resolution efforts. The primary members involved in the response were Bach (Incident Commander), Stanley (SRE Lead), and JP (DevOps Engineer), who are working together to rectify the situation and ensure communication with customers and leadership.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Aug 16, 2024 - 11:18 PDT
Update - The incident involved an outage of Elasticsearch (severity SEV2), which has affected customer access to the website and triggered numerous support calls. The issue was identified after an alert reported Elasticsearch being down, and it is suspected that the problem relates to a recent deploy that may have introduced breaking changes, evidenced by a spike in 5xx errors noticed in Datadog. The team, led by JP Cheung as Commander, is in the process of rolling back the last deploy and monitoring system metrics, with a customer update already drafted to inform users about the ongoing resolution efforts. The primary members involved in the response were Bach (Incident Commander), Stanley (SRE Lead), and JP (DevOps Engineer), who are working together to rectify the situation and ensure communication with customers and leadership.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Aug 16, 2024 - 11:17 PDT
Investigating - elastic search is down
Aug 16, 2024 - 11:13 PDT
Investigating - elasticsearch is down
Aug 15, 2024 - 09:54 PDT
Investigating - elastic search down
Aug 13, 2024 - 10:13 PDT
Update - The incident report details a SEV2 severity issue where Elasticsearch is currently down, impacting customer services. The problem was triggered by an alert received by JP Cheung via Slack. Actions taken thus far include checking the access cluster health API assigned to Jamie Baker and updating the incident summary as well as ensuring role assignments, both of which were completed by JP Cheung, who is commanding the response to the incident. There are no follow-ups identified at this time.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Aug 13, 2024 - 08:42 PDT
Investigating - elastic search is down
Aug 13, 2024 - 08:39 PDT
Update - A SEV2 incident has been initiated due to an issue found in the `elasticsearch-prod` service, which may affect customers. The problem was triggered by an alert received on Slack, reported by JP Cheung, who is also acting as the Commander for the incident. So far, JP Cheung has updated the incident summary and ensured that roles are assigned, while Jamie Baker is tasked with checking the access cluster health API. Further action items and follow-ups are still pending.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Aug 12, 2024 - 16:43 PDT
Investigating - issue found
Aug 12, 2024 - 16:38 PDT
Update - The incident involves an outage of the Elasticsearch service, categorized as a SEV2 event that began on August 12, 2024. Customers are experiencing significant disruption in accessing the SailPoint website due to this issue. The problem was believed to be related to a recent deployment, indicated by elevated error rates and a notable drop in response times observed shortly after the deployment. Immediate actions were taken by JP Cheung, who reported the incident and is acting as the Commander, while Andre King was assigned to check the cluster health API. Communication efforts were handled by Dan and Fabio, who coordinated troubleshooting, initiated a rollback of the last deployment, and updated both customer and internal teams on the service status.

- Severity: SEV2
- Services: elasticsearch-prod
- @Incident Types: Cloud, Customer Facing

Aug 12, 2024 - 12:38 PDT
Investigating - elastic search is down
Aug 12, 2024 - 12:35 PDT
Investigating - something is wrong
Aug 12, 2024 - 10:11 PDT
Investigating - Service is down
Aug 09, 2024 - 15:10 PDT
Investigating - blah blah blah
Aug 08, 2024 - 21:30 PDT
Investigating - Canend message
Jul 30, 2024 - 14:11 PDT
Investigating - We are currently experiencing an issue impacting the elasticsearch-staging service due to the API being down. Our team is actively working on resolving this incident. We appreciate your patience and will provide updates as soon as possible. Thank you for your understanding.
Jul 23, 2024 - 13:23 PDT
Investigating - investigating
Jul 16, 2024 - 13:16 PDT
Investigating - investigating
Jul 16, 2024 - 13:16 PDT
Investigating - We are currently investigating a SEV2 incident affecting the API in the elasticsearch-staging service, impacting our customers. Our team is actively working on resolving the issue. We will provide further updates as soon as possible. Thank you for your patience.
Jul 08, 2024 - 13:25 PDT
Investigating - Found problem ,investigating
Jun 28, 2024 - 12:43 PDT
Investigating - Found problem ,investigating
Jun 28, 2024 - 12:43 PDT
Investigating - Due to a DDoS attack targeting our web infrastructure, the NTWRK site experienced an outage affecting user logins. Our team is actively working to restore services and improve response capabilities to prevent future incidents. Thank you for your patience and understanding.
Apr 19, 2024 - 09:37 PDT
Checkout Operational
90 days ago
100.0 % uptime
Today
Service ABC Degraded Performance
90 days ago
100.0 % uptime
Today
Functionality A ? Degraded Performance
90 days ago
100.0 % uptime
Today
Functionality B Operational
90 days ago
100.0 % uptime
Today
Login via SSO Under Maintenance
90 days ago
97.66 % uptime
Today
Submit Reviews Operational
90 days ago
100.0 % uptime
Today
Slack Connections Major Outage
Slack Messaging Operational
Slack Notifications Partial Outage
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Major outage
Partial outage
No downtime recorded on this day.
No data exists for this day.
had a major outage.
had a partial outage.
Apr 2, 2025

No incidents reported today.

Apr 1, 2025

No incidents reported.

Mar 31, 2025

No incidents reported.

Mar 30, 2025

No incidents reported.

Mar 29, 2025

No incidents reported.

Mar 28, 2025

No incidents reported.

Mar 27, 2025

No incidents reported.

Mar 26, 2025

No incidents reported.

Mar 25, 2025

No incidents reported.

Mar 24, 2025

No incidents reported.

Mar 23, 2025

No incidents reported.

Mar 22, 2025

No incidents reported.

Mar 21, 2025

No incidents reported.

Mar 20, 2025

No incidents reported.

Mar 19, 2025

Unresolved incident: DB is at 100%.