Service Disruption - LHR SL3
Incident Report for Cornerstone
Postmortem

On November 30th, 2022, Cornerstone engineers were alerted by internal monitoring tools for maintenance pages served for customers hosted in LHR-SL3 (On-Prem).

Cornerstone engineers were engaged in the investigation and troubleshooting of the issue related to Storage IO contention on backend database cluster nodes. The nodes were reset to get the systems back to a healthy state. In addition, engineers detected a cluster configuration issue during the incident, which delayed system recovery.

To mitigate this situation from occurring in the future and to reduce the risk of a future impact, Cornerstone has remediated the cluster configuration issue and identified the cause of the incident. Additional reliability checks have been made and corrective action has been taken, including but not limited to updating processes to avoid or throttle such background processes that may trigger IO contention, avoiding the recurrence of this incident.

Posted Dec 06, 2022 - 23:28 PST

Resolved
The CSOD Technology Team observed a service disruption on this swimlane. The problem began at 1.20a Pacific Time and service was restored at 3.12a Pacific Time. During this time, clients with portals on this swimlane were not able to access the application.
Posted Nov 30, 2022 - 03:32 PST
Update
The issue has been identified and we are working to resolve with urgency. Another update will be provided in the next 60minutes.
Posted Nov 30, 2022 - 02:37 PST
Identified
This swimlane (London SL3) is experiencing a service disruption. This is our top priority and we are working to resolve the problem as soon as possible. Please check back periodically for additional updates, which will be posted as they become available.
Posted Nov 30, 2022 - 01:43 PST
This incident affected: LHR-SL3 (Uptime, Response Time).