Degraded Performance - Delay in OCPP Log Processing
Executive Summary
Between 19th June 2024 12:39 UTC and 21st June 2024 14:00 UTC, during a routine historical log maintenance activity on our log storage database cluster, the system experienced degraded performance due to a slowdown in OCPP log processing. This delay in log processing affected the OCPP Log
display on the dashboard.
Events Timeline
After the incident was closed, the Engineering continued to review all logs during this period for any adverse effects.
Closed June 21 14:00 UTC
The system performance was fully restored, and log processing has resumed at it's expected speed. There is an existing backlog to be processed and we expect these logs to be available on the Admin dashboard within a few hours.
Update & Monitoring June 21 10:00 UTC
After pausing the maintenance activity, the log processing speed gradually returned to normal.
Update & Monitoring June 21 08:00 UTC
The decision was made to temporarily pause the maintenance activity to prevent further degradation in the performance of log database cluster.
Update & Monitoring June 21 07:30 UTC
Investigation revealed that the ongoing historical log maintenance activity was impacting the performance of log database cluster.
Investigating June 20 09:54 UTC
The team observed a significant slowdown in the speed of the log processing, which caused delays in the display of logs.
Identified June 19 20:58 UTC
It was observed that the logs for certain sessions or chargestations were not displaying on the dashboard.
Maintenance Activity Initiated June 17 07:00 UTC
The team initiated a historical log maintenance activity on the log database cluster.
Mitigation Actions
To prevent these types of issues from happening again in the future, we have taken or are taking the following actions:
- Enhanced alerting and monitoring systems for our log processing and log storage infrastructure.