Description
On Sunday February 12 around 18.20 MST, some Exchange and Office365 users began to report issues with EventBoard displays no longer being updated with new/changed calendar events. The issue was escalated at 18:44 and internal teams began to triage the issue. At 21:30 a patch was released to address the issue, following confirmation from clients and system monitoring the incident was closed at 22:00.
Root Cause / Remediation
A new logging component was released in Teem’s internal EWS service handling components on Thursday Feb 7. An unknown/undocumented defect in Apache’s Log4Net RollingFileAppender caused some requests to these internal services to fail with an invalid parameter message, progressively overloading one of the logging and metric sub systems. This overload had a down stream effect on the synchronization system on Feb 10 causing a backlog of requests to build up and the delays reported by clients. The problem component was removed and replaced and the backlog of events was quickly processed catching up all affected devices.