O365 SSO & Google Calendar Functionality Impaired
Incident Report for Teem
Postmortem

Earlier this month, we notified you of the unexpected technical challenges some customers experienced as a result of a major infrastructure upgrade for the Teem platform.

We take performance and security very seriously, which is why we initiated the update to align with SOC 2 Type 2 data security standards. We appreciate your patience and understanding while we worked to resolve the temporary outages.

Our first priority was to get you back up and running; our second is to give you some background on what happened.

Here’s what we determined through our root cause analysis and how we’re preventing this moving forward.

Issue: O365 Sign-in Error 500

Cause: During a routine deployment, one of the third-party software libraries Teem SSO relies on was inadvertently upgraded to the latest major release of that library, which included noticeable changes.

Remediation:

· Reverted the library by specifying previous version with package management to avoid any unintentional upgrades

Issue: Post-Upgrade Device Connectivity

Cause: Teem’s core service had an interruption starting on Jan.9, 2021. During the interruption, when the EventBoard device made API calls to the service, the response status could be one of many error codes, including 401 Unauthorized. While this status code was in error, the device executed its designed security protocols and logged off. During a logout, EventBoard deletes all API tokens, downloaded themes, settings, and calendar data. It then reverts to a not-signed-in state and provides a 6-digit pin code for reactivation. Some customers were stuck on a “Authenticating with Teem …” screen. In these cases, after logging out EventBoard showed a message saying it could not communicate with Teem, and selecting “Retry” locked the app on that screen (a secondary symptom of the core issue). After logging out, the core service interruption would return an error instead of a pin code

Remediation:

· Deployed hotfix allowing devices to automatically activate and log in at pin code screen if they still exist in Teem database and are connected to only one Teem customer instance

· Modified EventBoard app to increase fault tolerance on false 401s and to no longer get stuck at “Authenticating with Teem …” screen

· Modified core service (monolith) so it doesn’t return 401s incorrectly

Issue: App.Teem.com Platform Outage

Cause: When deploying code on Jan. 14, 2021, an errant pip upgrade caused servers to not receive the deploying code and services to be stopped, interrupting all aspects of Teem.

Solutions:

· Updated deploy script to pin pip version

· Cleared salt cache and confirmed correct deploy script on servers

· Changed canary process to better detect downed servers

· Ongoing: Updating underlying framework and all packages

We apologize for any inconvenience and are continually working toward providing a more reliable experience for you.

For additional information or to report an issue, please reach out to your Account Manager or visit help.teem.com to contact our Customer Support team.

Thank you

Posted Jan 29, 2021 - 15:17 MST

Resolved
We have verified that the Google synch is back to normal performance. As such, this incident is now marked as resolved.
Posted Jan 15, 2021 - 17:54 MST
Identified
The Engineering team has identified the source of the problem with Google Calendar functionality and are working to mitigate the problem. The next update will be provided at 8pm MST.
Posted Jan 15, 2021 - 16:10 MST
Update
At this time, SSO functionality has been restored and is operational. During the investigation process we have identified that Google Calendar functionality is impaired, therefore this incident will remain open. We will provide another update at 4pm MST regarding this component.
Posted Jan 15, 2021 - 14:00 MST
Investigating
We have received reports from a subset of customers utilizing O365 SSO are not able to authenticate into Teem products. Our Engineering team is currently investigating and an update will be provided at 2pm MST.
Posted Jan 15, 2021 - 11:52 MST
This incident affected: Google Apps Calendar and Authentication (SSO).