Earlier this month, we notified you of the unexpected technical challenges some customers experienced as a result of a major infrastructure upgrade for the Teem platform.
We take performance and security very seriously, which is why we initiated the update to align with SOC 2 Type 2 data security standards. We appreciate your patience and understanding while we worked to resolve the temporary outages.
Our first priority was to get you back up and running; our second is to give you some background on what happened.
Here’s what we determined through our root cause analysis and how we’re preventing this moving forward.
Issue: O365 Sign-in Error 500
Cause: During a routine deployment, one of the third-party software libraries Teem SSO relies on was inadvertently upgraded to the latest major release of that library, which included noticeable changes.
Remediation:
· Reverted the library by specifying previous version with package management to avoid any unintentional upgrades
Issue: Post-Upgrade Device Connectivity
Cause: Teem’s core service had an interruption starting on Jan.9, 2021. During the interruption, when the EventBoard device made API calls to the service, the response status could be one of many error codes, including 401 Unauthorized. While this status code was in error, the device executed its designed security protocols and logged off. During a logout, EventBoard deletes all API tokens, downloaded themes, settings, and calendar data. It then reverts to a not-signed-in state and provides a 6-digit pin code for reactivation. Some customers were stuck on a “Authenticating with Teem …” screen. In these cases, after logging out EventBoard showed a message saying it could not communicate with Teem, and selecting “Retry” locked the app on that screen (a secondary symptom of the core issue). After logging out, the core service interruption would return an error instead of a pin code
Remediation:
· Deployed hotfix allowing devices to automatically activate and log in at pin code screen if they still exist in Teem database and are connected to only one Teem customer instance
· Modified EventBoard app to increase fault tolerance on false 401s and to no longer get stuck at “Authenticating with Teem …” screen
· Modified core service (monolith) so it doesn’t return 401s incorrectly
Issue: App.Teem.com Platform Outage
Cause: When deploying code on Jan. 14, 2021, an errant pip upgrade caused servers to not receive the deploying code and services to be stopped, interrupting all aspects of Teem.
Solutions:
· Updated deploy script to pin pip version
· Cleared salt cache and confirmed correct deploy script on servers
· Changed canary process to better detect downed servers
· Ongoing: Updating underlying framework and all packages
We apologize for any inconvenience and are continually working toward providing a more reliable experience for you.
For additional information or to report an issue, please reach out to your Account Manager or visit help.teem.com to contact our Customer Support team.
Thank you