We are investigating an issue with Teem
Incident Report for Teem
Postmortem

Teem Detailed Root Cause Analysis | 4.05.2024 

404 Error Site Wide Outage 

 

We are truly grateful for your continued support and loyalty. We value your feedback and appreciate your patience as we worked to resolve this incident.  

 

Description: 

On April 5th, 2024,  internal teams and customer support noticed issues happening with logging and being able to access sites. Reports of some Teem instances experiencing inability to login with a forward of a 404 error also were presented. This issue impacted a large scale of Teem Customers. 

This cause of this site outage was due to a migration to a new server that would allow easier rollbacks and data integrity, that unfortunately had a performance error and needed to be done early. 

Upon receiving notice of the site access issues, our dedicated team promptly took action to restart the migration process, ensuring its successful completion and resolving the issue. 

 

Type of Event: 

Site Access  

 

Remediation: 

Upon immediate notice of the login issues, our dedicated Teem team was able to restart the migration process to successful completion, resolving the issue. 

 

Timeline: 

 April 5 

  • (4:14 AM) – Internal teams were noticing issues with logging as well as site access 
  • (4:21 AM) – Customers also reporting the issue, at this point we have already notified the team 
  • (4:24 AM) – Investigation process has started with internal teams 
  • (5:23 AM) – Internal teams notify us of continued investigation but that the issue is close to being found 
  • (6:43 AM) – The issue has been found and a fix has been implemented with the rollback of the server to the new DB still needing around a few hours to finish. 
  • (12:45 PM) – The database has been moved and restored. 

 

Total Duration of Event: 

(0 day/8 hours/30minutes)  

 

Root Cause Analysis: 

 

There was a production outage caused by a database that we were hosting on. We needed to then move the data base to our standby server and set up a new one. We had planned the migration for a few weeks later and were preparing for it but the server encountered errors before the planned maintenance happened forcing us to start the migration early. We started the migration server backed up the DB and re-established connection with services. 

 

Preventative Action:  

The Teem team was able to start a migration process and monitor it through successful completion and review with internal testing to ensure the issue was resolved for customers. This incident has been closed, but our team is dedicated to closely monitoring future updates as they are released to ensure the best customer experience.

Posted Apr 17, 2024 - 11:41 MDT

Resolved
We have had confirmation from internal teams as well as customers of the issue being resolved. We appreciate everyone's patience as our engineers worked to resolve this.
Posted Apr 05, 2024 - 12:45 MDT
Monitoring
A fix has been implemented by the Engineering team. We'll be going into Monitoring for the next hour, an update will be given at 14:30 CST.
Posted Apr 05, 2024 - 11:30 MDT
Update
We are continuing to work on the fix for this issue. We will post another update at 13:00 CST.
Posted Apr 05, 2024 - 09:57 MDT
Identified
The issue with TEEM has been identified and a fix is being implemented. We will post another update at 11:00 CST.
Posted Apr 05, 2024 - 08:04 MDT
Investigating
We are currently investigating an issue with Teem. We will update you when we have more information.
Posted Apr 05, 2024 - 04:17 MDT