All chart metrics are now completely caught up. The root cause of the incident was due to attempted table partitioning during a database vacuum, which caused a lock on a critical table and cascaded to impact the rest of the application. We'll be adjusting our vacuum and partitioning schedules to avoid this lock again.
Posted Jun 03, 2019 - 10:36 MDT
We've identified and fixed the database connection issue. We are currently loading the backlog of data that was held during the incident. Data will be appearing in the UI shortly.
Posted Jun 03, 2019 - 10:03 MDT
We appear to be using more than the expected number of database connections, causing failures on our Web UI. Ingestion is backed up, but the incoming data is safe and collected.