Elevated API Errors

Incident Report for Adapty

Postmortem

This outage was caused by Cloudflare WAF Managed Rules migration. We use them to block unwanted requests to our API. Those are mostly bots. We also use rate-limiting rules. This update should’ve been invisible to the customers. But for some reason, the number of requests to our load balancer nodes increased almost 10x. This caused all of them to be 100% occupied and they started to drop most of the new connections. After introducing new load balancer nodes to the cluster, we were able to handle incoming requests. Over time the number of requests has gone back to normal.

We contacted Cloudflare for the details on what could be the reason for this behavior, and we’ll provide updates once we hear back from them.

To prevent this from happening in the future, we will introduce rate limiting not only on the Cloudflare level but also inside our infrastructure.

Posted May 04, 2023 - 20:09 UTC

Resolved

This incident has been resolved.

Posted May 03, 2023 - 11:50 UTC

Update

We are continuing to monitor for any further issues.

Posted May 03, 2023 - 11:31 UTC

Monitoring

A fix has been implemented and we are monitoring the results.

Posted May 03, 2023 - 11:31 UTC

Investigating

We're experiencing an elevated level of API errors and are currently looking into the issue.

Posted May 03, 2023 - 11:10 UTC

This incident affected: API and Dashboard.