Opalstack - Intermittent outages on opal4 – Incident details

Intermittent outages on opal4

Resolved
Operational
Started about 6 years agoLasted 3 months

Affected

Americas Hosting

Operational from 12:36 AM to 4:03 PM

Web Hosting - Shared

Operational from 12:36 AM to 4:03 PM

opal4.opalstack.com

Operational from 12:36 AM to 4:03 PM

Updates
  • Resolved
    Resolved

    During last week's maintenance our upstream provider updated the server firmware on opal4. We've seen no further issues since that time.

  • Monitoring
    Monitoring

    The maintenance is complete and Opal4 is back online. We'll continue to monitor.

  • Identified
    Identified

    Opal4 is down for the emergency maintenance scheduled earlier today. The total expected downtime is less than one hour.

  • Investigating
    Investigating

    Opal4 will be going down this evening at 9PM US Central (2020-04-01 21:00 UTC-5) for emergency maintenance. The total downtime should be less than one hour.

    The ongoing issue on opal4 has been high system load caused by two separate types of attacks:

    1. High Apache RAM usage due to attacks against common targets like Wordpress sites
    2. High CPU usage by the system firewall under a high volume of SYN packets

    We've resolved the first problem by tuning our application firewall rules and by working with a couple of specific site owners that were receiving the brunt of the attacks.

    The second problem is part of what we're troubleshooting this evening.

    At first glance the SYN packet issue would seem like a common SYN flood attack but other shared servers in our infrastructure receive similar amounts of traffic and don't have any problems mitigating it.

    In the past 24 hours we've discovered key differences in Opal4's hardware compared to the other servers, so we're working with our upstream provider to sort that out.

    Tonight's downtime is at their request to allow them to perform hardware diagnostics in support of that investigation.

  • Monitoring
    Monitoring

    Opal4 just went through a brief spike in system load during which performance was degraded, but the system is back to normal at this time.

  • Identified
    Identified

    We've identified two distinct attack patterns responsible for the intermittent outages on opal4 and are putting measures in place to mitigate them.

  • Monitoring
    Monitoring

    This intermittent issue on Opal4 has started up again - we will continue to monitor as we work to resolve it.

  • Resolved
    Resolved

    We've seen no further issues in the past several hours.

  • Monitoring
    Monitoring

    The problem appears to have been caused by temporary high load due to high CPU usage by the system firewall.

    opal4 is stable at this time. We'll continue to monitor.

  • Investigating
    Investigating

    opal4.opalstack.com is experiencing intermittent outages. We're looking into it and will update this item when we have more information.