Update - Apr 12'th 2018 - 14:30 CST We have fixed the network issue and site performance has returned to normal. We are still working with our hosting partners on a postmortem of the incident. We do not anticipate further disruption but will likely need to perform some site maintenance overnight (approx 1:00 AM CST).
Incident - Apr 12'th 2018 - 13:01 CST We are experiencing network issues that have reduced our frontend server availability leading to sluggish performance on OAS sites. We are investigating the issue and will update when we know more.
We will be performing emergency maintenance on a network gateway which will result in around 3 minutes of downtime sometime during the maintenance window.
We will be performing database server maintenance. During this window we expect a couple of small interruptions (5-10 second blips) to OAS websites. We will not be taking any additional steps to alert users as since the impact will be so minimal.
Image Tools on OAS will be unavailable between approximately 2017-05-01 21:00 CDT and 2017-05-01 22:00 CDT while maintenance is performed on the network. This means you will be unable to upload images to an auction, edit any images, or save/publish any images. Please plan accordingly.
All sites will unavailable during this maintenance. It will take at most two hours to complete. The sites will display a 'down for maintenance page' while the updates are being carried out.
The problem was tied to segfaults in libfreepriv3.so caused by some components that utilize nss libraries. The issue has been temporarily resolved by downgrading nss. Long term there will be a solution provided by Red Hat or our hosting provider will make adjustments to the OpenStack hypervisor the server instance is running on.
All OAS sites will be impacted. During the hour there will be periods of downtime. This maintenance is to update the controller firmware on servers in our database cluster. This action is in response to downtime experienced on May 10th. The update is needed to prevent a recurrence of said issue, where, "The Server May Stop Responding and Will Display Power-On Self-Test (POST) Error Messages on Reboot When Running Smart Array Controllers and/or Host Bus Adapters".
An incident with our database hardware caused a site outage. Steps are being taken to upgrade firmware to ensure that this same incident does not reoccur.