One backed server down for 10 mins, file storage for 30 mins

We’re going to try and quickly complete what we failed to complete yesterday. This involves taking down an email server unexpectedly for 10 minutes to remove a component, and file storage for about 30 mins. Sorry for the inconvenience.

UPDATE: The affected backed email server is now back up and running. All users should be able to access their email

File storage offline for an hour

We’re going to try and move around the RAID array to try and solve the SCSI problems we’ve been having. To do this, we’ll have to totally take file storage offline for about an hour in about 30 minutes. Sorry for the short notice, but we feel this is better than the intermittent unplanned outages that it’s been causing for the entire web interface.

Update: NYI took a bit longer to get ready, the file storage is just about to go offline now about 1:30 mins later (10:40AM Eastern time)

Update: Ok, file storage should all be back and running now, however unfortunately due to some missing hardware at NYI, we weren’t able to complete the full change we wanted to do, so we may have to have another file storage outage on the weekend.

Outage

A repeat problem with the SCSI bus on one of the servers caused a web outage of about 10 minutes. IMAP, POP and SMTP services should have been unaffected. One good thing, we’ve worked out a way to avoid a repeat of the problem causing a significant web outage in the future while we try and get this issue worked you.

Various outages

Various teething problems with the new SATA array caused a series of outages.

No data was lost, however websites were offline and email services were slowed considerably (sometimes to the point of timeouts or failure to load).

We’re hoping that we’ve solved all the problems now, however we will be watching very closely over the next week to make sure that everything keeps working reliably.

More technical details and discussion in this forum thread.

Short outage on file storage

File storage was unavailable for about 15 minutes due to an emergency reboot that was required on one of the servers. It should be working fine again now.

File storage going readonly for a couple of hours

We’re installing a new array for file storage in anticipation of the usage increase that will come with DAV and FTP access (and because the current one is over 90% full and growing quickly!)

To avoid corruption as we transfer data between devices, we’re turning all filestorage to read only mode. This means you can still access your files (and your webpages will still work!) but won’t be able to add, rename or delete files.

We anticipate this will take 2-3 hours.

Bron.

Update: everything’s back in read/write mode again, and the new disk is installed!

Backend server restored

The backend server (server2) has been rebooted and services restored

Immediate reboot of one backend server (server2)

One of our backend servers requires an immediate reboot to resolve a locking problem. This should only take a few minutes and affect only some users.

Services restored

All services should now be restored

Servers down

Servers appear to be down. Investigating.

Follow

Get every new post delivered to your Inbox.

Join 3,975 other followers