System outage

Status (by robm at Sun May 31 15:44 UTC)
One of our primary database servers has died, causing an outage for all services. We’re failing over to our replica to restore all services.

Posted in Status. Comments Off

Short web + imap outage

Status (by robm at Wed May 20 07:41 UTC)
An imap server froze up in an odd way that caused all the web processes accessing the server to get stuck, causing us to run out of web processes. This lasted about 5-10 minutes.

We’ve failed people off the affected IMAP server, and restored the web servers. Everything should be functioning normally again.

One slot of one IMAP server was down for 3-4 hours

Status (by robm at Mon May 18 19:51 UTC)
One storage slot on one of our IMAP servers got itself into a state where it would allow logins, but then "freeze". Unfortunately our regular 2 minute checks wasn’t picking this up, meaning we didn’t know about the problem for quite some time. This means that a few 100 users couldn’t login to their account.

The affected slot has been fixed, and we’ll be updating our 2 minute checks to make sure this problem is detected promptly in the future.

One imap server down

Status (by robm at Sun May 17 09:47 UTC)
One of the imap servers is down, affecting email access for some users. We’re currently investigating.

Update (by robm at Sun May 17 10:47 UTC)
We’re having trouble getting the IMAP server back up, so we’ve failed over all services to replica servers. All services should be restored again.

Message read screen display problems

Status (by robm at Fri May 15 01:55 UTC)
For the last 12 hours or so, there may have been some display problems on the message read screen for users.

This would have resulted in a lot of buttons displaying in the action bar region at the top and bottom of each page (eg multiple Delete buttons, multiple action select menus, etc).

This should now be fixed.

One of our servers is down

Status (by brong at Wed May 6 09:14 UTC)
One of our servers has just gone offline. Am looking into the cause now.

Update (by brong at Wed May 6 09:23 UTC)
Have switched all users to replicas. Everything should be working again now. Now to figure out what’s gone wrong!

Web outage

Status (by robm at Tue May 5 15:31 UTC)
We’re experience some web server problems. We’re investigating

Update (by robm at Tue May 5 15:39 UTC)
Ok, we’ve worked out the problem, fixing it, should be back in 5 minutes

Update (by robm at Tue May 5 15:42 UTC)
Ok, web interface is restored.

One IMAP server down

Status (by robm at Fri May 1 16:58 UTC)
One IMAP server is down affecting some users. We’re investigating.

Update (by robm at Fri May 1 17:20 UTC)
We’re rebooting the IMAP server, but it’s having some problems. If we can’t get it back up, we’ll fail all affected users to the replica in the next 10 minutes.

Update (by robm at Fri May 1 17:52 UTC)
We’ve now failed over all affected users to replica servers. All email should be working again.

Follow

Get every new post delivered to your Inbox.

Join 50 other followers