So an odd thing happened around the 29th/30th of August that turned our production system upside down for a short time: the GoGrid machine we had running for a bit with no problems suddenly mounted the root file system as read only and stopped accepting incoming ssh connections.
Naturally we tried to resolve the problem through their tech support, but all we could were uninformative replies like "you must have upgraded your kernel" and "I can't get the machine to get an address through DHCP". Of course we haven't upgraded the kernel or any such thing. At one point the first tech could connect but said the kernel panicked during the boot process. Mmm great, I knew I should have backed up our config.
So we went into disaster recovery mode and tried to stand up another GoGrid instance using CentOS 32-bit. No dice, the machine would boot but couldn't ssh to it (another trapped in kernel panic?). Same thing for a RHEL 5 64-bit instance, that one we could ssh to, but tried a RHEL 4 32-bit instance, boot but no ssh, and finally another RHEL 4 32-bit instance assigned from the bottom of the IP pool and we could ssh to it. Very hit or miss so it was too risky to proceed.
We ended up moving our Linux/Apache/PHP5 system to a Windows 2008/IIS7/PHP5 system we had sitting spare (as a hot spare of our production system actually) and configured FastCGI and had things chugging along in about 4 hours.
Loosing a production system is a tough problem to deal with. The day was spent sorting out problems, fixing bad data (a read only file system using file based caching can make some really really bad data), and essentially lost. This is the risk you take and sometimes the price you pay for hosting on a beta platform.
Too bad, we were planning on moving our development, demo, and test servers to GoGrid because it would be cheaper, minus these sorts of events of course.