DCSki Server Status ?
8 posts
6 users
5k+ views
bousquet19
April 12, 2010
Member since 02/23/2006 🔗
789 posts
Scott,

Over the weekend of April 10-11, a message appeared when I tried to log on to DCSki that your server was down and that the site would be inaccessible for a period of time. Yet less than 24 hours later, the site was up and running!

Did you apply a short-term band-aid or a longer-term fix?

Woody
Scott - DCSki Editor
April 12, 2010
Member since 10/10/1999 🔗
1,276 posts
Originally Posted By: bousquet19
Scott,

Over the weekend of April 10-11, a message appeared when I tried to log on to DCSki that your server was down and that the site would be inaccessible for a period of time. Yet less than 24 hours later, the site was up and running!

Did you apply a short-term band-aid or a longer-term fix?

Woody


Yeah, there was a bit of excitement this weekend!

Late Friday night, just as I was about to go to bed, the Network Operations Center at the facility that houses my server called to let me know my server had just stopped responding. They sent a tech to the cage and reported the server's lights were off, and that it wouldn't boot.

I was up most of Friday night trying to diagnose and implement a contingency plan, which involves setting up an alternate site (which provided the outage message you saw). The outage message was being delivered within 11 or 12 hours of the initial failure; it can take that long to re-propagate a domain name to a new location.

On Saturday, I drove down to Virginia where the server lives, and brought a spare parts kit I had (thankfully) purchased six years ago when I bought the server (an Apple XServe). We pulled the server off the rack, and I swapped out its PMU battery and power supply. Hit the power button, and it roared back to life -- phew! Apple has made it very easy to swap out parts in the XServe -- it took less than a minute to pop out the power supply and put in a new one. (Newer versions of the XServe come with an option to have two live power supplies, so it switches over to a backup instantly if the primary fails -- I think I'll buy that option in the future if/when I upgrade the server.)

So after running non-stop for over 50,000 hours, the power supply had decided it was time to retire. I was thankful I had the spare, and that it was still working -- it had been stored in a very hot attic all this time. (I was crawling around my attic at 2 a.m. searching for it!) I hope to find another spare I can buy to have on hand in case this happens again, since power supplies are the most likely thing to fail.

It wasn't a fun weekend; it was pretty stressful, because there was the possibility of losing data depending on the nature of the failure, going weeks without primary service, and the prospect of incurring a large expense to get things back on the ground. (As it is, I'm expecting a large bill from the colocation facility for the emergency weekend/after-hours assistance, and also had to purchase service for the alternate/backup site.) I've always worried about a hardware failure, because obtaining access to the server quickly is challenging; it's located in the equivalent of a digital Fort Knox two hours away in Virginia. Hopefully it's good for another 6 years, though! And it was nice to be able to have live support from the companies I contract with throughout the night and weekend. Even though the failure occurred late on a Friday, the server was 100% back with no data loss within 17 hours.
The Colonel
April 13, 2010
Member since 03/5/2004 🔗
3,110 posts
Scott,
You da man!
Thanks,
The Colonel smile
JimK - DCSki Columnist
April 16, 2010
Member since 01/14/2004 🔗
3,013 posts
Originally Posted By: The Colonel
Scott,
You da man!
Thanks,
The Colonel smile


+1


Had to make sure we got a post on the 16th :-)
Laurel Hill Crazie
April 12, 2011
Member since 08/16/2004 🔗
2,053 posts
Thanks Scott for your diligence. Life without DCSki would be a less fun.
pagamony - DCSki Supporter 
April 12, 2011
Member since 02/23/2005 🔗
937 posts
Soctt, you ever think about moving over to a hosted service instead of maintaining your own server?
Scott - DCSki Editor
April 17, 2011
Member since 10/10/1999 🔗
1,276 posts
This post was from a year ago, but was "revived" by a spammer (who since has been deleted). The server's been humming along great since a power supply failure a year ago. I always keep a spare power supply (and other parts) ready and waiting, but making any repairs requires a long drive to the data center. There's only been the one power supply failure, thankfully.

Currently DCSki is running on a co-located server in a state-of-the-art data center. The data center is responsible for many aspects of the server (keeping it powered 24/7, cooled, bits flowing in and out, etc.) A dedicated physical server has provided a lot of performance and flexibility. But, I will probably move to a dedicated virtual server this summer and give that a try. Hopefully that will maintain the same flexibility at a lower cost without giving up too much performance.
Laurel Hill Crazie
April 17, 2011
Member since 08/16/2004 🔗
2,053 posts
Ha, I thought this problem sounded familiar. I should have checked the date of the last post, perhaps I did but only the 4/13 registered? blush

Ski and Tell

Speak truth to powder.

Join the conversation by logging in.

Don't have an account? Create one here.

0.15 seconds