Sunday, July 31, 2011

Initial Test Points for Getting Your Environment Under Control

Starting a job with a running system and real users is a nice “problem” to have but it presents some unique challenges as well. Especially if server monitoring isn’t robust and there are absolutely zero automated tests. Without these two critical components, you’re both operating and developing completely blind.

Without monitoring, server changes can’t be analyzed to see you’ve really made things better (or even worse). And without testing, every commit you make is a risk to the running site.
Monitoring made easy

Pingdom is perhaps the simplest monitoring tool that literally anyone with a browser can setup. Even if you don’t want to (or can’t) spend a penny, they will track one URL on your site for free. Be smart, and point this URL to a critical, complex page on your site to verify as many running pieces as possible. Once entered, Pingdom starts collecting data on the page’s general availability and even response time (world-wide).

With the single free URL check from Pingdom, you literally have zero excuses for flying blind. As outages crop up, get the URLs that demonstrate these failures added to Pingdom. Stop being the last guy to find out that the web service is down and start being the one reporting it’s outage to team.

Getting your SNMP configured correctly is the next step and will allow you to do real low-level monitoring of disks, cpu, network, etc. If you don’t have the time (or know-how) to setup a front-end to report on all these data points, think about having an external service provider do it for you. Logicmonitor and Cloudkick are both excellent and reliable monitoring services.

Read more: DevOps zone
QR: initial-test-points-getting

Posted via email from Jasper-Net