This is the blog of Ronald Bartels that wanders on and off the subject of problem management (that is how it started), but it is best described by Ray who says this is Daddy's thoughts! Like the best music is from the Eighties and a wee dram helps in solving most inconveniences.
Ganglia Monitoring System
Ganglia is a scalable distributed monitoring system for
high-performance computing systems such as clusters and Grids. It is
based on a hierarchical design targeted at federations of clusters. It
leverages widely used technologies such as XML for data representation,
XDR for compact, portable data transport, and RRDtool for data storage
and visualization. It uses carefully engineered data structures and
algorithms to achieve very low per-node overheads and high concurrency.
The implementation is robust, has been ported to an extensive set of
operating systems and processor architectures, and is currently in use
on thousands of clusters around the world. It has been used to link
clusters across university campuses and around the world and can scale
to handle clusters with 2000 nodes.
Find out more about Ganglia at Sourceforge over here.