Event management of large distributed system and network management environments

Publication Type:
Thesis
Issue Date:
2006
Full metadata record
Co-ordinated event management across system, network and application environments is a challenging task. The wide diversity of industry and commercial standards, differing business and technical requirements and a huge variety of environments mean there are no simple solutions. This thesis proposes a highly scaleable, flexible and resilient event management architecture that has been applied to the outsourcing activities of HP Services worldwide. Our solution is based on industry standards such as SNMP and commercial products. It provides a framework for all aspects of event management, including event detection, logging, notification, and correlation. It was initially applied and refined in an outsourcing IT environment, then further developed in larger outsourcing environments. It was developed using a standard solution architecture methodology (known as ITSA) that enabled the partly developed architectures to be continually refined, improved and deployed. The technology aspects of the solution work closely with ITIL event management processes. To achieve a unified event display and a standardised event message format, all events from all sources are reduced to a standard format that includes the “raw” event information plus business intelligence, called the business string, added to the event for display and routing purposes. This extra information identifies the nature of the event and allows filtered displays of events. It is extracted from configuration management extensions added to the standard event management tools. The extended format is flexible enough to handle the different commercial tools. The first generation of the solution was based on Computer Associates’ Unicenter TNG and was called the Event Monitoring Utility (EMU). This was later significantly extended by switching to HP OpenView, and the extra development of further central event management functions, especially event correlation, in a solution called DECADE. Significant agent extensions were achieved by the creation and deployment of a solution called SMSPI, which included an extended configuration management and policy database, and further event automation. The extended solution is now deployed across HP Services’ entire global outsourced environment. The solution has proven very successful, winning two Computer Associates Software Achievement Awards, including the Grand Prize, and generating two US patents. It will be progressively deployed to several million servers and network devices globally over the next few years. The work described here is at once self-contained and a basis for on-going development of event management in the face of ever more complex systems, and increasing demands for more detailed event management.
Please use this identifier to cite or link to this item: