Technology Industry
Industry: Email Alert RSS FeedEssentials of a smooth-running network; Application-layer monitoring needs to complement that of device monitoring - Special focus: enterprise network testing
Communications News, July, 2002 by Phil Hollows
A comprehensive service-level management (SLM) approach--integrating systems, people and processes--can mean better application and service delivery, measurable progress and heightened competitive advantage. SLM is a complete rethinking of how performance information can be used.
Identifying, attributing, documenting, submitting, tracking and--ultimately--resolving service-delivery problems can involve significant manual efforts by IT staffers, even when elements of each step are automated by in-house systems. Waiting for the help desk to collapse under a tidal wave of incoming calls and tickets is a poor way to monitor system performance. Most companies have some form of network and systems-management solution monitoring their network infrastructure. Why, then, do major outages still take place, and why have response times failed to improve significantly?
Most RecentTechnology Articles
Most initial monitoring implementations are network oriented: they monitor individual devices and listen for traps, but users--not devices--interact with applications. Unless the device is well-known, receiving a trap or a monitoring alert does not tell the operators who is going to be affected.
Application-layer monitoring needs to be implemented to allow staffers to recognize the impact of device problems by application, enabling effective triage and prioritization of remedial activities. Apps should be treated as white boxes. Layer 7 tests should extend behind the user-facing servers to other server components in the stack. When implementing SLM, do not forget to check application, database and directory servers.
Technically, make sure the network itself is performing well. Ensure that the network topology is understood well enough for identification of critical path devices and network services, such as edge routers or VPN connectivity. Set up fault-management systems to monitor SNMP traps from these systems. Focusing on mission-critical devices guarantees IT is working efficiently on the most important issues.
If WAN technologies are being used, traps from the network provider's gear should be listened to. As with all threshold-based warnings, device performance monitored for at least a week establishes the appropriate thresholds.
Next, proactively measure performance related to end-user transactions. Place active monitoring agents at each core location. Depending on services offered, network quality tests can be set up to consistently measure latency, jitter and packet-loss rates independently of the application load. This is vital for streamed traffic, where network quality is typically the largest variable in perceived video- or voice-stream quality.
Set up monitors to measure the appropriate traffic for each core application. Web and e-mail transactions are obvious, so subtle interactions should be tested. If a remote office does not have a local DNS server, DNS look-up availability should be checked over the network with active tests.
Monitoring rates will vary depending on needs. Slower rates (such as once every 15 minutes) reduce artificial network traffic but may introduce unacceptable notification delays when performance problems occur. Active transactions every few minutes are acceptable on all but the slowest networks, simply because the amount of traffic such a transaction generates is typically immaterial compared to user-generated application data flows on the network.
Another consideration is transients. Sample repeatedly at each scheduled test time, to best determine how many iterations are required to fail before an alert is sent. This helps reduce the effects a transient issue, such as a burst of traffic from a nearby system sending a false alarm, without unduly increasing the delay in notification of genuine issues.
Finally, a technique to consider to reduce monitoring overhead while yielding important contextual data about a problem is to set up dependencies between monitors. Disabling all application monitors against a given server is a choice if an ICMP ping test, for example, cannot reach the host.
Building such an approach provides speedy guidance to staff when problems do arise. Adopting a complete SLM approach addresses not just systematic data collection and early warning systems, but the processes these systems are ultimately supporting.
For more information from Response Networks: www.rsleads.com/207cn-260
Hollows is with Response Network, North Andover, MA.
CIO SessionsVision Series on ZDNet
Brought to you by CBS MoneyWatch.com
- 10 Best Places to Retire
- Companies with the Best 401(k) Plans
- Most Important Document for Your Heirs? It's Not Your Will
- Video: Should You Expect to Retire Rich?
- Over 50? Here's How to Get (and Keep) a Great Job
Most Recent Technology Articles
- INTERVIEW WITH BEN BUTTERS, DIRECTOR OF EUROPEAN AFFAIRS AT EUROCHAMBRES : "A PERFECT ROAD MAP FOR EU CLUSTERS DOES NOT EXIST".
- AGENDA.(Brief article)(Conference notes)
- FIGHT AGAINST INTERNET PIRACY.
- INTERNET : AUTHORS' SOCIETIES URGE ACTION AGAINST PIRACY.
- TELECOMMUNICATIONS : BUSINESSEUROPE HOSTILE TO FURTHER CONTRACTUAL OBLIGATIONS.(Brief article)
Most Recent Technology Publications
Most Popular Technology Articles
- What is precision air conditioning and why is it necessary?
- Business process re-engineering in the small firm: A case study
- BizRate to monitor in-store customer satisfaction for Office Depot stores - Market Intelligence
- Speed control of separately excited DC motor
- Base course modification through stabilization using cement and bitumen


