Increasing network performance using molecular sequence reduction technology - Internet

Computer Technology Review, Feb, 2002 by Amit P. Singh

This article is the first in a two-part series.

How fit is your network? Enterprise wide area networks (WANs) are a mission-critical resource and must be tuned for optimum performance. Many network managers are often asked to evaluate the performance of their enterprise network. The network "health" metrics that they most commonly use for this purpose are link utilization, round-trip delay, and packet loss.

Unfortunately, these traditional measures do not capture the inherent redundancy of the data being transported over today's networks. During my tenure at Stanford University and as the co-founder of Peribit Networks, my research team and I discovered that most networks contain numerous repetitive data patterns that span virtually all applications and user sessions, severely degrading network performance. This means that both transmission links and routers continuously process vast amounts of redundant data. In fact, measured results from over 100 networks show average repetition rates of 60 to 90 percent. Therefore, most WANs are not running anywhere near their true potential.

Some network managers know they have a capacity problem because of congestion, packet loss, and end user complaints. Other network managers may find that their WAN links are within performance targets. However, even these WAN links likely transport repetitive data and therefore could be downsized to cut expenses while still delivering the same (or higher) network performance.

Let's compare network performance to health and fitness. You may believe you are healthy because you are not ill. However, this does not mean that you are fit and in optimal physical condition. By analogy, your network may have acceptable delays and appear to be healthy. However, the network could in fact be far from its maximum potential due to the many repetitions that are wasting network resources.

What causes repetitive network traffic? There are three common sources of repetitive data traversing corporate networks:

* Business process flows

* Application overhead

* Commonly used strings, phrases, or objects

Business Process Flows

Common business practices and workflows generate huge amounts of repetitive data in networks. Employees frequently copy or forward email messages with attachments, resulting in multiple repeated transmissions of the same or similar data. Organizations typically maintain centralized databases and servers that are frequently accessed by employees. Many queries to these databases retrieve the same information, e.g., when salespeople pull up contact, account, or status update information. Finally, frequently used information like HR benefits are often posted to an internal Web site as a means to efficiently disseminate data to all employees. These files are frequently downloaded causing repeated transmission of the same data within the enterprise.

Application Overhead

Distributed applications are designed to be easy to use and must guarantee the reliability and consistency of data. Thus enterprise applications are often very "chatty" and frequently communicate with distributed end points to ensure that they are correctly synchronized and that data consistency is maintained. These update messages along with full database replications are typical of virtually all enterprise applications and generate a high degree of repetitions across the WAN. In addition to the internal application traffic, user communications with distributed applications are also very redundant. All requests to an application server or database must follow a fixed format and protocol and the response from the application must also be in a fixed recognizable format. These application protocols are designed to be easy to use, fault-tolerant, portable, and very extensible and thus often generate significant communication overhead that is highly redundant.

Commonly Used Strings, Phrases, or Objects

In English, the words "the" "and" "to" "you" occur very often in normal conversation. Likewise, many common phrases exist that are supersets of these words, e.g. "Talk to you soon." Furthermore, inside companies and other communities there are even more common phrases that are used over and over again like "quarterly financial results," "project management status update," and "company confidential." These commonly used patterns can range in size from a few words (such as the previous examples) to large paragraphs (e.g., a common company disclaimer or backgrounder).

In addition to the repetitions that appear in text, there is often a much greater degree of redundancy due to the many common objects that are accessed and transferred throughout an organization. These repeated objects could range from common images that are embedded in various documents to tables and slides that are used in multiple presentations. Users typically generate data through an "evolutionary" process where previous instances of the file or object are gradually modified and combined to create new versions. Hence, the combined data generated by all the applications and users on a network can be very highly repetitious.


 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
CXO UnpluggedSmart Business interviews on BNET

See and hear how senior level executives across the Asia Pacific are developing smart business ideas across a variety of sectors. The focus is on the future, and on how businesses need to evolve.

advertisement
  • Click Here
  • Click Here
  • Click Here
advertisement

Content provided in partnership with Thompson Gale