Enterprise QoS Survival Guide: 1999 Edition

                           Terry Gray
                    University of Washington


Enterprise Quality of Service (QoS) concerns the transformation of current best-effort, single-class-of-service internets into systems that can offer one or more levels of preferred service to certain classes of users or applications. The goal of having a network with preferred service levels is to increase the probability that critical or demanding applications will receive enough bandwidth (and absence of delay) to succeed. The importance of having preferred service levels depends on the amount of (instantaneous) congestion in the enterprise network. The term QoS is susceptible to many different definitions, and a wide spectrum of implementation approaches. Choosing the "best" one will require making decisions about the relative costs of network capacity and the machinery to manage it.

There are three kinds of people with respect to network QoS: optimists, pessimists, and fence-sitters. Optimists believe that the cost of bandwidth is or soon will be less than the cost of complex mechanisms for managing it. Pessimists believe that bandwidth will always be scarce and therefore complex end-to-end QoS mechanisms will be essential in order for advanced applications to succeed in a shared Internet infrastructure. The fence-sitters want to believe the optimists (and indeed do believe them with respect to campus/enterprise bandwidth) but aren't so sure about wide-area bandwidth (notwithstanding advances such as DWDM). In any case, the fence-sitters figure they better have a contingency plan in case the pessimists are, at least partially, right.

This paper attempts to identify key issues in enterprise QoS, and then outlines a "fence-sitter" strategy that emphasizes operational simplicity but also tries to hedge bets. Key points include:

  1. The problem focus is providing support for different classes of service in a campus or enterprise network infrastructure. The principal concerns are recurring costs and network reliability.
  2. While attempting to provide a specific level of service for certain applications or users, real-world QoS solutions must also preserve some minimum amount of bandwidth for baseline or best-effort service. (Network managers can die at the hands of the few or the many :)
  3. Different QoS strategies are appropriate for different parts of a network, depending on probabilities of congestion (as well as non-technical issues.) Three different "congestion zones" are identified: local subnet, enterprise backbone, and border/wide-area.
  4. Within a particular "congestion zone", the desireability of using admission control or other "heavy weight" QoS mechanisms depends on the answers to several key questions, in particular: a) Is the cost of bandwidth greater or less than the cost of managing it? b) Is the prevailing economic environment such that a revenue stream exists to allow adding capacity? c) If capacity is insufficient, do you prefer to disappoint users via poor session performance, or via denial of service (busy signals)?
  5. One conclusion: IF bandwidth costs more than managing it, AND there is inadequate revenue for adding capacity, AND busy signals are preferable to degraded sessions, THEN admission control is necessary and desirable (but probably not otherwise).
  6. If different portions of a network may have different (or no) packet prioritization mechanisms, what differentiation info should be carried in each packet? As a model for thinking about packet prioritization requirements consider the following taxonomy: Differentiation by a) application type/need, b) user/application desire, and c) user/application privilege. These three criteria can be mapped to distinct sets of bits in a frame (specifcally: port number, TOS/DS byte, and 802.1p/Q priority/VLAN bits).
  7. Avoiding the use of heavy-weight QoS mechanisms (e.g. per-session authentication/authorization/accounting/reservation and admission control) within the enterprise network is very appealing in order to avoid their impact on recurring operational costs and reliability.
  8. We worry about any scheme that makes an organization's most important strategic infrastructure asset (the ability to forward packets) dependent on authentication and policy servers that are not now needed for per-packet forwarding decisions.
  9. Perhaps the most important network design objective for the future will be to minimize "policy jitter", that is, the rate-of-change of QoS policies over time and their associated costs. There is evidence that the only way to accomplish this goal is to seriously limit the number of available policy options.
  10. In summary, UW's specific network QoS infrastructure goals include:

Said differently, our strategy is to build a network infrastructure that can support multiple classes of service without introducing the complexity of per-session admission control (via per-user authentication or per-packet or per-flow lookups or reservations). It should be amenable to several different premium service policy models, e.g. charging per-port subscription and/or usage fees and/or support for differential queuing based on application need, especially delay-sensitivity. The UW model also allows for end-systems to signal campus border routers acting as bandwidth brokers (e.g. via RSVP) if necessary to negotiate for wide-area premium service, and for "very long term" reservations (i.e. segregated bandwidth) or MPLS virtual circuits among enterprise sites for IP telephony, IP videoconferencing, etc.

For the complete 50 page paper see Enterprise QoS Survival Guide: 1999 Edition