Active Queue Management F. Baker
Internet-Draft R. Pan
Intended status: Informational Cisco Systems
Expires: May 4, 2016 November 1, 2015
On Queuing, Marking, and Dropping
draft-ietf-aqm-fq-implementation-05
Abstract
This note discusses queuing and marking/dropping algorithms. While
these algorithms may be implemented in a coupled manner, this note
argues that specifications, measurements, and comparisons should
decouple the different algorithms and their contributions to system
behavior.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 4, 2016.
Copyright Notice
Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Baker & Pan Expires May 4, 2016 [Page 1]
Internet-Draft November 2015
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Fair Queuing: Algorithms and History . . . . . . . . . . . . 3
2.1. Generalized Processor Sharing . . . . . . . . . . . . . . 3
2.1.1. GPS Comparisons: transmission quanta . . . . . . . . 4
2.1.2. GPS Comparisons: flow definition . . . . . . . . . . 4
2.1.3. GPS Comparisons: unit of measurement . . . . . . . . 5
2.2. GPS Approximations . . . . . . . . . . . . . . . . . . . 5
2.2.1. Definition of a queuing algorithm . . . . . . . . . . 5
2.2.2. Round Robin Models . . . . . . . . . . . . . . . . . 6
2.2.3. Calendar Queue Models . . . . . . . . . . . . . . . . 7
2.2.4. Work Conserving Models and Stochastic Fairness
Queuing . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.5. Non Work Conserving Models and Virtual Clock . . . . 9
3. Queuing, Marking, and Dropping . . . . . . . . . . . . . . . 10
3.1. Queuing with Tail Mark/Drop . . . . . . . . . . . . . . . 10
3.2. Queuing with CoDel Mark/Drop . . . . . . . . . . . . . . 11
3.3. Queuing with RED or PIE Mark/Drop . . . . . . . . . . . . 11
4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 12
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12
6. Security Considerations . . . . . . . . . . . . . . . . . . . 12
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 13
8. References . . . . . . . . . . . . . . . . . . . . . . . . . 13
8.1. Normative References . . . . . . . . . . . . . . . . . . 13
8.2. Informative References . . . . . . . . . . . . . . . . . 13
Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 15
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 15
1. Introduction
In the discussion of Active Queue Management, there has been
discussion of the coupling of queue management algorithms such as
Stochastic Fairness Queuing [SFQ], Virtual Clock [VirtualClock], or
Deficit Round Robin [DRR] with mark/drop algorithms such as CoDel
[I-D.ietf-aqm-codel] or PIE [I-D.ietf-aqm-pie]. In the interest of
clarifying the discussion, we document possible implementation
approaches to that, and analyze the possible effects and side-
effects. The language and model derive from the Architecture for
Differentiated Services [RFC2475].
This note is informational, intended to describe reasonable
possibilities without constraining outcomes. This is not so much
about "right" or "wrong" as it is "what might be reasonable", and
discusses several possible implementation strategies. Also, while
queuing might be implemented in almost any layer, specifically the
note addresses queues that might be used in the Differentiated
Services Architecture, and are therefore at or below the IP layer.
Baker & Pan Expires May 4, 2016 [Page 2]
Internet-Draft November 2015
2. Fair Queuing: Algorithms and History
There is extensive history in the set of algorithms collectively
referred to as "Fair Queuing". The model was initially discussed in
[RFC0970], which proposed it hypothetically as a solution to the TCP
Silly Window Syndrome issue in BSD 4.1. The problem was that, due to
a TCP implementation bug, some senders would settle into sending a
long stream of very short segments, which unnecessarily consumed
bandwidth on TCP and IP headers and occupied short packet buffers,
thereby disrupting competing sessions. Nagle suggested that if
packet streams were sorted by their source address and the sources
treated in a round robin fashion, a sender's effect on end-to-end
latency and increased loss rate would primarily affect only itself.
This touched off perhaps a decade of work by various researchers on
what was and is termed "Fair Queuing," philosophical discussions of
the meaning of the word "fair," operational reasons that one might
want a "weighted" or "predictably unfair" queuing algorithm, and so
on.
2.1. Generalized Processor Sharing
Conceptually, any Fair Queuing algorithm attempts to implement some
approximation to the Generalized Processor Sharing [GPS] model.
The GPS model, in its essence, presumes that a set of identified data
streams, called "flows", pass through an interface. Each flow has a
rate when measured over a period of time; A voice session might, for
example, require 64 kbps plus whatever overhead is necessary to
deliver it, and a TCP session might have variable throughput
depending on where it is in its evolution. The premise of
Generalized Processor Sharing is that on all time scales, the flow
occupies a predictable bit rate, so that if there is enough bandwidth
for the flow in the long term, it also lacks nothing in the short
term. "All time scales" is obviously untenable in a packet network -
and even in a traditional TDM circuit switch network - because a
timescale shorter than the duration of a packet will only see one
packet at a time. But it provides an ideal for other models to be
compared against.
There are a number of attributes of approximations to the GPS model
that bear operational consideration, including at least the
transmission quanta, the definition of a "flow", and the unit of
measurement. Implementation approaches have different practical
impacts as well.
Baker & Pan Expires May 4, 2016 [Page 3]
Internet-Draft November 2015
2.1.1. GPS Comparisons: transmission quanta
The most obvious comparison between the GPS model and common
approximations to it is that real world data is not delivered
uniformly, but in some quantum. The smallest quantum, in a packet
network, is a packet. But quanta can be larger; for example, in
video applications it is common to describe data flow in frames per
second, where a frame describes a picture on a screen or the changes
made from a previous one. A single video frame is commonly on the
order of tens of packets. If a codec is delivering thirty frames per
second, it is conceivable that the packets comprising a frame might
be sent as thirty bursts per second, with each burst sent at the
interface rate of the camera or other sender. Similarly, TCP
exchanges have an initial window, common values of which include 1,
2, 3, 4 [RFC3390], and 10 [RFC6928], and there are also reports of
bursts of 64 kB at the relevant MSS, which is to say about 45 packets
in one burst, presumably coming from TCP Segment Offload (TSO, also
called TOE) engines (at least one implementation is known to be able
to send a burst of 256 kB). After that initial burst, TCP senders
commonly send pairs of packets, but may send either smaller or larger
bursts [RFC5690].
2.1.2. GPS Comparisons: flow definition
An important engineering trade-off relevant to GPS is the definition
of a "flow". A flow is, by definition, a defined data stream.
Common definitions include:
o Packets in a single transport layer session ("microflow"),
identified by a five-tuple [RFC2990],
o Packets between a single pair of addresses, identified by a source
and destination address or prefix,
o Packets from a single source address or prefix [RFC0970],
o Packets to a single destination address or prefix,
o Packets to or from a single subscriber, customer, or peer
[RFC6057]. In Service Provider operations, this might be a
neighboring Autonomous System; in broadband, a residential
customer.
The difference should be apparent. Consider a comparison between
sorting by source address or destination address, to pick two
examples, in the case that a given router interface has N application
sessions going through it between N/2 local destinations and N remote
sources. Sorting by source, or in this case by source/destination
Baker & Pan Expires May 4, 2016 [Page 4]
Internet-Draft November 2015
pair, would give each remote peer an upper bound guarantee of 1/N of
the available capacity, which might be distributed very unevenly
among the local destinations. Sorting by destination would give each
local destination an upper bound guarantee of 2/N of the available
capacity, which might be distributed very unevenly among the remote
systems and correlated sessions. Who is one fair to? In both cases,
they deliver equal service by their definition, but that might not be
someone else's definition.
Flow fairness, and the implications of TCP's congestion avoidance
algorithms, is discussed extensively in [NoFair].
2.1.3. GPS Comparisons: unit of measurement
And finally, there is the question of what is measured for rate. If
the only objective is to force packet streams to not dominate each
other, it is sufficient to count packets. However, if the issue is
the bit rate of an SLA, one must consider the sizes of the packets
(the aggregate throughput of a flow, measured in bits or bytes). And
if predictable unfairness is a consideration, the value must be
weighted accordingly.
[RFC7141] discusses measurement.
2.2. GPS Approximations
Carrying the matter further, a queuing algorithm may also be termed
"Work Conserving" or "Non Work Conserving". A queue in a "work
conserving" algorithm, by definition, is either empty, in which case
no attempt is being made to dequeue data from it, or contains
something, in which case the algorithm continuously tries to empty
the queue. A work conserving queue that contains queued data, at an
interface with a given rate, will deliver data at that rate until it
empties. A non-work-conserving queue might stop delivering even
though it still contains data. A common reason for doing this is to
impose an artificial upper bound on a class of traffic that is lower
than the rate of the underlying physical interface.