This draft describes extended performance statistics for TCP. They are designed to use TCP's ideal vantage point to diagnose performance problems in both the network and the application. If a network based application is performing poorly, TCP can determine if the bottleneck is in the sender, the receiver or the network itself. If the bottleneck is in the network, TCP can provide specific information about its nature.
Please get the most up-to-date date TCP-ESTATS-MIB here:
[live draft]
[the
already out of date IETF draft]
[live draft with change bars since the IETF draft]
[live draft with change bars since the prior IETF draft]
The SNMP objects defined in this draft should be merged into the existing TCP MIB. However, RFC2012 describing the current TCP MIB, is already under revision [RFC2012bis] to support IPv6 address by the ipngwg. The IPv6 team took it upon themselves to make some other very much needed revisions to other portions of the TCP MIB.
As an interim measure to simplify version control, etc. we are presenting our extensions as a separate document. However we fully expect to merge into the main TCP MIB document at some future date. To facilitate possible prototype implementations we have duplicated a minimal set of objects from RFC2012bis, such that this MIB is self contained. These duplicated objects will be removed when the documents are merged.
There is a new complication in the plan to merge these mibs. The IPv6 MIB team has suddenly developed cold feet in regards to making any updates to MIBs beyond minimally updating the IP addresses. If this becomes the policy, some very good improvements to the TCP MIB by the IPv6 team effectively become orphaned. In principle this work could be preserved an re-merged into the TCP MIB under the auspices of the transport working group (tsvwg), however we are not currently prepared to do the merge within the web100 project.
We plan to have a clean draft of the TCP-ESTATS-MIB "done" by the Nov 4th IETF ID cutoff.
There is one major difference from the current (updated) TCP MIB. We have created a connection index (tcpEStatsConnectIndex), such that diagnostic tools can more quickly poll on one connection. The statistics (About 135 objects) are divided into 8 smaller tables, each indexed by tcpEStatsConnectIndex. This is to permit finer grain control of overhead trade-offs. (In -01 draft the index was called ...IdId, but that has been changed).
A minimal merge would be to add a ConnectionIndex object to the ipv6mib TcpConnection table and drop in the 8 additional tables from ESTATS. It might even work to add the ConnectionIndex to the main TCP mib, merely tagged "to support future fast connection polling", and leave ESTATS as a separate document.
The web100 project has implemented TCP kernel instrumentation in an approximate implementation of this MIB. The web100 instruments are exported via the Linux proc interface (not SNMP), and are differ slightly from this draft. We have a collaborator who is building an real SNMP agent to access the web100 kernel instruments.
Is the TCP-ESTATS-MIB complete?
Did we overlook something that you would like to observe about TCP in the field?
StartTime
We assume a real ToD clock. (embedded/tiny system argument omitted). (I expect to fix the name collision) This is actually a good reason to keep the mibs independent.
counter sizes
We list nine 64 bit counters without HC tags or 32 bit equivalents. In terms of regular computer system implementations these are pretty much a no brainer (because you already hold a SMP lock on the per connection protocol structure, so 32bit count + overflow count is good enough). How much political fire will this draw? (interestingly enough we only specified 32 bits for segment counts).
timer granularity
We assume uS precision for various objects (duration, RTT, etc) however today nobody has cheap clocks with this granularity. I would like to have this debated (or better yet, play back the same debate from some other already completed MIB). Can somebody cite a pointer?
octets v bytes
Back when there were many 36 bit computers in the world, the term byte was ambiguously either 6 or a 8 bits, ergo the use of octet in the TCP spec. Byte is no longer ambiguous, so most of us have stopped using octet. I will try to find a more official reference on the deprecation of octet. RFC1122 even defines octets as 8-bit bytes.
handling retransmitted data
The historical and revised MIBs [RFC2012 and RFC2012bis] contain a false optimization re: excluding retransmitted data from some instruments such as tcpOutSegs. The problem is that there are many different code paths that cause retransmission (Tahoe, classic fast retransmit, NewReno, SACK, etc) but the accounting is done much later (after the headers are built). Doing it correctly requires reconstructing why the segments is being sent (not too hard, but still not clean) A far better approach is to take the view that there are two classes of statistics: measures of IP resources consumed (packets and bytes in and out, etc for all reasons) and measures of application performance (total advance of the ACK fields). Then DataBytesTransmitted-DataBytesAcked(-CurrentWindow) is an exact measure of data bytes retransmitted. I would keep the direct measures of retransmissions (tcpRetransSegs, etc) but I would scratch all of the "(in/ex)cluding retransmitted segs/octets" This applies to both the summary and per connection stats.
Although this is a change from past definitions, I believe that there are zero applications that care.
Is the overlap between the instruments in the TcpConnection table [RFC2012bis] and the ESTATS mib ok? Although many of the instruments have similar definitions, a number of the differences are useful. Specifically, the TcpConnection can be more easily implemented in a tiny embedded system. The extended statistics require such things as a real Time-of-Day clock, which are easily provided in a workstation or larger system, but may be problematic in smaller environments.
Is the partition into 8 separate tables appropriate?
Since ESTAT-MIB is so large (~135 objects) we partitioned it into 8 smaller tables, each indexed by a "ConnectionIndex". This provides a mechanism to balance resource consumption against statistics detail and provide a fast mechanism for diagnostic tools to poll on a specific connection. However it has never been clear to me if this partitioning is worth the added complexity and overhead. Is it? We may not know until we have a true stand alone implementation.
error semantics are not sufficiently specified.
What error should be reported if a given connection index is no longer valid.
Is it appropriate to capture the IP TTL field? (NEW)
| [TCP-ESTATS-MIB] | Matt Mathis, John Heffner, Raghu Reddy, J. Saperia, "TCP Extended Statistics MIB", work in progress. |
| [RFC2012] | McCloghrie, K., "SNMPv2 Management Information Base for the Transmission Control Protocol using SMIv2", RFC 2012, November 1996. |
| [RFC2012bis] | Bill Fenner, et al, "Management Information Base for the Transmission Control Protocol (TCP)" Internet-Draft draft-ietf- draft-ietf-ipv6-rfc2012-update-00.txt, expires Dec 2002 |
Please send comments and suggestions to mathis@psc.edu.
This document is a product of the web100 project.