forked from HewlettPackard/netperf
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathnetperf.texi
4150 lines (3622 loc) · 179 KB
/
netperf.texi
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\input texinfo @c -*-texinfo-*-
@c %**start of header
@setfilename netperf.info
@settitle Care and Feeding of Netperf 2.7.X
@c %**end of header
@copying
This is Rick Jones' feeble attempt at a Texinfo-based manual for the
netperf benchmark.
Copyright @copyright{} 2005-2015 Hewlett-Packard Company
@quotation
Permission is granted to copy, distribute and/or modify this document
per the terms of the netperf source license, a copy of which can be
found in the file @file{COPYING} of the basic netperf distribution.
@end quotation
@end copying
@titlepage
@title Care and Feeding of Netperf
@subtitle Versions 2.7.0 and Later
@author Rick Jones @email{rick.jones2@@hp.com}
@c this is here to start the copyright page
@page
@vskip 0pt plus 1filll
@insertcopying
@end titlepage
@c begin with a table of contents
@contents
@ifnottex
@node Top, Introduction, (dir), (dir)
@top Netperf Manual
@insertcopying
@end ifnottex
@menu
* Introduction:: An introduction to netperf - what it
is and what it is not.
* Installing Netperf:: How to go about installing netperf.
* The Design of Netperf::
* Global Command-line Options::
* Using Netperf to Measure Bulk Data Transfer::
* Using Netperf to Measure Request/Response ::
* Using Netperf to Measure Aggregate Performance::
* Using Netperf to Measure Bidirectional Transfer::
* The Omni Tests::
* Other Netperf Tests::
* Address Resolution::
* Enhancing Netperf::
* Netperf4::
* Concept Index::
* Option Index::
@end menu
@node Introduction, Installing Netperf, Top, Top
@chapter Introduction
@cindex Introduction
Netperf is a benchmark that can be use to measure various aspect of
networking performance. The primary foci are bulk (aka
unidirectional) data transfer and request/response performance using
either TCP or UDP and the Berkeley Sockets interface. As of this
writing, the tests available either unconditionally or conditionally
include:
@itemize @bullet
@item
TCP and UDP unidirectional transfer and request/response over IPv4 and
IPv6 using the Sockets interface.
@item
TCP and UDP unidirectional transfer and request/response over IPv4
using the XTI interface.
@item
Link-level unidirectional transfer and request/response using the DLPI
interface.
@item
Unix domain sockets
@item
SCTP unidirectional transfer and request/response over IPv4 and IPv6
using the sockets interface.
@end itemize
While not every revision of netperf will work on every platform
listed, the intention is that at least some version of netperf will
work on the following platforms:
@itemize @bullet
@item
Unix - at least all the major variants.
@item
Linux
@item
Windows
@item
Others
@end itemize
Netperf is maintained and informally supported primarily by Rick
Jones, who can perhaps be best described as Netperf Contributing
Editor. Non-trivial and very appreciated assistance comes from others
in the network performance community, who are too numerous to mention
here. While it is often used by them, netperf is NOT supported via any
of the formal Hewlett-Packard support channels. You should feel free
to make enhancements and modifications to netperf to suit your
nefarious porpoises, so long as you stay within the guidelines of the
netperf copyright. If you feel so inclined, you can send your changes
to
@email{netperf-feedback@@netperf.org,netperf-feedback} for possible
inclusion into subsequent versions of netperf.
It is the Contributing Editor's belief that the netperf license walks
like open source and talks like open source. However, the license was
never submitted for ``certification'' as an open source license. If
you would prefer to make contributions to a networking benchmark using
a certified open source license, please consider netperf4, which is
distributed under the terms of the GPLv2.
The @email{netperf-talk@@netperf.org,netperf-talk} mailing list is
available to discuss the care and feeding of netperf with others who
share your interest in network performance benchmarking. The
netperf-talk mailing list is a closed list (to deal with spam) and you
must first subscribe by sending email to
@email{netperf-talk-request@@netperf.org,netperf-talk-request}.
@menu
* Conventions::
@end menu
@node Conventions, , Introduction, Introduction
@section Conventions
A @dfn{sizespec} is a one or two item, comma-separated list used as an
argument to a command-line option that can set one or two, related
netperf parameters. If you wish to set both parameters to separate
values, items should be separated by a comma:
@example
parameter1,parameter2
@end example
If you wish to set the first parameter without altering the value of
the second from its default, you should follow the first item with a
comma:
@example
parameter1,
@end example
Likewise, precede the item with a comma if you wish to set only the
second parameter:
@example
,parameter2
@end example
An item with no commas:
@example
parameter1and2
@end example
will set both parameters to the same value. This last mode is one of
the most frequently used.
There is another variant of the comma-separated, two-item list called
a @dfn{optionspec} which is like a sizespec with the exception that a
single item with no comma:
@example
parameter1
@end example
will only set the value of the first parameter and will leave the
second parameter at its default value.
Netperf has two types of command-line options. The first are global
command line options. They are essentially any option not tied to a
particular test or group of tests. An example of a global
command-line option is the one which sets the test type - @option{-t}.
The second type of options are test-specific options. These are
options which are only applicable to a particular test or set of
tests. An example of a test-specific option would be the send socket
buffer size for a TCP_STREAM test.
Global command-line options are specified first with test-specific
options following after a @code{--} as in:
@example
netperf <global> -- <test-specific>
@end example
@node Installing Netperf, The Design of Netperf, Introduction, Top
@chapter Installing Netperf
@cindex Installation
Netperf's primary form of distribution is source code. This allows
installation on systems other than those to which the authors have
ready access and thus the ability to create binaries. There are two
styles of netperf installation. The first runs the netperf server
program - netserver - as a child of inetd. This requires the
installer to have sufficient privileges to edit the files
@file{/etc/services} and @file{/etc/inetd.conf} or their
platform-specific equivalents.
The second style is to run netserver as a standalone daemon. This
second method does not require edit privileges on @file{/etc/services}
and @file{/etc/inetd.conf} but does mean you must remember to run the
netserver program explicitly after every system reboot.
This manual assumes that those wishing to measure networking
performance already know how to use anonymous FTP and/or a web
browser. It is also expected that you have at least a passing
familiarity with the networking protocols and interfaces involved. In
all honesty, if you do not have such familiarity, likely as not you
have some experience to gain before attempting network performance
measurements. The excellent texts by authors such as Stevens, Fenner
and Rudoff and/or Stallings would be good starting points. There are
likely other excellent sources out there as well.
@menu
* Getting Netperf Bits::
* Installing Netperf Bits::
* Verifying Installation::
@end menu
@node Getting Netperf Bits, Installing Netperf Bits, Installing Netperf, Installing Netperf
@section Getting Netperf Bits
Gzipped tar files of netperf sources can be retrieved via
@uref{ftp://ftp.netperf.org/netperf,anonymous FTP}
for ``released'' versions of the bits. Pre-release versions of the
bits can be retrieved via anonymous FTP from the
@uref{ftp://ftp.netperf.org/netperf/experimental,experimental} subdirectory.
For convenience and ease of remembering, a link to the download site
is provided via the
@uref{http://www.netperf.org/, NetperfPage}
The bits corresponding to each discrete release of netperf are
@uref{http://www.netperf.org/svn/netperf2/tags,tagged} for retrieval
via subversion. For example, there is a tag for the first version
corresponding to this version of the manual -
@uref{http://www.netperf.org/svn/netperf2/tags/netperf-2.7.0,netperf
2.7.0}. Those wishing to be on the bleeding edge of netperf
development can use subversion to grab the
@uref{http://www.netperf.org/svn/netperf2/trunk,top of trunk}. When
fixing bugs or making enhancements, patches against the top-of-trunk
are preferred.
There are likely other places around the Internet from which one can
download netperf bits. These may be simple mirrors of the main
netperf site, or they may be local variants on netperf. As with
anything one downloads from the Internet, take care to make sure it is
what you really wanted and isn't some malicious Trojan or whatnot.
Caveat downloader.
As a general rule, binaries of netperf and netserver are not
distributed from ftp.netperf.org. From time to time a kind soul or
souls has packaged netperf as a Debian package available via the
apt-get mechanism or as an RPM. I would be most interested in
learning how to enhance the makefiles to make that easier for people.
@node Installing Netperf Bits, Verifying Installation, Getting Netperf Bits, Installing Netperf
@section Installing Netperf
Once you have downloaded the tar file of netperf sources onto your
system(s), it is necessary to unpack the tar file, cd to the netperf
directory, run configure and then make. Most of the time it should be
sufficient to just:
@example
gzcat netperf-<version>.tar.gz | tar xf -
cd netperf-<version>
./configure
make
make install
@end example
Most of the ``usual'' configure script options should be present
dealing with where to install binaries and whatnot.
@example
./configure --help
@end example
should list all of those and more. You may find the @code{--prefix}
option helpful in deciding where the binaries and such will be put
during the @code{make install}.
@vindex --enable-cpuutil, Configure
If the netperf configure script does not know how to automagically
detect which CPU utilization mechanism to use on your platform you may
want to add a @code{--enable-cpuutil=mumble} option to the configure
command. If you have knowledge and/or experience to contribute to
that area, feel free to contact @email{netperf-feedback@@netperf.org}.
@vindex --enable-xti, Configure
@vindex --enable-unixdomain, Configure
@vindex --enable-dlpi, Configure
@vindex --enable-sctp, Configure
Similarly, if you want tests using the XTI interface, Unix Domain
Sockets, DLPI or SCTP it will be necessary to add one or more
@code{--enable-[xti|unixdomain|dlpi|sctp]=yes} options to the configure
command. As of this writing, the configure script will not include
those tests automagically.
@vindex --enable-omni, Configure
Starting with version 2.5.0, netperf began migrating most of the
``classic'' netperf tests found in @file{src/nettest_bsd.c} to the
so-called ``omni'' tests (aka ``two routines to run them all'') found
in @file{src/nettest_omni.c}. This migration enables a number of new
features such as greater control over what output is included, and new
things to output. The ``omni'' test is enabled by default in 2.5.0
and a number of the classic tests are migrated - you can tell if a
test has been migrated
from the presence of @code{MIGRATED} in the test banner. If you
encounter problems with either the omni or migrated tests, please
first attempt to obtain resolution via
@email{netperf-talk@@netperf.org} or
@email{netperf-feedback@@netperf.org}. If that is unsuccessful, you
can add a @code{--enable-omni=no} to the configure command and the
omni tests will not be compiled-in and the classic tests will not be
migrated.
Starting with version 2.5.0, netperf includes the ``burst mode''
functionality in a default compilation of the bits. If you encounter
problems with this, please first attempt to obtain help via
@email{netperf-talk@@netperf.org} or
@email{netperf-feedback@@netperf.org}. If that is unsuccessful, you
can add a @code{--enable-burst=no} to the configure command and the
burst mode functionality will not be compiled-in.
On some platforms, it may be necessary to precede the configure
command with a CFLAGS and/or LIBS variable as the netperf configure
script is not yet smart enough to set them itself. Whenever possible,
these requirements will be found in @file{README.@var{platform}} files.
Expertise and assistance in making that more automagic in the
configure script would be most welcome.
@cindex Limiting Bandwidth
@cindex Bandwidth Limitation
@vindex --enable-intervals, Configure
@vindex --enable-histogram, Configure
Other optional configure-time settings include
@code{--enable-intervals=yes} to give netperf the ability to ``pace''
its _STREAM tests and @code{--enable-histogram=yes} to have netperf
keep a histogram of interesting times. Each of these will have some
effect on the measured result. If your system supports
@code{gethrtime()} the effect of the histogram measurement should be
minimized but probably still measurable. For example, the histogram
of a netperf TCP_RR test will be of the individual transaction times:
@example
netperf -t TCP_RR -H lag -v 2
TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET : histogram
Local /Remote
Socket Size Request Resp. Elapsed Trans.
Send Recv Size Size Time Rate
bytes Bytes bytes bytes secs. per sec
16384 87380 1 1 10.00 3538.82
32768 32768
Alignment Offset
Local Remote Local Remote
Send Recv Send Recv
8 0 0 0
Histogram of request/response times
UNIT_USEC : 0: 0: 0: 0: 0: 0: 0: 0: 0: 0
TEN_USEC : 0: 0: 0: 0: 0: 0: 0: 0: 0: 0
HUNDRED_USEC : 0: 34480: 111: 13: 12: 6: 9: 3: 4: 7
UNIT_MSEC : 0: 60: 50: 51: 44: 44: 72: 119: 100: 101
TEN_MSEC : 0: 105: 0: 0: 0: 0: 0: 0: 0: 0
HUNDRED_MSEC : 0: 0: 0: 0: 0: 0: 0: 0: 0: 0
UNIT_SEC : 0: 0: 0: 0: 0: 0: 0: 0: 0: 0
TEN_SEC : 0: 0: 0: 0: 0: 0: 0: 0: 0: 0
>100_SECS: 0
HIST_TOTAL: 35391
@end example
The histogram you see above is basically a base-10 log histogram where
we can see that most of the transaction times were on the order of one
hundred to one-hundred, ninety-nine microseconds, but they were
occasionally as long as ten to nineteen milliseconds
The @option{--enable-demo=yes} configure option will cause code to be
included to report interim results during a test run. The rate at
which interim results are reported can then be controlled via the
global @option{-D} option. Here is an example of @option{-D} output:
@example
$ src/netperf -D 1.35 -H tardy.hpl.hp.com -f M
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.hpl.hp.com (15.9.116.144) port 0 AF_INET : demo
Interim result: 5.41 MBytes/s over 1.35 seconds ending at 1308789765.848
Interim result: 11.07 MBytes/s over 1.36 seconds ending at 1308789767.206
Interim result: 16.00 MBytes/s over 1.36 seconds ending at 1308789768.566
Interim result: 20.66 MBytes/s over 1.36 seconds ending at 1308789769.922
Interim result: 22.74 MBytes/s over 1.36 seconds ending at 1308789771.285
Interim result: 23.07 MBytes/s over 1.36 seconds ending at 1308789772.647
Interim result: 23.77 MBytes/s over 1.37 seconds ending at 1308789774.016
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. MBytes/sec
87380 16384 16384 10.06 17.81
@end example
Notice how the units of the interim result track that requested by the
@option{-f} option. Also notice that sometimes the interval will be
longer than the value specified in the @option{-D} option. This is
normal and stems from how demo mode is implemented not by relying on
interval timers or frequent calls to get the current time, but by
calculating how many units of work must be performed to take at least
the desired interval.
Those familiar with this option in earlier versions of netperf will
note the addition of the ``ending at'' text. This is the time as
reported by a @code{gettimeofday()} call (or its emulation) with a
@code{NULL} timezone pointer. This addition is intended to make it
easier to insert interim results into an
@uref{http://oss.oetiker.ch/rrdtool/doc/rrdtool.en.html,rrdtool}
Round-Robin Database (RRD). A likely bug-riddled example of doing so
can be found in @file{doc/examples/netperf_interim_to_rrd.sh}. The
time is reported out to milliseconds rather than microseconds because
that is the most rrdtool understands as of the time of this writing.
As of this writing, a @code{make install} will not actually update the
files @file{/etc/services} and/or @file{/etc/inetd.conf} or their
platform-specific equivalents. It remains necessary to perform that
bit of installation magic by hand. Patches to the makefile sources to
effect an automagic editing of the necessary files to have netperf
installed as a child of inetd would be most welcome.
Starting the netserver as a standalone daemon should be as easy as:
@example
$ netserver
Starting netserver at port 12865
Starting netserver at hostname 0.0.0.0 port 12865 and family 0
@end example
Over time the specifics of the messages netserver prints to the screen
may change but the gist will remain the same.
If the compilation of netperf or netserver happens to fail, feel free
to contact @email{netperf-feedback@@netperf.org} or join and ask in
@email{netperf-talk@@netperf.org}. However, it is quite important
that you include the actual compilation errors and perhaps even the
configure log in your email. Otherwise, it will be that much more
difficult for someone to assist you.
@node Verifying Installation, , Installing Netperf Bits, Installing Netperf
@section Verifying Installation
Basically, once netperf is installed and netserver is configured as a
child of inetd, or launched as a standalone daemon, simply typing:
@example
netperf
@end example
should result in output similar to the following:
@example
$ netperf
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
87380 16384 16384 10.00 2997.84
@end example
@node The Design of Netperf, Global Command-line Options, Installing Netperf, Top
@chapter The Design of Netperf
@cindex Design of Netperf
Netperf is designed around a basic client-server model. There are
two executables - netperf and netserver. Generally you will only
execute the netperf program, with the netserver program being invoked
by the remote system's inetd or having been previously started as its
own standalone daemon.
When you execute netperf it will establish a ``control connection'' to
the remote system. This connection will be used to pass test
configuration information and results to and from the remote system.
Regardless of the type of test to be run, the control connection will
be a TCP connection using BSD sockets. The control connection can use
either IPv4 or IPv6.
Once the control connection is up and the configuration information
has been passed, a separate ``data'' connection will be opened for the
measurement itself using the API's and protocols appropriate for the
specified test. When the test is completed, the data connection will
be torn-down and results from the netserver will be passed-back via the
control connection and combined with netperf's result for display to
the user.
Netperf places no traffic on the control connection while a test is in
progress. Certain TCP options, such as SO_KEEPALIVE, if set as your
systems' default, may put packets out on the control connection while
a test is in progress. Generally speaking this will have no effect on
the results.
@menu
* CPU Utilization::
@end menu
@node CPU Utilization, , The Design of Netperf, The Design of Netperf
@section CPU Utilization
@cindex CPU Utilization
CPU utilization is an important, and alas all-too infrequently
reported component of networking performance. Unfortunately, it can
be one of the most difficult metrics to measure accurately and
portably. Netperf will do its level best to report accurate
CPU utilization figures, but some combinations of processor, OS and
configuration may make that difficult.
CPU utilization in netperf is reported as a value between 0 and 100%
regardless of the number of CPUs involved. In addition to CPU
utilization, netperf will report a metric called a @dfn{service
demand}. The service demand is the normalization of CPU utilization
and work performed. For a _STREAM test it is the microseconds of CPU
time consumed to transfer on KB (K == 1024) of data. For a _RR test
it is the microseconds of CPU time consumed processing a single
transaction. For both CPU utilization and service demand, lower is
better.
Service demand can be particularly useful when trying to gauge the
effect of a performance change. It is essentially a measure of
efficiency, with smaller values being more efficient and thus
``better.''
Netperf is coded to be able to use one of several, generally
platform-specific CPU utilization measurement mechanisms. Single
letter codes will be included in the CPU portion of the test banner to
indicate which mechanism was used on each of the local (netperf) and
remote (netserver) system.
As of this writing those codes are:
@table @code
@item U
The CPU utilization measurement mechanism was unknown to netperf or
netperf/netserver was not compiled to include CPU utilization
measurements. The code for the null CPU utilization mechanism can be
found in @file{src/netcpu_none.c}.
@item I
An HP-UX-specific CPU utilization mechanism whereby the kernel
incremented a per-CPU counter by one for each trip through the idle
loop. This mechanism was only available on specially-compiled HP-UX
kernels prior to HP-UX 10 and is mentioned here only for the sake of
historical completeness and perhaps as a suggestion to those who might
be altering other operating systems. While rather simple, perhaps even
simplistic, this mechanism was quite robust and was not affected by
the concerns of statistical methods, or methods attempting to track
time in each of user, kernel, interrupt and idle modes which require
quite careful accounting. It can be thought-of as the in-kernel
version of the looper @code{L} mechanism without the context switch
overhead. This mechanism required calibration.
@item P
An HP-UX-specific CPU utilization mechanism whereby the kernel
keeps-track of time (in the form of CPU cycles) spent in the kernel
idle loop (HP-UX 10.0 to 11.31 inclusive), or where the kernel keeps
track of time spent in idle, user, kernel and interrupt processing
(HP-UX 11.23 and later). The former requires calibration, the latter
does not. Values in either case are retrieved via one of the pstat(2)
family of calls, hence the use of the letter @code{P}. The code for
these mechanisms is found in @file{src/netcpu_pstat.c} and
@file{src/netcpu_pstatnew.c} respectively.
@item K
A Solaris-specific CPU utilization mechanism whereby the kernel keeps
track of ticks (eg HZ) spent in the idle loop. This method is
statistical and is known to be inaccurate when the interrupt rate is
above epsilon as time spent processing interrupts is not subtracted
from idle. The value is retrieved via a kstat() call - hence the use
of the letter @code{K}. Since this mechanism uses units of ticks (HZ)
the calibration value should invariably match HZ. (Eg 100) The code
for this mechanism is implemented in @file{src/netcpu_kstat.c}.
@item M
A Solaris-specific mechanism available on Solaris 10 and latter which
uses the new microstate accounting mechanisms. There are two, alas,
overlapping, mechanisms. The first tracks nanoseconds spent in user,
kernel, and idle modes. The second mechanism tracks nanoseconds spent
in interrupt. Since the mechanisms overlap, netperf goes through some
hand-waving to try to ``fix'' the problem. Since the accuracy of the
handwaving cannot be completely determined, one must presume that
while better than the @code{K} mechanism, this mechanism too is not
without issues. The values are retrieved via kstat() calls, but the
letter code is set to @code{M} to distinguish this mechanism from the
even less accurate @code{K} mechanism. The code for this mechanism is
implemented in @file{src/netcpu_kstat10.c}.
@item L
A mechanism based on ``looper''or ``soaker'' processes which sit in
tight loops counting as fast as they possibly can. This mechanism
starts a looper process for each known CPU on the system. The effect
of processor hyperthreading on the mechanism is not yet known. This
mechanism definitely requires calibration. The code for the
``looper''mechanism can be found in @file{src/netcpu_looper.c}
@item N
A Microsoft Windows-specific mechanism, the code for which can be
found in @file{src/netcpu_ntperf.c}. This mechanism too is based on
what appears to be a form of micro-state accounting and requires no
calibration. On laptops, or other systems which may dynamically alter
the CPU frequency to minimize power consumption, it has been suggested
that this mechanism may become slightly confused, in which case using
BIOS/uEFI settings to disable the power saving would be indicated.
@item S
This mechanism uses @file{/proc/stat} on Linux to retrieve time
(ticks) spent in idle mode. It is thought but not known to be
reasonably accurate. The code for this mechanism can be found in
@file{src/netcpu_procstat.c}.
@item C
A mechanism somewhat similar to @code{S} but using the sysctl() call
on BSD-like Operating systems (*BSD and MacOS X). The code for this
mechanism can be found in @file{src/netcpu_sysctl.c}.
@item Others
Other mechanisms included in netperf in the past have included using
the times() and getrusage() calls. These calls are actually rather
poorly suited to the task of measuring CPU overhead for networking as
they tend to be process-specific and much network-related processing
can happen outside the context of a process, in places where it is not
a given it will be charged to the correct, or even a process. They
are mentioned here as a warning to anyone seeing those mechanisms used
in other networking benchmarks. These mechanisms are not available in
netperf 2.4.0 and later.
@end table
For many platforms, the configure script will chose the best available
CPU utilization mechanism. However, some platforms have no
particularly good mechanisms. On those platforms, it is probably best
to use the ``LOOPER'' mechanism which is basically some number of
processes (as many as there are processors) sitting in tight little
loops counting as fast as they can. The rate at which the loopers
count when the system is believed to be idle is compared with the rate
when the system is running netperf and the ratio is used to compute
CPU utilization.
In the past, netperf included some mechanisms that only reported CPU
time charged to the calling process. Those mechanisms have been
removed from netperf versions 2.4.0 and later because they are
hopelessly inaccurate. Networking can and often results in CPU time
being spent in places - such as interrupt contexts - that do not get
charged to a or the correct process.
In fact, time spent in the processing of interrupts is a common issue
for many CPU utilization mechanisms. In particular, the ``PSTAT''
mechanism was eventually known to have problems accounting for certain
interrupt time prior to HP-UX 11.11 (11iv1). HP-UX 11iv2 and later
are known/presumed to be good. The ``KSTAT'' mechanism is known to
have problems on all versions of Solaris up to and including Solaris
10. Even the microstate accounting available via kstat in Solaris 10
has issues, though perhaps not as bad as those of prior versions.
The /proc/stat mechanism under Linux is in what the author would
consider an ``uncertain'' category as it appears to be statistical,
which may also have issues with time spent processing interrupts.
In summary, be sure to ``sanity-check'' the CPU utilization figures
with other mechanisms. However, platform tools such as top, vmstat or
mpstat are often based on the same mechanisms used by netperf.
@menu
* CPU Utilization in a Virtual Guest::
@end menu
@node CPU Utilization in a Virtual Guest, , CPU Utilization, CPU Utilization
@subsection CPU Utilization in a Virtual Guest
The CPU utilization mechanisms used by netperf are ``inline'' in that
they are run by the same netperf or netserver process as is running
the test itself. This works just fine for ``bare iron'' tests but
runs into a problem when using virtual machines.
The relationship between virtual guest and hypervisor can be thought
of as being similar to that between a process and kernel in a bare
iron system. As such, (m)any CPU utilization mechanisms used in the
virtual guest are similar to ``process-local'' mechanisms in a bare
iron situation. However, just as with bare iron and process-local
mechanisms, much networking processing happens outside the context of
the virtual guest. It takes place in the hypervisor, and is not
visible to mechanisms running in the guest(s). For this reason, one
should not really trust CPU utilization figures reported by netperf or
netserver when running in a virtual guest.
If one is looking to measure the added overhead of a virtualization
mechanism, rather than rely on CPU utilization, one can rely instead
on netperf _RR tests - path-lengths and overheads can be a significant
fraction of the latency, so increases in overhead should appear as
decreases in transaction rate. Whatever you do, @b{DO NOT} rely on
the throughput of a _STREAM test. Achieving link-rate can be done via
a multitude of options that mask overhead rather than eliminate it.
@node Global Command-line Options, Using Netperf to Measure Bulk Data Transfer, The Design of Netperf, Top
@chapter Global Command-line Options
This section describes each of the global command-line options
available in the netperf and netserver binaries. Essentially, it is
an expanded version of the usage information displayed by netperf or
netserver when invoked with the @option{-h} global command-line
option.
@menu
* Command-line Options Syntax::
* Global Options::
@end menu
@node Command-line Options Syntax, Global Options, Global Command-line Options, Global Command-line Options
@comment node-name, next, previous, up
@section Command-line Options Syntax
Revision 1.8 of netperf introduced enough new functionality to overrun
the English alphabet for mnemonic command-line option names, and the
author was not and is not quite ready to switch to the contemporary
@option{--mumble} style of command-line options. (Call him a Luddite
if you wish :).
For this reason, the command-line options were split into two parts -
the first are the global command-line options. They are options that
affect nearly any and every test type of netperf. The second type are
the test-specific command-line options. Both are entered on the same
command line, but they must be separated from one another by a @code{--}
for correct parsing. Global command-line options come first, followed
by the @code{--} and then test-specific command-line options. If there
are no test-specific options to be set, the @code{--} may be omitted. If
there are no global command-line options to be set, test-specific
options must still be preceded by a @code{--}. For example:
@example
netperf <global> -- <test-specific>
@end example
sets both global and test-specific options:
@example
netperf <global>
@end example
sets just global options and:
@example
netperf -- <test-specific>
@end example
sets just test-specific options.
@node Global Options, , Command-line Options Syntax, Global Command-line Options
@comment node-name, next, previous, up
@section Global Options
@table @code
@vindex -a, Global
@item -a <sizespec>
This option allows you to alter the alignment of the buffers used in
the sending and receiving calls on the local system.. Changing the
alignment of the buffers can force the system to use different copy
schemes, which can have a measurable effect on performance. If the
page size for the system were 4096 bytes, and you want to pass
page-aligned buffers beginning on page boundaries, you could use
@samp{-a 4096}. By default the units are bytes, but suffix of ``G,''
``M,'' or ``K'' will specify the units to be 2^30 (GB), 2^20 (MB) or
2^10 (KB) respectively. A suffix of ``g,'' ``m'' or ``k'' will specify
units of 10^9, 10^6 or 10^3 bytes respectively. [Default: 8 bytes]
@vindex -A, Global
@item -A <sizespec>
This option is identical to the @option{-a} option with the difference
being it affects alignments for the remote system.
@vindex -b, Global
@item -b <size>
This option is only present when netperf has been configure with
--enable-intervals=yes prior to compilation. It sets the size of the
burst of send calls in a _STREAM test. When used in conjunction with
the @option{-w} option it can cause the rate at which data is sent to
be ``paced.''
@vindex -B, Global
@item -B <string>
This option will cause @option{<string>} to be appended to the brief
(see -P) output of netperf.
@vindex -c, Global
@item -c [rate]
This option will ask that CPU utilization and service demand be
calculated for the local system. For those CPU utilization mechanisms
requiring calibration, the options rate parameter may be specified to
preclude running another calibration step, saving 40 seconds of time.
For those CPU utilization mechanisms requiring no calibration, the
optional rate parameter will be utterly and completely ignored.
[Default: no CPU measurements]
@vindex -C, Global
@item -C [rate]
This option requests CPU utilization and service demand calculations
for the remote system. It is otherwise identical to the @option{-c}
option.
@vindex -d, Global
@item -d
Each instance of this option will increase the quantity of debugging
output displayed during a test. If the debugging output level is set
high enough, it may have a measurable effect on performance.
Debugging information for the local system is printed to stdout.
Debugging information for the remote system is sent by default to the
file @file{/tmp/netperf.debug}. [Default: no debugging output]
@vindex -D, Global
@item -D [interval,units]
This option is only available when netperf is configured with
--enable-demo=yes. When set, it will cause netperf to emit periodic
reports of performance during the run. [@var{interval},@var{units}]
follow the semantics of an optionspec. If specified,
@var{interval} gives the minimum interval in real seconds, it does not
have to be whole seconds. The @var{units} value can be used for the
first guess as to how many units of work (bytes or transactions) must
be done to take at least @var{interval} seconds. If omitted,
@var{interval} defaults to one second and @var{units} to values
specific to each test type.
@vindex -f, Global
@item -f G|M|K|g|m|k|x
This option can be used to change the reporting units for _STREAM
tests. Arguments of ``G,'' ``M,'' or ``K'' will set the units to
2^30, 2^20 or 2^10 bytes/s respectively (EG power of two GB, MB or
KB). Arguments of ``g,'' ``,m'' or ``k'' will set the units to 10^9,
10^6 or 10^3 bits/s respectively. An argument of ``x'' requests the
units be transactions per second and is only meaningful for a
request-response test. [Default: ``m'' or 10^6 bits/s]
@vindex -F, Global
@item -F <fillfile>
This option specified the file from which send which buffers will be
pre-filled . While the buffers will contain data from the specified
file, the file is not fully transferred to the remote system as the
receiving end of the test will not write the contents of what it
receives to a file. This can be used to pre-fill the send buffers
with data having different compressibility and so is useful when
measuring performance over mechanisms which perform compression.
While previously required for a TCP_SENDFILE test, later versions of
netperf removed that restriction, creating a temporary file as
needed. While the author cannot recall exactly when that took place,
it is known to be unnecessary in version 2.5.0 and later.
@vindex -h, Global
@item -h
This option causes netperf to display its ``global'' usage string and
exit to the exclusion of all else.
@vindex -H, Global
@item -H <optionspec>
This option will set the name of the remote system and or the address
family used for the control connection. For example:
@example
-H linger,4
@end example
will set the name of the remote system to ``linger'' and tells netperf to
use IPv4 addressing only.
@example
-H ,6
@end example
will leave the name of the remote system at its default, and request
that only IPv6 addresses be used for the control connection.
@example
-H lag
@end example
will set the name of the remote system to ``lag'' and leave the
address family to AF_UNSPEC which means selection of IPv4 vs IPv6 is
left to the system's address resolution.
A value of ``inet'' can be used in place of ``4'' to request IPv4 only
addressing. Similarly, a value of ``inet6'' can be used in place of
``6'' to request IPv6 only addressing. A value of ``0'' can be used
to request either IPv4 or IPv6 addressing as name resolution dictates.
By default, the options set with the global @option{-H} option are
inherited by the test for its data connection, unless a test-specific
@option{-H} option is specified.
If a @option{-H} option follows either the @option{-4} or @option{-6}
options, the family setting specified with the -H option will override
the @option{-4} or @option{-6} options for the remote address
family. If no address family is specified, settings from a previous
@option{-4} or @option{-6} option will remain. In a nutshell, the
last explicit global command-line option wins.
[Default: ``localhost'' for the remote name/IP address and ``0'' (eg
AF_UNSPEC) for the remote address family.]
@vindex -I, Global
@item -I <optionspec>
This option enables the calculation of confidence intervals and sets
the confidence and width parameters with the first half of the
optionspec being either 99 or 95 for 99% or 95% confidence
respectively. The second value of the optionspec specifies the width
of the desired confidence interval. For example
@example
-I 99,5
@end example
asks netperf to be 99% confident that the measured mean values for
throughput and CPU utilization are within +/- 2.5% of the ``real''
mean values. If the @option{-i} option is specified and the
@option{-I} option is omitted, the confidence defaults to 99% and the
width to 5% (giving +/- 2.5%)
If classic netperf test calculates that the desired confidence
intervals have not been met, it emits a noticeable warning that cannot
be suppressed with the @option{-P} or @option{-v} options:
@example
netperf -H tardy.cup -i 3 -I 99,5
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.cup.hp.com (15.244.44.58) port 0 AF_INET : +/-2.5% @ 99% conf.
!!! WARNING
!!! Desired confidence was not achieved within the specified iterations.
!!! This implies that there was variability in the test environment that
!!! must be investigated before going further.
!!! Confidence intervals: Throughput : 6.8%
!!! Local CPU util : 0.0%
!!! Remote CPU util : 0.0%
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
32768 16384 16384 10.01 40.23
@end example
In the example above we see that netperf did not meet the desired
confidence intervals. Instead of being 99% confident it was within
+/- 2.5% of the real mean value of throughput it is only confident it
was within +/-3.4%. In this example, increasing the @option{-i}
option (described below) and/or increasing the iteration length with
the @option{-l} option might resolve the situation.
In an explicit ``omni'' test, failure to meet the confidence intervals
will not result in netperf emitting a warning. To verify the hitting,
or not, of the confidence intervals one will need to include them as
part of an @ref{Omni Output Selection,output selection} in the
test-specific @option{-o}, @option{-O} or @option{k} output selection
options. The warning about not hitting the confidence intervals will
remain in a ``migrated'' classic netperf test.
@vindex -i, Global
@item -i <sizespec>
This option enables the calculation of confidence intervals and sets
the minimum and maximum number of iterations to run in attempting to
achieve the desired confidence interval. The first value sets the
maximum number of iterations to run, the second, the minimum. The
maximum number of iterations is silently capped at 30 and the minimum
is silently floored at 3. Netperf repeats the measurement the minimum
number of iterations and continues until it reaches either the
desired confidence interval, or the maximum number of iterations,
whichever comes first. A classic or migrated netperf test will not
display the actual number of iterations run. An @ref{The Omni
Tests,omni test} will emit the number of iterations run if the
@code{CONFIDENCE_ITERATION} output selector is included in the
@ref{Omni Output Selection,output selection}.
If the @option{-I} option is specified and the @option{-i} option
omitted the maximum number of iterations is set to 10 and the minimum
to three.
Output of a warning upon not hitting the desired confidence intervals
follows the description provided for the @option{-I} option.
The total test time will be somewhere between the minimum and maximum
number of iterations multiplied by the test length supplied by the
@option{-l} option.
@vindex -j, Global
@item -j
This option instructs netperf to keep additional timing statistics
when explicitly running an @ref{The Omni Tests,omni test}. These can
be output when the test-specific @option{-o}, @option{-O} or
@option{-k} @ref{Omni Output Selectors,output selectors} include one
or more of:
@itemize
@item MIN_LATENCY
@item MAX_LATENCY
@item P50_LATENCY
@item P90_LATENCY
@item P99_LATENCY
@item MEAN_LATENCY
@item STDDEV_LATENCY
@end itemize
These statistics will be based on an expanded (100 buckets per row
rather than 10) histogram of times rather than a terribly long list of
individual times. As such, there will be some slight error thanks to
the bucketing. However, the reduction in storage and processing
overheads is well worth it. When running a request/response test, one
might get some idea of the error by comparing the @ref{Omni Output
Selectors,@code{MEAN_LATENCY}} calculated from the histogram with the
@code{RT_LATENCY} calculated from the number of request/response
transactions and the test run time.