Performance data for Paravirtops+VMI enabled kernels
Krishna Raj Raja
krraja at vmware.com
Fri Feb 9 02:50:08 GMT 2007
Hi,
I work for the performance group at VMware. I have been collecting some
performance numbers on the 2.6.20-rc6 kernel that I would like to share with
you.
These are numbers comparing the paravirtops+VMI-enabled kernel with the
non-paravirtops kernel running on native hardware.
There is a negligible performance difference between the
paravirtops+VMI-enabled kernel and the non-paravirtops kernel on a variety of
benchmarks, as shown below.
In many cases, the paravirtops+VMI-enabled kernel performs marginally better.
This is either experimental noise or favorable cache disturbances at the
micro-architectural level.
I had trouble installing Feisty herd-2 on my machines, so one of set of
benchmark results were obtained on the Ubuntu Edgy i386 server distro. The
3d-benchmark results below however, were obtained on Feisty-herd-3 (the
installer issue that I was having seems to be have been fixed in this
release)
Please feel free to mail me if you have any questions.
Thanks
-Krishna
Ubuntu Feisty Performance
Test Setup
Hardware: AMD Dual Core Opteron 270 (4 cores), 2.0Ghz, L2 1M, 64K L1, 5G RAM
(non-PAE kernel only 4G used)
Distro: Ubuntu 6.10 Edgy Server i386
Kernel:
* PVOPS+VMI: 2.6.20-rc6-ubuntu (testing at 432106): CONFIG_PARAVIRT=y,
CONFIG_VMI=y
* non-paravirtops: 2.6.20-5.7-generic (clean at 432106):
config-2.6.20-5.7-generic
* ubuntu wireless and flash memory drivers failed to compile and were
disabled in the config
Native Performance: Pvops+VMI vs non-paravirtops Kernel
Macro Benchmarks
3 iterations, score is avg of last 2 iterations, 1st run discarded. Scores
are in secs, lower scores are better. Scores within brackets are throughput
numbers, higher is better.
Workload
non-paravirtops
Pvops + VMI
non-paravirtops /VMI
SpecJBB
730.62 (score:9606)
730.98 (score:10424)
0.99 (1.08)
reAIM (fserver)+
55.52 (score:2427)
55.80 (score:2415)
0.995 (0.995)
Kernel Compile (1P, -j1)
380.45
391.425
0.972
Kernel Compile (1P, -j2)
391.60
377.83
1.036
Kernel Compile (2P, -j2)
204.075
204.555
0.997
Kernel Compile (2P, -j8)
201.425
201.63
0.999
Ogg Encoding*
212.68
213.12
0.998
SpecJBB is throughput based benchmark (fixed run time). Numbers inside the
bracket should be considered
* wave file of track length 48m 24s at 256kbps bitrate, 5 iterations, 1st
iteration discarded
+ fsever workload has been customized to run on tmpfs instead of regular fs
CPU Microbenchmarks
maxcpus=1
SPECPU 2006, gcc 4.1.2, scores are run time in secs lower scores are better,
scores within brackets are estimated base ratio - higher ratios are better
Workload
non-paravirtops
Pvops + VMI
non-paravirtops/VMI
perlbench
948 (10.3)
949 (10.3)
0.998
bzip2
1550 (6.22)
1560 (6.2)
0.993
gcc
913 (8.82)
913 (8.82)
1.000
mcf
1020 (8.91)
1030 (8.90)
0.990
gobmk
991 (10.6)
992 (10.6)
0.999
hmmer
1670 (5.58)
1670 (5.58)
1.000
sjeng
1320 (9.17)
*1320 (9.19)
1.000
libquantum
2200 (9.41)
2200 (9.41)
1.000
h264ref
1980 (11.2)
1980 (11.2)
1.000
omnetpp
859 (7.28)
860 (7.27)
0.998
astar
1140 (6.18)
1140 (6.18)
1.000
xalancbmk
1100 (6.26)
1100 (6.26)
1.000
I/O Benchmarks
Client machine: Dell 1600C P4 xeon 2 x 2.4Ghz, 1G RAM, Win2K professional SP4
Setup: Client machine is connected to server using crossover cable at Gigabit
Link speed.
maxcpus=1
Netperf
default: netperf -H $1 -l 60 -t TCP_STREAM -- -m 8192 -M 8192 -s 4096 -S 8192
tuned: netperf -H $1 -l 60 -t TCP_STREAM -- -m 8192 -M 8192 -s 32768 -S 65536
Scores are throughput in Mbps, average of 4 runs, higher scores are better
Workload
non-paravirtops
Pvops + VMI
VMI/non-paravirtops
send,default
418.10
416.16
0.995
send,tuned
932.42
926.33
0.993
recv, default
227.40
226.555
0.996
recv, tuned
945.11
944.78
0.999
IOMeter
IOMeter 2006_07_27-RC3, 1 worker thread
Scores are throughput numbers in IOPS, higher scores are better. Numbers
inside the brackets are throughput in Mbps
Workload
non-paravirtops
Pvops + VMI
VMI/non-paravirtops
4K sequential read
11823.98 (46.18)
12345.57 (48.22)
1.044
16K sequential read
4571.101 (71.43)
4576.95 (71.51)
1.001
32K sequential read
2331.338 (72.85)
2333.80 (72.93)
1.001
4K sequential write
11480.75 (44.84)
11842.80 (46.26)
1.031
16K sequential write
4008.865 (62.64)
4011.06 (62.67)
1.000
32K sequential write
1942.484 (60.70)
1946.50 (60.82)
1.002
4K random read
252.96 (0.98)
252.076 (0.98)
0.996
16K random read
242.577 (3.79)
242.098 (3.78)
0.998
32K random read
229.32 (7.16)
228.90 (72.15)
0.998
4K random write
841.58 (3.28)
842.54 (3.291)
1.001
16K random write
552.13 (8.62)
545.06 (8.51)
0.987
32K random write
626.61 (19.58)
626.41 (19.57)
0.999
3d Benchmarks
System: Dell Precision 390, Intel Core 2 Duo 6400@ 2.13Ghz, 2G RAM, Nvidia
Quadro NVS 285,
Distro: Ubuntu Feisty Herd-3 Desktop i386, Nvidia Driver build 1.0-9746:
Kernel:
* non-paravirtops: 2.6.20-rc6-ubuntu1 (clean at 432106): CONFIG_VMI=n,
CONFIG_PARAVIRT=n
* Ubuntu-generic: 2.6.20-6-generic: stock Feisty-herd-3 kernel
* VMI: 2.6.20-rc6-ubuntu (testing at 432106): CONFIG_PARAVIRT=y,
CONFIG_VMI=y,
Test: SPECviewperf-9.0.3
Scores are in Frames Per Sec (FPS), Higher scores are better
Workload
non-paravirtops
Ubuntu-generic
VMI
VMI/Ubuntu-generic
VMI/non-paravirtops
3dsmax-04
6.882
6.876
6.930
1.007
1.006
catia-02
8.860
8.779
8.843
1.007
0.998
ensight-03
4.648
4.637
4.638
1.000
0.997
light-08
9.047
8.849
9.052
1.022
1.000
maya-02
15.48
15.70
15.47
0.985
0.999
proe-04
7.139
7.286
7.122
0.977
0.997
sw-01
8.948
8.916
8.940
1.002
0.999
ugnx-01
1.085
1.083
1.083
1.000
0.998
tcvis-01
1.600
1.600
1.605
1.003
1.003
NVidia Quadro NVS is a business desktop class graphics card, scores are low
2.6.20-6-generic stock herd-3 kernel has PVOPS/VMI enabled by default
With Nvidia Geforce 7600GT:
Scores are in Frames Per Sec (FPS), Higher scores are better
Workload
non-paravirtops
PVOPS+VMI
VMI/non-paravirtops
3dsmax-04
10.77
10.67
0.990
catia-02
11.49
11.53
1.003
ensight-03
9.060
9.040
0.997
light-08
10.01
10.09
1.007
maya-02
28.64
28.64
1.000
proe-04
9.593
9.694
1.010
sw-01
17.29
17.34
1.002
ugnx-01
3.066
3.065
0.999
tcvis-01
3.918
3.918
1.000
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.ubuntu.com/archives/ubuntu-devel/attachments/20070208/884641a6/attachment-0001.htm
More information about the ubuntu-devel
mailing list