xref: /openbmc/u-boot/doc/README.440-DDR-performance (revision 9d11d12a16d12caae7f9418322485392a5583423)
1AMCC suggested to set the PMU bit to 0 for best performace on the
2PPC440 DDR controller. The 440er common DDR setup files (sdram.c &
3spd_sdram.c) are changed accordingly. So all 440er boards using
4these setup routines will automatically receive this performance
5increase.
6
7Please see below some benchmarks done by AMCC to demonstrate this
8performance changes:
9
10
11----------------------------------------
12SDRAM0_CFG0[PMU] = 1 (U-Boot default for Bamboo, Yosemite and Yellowstone)
13----------------------------------------
14Stream benchmark results
15-------------------------------------------------------------
16This system uses 8 bytes per DOUBLE PRECISION word.
17-------------------------------------------------------------
18Array size = 2000000, Offset = 0
19Total memory required = 45.8 MB.
20Each test is run 10 times, but only
21the *best* time for each is used.
22-------------------------------------------------------------
23Your clock granularity/precision appears to be 1 microseconds.
24Each test below will take on the order of 112345 microseconds.
25   (= 112345 clock ticks)
26Increase the size of the arrays if this shows that you are not getting
27at least 20 clock ticks per test.
28-------------------------------------------------------------
29WARNING -- The above is only a rough guideline.
30For best results, please be sure you know the precision of your system
31timer.
32-------------------------------------------------------------
33Function      Rate (MB/s)   RMS time     Min time     Max time
34Copy:         256.7683       0.1248       0.1246       0.1250
35Scale:        246.0157       0.1302       0.1301       0.1302
36Add:          255.0316       0.1883       0.1882       0.1885
37Triad:        253.1245       0.1897       0.1896       0.1899
38
39
40TTCP Benchmark Results
41ttcp-t: socket
42ttcp-t: connect
43ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5000  tcp  ->
44localhost
45ttcp-t: 16777216 bytes in 0.28 real seconds = 454.29 Mbit/sec +++
46ttcp-t: 2048 I/O calls, msec/call = 0.14, calls/sec = 7268.57
47ttcp-t: 0.0user 0.1sys 0:00real 60% 0i+0d 0maxrss 0+2pf 3+1506csw
48
49----------------------------------------
50SDRAM0_CFG0[PMU] = 0 (Suggested modification)
51Setting PMU = 0 provides a noticeable performance improvement *2% to
525% improvement in memory performance.
53*Improves the Mbit/sec for TTCP benchmark by almost 76%.
54----------------------------------------
55Stream benchmark results
56-------------------------------------------------------------
57This system uses 8 bytes per DOUBLE PRECISION word.
58-------------------------------------------------------------
59Array size = 2000000, Offset = 0
60Total memory required = 45.8 MB.
61Each test is run 10 times, but only
62the *best* time for each is used.
63-------------------------------------------------------------
64Your clock granularity/precision appears to be 1 microseconds.
65Each test below will take on the order of 120066 microseconds.
66   (= 120066 clock ticks)
67Increase the size of the arrays if this shows that you are not getting
68at least 20 clock ticks per test.
69-------------------------------------------------------------
70WARNING -- The above is only a rough guideline.
71For best results, please be sure you know the precision of your system
72timer.
73-------------------------------------------------------------
74Function      Rate (MB/s)   RMS time     Min time     Max time
75Copy:         262.5167       0.1221       0.1219       0.1223
76Scale:        258.4856       0.1238       0.1238       0.1240
77Add:          262.5404       0.1829       0.1828       0.1831
78Triad:        266.8594       0.1800       0.1799       0.1802
79
80TTCP Benchmark Results
81ttcp-t: socket
82ttcp-t: connect
83ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5000  tcp  ->
84localhost
85ttcp-t: 16777216 bytes in 0.16 real seconds = 804.06 Mbit/sec +++
86ttcp-t: 2048 I/O calls, msec/call = 0.08, calls/sec = 12864.89
87ttcp-t: 0.0user 0.0sys 0:00real 46% 0i+0d 0maxrss 0+2pf 120+1csw
88
89
902006-07-28, Stefan Roese <sr@denx.de>
91