-
Notifications
You must be signed in to change notification settings - Fork 176
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
PPC440 DDR setup: Set SDRAM0_CFG0[PMU]=0 for best performance
AMCC suggested to set the PMU bit to 0 for best performace on the PPC440 DDR controller. Please see doc/README.440-DDR-performance for details. Patch by Stefan Roese, 28 Jul 2006
- Loading branch information
Showing
6 changed files
with
103 additions
and
7 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
AMCC suggested to set the PMU bit to 0 for best performace on the | ||
PPC440 DDR controller. The 440er common DDR setup files (sdram.c & | ||
spd_sdram.c) are changed accordingly. So all 440er boards using | ||
these setup routines will automatically receive this performance | ||
increase. | ||
|
||
Please see below some benchmarks done by AMCC to demonstrate this | ||
performance changes: | ||
|
||
|
||
---------------------------------------- | ||
SDRAM0_CFG0[PMU] = 1 (U-boot default for Bamboo, Yosemite and Yellowstone) | ||
---------------------------------------- | ||
Stream benchmark results | ||
------------------------------------------------------------- | ||
This system uses 8 bytes per DOUBLE PRECISION word. | ||
------------------------------------------------------------- | ||
Array size = 2000000, Offset = 0 | ||
Total memory required = 45.8 MB. | ||
Each test is run 10 times, but only | ||
the *best* time for each is used. | ||
------------------------------------------------------------- | ||
Your clock granularity/precision appears to be 1 microseconds. | ||
Each test below will take on the order of 112345 microseconds. | ||
(= 112345 clock ticks) | ||
Increase the size of the arrays if this shows that you are not getting | ||
at least 20 clock ticks per test. | ||
------------------------------------------------------------- | ||
WARNING -- The above is only a rough guideline. | ||
For best results, please be sure you know the precision of your system | ||
timer. | ||
------------------------------------------------------------- | ||
Function Rate (MB/s) RMS time Min time Max time | ||
Copy: 256.7683 0.1248 0.1246 0.1250 | ||
Scale: 246.0157 0.1302 0.1301 0.1302 | ||
Add: 255.0316 0.1883 0.1882 0.1885 | ||
Triad: 253.1245 0.1897 0.1896 0.1899 | ||
|
||
|
||
TTCP Benchmark Results | ||
ttcp-t: socket | ||
ttcp-t: connect | ||
ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5000 tcp -> | ||
localhost | ||
ttcp-t: 16777216 bytes in 0.28 real seconds = 454.29 Mbit/sec +++ | ||
ttcp-t: 2048 I/O calls, msec/call = 0.14, calls/sec = 7268.57 | ||
ttcp-t: 0.0user 0.1sys 0:00real 60% 0i+0d 0maxrss 0+2pf 3+1506csw | ||
|
||
---------------------------------------- | ||
SDRAM0_CFG0[PMU] = 0 (Suggested modification) | ||
Setting PMU = 0 provides a noticeable performance improvement *2% to | ||
5% improvement in memory performance. | ||
*Improves the Mbit/sec for TTCP benchmark by almost 76%. | ||
---------------------------------------- | ||
Stream benchmark results | ||
------------------------------------------------------------- | ||
This system uses 8 bytes per DOUBLE PRECISION word. | ||
------------------------------------------------------------- | ||
Array size = 2000000, Offset = 0 | ||
Total memory required = 45.8 MB. | ||
Each test is run 10 times, but only | ||
the *best* time for each is used. | ||
------------------------------------------------------------- | ||
Your clock granularity/precision appears to be 1 microseconds. | ||
Each test below will take on the order of 120066 microseconds. | ||
(= 120066 clock ticks) | ||
Increase the size of the arrays if this shows that you are not getting | ||
at least 20 clock ticks per test. | ||
------------------------------------------------------------- | ||
WARNING -- The above is only a rough guideline. | ||
For best results, please be sure you know the precision of your system | ||
timer. | ||
------------------------------------------------------------- | ||
Function Rate (MB/s) RMS time Min time Max time | ||
Copy: 262.5167 0.1221 0.1219 0.1223 | ||
Scale: 258.4856 0.1238 0.1238 0.1240 | ||
Add: 262.5404 0.1829 0.1828 0.1831 | ||
Triad: 266.8594 0.1800 0.1799 0.1802 | ||
|
||
TTCP Benchmark Results | ||
ttcp-t: socket | ||
ttcp-t: connect | ||
ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5000 tcp -> | ||
localhost | ||
ttcp-t: 16777216 bytes in 0.16 real seconds = 804.06 Mbit/sec +++ | ||
ttcp-t: 2048 I/O calls, msec/call = 0.08, calls/sec = 12864.89 | ||
ttcp-t: 0.0user 0.0sys 0:00real 46% 0i+0d 0maxrss 0+2pf 120+1csw | ||
|
||
|
||
2006-07-28, Stefan Roese <sr@denx.de> |