1*a2681707SWolfgang DenkThe most frequent cause of problems when porting U-Boot to new 2*a2681707SWolfgang Denkhardware, or when using a sloppy port on some board, is memory errors. 3*a2681707SWolfgang DenkIn most cases these are not caused by failing hardware, but by 4*a2681707SWolfgang Denkincorrect initialization of the memory controller. So it appears to 5*a2681707SWolfgang Denkbe a good idea to always test if the memory is working correctly, 6*a2681707SWolfgang Denkbefore looking for any other potential causes of any problems. 7*a2681707SWolfgang Denk 8*a2681707SWolfgang DenkU-Boot implements 3 different approaches to perform memory tests: 9*a2681707SWolfgang Denk 10*a2681707SWolfgang Denk1. The get_ram_size() function (see "common/memsize.c"). 11*a2681707SWolfgang Denk 12*a2681707SWolfgang Denk This function is supposed to be used in each and every U-Boot port 13*a2681707SWolfgang Denk determine the presence and actual size of each of the potential 14*a2681707SWolfgang Denk memory banks on this piece of hardware. The code is supposed to be 15*a2681707SWolfgang Denk very fast, so running it for each reboot does not hurt. It is a 16*a2681707SWolfgang Denk little known and generally underrated fact that this code will also 17*a2681707SWolfgang Denk catch 99% of hardware related (i. e. reliably reproducible) memory 18*a2681707SWolfgang Denk errors. It is strongly recommended to always use this function, in 19*a2681707SWolfgang Denk each and every port of U-Boot. 20*a2681707SWolfgang Denk 21*a2681707SWolfgang Denk2. The "mtest" command. 22*a2681707SWolfgang Denk 23*a2681707SWolfgang Denk This is probably the best known memory test utility in U-Boot. 24*a2681707SWolfgang Denk Unfortunately, it is also the most problematic, and the most 25*a2681707SWolfgang Denk useless one. 26*a2681707SWolfgang Denk 27*a2681707SWolfgang Denk There are a number of serious problems with this command: 28*a2681707SWolfgang Denk 29*a2681707SWolfgang Denk - It is terribly slow. Running "mtest" on the whole system RAM 30*a2681707SWolfgang Denk takes a _long_ time before there is any significance in the fact 31*a2681707SWolfgang Denk that no errors have been found so far. 32*a2681707SWolfgang Denk 33*a2681707SWolfgang Denk - It is difficult to configure, and to use. And any errors here 34*a2681707SWolfgang Denk will reliably crash or hang your system. "mtest" is dumb and has 35*a2681707SWolfgang Denk no knowledge about memory ranges that may be in use for other 36*a2681707SWolfgang Denk purposes, like exception code, U-Boot code and data, stack, 37*a2681707SWolfgang Denk malloc arena, video buffer, log buffer, etc. If you let it, it 38*a2681707SWolfgang Denk will happily "test" all such areas, which of course will cause 39*a2681707SWolfgang Denk some problems. 40*a2681707SWolfgang Denk 41*a2681707SWolfgang Denk - It is not easy to configure and use, and a large number of 42*a2681707SWolfgang Denk systems are seriously misconfigured. The original idea was to 43*a2681707SWolfgang Denk test basically the whole system RAM, with only exempting the 44*a2681707SWolfgang Denk areas used by U-Boot itself - on most systems these are the areas 45*a2681707SWolfgang Denk used for the exception vectors (usually at the very lower end of 46*a2681707SWolfgang Denk system memory) and for U-Boot (code, data, etc. - see above; 47*a2681707SWolfgang Denk these are usually at the very upper end of system memory). But 48*a2681707SWolfgang Denk experience has shown that a very large number of ports use 49*a2681707SWolfgang Denk pretty much bogus settings of CONFIG_SYS_MEMTEST_START and 50*a2681707SWolfgang Denk CONFIG_SYS_MEMTEST_END; this results in useless tests (because 51*a2681707SWolfgang Denk the ranges is too small and/or badly located) or in critical 52*a2681707SWolfgang Denk failures (system crashes). 53*a2681707SWolfgang Denk 54*a2681707SWolfgang Denk Because of these issues, the "mtest" command is considered depre- 55*a2681707SWolfgang Denk cated. It should not be enabled in most normal ports of U-Boot, 56*a2681707SWolfgang Denk especially not in production. If you really need a memory test, 57*a2681707SWolfgang Denk then see 1. and 3. above resp. below. 58*a2681707SWolfgang Denk 59*a2681707SWolfgang Denk3. The most thorough memory test facility is available as part of the 60*a2681707SWolfgang Denk POST (Power-On Self Test) sub-system, see "post/drivers/memory.c". 61*a2681707SWolfgang Denk 62*a2681707SWolfgang Denk If you really need to perform memory tests (for example, because 63*a2681707SWolfgang Denk it is mandatory part of your requirement specification), then 64*a2681707SWolfgang Denk enable this test which is generic and should work on all archi- 65*a2681707SWolfgang Denk tectures. 66*a2681707SWolfgang Denk 67*a2681707SWolfgang DenkWARNING: 68*a2681707SWolfgang Denk 69*a2681707SWolfgang DenkIt should pointed out that _all_ these memory tests have one 70*a2681707SWolfgang Denkfundamental, unfixable design flaw: they are based on the assumption 71*a2681707SWolfgang Denkthat memory errors can be found by writing to and reading from memory. 72*a2681707SWolfgang DenkUnfortunately, this is only true for the relatively harmless, usually 73*a2681707SWolfgang Denkstatic errors like shorts between data or address lines, unconnected 74*a2681707SWolfgang Denkpins, etc. All the really nasty errors which will first turn your 75*a2681707SWolfgang Denkhair gray, only to make you tear it out later, are dynamical errors, 76*a2681707SWolfgang Denkwhich usually happen not with simple read or write cycles on the bus, 77*a2681707SWolfgang Denkbut when performing back-to-back data transfers in burst mode. Such 78*a2681707SWolfgang Denkaccesses usually happen only for certain DMA operations, or for heavy 79*a2681707SWolfgang Denkcache use (instruction fetching, cache flushing). So far I am not 80*a2681707SWolfgang Denkaware of any freely available code that implements a generic, and 81*a2681707SWolfgang Denkefficient, memory test like that. The best known test case to stress 82*a2681707SWolfgang Denka system like that is to boot Linux with root file system mounted over 83*a2681707SWolfgang DenkNFS, and then build some larger software package natively (say, 84*a2681707SWolfgang Denkcompile a Linux kernel on the system) - this will cause enough context 85*a2681707SWolfgang Denkswitches, network traffic (and thus DMA transfers from the network 86*a2681707SWolfgang Denkcontroller), varying RAM use, etc. to trigger any weak spots in this 87*a2681707SWolfgang Denkarea. 88*a2681707SWolfgang Denk 89*a2681707SWolfgang DenkNote: An attempt was made once to implement such a test to catch 90*a2681707SWolfgang Denkmemory problems on a specific board. The code is pretty much board 91*a2681707SWolfgang Denkspecific (for example, it includes setting specific GPIO signals to 92*a2681707SWolfgang Denkprovide triggers for an attached logic analyzer), but you can get an 93*a2681707SWolfgang Denkidea how it works: see "examples/standalone/test_burst*". 94*a2681707SWolfgang Denk 95*a2681707SWolfgang DenkNote 2: Ironically enough, the "test_burst" did not catch any RAM 96*a2681707SWolfgang Denkerrors, not a single one ever. The problems this code was supposed 97*a2681707SWolfgang Denkto catch did not happen when accessing the RAM, but when reading from 98*a2681707SWolfgang DenkNOR flash. 99