1*e790a4ceSJonathan Corbet.. SPDX-License-Identifier: GPL-2.0 2*e790a4ceSJonathan Corbet 3*e790a4ceSJonathan Corbet======================= 4*e790a4ceSJonathan CorbetSTM32 DMA-MDMA chaining 5*e790a4ceSJonathan Corbet======================= 6*e790a4ceSJonathan Corbet 7*e790a4ceSJonathan Corbet 8*e790a4ceSJonathan CorbetIntroduction 9*e790a4ceSJonathan Corbet------------ 10*e790a4ceSJonathan Corbet 11*e790a4ceSJonathan Corbet This document describes the STM32 DMA-MDMA chaining feature. But before going 12*e790a4ceSJonathan Corbet further, let's introduce the peripherals involved. 13*e790a4ceSJonathan Corbet 14*e790a4ceSJonathan Corbet To offload data transfers from the CPU, STM32 microprocessors (MPUs) embed 15*e790a4ceSJonathan Corbet direct memory access controllers (DMA). 16*e790a4ceSJonathan Corbet 17*e790a4ceSJonathan Corbet STM32MP1 SoCs embed both STM32 DMA and STM32 MDMA controllers. STM32 DMA 18*e790a4ceSJonathan Corbet request routing capabilities are enhanced by a DMA request multiplexer 19*e790a4ceSJonathan Corbet (STM32 DMAMUX). 20*e790a4ceSJonathan Corbet 21*e790a4ceSJonathan Corbet **STM32 DMAMUX** 22*e790a4ceSJonathan Corbet 23*e790a4ceSJonathan Corbet STM32 DMAMUX routes any DMA request from a given peripheral to any STM32 DMA 24*e790a4ceSJonathan Corbet controller (STM32MP1 counts two STM32 DMA controllers) channels. 25*e790a4ceSJonathan Corbet 26*e790a4ceSJonathan Corbet **STM32 DMA** 27*e790a4ceSJonathan Corbet 28*e790a4ceSJonathan Corbet STM32 DMA is mainly used to implement central data buffer storage (usually in 29*e790a4ceSJonathan Corbet the system SRAM) for different peripheral. It can access external RAMs but 30*e790a4ceSJonathan Corbet without the ability to generate convenient burst transfer ensuring the best 31*e790a4ceSJonathan Corbet load of the AXI. 32*e790a4ceSJonathan Corbet 33*e790a4ceSJonathan Corbet **STM32 MDMA** 34*e790a4ceSJonathan Corbet 35*e790a4ceSJonathan Corbet STM32 MDMA (Master DMA) is mainly used to manage direct data transfers between 36*e790a4ceSJonathan Corbet RAM data buffers without CPU intervention. It can also be used in a 37*e790a4ceSJonathan Corbet hierarchical structure that uses STM32 DMA as first level data buffer 38*e790a4ceSJonathan Corbet interfaces for AHB peripherals, while the STM32 MDMA acts as a second level 39*e790a4ceSJonathan Corbet DMA with better performance. As a AXI/AHB master, STM32 MDMA can take control 40*e790a4ceSJonathan Corbet of the AXI/AHB bus. 41*e790a4ceSJonathan Corbet 42*e790a4ceSJonathan Corbet 43*e790a4ceSJonathan CorbetPrinciples 44*e790a4ceSJonathan Corbet---------- 45*e790a4ceSJonathan Corbet 46*e790a4ceSJonathan Corbet STM32 DMA-MDMA chaining feature relies on the strengths of STM32 DMA and 47*e790a4ceSJonathan Corbet STM32 MDMA controllers. 48*e790a4ceSJonathan Corbet 49*e790a4ceSJonathan Corbet STM32 DMA has a circular Double Buffer Mode (DBM). At each end of transaction 50*e790a4ceSJonathan Corbet (when DMA data counter - DMA_SxNDTR - reaches 0), the memory pointers 51*e790a4ceSJonathan Corbet (configured with DMA_SxSM0AR and DMA_SxM1AR) are swapped and the DMA data 52*e790a4ceSJonathan Corbet counter is automatically reloaded. This allows the SW or the STM32 MDMA to 53*e790a4ceSJonathan Corbet process one memory area while the second memory area is being filled/used by 54*e790a4ceSJonathan Corbet the STM32 DMA transfer. 55*e790a4ceSJonathan Corbet 56*e790a4ceSJonathan Corbet With STM32 MDMA linked-list mode, a single request initiates the data array 57*e790a4ceSJonathan Corbet (collection of nodes) to be transferred until the linked-list pointer for the 58*e790a4ceSJonathan Corbet channel is null. The channel transfer complete of the last node is the end of 59*e790a4ceSJonathan Corbet transfer, unless first and last nodes are linked to each other, in such a 60*e790a4ceSJonathan Corbet case, the linked-list loops on to create a circular MDMA transfer. 61*e790a4ceSJonathan Corbet 62*e790a4ceSJonathan Corbet STM32 MDMA has direct connections with STM32 DMA. This enables autonomous 63*e790a4ceSJonathan Corbet communication and synchronization between peripherals, thus saving CPU 64*e790a4ceSJonathan Corbet resources and bus congestion. Transfer Complete signal of STM32 DMA channel 65*e790a4ceSJonathan Corbet can triggers STM32 MDMA transfer. STM32 MDMA can clear the request generated 66*e790a4ceSJonathan Corbet by the STM32 DMA by writing to its Interrupt Clear register (whose address is 67*e790a4ceSJonathan Corbet stored in MDMA_CxMAR, and bit mask in MDMA_CxMDR). 68*e790a4ceSJonathan Corbet 69*e790a4ceSJonathan Corbet .. table:: STM32 MDMA interconnect table with STM32 DMA 70*e790a4ceSJonathan Corbet 71*e790a4ceSJonathan Corbet +--------------+----------------+-----------+------------+ 72*e790a4ceSJonathan Corbet | STM32 DMAMUX | STM32 DMA | STM32 DMA | STM32 MDMA | 73*e790a4ceSJonathan Corbet | channels | channels | Transfer | request | 74*e790a4ceSJonathan Corbet | | | complete | | 75*e790a4ceSJonathan Corbet | | | signal | | 76*e790a4ceSJonathan Corbet +==============+================+===========+============+ 77*e790a4ceSJonathan Corbet | Channel *0* | DMA1 channel 0 | dma1_tcf0 | *0x00* | 78*e790a4ceSJonathan Corbet +--------------+----------------+-----------+------------+ 79*e790a4ceSJonathan Corbet | Channel *1* | DMA1 channel 1 | dma1_tcf1 | *0x01* | 80*e790a4ceSJonathan Corbet +--------------+----------------+-----------+------------+ 81*e790a4ceSJonathan Corbet | Channel *2* | DMA1 channel 2 | dma1_tcf2 | *0x02* | 82*e790a4ceSJonathan Corbet +--------------+----------------+-----------+------------+ 83*e790a4ceSJonathan Corbet | Channel *3* | DMA1 channel 3 | dma1_tcf3 | *0x03* | 84*e790a4ceSJonathan Corbet +--------------+----------------+-----------+------------+ 85*e790a4ceSJonathan Corbet | Channel *4* | DMA1 channel 4 | dma1_tcf4 | *0x04* | 86*e790a4ceSJonathan Corbet +--------------+----------------+-----------+------------+ 87*e790a4ceSJonathan Corbet | Channel *5* | DMA1 channel 5 | dma1_tcf5 | *0x05* | 88*e790a4ceSJonathan Corbet +--------------+----------------+-----------+------------+ 89*e790a4ceSJonathan Corbet | Channel *6* | DMA1 channel 6 | dma1_tcf6 | *0x06* | 90*e790a4ceSJonathan Corbet +--------------+----------------+-----------+------------+ 91*e790a4ceSJonathan Corbet | Channel *7* | DMA1 channel 7 | dma1_tcf7 | *0x07* | 92*e790a4ceSJonathan Corbet +--------------+----------------+-----------+------------+ 93*e790a4ceSJonathan Corbet | Channel *8* | DMA2 channel 0 | dma2_tcf0 | *0x08* | 94*e790a4ceSJonathan Corbet +--------------+----------------+-----------+------------+ 95*e790a4ceSJonathan Corbet | Channel *9* | DMA2 channel 1 | dma2_tcf1 | *0x09* | 96*e790a4ceSJonathan Corbet +--------------+----------------+-----------+------------+ 97*e790a4ceSJonathan Corbet | Channel *10* | DMA2 channel 2 | dma2_tcf2 | *0x0A* | 98*e790a4ceSJonathan Corbet +--------------+----------------+-----------+------------+ 99*e790a4ceSJonathan Corbet | Channel *11* | DMA2 channel 3 | dma2_tcf3 | *0x0B* | 100*e790a4ceSJonathan Corbet +--------------+----------------+-----------+------------+ 101*e790a4ceSJonathan Corbet | Channel *12* | DMA2 channel 4 | dma2_tcf4 | *0x0C* | 102*e790a4ceSJonathan Corbet +--------------+----------------+-----------+------------+ 103*e790a4ceSJonathan Corbet | Channel *13* | DMA2 channel 5 | dma2_tcf5 | *0x0D* | 104*e790a4ceSJonathan Corbet +--------------+----------------+-----------+------------+ 105*e790a4ceSJonathan Corbet | Channel *14* | DMA2 channel 6 | dma2_tcf6 | *0x0E* | 106*e790a4ceSJonathan Corbet +--------------+----------------+-----------+------------+ 107*e790a4ceSJonathan Corbet | Channel *15* | DMA2 channel 7 | dma2_tcf7 | *0x0F* | 108*e790a4ceSJonathan Corbet +--------------+----------------+-----------+------------+ 109*e790a4ceSJonathan Corbet 110*e790a4ceSJonathan Corbet STM32 DMA-MDMA chaining feature then uses a SRAM buffer. STM32MP1 SoCs embed 111*e790a4ceSJonathan Corbet three fast access static internal RAMs of various size, used for data storage. 112*e790a4ceSJonathan Corbet Due to STM32 DMA legacy (within microcontrollers), STM32 DMA performances are 113*e790a4ceSJonathan Corbet bad with DDR, while they are optimal with SRAM. Hence the SRAM buffer used 114*e790a4ceSJonathan Corbet between STM32 DMA and STM32 MDMA. This buffer is split in two equal periods 115*e790a4ceSJonathan Corbet and STM32 DMA uses one period while STM32 MDMA uses the other period 116*e790a4ceSJonathan Corbet simultaneously. 117*e790a4ceSJonathan Corbet :: 118*e790a4ceSJonathan Corbet 119*e790a4ceSJonathan Corbet dma[1:2]-tcf[0:7] 120*e790a4ceSJonathan Corbet .----------------. 121*e790a4ceSJonathan Corbet ____________ ' _________ V____________ 122*e790a4ceSJonathan Corbet | STM32 DMA | / __|>_ \ | STM32 MDMA | 123*e790a4ceSJonathan Corbet |------------| | / \ | |------------| 124*e790a4ceSJonathan Corbet | DMA_SxM0AR |<=>| | SRAM | |<=>| []-[]...[] | 125*e790a4ceSJonathan Corbet | DMA_SxM1AR | | \_____/ | | | 126*e790a4ceSJonathan Corbet |____________| \___<|____/ |____________| 127*e790a4ceSJonathan Corbet 128*e790a4ceSJonathan Corbet STM32 DMA-MDMA chaining uses (struct dma_slave_config).peripheral_config to 129*e790a4ceSJonathan Corbet exchange the parameters needed to configure MDMA. These parameters are 130*e790a4ceSJonathan Corbet gathered into a u32 array with three values: 131*e790a4ceSJonathan Corbet 132*e790a4ceSJonathan Corbet * the STM32 MDMA request (which is actually the DMAMUX channel ID), 133*e790a4ceSJonathan Corbet * the address of the STM32 DMA register to clear the Transfer Complete 134*e790a4ceSJonathan Corbet interrupt flag, 135*e790a4ceSJonathan Corbet * the mask of the Transfer Complete interrupt flag of the STM32 DMA channel. 136*e790a4ceSJonathan Corbet 137*e790a4ceSJonathan CorbetDevice Tree updates for STM32 DMA-MDMA chaining support 138*e790a4ceSJonathan Corbet------------------------------------------------------- 139*e790a4ceSJonathan Corbet 140*e790a4ceSJonathan Corbet **1. Allocate a SRAM buffer** 141*e790a4ceSJonathan Corbet 142*e790a4ceSJonathan Corbet SRAM device tree node is defined in SoC device tree. You can refer to it in 143*e790a4ceSJonathan Corbet your board device tree to define your SRAM pool. 144*e790a4ceSJonathan Corbet :: 145*e790a4ceSJonathan Corbet 146*e790a4ceSJonathan Corbet &sram { 147*e790a4ceSJonathan Corbet my_foo_device_dma_pool: dma-sram@0 { 148*e790a4ceSJonathan Corbet reg = <0x0 0x1000>; 149*e790a4ceSJonathan Corbet }; 150*e790a4ceSJonathan Corbet }; 151*e790a4ceSJonathan Corbet 152*e790a4ceSJonathan Corbet Be careful of the start index, in case there are other SRAM consumers. 153*e790a4ceSJonathan Corbet Define your pool size strategically: to optimise chaining, the idea is that 154*e790a4ceSJonathan Corbet STM32 DMA and STM32 MDMA can work simultaneously, on each buffer of the 155*e790a4ceSJonathan Corbet SRAM. 156*e790a4ceSJonathan Corbet If the SRAM period is greater than the expected DMA transfer, then STM32 DMA 157*e790a4ceSJonathan Corbet and STM32 MDMA will work sequentially instead of simultaneously. It is not a 158*e790a4ceSJonathan Corbet functional issue but it is not optimal. 159*e790a4ceSJonathan Corbet 160*e790a4ceSJonathan Corbet Don't forget to refer to your SRAM pool in your device node. You need to 161*e790a4ceSJonathan Corbet define a new property. 162*e790a4ceSJonathan Corbet :: 163*e790a4ceSJonathan Corbet 164*e790a4ceSJonathan Corbet &my_foo_device { 165*e790a4ceSJonathan Corbet ... 166*e790a4ceSJonathan Corbet my_dma_pool = &my_foo_device_dma_pool; 167*e790a4ceSJonathan Corbet }; 168*e790a4ceSJonathan Corbet 169*e790a4ceSJonathan Corbet Then get this SRAM pool in your foo driver and allocate your SRAM buffer. 170*e790a4ceSJonathan Corbet 171*e790a4ceSJonathan Corbet **2. Allocate a STM32 DMA channel and a STM32 MDMA channel** 172*e790a4ceSJonathan Corbet 173*e790a4ceSJonathan Corbet You need to define an extra channel in your device tree node, in addition to 174*e790a4ceSJonathan Corbet the one you should already have for "classic" DMA operation. 175*e790a4ceSJonathan Corbet 176*e790a4ceSJonathan Corbet This new channel must be taken from STM32 MDMA channels, so, the phandle of 177*e790a4ceSJonathan Corbet the DMA controller to use is the MDMA controller's one. 178*e790a4ceSJonathan Corbet :: 179*e790a4ceSJonathan Corbet 180*e790a4ceSJonathan Corbet &my_foo_device { 181*e790a4ceSJonathan Corbet [...] 182*e790a4ceSJonathan Corbet my_dma_pool = &my_foo_device_dma_pool; 183*e790a4ceSJonathan Corbet dmas = <&dmamux1 ...>, // STM32 DMA channel 184*e790a4ceSJonathan Corbet <&mdma1 0 0x3 0x1200000a 0 0>; // + STM32 MDMA channel 185*e790a4ceSJonathan Corbet }; 186*e790a4ceSJonathan Corbet 187*e790a4ceSJonathan Corbet Concerning STM32 MDMA bindings: 188*e790a4ceSJonathan Corbet 189*e790a4ceSJonathan Corbet 1. The request line number : whatever the value here, it will be overwritten 190*e790a4ceSJonathan Corbet by MDMA driver with the STM32 DMAMUX channel ID passed through 191*e790a4ceSJonathan Corbet (struct dma_slave_config).peripheral_config 192*e790a4ceSJonathan Corbet 193*e790a4ceSJonathan Corbet 2. The priority level : choose Very High (0x3) so that your channel will 194*e790a4ceSJonathan Corbet take priority other the other during request arbitration 195*e790a4ceSJonathan Corbet 196*e790a4ceSJonathan Corbet 3. A 32bit mask specifying the DMA channel configuration : source and 197*e790a4ceSJonathan Corbet destination address increment, block transfer with 128 bytes per single 198*e790a4ceSJonathan Corbet transfer 199*e790a4ceSJonathan Corbet 200*e790a4ceSJonathan Corbet 4. The 32bit value specifying the register to be used to acknowledge the 201*e790a4ceSJonathan Corbet request: it will be overwritten by MDMA driver, with the DMA channel 202*e790a4ceSJonathan Corbet interrupt flag clear register address passed through 203*e790a4ceSJonathan Corbet (struct dma_slave_config).peripheral_config 204*e790a4ceSJonathan Corbet 205*e790a4ceSJonathan Corbet 5. The 32bit mask specifying the value to be written to acknowledge the 206*e790a4ceSJonathan Corbet request: it will be overwritten by MDMA driver, with the DMA channel 207*e790a4ceSJonathan Corbet Transfer Complete flag passed through 208*e790a4ceSJonathan Corbet (struct dma_slave_config).peripheral_config 209*e790a4ceSJonathan Corbet 210*e790a4ceSJonathan CorbetDriver updates for STM32 DMA-MDMA chaining support in foo driver 211*e790a4ceSJonathan Corbet---------------------------------------------------------------- 212*e790a4ceSJonathan Corbet 213*e790a4ceSJonathan Corbet **0. (optional) Refactor the original sg_table if dmaengine_prep_slave_sg()** 214*e790a4ceSJonathan Corbet 215*e790a4ceSJonathan Corbet In case of dmaengine_prep_slave_sg(), the original sg_table can't be used as 216*e790a4ceSJonathan Corbet is. Two new sg_tables must be created from the original one. One for 217*e790a4ceSJonathan Corbet STM32 DMA transfer (where memory address targets now the SRAM buffer instead 218*e790a4ceSJonathan Corbet of DDR buffer) and one for STM32 MDMA transfer (where memory address targets 219*e790a4ceSJonathan Corbet the DDR buffer). 220*e790a4ceSJonathan Corbet 221*e790a4ceSJonathan Corbet The new sg_list items must fit SRAM period length. Here is an example for 222*e790a4ceSJonathan Corbet DMA_DEV_TO_MEM: 223*e790a4ceSJonathan Corbet :: 224*e790a4ceSJonathan Corbet 225*e790a4ceSJonathan Corbet /* 226*e790a4ceSJonathan Corbet * Assuming sgl and nents, respectively the initial scatterlist and its 227*e790a4ceSJonathan Corbet * length. 228*e790a4ceSJonathan Corbet * Assuming sram_dma_buf and sram_period, respectively the memory 229*e790a4ceSJonathan Corbet * allocated from the pool for DMA usage, and the length of the period, 230*e790a4ceSJonathan Corbet * which is half of the sram_buf size. 231*e790a4ceSJonathan Corbet */ 232*e790a4ceSJonathan Corbet struct sg_table new_dma_sgt, new_mdma_sgt; 233*e790a4ceSJonathan Corbet struct scatterlist *s, *_sgl; 234*e790a4ceSJonathan Corbet dma_addr_t ddr_dma_buf; 235*e790a4ceSJonathan Corbet u32 new_nents = 0, len; 236*e790a4ceSJonathan Corbet int i; 237*e790a4ceSJonathan Corbet 238*e790a4ceSJonathan Corbet /* Count the number of entries needed */ 239*e790a4ceSJonathan Corbet for_each_sg(sgl, s, nents, i) 240*e790a4ceSJonathan Corbet if (sg_dma_len(s) > sram_period) 241*e790a4ceSJonathan Corbet new_nents += DIV_ROUND_UP(sg_dma_len(s), sram_period); 242*e790a4ceSJonathan Corbet else 243*e790a4ceSJonathan Corbet new_nents++; 244*e790a4ceSJonathan Corbet 245*e790a4ceSJonathan Corbet /* Create sg table for STM32 DMA channel */ 246*e790a4ceSJonathan Corbet ret = sg_alloc_table(&new_dma_sgt, new_nents, GFP_ATOMIC); 247*e790a4ceSJonathan Corbet if (ret) 248*e790a4ceSJonathan Corbet dev_err(dev, "DMA sg table alloc failed\n"); 249*e790a4ceSJonathan Corbet 250*e790a4ceSJonathan Corbet for_each_sg(new_dma_sgt.sgl, s, new_dma_sgt.nents, i) { 251*e790a4ceSJonathan Corbet _sgl = sgl; 252*e790a4ceSJonathan Corbet sg_dma_len(s) = min(sg_dma_len(_sgl), sram_period); 253*e790a4ceSJonathan Corbet /* Targets the beginning = first half of the sram_buf */ 254*e790a4ceSJonathan Corbet s->dma_address = sram_buf; 255*e790a4ceSJonathan Corbet /* 256*e790a4ceSJonathan Corbet * Targets the second half of the sram_buf 257*e790a4ceSJonathan Corbet * for odd indexes of the item of the sg_list 258*e790a4ceSJonathan Corbet */ 259*e790a4ceSJonathan Corbet if (i & 1) 260*e790a4ceSJonathan Corbet s->dma_address += sram_period; 261*e790a4ceSJonathan Corbet } 262*e790a4ceSJonathan Corbet 263*e790a4ceSJonathan Corbet /* Create sg table for STM32 MDMA channel */ 264*e790a4ceSJonathan Corbet ret = sg_alloc_table(&new_mdma_sgt, new_nents, GFP_ATOMIC); 265*e790a4ceSJonathan Corbet if (ret) 266*e790a4ceSJonathan Corbet dev_err(dev, "MDMA sg_table alloc failed\n"); 267*e790a4ceSJonathan Corbet 268*e790a4ceSJonathan Corbet _sgl = sgl; 269*e790a4ceSJonathan Corbet len = sg_dma_len(sgl); 270*e790a4ceSJonathan Corbet ddr_dma_buf = sg_dma_address(sgl); 271*e790a4ceSJonathan Corbet for_each_sg(mdma_sgt.sgl, s, mdma_sgt.nents, i) { 272*e790a4ceSJonathan Corbet size_t bytes = min_t(size_t, len, sram_period); 273*e790a4ceSJonathan Corbet 274*e790a4ceSJonathan Corbet sg_dma_len(s) = bytes; 275*e790a4ceSJonathan Corbet sg_dma_address(s) = ddr_dma_buf; 276*e790a4ceSJonathan Corbet len -= bytes; 277*e790a4ceSJonathan Corbet 278*e790a4ceSJonathan Corbet if (!len && sg_next(_sgl)) { 279*e790a4ceSJonathan Corbet _sgl = sg_next(_sgl); 280*e790a4ceSJonathan Corbet len = sg_dma_len(_sgl); 281*e790a4ceSJonathan Corbet ddr_dma_buf = sg_dma_address(_sgl); 282*e790a4ceSJonathan Corbet } else { 283*e790a4ceSJonathan Corbet ddr_dma_buf += bytes; 284*e790a4ceSJonathan Corbet } 285*e790a4ceSJonathan Corbet } 286*e790a4ceSJonathan Corbet 287*e790a4ceSJonathan Corbet Don't forget to release these new sg_tables after getting the descriptors 288*e790a4ceSJonathan Corbet with dmaengine_prep_slave_sg(). 289*e790a4ceSJonathan Corbet 290*e790a4ceSJonathan Corbet **1. Set controller specific parameters** 291*e790a4ceSJonathan Corbet 292*e790a4ceSJonathan Corbet First, use dmaengine_slave_config() with a struct dma_slave_config to 293*e790a4ceSJonathan Corbet configure STM32 DMA channel. You just have to take care of DMA addresses, 294*e790a4ceSJonathan Corbet the memory address (depending on the transfer direction) must point on your 295*e790a4ceSJonathan Corbet SRAM buffer, and set (struct dma_slave_config).peripheral_size != 0. 296*e790a4ceSJonathan Corbet 297*e790a4ceSJonathan Corbet STM32 DMA driver will check (struct dma_slave_config).peripheral_size to 298*e790a4ceSJonathan Corbet determine if chaining is being used or not. If it is used, then STM32 DMA 299*e790a4ceSJonathan Corbet driver fills (struct dma_slave_config).peripheral_config with an array of 300*e790a4ceSJonathan Corbet three u32 : the first one containing STM32 DMAMUX channel ID, the second one 301*e790a4ceSJonathan Corbet the channel interrupt flag clear register address, and the third one the 302*e790a4ceSJonathan Corbet channel Transfer Complete flag mask. 303*e790a4ceSJonathan Corbet 304*e790a4ceSJonathan Corbet Then, use dmaengine_slave_config with another struct dma_slave_config to 305*e790a4ceSJonathan Corbet configure STM32 MDMA channel. Take care of DMA addresses, the device address 306*e790a4ceSJonathan Corbet (depending on the transfer direction) must point on your SRAM buffer, and 307*e790a4ceSJonathan Corbet the memory address must point to the buffer originally used for "classic" 308*e790a4ceSJonathan Corbet DMA operation. Use the previous (struct dma_slave_config).peripheral_size 309*e790a4ceSJonathan Corbet and .peripheral_config that have been updated by STM32 DMA driver, to set 310*e790a4ceSJonathan Corbet (struct dma_slave_config).peripheral_size and .peripheral_config of the 311*e790a4ceSJonathan Corbet struct dma_slave_config to configure STM32 MDMA channel. 312*e790a4ceSJonathan Corbet :: 313*e790a4ceSJonathan Corbet 314*e790a4ceSJonathan Corbet struct dma_slave_config dma_conf; 315*e790a4ceSJonathan Corbet struct dma_slave_config mdma_conf; 316*e790a4ceSJonathan Corbet 317*e790a4ceSJonathan Corbet memset(&dma_conf, 0, sizeof(dma_conf)); 318*e790a4ceSJonathan Corbet [...] 319*e790a4ceSJonathan Corbet config.direction = DMA_DEV_TO_MEM; 320*e790a4ceSJonathan Corbet config.dst_addr = sram_dma_buf; // SRAM buffer 321*e790a4ceSJonathan Corbet config.peripheral_size = 1; // peripheral_size != 0 => chaining 322*e790a4ceSJonathan Corbet 323*e790a4ceSJonathan Corbet dmaengine_slave_config(dma_chan, &dma_config); 324*e790a4ceSJonathan Corbet 325*e790a4ceSJonathan Corbet memset(&mdma_conf, 0, sizeof(mdma_conf)); 326*e790a4ceSJonathan Corbet config.direction = DMA_DEV_TO_MEM; 327*e790a4ceSJonathan Corbet mdma_conf.src_addr = sram_dma_buf; // SRAM buffer 328*e790a4ceSJonathan Corbet mdma_conf.dst_addr = rx_dma_buf; // original memory buffer 329*e790a4ceSJonathan Corbet mdma_conf.peripheral_size = dma_conf.peripheral_size; // <- dma_conf 330*e790a4ceSJonathan Corbet mdma_conf.peripheral_config = dma_config.peripheral_config; // <- dma_conf 331*e790a4ceSJonathan Corbet 332*e790a4ceSJonathan Corbet dmaengine_slave_config(mdma_chan, &mdma_conf); 333*e790a4ceSJonathan Corbet 334*e790a4ceSJonathan Corbet **2. Get a descriptor for STM32 DMA channel transaction** 335*e790a4ceSJonathan Corbet 336*e790a4ceSJonathan Corbet In the same way you get your descriptor for your "classic" DMA operation, 337*e790a4ceSJonathan Corbet you just have to replace the original sg_list (in case of 338*e790a4ceSJonathan Corbet dmaengine_prep_slave_sg()) with the new sg_list using SRAM buffer, or to 339*e790a4ceSJonathan Corbet replace the original buffer address, length and period (in case of 340*e790a4ceSJonathan Corbet dmaengine_prep_dma_cyclic()) with the new SRAM buffer. 341*e790a4ceSJonathan Corbet 342*e790a4ceSJonathan Corbet **3. Get a descriptor for STM32 MDMA channel transaction** 343*e790a4ceSJonathan Corbet 344*e790a4ceSJonathan Corbet If you previously get descriptor (for STM32 DMA) with 345*e790a4ceSJonathan Corbet 346*e790a4ceSJonathan Corbet * dmaengine_prep_slave_sg(), then use dmaengine_prep_slave_sg() for 347*e790a4ceSJonathan Corbet STM32 MDMA; 348*e790a4ceSJonathan Corbet * dmaengine_prep_dma_cyclic(), then use dmaengine_prep_dma_cyclic() for 349*e790a4ceSJonathan Corbet STM32 MDMA. 350*e790a4ceSJonathan Corbet 351*e790a4ceSJonathan Corbet Use the new sg_list using SRAM buffer (in case of dmaengine_prep_slave_sg()) 352*e790a4ceSJonathan Corbet or, depending on the transfer direction, either the original DDR buffer (in 353*e790a4ceSJonathan Corbet case of DMA_DEV_TO_MEM) or the SRAM buffer (in case of DMA_MEM_TO_DEV), the 354*e790a4ceSJonathan Corbet source address being previously set with dmaengine_slave_config(). 355*e790a4ceSJonathan Corbet 356*e790a4ceSJonathan Corbet **4. Submit both transactions** 357*e790a4ceSJonathan Corbet 358*e790a4ceSJonathan Corbet Before submitting your transactions, you may need to define on which 359*e790a4ceSJonathan Corbet descriptor you want a callback to be called at the end of the transfer 360*e790a4ceSJonathan Corbet (dmaengine_prep_slave_sg()) or the period (dmaengine_prep_dma_cyclic()). 361*e790a4ceSJonathan Corbet Depending on the direction, set the callback on the descriptor that finishes 362*e790a4ceSJonathan Corbet the overal transfer: 363*e790a4ceSJonathan Corbet 364*e790a4ceSJonathan Corbet * DMA_DEV_TO_MEM: set the callback on the "MDMA" descriptor 365*e790a4ceSJonathan Corbet * DMA_MEM_TO_DEV: set the callback on the "DMA" descriptor 366*e790a4ceSJonathan Corbet 367*e790a4ceSJonathan Corbet Then, submit the descriptors whatever the order, with dmaengine_tx_submit(). 368*e790a4ceSJonathan Corbet 369*e790a4ceSJonathan Corbet **5. Issue pending requests (and wait for callback notification)** 370*e790a4ceSJonathan Corbet 371*e790a4ceSJonathan Corbet As STM32 MDMA channel transfer is triggered by STM32 DMA, you must issue 372*e790a4ceSJonathan Corbet STM32 MDMA channel before STM32 DMA channel. 373*e790a4ceSJonathan Corbet 374*e790a4ceSJonathan Corbet If any, your callback will be called to warn you about the end of the overal 375*e790a4ceSJonathan Corbet transfer or the period completion. 376*e790a4ceSJonathan Corbet 377*e790a4ceSJonathan Corbet Don't forget to terminate both channels. STM32 DMA channel is configured in 378*e790a4ceSJonathan Corbet cyclic Double-Buffer mode so it won't be disabled by HW, you need to terminate 379*e790a4ceSJonathan Corbet it. STM32 MDMA channel will be stopped by HW in case of sg transfer, but not 380*e790a4ceSJonathan Corbet in case of cyclic transfer. You can terminate it whatever the kind of transfer. 381*e790a4ceSJonathan Corbet 382*e790a4ceSJonathan Corbet **STM32 DMA-MDMA chaining DMA_MEM_TO_DEV special case** 383*e790a4ceSJonathan Corbet 384*e790a4ceSJonathan Corbet STM32 DMA-MDMA chaining in DMA_MEM_TO_DEV is a special case. Indeed, the 385*e790a4ceSJonathan Corbet STM32 MDMA feeds the SRAM buffer with the DDR data, and the STM32 DMA reads 386*e790a4ceSJonathan Corbet data from SRAM buffer. So some data (the first period) have to be copied in 387*e790a4ceSJonathan Corbet SRAM buffer when the STM32 DMA starts to read. 388*e790a4ceSJonathan Corbet 389*e790a4ceSJonathan Corbet A trick could be pausing the STM32 DMA channel (that will raise a Transfer 390*e790a4ceSJonathan Corbet Complete signal, triggering the STM32 MDMA channel), but the first data read 391*e790a4ceSJonathan Corbet by the STM32 DMA could be "wrong". The proper way is to prepare the first SRAM 392*e790a4ceSJonathan Corbet period with dmaengine_prep_dma_memcpy(). Then this first period should be 393*e790a4ceSJonathan Corbet "removed" from the sg or the cyclic transfer. 394*e790a4ceSJonathan Corbet 395*e790a4ceSJonathan Corbet Due to this complexity, rather use the STM32 DMA-MDMA chaining for 396*e790a4ceSJonathan Corbet DMA_DEV_TO_MEM and keep the "classic" DMA usage for DMA_MEM_TO_DEV, unless 397*e790a4ceSJonathan Corbet you're not afraid. 398*e790a4ceSJonathan Corbet 399*e790a4ceSJonathan CorbetResources 400*e790a4ceSJonathan Corbet--------- 401*e790a4ceSJonathan Corbet 402*e790a4ceSJonathan Corbet Application note, datasheet and reference manual are available on ST website 403*e790a4ceSJonathan Corbet (STM32MP1_). 404*e790a4ceSJonathan Corbet 405*e790a4ceSJonathan Corbet Dedicated focus on three application notes (AN5224_, AN4031_ & AN5001_) 406*e790a4ceSJonathan Corbet dealing with STM32 DMAMUX, STM32 DMA and STM32 MDMA. 407*e790a4ceSJonathan Corbet 408*e790a4ceSJonathan Corbet.. _STM32MP1: https://www.st.com/en/microcontrollers-microprocessors/stm32mp1-series.html 409*e790a4ceSJonathan Corbet.. _AN5224: https://www.st.com/resource/en/application_note/an5224-stm32-dmamux-the-dma-request-router-stmicroelectronics.pdf 410*e790a4ceSJonathan Corbet.. _AN4031: https://www.st.com/resource/en/application_note/dm00046011-using-the-stm32f2-stm32f4-and-stm32f7-series-dma-controller-stmicroelectronics.pdf 411*e790a4ceSJonathan Corbet.. _AN5001: https://www.st.com/resource/en/application_note/an5001-stm32cube-expansion-package-for-stm32h7-series-mdma-stmicroelectronics.pdf 412*e790a4ceSJonathan Corbet 413*e790a4ceSJonathan Corbet:Authors: 414*e790a4ceSJonathan Corbet 415*e790a4ceSJonathan Corbet- Amelie Delaunay <amelie.delaunay@foss.st.com>