xref: /openbmc/linux/Documentation/arch/arm/stm32/stm32-dma-mdma-chaining.rst (revision 2612e3bbc0386368a850140a6c9b990cd496a5ec)
1*e790a4ceSJonathan Corbet.. SPDX-License-Identifier: GPL-2.0
2*e790a4ceSJonathan Corbet
3*e790a4ceSJonathan Corbet=======================
4*e790a4ceSJonathan CorbetSTM32 DMA-MDMA chaining
5*e790a4ceSJonathan Corbet=======================
6*e790a4ceSJonathan Corbet
7*e790a4ceSJonathan Corbet
8*e790a4ceSJonathan CorbetIntroduction
9*e790a4ceSJonathan Corbet------------
10*e790a4ceSJonathan Corbet
11*e790a4ceSJonathan Corbet  This document describes the STM32 DMA-MDMA chaining feature. But before going
12*e790a4ceSJonathan Corbet  further, let's introduce the peripherals involved.
13*e790a4ceSJonathan Corbet
14*e790a4ceSJonathan Corbet  To offload data transfers from the CPU, STM32 microprocessors (MPUs) embed
15*e790a4ceSJonathan Corbet  direct memory access controllers (DMA).
16*e790a4ceSJonathan Corbet
17*e790a4ceSJonathan Corbet  STM32MP1 SoCs embed both STM32 DMA and STM32 MDMA controllers. STM32 DMA
18*e790a4ceSJonathan Corbet  request routing capabilities are enhanced by a DMA request multiplexer
19*e790a4ceSJonathan Corbet  (STM32 DMAMUX).
20*e790a4ceSJonathan Corbet
21*e790a4ceSJonathan Corbet  **STM32 DMAMUX**
22*e790a4ceSJonathan Corbet
23*e790a4ceSJonathan Corbet  STM32 DMAMUX routes any DMA request from a given peripheral to any STM32 DMA
24*e790a4ceSJonathan Corbet  controller (STM32MP1 counts two STM32 DMA controllers) channels.
25*e790a4ceSJonathan Corbet
26*e790a4ceSJonathan Corbet  **STM32 DMA**
27*e790a4ceSJonathan Corbet
28*e790a4ceSJonathan Corbet  STM32 DMA is mainly used to implement central data buffer storage (usually in
29*e790a4ceSJonathan Corbet  the system SRAM) for different peripheral. It can access external RAMs but
30*e790a4ceSJonathan Corbet  without the ability to generate convenient burst transfer ensuring the best
31*e790a4ceSJonathan Corbet  load of the AXI.
32*e790a4ceSJonathan Corbet
33*e790a4ceSJonathan Corbet  **STM32 MDMA**
34*e790a4ceSJonathan Corbet
35*e790a4ceSJonathan Corbet  STM32 MDMA (Master DMA) is mainly used to manage direct data transfers between
36*e790a4ceSJonathan Corbet  RAM data buffers without CPU intervention. It can also be used in a
37*e790a4ceSJonathan Corbet  hierarchical structure that uses STM32 DMA as first level data buffer
38*e790a4ceSJonathan Corbet  interfaces for AHB peripherals, while the STM32 MDMA acts as a second level
39*e790a4ceSJonathan Corbet  DMA with better performance. As a AXI/AHB master, STM32 MDMA can take control
40*e790a4ceSJonathan Corbet  of the AXI/AHB bus.
41*e790a4ceSJonathan Corbet
42*e790a4ceSJonathan Corbet
43*e790a4ceSJonathan CorbetPrinciples
44*e790a4ceSJonathan Corbet----------
45*e790a4ceSJonathan Corbet
46*e790a4ceSJonathan Corbet  STM32 DMA-MDMA chaining feature relies on the strengths of STM32 DMA and
47*e790a4ceSJonathan Corbet  STM32 MDMA controllers.
48*e790a4ceSJonathan Corbet
49*e790a4ceSJonathan Corbet  STM32 DMA has a circular Double Buffer Mode (DBM). At each end of transaction
50*e790a4ceSJonathan Corbet  (when DMA data counter - DMA_SxNDTR - reaches 0), the memory pointers
51*e790a4ceSJonathan Corbet  (configured with DMA_SxSM0AR and DMA_SxM1AR) are swapped and the DMA data
52*e790a4ceSJonathan Corbet  counter is automatically reloaded. This allows the SW or the STM32 MDMA to
53*e790a4ceSJonathan Corbet  process one memory area while the second memory area is being filled/used by
54*e790a4ceSJonathan Corbet  the STM32 DMA transfer.
55*e790a4ceSJonathan Corbet
56*e790a4ceSJonathan Corbet  With STM32 MDMA linked-list mode, a single request initiates the data array
57*e790a4ceSJonathan Corbet  (collection of nodes) to be transferred until the linked-list pointer for the
58*e790a4ceSJonathan Corbet  channel is null. The channel transfer complete of the last node is the end of
59*e790a4ceSJonathan Corbet  transfer, unless first and last nodes are linked to each other, in such a
60*e790a4ceSJonathan Corbet  case, the linked-list loops on to create a circular MDMA transfer.
61*e790a4ceSJonathan Corbet
62*e790a4ceSJonathan Corbet  STM32 MDMA has direct connections with STM32 DMA. This enables autonomous
63*e790a4ceSJonathan Corbet  communication and synchronization between peripherals, thus saving CPU
64*e790a4ceSJonathan Corbet  resources and bus congestion. Transfer Complete signal of STM32 DMA channel
65*e790a4ceSJonathan Corbet  can triggers STM32 MDMA transfer. STM32 MDMA can clear the request generated
66*e790a4ceSJonathan Corbet  by the STM32 DMA by writing to its Interrupt Clear register (whose address is
67*e790a4ceSJonathan Corbet  stored in MDMA_CxMAR, and bit mask in MDMA_CxMDR).
68*e790a4ceSJonathan Corbet
69*e790a4ceSJonathan Corbet  .. table:: STM32 MDMA interconnect table with STM32 DMA
70*e790a4ceSJonathan Corbet
71*e790a4ceSJonathan Corbet    +--------------+----------------+-----------+------------+
72*e790a4ceSJonathan Corbet    | STM32 DMAMUX | STM32 DMA      | STM32 DMA | STM32 MDMA |
73*e790a4ceSJonathan Corbet    | channels     | channels       | Transfer  | request    |
74*e790a4ceSJonathan Corbet    |              |                | complete  |            |
75*e790a4ceSJonathan Corbet    |              |                | signal    |            |
76*e790a4ceSJonathan Corbet    +==============+================+===========+============+
77*e790a4ceSJonathan Corbet    | Channel *0*  | DMA1 channel 0 | dma1_tcf0 | *0x00*     |
78*e790a4ceSJonathan Corbet    +--------------+----------------+-----------+------------+
79*e790a4ceSJonathan Corbet    | Channel *1*  | DMA1 channel 1 | dma1_tcf1 | *0x01*     |
80*e790a4ceSJonathan Corbet    +--------------+----------------+-----------+------------+
81*e790a4ceSJonathan Corbet    | Channel *2*  | DMA1 channel 2 | dma1_tcf2 | *0x02*     |
82*e790a4ceSJonathan Corbet    +--------------+----------------+-----------+------------+
83*e790a4ceSJonathan Corbet    | Channel *3*  | DMA1 channel 3 | dma1_tcf3 | *0x03*     |
84*e790a4ceSJonathan Corbet    +--------------+----------------+-----------+------------+
85*e790a4ceSJonathan Corbet    | Channel *4*  | DMA1 channel 4 | dma1_tcf4 | *0x04*     |
86*e790a4ceSJonathan Corbet    +--------------+----------------+-----------+------------+
87*e790a4ceSJonathan Corbet    | Channel *5*  | DMA1 channel 5 | dma1_tcf5 | *0x05*     |
88*e790a4ceSJonathan Corbet    +--------------+----------------+-----------+------------+
89*e790a4ceSJonathan Corbet    | Channel *6*  | DMA1 channel 6 | dma1_tcf6 | *0x06*     |
90*e790a4ceSJonathan Corbet    +--------------+----------------+-----------+------------+
91*e790a4ceSJonathan Corbet    | Channel *7*  | DMA1 channel 7 | dma1_tcf7 | *0x07*     |
92*e790a4ceSJonathan Corbet    +--------------+----------------+-----------+------------+
93*e790a4ceSJonathan Corbet    | Channel *8*  | DMA2 channel 0 | dma2_tcf0 | *0x08*     |
94*e790a4ceSJonathan Corbet    +--------------+----------------+-----------+------------+
95*e790a4ceSJonathan Corbet    | Channel *9*  | DMA2 channel 1 | dma2_tcf1 | *0x09*     |
96*e790a4ceSJonathan Corbet    +--------------+----------------+-----------+------------+
97*e790a4ceSJonathan Corbet    | Channel *10* | DMA2 channel 2 | dma2_tcf2 | *0x0A*     |
98*e790a4ceSJonathan Corbet    +--------------+----------------+-----------+------------+
99*e790a4ceSJonathan Corbet    | Channel *11* | DMA2 channel 3 | dma2_tcf3 | *0x0B*     |
100*e790a4ceSJonathan Corbet    +--------------+----------------+-----------+------------+
101*e790a4ceSJonathan Corbet    | Channel *12* | DMA2 channel 4 | dma2_tcf4 | *0x0C*     |
102*e790a4ceSJonathan Corbet    +--------------+----------------+-----------+------------+
103*e790a4ceSJonathan Corbet    | Channel *13* | DMA2 channel 5 | dma2_tcf5 | *0x0D*     |
104*e790a4ceSJonathan Corbet    +--------------+----------------+-----------+------------+
105*e790a4ceSJonathan Corbet    | Channel *14* | DMA2 channel 6 | dma2_tcf6 | *0x0E*     |
106*e790a4ceSJonathan Corbet    +--------------+----------------+-----------+------------+
107*e790a4ceSJonathan Corbet    | Channel *15* | DMA2 channel 7 | dma2_tcf7 | *0x0F*     |
108*e790a4ceSJonathan Corbet    +--------------+----------------+-----------+------------+
109*e790a4ceSJonathan Corbet
110*e790a4ceSJonathan Corbet  STM32 DMA-MDMA chaining feature then uses a SRAM buffer. STM32MP1 SoCs embed
111*e790a4ceSJonathan Corbet  three fast access static internal RAMs of various size, used for data storage.
112*e790a4ceSJonathan Corbet  Due to STM32 DMA legacy (within microcontrollers), STM32 DMA performances are
113*e790a4ceSJonathan Corbet  bad with DDR, while they are optimal with SRAM. Hence the SRAM buffer used
114*e790a4ceSJonathan Corbet  between STM32 DMA and STM32 MDMA. This buffer is split in two equal periods
115*e790a4ceSJonathan Corbet  and STM32 DMA uses one period while STM32 MDMA uses the other period
116*e790a4ceSJonathan Corbet  simultaneously.
117*e790a4ceSJonathan Corbet  ::
118*e790a4ceSJonathan Corbet
119*e790a4ceSJonathan Corbet                    dma[1:2]-tcf[0:7]
120*e790a4ceSJonathan Corbet                   .----------------.
121*e790a4ceSJonathan Corbet     ____________ '    _________     V____________
122*e790a4ceSJonathan Corbet    | STM32 DMA  |    /  __|>_  \    | STM32 MDMA |
123*e790a4ceSJonathan Corbet    |------------|   |  /     \  |   |------------|
124*e790a4ceSJonathan Corbet    | DMA_SxM0AR |<=>| | SRAM  | |<=>| []-[]...[] |
125*e790a4ceSJonathan Corbet    | DMA_SxM1AR |   |  \_____/  |   |            |
126*e790a4ceSJonathan Corbet    |____________|    \___<|____/    |____________|
127*e790a4ceSJonathan Corbet
128*e790a4ceSJonathan Corbet  STM32 DMA-MDMA chaining uses (struct dma_slave_config).peripheral_config to
129*e790a4ceSJonathan Corbet  exchange the parameters needed to configure MDMA. These parameters are
130*e790a4ceSJonathan Corbet  gathered into a u32 array with three values:
131*e790a4ceSJonathan Corbet
132*e790a4ceSJonathan Corbet  * the STM32 MDMA request (which is actually the DMAMUX channel ID),
133*e790a4ceSJonathan Corbet  * the address of the STM32 DMA register to clear the Transfer Complete
134*e790a4ceSJonathan Corbet    interrupt flag,
135*e790a4ceSJonathan Corbet  * the mask of the Transfer Complete interrupt flag of the STM32 DMA channel.
136*e790a4ceSJonathan Corbet
137*e790a4ceSJonathan CorbetDevice Tree updates for STM32 DMA-MDMA chaining support
138*e790a4ceSJonathan Corbet-------------------------------------------------------
139*e790a4ceSJonathan Corbet
140*e790a4ceSJonathan Corbet  **1. Allocate a SRAM buffer**
141*e790a4ceSJonathan Corbet
142*e790a4ceSJonathan Corbet    SRAM device tree node is defined in SoC device tree. You can refer to it in
143*e790a4ceSJonathan Corbet    your board device tree to define your SRAM pool.
144*e790a4ceSJonathan Corbet    ::
145*e790a4ceSJonathan Corbet
146*e790a4ceSJonathan Corbet          &sram {
147*e790a4ceSJonathan Corbet                  my_foo_device_dma_pool: dma-sram@0 {
148*e790a4ceSJonathan Corbet                          reg = <0x0 0x1000>;
149*e790a4ceSJonathan Corbet                  };
150*e790a4ceSJonathan Corbet          };
151*e790a4ceSJonathan Corbet
152*e790a4ceSJonathan Corbet    Be careful of the start index, in case there are other SRAM consumers.
153*e790a4ceSJonathan Corbet    Define your pool size strategically: to optimise chaining, the idea is that
154*e790a4ceSJonathan Corbet    STM32 DMA and STM32 MDMA can work simultaneously, on each buffer of the
155*e790a4ceSJonathan Corbet    SRAM.
156*e790a4ceSJonathan Corbet    If the SRAM period is greater than the expected DMA transfer, then STM32 DMA
157*e790a4ceSJonathan Corbet    and STM32 MDMA will work sequentially instead of simultaneously. It is not a
158*e790a4ceSJonathan Corbet    functional issue but it is not optimal.
159*e790a4ceSJonathan Corbet
160*e790a4ceSJonathan Corbet    Don't forget to refer to your SRAM pool in your device node. You need to
161*e790a4ceSJonathan Corbet    define a new property.
162*e790a4ceSJonathan Corbet    ::
163*e790a4ceSJonathan Corbet
164*e790a4ceSJonathan Corbet          &my_foo_device {
165*e790a4ceSJonathan Corbet                  ...
166*e790a4ceSJonathan Corbet                  my_dma_pool = &my_foo_device_dma_pool;
167*e790a4ceSJonathan Corbet          };
168*e790a4ceSJonathan Corbet
169*e790a4ceSJonathan Corbet    Then get this SRAM pool in your foo driver and allocate your SRAM buffer.
170*e790a4ceSJonathan Corbet
171*e790a4ceSJonathan Corbet  **2. Allocate a STM32 DMA channel and a STM32 MDMA channel**
172*e790a4ceSJonathan Corbet
173*e790a4ceSJonathan Corbet    You need to define an extra channel in your device tree node, in addition to
174*e790a4ceSJonathan Corbet    the one you should already have for "classic" DMA operation.
175*e790a4ceSJonathan Corbet
176*e790a4ceSJonathan Corbet    This new channel must be taken from STM32 MDMA channels, so, the phandle of
177*e790a4ceSJonathan Corbet    the DMA controller to use is the MDMA controller's one.
178*e790a4ceSJonathan Corbet    ::
179*e790a4ceSJonathan Corbet
180*e790a4ceSJonathan Corbet          &my_foo_device {
181*e790a4ceSJonathan Corbet                  [...]
182*e790a4ceSJonathan Corbet                  my_dma_pool = &my_foo_device_dma_pool;
183*e790a4ceSJonathan Corbet                  dmas = <&dmamux1 ...>,                // STM32 DMA channel
184*e790a4ceSJonathan Corbet                         <&mdma1 0 0x3 0x1200000a 0 0>; // + STM32 MDMA channel
185*e790a4ceSJonathan Corbet          };
186*e790a4ceSJonathan Corbet
187*e790a4ceSJonathan Corbet    Concerning STM32 MDMA bindings:
188*e790a4ceSJonathan Corbet
189*e790a4ceSJonathan Corbet    1. The request line number : whatever the value here, it will be overwritten
190*e790a4ceSJonathan Corbet    by MDMA driver with the STM32 DMAMUX channel ID passed through
191*e790a4ceSJonathan Corbet    (struct dma_slave_config).peripheral_config
192*e790a4ceSJonathan Corbet
193*e790a4ceSJonathan Corbet    2. The priority level : choose Very High (0x3) so that your channel will
194*e790a4ceSJonathan Corbet    take priority other the other during request arbitration
195*e790a4ceSJonathan Corbet
196*e790a4ceSJonathan Corbet    3. A 32bit mask specifying the DMA channel configuration : source and
197*e790a4ceSJonathan Corbet    destination address increment, block transfer with 128 bytes per single
198*e790a4ceSJonathan Corbet    transfer
199*e790a4ceSJonathan Corbet
200*e790a4ceSJonathan Corbet    4. The 32bit value specifying the register to be used to acknowledge the
201*e790a4ceSJonathan Corbet    request: it will be overwritten by MDMA driver, with the DMA channel
202*e790a4ceSJonathan Corbet    interrupt flag clear register address passed through
203*e790a4ceSJonathan Corbet    (struct dma_slave_config).peripheral_config
204*e790a4ceSJonathan Corbet
205*e790a4ceSJonathan Corbet    5. The 32bit mask specifying the value to be written to acknowledge the
206*e790a4ceSJonathan Corbet    request: it will be overwritten by MDMA driver, with the DMA channel
207*e790a4ceSJonathan Corbet    Transfer Complete flag passed through
208*e790a4ceSJonathan Corbet    (struct dma_slave_config).peripheral_config
209*e790a4ceSJonathan Corbet
210*e790a4ceSJonathan CorbetDriver updates for STM32 DMA-MDMA chaining support in foo driver
211*e790a4ceSJonathan Corbet----------------------------------------------------------------
212*e790a4ceSJonathan Corbet
213*e790a4ceSJonathan Corbet  **0. (optional) Refactor the original sg_table if dmaengine_prep_slave_sg()**
214*e790a4ceSJonathan Corbet
215*e790a4ceSJonathan Corbet    In case of dmaengine_prep_slave_sg(), the original sg_table can't be used as
216*e790a4ceSJonathan Corbet    is. Two new sg_tables must be created from the original one. One for
217*e790a4ceSJonathan Corbet    STM32 DMA transfer (where memory address targets now the SRAM buffer instead
218*e790a4ceSJonathan Corbet    of DDR buffer) and one for STM32 MDMA transfer (where memory address targets
219*e790a4ceSJonathan Corbet    the DDR buffer).
220*e790a4ceSJonathan Corbet
221*e790a4ceSJonathan Corbet    The new sg_list items must fit SRAM period length. Here is an example for
222*e790a4ceSJonathan Corbet    DMA_DEV_TO_MEM:
223*e790a4ceSJonathan Corbet    ::
224*e790a4ceSJonathan Corbet
225*e790a4ceSJonathan Corbet      /*
226*e790a4ceSJonathan Corbet        * Assuming sgl and nents, respectively the initial scatterlist and its
227*e790a4ceSJonathan Corbet        * length.
228*e790a4ceSJonathan Corbet        * Assuming sram_dma_buf and sram_period, respectively the memory
229*e790a4ceSJonathan Corbet        * allocated from the pool for DMA usage, and the length of the period,
230*e790a4ceSJonathan Corbet        * which is half of the sram_buf size.
231*e790a4ceSJonathan Corbet        */
232*e790a4ceSJonathan Corbet      struct sg_table new_dma_sgt, new_mdma_sgt;
233*e790a4ceSJonathan Corbet      struct scatterlist *s, *_sgl;
234*e790a4ceSJonathan Corbet      dma_addr_t ddr_dma_buf;
235*e790a4ceSJonathan Corbet      u32 new_nents = 0, len;
236*e790a4ceSJonathan Corbet      int i;
237*e790a4ceSJonathan Corbet
238*e790a4ceSJonathan Corbet      /* Count the number of entries needed */
239*e790a4ceSJonathan Corbet      for_each_sg(sgl, s, nents, i)
240*e790a4ceSJonathan Corbet              if (sg_dma_len(s) > sram_period)
241*e790a4ceSJonathan Corbet                      new_nents += DIV_ROUND_UP(sg_dma_len(s), sram_period);
242*e790a4ceSJonathan Corbet              else
243*e790a4ceSJonathan Corbet                      new_nents++;
244*e790a4ceSJonathan Corbet
245*e790a4ceSJonathan Corbet      /* Create sg table for STM32 DMA channel */
246*e790a4ceSJonathan Corbet      ret = sg_alloc_table(&new_dma_sgt, new_nents, GFP_ATOMIC);
247*e790a4ceSJonathan Corbet      if (ret)
248*e790a4ceSJonathan Corbet              dev_err(dev, "DMA sg table alloc failed\n");
249*e790a4ceSJonathan Corbet
250*e790a4ceSJonathan Corbet      for_each_sg(new_dma_sgt.sgl, s, new_dma_sgt.nents, i) {
251*e790a4ceSJonathan Corbet              _sgl = sgl;
252*e790a4ceSJonathan Corbet              sg_dma_len(s) = min(sg_dma_len(_sgl), sram_period);
253*e790a4ceSJonathan Corbet              /* Targets the beginning = first half of the sram_buf */
254*e790a4ceSJonathan Corbet              s->dma_address = sram_buf;
255*e790a4ceSJonathan Corbet              /*
256*e790a4ceSJonathan Corbet                * Targets the second half of the sram_buf
257*e790a4ceSJonathan Corbet                * for odd indexes of the item of the sg_list
258*e790a4ceSJonathan Corbet                */
259*e790a4ceSJonathan Corbet              if (i & 1)
260*e790a4ceSJonathan Corbet                      s->dma_address += sram_period;
261*e790a4ceSJonathan Corbet      }
262*e790a4ceSJonathan Corbet
263*e790a4ceSJonathan Corbet      /* Create sg table for STM32 MDMA channel */
264*e790a4ceSJonathan Corbet      ret = sg_alloc_table(&new_mdma_sgt, new_nents, GFP_ATOMIC);
265*e790a4ceSJonathan Corbet      if (ret)
266*e790a4ceSJonathan Corbet              dev_err(dev, "MDMA sg_table alloc failed\n");
267*e790a4ceSJonathan Corbet
268*e790a4ceSJonathan Corbet      _sgl = sgl;
269*e790a4ceSJonathan Corbet      len = sg_dma_len(sgl);
270*e790a4ceSJonathan Corbet      ddr_dma_buf = sg_dma_address(sgl);
271*e790a4ceSJonathan Corbet      for_each_sg(mdma_sgt.sgl, s, mdma_sgt.nents, i) {
272*e790a4ceSJonathan Corbet              size_t bytes = min_t(size_t, len, sram_period);
273*e790a4ceSJonathan Corbet
274*e790a4ceSJonathan Corbet              sg_dma_len(s) = bytes;
275*e790a4ceSJonathan Corbet              sg_dma_address(s) = ddr_dma_buf;
276*e790a4ceSJonathan Corbet              len -= bytes;
277*e790a4ceSJonathan Corbet
278*e790a4ceSJonathan Corbet              if (!len && sg_next(_sgl)) {
279*e790a4ceSJonathan Corbet                      _sgl = sg_next(_sgl);
280*e790a4ceSJonathan Corbet                      len = sg_dma_len(_sgl);
281*e790a4ceSJonathan Corbet                      ddr_dma_buf = sg_dma_address(_sgl);
282*e790a4ceSJonathan Corbet              } else {
283*e790a4ceSJonathan Corbet                      ddr_dma_buf += bytes;
284*e790a4ceSJonathan Corbet              }
285*e790a4ceSJonathan Corbet      }
286*e790a4ceSJonathan Corbet
287*e790a4ceSJonathan Corbet    Don't forget to release these new sg_tables after getting the descriptors
288*e790a4ceSJonathan Corbet    with dmaengine_prep_slave_sg().
289*e790a4ceSJonathan Corbet
290*e790a4ceSJonathan Corbet  **1. Set controller specific parameters**
291*e790a4ceSJonathan Corbet
292*e790a4ceSJonathan Corbet    First, use dmaengine_slave_config() with a struct dma_slave_config to
293*e790a4ceSJonathan Corbet    configure STM32 DMA channel. You just have to take care of DMA addresses,
294*e790a4ceSJonathan Corbet    the memory address (depending on the transfer direction) must point on your
295*e790a4ceSJonathan Corbet    SRAM buffer, and set (struct dma_slave_config).peripheral_size != 0.
296*e790a4ceSJonathan Corbet
297*e790a4ceSJonathan Corbet    STM32 DMA driver will check (struct dma_slave_config).peripheral_size to
298*e790a4ceSJonathan Corbet    determine if chaining is being used or not. If it is used, then STM32 DMA
299*e790a4ceSJonathan Corbet    driver fills (struct dma_slave_config).peripheral_config with an array of
300*e790a4ceSJonathan Corbet    three u32 : the first one containing STM32 DMAMUX channel ID, the second one
301*e790a4ceSJonathan Corbet    the channel interrupt flag clear register address, and the third one the
302*e790a4ceSJonathan Corbet    channel Transfer Complete flag mask.
303*e790a4ceSJonathan Corbet
304*e790a4ceSJonathan Corbet    Then, use dmaengine_slave_config with another struct dma_slave_config to
305*e790a4ceSJonathan Corbet    configure STM32 MDMA channel. Take care of DMA addresses, the device address
306*e790a4ceSJonathan Corbet    (depending on the transfer direction) must point on your SRAM buffer, and
307*e790a4ceSJonathan Corbet    the memory address must point to the buffer originally used for "classic"
308*e790a4ceSJonathan Corbet    DMA operation. Use the previous (struct dma_slave_config).peripheral_size
309*e790a4ceSJonathan Corbet    and .peripheral_config that have been updated by STM32 DMA driver, to set
310*e790a4ceSJonathan Corbet    (struct dma_slave_config).peripheral_size and .peripheral_config of the
311*e790a4ceSJonathan Corbet    struct dma_slave_config to configure STM32 MDMA channel.
312*e790a4ceSJonathan Corbet    ::
313*e790a4ceSJonathan Corbet
314*e790a4ceSJonathan Corbet      struct dma_slave_config dma_conf;
315*e790a4ceSJonathan Corbet      struct dma_slave_config mdma_conf;
316*e790a4ceSJonathan Corbet
317*e790a4ceSJonathan Corbet      memset(&dma_conf, 0, sizeof(dma_conf));
318*e790a4ceSJonathan Corbet      [...]
319*e790a4ceSJonathan Corbet      config.direction = DMA_DEV_TO_MEM;
320*e790a4ceSJonathan Corbet      config.dst_addr = sram_dma_buf;        // SRAM buffer
321*e790a4ceSJonathan Corbet      config.peripheral_size = 1;            // peripheral_size != 0 => chaining
322*e790a4ceSJonathan Corbet
323*e790a4ceSJonathan Corbet      dmaengine_slave_config(dma_chan, &dma_config);
324*e790a4ceSJonathan Corbet
325*e790a4ceSJonathan Corbet      memset(&mdma_conf, 0, sizeof(mdma_conf));
326*e790a4ceSJonathan Corbet      config.direction = DMA_DEV_TO_MEM;
327*e790a4ceSJonathan Corbet      mdma_conf.src_addr = sram_dma_buf;     // SRAM buffer
328*e790a4ceSJonathan Corbet      mdma_conf.dst_addr = rx_dma_buf;       // original memory buffer
329*e790a4ceSJonathan Corbet      mdma_conf.peripheral_size = dma_conf.peripheral_size;       // <- dma_conf
330*e790a4ceSJonathan Corbet      mdma_conf.peripheral_config = dma_config.peripheral_config; // <- dma_conf
331*e790a4ceSJonathan Corbet
332*e790a4ceSJonathan Corbet      dmaengine_slave_config(mdma_chan, &mdma_conf);
333*e790a4ceSJonathan Corbet
334*e790a4ceSJonathan Corbet  **2. Get a descriptor for STM32 DMA channel transaction**
335*e790a4ceSJonathan Corbet
336*e790a4ceSJonathan Corbet    In the same way you get your descriptor for your "classic" DMA operation,
337*e790a4ceSJonathan Corbet    you just have to replace the original sg_list (in case of
338*e790a4ceSJonathan Corbet    dmaengine_prep_slave_sg()) with the new sg_list using SRAM buffer, or to
339*e790a4ceSJonathan Corbet    replace the original buffer address, length and period (in case of
340*e790a4ceSJonathan Corbet    dmaengine_prep_dma_cyclic()) with the new SRAM buffer.
341*e790a4ceSJonathan Corbet
342*e790a4ceSJonathan Corbet  **3. Get a descriptor for STM32 MDMA channel transaction**
343*e790a4ceSJonathan Corbet
344*e790a4ceSJonathan Corbet    If you previously get descriptor (for STM32 DMA) with
345*e790a4ceSJonathan Corbet
346*e790a4ceSJonathan Corbet    * dmaengine_prep_slave_sg(), then use dmaengine_prep_slave_sg() for
347*e790a4ceSJonathan Corbet      STM32 MDMA;
348*e790a4ceSJonathan Corbet    * dmaengine_prep_dma_cyclic(), then use dmaengine_prep_dma_cyclic() for
349*e790a4ceSJonathan Corbet      STM32 MDMA.
350*e790a4ceSJonathan Corbet
351*e790a4ceSJonathan Corbet    Use the new sg_list using SRAM buffer (in case of dmaengine_prep_slave_sg())
352*e790a4ceSJonathan Corbet    or, depending on the transfer direction, either the original DDR buffer (in
353*e790a4ceSJonathan Corbet    case of DMA_DEV_TO_MEM) or the SRAM buffer (in case of DMA_MEM_TO_DEV), the
354*e790a4ceSJonathan Corbet    source address being previously set with dmaengine_slave_config().
355*e790a4ceSJonathan Corbet
356*e790a4ceSJonathan Corbet  **4. Submit both transactions**
357*e790a4ceSJonathan Corbet
358*e790a4ceSJonathan Corbet    Before submitting your transactions, you may need to define on which
359*e790a4ceSJonathan Corbet    descriptor you want a callback to be called at the end of the transfer
360*e790a4ceSJonathan Corbet    (dmaengine_prep_slave_sg()) or the period (dmaengine_prep_dma_cyclic()).
361*e790a4ceSJonathan Corbet    Depending on the direction, set the callback on the descriptor that finishes
362*e790a4ceSJonathan Corbet    the overal transfer:
363*e790a4ceSJonathan Corbet
364*e790a4ceSJonathan Corbet    * DMA_DEV_TO_MEM: set the callback on the "MDMA" descriptor
365*e790a4ceSJonathan Corbet    * DMA_MEM_TO_DEV: set the callback on the "DMA" descriptor
366*e790a4ceSJonathan Corbet
367*e790a4ceSJonathan Corbet    Then, submit the descriptors whatever the order, with dmaengine_tx_submit().
368*e790a4ceSJonathan Corbet
369*e790a4ceSJonathan Corbet  **5. Issue pending requests (and wait for callback notification)**
370*e790a4ceSJonathan Corbet
371*e790a4ceSJonathan Corbet  As STM32 MDMA channel transfer is triggered by STM32 DMA, you must issue
372*e790a4ceSJonathan Corbet  STM32 MDMA channel before STM32 DMA channel.
373*e790a4ceSJonathan Corbet
374*e790a4ceSJonathan Corbet  If any, your callback will be called to warn you about the end of the overal
375*e790a4ceSJonathan Corbet  transfer or the period completion.
376*e790a4ceSJonathan Corbet
377*e790a4ceSJonathan Corbet  Don't forget to terminate both channels. STM32 DMA channel is configured in
378*e790a4ceSJonathan Corbet  cyclic Double-Buffer mode so it won't be disabled by HW, you need to terminate
379*e790a4ceSJonathan Corbet  it. STM32 MDMA channel will be stopped by HW in case of sg transfer, but not
380*e790a4ceSJonathan Corbet  in case of cyclic transfer. You can terminate it whatever the kind of transfer.
381*e790a4ceSJonathan Corbet
382*e790a4ceSJonathan Corbet  **STM32 DMA-MDMA chaining DMA_MEM_TO_DEV special case**
383*e790a4ceSJonathan Corbet
384*e790a4ceSJonathan Corbet  STM32 DMA-MDMA chaining in DMA_MEM_TO_DEV is a special case. Indeed, the
385*e790a4ceSJonathan Corbet  STM32 MDMA feeds the SRAM buffer with the DDR data, and the STM32 DMA reads
386*e790a4ceSJonathan Corbet  data from SRAM buffer. So some data (the first period) have to be copied in
387*e790a4ceSJonathan Corbet  SRAM buffer when the STM32 DMA starts to read.
388*e790a4ceSJonathan Corbet
389*e790a4ceSJonathan Corbet  A trick could be pausing the STM32 DMA channel (that will raise a Transfer
390*e790a4ceSJonathan Corbet  Complete signal, triggering the STM32 MDMA channel), but the first data read
391*e790a4ceSJonathan Corbet  by the STM32 DMA could be "wrong". The proper way is to prepare the first SRAM
392*e790a4ceSJonathan Corbet  period with dmaengine_prep_dma_memcpy(). Then this first period should be
393*e790a4ceSJonathan Corbet  "removed" from the sg or the cyclic transfer.
394*e790a4ceSJonathan Corbet
395*e790a4ceSJonathan Corbet  Due to this complexity, rather use the STM32 DMA-MDMA chaining for
396*e790a4ceSJonathan Corbet  DMA_DEV_TO_MEM and keep the "classic" DMA usage for DMA_MEM_TO_DEV, unless
397*e790a4ceSJonathan Corbet  you're not afraid.
398*e790a4ceSJonathan Corbet
399*e790a4ceSJonathan CorbetResources
400*e790a4ceSJonathan Corbet---------
401*e790a4ceSJonathan Corbet
402*e790a4ceSJonathan Corbet  Application note, datasheet and reference manual are available on ST website
403*e790a4ceSJonathan Corbet  (STM32MP1_).
404*e790a4ceSJonathan Corbet
405*e790a4ceSJonathan Corbet  Dedicated focus on three application notes (AN5224_, AN4031_ & AN5001_)
406*e790a4ceSJonathan Corbet  dealing with STM32 DMAMUX, STM32 DMA and STM32 MDMA.
407*e790a4ceSJonathan Corbet
408*e790a4ceSJonathan Corbet.. _STM32MP1: https://www.st.com/en/microcontrollers-microprocessors/stm32mp1-series.html
409*e790a4ceSJonathan Corbet.. _AN5224: https://www.st.com/resource/en/application_note/an5224-stm32-dmamux-the-dma-request-router-stmicroelectronics.pdf
410*e790a4ceSJonathan Corbet.. _AN4031: https://www.st.com/resource/en/application_note/dm00046011-using-the-stm32f2-stm32f4-and-stm32f7-series-dma-controller-stmicroelectronics.pdf
411*e790a4ceSJonathan Corbet.. _AN5001: https://www.st.com/resource/en/application_note/an5001-stm32cube-expansion-package-for-stm32h7-series-mdma-stmicroelectronics.pdf
412*e790a4ceSJonathan Corbet
413*e790a4ceSJonathan Corbet:Authors:
414*e790a4ceSJonathan Corbet
415*e790a4ceSJonathan Corbet- Amelie Delaunay <amelie.delaunay@foss.st.com>