1.. _dma_api:
2
3============================================
4Dynamic DMA mapping using the generic device
5============================================
6
7:Author: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
8
9This document describes the DMA API.  For a more gentle introduction
10of the API (and actual examples), see Documentation/core-api/dma-api-howto.rst.
11
12This API is split into two pieces.  Part I describes the basic API.
13Part II describes extensions for supporting non-consistent memory
14machines.  Unless you know that your driver absolutely has to support
15non-consistent platforms (this is usually only legacy platforms) you
16should only use the API described in part I.
17
18Part I - dma_API
19----------------
20
21To get the dma_API, you must #include <linux/dma-mapping.h>.  This
22provides dma_addr_t and the interfaces described below.
23
24A dma_addr_t can hold any valid DMA address for the platform.  It can be
25given to a device to use as a DMA source or target.  A CPU cannot reference
26a dma_addr_t directly because there may be translation between its physical
27address space and the DMA address space.
28
29Part Ia - Using large DMA-coherent buffers
30------------------------------------------
31
32::
33
34	void *
35	dma_alloc_coherent(struct device *dev, size_t size,
36			   dma_addr_t *dma_handle, gfp_t flag)
37
38Consistent memory is memory for which a write by either the device or
39the processor can immediately be read by the processor or device
40without having to worry about caching effects.  (You may however need
41to make sure to flush the processor's write buffers before telling
42devices to read that memory.)
43
44This routine allocates a region of <size> bytes of consistent memory.
45
46It returns a pointer to the allocated region (in the processor's virtual
47address space) or NULL if the allocation failed.
48
49It also returns a <dma_handle> which may be cast to an unsigned integer the
50same width as the bus and given to the device as the DMA address base of
51the region.
52
53Note: consistent memory can be expensive on some platforms, and the
54minimum allocation length may be as big as a page, so you should
55consolidate your requests for consistent memory as much as possible.
56The simplest way to do that is to use the dma_pool calls (see below).
57
58The flag parameter (dma_alloc_coherent() only) allows the caller to
59specify the ``GFP_`` flags (see kmalloc()) for the allocation (the
60implementation may choose to ignore flags that affect the location of
61the returned memory, like GFP_DMA).
62
63::
64
65	void
66	dma_free_coherent(struct device *dev, size_t size, void *cpu_addr,
67			  dma_addr_t dma_handle)
68
69Free a region of consistent memory you previously allocated.  dev,
70size and dma_handle must all be the same as those passed into
71dma_alloc_coherent().  cpu_addr must be the virtual address returned by
72the dma_alloc_coherent().
73
74Note that unlike their sibling allocation calls, these routines
75may only be called with IRQs enabled.
76
77
78Part Ib - Using small DMA-coherent buffers
79------------------------------------------
80
81To get this part of the dma_API, you must #include <linux/dmapool.h>
82
83Many drivers need lots of small DMA-coherent memory regions for DMA
84descriptors or I/O buffers.  Rather than allocating in units of a page
85or more using dma_alloc_coherent(), you can use DMA pools.  These work
86much like a struct kmem_cache, except that they use the DMA-coherent allocator,
87not __get_free_pages().  Also, they understand common hardware constraints
88for alignment, like queue heads needing to be aligned on N-byte boundaries.
89
90
91::
92
93	struct dma_pool *
94	dma_pool_create(const char *name, struct device *dev,
95			size_t size, size_t align, size_t alloc);
96
97dma_pool_create() initializes a pool of DMA-coherent buffers
98for use with a given device.  It must be called in a context which
99can sleep.
100
101The "name" is for diagnostics (like a struct kmem_cache name); dev and size
102are like what you'd pass to dma_alloc_coherent().  The device's hardware
103alignment requirement for this type of data is "align" (which is expressed
104in bytes, and must be a power of two).  If your device has no boundary
105crossing restrictions, pass 0 for alloc; passing 4096 says memory allocated
106from this pool must not cross 4KByte boundaries.
107
108::
109
110	void *
111	dma_pool_zalloc(struct dma_pool *pool, gfp_t mem_flags,
112		        dma_addr_t *handle)
113
114Wraps dma_pool_alloc() and also zeroes the returned memory if the
115allocation attempt succeeded.
116
117
118::
119
120	void *
121	dma_pool_alloc(struct dma_pool *pool, gfp_t gfp_flags,
122		       dma_addr_t *dma_handle);
123
124This allocates memory from the pool; the returned memory will meet the
125size and alignment requirements specified at creation time.  Pass
126GFP_ATOMIC to prevent blocking, or if it's permitted (not
127in_interrupt, not holding SMP locks), pass GFP_KERNEL to allow
128blocking.  Like dma_alloc_coherent(), this returns two values:  an
129address usable by the CPU, and the DMA address usable by the pool's
130device.
131
132::
133
134	void
135	dma_pool_free(struct dma_pool *pool, void *vaddr,
136		      dma_addr_t addr);
137
138This puts memory back into the pool.  The pool is what was passed to
139dma_pool_alloc(); the CPU (vaddr) and DMA addresses are what
140were returned when that routine allocated the memory being freed.
141
142::
143
144	void
145	dma_pool_destroy(struct dma_pool *pool);
146
147dma_pool_destroy() frees the resources of the pool.  It must be
148called in a context which can sleep.  Make sure you've freed all allocated
149memory back to the pool before you destroy it.
150
151
152Part Ic - DMA addressing limitations
153------------------------------------
154
155::
156
157	int
158	dma_set_mask_and_coherent(struct device *dev, u64 mask)
159
160Checks to see if the mask is possible and updates the device
161streaming and coherent DMA mask parameters if it is.
162
163Returns: 0 if successful and a negative error if not.
164
165::
166
167	int
168	dma_set_mask(struct device *dev, u64 mask)
169
170Checks to see if the mask is possible and updates the device
171parameters if it is.
172
173Returns: 0 if successful and a negative error if not.
174
175::
176
177	int
178	dma_set_coherent_mask(struct device *dev, u64 mask)
179
180Checks to see if the mask is possible and updates the device
181parameters if it is.
182
183Returns: 0 if successful and a negative error if not.
184
185::
186
187	u64
188	dma_get_required_mask(struct device *dev)
189
190This API returns the mask that the platform requires to
191operate efficiently.  Usually this means the returned mask
192is the minimum required to cover all of memory.  Examining the
193required mask gives drivers with variable descriptor sizes the
194opportunity to use smaller descriptors as necessary.
195
196Requesting the required mask does not alter the current mask.  If you
197wish to take advantage of it, you should issue a dma_set_mask()
198call to set the mask to the value returned.
199
200::
201
202	size_t
203	dma_max_mapping_size(struct device *dev);
204
205Returns the maximum size of a mapping for the device. The size parameter
206of the mapping functions like dma_map_single(), dma_map_page() and
207others should not be larger than the returned value.
208
209::
210
211	size_t
212	dma_opt_mapping_size(struct device *dev);
213
214Returns the maximum optimal size of a mapping for the device.
215
216Mapping larger buffers may take much longer in certain scenarios. In
217addition, for high-rate short-lived streaming mappings, the upfront time
218spent on the mapping may account for an appreciable part of the total
219request lifetime. As such, if splitting larger requests incurs no
220significant performance penalty, then device drivers are advised to
221limit total DMA streaming mappings length to the returned value.
222
223::
224
225	bool
226	dma_need_sync(struct device *dev, dma_addr_t dma_addr);
227
228Returns %true if dma_sync_single_for_{device,cpu} calls are required to
229transfer memory ownership.  Returns %false if those calls can be skipped.
230
231::
232
233	unsigned long
234	dma_get_merge_boundary(struct device *dev);
235
236Returns the DMA merge boundary. If the device cannot merge any the DMA address
237segments, the function returns 0.
238
239Part Id - Streaming DMA mappings
240--------------------------------
241
242::
243
244	dma_addr_t
245	dma_map_single(struct device *dev, void *cpu_addr, size_t size,
246		       enum dma_data_direction direction)
247
248Maps a piece of processor virtual memory so it can be accessed by the
249device and returns the DMA address of the memory.
250
251The direction for both APIs may be converted freely by casting.
252However the dma_API uses a strongly typed enumerator for its
253direction:
254
255======================= =============================================
256DMA_NONE		no direction (used for debugging)
257DMA_TO_DEVICE		data is going from the memory to the device
258DMA_FROM_DEVICE		data is coming from the device to the memory
259DMA_BIDIRECTIONAL	direction isn't known
260======================= =============================================
261
262.. note::
263
264	Not all memory regions in a machine can be mapped by this API.
265	Further, contiguous kernel virtual space may not be contiguous as
266	physical memory.  Since this API does not provide any scatter/gather
267	capability, it will fail if the user tries to map a non-physically
268	contiguous piece of memory.  For this reason, memory to be mapped by
269	this API should be obtained from sources which guarantee it to be
270	physically contiguous (like kmalloc).
271
272	Further, the DMA address of the memory must be within the
273	dma_mask of the device (the dma_mask is a bit mask of the
274	addressable region for the device, i.e., if the DMA address of
275	the memory ANDed with the dma_mask is still equal to the DMA
276	address, then the device can perform DMA to the memory).  To
277	ensure that the memory allocated by kmalloc is within the dma_mask,
278	the driver may specify various platform-dependent flags to restrict
279	the DMA address range of the allocation (e.g., on x86, GFP_DMA
280	guarantees to be within the first 16MB of available DMA addresses,
281	as required by ISA devices).
282
283	Note also that the above constraints on physical contiguity and
284	dma_mask may not apply if the platform has an IOMMU (a device which
285	maps an I/O DMA address to a physical memory address).  However, to be
286	portable, device driver writers may *not* assume that such an IOMMU
287	exists.
288
289.. warning::
290
291	Memory coherency operates at a granularity called the cache
292	line width.  In order for memory mapped by this API to operate
293	correctly, the mapped region must begin exactly on a cache line
294	boundary and end exactly on one (to prevent two separately mapped
295	regions from sharing a single cache line).  Since the cache line size
296	may not be known at compile time, the API will not enforce this
297	requirement.  Therefore, it is recommended that driver writers who
298	don't take special care to determine the cache line size at run time
299	only map virtual regions that begin and end on page boundaries (which
300	are guaranteed also to be cache line boundaries).
301
302	DMA_TO_DEVICE synchronisation must be done after the last modification
303	of the memory region by the software and before it is handed off to
304	the device.  Once this primitive is used, memory covered by this
305	primitive should be treated as read-only by the device.  If the device
306	may write to it at any point, it should be DMA_BIDIRECTIONAL (see
307	below).
308
309	DMA_FROM_DEVICE synchronisation must be done before the driver
310	accesses data that may be changed by the device.  This memory should
311	be treated as read-only by the driver.  If the driver needs to write
312	to it at any point, it should be DMA_BIDIRECTIONAL (see below).
313
314	DMA_BIDIRECTIONAL requires special handling: it means that the driver
315	isn't sure if the memory was modified before being handed off to the
316	device and also isn't sure if the device will also modify it.  Thus,
317	you must always sync bidirectional memory twice: once before the
318	memory is handed off to the device (to make sure all memory changes
319	are flushed from the processor) and once before the data may be
320	accessed after being used by the device (to make sure any processor
321	cache lines are updated with data that the device may have changed).
322
323::
324
325	void
326	dma_unmap_single(struct device *dev, dma_addr_t dma_addr, size_t size,
327			 enum dma_data_direction direction)
328
329Unmaps the region previously mapped.  All the parameters passed in
330must be identical to those passed in (and returned) by the mapping
331API.
332
333::
334
335	dma_addr_t
336	dma_map_page(struct device *dev, struct page *page,
337		     unsigned long offset, size_t size,
338		     enum dma_data_direction direction)
339
340	void
341	dma_unmap_page(struct device *dev, dma_addr_t dma_address, size_t size,
342		       enum dma_data_direction direction)
343
344API for mapping and unmapping for pages.  All the notes and warnings
345for the other mapping APIs apply here.  Also, although the <offset>
346and <size> parameters are provided to do partial page mapping, it is
347recommended that you never use these unless you really know what the
348cache width is.
349
350::
351
352	dma_addr_t
353	dma_map_resource(struct device *dev, phys_addr_t phys_addr, size_t size,
354			 enum dma_data_direction dir, unsigned long attrs)
355
356	void
357	dma_unmap_resource(struct device *dev, dma_addr_t addr, size_t size,
358			   enum dma_data_direction dir, unsigned long attrs)
359
360API for mapping and unmapping for MMIO resources. All the notes and
361warnings for the other mapping APIs apply here. The API should only be
362used to map device MMIO resources, mapping of RAM is not permitted.
363
364::
365
366	int
367	dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
368
369In some circumstances dma_map_single(), dma_map_page() and dma_map_resource()
370will fail to create a mapping. A driver can check for these errors by testing
371the returned DMA address with dma_mapping_error(). A non-zero return value
372means the mapping could not be created and the driver should take appropriate
373action (e.g. reduce current DMA mapping usage or delay and try again later).
374
375::
376
377	int
378	dma_map_sg(struct device *dev, struct scatterlist *sg,
379		   int nents, enum dma_data_direction direction)
380
381Returns: the number of DMA address segments mapped (this may be shorter
382than <nents> passed in if some elements of the scatter/gather list are
383physically or virtually adjacent and an IOMMU maps them with a single
384entry).
385
386Please note that the sg cannot be mapped again if it has been mapped once.
387The mapping process is allowed to destroy information in the sg.
388
389As with the other mapping interfaces, dma_map_sg() can fail. When it
390does, 0 is returned and a driver must take appropriate action. It is
391critical that the driver do something, in the case of a block driver
392aborting the request or even oopsing is better than doing nothing and
393corrupting the filesystem.
394
395With scatterlists, you use the resulting mapping like this::
396
397	int i, count = dma_map_sg(dev, sglist, nents, direction);
398	struct scatterlist *sg;
399
400	for_each_sg(sglist, sg, count, i) {
401		hw_address[i] = sg_dma_address(sg);
402		hw_len[i] = sg_dma_len(sg);
403	}
404
405where nents is the number of entries in the sglist.
406
407The implementation is free to merge several consecutive sglist entries
408into one (e.g. with an IOMMU, or if several pages just happen to be
409physically contiguous) and returns the actual number of sg entries it
410mapped them to. On failure 0, is returned.
411
412Then you should loop count times (note: this can be less than nents times)
413and use sg_dma_address() and sg_dma_len() macros where you previously
414accessed sg->address and sg->length as shown above.
415
416::
417
418	void
419	dma_unmap_sg(struct device *dev, struct scatterlist *sg,
420		     int nents, enum dma_data_direction direction)
421
422Unmap the previously mapped scatter/gather list.  All the parameters
423must be the same as those and passed in to the scatter/gather mapping
424API.
425
426Note: <nents> must be the number you passed in, *not* the number of
427DMA address entries returned.
428
429::
430
431	void
432	dma_sync_single_for_cpu(struct device *dev, dma_addr_t dma_handle,
433				size_t size,
434				enum dma_data_direction direction)
435
436	void
437	dma_sync_single_for_device(struct device *dev, dma_addr_t dma_handle,
438				   size_t size,
439				   enum dma_data_direction direction)
440
441	void
442	dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
443			    int nents,
444			    enum dma_data_direction direction)
445
446	void
447	dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
448			       int nents,
449			       enum dma_data_direction direction)
450
451Synchronise a single contiguous or scatter/gather mapping for the CPU
452and device. With the sync_sg API, all the parameters must be the same
453as those passed into the single mapping API. With the sync_single API,
454you can use dma_handle and size parameters that aren't identical to
455those passed into the single mapping API to do a partial sync.
456
457
458.. note::
459
460   You must do this:
461
462   - Before reading values that have been written by DMA from the device
463     (use the DMA_FROM_DEVICE direction)
464   - After writing values that will be written to the device using DMA
465     (use the DMA_TO_DEVICE) direction
466   - before *and* after handing memory to the device if the memory is
467     DMA_BIDIRECTIONAL
468
469See also dma_map_single().
470
471::
472
473	dma_addr_t
474	dma_map_single_attrs(struct device *dev, void *cpu_addr, size_t size,
475			     enum dma_data_direction dir,
476			     unsigned long attrs)
477
478	void
479	dma_unmap_single_attrs(struct device *dev, dma_addr_t dma_addr,
480			       size_t size, enum dma_data_direction dir,
481			       unsigned long attrs)
482
483	int
484	dma_map_sg_attrs(struct device *dev, struct scatterlist *sgl,
485			 int nents, enum dma_data_direction dir,
486			 unsigned long attrs)
487
488	void
489	dma_unmap_sg_attrs(struct device *dev, struct scatterlist *sgl,
490			   int nents, enum dma_data_direction dir,
491			   unsigned long attrs)
492
493The four functions above are just like the counterpart functions
494without the _attrs suffixes, except that they pass an optional
495dma_attrs.
496
497The interpretation of DMA attributes is architecture-specific, and
498each attribute should be documented in
499Documentation/core-api/dma-attributes.rst.
500
501If dma_attrs are 0, the semantics of each of these functions
502is identical to those of the corresponding function
503without the _attrs suffix. As a result dma_map_single_attrs()
504can generally replace dma_map_single(), etc.
505
506As an example of the use of the ``*_attrs`` functions, here's how
507you could pass an attribute DMA_ATTR_FOO when mapping memory
508for DMA::
509
510	#include <linux/dma-mapping.h>
511	/* DMA_ATTR_FOO should be defined in linux/dma-mapping.h and
512	* documented in Documentation/core-api/dma-attributes.rst */
513	...
514
515		unsigned long attr;
516		attr |= DMA_ATTR_FOO;
517		....
518		n = dma_map_sg_attrs(dev, sg, nents, DMA_TO_DEVICE, attr);
519		....
520
521Architectures that care about DMA_ATTR_FOO would check for its
522presence in their implementations of the mapping and unmapping
523routines, e.g.:::
524
525	void whizco_dma_map_sg_attrs(struct device *dev, dma_addr_t dma_addr,
526				     size_t size, enum dma_data_direction dir,
527				     unsigned long attrs)
528	{
529		....
530		if (attrs & DMA_ATTR_FOO)
531			/* twizzle the frobnozzle */
532		....
533	}
534
535
536Part II - Non-coherent DMA allocations
537--------------------------------------
538
539These APIs allow to allocate pages that are guaranteed to be DMA addressable
540by the passed in device, but which need explicit management of memory ownership
541for the kernel vs the device.
542
543If you don't understand how cache line coherency works between a processor and
544an I/O device, you should not be using this part of the API.
545
546::
547
548	struct page *
549	dma_alloc_pages(struct device *dev, size_t size, dma_addr_t *dma_handle,
550			enum dma_data_direction dir, gfp_t gfp)
551
552This routine allocates a region of <size> bytes of non-coherent memory.  It
553returns a pointer to first struct page for the region, or NULL if the
554allocation failed. The resulting struct page can be used for everything a
555struct page is suitable for.
556
557It also returns a <dma_handle> which may be cast to an unsigned integer the
558same width as the bus and given to the device as the DMA address base of
559the region.
560
561The dir parameter specified if data is read and/or written by the device,
562see dma_map_single() for details.
563
564The gfp parameter allows the caller to specify the ``GFP_`` flags (see
565kmalloc()) for the allocation, but rejects flags used to specify a memory
566zone such as GFP_DMA or GFP_HIGHMEM.
567
568Before giving the memory to the device, dma_sync_single_for_device() needs
569to be called, and before reading memory written by the device,
570dma_sync_single_for_cpu(), just like for streaming DMA mappings that are
571reused.
572
573::
574
575	void
576	dma_free_pages(struct device *dev, size_t size, struct page *page,
577			dma_addr_t dma_handle, enum dma_data_direction dir)
578
579Free a region of memory previously allocated using dma_alloc_pages().
580dev, size, dma_handle and dir must all be the same as those passed into
581dma_alloc_pages().  page must be the pointer returned by dma_alloc_pages().
582
583::
584
585	int
586	dma_mmap_pages(struct device *dev, struct vm_area_struct *vma,
587		       size_t size, struct page *page)
588
589Map an allocation returned from dma_alloc_pages() into a user address space.
590dev and size must be the same as those passed into dma_alloc_pages().
591page must be the pointer returned by dma_alloc_pages().
592
593::
594
595	void *
596	dma_alloc_noncoherent(struct device *dev, size_t size,
597			dma_addr_t *dma_handle, enum dma_data_direction dir,
598			gfp_t gfp)
599
600This routine is a convenient wrapper around dma_alloc_pages that returns the
601kernel virtual address for the allocated memory instead of the page structure.
602
603::
604
605	void
606	dma_free_noncoherent(struct device *dev, size_t size, void *cpu_addr,
607			dma_addr_t dma_handle, enum dma_data_direction dir)
608
609Free a region of memory previously allocated using dma_alloc_noncoherent().
610dev, size, dma_handle and dir must all be the same as those passed into
611dma_alloc_noncoherent().  cpu_addr must be the virtual address returned by
612dma_alloc_noncoherent().
613
614::
615
616	struct sg_table *
617	dma_alloc_noncontiguous(struct device *dev, size_t size,
618				enum dma_data_direction dir, gfp_t gfp,
619				unsigned long attrs);
620
621This routine allocates  <size> bytes of non-coherent and possibly non-contiguous
622memory.  It returns a pointer to struct sg_table that describes the allocated
623and DMA mapped memory, or NULL if the allocation failed. The resulting memory
624can be used for struct page mapped into a scatterlist are suitable for.
625
626The return sg_table is guaranteed to have 1 single DMA mapped segment as
627indicated by sgt->nents, but it might have multiple CPU side segments as
628indicated by sgt->orig_nents.
629
630The dir parameter specified if data is read and/or written by the device,
631see dma_map_single() for details.
632
633The gfp parameter allows the caller to specify the ``GFP_`` flags (see
634kmalloc()) for the allocation, but rejects flags used to specify a memory
635zone such as GFP_DMA or GFP_HIGHMEM.
636
637The attrs argument must be either 0 or DMA_ATTR_ALLOC_SINGLE_PAGES.
638
639Before giving the memory to the device, dma_sync_sgtable_for_device() needs
640to be called, and before reading memory written by the device,
641dma_sync_sgtable_for_cpu(), just like for streaming DMA mappings that are
642reused.
643
644::
645
646	void
647	dma_free_noncontiguous(struct device *dev, size_t size,
648			       struct sg_table *sgt,
649			       enum dma_data_direction dir)
650
651Free memory previously allocated using dma_alloc_noncontiguous().  dev, size,
652and dir must all be the same as those passed into dma_alloc_noncontiguous().
653sgt must be the pointer returned by dma_alloc_noncontiguous().
654
655::
656
657	void *
658	dma_vmap_noncontiguous(struct device *dev, size_t size,
659		struct sg_table *sgt)
660
661Return a contiguous kernel mapping for an allocation returned from
662dma_alloc_noncontiguous().  dev and size must be the same as those passed into
663dma_alloc_noncontiguous().  sgt must be the pointer returned by
664dma_alloc_noncontiguous().
665
666Once a non-contiguous allocation is mapped using this function, the
667flush_kernel_vmap_range() and invalidate_kernel_vmap_range() APIs must be used
668to manage the coherency between the kernel mapping, the device and user space
669mappings (if any).
670
671::
672
673	void
674	dma_vunmap_noncontiguous(struct device *dev, void *vaddr)
675
676Unmap a kernel mapping returned by dma_vmap_noncontiguous().  dev must be the
677same the one passed into dma_alloc_noncontiguous().  vaddr must be the pointer
678returned by dma_vmap_noncontiguous().
679
680
681::
682
683	int
684	dma_mmap_noncontiguous(struct device *dev, struct vm_area_struct *vma,
685			       size_t size, struct sg_table *sgt)
686
687Map an allocation returned from dma_alloc_noncontiguous() into a user address
688space.  dev and size must be the same as those passed into
689dma_alloc_noncontiguous().  sgt must be the pointer returned by
690dma_alloc_noncontiguous().
691
692::
693
694	int
695	dma_get_cache_alignment(void)
696
697Returns the processor cache alignment.  This is the absolute minimum
698alignment *and* width that you must observe when either mapping
699memory or doing partial flushes.
700
701.. note::
702
703	This API may return a number *larger* than the actual cache
704	line, but it will guarantee that one or more cache lines fit exactly
705	into the width returned by this call.  It will also always be a power
706	of two for easy alignment.
707
708
709Part III - Debug drivers use of the DMA-API
710-------------------------------------------
711
712The DMA-API as described above has some constraints. DMA addresses must be
713released with the corresponding function with the same size for example. With
714the advent of hardware IOMMUs it becomes more and more important that drivers
715do not violate those constraints. In the worst case such a violation can
716result in data corruption up to destroyed filesystems.
717
718To debug drivers and find bugs in the usage of the DMA-API checking code can
719be compiled into the kernel which will tell the developer about those
720violations. If your architecture supports it you can select the "Enable
721debugging of DMA-API usage" option in your kernel configuration. Enabling this
722option has a performance impact. Do not enable it in production kernels.
723
724If you boot the resulting kernel will contain code which does some bookkeeping
725about what DMA memory was allocated for which device. If this code detects an
726error it prints a warning message with some details into your kernel log. An
727example warning message may look like this::
728
729	WARNING: at /data2/repos/linux-2.6-iommu/lib/dma-debug.c:448
730		check_unmap+0x203/0x490()
731	Hardware name:
732	forcedeth 0000:00:08.0: DMA-API: device driver frees DMA memory with wrong
733		function [device address=0x00000000640444be] [size=66 bytes] [mapped as
734	single] [unmapped as page]
735	Modules linked in: nfsd exportfs bridge stp llc r8169
736	Pid: 0, comm: swapper Tainted: G        W  2.6.28-dmatest-09289-g8bb99c0 #1
737	Call Trace:
738	<IRQ>  [<ffffffff80240b22>] warn_slowpath+0xf2/0x130
739	[<ffffffff80647b70>] _spin_unlock+0x10/0x30
740	[<ffffffff80537e75>] usb_hcd_link_urb_to_ep+0x75/0xc0
741	[<ffffffff80647c22>] _spin_unlock_irqrestore+0x12/0x40
742	[<ffffffff8055347f>] ohci_urb_enqueue+0x19f/0x7c0
743	[<ffffffff80252f96>] queue_work+0x56/0x60
744	[<ffffffff80237e10>] enqueue_task_fair+0x20/0x50
745	[<ffffffff80539279>] usb_hcd_submit_urb+0x379/0xbc0
746	[<ffffffff803b78c3>] cpumask_next_and+0x23/0x40
747	[<ffffffff80235177>] find_busiest_group+0x207/0x8a0
748	[<ffffffff8064784f>] _spin_lock_irqsave+0x1f/0x50
749	[<ffffffff803c7ea3>] check_unmap+0x203/0x490
750	[<ffffffff803c8259>] debug_dma_unmap_page+0x49/0x50
751	[<ffffffff80485f26>] nv_tx_done_optimized+0xc6/0x2c0
752	[<ffffffff80486c13>] nv_nic_irq_optimized+0x73/0x2b0
753	[<ffffffff8026df84>] handle_IRQ_event+0x34/0x70
754	[<ffffffff8026ffe9>] handle_edge_irq+0xc9/0x150
755	[<ffffffff8020e3ab>] do_IRQ+0xcb/0x1c0
756	[<ffffffff8020c093>] ret_from_intr+0x0/0xa
757	<EOI> <4>---[ end trace f6435a98e2a38c0e ]---
758
759The driver developer can find the driver and the device including a stacktrace
760of the DMA-API call which caused this warning.
761
762Per default only the first error will result in a warning message. All other
763errors will only silently counted. This limitation exist to prevent the code
764from flooding your kernel log. To support debugging a device driver this can
765be disabled via debugfs. See the debugfs interface documentation below for
766details.
767
768The debugfs directory for the DMA-API debugging code is called dma-api/. In
769this directory the following files can currently be found:
770
771=============================== ===============================================
772dma-api/all_errors		This file contains a numeric value. If this
773				value is not equal to zero the debugging code
774				will print a warning for every error it finds
775				into the kernel log. Be careful with this
776				option, as it can easily flood your logs.
777
778dma-api/disabled		This read-only file contains the character 'Y'
779				if the debugging code is disabled. This can
780				happen when it runs out of memory or if it was
781				disabled at boot time
782
783dma-api/dump			This read-only file contains current DMA
784				mappings.
785
786dma-api/error_count		This file is read-only and shows the total
787				numbers of errors found.
788
789dma-api/num_errors		The number in this file shows how many
790				warnings will be printed to the kernel log
791				before it stops. This number is initialized to
792				one at system boot and be set by writing into
793				this file
794
795dma-api/min_free_entries	This read-only file can be read to get the
796				minimum number of free dma_debug_entries the
797				allocator has ever seen. If this value goes
798				down to zero the code will attempt to increase
799				nr_total_entries to compensate.
800
801dma-api/num_free_entries	The current number of free dma_debug_entries
802				in the allocator.
803
804dma-api/nr_total_entries	The total number of dma_debug_entries in the
805				allocator, both free and used.
806
807dma-api/driver_filter		You can write a name of a driver into this file
808				to limit the debug output to requests from that
809				particular driver. Write an empty string to
810				that file to disable the filter and see
811				all errors again.
812=============================== ===============================================
813
814If you have this code compiled into your kernel it will be enabled by default.
815If you want to boot without the bookkeeping anyway you can provide
816'dma_debug=off' as a boot parameter. This will disable DMA-API debugging.
817Notice that you can not enable it again at runtime. You have to reboot to do
818so.
819
820If you want to see debug messages only for a special device driver you can
821specify the dma_debug_driver=<drivername> parameter. This will enable the
822driver filter at boot time. The debug code will only print errors for that
823driver afterwards. This filter can be disabled or changed later using debugfs.
824
825When the code disables itself at runtime this is most likely because it ran
826out of dma_debug_entries and was unable to allocate more on-demand. 65536
827entries are preallocated at boot - if this is too low for you boot with
828'dma_debug_entries=<your_desired_number>' to overwrite the default. Note
829that the code allocates entries in batches, so the exact number of
830preallocated entries may be greater than the actual number requested. The
831code will print to the kernel log each time it has dynamically allocated
832as many entries as were initially preallocated. This is to indicate that a
833larger preallocation size may be appropriate, or if it happens continually
834that a driver may be leaking mappings.
835
836::
837
838	void
839	debug_dma_mapping_error(struct device *dev, dma_addr_t dma_addr);
840
841dma-debug interface debug_dma_mapping_error() to debug drivers that fail
842to check DMA mapping errors on addresses returned by dma_map_single() and
843dma_map_page() interfaces. This interface clears a flag set by
844debug_dma_map_page() to indicate that dma_mapping_error() has been called by
845the driver. When driver does unmap, debug_dma_unmap() checks the flag and if
846this flag is still set, prints warning message that includes call trace that
847leads up to the unmap. This interface can be called from dma_mapping_error()
848routines to enable DMA mapping error check debugging.
849