xref: /openbmc/linux/Documentation/admin-guide/device-mapper/persistent-data.rst (revision 0898782247ae533d1f4e47a06bc5d4870931b284)
1*6cf2a73cSMauro Carvalho Chehab===============
2*6cf2a73cSMauro Carvalho ChehabPersistent data
3*6cf2a73cSMauro Carvalho Chehab===============
4*6cf2a73cSMauro Carvalho Chehab
5*6cf2a73cSMauro Carvalho ChehabIntroduction
6*6cf2a73cSMauro Carvalho Chehab============
7*6cf2a73cSMauro Carvalho Chehab
8*6cf2a73cSMauro Carvalho ChehabThe more-sophisticated device-mapper targets require complex metadata
9*6cf2a73cSMauro Carvalho Chehabthat is managed in kernel.  In late 2010 we were seeing that various
10*6cf2a73cSMauro Carvalho Chehabdifferent targets were rolling their own data structures, for example:
11*6cf2a73cSMauro Carvalho Chehab
12*6cf2a73cSMauro Carvalho Chehab- Mikulas Patocka's multisnap implementation
13*6cf2a73cSMauro Carvalho Chehab- Heinz Mauelshagen's thin provisioning target
14*6cf2a73cSMauro Carvalho Chehab- Another btree-based caching target posted to dm-devel
15*6cf2a73cSMauro Carvalho Chehab- Another multi-snapshot target based on a design of Daniel Phillips
16*6cf2a73cSMauro Carvalho Chehab
17*6cf2a73cSMauro Carvalho ChehabMaintaining these data structures takes a lot of work, so if possible
18*6cf2a73cSMauro Carvalho Chehabwe'd like to reduce the number.
19*6cf2a73cSMauro Carvalho Chehab
20*6cf2a73cSMauro Carvalho ChehabThe persistent-data library is an attempt to provide a re-usable
21*6cf2a73cSMauro Carvalho Chehabframework for people who want to store metadata in device-mapper
22*6cf2a73cSMauro Carvalho Chehabtargets.  It's currently used by the thin-provisioning target and an
23*6cf2a73cSMauro Carvalho Chehabupcoming hierarchical storage target.
24*6cf2a73cSMauro Carvalho Chehab
25*6cf2a73cSMauro Carvalho ChehabOverview
26*6cf2a73cSMauro Carvalho Chehab========
27*6cf2a73cSMauro Carvalho Chehab
28*6cf2a73cSMauro Carvalho ChehabThe main documentation is in the header files which can all be found
29*6cf2a73cSMauro Carvalho Chehabunder drivers/md/persistent-data.
30*6cf2a73cSMauro Carvalho Chehab
31*6cf2a73cSMauro Carvalho ChehabThe block manager
32*6cf2a73cSMauro Carvalho Chehab-----------------
33*6cf2a73cSMauro Carvalho Chehab
34*6cf2a73cSMauro Carvalho Chehabdm-block-manager.[hc]
35*6cf2a73cSMauro Carvalho Chehab
36*6cf2a73cSMauro Carvalho ChehabThis provides access to the data on disk in fixed sized-blocks.  There
37*6cf2a73cSMauro Carvalho Chehabis a read/write locking interface to prevent concurrent accesses, and
38*6cf2a73cSMauro Carvalho Chehabkeep data that is being used in the cache.
39*6cf2a73cSMauro Carvalho Chehab
40*6cf2a73cSMauro Carvalho ChehabClients of persistent-data are unlikely to use this directly.
41*6cf2a73cSMauro Carvalho Chehab
42*6cf2a73cSMauro Carvalho ChehabThe transaction manager
43*6cf2a73cSMauro Carvalho Chehab-----------------------
44*6cf2a73cSMauro Carvalho Chehab
45*6cf2a73cSMauro Carvalho Chehabdm-transaction-manager.[hc]
46*6cf2a73cSMauro Carvalho Chehab
47*6cf2a73cSMauro Carvalho ChehabThis restricts access to blocks and enforces copy-on-write semantics.
48*6cf2a73cSMauro Carvalho ChehabThe only way you can get hold of a writable block through the
49*6cf2a73cSMauro Carvalho Chehabtransaction manager is by shadowing an existing block (ie. doing
50*6cf2a73cSMauro Carvalho Chehabcopy-on-write) or allocating a fresh one.  Shadowing is elided within
51*6cf2a73cSMauro Carvalho Chehabthe same transaction so performance is reasonable.  The commit method
52*6cf2a73cSMauro Carvalho Chehabensures that all data is flushed before it writes the superblock.
53*6cf2a73cSMauro Carvalho ChehabOn power failure your metadata will be as it was when last committed.
54*6cf2a73cSMauro Carvalho Chehab
55*6cf2a73cSMauro Carvalho ChehabThe Space Maps
56*6cf2a73cSMauro Carvalho Chehab--------------
57*6cf2a73cSMauro Carvalho Chehab
58*6cf2a73cSMauro Carvalho Chehabdm-space-map.h
59*6cf2a73cSMauro Carvalho Chehabdm-space-map-metadata.[hc]
60*6cf2a73cSMauro Carvalho Chehabdm-space-map-disk.[hc]
61*6cf2a73cSMauro Carvalho Chehab
62*6cf2a73cSMauro Carvalho ChehabOn-disk data structures that keep track of reference counts of blocks.
63*6cf2a73cSMauro Carvalho ChehabAlso acts as the allocator of new blocks.  Currently two
64*6cf2a73cSMauro Carvalho Chehabimplementations: a simpler one for managing blocks on a different
65*6cf2a73cSMauro Carvalho Chehabdevice (eg. thinly-provisioned data blocks); and one for managing
66*6cf2a73cSMauro Carvalho Chehabthe metadata space.  The latter is complicated by the need to store
67*6cf2a73cSMauro Carvalho Chehabits own data within the space it's managing.
68*6cf2a73cSMauro Carvalho Chehab
69*6cf2a73cSMauro Carvalho ChehabThe data structures
70*6cf2a73cSMauro Carvalho Chehab-------------------
71*6cf2a73cSMauro Carvalho Chehab
72*6cf2a73cSMauro Carvalho Chehabdm-btree.[hc]
73*6cf2a73cSMauro Carvalho Chehabdm-btree-remove.c
74*6cf2a73cSMauro Carvalho Chehabdm-btree-spine.c
75*6cf2a73cSMauro Carvalho Chehabdm-btree-internal.h
76*6cf2a73cSMauro Carvalho Chehab
77*6cf2a73cSMauro Carvalho ChehabCurrently there is only one data structure, a hierarchical btree.
78*6cf2a73cSMauro Carvalho ChehabThere are plans to add more.  For example, something with an
79*6cf2a73cSMauro Carvalho Chehabarray-like interface would see a lot of use.
80*6cf2a73cSMauro Carvalho Chehab
81*6cf2a73cSMauro Carvalho ChehabThe btree is 'hierarchical' in that you can define it to be composed
82*6cf2a73cSMauro Carvalho Chehabof nested btrees, and take multiple keys.  For example, the
83*6cf2a73cSMauro Carvalho Chehabthin-provisioning target uses a btree with two levels of nesting.
84*6cf2a73cSMauro Carvalho ChehabThe first maps a device id to a mapping tree, and that in turn maps a
85*6cf2a73cSMauro Carvalho Chehabvirtual block to a physical block.
86*6cf2a73cSMauro Carvalho Chehab
87*6cf2a73cSMauro Carvalho ChehabValues stored in the btrees can have arbitrary size.  Keys are always
88*6cf2a73cSMauro Carvalho Chehab64bits, although nesting allows you to use multiple keys.
89