198264991SMauro Carvalho Chehab=======================================================
298264991SMauro Carvalho ChehabConfigfs - Userspace-driven Kernel Object Configuration
398264991SMauro Carvalho Chehab=======================================================
498264991SMauro Carvalho Chehab
598264991SMauro Carvalho ChehabJoel Becker <joel.becker@oracle.com>
698264991SMauro Carvalho Chehab
798264991SMauro Carvalho ChehabUpdated: 31 March 2005
898264991SMauro Carvalho Chehab
998264991SMauro Carvalho ChehabCopyright (c) 2005 Oracle Corporation,
1098264991SMauro Carvalho Chehab	Joel Becker <joel.becker@oracle.com>
1198264991SMauro Carvalho Chehab
1298264991SMauro Carvalho Chehab
1398264991SMauro Carvalho ChehabWhat is configfs?
1498264991SMauro Carvalho Chehab=================
1598264991SMauro Carvalho Chehab
1698264991SMauro Carvalho Chehabconfigfs is a ram-based filesystem that provides the converse of
1798264991SMauro Carvalho Chehabsysfs's functionality.  Where sysfs is a filesystem-based view of
1898264991SMauro Carvalho Chehabkernel objects, configfs is a filesystem-based manager of kernel
1998264991SMauro Carvalho Chehabobjects, or config_items.
2098264991SMauro Carvalho Chehab
2198264991SMauro Carvalho ChehabWith sysfs, an object is created in kernel (for example, when a device
2298264991SMauro Carvalho Chehabis discovered) and it is registered with sysfs.  Its attributes then
2398264991SMauro Carvalho Chehabappear in sysfs, allowing userspace to read the attributes via
2498264991SMauro Carvalho Chehabreaddir(3)/read(2).  It may allow some attributes to be modified via
2598264991SMauro Carvalho Chehabwrite(2).  The important point is that the object is created and
2698264991SMauro Carvalho Chehabdestroyed in kernel, the kernel controls the lifecycle of the sysfs
2798264991SMauro Carvalho Chehabrepresentation, and sysfs is merely a window on all this.
2898264991SMauro Carvalho Chehab
2998264991SMauro Carvalho ChehabA configfs config_item is created via an explicit userspace operation:
3098264991SMauro Carvalho Chehabmkdir(2).  It is destroyed via rmdir(2).  The attributes appear at
3198264991SMauro Carvalho Chehabmkdir(2) time, and can be read or modified via read(2) and write(2).
3298264991SMauro Carvalho ChehabAs with sysfs, readdir(3) queries the list of items and/or attributes.
3398264991SMauro Carvalho Chehabsymlink(2) can be used to group items together.  Unlike sysfs, the
3498264991SMauro Carvalho Chehablifetime of the representation is completely driven by userspace.  The
3598264991SMauro Carvalho Chehabkernel modules backing the items must respond to this.
3698264991SMauro Carvalho Chehab
3798264991SMauro Carvalho ChehabBoth sysfs and configfs can and should exist together on the same
3898264991SMauro Carvalho Chehabsystem.  One is not a replacement for the other.
3998264991SMauro Carvalho Chehab
4098264991SMauro Carvalho ChehabUsing configfs
4198264991SMauro Carvalho Chehab==============
4298264991SMauro Carvalho Chehab
4398264991SMauro Carvalho Chehabconfigfs can be compiled as a module or into the kernel.  You can access
4498264991SMauro Carvalho Chehabit by doing::
4598264991SMauro Carvalho Chehab
4698264991SMauro Carvalho Chehab	mount -t configfs none /config
4798264991SMauro Carvalho Chehab
4898264991SMauro Carvalho ChehabThe configfs tree will be empty unless client modules are also loaded.
4998264991SMauro Carvalho ChehabThese are modules that register their item types with configfs as
5098264991SMauro Carvalho Chehabsubsystems.  Once a client subsystem is loaded, it will appear as a
5198264991SMauro Carvalho Chehabsubdirectory (or more than one) under /config.  Like sysfs, the
5298264991SMauro Carvalho Chehabconfigfs tree is always there, whether mounted on /config or not.
5398264991SMauro Carvalho Chehab
5498264991SMauro Carvalho ChehabAn item is created via mkdir(2).  The item's attributes will also
5598264991SMauro Carvalho Chehabappear at this time.  readdir(3) can determine what the attributes are,
5698264991SMauro Carvalho Chehabread(2) can query their default values, and write(2) can store new
5798264991SMauro Carvalho Chehabvalues.  Don't mix more than one attribute in one attribute file.
5898264991SMauro Carvalho Chehab
5998264991SMauro Carvalho ChehabThere are two types of configfs attributes:
6098264991SMauro Carvalho Chehab
6198264991SMauro Carvalho Chehab* Normal attributes, which similar to sysfs attributes, are small ASCII text
6298264991SMauro Carvalho Chehab  files, with a maximum size of one page (PAGE_SIZE, 4096 on i386).  Preferably
6398264991SMauro Carvalho Chehab  only one value per file should be used, and the same caveats from sysfs apply.
6498264991SMauro Carvalho Chehab  Configfs expects write(2) to store the entire buffer at once.  When writing to
6598264991SMauro Carvalho Chehab  normal configfs attributes, userspace processes should first read the entire
6698264991SMauro Carvalho Chehab  file, modify the portions they wish to change, and then write the entire
6798264991SMauro Carvalho Chehab  buffer back.
6898264991SMauro Carvalho Chehab
6998264991SMauro Carvalho Chehab* Binary attributes, which are somewhat similar to sysfs binary attributes,
7098264991SMauro Carvalho Chehab  but with a few slight changes to semantics.  The PAGE_SIZE limitation does not
7198264991SMauro Carvalho Chehab  apply, but the whole binary item must fit in single kernel vmalloc'ed buffer.
7298264991SMauro Carvalho Chehab  The write(2) calls from user space are buffered, and the attributes'
7398264991SMauro Carvalho Chehab  write_bin_attribute method will be invoked on the final close, therefore it is
7498264991SMauro Carvalho Chehab  imperative for user-space to check the return code of close(2) in order to
7598264991SMauro Carvalho Chehab  verify that the operation finished successfully.
7698264991SMauro Carvalho Chehab  To avoid a malicious user OOMing the kernel, there's a per-binary attribute
7798264991SMauro Carvalho Chehab  maximum buffer value.
7898264991SMauro Carvalho Chehab
7998264991SMauro Carvalho ChehabWhen an item needs to be destroyed, remove it with rmdir(2).  An
8098264991SMauro Carvalho Chehabitem cannot be destroyed if any other item has a link to it (via
8198264991SMauro Carvalho Chehabsymlink(2)).  Links can be removed via unlink(2).
8298264991SMauro Carvalho Chehab
8398264991SMauro Carvalho ChehabConfiguring FakeNBD: an Example
8498264991SMauro Carvalho Chehab===============================
8598264991SMauro Carvalho Chehab
8698264991SMauro Carvalho ChehabImagine there's a Network Block Device (NBD) driver that allows you to
8798264991SMauro Carvalho Chehabaccess remote block devices.  Call it FakeNBD.  FakeNBD uses configfs
8898264991SMauro Carvalho Chehabfor its configuration.  Obviously, there will be a nice program that
8998264991SMauro Carvalho Chehabsysadmins use to configure FakeNBD, but somehow that program has to tell
9098264991SMauro Carvalho Chehabthe driver about it.  Here's where configfs comes in.
9198264991SMauro Carvalho Chehab
9298264991SMauro Carvalho ChehabWhen the FakeNBD driver is loaded, it registers itself with configfs.
9398264991SMauro Carvalho Chehabreaddir(3) sees this just fine::
9498264991SMauro Carvalho Chehab
9598264991SMauro Carvalho Chehab	# ls /config
9698264991SMauro Carvalho Chehab	fakenbd
9798264991SMauro Carvalho Chehab
9898264991SMauro Carvalho ChehabA fakenbd connection can be created with mkdir(2).  The name is
9998264991SMauro Carvalho Chehabarbitrary, but likely the tool will make some use of the name.  Perhaps
10098264991SMauro Carvalho Chehabit is a uuid or a disk name::
10198264991SMauro Carvalho Chehab
10298264991SMauro Carvalho Chehab	# mkdir /config/fakenbd/disk1
10398264991SMauro Carvalho Chehab	# ls /config/fakenbd/disk1
10498264991SMauro Carvalho Chehab	target device rw
10598264991SMauro Carvalho Chehab
10698264991SMauro Carvalho ChehabThe target attribute contains the IP address of the server FakeNBD will
10798264991SMauro Carvalho Chehabconnect to.  The device attribute is the device on the server.
10898264991SMauro Carvalho ChehabPredictably, the rw attribute determines whether the connection is
10998264991SMauro Carvalho Chehabread-only or read-write::
11098264991SMauro Carvalho Chehab
11198264991SMauro Carvalho Chehab	# echo 10.0.0.1 > /config/fakenbd/disk1/target
11298264991SMauro Carvalho Chehab	# echo /dev/sda1 > /config/fakenbd/disk1/device
11398264991SMauro Carvalho Chehab	# echo 1 > /config/fakenbd/disk1/rw
11498264991SMauro Carvalho Chehab
11598264991SMauro Carvalho ChehabThat's it.  That's all there is.  Now the device is configured, via the
11698264991SMauro Carvalho Chehabshell no less.
11798264991SMauro Carvalho Chehab
11898264991SMauro Carvalho ChehabCoding With configfs
11998264991SMauro Carvalho Chehab====================
12098264991SMauro Carvalho Chehab
12198264991SMauro Carvalho ChehabEvery object in configfs is a config_item.  A config_item reflects an
12298264991SMauro Carvalho Chehabobject in the subsystem.  It has attributes that match values on that
12398264991SMauro Carvalho Chehabobject.  configfs handles the filesystem representation of that object
12498264991SMauro Carvalho Chehaband its attributes, allowing the subsystem to ignore all but the
12598264991SMauro Carvalho Chehabbasic show/store interaction.
12698264991SMauro Carvalho Chehab
12798264991SMauro Carvalho ChehabItems are created and destroyed inside a config_group.  A group is a
12898264991SMauro Carvalho Chehabcollection of items that share the same attributes and operations.
12998264991SMauro Carvalho ChehabItems are created by mkdir(2) and removed by rmdir(2), but configfs
13098264991SMauro Carvalho Chehabhandles that.  The group has a set of operations to perform these tasks
13198264991SMauro Carvalho Chehab
13298264991SMauro Carvalho ChehabA subsystem is the top level of a client module.  During initialization,
13398264991SMauro Carvalho Chehabthe client module registers the subsystem with configfs, the subsystem
13498264991SMauro Carvalho Chehabappears as a directory at the top of the configfs filesystem.  A
13598264991SMauro Carvalho Chehabsubsystem is also a config_group, and can do everything a config_group
13698264991SMauro Carvalho Chehabcan.
13798264991SMauro Carvalho Chehab
13898264991SMauro Carvalho Chehabstruct config_item
13998264991SMauro Carvalho Chehab==================
14098264991SMauro Carvalho Chehab
14198264991SMauro Carvalho Chehab::
14298264991SMauro Carvalho Chehab
14398264991SMauro Carvalho Chehab	struct config_item {
14498264991SMauro Carvalho Chehab		char                    *ci_name;
14598264991SMauro Carvalho Chehab		char                    ci_namebuf[UOBJ_NAME_LEN];
14698264991SMauro Carvalho Chehab		struct kref             ci_kref;
14798264991SMauro Carvalho Chehab		struct list_head        ci_entry;
14898264991SMauro Carvalho Chehab		struct config_item      *ci_parent;
14998264991SMauro Carvalho Chehab		struct config_group     *ci_group;
15098264991SMauro Carvalho Chehab		struct config_item_type *ci_type;
15198264991SMauro Carvalho Chehab		struct dentry           *ci_dentry;
15298264991SMauro Carvalho Chehab	};
15398264991SMauro Carvalho Chehab
15498264991SMauro Carvalho Chehab	void config_item_init(struct config_item *);
15598264991SMauro Carvalho Chehab	void config_item_init_type_name(struct config_item *,
15698264991SMauro Carvalho Chehab					const char *name,
15798264991SMauro Carvalho Chehab					struct config_item_type *type);
15898264991SMauro Carvalho Chehab	struct config_item *config_item_get(struct config_item *);
15998264991SMauro Carvalho Chehab	void config_item_put(struct config_item *);
16098264991SMauro Carvalho Chehab
16198264991SMauro Carvalho ChehabGenerally, struct config_item is embedded in a container structure, a
16298264991SMauro Carvalho Chehabstructure that actually represents what the subsystem is doing.  The
16398264991SMauro Carvalho Chehabconfig_item portion of that structure is how the object interacts with
16498264991SMauro Carvalho Chehabconfigfs.
16598264991SMauro Carvalho Chehab
16698264991SMauro Carvalho ChehabWhether statically defined in a source file or created by a parent
16798264991SMauro Carvalho Chehabconfig_group, a config_item must have one of the _init() functions
16898264991SMauro Carvalho Chehabcalled on it.  This initializes the reference count and sets up the
16998264991SMauro Carvalho Chehabappropriate fields.
17098264991SMauro Carvalho Chehab
17198264991SMauro Carvalho ChehabAll users of a config_item should have a reference on it via
17298264991SMauro Carvalho Chehabconfig_item_get(), and drop the reference when they are done via
17398264991SMauro Carvalho Chehabconfig_item_put().
17498264991SMauro Carvalho Chehab
17598264991SMauro Carvalho ChehabBy itself, a config_item cannot do much more than appear in configfs.
17698264991SMauro Carvalho ChehabUsually a subsystem wants the item to display and/or store attributes,
17798264991SMauro Carvalho Chehabamong other things.  For that, it needs a type.
17898264991SMauro Carvalho Chehab
17998264991SMauro Carvalho Chehabstruct config_item_type
18098264991SMauro Carvalho Chehab=======================
18198264991SMauro Carvalho Chehab
18298264991SMauro Carvalho Chehab::
18398264991SMauro Carvalho Chehab
18498264991SMauro Carvalho Chehab	struct configfs_item_operations {
18598264991SMauro Carvalho Chehab		void (*release)(struct config_item *);
18698264991SMauro Carvalho Chehab		int (*allow_link)(struct config_item *src,
18798264991SMauro Carvalho Chehab				  struct config_item *target);
18898264991SMauro Carvalho Chehab		void (*drop_link)(struct config_item *src,
18998264991SMauro Carvalho Chehab				 struct config_item *target);
19098264991SMauro Carvalho Chehab	};
19198264991SMauro Carvalho Chehab
19298264991SMauro Carvalho Chehab	struct config_item_type {
19398264991SMauro Carvalho Chehab		struct module                           *ct_owner;
19498264991SMauro Carvalho Chehab		struct configfs_item_operations         *ct_item_ops;
19598264991SMauro Carvalho Chehab		struct configfs_group_operations        *ct_group_ops;
19698264991SMauro Carvalho Chehab		struct configfs_attribute               **ct_attrs;
19798264991SMauro Carvalho Chehab		struct configfs_bin_attribute		**ct_bin_attrs;
19898264991SMauro Carvalho Chehab	};
19998264991SMauro Carvalho Chehab
20098264991SMauro Carvalho ChehabThe most basic function of a config_item_type is to define what
20198264991SMauro Carvalho Chehaboperations can be performed on a config_item.  All items that have been
20298264991SMauro Carvalho Chehaballocated dynamically will need to provide the ct_item_ops->release()
20398264991SMauro Carvalho Chehabmethod.  This method is called when the config_item's reference count
20498264991SMauro Carvalho Chehabreaches zero.
20598264991SMauro Carvalho Chehab
20698264991SMauro Carvalho Chehabstruct configfs_attribute
20798264991SMauro Carvalho Chehab=========================
20898264991SMauro Carvalho Chehab
20998264991SMauro Carvalho Chehab::
21098264991SMauro Carvalho Chehab
21198264991SMauro Carvalho Chehab	struct configfs_attribute {
21298264991SMauro Carvalho Chehab		char                    *ca_name;
21398264991SMauro Carvalho Chehab		struct module           *ca_owner;
21498264991SMauro Carvalho Chehab		umode_t                  ca_mode;
21598264991SMauro Carvalho Chehab		ssize_t (*show)(struct config_item *, char *);
21698264991SMauro Carvalho Chehab		ssize_t (*store)(struct config_item *, const char *, size_t);
21798264991SMauro Carvalho Chehab	};
21898264991SMauro Carvalho Chehab
21998264991SMauro Carvalho ChehabWhen a config_item wants an attribute to appear as a file in the item's
22098264991SMauro Carvalho Chehabconfigfs directory, it must define a configfs_attribute describing it.
22198264991SMauro Carvalho ChehabIt then adds the attribute to the NULL-terminated array
22298264991SMauro Carvalho Chehabconfig_item_type->ct_attrs.  When the item appears in configfs, the
22398264991SMauro Carvalho Chehabattribute file will appear with the configfs_attribute->ca_name
22498264991SMauro Carvalho Chehabfilename.  configfs_attribute->ca_mode specifies the file permissions.
22598264991SMauro Carvalho Chehab
22698264991SMauro Carvalho ChehabIf an attribute is readable and provides a ->show method, that method will
22798264991SMauro Carvalho Chehabbe called whenever userspace asks for a read(2) on the attribute.  If an
22898264991SMauro Carvalho Chehabattribute is writable and provides a ->store  method, that method will be
22958c8e97dSRandy Dunlapcalled whenever userspace asks for a write(2) on the attribute.
23098264991SMauro Carvalho Chehab
23198264991SMauro Carvalho Chehabstruct configfs_bin_attribute
23298264991SMauro Carvalho Chehab=============================
23398264991SMauro Carvalho Chehab
23498264991SMauro Carvalho Chehab::
23598264991SMauro Carvalho Chehab
23698264991SMauro Carvalho Chehab	struct configfs_bin_attribute {
23798264991SMauro Carvalho Chehab		struct configfs_attribute	cb_attr;
23898264991SMauro Carvalho Chehab		void				*cb_private;
23998264991SMauro Carvalho Chehab		size_t				cb_max_size;
24098264991SMauro Carvalho Chehab	};
24198264991SMauro Carvalho Chehab
24298264991SMauro Carvalho ChehabThe binary attribute is used when the one needs to use binary blob to
24398264991SMauro Carvalho Chehabappear as the contents of a file in the item's configfs directory.
24498264991SMauro Carvalho ChehabTo do so add the binary attribute to the NULL-terminated array
24598264991SMauro Carvalho Chehabconfig_item_type->ct_bin_attrs, and the item appears in configfs, the
24698264991SMauro Carvalho Chehabattribute file will appear with the configfs_bin_attribute->cb_attr.ca_name
24798264991SMauro Carvalho Chehabfilename.  configfs_bin_attribute->cb_attr.ca_mode specifies the file
24898264991SMauro Carvalho Chehabpermissions.
24998264991SMauro Carvalho ChehabThe cb_private member is provided for use by the driver, while the
25098264991SMauro Carvalho Chehabcb_max_size member specifies the maximum amount of vmalloc buffer
25198264991SMauro Carvalho Chehabto be used.
25298264991SMauro Carvalho Chehab
25398264991SMauro Carvalho ChehabIf binary attribute is readable and the config_item provides a
25498264991SMauro Carvalho Chehabct_item_ops->read_bin_attribute() method, that method will be called
25598264991SMauro Carvalho Chehabwhenever userspace asks for a read(2) on the attribute.  The converse
256*d56b699dSBjorn Helgaaswill happen for write(2). The reads/writes are buffered so only a
25798264991SMauro Carvalho Chehabsingle read/write will occur; the attributes' need not concern itself
25898264991SMauro Carvalho Chehabwith it.
25998264991SMauro Carvalho Chehab
26098264991SMauro Carvalho Chehabstruct config_group
26198264991SMauro Carvalho Chehab===================
26298264991SMauro Carvalho Chehab
26398264991SMauro Carvalho ChehabA config_item cannot live in a vacuum.  The only way one can be created
26498264991SMauro Carvalho Chehabis via mkdir(2) on a config_group.  This will trigger creation of a
26598264991SMauro Carvalho Chehabchild item::
26698264991SMauro Carvalho Chehab
26798264991SMauro Carvalho Chehab	struct config_group {
26898264991SMauro Carvalho Chehab		struct config_item		cg_item;
26998264991SMauro Carvalho Chehab		struct list_head		cg_children;
27098264991SMauro Carvalho Chehab		struct configfs_subsystem 	*cg_subsys;
27198264991SMauro Carvalho Chehab		struct list_head		default_groups;
27298264991SMauro Carvalho Chehab		struct list_head		group_entry;
27398264991SMauro Carvalho Chehab	};
27498264991SMauro Carvalho Chehab
27598264991SMauro Carvalho Chehab	void config_group_init(struct config_group *group);
27698264991SMauro Carvalho Chehab	void config_group_init_type_name(struct config_group *group,
27798264991SMauro Carvalho Chehab					 const char *name,
27898264991SMauro Carvalho Chehab					 struct config_item_type *type);
27998264991SMauro Carvalho Chehab
28098264991SMauro Carvalho Chehab
28198264991SMauro Carvalho ChehabThe config_group structure contains a config_item.  Properly configuring
28298264991SMauro Carvalho Chehabthat item means that a group can behave as an item in its own right.
28398264991SMauro Carvalho ChehabHowever, it can do more: it can create child items or groups.  This is
28498264991SMauro Carvalho Chehabaccomplished via the group operations specified on the group's
28598264991SMauro Carvalho Chehabconfig_item_type::
28698264991SMauro Carvalho Chehab
28798264991SMauro Carvalho Chehab	struct configfs_group_operations {
28898264991SMauro Carvalho Chehab		struct config_item *(*make_item)(struct config_group *group,
28998264991SMauro Carvalho Chehab						 const char *name);
29098264991SMauro Carvalho Chehab		struct config_group *(*make_group)(struct config_group *group,
29198264991SMauro Carvalho Chehab						   const char *name);
29298264991SMauro Carvalho Chehab		void (*disconnect_notify)(struct config_group *group,
29398264991SMauro Carvalho Chehab					  struct config_item *item);
29498264991SMauro Carvalho Chehab		void (*drop_item)(struct config_group *group,
29598264991SMauro Carvalho Chehab				  struct config_item *item);
29698264991SMauro Carvalho Chehab	};
29798264991SMauro Carvalho Chehab
29898264991SMauro Carvalho ChehabA group creates child items by providing the
29998264991SMauro Carvalho Chehabct_group_ops->make_item() method.  If provided, this method is called from
30098264991SMauro Carvalho Chehabmkdir(2) in the group's directory.  The subsystem allocates a new
30198264991SMauro Carvalho Chehabconfig_item (or more likely, its container structure), initializes it,
30298264991SMauro Carvalho Chehaband returns it to configfs.  Configfs will then populate the filesystem
30398264991SMauro Carvalho Chehabtree to reflect the new item.
30498264991SMauro Carvalho Chehab
30598264991SMauro Carvalho ChehabIf the subsystem wants the child to be a group itself, the subsystem
30698264991SMauro Carvalho Chehabprovides ct_group_ops->make_group().  Everything else behaves the same,
30798264991SMauro Carvalho Chehabusing the group _init() functions on the group.
30898264991SMauro Carvalho Chehab
30998264991SMauro Carvalho ChehabFinally, when userspace calls rmdir(2) on the item or group,
31098264991SMauro Carvalho Chehabct_group_ops->drop_item() is called.  As a config_group is also a
31198264991SMauro Carvalho Chehabconfig_item, it is not necessary for a separate drop_group() method.
31298264991SMauro Carvalho ChehabThe subsystem must config_item_put() the reference that was initialized
31398264991SMauro Carvalho Chehabupon item allocation.  If a subsystem has no work to do, it may omit
31498264991SMauro Carvalho Chehabthe ct_group_ops->drop_item() method, and configfs will call
31598264991SMauro Carvalho Chehabconfig_item_put() on the item on behalf of the subsystem.
31698264991SMauro Carvalho Chehab
31798264991SMauro Carvalho ChehabImportant:
31898264991SMauro Carvalho Chehab   drop_item() is void, and as such cannot fail.  When rmdir(2)
31998264991SMauro Carvalho Chehab   is called, configfs WILL remove the item from the filesystem tree
32098264991SMauro Carvalho Chehab   (assuming that it has no children to keep it busy).  The subsystem is
32198264991SMauro Carvalho Chehab   responsible for responding to this.  If the subsystem has references to
32298264991SMauro Carvalho Chehab   the item in other threads, the memory is safe.  It may take some time
32398264991SMauro Carvalho Chehab   for the item to actually disappear from the subsystem's usage.  But it
32498264991SMauro Carvalho Chehab   is gone from configfs.
32598264991SMauro Carvalho Chehab
32698264991SMauro Carvalho ChehabWhen drop_item() is called, the item's linkage has already been torn
32798264991SMauro Carvalho Chehabdown.  It no longer has a reference on its parent and has no place in
32898264991SMauro Carvalho Chehabthe item hierarchy.  If a client needs to do some cleanup before this
32998264991SMauro Carvalho Chehabteardown happens, the subsystem can implement the
33098264991SMauro Carvalho Chehabct_group_ops->disconnect_notify() method.  The method is called after
33198264991SMauro Carvalho Chehabconfigfs has removed the item from the filesystem view but before the
33298264991SMauro Carvalho Chehabitem is removed from its parent group.  Like drop_item(),
33398264991SMauro Carvalho Chehabdisconnect_notify() is void and cannot fail.  Client subsystems should
33498264991SMauro Carvalho Chehabnot drop any references here, as they still must do it in drop_item().
33598264991SMauro Carvalho Chehab
33698264991SMauro Carvalho ChehabA config_group cannot be removed while it still has child items.  This
33798264991SMauro Carvalho Chehabis implemented in the configfs rmdir(2) code.  ->drop_item() will not be
33898264991SMauro Carvalho Chehabcalled, as the item has not been dropped.  rmdir(2) will fail, as the
33998264991SMauro Carvalho Chehabdirectory is not empty.
34098264991SMauro Carvalho Chehab
34198264991SMauro Carvalho Chehabstruct configfs_subsystem
34298264991SMauro Carvalho Chehab=========================
34398264991SMauro Carvalho Chehab
34498264991SMauro Carvalho ChehabA subsystem must register itself, usually at module_init time.  This
34598264991SMauro Carvalho Chehabtells configfs to make the subsystem appear in the file tree::
34698264991SMauro Carvalho Chehab
34798264991SMauro Carvalho Chehab	struct configfs_subsystem {
34898264991SMauro Carvalho Chehab		struct config_group	su_group;
34998264991SMauro Carvalho Chehab		struct mutex		su_mutex;
35098264991SMauro Carvalho Chehab	};
35198264991SMauro Carvalho Chehab
35298264991SMauro Carvalho Chehab	int configfs_register_subsystem(struct configfs_subsystem *subsys);
35398264991SMauro Carvalho Chehab	void configfs_unregister_subsystem(struct configfs_subsystem *subsys);
35498264991SMauro Carvalho Chehab
35598264991SMauro Carvalho ChehabA subsystem consists of a toplevel config_group and a mutex.
35698264991SMauro Carvalho ChehabThe group is where child config_items are created.  For a subsystem,
35798264991SMauro Carvalho Chehabthis group is usually defined statically.  Before calling
35898264991SMauro Carvalho Chehabconfigfs_register_subsystem(), the subsystem must have initialized the
35998264991SMauro Carvalho Chehabgroup via the usual group _init() functions, and it must also have
36098264991SMauro Carvalho Chehabinitialized the mutex.
36198264991SMauro Carvalho Chehab
36298264991SMauro Carvalho ChehabWhen the register call returns, the subsystem is live, and it
36398264991SMauro Carvalho Chehabwill be visible via configfs.  At that point, mkdir(2) can be called and
36498264991SMauro Carvalho Chehabthe subsystem must be ready for it.
36598264991SMauro Carvalho Chehab
36698264991SMauro Carvalho ChehabAn Example
36798264991SMauro Carvalho Chehab==========
36898264991SMauro Carvalho Chehab
36998264991SMauro Carvalho ChehabThe best example of these basic concepts is the simple_children
37098264991SMauro Carvalho Chehabsubsystem/group and the simple_child item in
37198264991SMauro Carvalho Chehabsamples/configfs/configfs_sample.c. It shows a trivial object displaying
37298264991SMauro Carvalho Chehaband storing an attribute, and a simple group creating and destroying
37398264991SMauro Carvalho Chehabthese children.
37498264991SMauro Carvalho Chehab
37598264991SMauro Carvalho ChehabHierarchy Navigation and the Subsystem Mutex
37698264991SMauro Carvalho Chehab============================================
37798264991SMauro Carvalho Chehab
37898264991SMauro Carvalho ChehabThere is an extra bonus that configfs provides.  The config_groups and
37998264991SMauro Carvalho Chehabconfig_items are arranged in a hierarchy due to the fact that they
38098264991SMauro Carvalho Chehabappear in a filesystem.  A subsystem is NEVER to touch the filesystem
38198264991SMauro Carvalho Chehabparts, but the subsystem might be interested in this hierarchy.  For
38298264991SMauro Carvalho Chehabthis reason, the hierarchy is mirrored via the config_group->cg_children
38398264991SMauro Carvalho Chehaband config_item->ci_parent structure members.
38498264991SMauro Carvalho Chehab
38598264991SMauro Carvalho ChehabA subsystem can navigate the cg_children list and the ci_parent pointer
38698264991SMauro Carvalho Chehabto see the tree created by the subsystem.  This can race with configfs'
38798264991SMauro Carvalho Chehabmanagement of the hierarchy, so configfs uses the subsystem mutex to
38898264991SMauro Carvalho Chehabprotect modifications.  Whenever a subsystem wants to navigate the
38998264991SMauro Carvalho Chehabhierarchy, it must do so under the protection of the subsystem
39098264991SMauro Carvalho Chehabmutex.
39198264991SMauro Carvalho Chehab
39298264991SMauro Carvalho ChehabA subsystem will be prevented from acquiring the mutex while a newly
39398264991SMauro Carvalho Chehaballocated item has not been linked into this hierarchy.   Similarly, it
39498264991SMauro Carvalho Chehabwill not be able to acquire the mutex while a dropping item has not
39598264991SMauro Carvalho Chehabyet been unlinked.  This means that an item's ci_parent pointer will
39698264991SMauro Carvalho Chehabnever be NULL while the item is in configfs, and that an item will only
39798264991SMauro Carvalho Chehabbe in its parent's cg_children list for the same duration.  This allows
39898264991SMauro Carvalho Chehaba subsystem to trust ci_parent and cg_children while they hold the
39998264991SMauro Carvalho Chehabmutex.
40098264991SMauro Carvalho Chehab
40198264991SMauro Carvalho ChehabItem Aggregation Via symlink(2)
40298264991SMauro Carvalho Chehab===============================
40398264991SMauro Carvalho Chehab
40498264991SMauro Carvalho Chehabconfigfs provides a simple group via the group->item parent/child
40598264991SMauro Carvalho Chehabrelationship.  Often, however, a larger environment requires aggregation
40698264991SMauro Carvalho Chehaboutside of the parent/child connection.  This is implemented via
40798264991SMauro Carvalho Chehabsymlink(2).
40898264991SMauro Carvalho Chehab
40998264991SMauro Carvalho ChehabA config_item may provide the ct_item_ops->allow_link() and
41098264991SMauro Carvalho Chehabct_item_ops->drop_link() methods.  If the ->allow_link() method exists,
41198264991SMauro Carvalho Chehabsymlink(2) may be called with the config_item as the source of the link.
41298264991SMauro Carvalho ChehabThese links are only allowed between configfs config_items.  Any
41398264991SMauro Carvalho Chehabsymlink(2) attempt outside the configfs filesystem will be denied.
41498264991SMauro Carvalho Chehab
41598264991SMauro Carvalho ChehabWhen symlink(2) is called, the source config_item's ->allow_link()
41698264991SMauro Carvalho Chehabmethod is called with itself and a target item.  If the source item
41798264991SMauro Carvalho Chehaballows linking to target item, it returns 0.  A source item may wish to
41898264991SMauro Carvalho Chehabreject a link if it only wants links to a certain type of object (say,
41998264991SMauro Carvalho Chehabin its own subsystem).
42098264991SMauro Carvalho Chehab
42198264991SMauro Carvalho ChehabWhen unlink(2) is called on the symbolic link, the source item is
42298264991SMauro Carvalho Chehabnotified via the ->drop_link() method.  Like the ->drop_item() method,
42398264991SMauro Carvalho Chehabthis is a void function and cannot return failure.  The subsystem is
42498264991SMauro Carvalho Chehabresponsible for responding to the change.
42598264991SMauro Carvalho Chehab
42698264991SMauro Carvalho ChehabA config_item cannot be removed while it links to any other item, nor
42798264991SMauro Carvalho Chehabcan it be removed while an item links to it.  Dangling symlinks are not
42898264991SMauro Carvalho Chehaballowed in configfs.
42998264991SMauro Carvalho Chehab
43098264991SMauro Carvalho ChehabAutomatically Created Subgroups
43198264991SMauro Carvalho Chehab===============================
43298264991SMauro Carvalho Chehab
43398264991SMauro Carvalho ChehabA new config_group may want to have two types of child config_items.
43498264991SMauro Carvalho ChehabWhile this could be codified by magic names in ->make_item(), it is much
43598264991SMauro Carvalho Chehabmore explicit to have a method whereby userspace sees this divergence.
43698264991SMauro Carvalho Chehab
43798264991SMauro Carvalho ChehabRather than have a group where some items behave differently than
43898264991SMauro Carvalho Chehabothers, configfs provides a method whereby one or many subgroups are
43998264991SMauro Carvalho Chehabautomatically created inside the parent at its creation.  Thus,
44098264991SMauro Carvalho Chehabmkdir("parent") results in "parent", "parent/subgroup1", up through
44198264991SMauro Carvalho Chehab"parent/subgroupN".  Items of type 1 can now be created in
44298264991SMauro Carvalho Chehab"parent/subgroup1", and items of type N can be created in
44398264991SMauro Carvalho Chehab"parent/subgroupN".
44498264991SMauro Carvalho Chehab
44598264991SMauro Carvalho ChehabThese automatic subgroups, or default groups, do not preclude other
44698264991SMauro Carvalho Chehabchildren of the parent group.  If ct_group_ops->make_group() exists,
44798264991SMauro Carvalho Chehabother child groups can be created on the parent group directly.
44898264991SMauro Carvalho Chehab
44998264991SMauro Carvalho ChehabA configfs subsystem specifies default groups by adding them using the
45098264991SMauro Carvalho Chehabconfigfs_add_default_group() function to the parent config_group
45198264991SMauro Carvalho Chehabstructure.  Each added group is populated in the configfs tree at the same
45298264991SMauro Carvalho Chehabtime as the parent group.  Similarly, they are removed at the same time
45398264991SMauro Carvalho Chehabas the parent.  No extra notification is provided.  When a ->drop_item()
45498264991SMauro Carvalho Chehabmethod call notifies the subsystem the parent group is going away, it
45598264991SMauro Carvalho Chehabalso means every default group child associated with that parent group.
45698264991SMauro Carvalho Chehab
45798264991SMauro Carvalho ChehabAs a consequence of this, default groups cannot be removed directly via
45898264991SMauro Carvalho Chehabrmdir(2).  They also are not considered when rmdir(2) on the parent
45998264991SMauro Carvalho Chehabgroup is checking for children.
46098264991SMauro Carvalho Chehab
46198264991SMauro Carvalho ChehabDependent Subsystems
46298264991SMauro Carvalho Chehab====================
46398264991SMauro Carvalho Chehab
46498264991SMauro Carvalho ChehabSometimes other drivers depend on particular configfs items.  For
46598264991SMauro Carvalho Chehabexample, ocfs2 mounts depend on a heartbeat region item.  If that
46698264991SMauro Carvalho Chehabregion item is removed with rmdir(2), the ocfs2 mount must BUG or go
46798264991SMauro Carvalho Chehabreadonly.  Not happy.
46898264991SMauro Carvalho Chehab
46998264991SMauro Carvalho Chehabconfigfs provides two additional API calls: configfs_depend_item() and
47098264991SMauro Carvalho Chehabconfigfs_undepend_item().  A client driver can call
47198264991SMauro Carvalho Chehabconfigfs_depend_item() on an existing item to tell configfs that it is
47298264991SMauro Carvalho Chehabdepended on.  configfs will then return -EBUSY from rmdir(2) for that
47398264991SMauro Carvalho Chehabitem.  When the item is no longer depended on, the client driver calls
47498264991SMauro Carvalho Chehabconfigfs_undepend_item() on it.
47598264991SMauro Carvalho Chehab
47698264991SMauro Carvalho ChehabThese API cannot be called underneath any configfs callbacks, as
47798264991SMauro Carvalho Chehabthey will conflict.  They can block and allocate.  A client driver
47898264991SMauro Carvalho Chehabprobably shouldn't calling them of its own gumption.  Rather it should
47998264991SMauro Carvalho Chehabbe providing an API that external subsystems call.
48098264991SMauro Carvalho Chehab
48198264991SMauro Carvalho ChehabHow does this work?  Imagine the ocfs2 mount process.  When it mounts,
48298264991SMauro Carvalho Chehabit asks for a heartbeat region item.  This is done via a call into the
48398264991SMauro Carvalho Chehabheartbeat code.  Inside the heartbeat code, the region item is looked
48498264991SMauro Carvalho Chehabup.  Here, the heartbeat code calls configfs_depend_item().  If it
48598264991SMauro Carvalho Chehabsucceeds, then heartbeat knows the region is safe to give to ocfs2.
48698264991SMauro Carvalho ChehabIf it fails, it was being torn down anyway, and heartbeat can gracefully
48798264991SMauro Carvalho Chehabpass up an error.
488