198264991SMauro Carvalho Chehab======================================================= 298264991SMauro Carvalho ChehabConfigfs - Userspace-driven Kernel Object Configuration 398264991SMauro Carvalho Chehab======================================================= 498264991SMauro Carvalho Chehab 598264991SMauro Carvalho ChehabJoel Becker <joel.becker@oracle.com> 698264991SMauro Carvalho Chehab 798264991SMauro Carvalho ChehabUpdated: 31 March 2005 898264991SMauro Carvalho Chehab 998264991SMauro Carvalho ChehabCopyright (c) 2005 Oracle Corporation, 1098264991SMauro Carvalho Chehab Joel Becker <joel.becker@oracle.com> 1198264991SMauro Carvalho Chehab 1298264991SMauro Carvalho Chehab 1398264991SMauro Carvalho ChehabWhat is configfs? 1498264991SMauro Carvalho Chehab================= 1598264991SMauro Carvalho Chehab 1698264991SMauro Carvalho Chehabconfigfs is a ram-based filesystem that provides the converse of 1798264991SMauro Carvalho Chehabsysfs's functionality. Where sysfs is a filesystem-based view of 1898264991SMauro Carvalho Chehabkernel objects, configfs is a filesystem-based manager of kernel 1998264991SMauro Carvalho Chehabobjects, or config_items. 2098264991SMauro Carvalho Chehab 2198264991SMauro Carvalho ChehabWith sysfs, an object is created in kernel (for example, when a device 2298264991SMauro Carvalho Chehabis discovered) and it is registered with sysfs. Its attributes then 2398264991SMauro Carvalho Chehabappear in sysfs, allowing userspace to read the attributes via 2498264991SMauro Carvalho Chehabreaddir(3)/read(2). It may allow some attributes to be modified via 2598264991SMauro Carvalho Chehabwrite(2). The important point is that the object is created and 2698264991SMauro Carvalho Chehabdestroyed in kernel, the kernel controls the lifecycle of the sysfs 2798264991SMauro Carvalho Chehabrepresentation, and sysfs is merely a window on all this. 2898264991SMauro Carvalho Chehab 2998264991SMauro Carvalho ChehabA configfs config_item is created via an explicit userspace operation: 3098264991SMauro Carvalho Chehabmkdir(2). It is destroyed via rmdir(2). The attributes appear at 3198264991SMauro Carvalho Chehabmkdir(2) time, and can be read or modified via read(2) and write(2). 3298264991SMauro Carvalho ChehabAs with sysfs, readdir(3) queries the list of items and/or attributes. 3398264991SMauro Carvalho Chehabsymlink(2) can be used to group items together. Unlike sysfs, the 3498264991SMauro Carvalho Chehablifetime of the representation is completely driven by userspace. The 3598264991SMauro Carvalho Chehabkernel modules backing the items must respond to this. 3698264991SMauro Carvalho Chehab 3798264991SMauro Carvalho ChehabBoth sysfs and configfs can and should exist together on the same 3898264991SMauro Carvalho Chehabsystem. One is not a replacement for the other. 3998264991SMauro Carvalho Chehab 4098264991SMauro Carvalho ChehabUsing configfs 4198264991SMauro Carvalho Chehab============== 4298264991SMauro Carvalho Chehab 4398264991SMauro Carvalho Chehabconfigfs can be compiled as a module or into the kernel. You can access 4498264991SMauro Carvalho Chehabit by doing:: 4598264991SMauro Carvalho Chehab 4698264991SMauro Carvalho Chehab mount -t configfs none /config 4798264991SMauro Carvalho Chehab 4898264991SMauro Carvalho ChehabThe configfs tree will be empty unless client modules are also loaded. 4998264991SMauro Carvalho ChehabThese are modules that register their item types with configfs as 5098264991SMauro Carvalho Chehabsubsystems. Once a client subsystem is loaded, it will appear as a 5198264991SMauro Carvalho Chehabsubdirectory (or more than one) under /config. Like sysfs, the 5298264991SMauro Carvalho Chehabconfigfs tree is always there, whether mounted on /config or not. 5398264991SMauro Carvalho Chehab 5498264991SMauro Carvalho ChehabAn item is created via mkdir(2). The item's attributes will also 5598264991SMauro Carvalho Chehabappear at this time. readdir(3) can determine what the attributes are, 5698264991SMauro Carvalho Chehabread(2) can query their default values, and write(2) can store new 5798264991SMauro Carvalho Chehabvalues. Don't mix more than one attribute in one attribute file. 5898264991SMauro Carvalho Chehab 5998264991SMauro Carvalho ChehabThere are two types of configfs attributes: 6098264991SMauro Carvalho Chehab 6198264991SMauro Carvalho Chehab* Normal attributes, which similar to sysfs attributes, are small ASCII text 6298264991SMauro Carvalho Chehab files, with a maximum size of one page (PAGE_SIZE, 4096 on i386). Preferably 6398264991SMauro Carvalho Chehab only one value per file should be used, and the same caveats from sysfs apply. 6498264991SMauro Carvalho Chehab Configfs expects write(2) to store the entire buffer at once. When writing to 6598264991SMauro Carvalho Chehab normal configfs attributes, userspace processes should first read the entire 6698264991SMauro Carvalho Chehab file, modify the portions they wish to change, and then write the entire 6798264991SMauro Carvalho Chehab buffer back. 6898264991SMauro Carvalho Chehab 6998264991SMauro Carvalho Chehab* Binary attributes, which are somewhat similar to sysfs binary attributes, 7098264991SMauro Carvalho Chehab but with a few slight changes to semantics. The PAGE_SIZE limitation does not 7198264991SMauro Carvalho Chehab apply, but the whole binary item must fit in single kernel vmalloc'ed buffer. 7298264991SMauro Carvalho Chehab The write(2) calls from user space are buffered, and the attributes' 7398264991SMauro Carvalho Chehab write_bin_attribute method will be invoked on the final close, therefore it is 7498264991SMauro Carvalho Chehab imperative for user-space to check the return code of close(2) in order to 7598264991SMauro Carvalho Chehab verify that the operation finished successfully. 7698264991SMauro Carvalho Chehab To avoid a malicious user OOMing the kernel, there's a per-binary attribute 7798264991SMauro Carvalho Chehab maximum buffer value. 7898264991SMauro Carvalho Chehab 7998264991SMauro Carvalho ChehabWhen an item needs to be destroyed, remove it with rmdir(2). An 8098264991SMauro Carvalho Chehabitem cannot be destroyed if any other item has a link to it (via 8198264991SMauro Carvalho Chehabsymlink(2)). Links can be removed via unlink(2). 8298264991SMauro Carvalho Chehab 8398264991SMauro Carvalho ChehabConfiguring FakeNBD: an Example 8498264991SMauro Carvalho Chehab=============================== 8598264991SMauro Carvalho Chehab 8698264991SMauro Carvalho ChehabImagine there's a Network Block Device (NBD) driver that allows you to 8798264991SMauro Carvalho Chehabaccess remote block devices. Call it FakeNBD. FakeNBD uses configfs 8898264991SMauro Carvalho Chehabfor its configuration. Obviously, there will be a nice program that 8998264991SMauro Carvalho Chehabsysadmins use to configure FakeNBD, but somehow that program has to tell 9098264991SMauro Carvalho Chehabthe driver about it. Here's where configfs comes in. 9198264991SMauro Carvalho Chehab 9298264991SMauro Carvalho ChehabWhen the FakeNBD driver is loaded, it registers itself with configfs. 9398264991SMauro Carvalho Chehabreaddir(3) sees this just fine:: 9498264991SMauro Carvalho Chehab 9598264991SMauro Carvalho Chehab # ls /config 9698264991SMauro Carvalho Chehab fakenbd 9798264991SMauro Carvalho Chehab 9898264991SMauro Carvalho ChehabA fakenbd connection can be created with mkdir(2). The name is 9998264991SMauro Carvalho Chehabarbitrary, but likely the tool will make some use of the name. Perhaps 10098264991SMauro Carvalho Chehabit is a uuid or a disk name:: 10198264991SMauro Carvalho Chehab 10298264991SMauro Carvalho Chehab # mkdir /config/fakenbd/disk1 10398264991SMauro Carvalho Chehab # ls /config/fakenbd/disk1 10498264991SMauro Carvalho Chehab target device rw 10598264991SMauro Carvalho Chehab 10698264991SMauro Carvalho ChehabThe target attribute contains the IP address of the server FakeNBD will 10798264991SMauro Carvalho Chehabconnect to. The device attribute is the device on the server. 10898264991SMauro Carvalho ChehabPredictably, the rw attribute determines whether the connection is 10998264991SMauro Carvalho Chehabread-only or read-write:: 11098264991SMauro Carvalho Chehab 11198264991SMauro Carvalho Chehab # echo 10.0.0.1 > /config/fakenbd/disk1/target 11298264991SMauro Carvalho Chehab # echo /dev/sda1 > /config/fakenbd/disk1/device 11398264991SMauro Carvalho Chehab # echo 1 > /config/fakenbd/disk1/rw 11498264991SMauro Carvalho Chehab 11598264991SMauro Carvalho ChehabThat's it. That's all there is. Now the device is configured, via the 11698264991SMauro Carvalho Chehabshell no less. 11798264991SMauro Carvalho Chehab 11898264991SMauro Carvalho ChehabCoding With configfs 11998264991SMauro Carvalho Chehab==================== 12098264991SMauro Carvalho Chehab 12198264991SMauro Carvalho ChehabEvery object in configfs is a config_item. A config_item reflects an 12298264991SMauro Carvalho Chehabobject in the subsystem. It has attributes that match values on that 12398264991SMauro Carvalho Chehabobject. configfs handles the filesystem representation of that object 12498264991SMauro Carvalho Chehaband its attributes, allowing the subsystem to ignore all but the 12598264991SMauro Carvalho Chehabbasic show/store interaction. 12698264991SMauro Carvalho Chehab 12798264991SMauro Carvalho ChehabItems are created and destroyed inside a config_group. A group is a 12898264991SMauro Carvalho Chehabcollection of items that share the same attributes and operations. 12998264991SMauro Carvalho ChehabItems are created by mkdir(2) and removed by rmdir(2), but configfs 13098264991SMauro Carvalho Chehabhandles that. The group has a set of operations to perform these tasks 13198264991SMauro Carvalho Chehab 13298264991SMauro Carvalho ChehabA subsystem is the top level of a client module. During initialization, 13398264991SMauro Carvalho Chehabthe client module registers the subsystem with configfs, the subsystem 13498264991SMauro Carvalho Chehabappears as a directory at the top of the configfs filesystem. A 13598264991SMauro Carvalho Chehabsubsystem is also a config_group, and can do everything a config_group 13698264991SMauro Carvalho Chehabcan. 13798264991SMauro Carvalho Chehab 13898264991SMauro Carvalho Chehabstruct config_item 13998264991SMauro Carvalho Chehab================== 14098264991SMauro Carvalho Chehab 14198264991SMauro Carvalho Chehab:: 14298264991SMauro Carvalho Chehab 14398264991SMauro Carvalho Chehab struct config_item { 14498264991SMauro Carvalho Chehab char *ci_name; 14598264991SMauro Carvalho Chehab char ci_namebuf[UOBJ_NAME_LEN]; 14698264991SMauro Carvalho Chehab struct kref ci_kref; 14798264991SMauro Carvalho Chehab struct list_head ci_entry; 14898264991SMauro Carvalho Chehab struct config_item *ci_parent; 14998264991SMauro Carvalho Chehab struct config_group *ci_group; 15098264991SMauro Carvalho Chehab struct config_item_type *ci_type; 15198264991SMauro Carvalho Chehab struct dentry *ci_dentry; 15298264991SMauro Carvalho Chehab }; 15398264991SMauro Carvalho Chehab 15498264991SMauro Carvalho Chehab void config_item_init(struct config_item *); 15598264991SMauro Carvalho Chehab void config_item_init_type_name(struct config_item *, 15698264991SMauro Carvalho Chehab const char *name, 15798264991SMauro Carvalho Chehab struct config_item_type *type); 15898264991SMauro Carvalho Chehab struct config_item *config_item_get(struct config_item *); 15998264991SMauro Carvalho Chehab void config_item_put(struct config_item *); 16098264991SMauro Carvalho Chehab 16198264991SMauro Carvalho ChehabGenerally, struct config_item is embedded in a container structure, a 16298264991SMauro Carvalho Chehabstructure that actually represents what the subsystem is doing. The 16398264991SMauro Carvalho Chehabconfig_item portion of that structure is how the object interacts with 16498264991SMauro Carvalho Chehabconfigfs. 16598264991SMauro Carvalho Chehab 16698264991SMauro Carvalho ChehabWhether statically defined in a source file or created by a parent 16798264991SMauro Carvalho Chehabconfig_group, a config_item must have one of the _init() functions 16898264991SMauro Carvalho Chehabcalled on it. This initializes the reference count and sets up the 16998264991SMauro Carvalho Chehabappropriate fields. 17098264991SMauro Carvalho Chehab 17198264991SMauro Carvalho ChehabAll users of a config_item should have a reference on it via 17298264991SMauro Carvalho Chehabconfig_item_get(), and drop the reference when they are done via 17398264991SMauro Carvalho Chehabconfig_item_put(). 17498264991SMauro Carvalho Chehab 17598264991SMauro Carvalho ChehabBy itself, a config_item cannot do much more than appear in configfs. 17698264991SMauro Carvalho ChehabUsually a subsystem wants the item to display and/or store attributes, 17798264991SMauro Carvalho Chehabamong other things. For that, it needs a type. 17898264991SMauro Carvalho Chehab 17998264991SMauro Carvalho Chehabstruct config_item_type 18098264991SMauro Carvalho Chehab======================= 18198264991SMauro Carvalho Chehab 18298264991SMauro Carvalho Chehab:: 18398264991SMauro Carvalho Chehab 18498264991SMauro Carvalho Chehab struct configfs_item_operations { 18598264991SMauro Carvalho Chehab void (*release)(struct config_item *); 18698264991SMauro Carvalho Chehab int (*allow_link)(struct config_item *src, 18798264991SMauro Carvalho Chehab struct config_item *target); 18898264991SMauro Carvalho Chehab void (*drop_link)(struct config_item *src, 18998264991SMauro Carvalho Chehab struct config_item *target); 19098264991SMauro Carvalho Chehab }; 19198264991SMauro Carvalho Chehab 19298264991SMauro Carvalho Chehab struct config_item_type { 19398264991SMauro Carvalho Chehab struct module *ct_owner; 19498264991SMauro Carvalho Chehab struct configfs_item_operations *ct_item_ops; 19598264991SMauro Carvalho Chehab struct configfs_group_operations *ct_group_ops; 19698264991SMauro Carvalho Chehab struct configfs_attribute **ct_attrs; 19798264991SMauro Carvalho Chehab struct configfs_bin_attribute **ct_bin_attrs; 19898264991SMauro Carvalho Chehab }; 19998264991SMauro Carvalho Chehab 20098264991SMauro Carvalho ChehabThe most basic function of a config_item_type is to define what 20198264991SMauro Carvalho Chehaboperations can be performed on a config_item. All items that have been 20298264991SMauro Carvalho Chehaballocated dynamically will need to provide the ct_item_ops->release() 20398264991SMauro Carvalho Chehabmethod. This method is called when the config_item's reference count 20498264991SMauro Carvalho Chehabreaches zero. 20598264991SMauro Carvalho Chehab 20698264991SMauro Carvalho Chehabstruct configfs_attribute 20798264991SMauro Carvalho Chehab========================= 20898264991SMauro Carvalho Chehab 20998264991SMauro Carvalho Chehab:: 21098264991SMauro Carvalho Chehab 21198264991SMauro Carvalho Chehab struct configfs_attribute { 21298264991SMauro Carvalho Chehab char *ca_name; 21398264991SMauro Carvalho Chehab struct module *ca_owner; 21498264991SMauro Carvalho Chehab umode_t ca_mode; 21598264991SMauro Carvalho Chehab ssize_t (*show)(struct config_item *, char *); 21698264991SMauro Carvalho Chehab ssize_t (*store)(struct config_item *, const char *, size_t); 21798264991SMauro Carvalho Chehab }; 21898264991SMauro Carvalho Chehab 21998264991SMauro Carvalho ChehabWhen a config_item wants an attribute to appear as a file in the item's 22098264991SMauro Carvalho Chehabconfigfs directory, it must define a configfs_attribute describing it. 22198264991SMauro Carvalho ChehabIt then adds the attribute to the NULL-terminated array 22298264991SMauro Carvalho Chehabconfig_item_type->ct_attrs. When the item appears in configfs, the 22398264991SMauro Carvalho Chehabattribute file will appear with the configfs_attribute->ca_name 22498264991SMauro Carvalho Chehabfilename. configfs_attribute->ca_mode specifies the file permissions. 22598264991SMauro Carvalho Chehab 22698264991SMauro Carvalho ChehabIf an attribute is readable and provides a ->show method, that method will 22798264991SMauro Carvalho Chehabbe called whenever userspace asks for a read(2) on the attribute. If an 22898264991SMauro Carvalho Chehabattribute is writable and provides a ->store method, that method will be 22958c8e97dSRandy Dunlapcalled whenever userspace asks for a write(2) on the attribute. 23098264991SMauro Carvalho Chehab 23198264991SMauro Carvalho Chehabstruct configfs_bin_attribute 23298264991SMauro Carvalho Chehab============================= 23398264991SMauro Carvalho Chehab 23498264991SMauro Carvalho Chehab:: 23598264991SMauro Carvalho Chehab 23698264991SMauro Carvalho Chehab struct configfs_bin_attribute { 23798264991SMauro Carvalho Chehab struct configfs_attribute cb_attr; 23898264991SMauro Carvalho Chehab void *cb_private; 23998264991SMauro Carvalho Chehab size_t cb_max_size; 24098264991SMauro Carvalho Chehab }; 24198264991SMauro Carvalho Chehab 24298264991SMauro Carvalho ChehabThe binary attribute is used when the one needs to use binary blob to 24398264991SMauro Carvalho Chehabappear as the contents of a file in the item's configfs directory. 24498264991SMauro Carvalho ChehabTo do so add the binary attribute to the NULL-terminated array 24598264991SMauro Carvalho Chehabconfig_item_type->ct_bin_attrs, and the item appears in configfs, the 24698264991SMauro Carvalho Chehabattribute file will appear with the configfs_bin_attribute->cb_attr.ca_name 24798264991SMauro Carvalho Chehabfilename. configfs_bin_attribute->cb_attr.ca_mode specifies the file 24898264991SMauro Carvalho Chehabpermissions. 24998264991SMauro Carvalho ChehabThe cb_private member is provided for use by the driver, while the 25098264991SMauro Carvalho Chehabcb_max_size member specifies the maximum amount of vmalloc buffer 25198264991SMauro Carvalho Chehabto be used. 25298264991SMauro Carvalho Chehab 25398264991SMauro Carvalho ChehabIf binary attribute is readable and the config_item provides a 25498264991SMauro Carvalho Chehabct_item_ops->read_bin_attribute() method, that method will be called 25598264991SMauro Carvalho Chehabwhenever userspace asks for a read(2) on the attribute. The converse 256*d56b699dSBjorn Helgaaswill happen for write(2). The reads/writes are buffered so only a 25798264991SMauro Carvalho Chehabsingle read/write will occur; the attributes' need not concern itself 25898264991SMauro Carvalho Chehabwith it. 25998264991SMauro Carvalho Chehab 26098264991SMauro Carvalho Chehabstruct config_group 26198264991SMauro Carvalho Chehab=================== 26298264991SMauro Carvalho Chehab 26398264991SMauro Carvalho ChehabA config_item cannot live in a vacuum. The only way one can be created 26498264991SMauro Carvalho Chehabis via mkdir(2) on a config_group. This will trigger creation of a 26598264991SMauro Carvalho Chehabchild item:: 26698264991SMauro Carvalho Chehab 26798264991SMauro Carvalho Chehab struct config_group { 26898264991SMauro Carvalho Chehab struct config_item cg_item; 26998264991SMauro Carvalho Chehab struct list_head cg_children; 27098264991SMauro Carvalho Chehab struct configfs_subsystem *cg_subsys; 27198264991SMauro Carvalho Chehab struct list_head default_groups; 27298264991SMauro Carvalho Chehab struct list_head group_entry; 27398264991SMauro Carvalho Chehab }; 27498264991SMauro Carvalho Chehab 27598264991SMauro Carvalho Chehab void config_group_init(struct config_group *group); 27698264991SMauro Carvalho Chehab void config_group_init_type_name(struct config_group *group, 27798264991SMauro Carvalho Chehab const char *name, 27898264991SMauro Carvalho Chehab struct config_item_type *type); 27998264991SMauro Carvalho Chehab 28098264991SMauro Carvalho Chehab 28198264991SMauro Carvalho ChehabThe config_group structure contains a config_item. Properly configuring 28298264991SMauro Carvalho Chehabthat item means that a group can behave as an item in its own right. 28398264991SMauro Carvalho ChehabHowever, it can do more: it can create child items or groups. This is 28498264991SMauro Carvalho Chehabaccomplished via the group operations specified on the group's 28598264991SMauro Carvalho Chehabconfig_item_type:: 28698264991SMauro Carvalho Chehab 28798264991SMauro Carvalho Chehab struct configfs_group_operations { 28898264991SMauro Carvalho Chehab struct config_item *(*make_item)(struct config_group *group, 28998264991SMauro Carvalho Chehab const char *name); 29098264991SMauro Carvalho Chehab struct config_group *(*make_group)(struct config_group *group, 29198264991SMauro Carvalho Chehab const char *name); 29298264991SMauro Carvalho Chehab void (*disconnect_notify)(struct config_group *group, 29398264991SMauro Carvalho Chehab struct config_item *item); 29498264991SMauro Carvalho Chehab void (*drop_item)(struct config_group *group, 29598264991SMauro Carvalho Chehab struct config_item *item); 29698264991SMauro Carvalho Chehab }; 29798264991SMauro Carvalho Chehab 29898264991SMauro Carvalho ChehabA group creates child items by providing the 29998264991SMauro Carvalho Chehabct_group_ops->make_item() method. If provided, this method is called from 30098264991SMauro Carvalho Chehabmkdir(2) in the group's directory. The subsystem allocates a new 30198264991SMauro Carvalho Chehabconfig_item (or more likely, its container structure), initializes it, 30298264991SMauro Carvalho Chehaband returns it to configfs. Configfs will then populate the filesystem 30398264991SMauro Carvalho Chehabtree to reflect the new item. 30498264991SMauro Carvalho Chehab 30598264991SMauro Carvalho ChehabIf the subsystem wants the child to be a group itself, the subsystem 30698264991SMauro Carvalho Chehabprovides ct_group_ops->make_group(). Everything else behaves the same, 30798264991SMauro Carvalho Chehabusing the group _init() functions on the group. 30898264991SMauro Carvalho Chehab 30998264991SMauro Carvalho ChehabFinally, when userspace calls rmdir(2) on the item or group, 31098264991SMauro Carvalho Chehabct_group_ops->drop_item() is called. As a config_group is also a 31198264991SMauro Carvalho Chehabconfig_item, it is not necessary for a separate drop_group() method. 31298264991SMauro Carvalho ChehabThe subsystem must config_item_put() the reference that was initialized 31398264991SMauro Carvalho Chehabupon item allocation. If a subsystem has no work to do, it may omit 31498264991SMauro Carvalho Chehabthe ct_group_ops->drop_item() method, and configfs will call 31598264991SMauro Carvalho Chehabconfig_item_put() on the item on behalf of the subsystem. 31698264991SMauro Carvalho Chehab 31798264991SMauro Carvalho ChehabImportant: 31898264991SMauro Carvalho Chehab drop_item() is void, and as such cannot fail. When rmdir(2) 31998264991SMauro Carvalho Chehab is called, configfs WILL remove the item from the filesystem tree 32098264991SMauro Carvalho Chehab (assuming that it has no children to keep it busy). The subsystem is 32198264991SMauro Carvalho Chehab responsible for responding to this. If the subsystem has references to 32298264991SMauro Carvalho Chehab the item in other threads, the memory is safe. It may take some time 32398264991SMauro Carvalho Chehab for the item to actually disappear from the subsystem's usage. But it 32498264991SMauro Carvalho Chehab is gone from configfs. 32598264991SMauro Carvalho Chehab 32698264991SMauro Carvalho ChehabWhen drop_item() is called, the item's linkage has already been torn 32798264991SMauro Carvalho Chehabdown. It no longer has a reference on its parent and has no place in 32898264991SMauro Carvalho Chehabthe item hierarchy. If a client needs to do some cleanup before this 32998264991SMauro Carvalho Chehabteardown happens, the subsystem can implement the 33098264991SMauro Carvalho Chehabct_group_ops->disconnect_notify() method. The method is called after 33198264991SMauro Carvalho Chehabconfigfs has removed the item from the filesystem view but before the 33298264991SMauro Carvalho Chehabitem is removed from its parent group. Like drop_item(), 33398264991SMauro Carvalho Chehabdisconnect_notify() is void and cannot fail. Client subsystems should 33498264991SMauro Carvalho Chehabnot drop any references here, as they still must do it in drop_item(). 33598264991SMauro Carvalho Chehab 33698264991SMauro Carvalho ChehabA config_group cannot be removed while it still has child items. This 33798264991SMauro Carvalho Chehabis implemented in the configfs rmdir(2) code. ->drop_item() will not be 33898264991SMauro Carvalho Chehabcalled, as the item has not been dropped. rmdir(2) will fail, as the 33998264991SMauro Carvalho Chehabdirectory is not empty. 34098264991SMauro Carvalho Chehab 34198264991SMauro Carvalho Chehabstruct configfs_subsystem 34298264991SMauro Carvalho Chehab========================= 34398264991SMauro Carvalho Chehab 34498264991SMauro Carvalho ChehabA subsystem must register itself, usually at module_init time. This 34598264991SMauro Carvalho Chehabtells configfs to make the subsystem appear in the file tree:: 34698264991SMauro Carvalho Chehab 34798264991SMauro Carvalho Chehab struct configfs_subsystem { 34898264991SMauro Carvalho Chehab struct config_group su_group; 34998264991SMauro Carvalho Chehab struct mutex su_mutex; 35098264991SMauro Carvalho Chehab }; 35198264991SMauro Carvalho Chehab 35298264991SMauro Carvalho Chehab int configfs_register_subsystem(struct configfs_subsystem *subsys); 35398264991SMauro Carvalho Chehab void configfs_unregister_subsystem(struct configfs_subsystem *subsys); 35498264991SMauro Carvalho Chehab 35598264991SMauro Carvalho ChehabA subsystem consists of a toplevel config_group and a mutex. 35698264991SMauro Carvalho ChehabThe group is where child config_items are created. For a subsystem, 35798264991SMauro Carvalho Chehabthis group is usually defined statically. Before calling 35898264991SMauro Carvalho Chehabconfigfs_register_subsystem(), the subsystem must have initialized the 35998264991SMauro Carvalho Chehabgroup via the usual group _init() functions, and it must also have 36098264991SMauro Carvalho Chehabinitialized the mutex. 36198264991SMauro Carvalho Chehab 36298264991SMauro Carvalho ChehabWhen the register call returns, the subsystem is live, and it 36398264991SMauro Carvalho Chehabwill be visible via configfs. At that point, mkdir(2) can be called and 36498264991SMauro Carvalho Chehabthe subsystem must be ready for it. 36598264991SMauro Carvalho Chehab 36698264991SMauro Carvalho ChehabAn Example 36798264991SMauro Carvalho Chehab========== 36898264991SMauro Carvalho Chehab 36998264991SMauro Carvalho ChehabThe best example of these basic concepts is the simple_children 37098264991SMauro Carvalho Chehabsubsystem/group and the simple_child item in 37198264991SMauro Carvalho Chehabsamples/configfs/configfs_sample.c. It shows a trivial object displaying 37298264991SMauro Carvalho Chehaband storing an attribute, and a simple group creating and destroying 37398264991SMauro Carvalho Chehabthese children. 37498264991SMauro Carvalho Chehab 37598264991SMauro Carvalho ChehabHierarchy Navigation and the Subsystem Mutex 37698264991SMauro Carvalho Chehab============================================ 37798264991SMauro Carvalho Chehab 37898264991SMauro Carvalho ChehabThere is an extra bonus that configfs provides. The config_groups and 37998264991SMauro Carvalho Chehabconfig_items are arranged in a hierarchy due to the fact that they 38098264991SMauro Carvalho Chehabappear in a filesystem. A subsystem is NEVER to touch the filesystem 38198264991SMauro Carvalho Chehabparts, but the subsystem might be interested in this hierarchy. For 38298264991SMauro Carvalho Chehabthis reason, the hierarchy is mirrored via the config_group->cg_children 38398264991SMauro Carvalho Chehaband config_item->ci_parent structure members. 38498264991SMauro Carvalho Chehab 38598264991SMauro Carvalho ChehabA subsystem can navigate the cg_children list and the ci_parent pointer 38698264991SMauro Carvalho Chehabto see the tree created by the subsystem. This can race with configfs' 38798264991SMauro Carvalho Chehabmanagement of the hierarchy, so configfs uses the subsystem mutex to 38898264991SMauro Carvalho Chehabprotect modifications. Whenever a subsystem wants to navigate the 38998264991SMauro Carvalho Chehabhierarchy, it must do so under the protection of the subsystem 39098264991SMauro Carvalho Chehabmutex. 39198264991SMauro Carvalho Chehab 39298264991SMauro Carvalho ChehabA subsystem will be prevented from acquiring the mutex while a newly 39398264991SMauro Carvalho Chehaballocated item has not been linked into this hierarchy. Similarly, it 39498264991SMauro Carvalho Chehabwill not be able to acquire the mutex while a dropping item has not 39598264991SMauro Carvalho Chehabyet been unlinked. This means that an item's ci_parent pointer will 39698264991SMauro Carvalho Chehabnever be NULL while the item is in configfs, and that an item will only 39798264991SMauro Carvalho Chehabbe in its parent's cg_children list for the same duration. This allows 39898264991SMauro Carvalho Chehaba subsystem to trust ci_parent and cg_children while they hold the 39998264991SMauro Carvalho Chehabmutex. 40098264991SMauro Carvalho Chehab 40198264991SMauro Carvalho ChehabItem Aggregation Via symlink(2) 40298264991SMauro Carvalho Chehab=============================== 40398264991SMauro Carvalho Chehab 40498264991SMauro Carvalho Chehabconfigfs provides a simple group via the group->item parent/child 40598264991SMauro Carvalho Chehabrelationship. Often, however, a larger environment requires aggregation 40698264991SMauro Carvalho Chehaboutside of the parent/child connection. This is implemented via 40798264991SMauro Carvalho Chehabsymlink(2). 40898264991SMauro Carvalho Chehab 40998264991SMauro Carvalho ChehabA config_item may provide the ct_item_ops->allow_link() and 41098264991SMauro Carvalho Chehabct_item_ops->drop_link() methods. If the ->allow_link() method exists, 41198264991SMauro Carvalho Chehabsymlink(2) may be called with the config_item as the source of the link. 41298264991SMauro Carvalho ChehabThese links are only allowed between configfs config_items. Any 41398264991SMauro Carvalho Chehabsymlink(2) attempt outside the configfs filesystem will be denied. 41498264991SMauro Carvalho Chehab 41598264991SMauro Carvalho ChehabWhen symlink(2) is called, the source config_item's ->allow_link() 41698264991SMauro Carvalho Chehabmethod is called with itself and a target item. If the source item 41798264991SMauro Carvalho Chehaballows linking to target item, it returns 0. A source item may wish to 41898264991SMauro Carvalho Chehabreject a link if it only wants links to a certain type of object (say, 41998264991SMauro Carvalho Chehabin its own subsystem). 42098264991SMauro Carvalho Chehab 42198264991SMauro Carvalho ChehabWhen unlink(2) is called on the symbolic link, the source item is 42298264991SMauro Carvalho Chehabnotified via the ->drop_link() method. Like the ->drop_item() method, 42398264991SMauro Carvalho Chehabthis is a void function and cannot return failure. The subsystem is 42498264991SMauro Carvalho Chehabresponsible for responding to the change. 42598264991SMauro Carvalho Chehab 42698264991SMauro Carvalho ChehabA config_item cannot be removed while it links to any other item, nor 42798264991SMauro Carvalho Chehabcan it be removed while an item links to it. Dangling symlinks are not 42898264991SMauro Carvalho Chehaballowed in configfs. 42998264991SMauro Carvalho Chehab 43098264991SMauro Carvalho ChehabAutomatically Created Subgroups 43198264991SMauro Carvalho Chehab=============================== 43298264991SMauro Carvalho Chehab 43398264991SMauro Carvalho ChehabA new config_group may want to have two types of child config_items. 43498264991SMauro Carvalho ChehabWhile this could be codified by magic names in ->make_item(), it is much 43598264991SMauro Carvalho Chehabmore explicit to have a method whereby userspace sees this divergence. 43698264991SMauro Carvalho Chehab 43798264991SMauro Carvalho ChehabRather than have a group where some items behave differently than 43898264991SMauro Carvalho Chehabothers, configfs provides a method whereby one or many subgroups are 43998264991SMauro Carvalho Chehabautomatically created inside the parent at its creation. Thus, 44098264991SMauro Carvalho Chehabmkdir("parent") results in "parent", "parent/subgroup1", up through 44198264991SMauro Carvalho Chehab"parent/subgroupN". Items of type 1 can now be created in 44298264991SMauro Carvalho Chehab"parent/subgroup1", and items of type N can be created in 44398264991SMauro Carvalho Chehab"parent/subgroupN". 44498264991SMauro Carvalho Chehab 44598264991SMauro Carvalho ChehabThese automatic subgroups, or default groups, do not preclude other 44698264991SMauro Carvalho Chehabchildren of the parent group. If ct_group_ops->make_group() exists, 44798264991SMauro Carvalho Chehabother child groups can be created on the parent group directly. 44898264991SMauro Carvalho Chehab 44998264991SMauro Carvalho ChehabA configfs subsystem specifies default groups by adding them using the 45098264991SMauro Carvalho Chehabconfigfs_add_default_group() function to the parent config_group 45198264991SMauro Carvalho Chehabstructure. Each added group is populated in the configfs tree at the same 45298264991SMauro Carvalho Chehabtime as the parent group. Similarly, they are removed at the same time 45398264991SMauro Carvalho Chehabas the parent. No extra notification is provided. When a ->drop_item() 45498264991SMauro Carvalho Chehabmethod call notifies the subsystem the parent group is going away, it 45598264991SMauro Carvalho Chehabalso means every default group child associated with that parent group. 45698264991SMauro Carvalho Chehab 45798264991SMauro Carvalho ChehabAs a consequence of this, default groups cannot be removed directly via 45898264991SMauro Carvalho Chehabrmdir(2). They also are not considered when rmdir(2) on the parent 45998264991SMauro Carvalho Chehabgroup is checking for children. 46098264991SMauro Carvalho Chehab 46198264991SMauro Carvalho ChehabDependent Subsystems 46298264991SMauro Carvalho Chehab==================== 46398264991SMauro Carvalho Chehab 46498264991SMauro Carvalho ChehabSometimes other drivers depend on particular configfs items. For 46598264991SMauro Carvalho Chehabexample, ocfs2 mounts depend on a heartbeat region item. If that 46698264991SMauro Carvalho Chehabregion item is removed with rmdir(2), the ocfs2 mount must BUG or go 46798264991SMauro Carvalho Chehabreadonly. Not happy. 46898264991SMauro Carvalho Chehab 46998264991SMauro Carvalho Chehabconfigfs provides two additional API calls: configfs_depend_item() and 47098264991SMauro Carvalho Chehabconfigfs_undepend_item(). A client driver can call 47198264991SMauro Carvalho Chehabconfigfs_depend_item() on an existing item to tell configfs that it is 47298264991SMauro Carvalho Chehabdepended on. configfs will then return -EBUSY from rmdir(2) for that 47398264991SMauro Carvalho Chehabitem. When the item is no longer depended on, the client driver calls 47498264991SMauro Carvalho Chehabconfigfs_undepend_item() on it. 47598264991SMauro Carvalho Chehab 47698264991SMauro Carvalho ChehabThese API cannot be called underneath any configfs callbacks, as 47798264991SMauro Carvalho Chehabthey will conflict. They can block and allocate. A client driver 47898264991SMauro Carvalho Chehabprobably shouldn't calling them of its own gumption. Rather it should 47998264991SMauro Carvalho Chehabbe providing an API that external subsystems call. 48098264991SMauro Carvalho Chehab 48198264991SMauro Carvalho ChehabHow does this work? Imagine the ocfs2 mount process. When it mounts, 48298264991SMauro Carvalho Chehabit asks for a heartbeat region item. This is done via a call into the 48398264991SMauro Carvalho Chehabheartbeat code. Inside the heartbeat code, the region item is looked 48498264991SMauro Carvalho Chehabup. Here, the heartbeat code calls configfs_depend_item(). If it 48598264991SMauro Carvalho Chehabsucceeds, then heartbeat knows the region is safe to give to ocfs2. 48698264991SMauro Carvalho ChehabIf it fails, it was being torn down anyway, and heartbeat can gracefully 48798264991SMauro Carvalho Chehabpass up an error. 488