156e6d5c0SMauro Carvalho Chehab.. SPDX-License-Identifier: GPL-2.0 256e6d5c0SMauro Carvalho Chehab 356e6d5c0SMauro Carvalho Chehab================================== 456e6d5c0SMauro Carvalho Chehabrelay interface (formerly relayfs) 556e6d5c0SMauro Carvalho Chehab================================== 656e6d5c0SMauro Carvalho Chehab 756e6d5c0SMauro Carvalho ChehabThe relay interface provides a means for kernel applications to 856e6d5c0SMauro Carvalho Chehabefficiently log and transfer large quantities of data from the kernel 956e6d5c0SMauro Carvalho Chehabto userspace via user-defined 'relay channels'. 1056e6d5c0SMauro Carvalho Chehab 1156e6d5c0SMauro Carvalho ChehabA 'relay channel' is a kernel->user data relay mechanism implemented 1256e6d5c0SMauro Carvalho Chehabas a set of per-cpu kernel buffers ('channel buffers'), each 1356e6d5c0SMauro Carvalho Chehabrepresented as a regular file ('relay file') in user space. Kernel 1456e6d5c0SMauro Carvalho Chehabclients write into the channel buffers using efficient write 1556e6d5c0SMauro Carvalho Chehabfunctions; these automatically log into the current cpu's channel 1656e6d5c0SMauro Carvalho Chehabbuffer. User space applications mmap() or read() from the relay files 1756e6d5c0SMauro Carvalho Chehaband retrieve the data as it becomes available. The relay files 1856e6d5c0SMauro Carvalho Chehabthemselves are files created in a host filesystem, e.g. debugfs, and 1956e6d5c0SMauro Carvalho Chehabare associated with the channel buffers using the API described below. 2056e6d5c0SMauro Carvalho Chehab 2156e6d5c0SMauro Carvalho ChehabThe format of the data logged into the channel buffers is completely 2256e6d5c0SMauro Carvalho Chehabup to the kernel client; the relay interface does however provide 2356e6d5c0SMauro Carvalho Chehabhooks which allow kernel clients to impose some structure on the 2456e6d5c0SMauro Carvalho Chehabbuffer data. The relay interface doesn't implement any form of data 2556e6d5c0SMauro Carvalho Chehabfiltering - this also is left to the kernel client. The purpose is to 2656e6d5c0SMauro Carvalho Chehabkeep things as simple as possible. 2756e6d5c0SMauro Carvalho Chehab 2856e6d5c0SMauro Carvalho ChehabThis document provides an overview of the relay interface API. The 2956e6d5c0SMauro Carvalho Chehabdetails of the function parameters are documented along with the 3056e6d5c0SMauro Carvalho Chehabfunctions in the relay interface code - please see that for details. 3156e6d5c0SMauro Carvalho Chehab 3256e6d5c0SMauro Carvalho ChehabSemantics 3356e6d5c0SMauro Carvalho Chehab========= 3456e6d5c0SMauro Carvalho Chehab 3556e6d5c0SMauro Carvalho ChehabEach relay channel has one buffer per CPU, each buffer has one or more 3656e6d5c0SMauro Carvalho Chehabsub-buffers. Messages are written to the first sub-buffer until it is 3756e6d5c0SMauro Carvalho Chehabtoo full to contain a new message, in which case it is written to 3856e6d5c0SMauro Carvalho Chehabthe next (if available). Messages are never split across sub-buffers. 3956e6d5c0SMauro Carvalho ChehabAt this point, userspace can be notified so it empties the first 4056e6d5c0SMauro Carvalho Chehabsub-buffer, while the kernel continues writing to the next. 4156e6d5c0SMauro Carvalho Chehab 4256e6d5c0SMauro Carvalho ChehabWhen notified that a sub-buffer is full, the kernel knows how many 4356e6d5c0SMauro Carvalho Chehabbytes of it are padding i.e. unused space occurring because a complete 4456e6d5c0SMauro Carvalho Chehabmessage couldn't fit into a sub-buffer. Userspace can use this 4556e6d5c0SMauro Carvalho Chehabknowledge to copy only valid data. 4656e6d5c0SMauro Carvalho Chehab 4756e6d5c0SMauro Carvalho ChehabAfter copying it, userspace can notify the kernel that a sub-buffer 4856e6d5c0SMauro Carvalho Chehabhas been consumed. 4956e6d5c0SMauro Carvalho Chehab 5056e6d5c0SMauro Carvalho ChehabA relay channel can operate in a mode where it will overwrite data not 5156e6d5c0SMauro Carvalho Chehabyet collected by userspace, and not wait for it to be consumed. 5256e6d5c0SMauro Carvalho Chehab 5356e6d5c0SMauro Carvalho ChehabThe relay channel itself does not provide for communication of such 5456e6d5c0SMauro Carvalho Chehabdata between userspace and kernel, allowing the kernel side to remain 5556e6d5c0SMauro Carvalho Chehabsimple and not impose a single interface on userspace. It does 5656e6d5c0SMauro Carvalho Chehabprovide a set of examples and a separate helper though, described 5756e6d5c0SMauro Carvalho Chehabbelow. 5856e6d5c0SMauro Carvalho Chehab 5956e6d5c0SMauro Carvalho ChehabThe read() interface both removes padding and internally consumes the 6056e6d5c0SMauro Carvalho Chehabread sub-buffers; thus in cases where read(2) is being used to drain 6156e6d5c0SMauro Carvalho Chehabthe channel buffers, special-purpose communication between kernel and 6256e6d5c0SMauro Carvalho Chehabuser isn't necessary for basic operation. 6356e6d5c0SMauro Carvalho Chehab 6456e6d5c0SMauro Carvalho ChehabOne of the major goals of the relay interface is to provide a low 6556e6d5c0SMauro Carvalho Chehaboverhead mechanism for conveying kernel data to userspace. While the 6656e6d5c0SMauro Carvalho Chehabread() interface is easy to use, it's not as efficient as the mmap() 6756e6d5c0SMauro Carvalho Chehabapproach; the example code attempts to make the tradeoff between the 6856e6d5c0SMauro Carvalho Chehabtwo approaches as small as possible. 6956e6d5c0SMauro Carvalho Chehab 7056e6d5c0SMauro Carvalho Chehabklog and relay-apps example code 7156e6d5c0SMauro Carvalho Chehab================================ 7256e6d5c0SMauro Carvalho Chehab 7356e6d5c0SMauro Carvalho ChehabThe relay interface itself is ready to use, but to make things easier, 7456e6d5c0SMauro Carvalho Chehaba couple simple utility functions and a set of examples are provided. 7556e6d5c0SMauro Carvalho Chehab 7656e6d5c0SMauro Carvalho ChehabThe relay-apps example tarball, available on the relay sourceforge 7756e6d5c0SMauro Carvalho Chehabsite, contains a set of self-contained examples, each consisting of a 7856e6d5c0SMauro Carvalho Chehabpair of .c files containing boilerplate code for each of the user and 7956e6d5c0SMauro Carvalho Chehabkernel sides of a relay application. When combined these two sets of 8056e6d5c0SMauro Carvalho Chehabboilerplate code provide glue to easily stream data to disk, without 8156e6d5c0SMauro Carvalho Chehabhaving to bother with mundane housekeeping chores. 8256e6d5c0SMauro Carvalho Chehab 8356e6d5c0SMauro Carvalho ChehabThe 'klog debugging functions' patch (klog.patch in the relay-apps 8456e6d5c0SMauro Carvalho Chehabtarball) provides a couple of high-level logging functions to the 8556e6d5c0SMauro Carvalho Chehabkernel which allow writing formatted text or raw data to a channel, 8656e6d5c0SMauro Carvalho Chehabregardless of whether a channel to write into exists or not, or even 8756e6d5c0SMauro Carvalho Chehabwhether the relay interface is compiled into the kernel or not. These 8856e6d5c0SMauro Carvalho Chehabfunctions allow you to put unconditional 'trace' statements anywhere 8956e6d5c0SMauro Carvalho Chehabin the kernel or kernel modules; only when there is a 'klog handler' 9056e6d5c0SMauro Carvalho Chehabregistered will data actually be logged (see the klog and kleak 9156e6d5c0SMauro Carvalho Chehabexamples for details). 9256e6d5c0SMauro Carvalho Chehab 9356e6d5c0SMauro Carvalho ChehabIt is of course possible to use the relay interface from scratch, 9456e6d5c0SMauro Carvalho Chehabi.e. without using any of the relay-apps example code or klog, but 9556e6d5c0SMauro Carvalho Chehabyou'll have to implement communication between userspace and kernel, 9656e6d5c0SMauro Carvalho Chehaballowing both to convey the state of buffers (full, empty, amount of 9756e6d5c0SMauro Carvalho Chehabpadding). The read() interface both removes padding and internally 9856e6d5c0SMauro Carvalho Chehabconsumes the read sub-buffers; thus in cases where read(2) is being 9956e6d5c0SMauro Carvalho Chehabused to drain the channel buffers, special-purpose communication 10056e6d5c0SMauro Carvalho Chehabbetween kernel and user isn't necessary for basic operation. Things 10156e6d5c0SMauro Carvalho Chehabsuch as buffer-full conditions would still need to be communicated via 10256e6d5c0SMauro Carvalho Chehabsome channel though. 10356e6d5c0SMauro Carvalho Chehab 10456e6d5c0SMauro Carvalho Chehabklog and the relay-apps examples can be found in the relay-apps 10556e6d5c0SMauro Carvalho Chehabtarball on http://relayfs.sourceforge.net 10656e6d5c0SMauro Carvalho Chehab 10756e6d5c0SMauro Carvalho ChehabThe relay interface user space API 10856e6d5c0SMauro Carvalho Chehab================================== 10956e6d5c0SMauro Carvalho Chehab 11056e6d5c0SMauro Carvalho ChehabThe relay interface implements basic file operations for user space 11156e6d5c0SMauro Carvalho Chehabaccess to relay channel buffer data. Here are the file operations 11256e6d5c0SMauro Carvalho Chehabthat are available and some comments regarding their behavior: 11356e6d5c0SMauro Carvalho Chehab 11456e6d5c0SMauro Carvalho Chehab=========== ============================================================ 11556e6d5c0SMauro Carvalho Chehabopen() enables user to open an _existing_ channel buffer. 11656e6d5c0SMauro Carvalho Chehab 11756e6d5c0SMauro Carvalho Chehabmmap() results in channel buffer being mapped into the caller's 11856e6d5c0SMauro Carvalho Chehab memory space. Note that you can't do a partial mmap - you 11956e6d5c0SMauro Carvalho Chehab must map the entire file, which is NRBUF * SUBBUFSIZE. 12056e6d5c0SMauro Carvalho Chehab 12156e6d5c0SMauro Carvalho Chehabread() read the contents of a channel buffer. The bytes read are 12256e6d5c0SMauro Carvalho Chehab 'consumed' by the reader, i.e. they won't be available 12356e6d5c0SMauro Carvalho Chehab again to subsequent reads. If the channel is being used 12456e6d5c0SMauro Carvalho Chehab in no-overwrite mode (the default), it can be read at any 12556e6d5c0SMauro Carvalho Chehab time even if there's an active kernel writer. If the 12656e6d5c0SMauro Carvalho Chehab channel is being used in overwrite mode and there are 12756e6d5c0SMauro Carvalho Chehab active channel writers, results may be unpredictable - 12856e6d5c0SMauro Carvalho Chehab users should make sure that all logging to the channel has 12956e6d5c0SMauro Carvalho Chehab ended before using read() with overwrite mode. Sub-buffer 13056e6d5c0SMauro Carvalho Chehab padding is automatically removed and will not be seen by 13156e6d5c0SMauro Carvalho Chehab the reader. 13256e6d5c0SMauro Carvalho Chehab 13356e6d5c0SMauro Carvalho Chehabsendfile() transfer data from a channel buffer to an output file 13456e6d5c0SMauro Carvalho Chehab descriptor. Sub-buffer padding is automatically removed 13556e6d5c0SMauro Carvalho Chehab and will not be seen by the reader. 13656e6d5c0SMauro Carvalho Chehab 13756e6d5c0SMauro Carvalho Chehabpoll() POLLIN/POLLRDNORM/POLLERR supported. User applications are 13856e6d5c0SMauro Carvalho Chehab notified when sub-buffer boundaries are crossed. 13956e6d5c0SMauro Carvalho Chehab 14056e6d5c0SMauro Carvalho Chehabclose() decrements the channel buffer's refcount. When the refcount 14156e6d5c0SMauro Carvalho Chehab reaches 0, i.e. when no process or kernel client has the 14256e6d5c0SMauro Carvalho Chehab buffer open, the channel buffer is freed. 14356e6d5c0SMauro Carvalho Chehab=========== ============================================================ 14456e6d5c0SMauro Carvalho Chehab 14556e6d5c0SMauro Carvalho ChehabIn order for a user application to make use of relay files, the 14656e6d5c0SMauro Carvalho Chehabhost filesystem must be mounted. For example:: 14756e6d5c0SMauro Carvalho Chehab 14856e6d5c0SMauro Carvalho Chehab mount -t debugfs debugfs /sys/kernel/debug 14956e6d5c0SMauro Carvalho Chehab 15056e6d5c0SMauro Carvalho Chehab.. Note:: 15156e6d5c0SMauro Carvalho Chehab 15256e6d5c0SMauro Carvalho Chehab the host filesystem doesn't need to be mounted for kernel 15356e6d5c0SMauro Carvalho Chehab clients to create or use channels - it only needs to be 15456e6d5c0SMauro Carvalho Chehab mounted when user space applications need access to the buffer 15556e6d5c0SMauro Carvalho Chehab data. 15656e6d5c0SMauro Carvalho Chehab 15756e6d5c0SMauro Carvalho Chehab 15856e6d5c0SMauro Carvalho ChehabThe relay interface kernel API 15956e6d5c0SMauro Carvalho Chehab============================== 16056e6d5c0SMauro Carvalho Chehab 16156e6d5c0SMauro Carvalho ChehabHere's a summary of the API the relay interface provides to in-kernel clients: 16256e6d5c0SMauro Carvalho Chehab 16356e6d5c0SMauro Carvalho ChehabTBD(curr. line MT:/API/) 16456e6d5c0SMauro Carvalho Chehab channel management functions:: 16556e6d5c0SMauro Carvalho Chehab 16656e6d5c0SMauro Carvalho Chehab relay_open(base_filename, parent, subbuf_size, n_subbufs, 16756e6d5c0SMauro Carvalho Chehab callbacks, private_data) 16856e6d5c0SMauro Carvalho Chehab relay_close(chan) 16956e6d5c0SMauro Carvalho Chehab relay_flush(chan) 17056e6d5c0SMauro Carvalho Chehab relay_reset(chan) 17156e6d5c0SMauro Carvalho Chehab 17256e6d5c0SMauro Carvalho Chehab channel management typically called on instigation of userspace:: 17356e6d5c0SMauro Carvalho Chehab 17456e6d5c0SMauro Carvalho Chehab relay_subbufs_consumed(chan, cpu, subbufs_consumed) 17556e6d5c0SMauro Carvalho Chehab 17656e6d5c0SMauro Carvalho Chehab write functions:: 17756e6d5c0SMauro Carvalho Chehab 17856e6d5c0SMauro Carvalho Chehab relay_write(chan, data, length) 17956e6d5c0SMauro Carvalho Chehab __relay_write(chan, data, length) 18056e6d5c0SMauro Carvalho Chehab relay_reserve(chan, length) 18156e6d5c0SMauro Carvalho Chehab 18256e6d5c0SMauro Carvalho Chehab callbacks:: 18356e6d5c0SMauro Carvalho Chehab 18456e6d5c0SMauro Carvalho Chehab subbuf_start(buf, subbuf, prev_subbuf, prev_padding) 18556e6d5c0SMauro Carvalho Chehab buf_mapped(buf, filp) 18656e6d5c0SMauro Carvalho Chehab buf_unmapped(buf, filp) 18756e6d5c0SMauro Carvalho Chehab create_buf_file(filename, parent, mode, buf, is_global) 18856e6d5c0SMauro Carvalho Chehab remove_buf_file(dentry) 18956e6d5c0SMauro Carvalho Chehab 19056e6d5c0SMauro Carvalho Chehab helper functions:: 19156e6d5c0SMauro Carvalho Chehab 19256e6d5c0SMauro Carvalho Chehab relay_buf_full(buf) 19356e6d5c0SMauro Carvalho Chehab subbuf_start_reserve(buf, length) 19456e6d5c0SMauro Carvalho Chehab 19556e6d5c0SMauro Carvalho Chehab 19656e6d5c0SMauro Carvalho ChehabCreating a channel 19756e6d5c0SMauro Carvalho Chehab------------------ 19856e6d5c0SMauro Carvalho Chehab 19956e6d5c0SMauro Carvalho Chehabrelay_open() is used to create a channel, along with its per-cpu 20056e6d5c0SMauro Carvalho Chehabchannel buffers. Each channel buffer will have an associated file 20156e6d5c0SMauro Carvalho Chehabcreated for it in the host filesystem, which can be and mmapped or 20256e6d5c0SMauro Carvalho Chehabread from in user space. The files are named basename0...basenameN-1 20356e6d5c0SMauro Carvalho Chehabwhere N is the number of online cpus, and by default will be created 20456e6d5c0SMauro Carvalho Chehabin the root of the filesystem (if the parent param is NULL). If you 20556e6d5c0SMauro Carvalho Chehabwant a directory structure to contain your relay files, you should 20656e6d5c0SMauro Carvalho Chehabcreate it using the host filesystem's directory creation function, 20756e6d5c0SMauro Carvalho Chehabe.g. debugfs_create_dir(), and pass the parent directory to 20856e6d5c0SMauro Carvalho Chehabrelay_open(). Users are responsible for cleaning up any directory 20956e6d5c0SMauro Carvalho Chehabstructure they create, when the channel is closed - again the host 21056e6d5c0SMauro Carvalho Chehabfilesystem's directory removal functions should be used for that, 21156e6d5c0SMauro Carvalho Chehabe.g. debugfs_remove(). 21256e6d5c0SMauro Carvalho Chehab 21356e6d5c0SMauro Carvalho ChehabIn order for a channel to be created and the host filesystem's files 21456e6d5c0SMauro Carvalho Chehabassociated with its channel buffers, the user must provide definitions 21556e6d5c0SMauro Carvalho Chehabfor two callback functions, create_buf_file() and remove_buf_file(). 21656e6d5c0SMauro Carvalho Chehabcreate_buf_file() is called once for each per-cpu buffer from 21756e6d5c0SMauro Carvalho Chehabrelay_open() and allows the user to create the file which will be used 21856e6d5c0SMauro Carvalho Chehabto represent the corresponding channel buffer. The callback should 21956e6d5c0SMauro Carvalho Chehabreturn the dentry of the file created to represent the channel buffer. 22056e6d5c0SMauro Carvalho Chehabremove_buf_file() must also be defined; it's responsible for deleting 22156e6d5c0SMauro Carvalho Chehabthe file(s) created in create_buf_file() and is called during 22256e6d5c0SMauro Carvalho Chehabrelay_close(). 22356e6d5c0SMauro Carvalho Chehab 22456e6d5c0SMauro Carvalho ChehabHere are some typical definitions for these callbacks, in this case 22556e6d5c0SMauro Carvalho Chehabusing debugfs:: 22656e6d5c0SMauro Carvalho Chehab 22756e6d5c0SMauro Carvalho Chehab /* 22856e6d5c0SMauro Carvalho Chehab * create_buf_file() callback. Creates relay file in debugfs. 22956e6d5c0SMauro Carvalho Chehab */ 23056e6d5c0SMauro Carvalho Chehab static struct dentry *create_buf_file_handler(const char *filename, 23156e6d5c0SMauro Carvalho Chehab struct dentry *parent, 23256e6d5c0SMauro Carvalho Chehab umode_t mode, 23356e6d5c0SMauro Carvalho Chehab struct rchan_buf *buf, 23456e6d5c0SMauro Carvalho Chehab int *is_global) 23556e6d5c0SMauro Carvalho Chehab { 23656e6d5c0SMauro Carvalho Chehab return debugfs_create_file(filename, mode, parent, buf, 23756e6d5c0SMauro Carvalho Chehab &relay_file_operations); 23856e6d5c0SMauro Carvalho Chehab } 23956e6d5c0SMauro Carvalho Chehab 24056e6d5c0SMauro Carvalho Chehab /* 24156e6d5c0SMauro Carvalho Chehab * remove_buf_file() callback. Removes relay file from debugfs. 24256e6d5c0SMauro Carvalho Chehab */ 24356e6d5c0SMauro Carvalho Chehab static int remove_buf_file_handler(struct dentry *dentry) 24456e6d5c0SMauro Carvalho Chehab { 24556e6d5c0SMauro Carvalho Chehab debugfs_remove(dentry); 24656e6d5c0SMauro Carvalho Chehab 24756e6d5c0SMauro Carvalho Chehab return 0; 24856e6d5c0SMauro Carvalho Chehab } 24956e6d5c0SMauro Carvalho Chehab 25056e6d5c0SMauro Carvalho Chehab /* 25156e6d5c0SMauro Carvalho Chehab * relay interface callbacks 25256e6d5c0SMauro Carvalho Chehab */ 25356e6d5c0SMauro Carvalho Chehab static struct rchan_callbacks relay_callbacks = 25456e6d5c0SMauro Carvalho Chehab { 25556e6d5c0SMauro Carvalho Chehab .create_buf_file = create_buf_file_handler, 25656e6d5c0SMauro Carvalho Chehab .remove_buf_file = remove_buf_file_handler, 25756e6d5c0SMauro Carvalho Chehab }; 25856e6d5c0SMauro Carvalho Chehab 25956e6d5c0SMauro Carvalho ChehabAnd an example relay_open() invocation using them:: 26056e6d5c0SMauro Carvalho Chehab 26156e6d5c0SMauro Carvalho Chehab chan = relay_open("cpu", NULL, SUBBUF_SIZE, N_SUBBUFS, &relay_callbacks, NULL); 26256e6d5c0SMauro Carvalho Chehab 26356e6d5c0SMauro Carvalho ChehabIf the create_buf_file() callback fails, or isn't defined, channel 26456e6d5c0SMauro Carvalho Chehabcreation and thus relay_open() will fail. 26556e6d5c0SMauro Carvalho Chehab 26656e6d5c0SMauro Carvalho ChehabThe total size of each per-cpu buffer is calculated by multiplying the 26756e6d5c0SMauro Carvalho Chehabnumber of sub-buffers by the sub-buffer size passed into relay_open(). 26856e6d5c0SMauro Carvalho ChehabThe idea behind sub-buffers is that they're basically an extension of 26956e6d5c0SMauro Carvalho Chehabdouble-buffering to N buffers, and they also allow applications to 27056e6d5c0SMauro Carvalho Chehabeasily implement random-access-on-buffer-boundary schemes, which can 27156e6d5c0SMauro Carvalho Chehabbe important for some high-volume applications. The number and size 27256e6d5c0SMauro Carvalho Chehabof sub-buffers is completely dependent on the application and even for 27356e6d5c0SMauro Carvalho Chehabthe same application, different conditions will warrant different 27456e6d5c0SMauro Carvalho Chehabvalues for these parameters at different times. Typically, the right 27556e6d5c0SMauro Carvalho Chehabvalues to use are best decided after some experimentation; in general, 27656e6d5c0SMauro Carvalho Chehabthough, it's safe to assume that having only 1 sub-buffer is a bad 27756e6d5c0SMauro Carvalho Chehabidea - you're guaranteed to either overwrite data or lose events 27856e6d5c0SMauro Carvalho Chehabdepending on the channel mode being used. 27956e6d5c0SMauro Carvalho Chehab 28056e6d5c0SMauro Carvalho ChehabThe create_buf_file() implementation can also be defined in such a way 28156e6d5c0SMauro Carvalho Chehabas to allow the creation of a single 'global' buffer instead of the 28256e6d5c0SMauro Carvalho Chehabdefault per-cpu set. This can be useful for applications interested 28356e6d5c0SMauro Carvalho Chehabmainly in seeing the relative ordering of system-wide events without 28456e6d5c0SMauro Carvalho Chehabthe need to bother with saving explicit timestamps for the purpose of 28556e6d5c0SMauro Carvalho Chehabmerging/sorting per-cpu files in a postprocessing step. 28656e6d5c0SMauro Carvalho Chehab 28756e6d5c0SMauro Carvalho ChehabTo have relay_open() create a global buffer, the create_buf_file() 28856e6d5c0SMauro Carvalho Chehabimplementation should set the value of the is_global outparam to a 28956e6d5c0SMauro Carvalho Chehabnon-zero value in addition to creating the file that will be used to 29056e6d5c0SMauro Carvalho Chehabrepresent the single buffer. In the case of a global buffer, 29156e6d5c0SMauro Carvalho Chehabcreate_buf_file() and remove_buf_file() will be called only once. The 29256e6d5c0SMauro Carvalho Chehabnormal channel-writing functions, e.g. relay_write(), can still be 29356e6d5c0SMauro Carvalho Chehabused - writes from any cpu will transparently end up in the global 29456e6d5c0SMauro Carvalho Chehabbuffer - but since it is a global buffer, callers should make sure 29556e6d5c0SMauro Carvalho Chehabthey use the proper locking for such a buffer, either by wrapping 29656e6d5c0SMauro Carvalho Chehabwrites in a spinlock, or by copying a write function from relay.h and 29756e6d5c0SMauro Carvalho Chehabcreating a local version that internally does the proper locking. 29856e6d5c0SMauro Carvalho Chehab 29956e6d5c0SMauro Carvalho ChehabThe private_data passed into relay_open() allows clients to associate 30056e6d5c0SMauro Carvalho Chehabuser-defined data with a channel, and is immediately available 30156e6d5c0SMauro Carvalho Chehab(including in create_buf_file()) via chan->private_data or 30256e6d5c0SMauro Carvalho Chehabbuf->chan->private_data. 30356e6d5c0SMauro Carvalho Chehab 30456e6d5c0SMauro Carvalho ChehabBuffer-only channels 30556e6d5c0SMauro Carvalho Chehab-------------------- 30656e6d5c0SMauro Carvalho Chehab 30756e6d5c0SMauro Carvalho ChehabThese channels have no files associated and can be created with 30856e6d5c0SMauro Carvalho Chehabrelay_open(NULL, NULL, ...). Such channels are useful in scenarios such 30956e6d5c0SMauro Carvalho Chehabas when doing early tracing in the kernel, before the VFS is up. In these 31056e6d5c0SMauro Carvalho Chehabcases, one may open a buffer-only channel and then call 31156e6d5c0SMauro Carvalho Chehabrelay_late_setup_files() when the kernel is ready to handle files, 31256e6d5c0SMauro Carvalho Chehabto expose the buffered data to the userspace. 31356e6d5c0SMauro Carvalho Chehab 31456e6d5c0SMauro Carvalho ChehabChannel 'modes' 31556e6d5c0SMauro Carvalho Chehab--------------- 31656e6d5c0SMauro Carvalho Chehab 31756e6d5c0SMauro Carvalho Chehabrelay channels can be used in either of two modes - 'overwrite' or 31856e6d5c0SMauro Carvalho Chehab'no-overwrite'. The mode is entirely determined by the implementation 31956e6d5c0SMauro Carvalho Chehabof the subbuf_start() callback, as described below. The default if no 32056e6d5c0SMauro Carvalho Chehabsubbuf_start() callback is defined is 'no-overwrite' mode. If the 32156e6d5c0SMauro Carvalho Chehabdefault mode suits your needs, and you plan to use the read() 32256e6d5c0SMauro Carvalho Chehabinterface to retrieve channel data, you can ignore the details of this 32356e6d5c0SMauro Carvalho Chehabsection, as it pertains mainly to mmap() implementations. 32456e6d5c0SMauro Carvalho Chehab 32556e6d5c0SMauro Carvalho ChehabIn 'overwrite' mode, also known as 'flight recorder' mode, writes 32656e6d5c0SMauro Carvalho Chehabcontinuously cycle around the buffer and will never fail, but will 32756e6d5c0SMauro Carvalho Chehabunconditionally overwrite old data regardless of whether it's actually 32856e6d5c0SMauro Carvalho Chehabbeen consumed. In no-overwrite mode, writes will fail, i.e. data will 32956e6d5c0SMauro Carvalho Chehabbe lost, if the number of unconsumed sub-buffers equals the total 33056e6d5c0SMauro Carvalho Chehabnumber of sub-buffers in the channel. It should be clear that if 33156e6d5c0SMauro Carvalho Chehabthere is no consumer or if the consumer can't consume sub-buffers fast 33256e6d5c0SMauro Carvalho Chehabenough, data will be lost in either case; the only difference is 33356e6d5c0SMauro Carvalho Chehabwhether data is lost from the beginning or the end of a buffer. 33456e6d5c0SMauro Carvalho Chehab 33556e6d5c0SMauro Carvalho ChehabAs explained above, a relay channel is made of up one or more 33656e6d5c0SMauro Carvalho Chehabper-cpu channel buffers, each implemented as a circular buffer 33756e6d5c0SMauro Carvalho Chehabsubdivided into one or more sub-buffers. Messages are written into 33856e6d5c0SMauro Carvalho Chehabthe current sub-buffer of the channel's current per-cpu buffer via the 33956e6d5c0SMauro Carvalho Chehabwrite functions described below. Whenever a message can't fit into 34056e6d5c0SMauro Carvalho Chehabthe current sub-buffer, because there's no room left for it, the 34156e6d5c0SMauro Carvalho Chehabclient is notified via the subbuf_start() callback that a switch to a 34256e6d5c0SMauro Carvalho Chehabnew sub-buffer is about to occur. The client uses this callback to 1) 34356e6d5c0SMauro Carvalho Chehabinitialize the next sub-buffer if appropriate 2) finalize the previous 34456e6d5c0SMauro Carvalho Chehabsub-buffer if appropriate and 3) return a boolean value indicating 34556e6d5c0SMauro Carvalho Chehabwhether or not to actually move on to the next sub-buffer. 34656e6d5c0SMauro Carvalho Chehab 34756e6d5c0SMauro Carvalho ChehabTo implement 'no-overwrite' mode, the userspace client would provide 34856e6d5c0SMauro Carvalho Chehaban implementation of the subbuf_start() callback something like the 34956e6d5c0SMauro Carvalho Chehabfollowing:: 35056e6d5c0SMauro Carvalho Chehab 35156e6d5c0SMauro Carvalho Chehab static int subbuf_start(struct rchan_buf *buf, 35256e6d5c0SMauro Carvalho Chehab void *subbuf, 35356e6d5c0SMauro Carvalho Chehab void *prev_subbuf, 35456e6d5c0SMauro Carvalho Chehab unsigned int prev_padding) 35556e6d5c0SMauro Carvalho Chehab { 35656e6d5c0SMauro Carvalho Chehab if (prev_subbuf) 35756e6d5c0SMauro Carvalho Chehab *((unsigned *)prev_subbuf) = prev_padding; 35856e6d5c0SMauro Carvalho Chehab 35956e6d5c0SMauro Carvalho Chehab if (relay_buf_full(buf)) 36056e6d5c0SMauro Carvalho Chehab return 0; 36156e6d5c0SMauro Carvalho Chehab 36256e6d5c0SMauro Carvalho Chehab subbuf_start_reserve(buf, sizeof(unsigned int)); 36356e6d5c0SMauro Carvalho Chehab 36456e6d5c0SMauro Carvalho Chehab return 1; 36556e6d5c0SMauro Carvalho Chehab } 36656e6d5c0SMauro Carvalho Chehab 36756e6d5c0SMauro Carvalho ChehabIf the current buffer is full, i.e. all sub-buffers remain unconsumed, 36856e6d5c0SMauro Carvalho Chehabthe callback returns 0 to indicate that the buffer switch should not 36956e6d5c0SMauro Carvalho Chehaboccur yet, i.e. until the consumer has had a chance to read the 37056e6d5c0SMauro Carvalho Chehabcurrent set of ready sub-buffers. For the relay_buf_full() function 37156e6d5c0SMauro Carvalho Chehabto make sense, the consumer is responsible for notifying the relay 37256e6d5c0SMauro Carvalho Chehabinterface when sub-buffers have been consumed via 37356e6d5c0SMauro Carvalho Chehabrelay_subbufs_consumed(). Any subsequent attempts to write into the 37456e6d5c0SMauro Carvalho Chehabbuffer will again invoke the subbuf_start() callback with the same 37556e6d5c0SMauro Carvalho Chehabparameters; only when the consumer has consumed one or more of the 37656e6d5c0SMauro Carvalho Chehabready sub-buffers will relay_buf_full() return 0, in which case the 37756e6d5c0SMauro Carvalho Chehabbuffer switch can continue. 37856e6d5c0SMauro Carvalho Chehab 37956e6d5c0SMauro Carvalho ChehabThe implementation of the subbuf_start() callback for 'overwrite' mode 38056e6d5c0SMauro Carvalho Chehabwould be very similar:: 38156e6d5c0SMauro Carvalho Chehab 38256e6d5c0SMauro Carvalho Chehab static int subbuf_start(struct rchan_buf *buf, 38356e6d5c0SMauro Carvalho Chehab void *subbuf, 38456e6d5c0SMauro Carvalho Chehab void *prev_subbuf, 38556e6d5c0SMauro Carvalho Chehab size_t prev_padding) 38656e6d5c0SMauro Carvalho Chehab { 38756e6d5c0SMauro Carvalho Chehab if (prev_subbuf) 38856e6d5c0SMauro Carvalho Chehab *((unsigned *)prev_subbuf) = prev_padding; 38956e6d5c0SMauro Carvalho Chehab 39056e6d5c0SMauro Carvalho Chehab subbuf_start_reserve(buf, sizeof(unsigned int)); 39156e6d5c0SMauro Carvalho Chehab 39256e6d5c0SMauro Carvalho Chehab return 1; 39356e6d5c0SMauro Carvalho Chehab } 39456e6d5c0SMauro Carvalho Chehab 39556e6d5c0SMauro Carvalho ChehabIn this case, the relay_buf_full() check is meaningless and the 39656e6d5c0SMauro Carvalho Chehabcallback always returns 1, causing the buffer switch to occur 39756e6d5c0SMauro Carvalho Chehabunconditionally. It's also meaningless for the client to use the 39856e6d5c0SMauro Carvalho Chehabrelay_subbufs_consumed() function in this mode, as it's never 39956e6d5c0SMauro Carvalho Chehabconsulted. 40056e6d5c0SMauro Carvalho Chehab 40156e6d5c0SMauro Carvalho ChehabThe default subbuf_start() implementation, used if the client doesn't 40256e6d5c0SMauro Carvalho Chehabdefine any callbacks, or doesn't define the subbuf_start() callback, 40356e6d5c0SMauro Carvalho Chehabimplements the simplest possible 'no-overwrite' mode, i.e. it does 40456e6d5c0SMauro Carvalho Chehabnothing but return 0. 40556e6d5c0SMauro Carvalho Chehab 40656e6d5c0SMauro Carvalho ChehabHeader information can be reserved at the beginning of each sub-buffer 40756e6d5c0SMauro Carvalho Chehabby calling the subbuf_start_reserve() helper function from within the 40856e6d5c0SMauro Carvalho Chehabsubbuf_start() callback. This reserved area can be used to store 40956e6d5c0SMauro Carvalho Chehabwhatever information the client wants. In the example above, room is 41056e6d5c0SMauro Carvalho Chehabreserved in each sub-buffer to store the padding count for that 41156e6d5c0SMauro Carvalho Chehabsub-buffer. This is filled in for the previous sub-buffer in the 41256e6d5c0SMauro Carvalho Chehabsubbuf_start() implementation; the padding value for the previous 41356e6d5c0SMauro Carvalho Chehabsub-buffer is passed into the subbuf_start() callback along with a 41456e6d5c0SMauro Carvalho Chehabpointer to the previous sub-buffer, since the padding value isn't 41556e6d5c0SMauro Carvalho Chehabknown until a sub-buffer is filled. The subbuf_start() callback is 41656e6d5c0SMauro Carvalho Chehabalso called for the first sub-buffer when the channel is opened, to 41756e6d5c0SMauro Carvalho Chehabgive the client a chance to reserve space in it. In this case the 41856e6d5c0SMauro Carvalho Chehabprevious sub-buffer pointer passed into the callback will be NULL, so 41956e6d5c0SMauro Carvalho Chehabthe client should check the value of the prev_subbuf pointer before 42056e6d5c0SMauro Carvalho Chehabwriting into the previous sub-buffer. 42156e6d5c0SMauro Carvalho Chehab 42256e6d5c0SMauro Carvalho ChehabWriting to a channel 42356e6d5c0SMauro Carvalho Chehab-------------------- 42456e6d5c0SMauro Carvalho Chehab 42556e6d5c0SMauro Carvalho ChehabKernel clients write data into the current cpu's channel buffer using 42656e6d5c0SMauro Carvalho Chehabrelay_write() or __relay_write(). relay_write() is the main logging 42756e6d5c0SMauro Carvalho Chehabfunction - it uses local_irqsave() to protect the buffer and should be 42856e6d5c0SMauro Carvalho Chehabused if you might be logging from interrupt context. If you know 42956e6d5c0SMauro Carvalho Chehabyou'll never be logging from interrupt context, you can use 43056e6d5c0SMauro Carvalho Chehab__relay_write(), which only disables preemption. These functions 43156e6d5c0SMauro Carvalho Chehabdon't return a value, so you can't determine whether or not they 43256e6d5c0SMauro Carvalho Chehabfailed - the assumption is that you wouldn't want to check a return 43356e6d5c0SMauro Carvalho Chehabvalue in the fast logging path anyway, and that they'll always succeed 43456e6d5c0SMauro Carvalho Chehabunless the buffer is full and no-overwrite mode is being used, in 43556e6d5c0SMauro Carvalho Chehabwhich case you can detect a failed write in the subbuf_start() 43656e6d5c0SMauro Carvalho Chehabcallback by calling the relay_buf_full() helper function. 43756e6d5c0SMauro Carvalho Chehab 43856e6d5c0SMauro Carvalho Chehabrelay_reserve() is used to reserve a slot in a channel buffer which 43956e6d5c0SMauro Carvalho Chehabcan be written to later. This would typically be used in applications 44056e6d5c0SMauro Carvalho Chehabthat need to write directly into a channel buffer without having to 44156e6d5c0SMauro Carvalho Chehabstage data in a temporary buffer beforehand. Because the actual write 44256e6d5c0SMauro Carvalho Chehabmay not happen immediately after the slot is reserved, applications 44356e6d5c0SMauro Carvalho Chehabusing relay_reserve() can keep a count of the number of bytes actually 44456e6d5c0SMauro Carvalho Chehabwritten, either in space reserved in the sub-buffers themselves or as 44556e6d5c0SMauro Carvalho Chehaba separate array. See the 'reserve' example in the relay-apps tarball 44656e6d5c0SMauro Carvalho Chehabat http://relayfs.sourceforge.net for an example of how this can be 44756e6d5c0SMauro Carvalho Chehabdone. Because the write is under control of the client and is 44856e6d5c0SMauro Carvalho Chehabseparated from the reserve, relay_reserve() doesn't protect the buffer 44956e6d5c0SMauro Carvalho Chehabat all - it's up to the client to provide the appropriate 45056e6d5c0SMauro Carvalho Chehabsynchronization when using relay_reserve(). 45156e6d5c0SMauro Carvalho Chehab 45256e6d5c0SMauro Carvalho ChehabClosing a channel 45356e6d5c0SMauro Carvalho Chehab----------------- 45456e6d5c0SMauro Carvalho Chehab 45556e6d5c0SMauro Carvalho ChehabThe client calls relay_close() when it's finished using the channel. 45656e6d5c0SMauro Carvalho ChehabThe channel and its associated buffers are destroyed when there are no 45756e6d5c0SMauro Carvalho Chehablonger any references to any of the channel buffers. relay_flush() 45856e6d5c0SMauro Carvalho Chehabforces a sub-buffer switch on all the channel buffers, and can be used 45956e6d5c0SMauro Carvalho Chehabto finalize and process the last sub-buffers before the channel is 46056e6d5c0SMauro Carvalho Chehabclosed. 46156e6d5c0SMauro Carvalho Chehab 46256e6d5c0SMauro Carvalho ChehabMisc 46356e6d5c0SMauro Carvalho Chehab---- 46456e6d5c0SMauro Carvalho Chehab 46556e6d5c0SMauro Carvalho ChehabSome applications may want to keep a channel around and re-use it 46656e6d5c0SMauro Carvalho Chehabrather than open and close a new channel for each use. relay_reset() 46756e6d5c0SMauro Carvalho Chehabcan be used for this purpose - it resets a channel to its initial 46856e6d5c0SMauro Carvalho Chehabstate without reallocating channel buffer memory or destroying 46956e6d5c0SMauro Carvalho Chehabexisting mappings. It should however only be called when it's safe to 47056e6d5c0SMauro Carvalho Chehabdo so, i.e. when the channel isn't currently being written to. 47156e6d5c0SMauro Carvalho Chehab 47256e6d5c0SMauro Carvalho ChehabFinally, there are a couple of utility callbacks that can be used for 47356e6d5c0SMauro Carvalho Chehabdifferent purposes. buf_mapped() is called whenever a channel buffer 47456e6d5c0SMauro Carvalho Chehabis mmapped from user space and buf_unmapped() is called when it's 47556e6d5c0SMauro Carvalho Chehabunmapped. The client can use this notification to trigger actions 47656e6d5c0SMauro Carvalho Chehabwithin the kernel application, such as enabling/disabling logging to 47756e6d5c0SMauro Carvalho Chehabthe channel. 47856e6d5c0SMauro Carvalho Chehab 47956e6d5c0SMauro Carvalho Chehab 48056e6d5c0SMauro Carvalho ChehabResources 48156e6d5c0SMauro Carvalho Chehab========= 48256e6d5c0SMauro Carvalho Chehab 48356e6d5c0SMauro Carvalho ChehabFor news, example code, mailing list, etc. see the relay interface homepage: 48456e6d5c0SMauro Carvalho Chehab 48556e6d5c0SMauro Carvalho Chehab http://relayfs.sourceforge.net 48656e6d5c0SMauro Carvalho Chehab 48756e6d5c0SMauro Carvalho Chehab 48856e6d5c0SMauro Carvalho ChehabCredits 48956e6d5c0SMauro Carvalho Chehab======= 49056e6d5c0SMauro Carvalho Chehab 49156e6d5c0SMauro Carvalho ChehabThe ideas and specs for the relay interface came about as a result of 49256e6d5c0SMauro Carvalho Chehabdiscussions on tracing involving the following: 49356e6d5c0SMauro Carvalho Chehab 49456e6d5c0SMauro Carvalho ChehabMichel Dagenais <michel.dagenais@polymtl.ca> 49556e6d5c0SMauro Carvalho ChehabRichard Moore <richardj_moore@uk.ibm.com> 49656e6d5c0SMauro Carvalho ChehabBob Wisniewski <bob@watson.ibm.com> 49756e6d5c0SMauro Carvalho ChehabKarim Yaghmour <karim@opersys.com> 49856e6d5c0SMauro Carvalho ChehabTom Zanussi <zanussi@us.ibm.com> 49956e6d5c0SMauro Carvalho Chehab 50056e6d5c0SMauro Carvalho ChehabAlso thanks to Hubertus Franke for a lot of useful suggestions and bug 50156e6d5c0SMauro Carvalho Chehabreports. 502