156e6d5c0SMauro Carvalho Chehab.. SPDX-License-Identifier: GPL-2.0
256e6d5c0SMauro Carvalho Chehab
356e6d5c0SMauro Carvalho Chehab==================================
456e6d5c0SMauro Carvalho Chehabrelay interface (formerly relayfs)
556e6d5c0SMauro Carvalho Chehab==================================
656e6d5c0SMauro Carvalho Chehab
756e6d5c0SMauro Carvalho ChehabThe relay interface provides a means for kernel applications to
856e6d5c0SMauro Carvalho Chehabefficiently log and transfer large quantities of data from the kernel
956e6d5c0SMauro Carvalho Chehabto userspace via user-defined 'relay channels'.
1056e6d5c0SMauro Carvalho Chehab
1156e6d5c0SMauro Carvalho ChehabA 'relay channel' is a kernel->user data relay mechanism implemented
1256e6d5c0SMauro Carvalho Chehabas a set of per-cpu kernel buffers ('channel buffers'), each
1356e6d5c0SMauro Carvalho Chehabrepresented as a regular file ('relay file') in user space.  Kernel
1456e6d5c0SMauro Carvalho Chehabclients write into the channel buffers using efficient write
1556e6d5c0SMauro Carvalho Chehabfunctions; these automatically log into the current cpu's channel
1656e6d5c0SMauro Carvalho Chehabbuffer.  User space applications mmap() or read() from the relay files
1756e6d5c0SMauro Carvalho Chehaband retrieve the data as it becomes available.  The relay files
1856e6d5c0SMauro Carvalho Chehabthemselves are files created in a host filesystem, e.g. debugfs, and
1956e6d5c0SMauro Carvalho Chehabare associated with the channel buffers using the API described below.
2056e6d5c0SMauro Carvalho Chehab
2156e6d5c0SMauro Carvalho ChehabThe format of the data logged into the channel buffers is completely
2256e6d5c0SMauro Carvalho Chehabup to the kernel client; the relay interface does however provide
2356e6d5c0SMauro Carvalho Chehabhooks which allow kernel clients to impose some structure on the
2456e6d5c0SMauro Carvalho Chehabbuffer data.  The relay interface doesn't implement any form of data
2556e6d5c0SMauro Carvalho Chehabfiltering - this also is left to the kernel client.  The purpose is to
2656e6d5c0SMauro Carvalho Chehabkeep things as simple as possible.
2756e6d5c0SMauro Carvalho Chehab
2856e6d5c0SMauro Carvalho ChehabThis document provides an overview of the relay interface API.  The
2956e6d5c0SMauro Carvalho Chehabdetails of the function parameters are documented along with the
3056e6d5c0SMauro Carvalho Chehabfunctions in the relay interface code - please see that for details.
3156e6d5c0SMauro Carvalho Chehab
3256e6d5c0SMauro Carvalho ChehabSemantics
3356e6d5c0SMauro Carvalho Chehab=========
3456e6d5c0SMauro Carvalho Chehab
3556e6d5c0SMauro Carvalho ChehabEach relay channel has one buffer per CPU, each buffer has one or more
3656e6d5c0SMauro Carvalho Chehabsub-buffers.  Messages are written to the first sub-buffer until it is
3756e6d5c0SMauro Carvalho Chehabtoo full to contain a new message, in which case it is written to
3856e6d5c0SMauro Carvalho Chehabthe next (if available).  Messages are never split across sub-buffers.
3956e6d5c0SMauro Carvalho ChehabAt this point, userspace can be notified so it empties the first
4056e6d5c0SMauro Carvalho Chehabsub-buffer, while the kernel continues writing to the next.
4156e6d5c0SMauro Carvalho Chehab
4256e6d5c0SMauro Carvalho ChehabWhen notified that a sub-buffer is full, the kernel knows how many
4356e6d5c0SMauro Carvalho Chehabbytes of it are padding i.e. unused space occurring because a complete
4456e6d5c0SMauro Carvalho Chehabmessage couldn't fit into a sub-buffer.  Userspace can use this
4556e6d5c0SMauro Carvalho Chehabknowledge to copy only valid data.
4656e6d5c0SMauro Carvalho Chehab
4756e6d5c0SMauro Carvalho ChehabAfter copying it, userspace can notify the kernel that a sub-buffer
4856e6d5c0SMauro Carvalho Chehabhas been consumed.
4956e6d5c0SMauro Carvalho Chehab
5056e6d5c0SMauro Carvalho ChehabA relay channel can operate in a mode where it will overwrite data not
5156e6d5c0SMauro Carvalho Chehabyet collected by userspace, and not wait for it to be consumed.
5256e6d5c0SMauro Carvalho Chehab
5356e6d5c0SMauro Carvalho ChehabThe relay channel itself does not provide for communication of such
5456e6d5c0SMauro Carvalho Chehabdata between userspace and kernel, allowing the kernel side to remain
5556e6d5c0SMauro Carvalho Chehabsimple and not impose a single interface on userspace.  It does
5656e6d5c0SMauro Carvalho Chehabprovide a set of examples and a separate helper though, described
5756e6d5c0SMauro Carvalho Chehabbelow.
5856e6d5c0SMauro Carvalho Chehab
5956e6d5c0SMauro Carvalho ChehabThe read() interface both removes padding and internally consumes the
6056e6d5c0SMauro Carvalho Chehabread sub-buffers; thus in cases where read(2) is being used to drain
6156e6d5c0SMauro Carvalho Chehabthe channel buffers, special-purpose communication between kernel and
6256e6d5c0SMauro Carvalho Chehabuser isn't necessary for basic operation.
6356e6d5c0SMauro Carvalho Chehab
6456e6d5c0SMauro Carvalho ChehabOne of the major goals of the relay interface is to provide a low
6556e6d5c0SMauro Carvalho Chehaboverhead mechanism for conveying kernel data to userspace.  While the
6656e6d5c0SMauro Carvalho Chehabread() interface is easy to use, it's not as efficient as the mmap()
6756e6d5c0SMauro Carvalho Chehabapproach; the example code attempts to make the tradeoff between the
6856e6d5c0SMauro Carvalho Chehabtwo approaches as small as possible.
6956e6d5c0SMauro Carvalho Chehab
7056e6d5c0SMauro Carvalho Chehabklog and relay-apps example code
7156e6d5c0SMauro Carvalho Chehab================================
7256e6d5c0SMauro Carvalho Chehab
7356e6d5c0SMauro Carvalho ChehabThe relay interface itself is ready to use, but to make things easier,
7456e6d5c0SMauro Carvalho Chehaba couple simple utility functions and a set of examples are provided.
7556e6d5c0SMauro Carvalho Chehab
7656e6d5c0SMauro Carvalho ChehabThe relay-apps example tarball, available on the relay sourceforge
7756e6d5c0SMauro Carvalho Chehabsite, contains a set of self-contained examples, each consisting of a
7856e6d5c0SMauro Carvalho Chehabpair of .c files containing boilerplate code for each of the user and
7956e6d5c0SMauro Carvalho Chehabkernel sides of a relay application.  When combined these two sets of
8056e6d5c0SMauro Carvalho Chehabboilerplate code provide glue to easily stream data to disk, without
8156e6d5c0SMauro Carvalho Chehabhaving to bother with mundane housekeeping chores.
8256e6d5c0SMauro Carvalho Chehab
8356e6d5c0SMauro Carvalho ChehabThe 'klog debugging functions' patch (klog.patch in the relay-apps
8456e6d5c0SMauro Carvalho Chehabtarball) provides a couple of high-level logging functions to the
8556e6d5c0SMauro Carvalho Chehabkernel which allow writing formatted text or raw data to a channel,
8656e6d5c0SMauro Carvalho Chehabregardless of whether a channel to write into exists or not, or even
8756e6d5c0SMauro Carvalho Chehabwhether the relay interface is compiled into the kernel or not.  These
8856e6d5c0SMauro Carvalho Chehabfunctions allow you to put unconditional 'trace' statements anywhere
8956e6d5c0SMauro Carvalho Chehabin the kernel or kernel modules; only when there is a 'klog handler'
9056e6d5c0SMauro Carvalho Chehabregistered will data actually be logged (see the klog and kleak
9156e6d5c0SMauro Carvalho Chehabexamples for details).
9256e6d5c0SMauro Carvalho Chehab
9356e6d5c0SMauro Carvalho ChehabIt is of course possible to use the relay interface from scratch,
9456e6d5c0SMauro Carvalho Chehabi.e. without using any of the relay-apps example code or klog, but
9556e6d5c0SMauro Carvalho Chehabyou'll have to implement communication between userspace and kernel,
9656e6d5c0SMauro Carvalho Chehaballowing both to convey the state of buffers (full, empty, amount of
9756e6d5c0SMauro Carvalho Chehabpadding).  The read() interface both removes padding and internally
9856e6d5c0SMauro Carvalho Chehabconsumes the read sub-buffers; thus in cases where read(2) is being
9956e6d5c0SMauro Carvalho Chehabused to drain the channel buffers, special-purpose communication
10056e6d5c0SMauro Carvalho Chehabbetween kernel and user isn't necessary for basic operation.  Things
10156e6d5c0SMauro Carvalho Chehabsuch as buffer-full conditions would still need to be communicated via
10256e6d5c0SMauro Carvalho Chehabsome channel though.
10356e6d5c0SMauro Carvalho Chehab
10456e6d5c0SMauro Carvalho Chehabklog and the relay-apps examples can be found in the relay-apps
10556e6d5c0SMauro Carvalho Chehabtarball on http://relayfs.sourceforge.net
10656e6d5c0SMauro Carvalho Chehab
10756e6d5c0SMauro Carvalho ChehabThe relay interface user space API
10856e6d5c0SMauro Carvalho Chehab==================================
10956e6d5c0SMauro Carvalho Chehab
11056e6d5c0SMauro Carvalho ChehabThe relay interface implements basic file operations for user space
11156e6d5c0SMauro Carvalho Chehabaccess to relay channel buffer data.  Here are the file operations
11256e6d5c0SMauro Carvalho Chehabthat are available and some comments regarding their behavior:
11356e6d5c0SMauro Carvalho Chehab
11456e6d5c0SMauro Carvalho Chehab=========== ============================================================
11556e6d5c0SMauro Carvalho Chehabopen()	    enables user to open an _existing_ channel buffer.
11656e6d5c0SMauro Carvalho Chehab
11756e6d5c0SMauro Carvalho Chehabmmap()      results in channel buffer being mapped into the caller's
11856e6d5c0SMauro Carvalho Chehab	    memory space. Note that you can't do a partial mmap - you
11956e6d5c0SMauro Carvalho Chehab	    must map the entire file, which is NRBUF * SUBBUFSIZE.
12056e6d5c0SMauro Carvalho Chehab
12156e6d5c0SMauro Carvalho Chehabread()      read the contents of a channel buffer.  The bytes read are
12256e6d5c0SMauro Carvalho Chehab	    'consumed' by the reader, i.e. they won't be available
12356e6d5c0SMauro Carvalho Chehab	    again to subsequent reads.  If the channel is being used
12456e6d5c0SMauro Carvalho Chehab	    in no-overwrite mode (the default), it can be read at any
12556e6d5c0SMauro Carvalho Chehab	    time even if there's an active kernel writer.  If the
12656e6d5c0SMauro Carvalho Chehab	    channel is being used in overwrite mode and there are
12756e6d5c0SMauro Carvalho Chehab	    active channel writers, results may be unpredictable -
12856e6d5c0SMauro Carvalho Chehab	    users should make sure that all logging to the channel has
12956e6d5c0SMauro Carvalho Chehab	    ended before using read() with overwrite mode.  Sub-buffer
13056e6d5c0SMauro Carvalho Chehab	    padding is automatically removed and will not be seen by
13156e6d5c0SMauro Carvalho Chehab	    the reader.
13256e6d5c0SMauro Carvalho Chehab
13356e6d5c0SMauro Carvalho Chehabsendfile()  transfer data from a channel buffer to an output file
13456e6d5c0SMauro Carvalho Chehab	    descriptor. Sub-buffer padding is automatically removed
13556e6d5c0SMauro Carvalho Chehab	    and will not be seen by the reader.
13656e6d5c0SMauro Carvalho Chehab
13756e6d5c0SMauro Carvalho Chehabpoll()      POLLIN/POLLRDNORM/POLLERR supported.  User applications are
13856e6d5c0SMauro Carvalho Chehab	    notified when sub-buffer boundaries are crossed.
13956e6d5c0SMauro Carvalho Chehab
14056e6d5c0SMauro Carvalho Chehabclose()     decrements the channel buffer's refcount.  When the refcount
14156e6d5c0SMauro Carvalho Chehab	    reaches 0, i.e. when no process or kernel client has the
14256e6d5c0SMauro Carvalho Chehab	    buffer open, the channel buffer is freed.
14356e6d5c0SMauro Carvalho Chehab=========== ============================================================
14456e6d5c0SMauro Carvalho Chehab
14556e6d5c0SMauro Carvalho ChehabIn order for a user application to make use of relay files, the
14656e6d5c0SMauro Carvalho Chehabhost filesystem must be mounted.  For example::
14756e6d5c0SMauro Carvalho Chehab
14856e6d5c0SMauro Carvalho Chehab	mount -t debugfs debugfs /sys/kernel/debug
14956e6d5c0SMauro Carvalho Chehab
15056e6d5c0SMauro Carvalho Chehab.. Note::
15156e6d5c0SMauro Carvalho Chehab
15256e6d5c0SMauro Carvalho Chehab	the host filesystem doesn't need to be mounted for kernel
15356e6d5c0SMauro Carvalho Chehab	clients to create or use channels - it only needs to be
15456e6d5c0SMauro Carvalho Chehab	mounted when user space applications need access to the buffer
15556e6d5c0SMauro Carvalho Chehab	data.
15656e6d5c0SMauro Carvalho Chehab
15756e6d5c0SMauro Carvalho Chehab
15856e6d5c0SMauro Carvalho ChehabThe relay interface kernel API
15956e6d5c0SMauro Carvalho Chehab==============================
16056e6d5c0SMauro Carvalho Chehab
16156e6d5c0SMauro Carvalho ChehabHere's a summary of the API the relay interface provides to in-kernel clients:
16256e6d5c0SMauro Carvalho Chehab
16356e6d5c0SMauro Carvalho ChehabTBD(curr. line MT:/API/)
16456e6d5c0SMauro Carvalho Chehab  channel management functions::
16556e6d5c0SMauro Carvalho Chehab
16656e6d5c0SMauro Carvalho Chehab    relay_open(base_filename, parent, subbuf_size, n_subbufs,
16756e6d5c0SMauro Carvalho Chehab               callbacks, private_data)
16856e6d5c0SMauro Carvalho Chehab    relay_close(chan)
16956e6d5c0SMauro Carvalho Chehab    relay_flush(chan)
17056e6d5c0SMauro Carvalho Chehab    relay_reset(chan)
17156e6d5c0SMauro Carvalho Chehab
17256e6d5c0SMauro Carvalho Chehab  channel management typically called on instigation of userspace::
17356e6d5c0SMauro Carvalho Chehab
17456e6d5c0SMauro Carvalho Chehab    relay_subbufs_consumed(chan, cpu, subbufs_consumed)
17556e6d5c0SMauro Carvalho Chehab
17656e6d5c0SMauro Carvalho Chehab  write functions::
17756e6d5c0SMauro Carvalho Chehab
17856e6d5c0SMauro Carvalho Chehab    relay_write(chan, data, length)
17956e6d5c0SMauro Carvalho Chehab    __relay_write(chan, data, length)
18056e6d5c0SMauro Carvalho Chehab    relay_reserve(chan, length)
18156e6d5c0SMauro Carvalho Chehab
18256e6d5c0SMauro Carvalho Chehab  callbacks::
18356e6d5c0SMauro Carvalho Chehab
18456e6d5c0SMauro Carvalho Chehab    subbuf_start(buf, subbuf, prev_subbuf, prev_padding)
18556e6d5c0SMauro Carvalho Chehab    buf_mapped(buf, filp)
18656e6d5c0SMauro Carvalho Chehab    buf_unmapped(buf, filp)
18756e6d5c0SMauro Carvalho Chehab    create_buf_file(filename, parent, mode, buf, is_global)
18856e6d5c0SMauro Carvalho Chehab    remove_buf_file(dentry)
18956e6d5c0SMauro Carvalho Chehab
19056e6d5c0SMauro Carvalho Chehab  helper functions::
19156e6d5c0SMauro Carvalho Chehab
19256e6d5c0SMauro Carvalho Chehab    relay_buf_full(buf)
19356e6d5c0SMauro Carvalho Chehab    subbuf_start_reserve(buf, length)
19456e6d5c0SMauro Carvalho Chehab
19556e6d5c0SMauro Carvalho Chehab
19656e6d5c0SMauro Carvalho ChehabCreating a channel
19756e6d5c0SMauro Carvalho Chehab------------------
19856e6d5c0SMauro Carvalho Chehab
19956e6d5c0SMauro Carvalho Chehabrelay_open() is used to create a channel, along with its per-cpu
20056e6d5c0SMauro Carvalho Chehabchannel buffers.  Each channel buffer will have an associated file
20156e6d5c0SMauro Carvalho Chehabcreated for it in the host filesystem, which can be and mmapped or
20256e6d5c0SMauro Carvalho Chehabread from in user space.  The files are named basename0...basenameN-1
20356e6d5c0SMauro Carvalho Chehabwhere N is the number of online cpus, and by default will be created
20456e6d5c0SMauro Carvalho Chehabin the root of the filesystem (if the parent param is NULL).  If you
20556e6d5c0SMauro Carvalho Chehabwant a directory structure to contain your relay files, you should
20656e6d5c0SMauro Carvalho Chehabcreate it using the host filesystem's directory creation function,
20756e6d5c0SMauro Carvalho Chehabe.g. debugfs_create_dir(), and pass the parent directory to
20856e6d5c0SMauro Carvalho Chehabrelay_open().  Users are responsible for cleaning up any directory
20956e6d5c0SMauro Carvalho Chehabstructure they create, when the channel is closed - again the host
21056e6d5c0SMauro Carvalho Chehabfilesystem's directory removal functions should be used for that,
21156e6d5c0SMauro Carvalho Chehabe.g. debugfs_remove().
21256e6d5c0SMauro Carvalho Chehab
21356e6d5c0SMauro Carvalho ChehabIn order for a channel to be created and the host filesystem's files
21456e6d5c0SMauro Carvalho Chehabassociated with its channel buffers, the user must provide definitions
21556e6d5c0SMauro Carvalho Chehabfor two callback functions, create_buf_file() and remove_buf_file().
21656e6d5c0SMauro Carvalho Chehabcreate_buf_file() is called once for each per-cpu buffer from
21756e6d5c0SMauro Carvalho Chehabrelay_open() and allows the user to create the file which will be used
21856e6d5c0SMauro Carvalho Chehabto represent the corresponding channel buffer.  The callback should
21956e6d5c0SMauro Carvalho Chehabreturn the dentry of the file created to represent the channel buffer.
22056e6d5c0SMauro Carvalho Chehabremove_buf_file() must also be defined; it's responsible for deleting
22156e6d5c0SMauro Carvalho Chehabthe file(s) created in create_buf_file() and is called during
22256e6d5c0SMauro Carvalho Chehabrelay_close().
22356e6d5c0SMauro Carvalho Chehab
22456e6d5c0SMauro Carvalho ChehabHere are some typical definitions for these callbacks, in this case
22556e6d5c0SMauro Carvalho Chehabusing debugfs::
22656e6d5c0SMauro Carvalho Chehab
22756e6d5c0SMauro Carvalho Chehab    /*
22856e6d5c0SMauro Carvalho Chehab    * create_buf_file() callback.  Creates relay file in debugfs.
22956e6d5c0SMauro Carvalho Chehab    */
23056e6d5c0SMauro Carvalho Chehab    static struct dentry *create_buf_file_handler(const char *filename,
23156e6d5c0SMauro Carvalho Chehab						struct dentry *parent,
23256e6d5c0SMauro Carvalho Chehab						umode_t mode,
23356e6d5c0SMauro Carvalho Chehab						struct rchan_buf *buf,
23456e6d5c0SMauro Carvalho Chehab						int *is_global)
23556e6d5c0SMauro Carvalho Chehab    {
23656e6d5c0SMauro Carvalho Chehab	    return debugfs_create_file(filename, mode, parent, buf,
23756e6d5c0SMauro Carvalho Chehab				    &relay_file_operations);
23856e6d5c0SMauro Carvalho Chehab    }
23956e6d5c0SMauro Carvalho Chehab
24056e6d5c0SMauro Carvalho Chehab    /*
24156e6d5c0SMauro Carvalho Chehab    * remove_buf_file() callback.  Removes relay file from debugfs.
24256e6d5c0SMauro Carvalho Chehab    */
24356e6d5c0SMauro Carvalho Chehab    static int remove_buf_file_handler(struct dentry *dentry)
24456e6d5c0SMauro Carvalho Chehab    {
24556e6d5c0SMauro Carvalho Chehab	    debugfs_remove(dentry);
24656e6d5c0SMauro Carvalho Chehab
24756e6d5c0SMauro Carvalho Chehab	    return 0;
24856e6d5c0SMauro Carvalho Chehab    }
24956e6d5c0SMauro Carvalho Chehab
25056e6d5c0SMauro Carvalho Chehab    /*
25156e6d5c0SMauro Carvalho Chehab    * relay interface callbacks
25256e6d5c0SMauro Carvalho Chehab    */
25356e6d5c0SMauro Carvalho Chehab    static struct rchan_callbacks relay_callbacks =
25456e6d5c0SMauro Carvalho Chehab    {
25556e6d5c0SMauro Carvalho Chehab	    .create_buf_file = create_buf_file_handler,
25656e6d5c0SMauro Carvalho Chehab	    .remove_buf_file = remove_buf_file_handler,
25756e6d5c0SMauro Carvalho Chehab    };
25856e6d5c0SMauro Carvalho Chehab
25956e6d5c0SMauro Carvalho ChehabAnd an example relay_open() invocation using them::
26056e6d5c0SMauro Carvalho Chehab
26156e6d5c0SMauro Carvalho Chehab  chan = relay_open("cpu", NULL, SUBBUF_SIZE, N_SUBBUFS, &relay_callbacks, NULL);
26256e6d5c0SMauro Carvalho Chehab
26356e6d5c0SMauro Carvalho ChehabIf the create_buf_file() callback fails, or isn't defined, channel
26456e6d5c0SMauro Carvalho Chehabcreation and thus relay_open() will fail.
26556e6d5c0SMauro Carvalho Chehab
26656e6d5c0SMauro Carvalho ChehabThe total size of each per-cpu buffer is calculated by multiplying the
26756e6d5c0SMauro Carvalho Chehabnumber of sub-buffers by the sub-buffer size passed into relay_open().
26856e6d5c0SMauro Carvalho ChehabThe idea behind sub-buffers is that they're basically an extension of
26956e6d5c0SMauro Carvalho Chehabdouble-buffering to N buffers, and they also allow applications to
27056e6d5c0SMauro Carvalho Chehabeasily implement random-access-on-buffer-boundary schemes, which can
27156e6d5c0SMauro Carvalho Chehabbe important for some high-volume applications.  The number and size
27256e6d5c0SMauro Carvalho Chehabof sub-buffers is completely dependent on the application and even for
27356e6d5c0SMauro Carvalho Chehabthe same application, different conditions will warrant different
27456e6d5c0SMauro Carvalho Chehabvalues for these parameters at different times.  Typically, the right
27556e6d5c0SMauro Carvalho Chehabvalues to use are best decided after some experimentation; in general,
27656e6d5c0SMauro Carvalho Chehabthough, it's safe to assume that having only 1 sub-buffer is a bad
27756e6d5c0SMauro Carvalho Chehabidea - you're guaranteed to either overwrite data or lose events
27856e6d5c0SMauro Carvalho Chehabdepending on the channel mode being used.
27956e6d5c0SMauro Carvalho Chehab
28056e6d5c0SMauro Carvalho ChehabThe create_buf_file() implementation can also be defined in such a way
28156e6d5c0SMauro Carvalho Chehabas to allow the creation of a single 'global' buffer instead of the
28256e6d5c0SMauro Carvalho Chehabdefault per-cpu set.  This can be useful for applications interested
28356e6d5c0SMauro Carvalho Chehabmainly in seeing the relative ordering of system-wide events without
28456e6d5c0SMauro Carvalho Chehabthe need to bother with saving explicit timestamps for the purpose of
28556e6d5c0SMauro Carvalho Chehabmerging/sorting per-cpu files in a postprocessing step.
28656e6d5c0SMauro Carvalho Chehab
28756e6d5c0SMauro Carvalho ChehabTo have relay_open() create a global buffer, the create_buf_file()
28856e6d5c0SMauro Carvalho Chehabimplementation should set the value of the is_global outparam to a
28956e6d5c0SMauro Carvalho Chehabnon-zero value in addition to creating the file that will be used to
29056e6d5c0SMauro Carvalho Chehabrepresent the single buffer.  In the case of a global buffer,
29156e6d5c0SMauro Carvalho Chehabcreate_buf_file() and remove_buf_file() will be called only once.  The
29256e6d5c0SMauro Carvalho Chehabnormal channel-writing functions, e.g. relay_write(), can still be
29356e6d5c0SMauro Carvalho Chehabused - writes from any cpu will transparently end up in the global
29456e6d5c0SMauro Carvalho Chehabbuffer - but since it is a global buffer, callers should make sure
29556e6d5c0SMauro Carvalho Chehabthey use the proper locking for such a buffer, either by wrapping
29656e6d5c0SMauro Carvalho Chehabwrites in a spinlock, or by copying a write function from relay.h and
29756e6d5c0SMauro Carvalho Chehabcreating a local version that internally does the proper locking.
29856e6d5c0SMauro Carvalho Chehab
29956e6d5c0SMauro Carvalho ChehabThe private_data passed into relay_open() allows clients to associate
30056e6d5c0SMauro Carvalho Chehabuser-defined data with a channel, and is immediately available
30156e6d5c0SMauro Carvalho Chehab(including in create_buf_file()) via chan->private_data or
30256e6d5c0SMauro Carvalho Chehabbuf->chan->private_data.
30356e6d5c0SMauro Carvalho Chehab
30456e6d5c0SMauro Carvalho ChehabBuffer-only channels
30556e6d5c0SMauro Carvalho Chehab--------------------
30656e6d5c0SMauro Carvalho Chehab
30756e6d5c0SMauro Carvalho ChehabThese channels have no files associated and can be created with
30856e6d5c0SMauro Carvalho Chehabrelay_open(NULL, NULL, ...). Such channels are useful in scenarios such
30956e6d5c0SMauro Carvalho Chehabas when doing early tracing in the kernel, before the VFS is up. In these
31056e6d5c0SMauro Carvalho Chehabcases, one may open a buffer-only channel and then call
31156e6d5c0SMauro Carvalho Chehabrelay_late_setup_files() when the kernel is ready to handle files,
31256e6d5c0SMauro Carvalho Chehabto expose the buffered data to the userspace.
31356e6d5c0SMauro Carvalho Chehab
31456e6d5c0SMauro Carvalho ChehabChannel 'modes'
31556e6d5c0SMauro Carvalho Chehab---------------
31656e6d5c0SMauro Carvalho Chehab
31756e6d5c0SMauro Carvalho Chehabrelay channels can be used in either of two modes - 'overwrite' or
31856e6d5c0SMauro Carvalho Chehab'no-overwrite'.  The mode is entirely determined by the implementation
31956e6d5c0SMauro Carvalho Chehabof the subbuf_start() callback, as described below.  The default if no
32056e6d5c0SMauro Carvalho Chehabsubbuf_start() callback is defined is 'no-overwrite' mode.  If the
32156e6d5c0SMauro Carvalho Chehabdefault mode suits your needs, and you plan to use the read()
32256e6d5c0SMauro Carvalho Chehabinterface to retrieve channel data, you can ignore the details of this
32356e6d5c0SMauro Carvalho Chehabsection, as it pertains mainly to mmap() implementations.
32456e6d5c0SMauro Carvalho Chehab
32556e6d5c0SMauro Carvalho ChehabIn 'overwrite' mode, also known as 'flight recorder' mode, writes
32656e6d5c0SMauro Carvalho Chehabcontinuously cycle around the buffer and will never fail, but will
32756e6d5c0SMauro Carvalho Chehabunconditionally overwrite old data regardless of whether it's actually
32856e6d5c0SMauro Carvalho Chehabbeen consumed.  In no-overwrite mode, writes will fail, i.e. data will
32956e6d5c0SMauro Carvalho Chehabbe lost, if the number of unconsumed sub-buffers equals the total
33056e6d5c0SMauro Carvalho Chehabnumber of sub-buffers in the channel.  It should be clear that if
33156e6d5c0SMauro Carvalho Chehabthere is no consumer or if the consumer can't consume sub-buffers fast
33256e6d5c0SMauro Carvalho Chehabenough, data will be lost in either case; the only difference is
33356e6d5c0SMauro Carvalho Chehabwhether data is lost from the beginning or the end of a buffer.
33456e6d5c0SMauro Carvalho Chehab
33556e6d5c0SMauro Carvalho ChehabAs explained above, a relay channel is made of up one or more
33656e6d5c0SMauro Carvalho Chehabper-cpu channel buffers, each implemented as a circular buffer
33756e6d5c0SMauro Carvalho Chehabsubdivided into one or more sub-buffers.  Messages are written into
33856e6d5c0SMauro Carvalho Chehabthe current sub-buffer of the channel's current per-cpu buffer via the
33956e6d5c0SMauro Carvalho Chehabwrite functions described below.  Whenever a message can't fit into
34056e6d5c0SMauro Carvalho Chehabthe current sub-buffer, because there's no room left for it, the
34156e6d5c0SMauro Carvalho Chehabclient is notified via the subbuf_start() callback that a switch to a
34256e6d5c0SMauro Carvalho Chehabnew sub-buffer is about to occur.  The client uses this callback to 1)
34356e6d5c0SMauro Carvalho Chehabinitialize the next sub-buffer if appropriate 2) finalize the previous
34456e6d5c0SMauro Carvalho Chehabsub-buffer if appropriate and 3) return a boolean value indicating
34556e6d5c0SMauro Carvalho Chehabwhether or not to actually move on to the next sub-buffer.
34656e6d5c0SMauro Carvalho Chehab
34756e6d5c0SMauro Carvalho ChehabTo implement 'no-overwrite' mode, the userspace client would provide
34856e6d5c0SMauro Carvalho Chehaban implementation of the subbuf_start() callback something like the
34956e6d5c0SMauro Carvalho Chehabfollowing::
35056e6d5c0SMauro Carvalho Chehab
35156e6d5c0SMauro Carvalho Chehab    static int subbuf_start(struct rchan_buf *buf,
35256e6d5c0SMauro Carvalho Chehab			    void *subbuf,
35356e6d5c0SMauro Carvalho Chehab			    void *prev_subbuf,
35456e6d5c0SMauro Carvalho Chehab			    unsigned int prev_padding)
35556e6d5c0SMauro Carvalho Chehab    {
35656e6d5c0SMauro Carvalho Chehab	    if (prev_subbuf)
35756e6d5c0SMauro Carvalho Chehab		    *((unsigned *)prev_subbuf) = prev_padding;
35856e6d5c0SMauro Carvalho Chehab
35956e6d5c0SMauro Carvalho Chehab	    if (relay_buf_full(buf))
36056e6d5c0SMauro Carvalho Chehab		    return 0;
36156e6d5c0SMauro Carvalho Chehab
36256e6d5c0SMauro Carvalho Chehab	    subbuf_start_reserve(buf, sizeof(unsigned int));
36356e6d5c0SMauro Carvalho Chehab
36456e6d5c0SMauro Carvalho Chehab	    return 1;
36556e6d5c0SMauro Carvalho Chehab    }
36656e6d5c0SMauro Carvalho Chehab
36756e6d5c0SMauro Carvalho ChehabIf the current buffer is full, i.e. all sub-buffers remain unconsumed,
36856e6d5c0SMauro Carvalho Chehabthe callback returns 0 to indicate that the buffer switch should not
36956e6d5c0SMauro Carvalho Chehaboccur yet, i.e. until the consumer has had a chance to read the
37056e6d5c0SMauro Carvalho Chehabcurrent set of ready sub-buffers.  For the relay_buf_full() function
37156e6d5c0SMauro Carvalho Chehabto make sense, the consumer is responsible for notifying the relay
37256e6d5c0SMauro Carvalho Chehabinterface when sub-buffers have been consumed via
37356e6d5c0SMauro Carvalho Chehabrelay_subbufs_consumed().  Any subsequent attempts to write into the
37456e6d5c0SMauro Carvalho Chehabbuffer will again invoke the subbuf_start() callback with the same
37556e6d5c0SMauro Carvalho Chehabparameters; only when the consumer has consumed one or more of the
37656e6d5c0SMauro Carvalho Chehabready sub-buffers will relay_buf_full() return 0, in which case the
37756e6d5c0SMauro Carvalho Chehabbuffer switch can continue.
37856e6d5c0SMauro Carvalho Chehab
37956e6d5c0SMauro Carvalho ChehabThe implementation of the subbuf_start() callback for 'overwrite' mode
38056e6d5c0SMauro Carvalho Chehabwould be very similar::
38156e6d5c0SMauro Carvalho Chehab
38256e6d5c0SMauro Carvalho Chehab    static int subbuf_start(struct rchan_buf *buf,
38356e6d5c0SMauro Carvalho Chehab			    void *subbuf,
38456e6d5c0SMauro Carvalho Chehab			    void *prev_subbuf,
38556e6d5c0SMauro Carvalho Chehab			    size_t prev_padding)
38656e6d5c0SMauro Carvalho Chehab    {
38756e6d5c0SMauro Carvalho Chehab	    if (prev_subbuf)
38856e6d5c0SMauro Carvalho Chehab		    *((unsigned *)prev_subbuf) = prev_padding;
38956e6d5c0SMauro Carvalho Chehab
39056e6d5c0SMauro Carvalho Chehab	    subbuf_start_reserve(buf, sizeof(unsigned int));
39156e6d5c0SMauro Carvalho Chehab
39256e6d5c0SMauro Carvalho Chehab	    return 1;
39356e6d5c0SMauro Carvalho Chehab    }
39456e6d5c0SMauro Carvalho Chehab
39556e6d5c0SMauro Carvalho ChehabIn this case, the relay_buf_full() check is meaningless and the
39656e6d5c0SMauro Carvalho Chehabcallback always returns 1, causing the buffer switch to occur
39756e6d5c0SMauro Carvalho Chehabunconditionally.  It's also meaningless for the client to use the
39856e6d5c0SMauro Carvalho Chehabrelay_subbufs_consumed() function in this mode, as it's never
39956e6d5c0SMauro Carvalho Chehabconsulted.
40056e6d5c0SMauro Carvalho Chehab
40156e6d5c0SMauro Carvalho ChehabThe default subbuf_start() implementation, used if the client doesn't
40256e6d5c0SMauro Carvalho Chehabdefine any callbacks, or doesn't define the subbuf_start() callback,
40356e6d5c0SMauro Carvalho Chehabimplements the simplest possible 'no-overwrite' mode, i.e. it does
40456e6d5c0SMauro Carvalho Chehabnothing but return 0.
40556e6d5c0SMauro Carvalho Chehab
40656e6d5c0SMauro Carvalho ChehabHeader information can be reserved at the beginning of each sub-buffer
40756e6d5c0SMauro Carvalho Chehabby calling the subbuf_start_reserve() helper function from within the
40856e6d5c0SMauro Carvalho Chehabsubbuf_start() callback.  This reserved area can be used to store
40956e6d5c0SMauro Carvalho Chehabwhatever information the client wants.  In the example above, room is
41056e6d5c0SMauro Carvalho Chehabreserved in each sub-buffer to store the padding count for that
41156e6d5c0SMauro Carvalho Chehabsub-buffer.  This is filled in for the previous sub-buffer in the
41256e6d5c0SMauro Carvalho Chehabsubbuf_start() implementation; the padding value for the previous
41356e6d5c0SMauro Carvalho Chehabsub-buffer is passed into the subbuf_start() callback along with a
41456e6d5c0SMauro Carvalho Chehabpointer to the previous sub-buffer, since the padding value isn't
41556e6d5c0SMauro Carvalho Chehabknown until a sub-buffer is filled.  The subbuf_start() callback is
41656e6d5c0SMauro Carvalho Chehabalso called for the first sub-buffer when the channel is opened, to
41756e6d5c0SMauro Carvalho Chehabgive the client a chance to reserve space in it.  In this case the
41856e6d5c0SMauro Carvalho Chehabprevious sub-buffer pointer passed into the callback will be NULL, so
41956e6d5c0SMauro Carvalho Chehabthe client should check the value of the prev_subbuf pointer before
42056e6d5c0SMauro Carvalho Chehabwriting into the previous sub-buffer.
42156e6d5c0SMauro Carvalho Chehab
42256e6d5c0SMauro Carvalho ChehabWriting to a channel
42356e6d5c0SMauro Carvalho Chehab--------------------
42456e6d5c0SMauro Carvalho Chehab
42556e6d5c0SMauro Carvalho ChehabKernel clients write data into the current cpu's channel buffer using
42656e6d5c0SMauro Carvalho Chehabrelay_write() or __relay_write().  relay_write() is the main logging
42756e6d5c0SMauro Carvalho Chehabfunction - it uses local_irqsave() to protect the buffer and should be
42856e6d5c0SMauro Carvalho Chehabused if you might be logging from interrupt context.  If you know
42956e6d5c0SMauro Carvalho Chehabyou'll never be logging from interrupt context, you can use
43056e6d5c0SMauro Carvalho Chehab__relay_write(), which only disables preemption.  These functions
43156e6d5c0SMauro Carvalho Chehabdon't return a value, so you can't determine whether or not they
43256e6d5c0SMauro Carvalho Chehabfailed - the assumption is that you wouldn't want to check a return
43356e6d5c0SMauro Carvalho Chehabvalue in the fast logging path anyway, and that they'll always succeed
43456e6d5c0SMauro Carvalho Chehabunless the buffer is full and no-overwrite mode is being used, in
43556e6d5c0SMauro Carvalho Chehabwhich case you can detect a failed write in the subbuf_start()
43656e6d5c0SMauro Carvalho Chehabcallback by calling the relay_buf_full() helper function.
43756e6d5c0SMauro Carvalho Chehab
43856e6d5c0SMauro Carvalho Chehabrelay_reserve() is used to reserve a slot in a channel buffer which
43956e6d5c0SMauro Carvalho Chehabcan be written to later.  This would typically be used in applications
44056e6d5c0SMauro Carvalho Chehabthat need to write directly into a channel buffer without having to
44156e6d5c0SMauro Carvalho Chehabstage data in a temporary buffer beforehand.  Because the actual write
44256e6d5c0SMauro Carvalho Chehabmay not happen immediately after the slot is reserved, applications
44356e6d5c0SMauro Carvalho Chehabusing relay_reserve() can keep a count of the number of bytes actually
44456e6d5c0SMauro Carvalho Chehabwritten, either in space reserved in the sub-buffers themselves or as
44556e6d5c0SMauro Carvalho Chehaba separate array.  See the 'reserve' example in the relay-apps tarball
44656e6d5c0SMauro Carvalho Chehabat http://relayfs.sourceforge.net for an example of how this can be
44756e6d5c0SMauro Carvalho Chehabdone.  Because the write is under control of the client and is
44856e6d5c0SMauro Carvalho Chehabseparated from the reserve, relay_reserve() doesn't protect the buffer
44956e6d5c0SMauro Carvalho Chehabat all - it's up to the client to provide the appropriate
45056e6d5c0SMauro Carvalho Chehabsynchronization when using relay_reserve().
45156e6d5c0SMauro Carvalho Chehab
45256e6d5c0SMauro Carvalho ChehabClosing a channel
45356e6d5c0SMauro Carvalho Chehab-----------------
45456e6d5c0SMauro Carvalho Chehab
45556e6d5c0SMauro Carvalho ChehabThe client calls relay_close() when it's finished using the channel.
45656e6d5c0SMauro Carvalho ChehabThe channel and its associated buffers are destroyed when there are no
45756e6d5c0SMauro Carvalho Chehablonger any references to any of the channel buffers.  relay_flush()
45856e6d5c0SMauro Carvalho Chehabforces a sub-buffer switch on all the channel buffers, and can be used
45956e6d5c0SMauro Carvalho Chehabto finalize and process the last sub-buffers before the channel is
46056e6d5c0SMauro Carvalho Chehabclosed.
46156e6d5c0SMauro Carvalho Chehab
46256e6d5c0SMauro Carvalho ChehabMisc
46356e6d5c0SMauro Carvalho Chehab----
46456e6d5c0SMauro Carvalho Chehab
46556e6d5c0SMauro Carvalho ChehabSome applications may want to keep a channel around and re-use it
46656e6d5c0SMauro Carvalho Chehabrather than open and close a new channel for each use.  relay_reset()
46756e6d5c0SMauro Carvalho Chehabcan be used for this purpose - it resets a channel to its initial
46856e6d5c0SMauro Carvalho Chehabstate without reallocating channel buffer memory or destroying
46956e6d5c0SMauro Carvalho Chehabexisting mappings.  It should however only be called when it's safe to
47056e6d5c0SMauro Carvalho Chehabdo so, i.e. when the channel isn't currently being written to.
47156e6d5c0SMauro Carvalho Chehab
47256e6d5c0SMauro Carvalho ChehabFinally, there are a couple of utility callbacks that can be used for
47356e6d5c0SMauro Carvalho Chehabdifferent purposes.  buf_mapped() is called whenever a channel buffer
47456e6d5c0SMauro Carvalho Chehabis mmapped from user space and buf_unmapped() is called when it's
47556e6d5c0SMauro Carvalho Chehabunmapped.  The client can use this notification to trigger actions
47656e6d5c0SMauro Carvalho Chehabwithin the kernel application, such as enabling/disabling logging to
47756e6d5c0SMauro Carvalho Chehabthe channel.
47856e6d5c0SMauro Carvalho Chehab
47956e6d5c0SMauro Carvalho Chehab
48056e6d5c0SMauro Carvalho ChehabResources
48156e6d5c0SMauro Carvalho Chehab=========
48256e6d5c0SMauro Carvalho Chehab
48356e6d5c0SMauro Carvalho ChehabFor news, example code, mailing list, etc. see the relay interface homepage:
48456e6d5c0SMauro Carvalho Chehab
48556e6d5c0SMauro Carvalho Chehab    http://relayfs.sourceforge.net
48656e6d5c0SMauro Carvalho Chehab
48756e6d5c0SMauro Carvalho Chehab
48856e6d5c0SMauro Carvalho ChehabCredits
48956e6d5c0SMauro Carvalho Chehab=======
49056e6d5c0SMauro Carvalho Chehab
49156e6d5c0SMauro Carvalho ChehabThe ideas and specs for the relay interface came about as a result of
49256e6d5c0SMauro Carvalho Chehabdiscussions on tracing involving the following:
49356e6d5c0SMauro Carvalho Chehab
49456e6d5c0SMauro Carvalho ChehabMichel Dagenais		<michel.dagenais@polymtl.ca>
49556e6d5c0SMauro Carvalho ChehabRichard Moore		<richardj_moore@uk.ibm.com>
49656e6d5c0SMauro Carvalho ChehabBob Wisniewski		<bob@watson.ibm.com>
49756e6d5c0SMauro Carvalho ChehabKarim Yaghmour		<karim@opersys.com>
49856e6d5c0SMauro Carvalho ChehabTom Zanussi		<zanussi@us.ibm.com>
49956e6d5c0SMauro Carvalho Chehab
50056e6d5c0SMauro Carvalho ChehabAlso thanks to Hubertus Franke for a lot of useful suggestions and bug
50156e6d5c0SMauro Carvalho Chehabreports.
502