xref: /openbmc/linux/Documentation/core-api/padata.rst (revision 9a87ffc99ec8eb8d35eed7c4f816d75f5cc9662e)
1bfcdcef8SDaniel Jordan.. SPDX-License-Identifier: GPL-2.0
2bfcdcef8SDaniel Jordan
3bfcdcef8SDaniel Jordan=======================================
4bfcdcef8SDaniel JordanThe padata parallel execution mechanism
5bfcdcef8SDaniel Jordan=======================================
6bfcdcef8SDaniel Jordan
7ec3b39c7SDaniel Jordan:Date: May 2020
8bfcdcef8SDaniel Jordan
9bfcdcef8SDaniel JordanPadata is a mechanism by which the kernel can farm jobs out to be done in
10ec3b39c7SDaniel Jordanparallel on multiple CPUs while optionally retaining their ordering.
11bfcdcef8SDaniel Jordan
12ec3b39c7SDaniel JordanIt was originally developed for IPsec, which needs to perform encryption and
13ec3b39c7SDaniel Jordandecryption on large numbers of packets without reordering those packets.  This
14ec3b39c7SDaniel Jordanis currently the sole consumer of padata's serialized job support.
15ec3b39c7SDaniel Jordan
16ec3b39c7SDaniel JordanPadata also supports multithreaded jobs, splitting up the job evenly while load
17ec3b39c7SDaniel Jordanbalancing and coordinating between threads.
18ec3b39c7SDaniel Jordan
19ec3b39c7SDaniel JordanRunning Serialized Jobs
20ec3b39c7SDaniel Jordan=======================
21bfcdcef8SDaniel Jordan
22bfcdcef8SDaniel JordanInitializing
23bfcdcef8SDaniel Jordan------------
24bfcdcef8SDaniel Jordan
25ec3b39c7SDaniel JordanThe first step in using padata to run serialized jobs is to set up a
26ec3b39c7SDaniel Jordanpadata_instance structure for overall control of how jobs are to be run::
27bfcdcef8SDaniel Jordan
28bfcdcef8SDaniel Jordan    #include <linux/padata.h>
29bfcdcef8SDaniel Jordan
303f257191SDaniel Jordan    struct padata_instance *padata_alloc(const char *name);
31bfcdcef8SDaniel Jordan
32bfcdcef8SDaniel Jordan'name' simply identifies the instance.
33bfcdcef8SDaniel Jordan
34350ef051SDaniel JordanThen, complete padata initialization by allocating a padata_shell::
35bfcdcef8SDaniel Jordan
36bfcdcef8SDaniel Jordan   struct padata_shell *padata_alloc_shell(struct padata_instance *pinst);
37bfcdcef8SDaniel Jordan
38bfcdcef8SDaniel JordanA padata_shell is used to submit a job to padata and allows a series of such
39bfcdcef8SDaniel Jordanjobs to be serialized independently.  A padata_instance may have one or more
40bfcdcef8SDaniel Jordanpadata_shells associated with it, each allowing a separate series of jobs.
41bfcdcef8SDaniel Jordan
42bfcdcef8SDaniel JordanModifying cpumasks
43bfcdcef8SDaniel Jordan------------------
44bfcdcef8SDaniel Jordan
45*d2fb903fSRandy DunlapThe CPUs used to run jobs can be changed in two ways, programmatically with
46bfcdcef8SDaniel Jordanpadata_set_cpumask() or via sysfs.  The former is defined::
47bfcdcef8SDaniel Jordan
48bfcdcef8SDaniel Jordan    int padata_set_cpumask(struct padata_instance *pinst, int cpumask_type,
49bfcdcef8SDaniel Jordan			   cpumask_var_t cpumask);
50bfcdcef8SDaniel Jordan
51bfcdcef8SDaniel JordanHere cpumask_type is one of PADATA_CPU_PARALLEL or PADATA_CPU_SERIAL, where a
52bfcdcef8SDaniel Jordanparallel cpumask describes which processors will be used to execute jobs
53bfcdcef8SDaniel Jordansubmitted to this instance in parallel and a serial cpumask defines which
54bfcdcef8SDaniel Jordanprocessors are allowed to be used as the serialization callback processor.
55bfcdcef8SDaniel Jordancpumask specifies the new cpumask to use.
56bfcdcef8SDaniel Jordan
57bfcdcef8SDaniel JordanThere may be sysfs files for an instance's cpumasks.  For example, pcrypt's
58bfcdcef8SDaniel Jordanlive in /sys/kernel/pcrypt/<instance-name>.  Within an instance's directory
59bfcdcef8SDaniel Jordanthere are two files, parallel_cpumask and serial_cpumask, and either cpumask
60bfcdcef8SDaniel Jordanmay be changed by echoing a bitmask into the file, for example::
61bfcdcef8SDaniel Jordan
62bfcdcef8SDaniel Jordan    echo f > /sys/kernel/pcrypt/pencrypt/parallel_cpumask
63bfcdcef8SDaniel Jordan
64bfcdcef8SDaniel JordanReading one of these files shows the user-supplied cpumask, which may be
65bfcdcef8SDaniel Jordandifferent from the 'usable' cpumask.
66bfcdcef8SDaniel Jordan
67bfcdcef8SDaniel JordanPadata maintains two pairs of cpumasks internally, the user-supplied cpumasks
68bfcdcef8SDaniel Jordanand the 'usable' cpumasks.  (Each pair consists of a parallel and a serial
69bfcdcef8SDaniel Jordancpumask.)  The user-supplied cpumasks default to all possible CPUs on instance
70bfcdcef8SDaniel Jordanallocation and may be changed as above.  The usable cpumasks are always a
71bfcdcef8SDaniel Jordansubset of the user-supplied cpumasks and contain only the online CPUs in the
72bfcdcef8SDaniel Jordanuser-supplied masks; these are the cpumasks padata actually uses.  So it is
73bfcdcef8SDaniel Jordanlegal to supply a cpumask to padata that contains offline CPUs.  Once an
74bfcdcef8SDaniel Jordanoffline CPU in the user-supplied cpumask comes online, padata is going to use
75bfcdcef8SDaniel Jordanit.
76bfcdcef8SDaniel Jordan
77bfcdcef8SDaniel JordanChanging the CPU masks are expensive operations, so it should not be done with
78bfcdcef8SDaniel Jordangreat frequency.
79bfcdcef8SDaniel Jordan
80bfcdcef8SDaniel JordanRunning A Job
81bfcdcef8SDaniel Jordan-------------
82bfcdcef8SDaniel Jordan
83bfcdcef8SDaniel JordanActually submitting work to the padata instance requires the creation of a
84bfcdcef8SDaniel Jordanpadata_priv structure, which represents one job::
85bfcdcef8SDaniel Jordan
86bfcdcef8SDaniel Jordan    struct padata_priv {
87bfcdcef8SDaniel Jordan        /* Other stuff here... */
88bfcdcef8SDaniel Jordan	void                    (*parallel)(struct padata_priv *padata);
89bfcdcef8SDaniel Jordan	void                    (*serial)(struct padata_priv *padata);
90bfcdcef8SDaniel Jordan    };
91bfcdcef8SDaniel Jordan
92bfcdcef8SDaniel JordanThis structure will almost certainly be embedded within some larger
93bfcdcef8SDaniel Jordanstructure specific to the work to be done.  Most of its fields are private to
94bfcdcef8SDaniel Jordanpadata, but the structure should be zeroed at initialisation time, and the
95bfcdcef8SDaniel Jordanparallel() and serial() functions should be provided.  Those functions will
96bfcdcef8SDaniel Jordanbe called in the process of getting the work done as we will see
97bfcdcef8SDaniel Jordanmomentarily.
98bfcdcef8SDaniel Jordan
99bfcdcef8SDaniel JordanThe submission of the job is done with::
100bfcdcef8SDaniel Jordan
101bfcdcef8SDaniel Jordan    int padata_do_parallel(struct padata_shell *ps,
102bfcdcef8SDaniel Jordan		           struct padata_priv *padata, int *cb_cpu);
103bfcdcef8SDaniel Jordan
104bfcdcef8SDaniel JordanThe ps and padata structures must be set up as described above; cb_cpu
105bfcdcef8SDaniel Jordanpoints to the preferred CPU to be used for the final callback when the job is
106bfcdcef8SDaniel Jordandone; it must be in the current instance's CPU mask (if not the cb_cpu pointer
107bfcdcef8SDaniel Jordanis updated to point to the CPU actually chosen).  The return value from
108bfcdcef8SDaniel Jordanpadata_do_parallel() is zero on success, indicating that the job is in
109bfcdcef8SDaniel Jordanprogress. -EBUSY means that somebody, somewhere else is messing with the
110bfcdcef8SDaniel Jordaninstance's CPU mask, while -EINVAL is a complaint about cb_cpu not being in the
111bfcdcef8SDaniel Jordanserial cpumask, no online CPUs in the parallel or serial cpumasks, or a stopped
112bfcdcef8SDaniel Jordaninstance.
113bfcdcef8SDaniel Jordan
114bfcdcef8SDaniel JordanEach job submitted to padata_do_parallel() will, in turn, be passed to
115bfcdcef8SDaniel Jordanexactly one call to the above-mentioned parallel() function, on one CPU, so
116bfcdcef8SDaniel Jordantrue parallelism is achieved by submitting multiple jobs.  parallel() runs with
117bfcdcef8SDaniel Jordansoftware interrupts disabled and thus cannot sleep.  The parallel()
118bfcdcef8SDaniel Jordanfunction gets the padata_priv structure pointer as its lone parameter;
119bfcdcef8SDaniel Jordaninformation about the actual work to be done is probably obtained by using
120bfcdcef8SDaniel Jordancontainer_of() to find the enclosing structure.
121bfcdcef8SDaniel Jordan
122bfcdcef8SDaniel JordanNote that parallel() has no return value; the padata subsystem assumes that
123bfcdcef8SDaniel Jordanparallel() will take responsibility for the job from this point.  The job
124bfcdcef8SDaniel Jordanneed not be completed during this call, but, if parallel() leaves work
125bfcdcef8SDaniel Jordanoutstanding, it should be prepared to be called again with a new job before
126bfcdcef8SDaniel Jordanthe previous one completes.
127bfcdcef8SDaniel Jordan
128bfcdcef8SDaniel JordanSerializing Jobs
129bfcdcef8SDaniel Jordan----------------
130bfcdcef8SDaniel Jordan
131bfcdcef8SDaniel JordanWhen a job does complete, parallel() (or whatever function actually finishes
132bfcdcef8SDaniel Jordanthe work) should inform padata of the fact with a call to::
133bfcdcef8SDaniel Jordan
134bfcdcef8SDaniel Jordan    void padata_do_serial(struct padata_priv *padata);
135bfcdcef8SDaniel Jordan
136bfcdcef8SDaniel JordanAt some point in the future, padata_do_serial() will trigger a call to the
137bfcdcef8SDaniel Jordanserial() function in the padata_priv structure.  That call will happen on
138bfcdcef8SDaniel Jordanthe CPU requested in the initial call to padata_do_parallel(); it, too, is
139bfcdcef8SDaniel Jordanrun with local software interrupts disabled.
140bfcdcef8SDaniel JordanNote that this call may be deferred for a while since the padata code takes
141bfcdcef8SDaniel Jordanpains to ensure that jobs are completed in the order in which they were
142bfcdcef8SDaniel Jordansubmitted.
143bfcdcef8SDaniel Jordan
144bfcdcef8SDaniel JordanDestroying
145bfcdcef8SDaniel Jordan----------
146bfcdcef8SDaniel Jordan
147350ef051SDaniel JordanCleaning up a padata instance predictably involves calling the two free
148bfcdcef8SDaniel Jordanfunctions that correspond to the allocation in reverse::
149bfcdcef8SDaniel Jordan
150bfcdcef8SDaniel Jordan    void padata_free_shell(struct padata_shell *ps);
151bfcdcef8SDaniel Jordan    void padata_free(struct padata_instance *pinst);
152bfcdcef8SDaniel Jordan
153bfcdcef8SDaniel JordanIt is the user's responsibility to ensure all outstanding jobs are complete
154bfcdcef8SDaniel Jordanbefore any of the above are called.
155bfcdcef8SDaniel Jordan
156ec3b39c7SDaniel JordanRunning Multithreaded Jobs
157ec3b39c7SDaniel Jordan==========================
158ec3b39c7SDaniel Jordan
159ec3b39c7SDaniel JordanA multithreaded job has a main thread and zero or more helper threads, with the
160ec3b39c7SDaniel Jordanmain thread participating in the job and then waiting until all helpers have
161ec3b39c7SDaniel Jordanfinished.  padata splits the job into units called chunks, where a chunk is a
162ec3b39c7SDaniel Jordanpiece of the job that one thread completes in one call to the thread function.
163ec3b39c7SDaniel Jordan
164ec3b39c7SDaniel JordanA user has to do three things to run a multithreaded job.  First, describe the
165ec3b39c7SDaniel Jordanjob by defining a padata_mt_job structure, which is explained in the Interface
166ec3b39c7SDaniel Jordansection.  This includes a pointer to the thread function, which padata will
167ec3b39c7SDaniel Jordancall each time it assigns a job chunk to a thread.  Then, define the thread
168ec3b39c7SDaniel Jordanfunction, which accepts three arguments, ``start``, ``end``, and ``arg``, where
169ec3b39c7SDaniel Jordanthe first two delimit the range that the thread operates on and the last is a
170ec3b39c7SDaniel Jordanpointer to the job's shared state, if any.  Prepare the shared state, which is
171ec3b39c7SDaniel Jordantypically allocated on the main thread's stack.  Last, call
172ec3b39c7SDaniel Jordanpadata_do_multithreaded(), which will return once the job is finished.
173ec3b39c7SDaniel Jordan
174bfcdcef8SDaniel JordanInterface
175bfcdcef8SDaniel Jordan=========
176bfcdcef8SDaniel Jordan
177bfcdcef8SDaniel Jordan.. kernel-doc:: include/linux/padata.h
178bfcdcef8SDaniel Jordan.. kernel-doc:: kernel/padata.c
179