1fb28afccSDavid Howells.. SPDX-License-Identifier: GPL-2.0
2fb28afccSDavid Howells
3fb28afccSDavid Howells=================================
4ddca5b0eSDavid HowellsNetwork Filesystem Helper Library
5fb28afccSDavid Howells=================================
6fb28afccSDavid Howells
7fb28afccSDavid Howells.. Contents:
8fb28afccSDavid Howells
9fb28afccSDavid Howells - Overview.
10bc899ee1SDavid Howells - Per-inode context.
11bc899ee1SDavid Howells   - Inode context helper functions.
12fb28afccSDavid Howells - Buffered read helpers.
13fb28afccSDavid Howells   - Read helper functions.
14fb28afccSDavid Howells   - Read helper structures.
15fb28afccSDavid Howells   - Read helper operations.
16fb28afccSDavid Howells   - Read helper procedure.
17fb28afccSDavid Howells   - Read helper cache API.
18fb28afccSDavid Howells
19fb28afccSDavid Howells
20fb28afccSDavid HowellsOverview
21fb28afccSDavid Howells========
22fb28afccSDavid Howells
23fb28afccSDavid HowellsThe network filesystem helper library is a set of functions designed to aid a
24fb28afccSDavid Howellsnetwork filesystem in implementing VM/VFS operations.  For the moment, that
25fb28afccSDavid Howellsjust includes turning various VM buffered read operations into requests to read
26fb28afccSDavid Howellsfrom the server.  The helper library, however, can also interpose other
27fb28afccSDavid Howellsservices, such as local caching or local data encryption.
28fb28afccSDavid Howells
29fb28afccSDavid HowellsNote that the library module doesn't link against local caching directly, so
30fb28afccSDavid Howellsaccess must be provided by the netfs.
31fb28afccSDavid Howells
32fb28afccSDavid Howells
33bc899ee1SDavid HowellsPer-Inode Context
34bc899ee1SDavid Howells=================
35bc899ee1SDavid Howells
36bc899ee1SDavid HowellsThe network filesystem helper library needs a place to store a bit of state for
37bc899ee1SDavid Howellsits use on each netfs inode it is helping to manage.  To this end, a context
38bc899ee1SDavid Howellsstructure is defined::
39bc899ee1SDavid Howells
40874c8ca1SDavid Howells	struct netfs_inode {
41874c8ca1SDavid Howells		struct inode inode;
42bc899ee1SDavid Howells		const struct netfs_request_ops *ops;
43bc899ee1SDavid Howells		struct fscache_cookie *cache;
44bc899ee1SDavid Howells	};
45bc899ee1SDavid Howells
46874c8ca1SDavid HowellsA network filesystem that wants to use netfs lib must place one of these in its
47874c8ca1SDavid Howellsinode wrapper struct instead of the VFS ``struct inode``.  This can be done in
48874c8ca1SDavid Howellsa way similar to the following::
49bc899ee1SDavid Howells
50bc899ee1SDavid Howells	struct my_inode {
51874c8ca1SDavid Howells		struct netfs_inode netfs; /* Netfslib context and vfs inode */
52bc899ee1SDavid Howells		...
53bc899ee1SDavid Howells	};
54bc899ee1SDavid Howells
55874c8ca1SDavid HowellsThis allows netfslib to find its state by using ``container_of()`` from the
56874c8ca1SDavid Howellsinode pointer, thereby allowing the netfslib helper functions to be pointed to
57874c8ca1SDavid Howellsdirectly by the VFS/VM operation tables.
58bc899ee1SDavid Howells
59bc899ee1SDavid HowellsThe structure contains the following fields:
60bc899ee1SDavid Howells
61874c8ca1SDavid Howells * ``inode``
62874c8ca1SDavid Howells
63874c8ca1SDavid Howells   The VFS inode structure.
64874c8ca1SDavid Howells
65bc899ee1SDavid Howells * ``ops``
66bc899ee1SDavid Howells
67bc899ee1SDavid Howells   The set of operations provided by the network filesystem to netfslib.
68bc899ee1SDavid Howells
69bc899ee1SDavid Howells * ``cache``
70bc899ee1SDavid Howells
71bc899ee1SDavid Howells   Local caching cookie, or NULL if no caching is enabled.  This field does not
72bc899ee1SDavid Howells   exist if fscache is disabled.
73bc899ee1SDavid Howells
74bc899ee1SDavid Howells
75bc899ee1SDavid HowellsInode Context Helper Functions
76bc899ee1SDavid Howells------------------------------
77bc899ee1SDavid Howells
78bc899ee1SDavid HowellsTo help deal with the per-inode context, a number helper functions are
79bc899ee1SDavid Howellsprovided.  Firstly, a function to perform basic initialisation on a context and
80bc899ee1SDavid Howellsset the operations table pointer::
81bc899ee1SDavid Howells
82e81fb419SLinus Torvalds	void netfs_inode_init(struct netfs_inode *ctx,
83bc899ee1SDavid Howells			      const struct netfs_request_ops *ops);
84bc899ee1SDavid Howells
85874c8ca1SDavid Howellsthen a function to cast from the VFS inode structure to the netfs context::
86bc899ee1SDavid Howells
87874c8ca1SDavid Howells	struct netfs_inode *netfs_node(struct inode *inode);
88bc899ee1SDavid Howells
89bc899ee1SDavid Howellsand finally, a function to get the cache cookie pointer from the context
90bc899ee1SDavid Howellsattached to an inode (or NULL if fscache is disabled)::
91bc899ee1SDavid Howells
92e81fb419SLinus Torvalds	struct fscache_cookie *netfs_i_cookie(struct netfs_inode *ctx);
93bc899ee1SDavid Howells
94bc899ee1SDavid Howells
95fb28afccSDavid HowellsBuffered Read Helpers
96fb28afccSDavid Howells=====================
97fb28afccSDavid Howells
9808830c8bSMatthew Wilcox (Oracle)The library provides a set of read helpers that handle the ->read_folio(),
99fb28afccSDavid Howells->readahead() and much of the ->write_begin() VM operations and translate them
100fb28afccSDavid Howellsinto a common call framework.
101fb28afccSDavid Howells
102fb28afccSDavid HowellsThe following services are provided:
103fb28afccSDavid Howells
104ddca5b0eSDavid Howells * Handle folios that span multiple pages.
105fb28afccSDavid Howells
106ddca5b0eSDavid Howells * Insulate the netfs from VM interface changes.
107fb28afccSDavid Howells
108ddca5b0eSDavid Howells * Allow the netfs to arbitrarily split reads up into pieces, even ones that
109ddca5b0eSDavid Howells   don't match folio sizes or folio alignments and that may cross folios.
110fb28afccSDavid Howells
111ddca5b0eSDavid Howells * Allow the netfs to expand a readahead request in both directions to meet its
112ddca5b0eSDavid Howells   needs.
113fb28afccSDavid Howells
114ddca5b0eSDavid Howells * Allow the netfs to partially fulfil a read, which will then be resubmitted.
115fb28afccSDavid Howells
116ddca5b0eSDavid Howells * Handle local caching, allowing cached data and server-read data to be
117fb28afccSDavid Howells   interleaved for a single request.
118fb28afccSDavid Howells
119ddca5b0eSDavid Howells * Handle clearing of bufferage that aren't on the server.
120fb28afccSDavid Howells
121fb28afccSDavid Howells * Handle retrying of reads that failed, switching reads from the cache to the
122fb28afccSDavid Howells   server as necessary.
123fb28afccSDavid Howells
124fb28afccSDavid Howells * In the future, this is a place that other services can be performed, such as
125fb28afccSDavid Howells   local encryption of data to be stored remotely or in the cache.
126fb28afccSDavid Howells
127fb28afccSDavid HowellsFrom the network filesystem, the helpers require a table of operations.  This
128fb28afccSDavid Howellsincludes a mandatory method to issue a read operation along with a number of
129fb28afccSDavid Howellsoptional methods.
130fb28afccSDavid Howells
131fb28afccSDavid Howells
132fb28afccSDavid HowellsRead Helper Functions
133fb28afccSDavid Howells---------------------
134fb28afccSDavid Howells
135fb28afccSDavid HowellsThree read helpers are provided::
136fb28afccSDavid Howells
137bc899ee1SDavid Howells	void netfs_readahead(struct readahead_control *ractl);
13808830c8bSMatthew Wilcox (Oracle)	int netfs_read_folio(struct file *file,
13908830c8bSMatthew Wilcox (Oracle)			     struct folio *folio);
140e81fb419SLinus Torvalds	int netfs_write_begin(struct netfs_inode *ctx,
141e81fb419SLinus Torvalds			      struct file *file,
142fb28afccSDavid Howells			      struct address_space *mapping,
143fb28afccSDavid Howells			      loff_t pos,
144fb28afccSDavid Howells			      unsigned int len,
145ddca5b0eSDavid Howells			      struct folio **_folio,
146bc899ee1SDavid Howells			      void **_fsdata);
147fb28afccSDavid Howells
148bc899ee1SDavid HowellsEach corresponds to a VM address space operation.  These operations use the
149bc899ee1SDavid Howellsstate in the per-inode context.
150fb28afccSDavid Howells
15108830c8bSMatthew Wilcox (Oracle)For ->readahead() and ->read_folio(), the network filesystem just point directly
152bc899ee1SDavid Howellsat the corresponding read helper; whereas for ->write_begin(), it may be a
153fb28afccSDavid Howellslittle more complicated as the network filesystem might want to flush
154ddca5b0eSDavid Howellsconflicting writes or track dirty data and needs to put the acquired folio if
155ddca5b0eSDavid Howellsan error occurs after calling the helper.
156fb28afccSDavid Howells
157fb28afccSDavid HowellsThe helpers manage the read request, calling back into the network filesystem
158*d56b699dSBjorn Helgaasthrough the supplied table of operations.  Waits will be performed as
159fb28afccSDavid Howellsnecessary before returning for helpers that are meant to be synchronous.
160fb28afccSDavid Howells
16140a81101SDavid HowellsIf an error occurs, the ->free_request() will be called to clean up the
16240a81101SDavid Howellsnetfs_io_request struct allocated.  If some parts of the request are in
16340a81101SDavid Howellsprogress when an error occurs, the request will get partially completed if
16440a81101SDavid Howellssufficient data is read.
165fb28afccSDavid Howells
166fb28afccSDavid HowellsAdditionally, there is::
167fb28afccSDavid Howells
1686a19114bSDavid Howells  * void netfs_subreq_terminated(struct netfs_io_subrequest *subreq,
169fb28afccSDavid Howells				 ssize_t transferred_or_error,
170fb28afccSDavid Howells				 bool was_async);
171fb28afccSDavid Howells
172fb28afccSDavid Howellswhich should be called to complete a read subrequest.  This is given the number
173fb28afccSDavid Howellsof bytes transferred or a negative error code, plus a flag indicating whether
174fb28afccSDavid Howellsthe operation was asynchronous (ie. whether the follow-on processing can be
175fb28afccSDavid Howellsdone in the current context, given this may involve sleeping).
176fb28afccSDavid Howells
177fb28afccSDavid Howells
178fb28afccSDavid HowellsRead Helper Structures
179fb28afccSDavid Howells----------------------
180fb28afccSDavid Howells
181fb28afccSDavid HowellsThe read helpers make use of a couple of structures to maintain the state of
182fb28afccSDavid Howellsthe read.  The first is a structure that manages a read request as a whole::
183fb28afccSDavid Howells
1846a19114bSDavid Howells	struct netfs_io_request {
185fb28afccSDavid Howells		struct inode		*inode;
186fb28afccSDavid Howells		struct address_space	*mapping;
187fb28afccSDavid Howells		struct netfs_cache_resources cache_resources;
188fb28afccSDavid Howells		void			*netfs_priv;
189fb28afccSDavid Howells		loff_t			start;
190fb28afccSDavid Howells		size_t			len;
191fb28afccSDavid Howells		loff_t			i_size;
1926a19114bSDavid Howells		const struct netfs_request_ops *netfs_ops;
193fb28afccSDavid Howells		unsigned int		debug_id;
194fb28afccSDavid Howells		...
195fb28afccSDavid Howells	};
196fb28afccSDavid Howells
197fb28afccSDavid HowellsThe above fields are the ones the netfs can use.  They are:
198fb28afccSDavid Howells
199fb28afccSDavid Howells * ``inode``
200fb28afccSDavid Howells * ``mapping``
201fb28afccSDavid Howells
202fb28afccSDavid Howells   The inode and the address space of the file being read from.  The mapping
203fb28afccSDavid Howells   may or may not point to inode->i_data.
204fb28afccSDavid Howells
205fb28afccSDavid Howells * ``cache_resources``
206fb28afccSDavid Howells
207fb28afccSDavid Howells   Resources for the local cache to use, if present.
208fb28afccSDavid Howells
209fb28afccSDavid Howells * ``netfs_priv``
210fb28afccSDavid Howells
211fb28afccSDavid Howells   The network filesystem's private data.  The value for this can be passed in
21240a81101SDavid Howells   to the helper functions or set during the request.
213fb28afccSDavid Howells
214fb28afccSDavid Howells * ``start``
215fb28afccSDavid Howells * ``len``
216fb28afccSDavid Howells
217fb28afccSDavid Howells   The file position of the start of the read request and the length.  These
218fb28afccSDavid Howells   may be altered by the ->expand_readahead() op.
219fb28afccSDavid Howells
220fb28afccSDavid Howells * ``i_size``
221fb28afccSDavid Howells
222fb28afccSDavid Howells   The size of the file at the start of the request.
223fb28afccSDavid Howells
224fb28afccSDavid Howells * ``netfs_ops``
225fb28afccSDavid Howells
226fb28afccSDavid Howells   A pointer to the operation table.  The value for this is passed into the
227fb28afccSDavid Howells   helper functions.
228fb28afccSDavid Howells
229fb28afccSDavid Howells * ``debug_id``
230fb28afccSDavid Howells
231fb28afccSDavid Howells   A number allocated to this operation that can be displayed in trace lines
232fb28afccSDavid Howells   for reference.
233fb28afccSDavid Howells
234fb28afccSDavid Howells
235fb28afccSDavid HowellsThe second structure is used to manage individual slices of the overall read
236fb28afccSDavid Howellsrequest::
237fb28afccSDavid Howells
2386a19114bSDavid Howells	struct netfs_io_subrequest {
2396a19114bSDavid Howells		struct netfs_io_request *rreq;
240fb28afccSDavid Howells		loff_t			start;
241fb28afccSDavid Howells		size_t			len;
242fb28afccSDavid Howells		size_t			transferred;
243fb28afccSDavid Howells		unsigned long		flags;
244fb28afccSDavid Howells		unsigned short		debug_index;
245fb28afccSDavid Howells		...
246fb28afccSDavid Howells	};
247fb28afccSDavid Howells
248fb28afccSDavid HowellsEach subrequest is expected to access a single source, though the helpers will
249fb28afccSDavid Howellshandle falling back from one source type to another.  The members are:
250fb28afccSDavid Howells
251fb28afccSDavid Howells * ``rreq``
252fb28afccSDavid Howells
253fb28afccSDavid Howells   A pointer to the read request.
254fb28afccSDavid Howells
255fb28afccSDavid Howells * ``start``
256fb28afccSDavid Howells * ``len``
257fb28afccSDavid Howells
258fb28afccSDavid Howells   The file position of the start of this slice of the read request and the
259fb28afccSDavid Howells   length.
260fb28afccSDavid Howells
261fb28afccSDavid Howells * ``transferred``
262fb28afccSDavid Howells
263fb28afccSDavid Howells   The amount of data transferred so far of the length of this slice.  The
264fb28afccSDavid Howells   network filesystem or cache should start the operation this far into the
265fb28afccSDavid Howells   slice.  If a short read occurs, the helpers will call again, having updated
266fb28afccSDavid Howells   this to reflect the amount read so far.
267fb28afccSDavid Howells
268fb28afccSDavid Howells * ``flags``
269fb28afccSDavid Howells
270fb28afccSDavid Howells   Flags pertaining to the read.  There are two of interest to the filesystem
271fb28afccSDavid Howells   or cache:
272fb28afccSDavid Howells
273fb28afccSDavid Howells   * ``NETFS_SREQ_CLEAR_TAIL``
274fb28afccSDavid Howells
275fb28afccSDavid Howells     This can be set to indicate that the remainder of the slice, from
276fb28afccSDavid Howells     transferred to len, should be cleared.
277fb28afccSDavid Howells
278fb28afccSDavid Howells   * ``NETFS_SREQ_SEEK_DATA_READ``
279fb28afccSDavid Howells
280fb28afccSDavid Howells     This is a hint to the cache that it might want to try skipping ahead to
281fb28afccSDavid Howells     the next data (ie. using SEEK_DATA).
282fb28afccSDavid Howells
283fb28afccSDavid Howells * ``debug_index``
284fb28afccSDavid Howells
285fb28afccSDavid Howells   A number allocated to this slice that can be displayed in trace lines for
286fb28afccSDavid Howells   reference.
287fb28afccSDavid Howells
288fb28afccSDavid Howells
289fb28afccSDavid HowellsRead Helper Operations
290fb28afccSDavid Howells----------------------
291fb28afccSDavid Howells
292fb28afccSDavid HowellsThe network filesystem must provide the read helpers with a table of operations
293fb28afccSDavid Howellsthrough which it can issue requests and negotiate::
294fb28afccSDavid Howells
2956a19114bSDavid Howells	struct netfs_request_ops {
2966a19114bSDavid Howells		void (*init_request)(struct netfs_io_request *rreq, struct file *file);
29740a81101SDavid Howells		void (*free_request)(struct netfs_io_request *rreq);
2986a19114bSDavid Howells		int (*begin_cache_operation)(struct netfs_io_request *rreq);
2996a19114bSDavid Howells		void (*expand_readahead)(struct netfs_io_request *rreq);
3006a19114bSDavid Howells		bool (*clamp_length)(struct netfs_io_subrequest *subreq);
301f18a3785SDavid Howells		void (*issue_read)(struct netfs_io_subrequest *subreq);
3026a19114bSDavid Howells		bool (*is_still_valid)(struct netfs_io_request *rreq);
303fb28afccSDavid Howells		int (*check_write_begin)(struct file *file, loff_t pos, unsigned len,
304fac47b43SXiubo Li					 struct folio **foliop, void **_fsdata);
3056a19114bSDavid Howells		void (*done)(struct netfs_io_request *rreq);
306fb28afccSDavid Howells	};
307fb28afccSDavid Howells
308fb28afccSDavid HowellsThe operations are as follows:
309fb28afccSDavid Howells
3106a19114bSDavid Howells * ``init_request()``
311fb28afccSDavid Howells
312fb28afccSDavid Howells   [Optional] This is called to initialise the request structure.  It is given
31340a81101SDavid Howells   the file for reference.
31440a81101SDavid Howells
31540a81101SDavid Howells * ``free_request()``
31640a81101SDavid Howells
31740a81101SDavid Howells   [Optional] This is called as the request is being deallocated so that the
31840a81101SDavid Howells   filesystem can clean up any state it has attached there.
319fb28afccSDavid Howells
320fb28afccSDavid Howells * ``begin_cache_operation()``
321fb28afccSDavid Howells
322fb28afccSDavid Howells   [Optional] This is called to ask the network filesystem to call into the
323fb28afccSDavid Howells   cache (if present) to initialise the caching state for this read.  The netfs
324fb28afccSDavid Howells   library module cannot access the cache directly, so the cache should call
325fb28afccSDavid Howells   something like fscache_begin_read_operation() to do this.
326fb28afccSDavid Howells
327fb28afccSDavid Howells   The cache gets to store its state in ->cache_resources and must set a table
328fb28afccSDavid Howells   of operations of its own there (though of a different type).
329fb28afccSDavid Howells
330fb28afccSDavid Howells   This should return 0 on success and an error code otherwise.  If an error is
331fb28afccSDavid Howells   reported, the operation may proceed anyway, just without local caching (only
332fb28afccSDavid Howells   out of memory and interruption errors cause failure here).
333fb28afccSDavid Howells
334fb28afccSDavid Howells * ``expand_readahead()``
335fb28afccSDavid Howells
336fb28afccSDavid Howells   [Optional] This is called to allow the filesystem to expand the size of a
337fb28afccSDavid Howells   readahead read request.  The filesystem gets to expand the request in both
338fb28afccSDavid Howells   directions, though it's not permitted to reduce it as the numbers may
339fb28afccSDavid Howells   represent an allocation already made.  If local caching is enabled, it gets
340fb28afccSDavid Howells   to expand the request first.
341fb28afccSDavid Howells
342fb28afccSDavid Howells   Expansion is communicated by changing ->start and ->len in the request
343fb28afccSDavid Howells   structure.  Note that if any change is made, ->len must be increased by at
344fb28afccSDavid Howells   least as much as ->start is reduced.
345fb28afccSDavid Howells
346fb28afccSDavid Howells * ``clamp_length()``
347fb28afccSDavid Howells
348fb28afccSDavid Howells   [Optional] This is called to allow the filesystem to reduce the size of a
349fb28afccSDavid Howells   subrequest.  The filesystem can use this, for example, to chop up a request
350fb28afccSDavid Howells   that has to be split across multiple servers or to put multiple reads in
351fb28afccSDavid Howells   flight.
352fb28afccSDavid Howells
353fb28afccSDavid Howells   This should return 0 on success and an error code on error.
354fb28afccSDavid Howells
355f18a3785SDavid Howells * ``issue_read()``
356fb28afccSDavid Howells
357fb28afccSDavid Howells   [Required] The helpers use this to dispatch a subrequest to the server for
358fb28afccSDavid Howells   reading.  In the subrequest, ->start, ->len and ->transferred indicate what
359fb28afccSDavid Howells   data should be read from the server.
360fb28afccSDavid Howells
361fb28afccSDavid Howells   There is no return value; the netfs_subreq_terminated() function should be
362fb28afccSDavid Howells   called to indicate whether or not the operation succeeded and how much data
363ddca5b0eSDavid Howells   it transferred.  The filesystem also should not deal with setting folios
364fb28afccSDavid Howells   uptodate, unlocking them or dropping their refs - the helpers need to deal
365fb28afccSDavid Howells   with this as they have to coordinate with copying to the local cache.
366fb28afccSDavid Howells
367ddca5b0eSDavid Howells   Note that the helpers have the folios locked, but not pinned.  It is
368ddca5b0eSDavid Howells   possible to use the ITER_XARRAY iov iterator to refer to the range of the
369ddca5b0eSDavid Howells   inode that is being operated upon without the need to allocate large bvec
370ddca5b0eSDavid Howells   tables.
371fb28afccSDavid Howells
372fb28afccSDavid Howells * ``is_still_valid()``
373fb28afccSDavid Howells
374fb28afccSDavid Howells   [Optional] This is called to find out if the data just read from the local
375fb28afccSDavid Howells   cache is still valid.  It should return true if it is still valid and false
376fb28afccSDavid Howells   if not.  If it's not still valid, it will be reread from the server.
377fb28afccSDavid Howells
378fb28afccSDavid Howells * ``check_write_begin()``
379fb28afccSDavid Howells
380fb28afccSDavid Howells   [Optional] This is called from the netfs_write_begin() helper once it has
381ddca5b0eSDavid Howells   allocated/grabbed the folio to be modified to allow the filesystem to flush
382fb28afccSDavid Howells   conflicting state before allowing it to be modified.
383fb28afccSDavid Howells
384fac47b43SXiubo Li   It may unlock and discard the folio it was given and set the caller's folio
385fac47b43SXiubo Li   pointer to NULL.  It should return 0 if everything is now fine (``*foliop``
386fac47b43SXiubo Li   left set) or the op should be retried (``*foliop`` cleared) and any other
387fac47b43SXiubo Li   error code to abort the operation.
388fb28afccSDavid Howells
389fb28afccSDavid Howells * ``done``
390fb28afccSDavid Howells
391ddca5b0eSDavid Howells   [Optional] This is called after the folios in the request have all been
392fb28afccSDavid Howells   unlocked (and marked uptodate if applicable).
393fb28afccSDavid Howells
394fb28afccSDavid Howells
395fb28afccSDavid Howells
396fb28afccSDavid HowellsRead Helper Procedure
397fb28afccSDavid Howells---------------------
398fb28afccSDavid Howells
399fb28afccSDavid HowellsThe read helpers work by the following general procedure:
400fb28afccSDavid Howells
401fb28afccSDavid Howells * Set up the request.
402fb28afccSDavid Howells
403fb28afccSDavid Howells * For readahead, allow the local cache and then the network filesystem to
404fb28afccSDavid Howells   propose expansions to the read request.  This is then proposed to the VM.
405fb28afccSDavid Howells   If the VM cannot fully perform the expansion, a partially expanded read will
406fb28afccSDavid Howells   be performed, though this may not get written to the cache in its entirety.
407fb28afccSDavid Howells
408fb28afccSDavid Howells * Loop around slicing chunks off of the request to form subrequests:
409fb28afccSDavid Howells
410fb28afccSDavid Howells   * If a local cache is present, it gets to do the slicing, otherwise the
411fb28afccSDavid Howells     helpers just try to generate maximal slices.
412fb28afccSDavid Howells
413fb28afccSDavid Howells   * The network filesystem gets to clamp the size of each slice if it is to be
414fb28afccSDavid Howells     the source.  This allows rsize and chunking to be implemented.
415fb28afccSDavid Howells
416fb28afccSDavid Howells   * The helpers issue a read from the cache or a read from the server or just
417fb28afccSDavid Howells     clears the slice as appropriate.
418fb28afccSDavid Howells
419fb28afccSDavid Howells   * The next slice begins at the end of the last one.
420fb28afccSDavid Howells
421fb28afccSDavid Howells   * As slices finish being read, they terminate.
422fb28afccSDavid Howells
423fb28afccSDavid Howells * When all the subrequests have terminated, the subrequests are assessed and
424fb28afccSDavid Howells   any that are short or have failed are reissued:
425fb28afccSDavid Howells
426fb28afccSDavid Howells   * Failed cache requests are issued against the server instead.
427fb28afccSDavid Howells
428fb28afccSDavid Howells   * Failed server requests just fail.
429fb28afccSDavid Howells
430fb28afccSDavid Howells   * Short reads against either source will be reissued against that source
431fb28afccSDavid Howells     provided they have transferred some more data:
432fb28afccSDavid Howells
433fb28afccSDavid Howells     * The cache may need to skip holes that it can't do DIO from.
434fb28afccSDavid Howells
435fb28afccSDavid Howells     * If NETFS_SREQ_CLEAR_TAIL was set, a short read will be cleared to the
436fb28afccSDavid Howells       end of the slice instead of reissuing.
437fb28afccSDavid Howells
438ddca5b0eSDavid Howells * Once the data is read, the folios that have been fully read/cleared:
439fb28afccSDavid Howells
440fb28afccSDavid Howells   * Will be marked uptodate.
441fb28afccSDavid Howells
442fb28afccSDavid Howells   * If a cache is present, will be marked with PG_fscache.
443fb28afccSDavid Howells
444fb28afccSDavid Howells   * Unlocked
445fb28afccSDavid Howells
446ddca5b0eSDavid Howells * Any folios that need writing to the cache will then have DIO writes issued.
447fb28afccSDavid Howells
448fb28afccSDavid Howells * Synchronous operations will wait for reading to be complete.
449fb28afccSDavid Howells
450ddca5b0eSDavid Howells * Writes to the cache will proceed asynchronously and the folios will have the
451fb28afccSDavid Howells   PG_fscache mark removed when that completes.
452fb28afccSDavid Howells
453fb28afccSDavid Howells * The request structures will be cleaned up when everything has completed.
454fb28afccSDavid Howells
455fb28afccSDavid Howells
456fb28afccSDavid HowellsRead Helper Cache API
457fb28afccSDavid Howells---------------------
458fb28afccSDavid Howells
459fb28afccSDavid HowellsWhen implementing a local cache to be used by the read helpers, two things are
460fb28afccSDavid Howellsrequired: some way for the network filesystem to initialise the caching for a
461fb28afccSDavid Howellsread request and a table of operations for the helpers to call.
462fb28afccSDavid Howells
463fb28afccSDavid HowellsThe network filesystem's ->begin_cache_operation() method is called to set up a
464fb28afccSDavid Howellscache and this must call into the cache to do the work.  If using fscache, for
465fb28afccSDavid Howellsexample, the cache would call::
466fb28afccSDavid Howells
4676a19114bSDavid Howells	int fscache_begin_read_operation(struct netfs_io_request *rreq,
468fb28afccSDavid Howells					 struct fscache_cookie *cookie);
469fb28afccSDavid Howells
470fb28afccSDavid Howellspassing in the request pointer and the cookie corresponding to the file.
471fb28afccSDavid Howells
4726a19114bSDavid HowellsThe netfs_io_request object contains a place for the cache to hang its
473fb28afccSDavid Howellsstate::
474fb28afccSDavid Howells
475fb28afccSDavid Howells	struct netfs_cache_resources {
476fb28afccSDavid Howells		const struct netfs_cache_ops	*ops;
477fb28afccSDavid Howells		void				*cache_priv;
478fb28afccSDavid Howells		void				*cache_priv2;
479fb28afccSDavid Howells	};
480fb28afccSDavid Howells
481fb28afccSDavid HowellsThis contains an operations table pointer and two private pointers.  The
482fb28afccSDavid Howellsoperation table looks like the following::
483fb28afccSDavid Howells
484fb28afccSDavid Howells	struct netfs_cache_ops {
485fb28afccSDavid Howells		void (*end_operation)(struct netfs_cache_resources *cres);
486fb28afccSDavid Howells
487fb28afccSDavid Howells		void (*expand_readahead)(struct netfs_cache_resources *cres,
488fb28afccSDavid Howells					 loff_t *_start, size_t *_len, loff_t i_size);
489fb28afccSDavid Howells
4906a19114bSDavid Howells		enum netfs_io_source (*prepare_read)(struct netfs_io_subrequest *subreq,
491fb28afccSDavid Howells						       loff_t i_size);
492fb28afccSDavid Howells
493fb28afccSDavid Howells		int (*read)(struct netfs_cache_resources *cres,
494fb28afccSDavid Howells			    loff_t start_pos,
495fb28afccSDavid Howells			    struct iov_iter *iter,
496fb28afccSDavid Howells			    bool seek_data,
497fb28afccSDavid Howells			    netfs_io_terminated_t term_func,
498fb28afccSDavid Howells			    void *term_func_priv);
499fb28afccSDavid Howells
500ddca5b0eSDavid Howells		int (*prepare_write)(struct netfs_cache_resources *cres,
501e0484344SDavid Howells				     loff_t *_start, size_t *_len, loff_t i_size,
502e0484344SDavid Howells				     bool no_space_allocated_yet);
503ddca5b0eSDavid Howells
504fb28afccSDavid Howells		int (*write)(struct netfs_cache_resources *cres,
505fb28afccSDavid Howells			     loff_t start_pos,
506fb28afccSDavid Howells			     struct iov_iter *iter,
507fb28afccSDavid Howells			     netfs_io_terminated_t term_func,
508fb28afccSDavid Howells			     void *term_func_priv);
509bee9f655SDavid Howells
510bee9f655SDavid Howells		int (*query_occupancy)(struct netfs_cache_resources *cres,
511bee9f655SDavid Howells				       loff_t start, size_t len, size_t granularity,
512bee9f655SDavid Howells				       loff_t *_data_start, size_t *_data_len);
513fb28afccSDavid Howells	};
514fb28afccSDavid Howells
515fb28afccSDavid HowellsWith a termination handler function pointer::
516fb28afccSDavid Howells
517fb28afccSDavid Howells	typedef void (*netfs_io_terminated_t)(void *priv,
518fb28afccSDavid Howells					      ssize_t transferred_or_error,
519fb28afccSDavid Howells					      bool was_async);
520fb28afccSDavid Howells
521fb28afccSDavid HowellsThe methods defined in the table are:
522fb28afccSDavid Howells
523fb28afccSDavid Howells * ``end_operation()``
524fb28afccSDavid Howells
525fb28afccSDavid Howells   [Required] Called to clean up the resources at the end of the read request.
526fb28afccSDavid Howells
527fb28afccSDavid Howells * ``expand_readahead()``
528fb28afccSDavid Howells
529fb28afccSDavid Howells   [Optional] Called at the beginning of a netfs_readahead() operation to allow
530fb28afccSDavid Howells   the cache to expand a request in either direction.  This allows the cache to
531fb28afccSDavid Howells   size the request appropriately for the cache granularity.
532fb28afccSDavid Howells
533fb28afccSDavid Howells   The function is passed poiners to the start and length in its parameters,
534fb28afccSDavid Howells   plus the size of the file for reference, and adjusts the start and length
535fb28afccSDavid Howells   appropriately.  It should return one of:
536fb28afccSDavid Howells
537fb28afccSDavid Howells   * ``NETFS_FILL_WITH_ZEROES``
538fb28afccSDavid Howells   * ``NETFS_DOWNLOAD_FROM_SERVER``
539fb28afccSDavid Howells   * ``NETFS_READ_FROM_CACHE``
540fb28afccSDavid Howells   * ``NETFS_INVALID_READ``
541fb28afccSDavid Howells
542fb28afccSDavid Howells   to indicate whether the slice should just be cleared or whether it should be
543fb28afccSDavid Howells   downloaded from the server or read from the cache - or whether slicing
544fb28afccSDavid Howells   should be given up at the current point.
545fb28afccSDavid Howells
546fb28afccSDavid Howells * ``prepare_read()``
547fb28afccSDavid Howells
548fb28afccSDavid Howells   [Required] Called to configure the next slice of a request.  ->start and
549fb28afccSDavid Howells   ->len in the subrequest indicate where and how big the next slice can be;
550fb28afccSDavid Howells   the cache gets to reduce the length to match its granularity requirements.
551fb28afccSDavid Howells
552fb28afccSDavid Howells * ``read()``
553fb28afccSDavid Howells
554fb28afccSDavid Howells   [Required] Called to read from the cache.  The start file offset is given
555fb28afccSDavid Howells   along with an iterator to read to, which gives the length also.  It can be
556fb28afccSDavid Howells   given a hint requesting that it seek forward from that start position for
557fb28afccSDavid Howells   data.
558fb28afccSDavid Howells
559fb28afccSDavid Howells   Also provided is a pointer to a termination handler function and private
560fb28afccSDavid Howells   data to pass to that function.  The termination function should be called
561fb28afccSDavid Howells   with the number of bytes transferred or an error code, plus a flag
562fb28afccSDavid Howells   indicating whether the termination is definitely happening in the caller's
563fb28afccSDavid Howells   context.
564fb28afccSDavid Howells
565ddca5b0eSDavid Howells * ``prepare_write()``
566ddca5b0eSDavid Howells
567e0484344SDavid Howells   [Required] Called to prepare a write to the cache to take place.  This
568e0484344SDavid Howells   involves checking to see whether the cache has sufficient space to honour
569e0484344SDavid Howells   the write.  ``*_start`` and ``*_len`` indicate the region to be written; the
570e0484344SDavid Howells   region can be shrunk or it can be expanded to a page boundary either way as
571e0484344SDavid Howells   necessary to align for direct I/O.  i_size holds the size of the object and
572e0484344SDavid Howells   is provided for reference.  no_space_allocated_yet is set to true if the
573e0484344SDavid Howells   caller is certain that no data has been written to that region - for example
574e0484344SDavid Howells   if it tried to do a read from there already.
575ddca5b0eSDavid Howells
576fb28afccSDavid Howells * ``write()``
577fb28afccSDavid Howells
578fb28afccSDavid Howells   [Required] Called to write to the cache.  The start file offset is given
579fb28afccSDavid Howells   along with an iterator to write from, which gives the length also.
580fb28afccSDavid Howells
581fb28afccSDavid Howells   Also provided is a pointer to a termination handler function and private
582fb28afccSDavid Howells   data to pass to that function.  The termination function should be called
583fb28afccSDavid Howells   with the number of bytes transferred or an error code, plus a flag
584fb28afccSDavid Howells   indicating whether the termination is definitely happening in the caller's
585fb28afccSDavid Howells   context.
586fb28afccSDavid Howells
587bee9f655SDavid Howells * ``query_occupancy()``
588bee9f655SDavid Howells
589bee9f655SDavid Howells   [Required] Called to find out where the next piece of data is within a
590bee9f655SDavid Howells   particular region of the cache.  The start and length of the region to be
591bee9f655SDavid Howells   queried are passed in, along with the granularity to which the answer needs
592bee9f655SDavid Howells   to be aligned.  The function passes back the start and length of the data,
593bee9f655SDavid Howells   if any, available within that region.  Note that there may be a hole at the
594bee9f655SDavid Howells   front.
595bee9f655SDavid Howells
596bee9f655SDavid Howells   It returns 0 if some data was found, -ENODATA if there was no usable data
597bee9f655SDavid Howells   within the region or -ENOBUFS if there is no caching on this file.
598bee9f655SDavid Howells
599fb28afccSDavid HowellsNote that these methods are passed a pointer to the cache resource structure,
600fb28afccSDavid Howellsnot the read request structure as they could be used in other situations where
601fb28afccSDavid Howellsthere isn't a read request structure as well, such as writing dirty data to the
602fb28afccSDavid Howellscache.
6036abbaa5bSMatthew Wilcox (Oracle)
604ddca5b0eSDavid Howells
605ddca5b0eSDavid HowellsAPI Function Reference
606ddca5b0eSDavid Howells======================
607ddca5b0eSDavid Howells
6086abbaa5bSMatthew Wilcox (Oracle).. kernel-doc:: include/linux/netfs.h
6093be01750SDavid Howells.. kernel-doc:: fs/netfs/buffered_read.c
6103be01750SDavid Howells.. kernel-doc:: fs/netfs/io.c
611