History log of /openbmc/linux/fs/fuse/dax.c (Results 1 – 25 of 40)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2
# ce5a6df2 16-Nov-2023 Hangyu Hua <hbh25y@gmail.com>

fuse: dax: set fc->dax to NULL in fuse_dax_conn_free()

commit 7f8ed28d1401320bcb02dda81b3c23ab2dc5a6d8 upstream.

fuse_dax_conn_free() will be called when fuse_fill_super_common() fails
after fuse_d

fuse: dax: set fc->dax to NULL in fuse_dax_conn_free()

commit 7f8ed28d1401320bcb02dda81b3c23ab2dc5a6d8 upstream.

fuse_dax_conn_free() will be called when fuse_fill_super_common() fails
after fuse_dax_conn_alloc(). Then deactivate_locked_super() in
virtio_fs_get_tree() will call virtio_kill_sb() to release the discarded
superblock. This will call fuse_dax_conn_free() again in fuse_conn_put(),
resulting in a possible double free.

Fixes: 1dd539577c42 ("virtiofs: add a mount option to enable dax")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Cc: <stable@vger.kernel.org> # v5.10
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


Revision tags: v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48
# 1d024e7a 18-Aug-2023 Matthew Wilcox (Oracle) <willy@infradead.org>

mm: remove enum page_entry_size

Remove the unnecessary encoding of page order into an enum and pass the
page order directly. That lets us get rid of pe_order().

The switch constructs have to be ch

mm: remove enum page_entry_size

Remove the unnecessary encoding of page order into an enum and pass the
page order directly. That lets us get rid of pe_order().

The switch constructs have to be changed to if/else constructs to prevent
GCC from warning on builds with 3-level page tables where PMD_ORDER and
PUD_ORDER have the same value.

If you are looking at this commit because your driver stopped compiling,
look at the previous commit as well and audit your driver to be sure it
doesn't depend on mmap_lock being held in its ->huge_fault method.

[willy@infradead.org: use "order %u" to match the (non dev_t) style]
Link: https://lkml.kernel.org/r/ZOUYekbtTv+n8hYf@casper.infradead.org
Link: https://lkml.kernel.org/r/20230818202335.2739663-4-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


Revision tags: v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9
# 1c71222e 26-Jan-2023 Suren Baghdasaryan <surenb@google.com>

mm: replace vma->vm_flags direct modifications with modifier calls

Replace direct modifications to vma->vm_flags with calls to modifier
functions to be able to track flag changes and to keep vma loc

mm: replace vma->vm_flags direct modifications with modifier calls

Replace direct modifications to vma->vm_flags with calls to modifier
functions to be able to track flag changes and to keep vma locking
correctness.

[akpm@linux-foundation.org: fix drivers/misc/open-dice.c, per Hyeonggon Yoo]
Link: https://lkml.kernel.org/r/20230126193752.297968-5-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Acked-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjun Roy <arjunroy@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Minchan Kim <minchan@google.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Oskolkov <posk@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Punit Agrawal <punit.agrawal@bytedance.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


Revision tags: v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33
# 47e30149 02-Apr-2022 Jeffle Xu <jefflexu@linux.alibaba.com>

fuse: avoid unnecessary spinlock bump

Move dmap free worker kicker inside the critical region, so that extra
spinlock lock/unlock could be avoided.

Suggested-by: Liu Jiang <gerry@linux.alibaba.com>

fuse: avoid unnecessary spinlock bump

Move dmap free worker kicker inside the critical region, so that extra
spinlock lock/unlock could be avoided.

Suggested-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

show more ...


# e511c4a3 13-May-2022 Jane Chu <jane.chu@oracle.com>

dax: introduce DAX_RECOVERY_WRITE dax access mode

Up till now, dax_direct_access() is used implicitly for normal
access, but for the purpose of recovery write, dax range with
poison is requested. T

dax: introduce DAX_RECOVERY_WRITE dax access mode

Up till now, dax_direct_access() is used implicitly for normal
access, but for the purpose of recovery write, dax range with
poison is requested. To make the interface clear, introduce
enum dax_access_mode {
DAX_ACCESS,
DAX_RECOVERY_WRITE,
}
where DAX_ACCESS is used for normal dax access, and
DAX_RECOVERY_WRITE is used for dax recovery write.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jane Chu <jane.chu@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Vivek Goyal <vgoyal@redhat.com>
Link: https://lore.kernel.org/r/165247982851.52965.11024212198889762949.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>

show more ...


Revision tags: v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23
# 46de8b97 09-Feb-2022 Matthew Wilcox (Oracle) <willy@infradead.org>

fs: Convert __set_page_dirty_no_writeback to noop_dirty_folio

This is a mechanical change.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: Damien Le Moal <damien.lemoal@open

fs: Convert __set_page_dirty_no_writeback to noop_dirty_folio

This is a mechanical change.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Tested-by: Mike Marshall <hubcap@omnibond.com> # orangefs
Tested-by: David Howells <dhowells@redhat.com> # afs

show more ...


# 5660a863 09-Feb-2022 Matthew Wilcox (Oracle) <willy@infradead.org>

fs: Remove noop_invalidatepage()

We used to have to use noop_invalidatepage() to prevent
block_invalidatepage() from being called, but that behaviour is now gone.

Signed-off-by: Matthew Wilcox (Ora

fs: Remove noop_invalidatepage()

We used to have to use noop_invalidatepage() to prevent
block_invalidatepage() from being called, but that behaviour is now gone.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Tested-by: Mike Marshall <hubcap@omnibond.com> # orangefs
Tested-by: David Howells <dhowells@redhat.com> # afs

show more ...


Revision tags: v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.16, v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5
# c3cb6f93 25-Nov-2021 Jeffle Xu <jefflexu@linux.alibaba.com>

fuse: mark inode DONT_CACHE when per inode DAX hint changes

When the per inode DAX hint changes while the file is still *opened*, it
is quite complicated and maybe fragile to dynamically change the

fuse: mark inode DONT_CACHE when per inode DAX hint changes

When the per inode DAX hint changes while the file is still *opened*, it
is quite complicated and maybe fragile to dynamically change the DAX
state.

Hence mark the inode and corresponding dentries as DONE_CACHE once the
per inode DAX hint changes, so that the inode instance will be evicted
and freed as soon as possible once the file is closed and the last
reference to the inode is put. And then when the file gets reopened next
time, the new instantiated inode will reflect the new DAX state.

In summary, when the per inode DAX hint changes for an *opened* file, the
DAX state of the file won't be updated until this file is closed and
reopened later.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

show more ...


# 2ee019fa 25-Nov-2021 Jeffle Xu <jefflexu@linux.alibaba.com>

fuse: negotiate per inode DAX in FUSE_INIT

Among the FUSE_INIT phase, client shall advertise per inode DAX if it's
mounted with "dax=inode". Then server is aware that client is in per
inode DAX mode

fuse: negotiate per inode DAX in FUSE_INIT

Among the FUSE_INIT phase, client shall advertise per inode DAX if it's
mounted with "dax=inode". Then server is aware that client is in per
inode DAX mode, and will construct per-inode DAX attribute accordingly.

Server shall also advertise support for per inode DAX. If server doesn't
support it while client is mounted with "dax=inode", client will
silently fallback to "dax=never" since "dax=inode" is advisory only.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

show more ...


# 93a497b9 25-Nov-2021 Jeffle Xu <jefflexu@linux.alibaba.com>

fuse: enable per inode DAX

DAX may be limited in some specific situation. When the number of usable
DAX windows is under watermark, the recalim routine will be triggered to
reclaim some DAX windows.

fuse: enable per inode DAX

DAX may be limited in some specific situation. When the number of usable
DAX windows is under watermark, the recalim routine will be triggered to
reclaim some DAX windows. It may have a negative impact on the
performance, since some processes may need to wait for DAX windows to be
recalimed and reused then. To mitigate the performance degradation, the
overall DAX window need to be expanded larger.

However, simply expanding the DAX window may not be a good deal in some
scenario. To maintain one DAX window chunk (i.e., 2MB in size), 32KB
(512 * 64 bytes) memory footprint will be consumed for page descriptors
inside guest, which is greater than the memory footprint if it uses
guest page cache when DAX disabled. Thus it'd better disable DAX for
those files smaller than 32KB, to reduce the demand for DAX window and
thus avoid the unworthy memory overhead.

Per inode DAX feature is introduced to address this issue, by offering a
finer grained control for dax to users, trying to achieve a balance
between performance and memory overhead.

The FUSE_ATTR_DAX flag in FUSE_LOOKUP reply is used to indicate whether
DAX should be enabled or not for corresponding file. Currently the state
whether DAX is enabled or not for the file is initialized only when
inode is instantiated.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

show more ...


# 780b1b95 25-Nov-2021 Jeffle Xu <jefflexu@linux.alibaba.com>

fuse: make DAX mount option a tri-state

We add 'always', 'never', and 'inode' (default). '-o dax' continues to
operate the same which is equivalent to 'always'.

The following behavior is consistent

fuse: make DAX mount option a tri-state

We add 'always', 'never', and 'inode' (default). '-o dax' continues to
operate the same which is equivalent to 'always'.

The following behavior is consistent with that on ext4/xfs:

- The default behavior (when neither '-o dax' nor
'-o dax=always|never|inode' option is specified) is equal to 'inode'
mode, while 'dax=inode' won't be printed among the mount option list.

- The 'inode' mode is only advisory. It will silently fallback to 'never'
mode if fuse server doesn't support that.

Also noted that by the time of this commit, 'inode' mode is actually equal
to 'always' mode, before the per inode DAX flag is introduced in the
following patch.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

show more ...


# cecd4916 25-Nov-2021 Jeffle Xu <jefflexu@linux.alibaba.com>

fuse: add fuse_should_enable_dax() helper

This is in prep for following per inode DAX checking.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Vivek Goyal <vgoyal@redhat.com>
Si

fuse: add fuse_should_enable_dax() helper

This is in prep for following per inode DAX checking.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

show more ...


Revision tags: v5.15.4, v5.15.3, v5.15.2, v5.15.1, v5.15
# d347739a 22-Oct-2021 Miklos Szeredi <mszeredi@redhat.com>

fuse: always invalidate attributes after writes

Extend the fuse_write_update_attr() helper to invalidate cached attributes
after a write.

This has already been done in all cases except in fuse_noti

fuse: always invalidate attributes after writes

Extend the fuse_write_update_attr() helper to invalidate cached attributes
after a write.

This has already been done in all cases except in fuse_notify_store(), so
this is mostly a cleanup.

fuse_direct_write_iter() calls fuse_direct_IO() which already calls
fuse_write_update_attr(), so don't repeat that again in the former.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

show more ...


# 27ae449b 22-Oct-2021 Miklos Szeredi <mszeredi@redhat.com>

fuse: rename fuse_write_update_size()

This function already updates the attr_version in fuse_inode, regardless of
whether the size was changed or not.

Rename the helper to fuse_write_update_attr()

fuse: rename fuse_write_update_size()

This function already updates the attr_version in fuse_inode, regardless of
whether the size was changed or not.

Rename the helper to fuse_write_update_attr() to reflect the more generic
nature.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

show more ...


# fa5eee57 22-Oct-2021 Miklos Szeredi <mszeredi@redhat.com>

fuse: selective attribute invalidation

Only invalidate attributes that the operation might have changed.

Introduce two constants for common combinations of changed attributes:

FUSE_STATX_MODIFY:

fuse: selective attribute invalidation

Only invalidate attributes that the operation might have changed.

Introduce two constants for common combinations of changed attributes:

FUSE_STATX_MODIFY: file contents are modified but not size

FUSE_STATX_MODSIZE: size and/or file contents modified

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

show more ...


Revision tags: v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10, v5.14.9, v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60, v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49, v5.13, v5.10.46, v5.10.43, v5.10.42, v5.10.41, v5.10.40, v5.10.39, v5.4.119, v5.10.36, v5.10.35, v5.10.34, v5.4.116, v5.10.33, v5.12
# 8bcbbe9c 21-Apr-2021 Jan Kara <jack@suse.cz>

fuse: Convert to using invalidate_lock

Use invalidate_lock instead of fuse's private i_mmap_sem. The intended
purpose is exactly the same. By this conversion we fix a long standing
race between hole

fuse: Convert to using invalidate_lock

Use invalidate_lock instead of fuse's private i_mmap_sem. The intended
purpose is exactly the same. By this conversion we fix a long standing
race between hole punching and read(2) / readahead(2) paths that can
lead to stale page cache contents.

CC: Miklos Szeredi <miklos@szeredi.hu>
Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>

show more ...


# 2e29be2e 25-May-2021 Ira Weiny <ira.weiny@intel.com>

fs/fuse: Remove unneeded kaddr parameter

fuse_dax_mem_range_init() does not need the address or the pfn of the
memory requested in dax_direct_access(). It is only calling direct
access to get the n

fs/fuse: Remove unneeded kaddr parameter

fuse_dax_mem_range_init() does not need the address or the pfn of the
memory requested in dax_direct_access(). It is only calling direct
access to get the number of pages.

Remove the unused variables and stop requesting the kaddr and pfn from
dax_direct_access().

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Vivek Goyal <vgoyal@redhat.com>
Link: https://lore.kernel.org/r/20210525172428.3634316-2-ira.weiny@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>

show more ...


# 3a6b2162 28-Jun-2021 Matthew Wilcox (Oracle) <willy@infradead.org>

mm: move page dirtying prototypes from mm.h

These functions implement the address_space ->set_page_dirty operation and
should live in pagemap.h, not mm.h so that the rest of the kernel doesn't
get f

mm: move page dirtying prototypes from mm.h

These functions implement the address_space ->set_page_dirty operation and
should live in pagemap.h, not mm.h so that the rest of the kernel doesn't
get funny ideas about calling them directly.

Link: https://lkml.kernel.org/r/20210615162342.1669332-7-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# b82a96c9 28-Jun-2021 Matthew Wilcox (Oracle) <willy@infradead.org>

fs: remove noop_set_page_dirty()

Use __set_page_dirty_no_writeback() instead. This will set the dirty bit
on the page, which will be used to avoid calling set_page_dirty() in the
future. It will h

fs: remove noop_set_page_dirty()

Use __set_page_dirty_no_writeback() instead. This will set the dirty bit
on the page, which will be used to avoid calling set_page_dirty() in the
future. It will have no effect on actually writing the page back, as the
pages are not on any LRU lists.

[akpm@linux-foundation.org: export __set_page_dirty_no_writeback() to modules]

Link: https://lkml.kernel.org/r/20210615162342.1669332-6-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# c4e0cd4e 03-Jun-2021 Zheng Yongjun <zhengyongjun3@huawei.com>

virtiofs: Fix spelling mistakes

Fix some spelling mistakes in comments:
refernce ==> reference
happnes ==> happens
threhold ==> threshold
splitted ==> split
mached ==> matched

Signed-off-by: Z

virtiofs: Fix spelling mistakes

Fix some spelling mistakes in comments:
refernce ==> reference
happnes ==> happens
threhold ==> threshold
splitted ==> split
mached ==> matched

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

show more ...


Revision tags: v5.10.32, v5.10.31, v5.10.30, v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16, v5.10.15, v5.10.14, v5.10, v5.8.17, v5.8.16, v5.8.15, v5.9, v5.8.14, v5.8.13, v5.8.12, v5.8.11, v5.8.10, v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62, v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53, v5.4.52, v5.7.9, v5.7.8, v5.4.51, v5.4.50, v5.7.7, v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2, v5.4.45, v5.7.1, v5.4.44, v5.7, v5.4.43, v5.4.42, v5.4.41, v5.4.40
# fcee216b 06-May-2020 Max Reitz <mreitz@redhat.com>

fuse: split fuse_mount off of fuse_conn

We want to allow submounts for the same fuse_conn, but with different
superblocks so that each of the submounts has its own device ID. To do
so, we need to s

fuse: split fuse_mount off of fuse_conn

We want to allow submounts for the same fuse_conn, but with different
superblocks so that each of the submounts has its own device ID. To do
so, we need to split all mount-specific information off of fuse_conn
into a new fuse_mount structure, so that multiple mounts can share a
single fuse_conn.

We need to take care only to perform connection-level actions once (i.e.
when the fuse_conn and thus the first fuse_mount are established, or
when the last fuse_mount and thus the fuse_conn are destroyed). For
example, fuse_sb_destroy() must invoke fuse_send_destroy() until the
last superblock is released.

To do so, we keep track of which fuse_mount is the root mount and
perform all fuse_conn-level actions only when this fuse_mount is
involved.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

show more ...


# 9a752d18 19-Aug-2020 Vivek Goyal <vgoyal@redhat.com>

virtiofs: add logic to free up a memory range

Add logic to free up a busy memory range. Freed memory range will be
returned to free pool. Add a worker which can be started to select
and free some bu

virtiofs: add logic to free up a memory range

Add logic to free up a busy memory range. Freed memory range will be
returned to free pool. Add a worker which can be started to select
and free some busy memory ranges.

Process can also steal one of its busy dax ranges if free range is not
available. I will refer it to as direct reclaim.

If free range is not available and nothing can't be stolen from same
inode, caller waits on a waitq for free range to become available.

For reclaiming a range, as of now we need to hold following locks in
specified order.

down_write(&fi->i_mmap_sem);
down_write(&fi->dax->sem);

We look for a free range in following order.

A. Try to get a free range.
B. If not, try direct reclaim.
C. If not, wait for a memory range to become free

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

show more ...


# d0cfb9dc 19-Aug-2020 Vivek Goyal <vgoyal@redhat.com>

virtiofs: maintain a list of busy elements

This list will be used selecting fuse_dax_mapping to free when number of
free mappings drops below a threshold.

Signed-off-by: Vivek Goyal <vgoyal@redhat.

virtiofs: maintain a list of busy elements

This list will be used selecting fuse_dax_mapping to free when number of
free mappings drops below a threshold.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

show more ...


# 6ae330ca 19-Aug-2020 Vivek Goyal <vgoyal@redhat.com>

virtiofs: serialize truncate/punch_hole and dax fault path

Currently in fuse we don't seem have any lock which can serialize fault
path with truncate/punch_hole path. With dax support I need one for

virtiofs: serialize truncate/punch_hole and dax fault path

Currently in fuse we don't seem have any lock which can serialize fault
path with truncate/punch_hole path. With dax support I need one for
following reasons.

1. Dax requirement

DAX fault code relies on inode size being stable for the duration of
fault and want to serialize with truncate/punch_hole and they explicitly
mention it.

static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp,
const struct iomap_ops *ops)
/*
* Check whether offset isn't beyond end of file now. Caller is
* supposed to hold locks serializing us with truncate / punch hole so
* this is a reliable test.
*/
max_pgoff = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);

2. Make sure there are no users of pages being truncated/punch_hole

get_user_pages() might take references to page and then do some DMA
to said pages. Filesystem might truncate those pages without knowing
that a DMA is in progress or some I/O is in progress. So use
dax_layout_busy_page() to make sure there are no such references
and I/O is not in progress on said pages before moving ahead with
truncation.

3. Limitation of kvm page fault error reporting

If we are truncating file on host first and then removing mappings in
guest lateter (truncate page cache etc), then this could lead to a
problem with KVM. Say a mapping is in place in guest and truncation
happens on host. Now if guest accesses that mapping, then host will
take a fault and kvm will either exit to qemu or spin infinitely.

IOW, before we do truncation on host, we need to make sure that guest
inode does not have any mapping in that region or whole file.

4. virtiofs memory range reclaim

Soon I will introduce the notion of being able to reclaim dax memory
ranges from a fuse dax inode. There also I need to make sure that
no I/O or fault is going on in the reclaimed range and nobody is using
it so that range can be reclaimed without issues.

Currently if we take inode lock, that serializes read/write. But it does
not do anything for faults. So I add another semaphore fuse_inode->i_mmap_sem
for this purpose. It can be used to serialize with faults.

As of now, I am adding taking this semaphore only in dax fault path and
not regular fault path because existing code does not have one. May
be existing code can benefit from it as well to take care of some
races, but that we can fix later if need be. For now, I am just focussing
only on DAX path which is new path.

Also added logic to take fuse_inode->i_mmap_sem in
truncate/punch_hole/open(O_TRUNC) path to make sure file truncation and
fuse dax fault are mutually exlusive and avoid all the above problems.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

show more ...


# 9483e7d5 19-Aug-2020 Vivek Goyal <vgoyal@redhat.com>

virtiofs: define dax address space operations

This is done along the lines of ext4 and xfs. I primarily wanted
->writepages hook at this time so that I could call into
dax_writeback_mapping_range()

virtiofs: define dax address space operations

This is done along the lines of ext4 and xfs. I primarily wanted
->writepages hook at this time so that I could call into
dax_writeback_mapping_range(). This in turn will decide which pfns need to
be written back.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

show more ...


12