History log of /openbmc/linux/fs/ocfs2/super.c (Results 1 – 25 of 555)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.6.36, v6.6.35, v6.6.34, v6.6.33
# 67bcecd7 30-May-2024 Joseph Qi <joseph.qi@linux.alibaba.com>

ocfs2: fix NULL pointer dereference in ocfs2_abort_trigger()

commit 685d03c3795378fca6a1b3d43581f7f1a3fc095f upstream.

bdev->bd_super has been removed and commit 8887b94d9322 change the usage
from

ocfs2: fix NULL pointer dereference in ocfs2_abort_trigger()

commit 685d03c3795378fca6a1b3d43581f7f1a3fc095f upstream.

bdev->bd_super has been removed and commit 8887b94d9322 change the usage
from bdev->bd_super to b_assoc_map->host->i_sb. Since ocfs2 hasn't set
bh->b_assoc_map, it will trigger NULL pointer dereference when calling
into ocfs2_abort_trigger().

Actually this was pointed out in history, see commit 74e364ad1b13. But
I've made a mistake when reviewing commit 8887b94d9322 and then
re-introduce this regression.

Since we cannot revive bdev in buffer head, so fix this issue by
initializing all types of ocfs2 triggers when fill super, and then get the
specific ocfs2 trigger from ocfs2_caching_info when access journal.

[joseph.qi@linux.alibaba.com: v2]
Link: https://lkml.kernel.org/r/20240602112045.1112708-1-joseph.qi@linux.alibaba.com
Link: https://lkml.kernel.org/r/20240530110630.3933832-2-joseph.qi@linux.alibaba.com
Fixes: 8887b94d9322 ("ocfs2: stop using bdev->bd_super for journal error logging")
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org> [6.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


Revision tags: v6.6.32, v6.6.31, v6.6.30, v6.6.29, v6.6.28, v6.6.27, v6.6.26, v6.6.25, v6.6.24, v6.6.23
# 42954c37 06-Feb-2024 Jan Kara <jack@suse.cz>

quota: Properly annotate i_dquot arrays with __rcu

[ Upstream commit ccb49011bb2ebfd66164dbf68c5bff48917bb5ef ]

Dquots pointed to from i_dquot arrays in inodes are protected by
dquot_srcu. Annotate

quota: Properly annotate i_dquot arrays with __rcu

[ Upstream commit ccb49011bb2ebfd66164dbf68c5bff48917bb5ef ]

Dquots pointed to from i_dquot arrays in inodes are protected by
dquot_srcu. Annotate them as such and change .get_dquots callback to
return properly annotated pointer to make sparse happy.

Fixes: b9ba6f94b238 ("quota: remove dqptr_sem")
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


Revision tags: v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42
# a53fb69b 26-Jul-2023 Kees Cook <keescook@chromium.org>

ocfs2: use regular seq_show_option for osb_cluster_stack

While cleaning up seq_show_option_n()'s use of strncpy, it was noticed
that the osb_cluster_stack member is always NUL-terminated, so there i

ocfs2: use regular seq_show_option for osb_cluster_stack

While cleaning up seq_show_option_n()'s use of strncpy, it was noticed
that the osb_cluster_stack member is always NUL-terminated, so there is no
need to use the special seq_show_option_n() routine. Replace it with the
standard seq_show_option() routine.

Link: https://lkml.kernel.org/r/20230726215919.never.127-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


Revision tags: v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30
# 50d92788 22-May-2023 Luís Henriques <ocfs2-devel@oss.oracle.com>

ocfs2: fix use-after-free when unmounting read-only filesystem

It's trivial to trigger a use-after-free bug in the ocfs2 quotas code using
fstest generic/452. After a read-only remount, quotas are

ocfs2: fix use-after-free when unmounting read-only filesystem

It's trivial to trigger a use-after-free bug in the ocfs2 quotas code using
fstest generic/452. After a read-only remount, quotas are suspended and
ocfs2_mem_dqinfo is freed through ->ocfs2_local_free_info(). When unmounting
the filesystem, an UAF access to the oinfo will eventually cause a crash.

BUG: KASAN: slab-use-after-free in timer_delete+0x54/0xc0
Read of size 8 at addr ffff8880389a8208 by task umount/669
...
Call Trace:
<TASK>
...
timer_delete+0x54/0xc0
try_to_grab_pending+0x31/0x230
__cancel_work_timer+0x6c/0x270
ocfs2_disable_quotas.isra.0+0x3e/0xf0 [ocfs2]
ocfs2_dismount_volume+0xdd/0x450 [ocfs2]
generic_shutdown_super+0xaa/0x280
kill_block_super+0x46/0x70
deactivate_locked_super+0x4d/0xb0
cleanup_mnt+0x135/0x1f0
...
</TASK>

Allocated by task 632:
kasan_save_stack+0x1c/0x40
kasan_set_track+0x21/0x30
__kasan_kmalloc+0x8b/0x90
ocfs2_local_read_info+0xe3/0x9a0 [ocfs2]
dquot_load_quota_sb+0x34b/0x680
dquot_load_quota_inode+0xfe/0x1a0
ocfs2_enable_quotas+0x190/0x2f0 [ocfs2]
ocfs2_fill_super+0x14ef/0x2120 [ocfs2]
mount_bdev+0x1be/0x200
legacy_get_tree+0x6c/0xb0
vfs_get_tree+0x3e/0x110
path_mount+0xa90/0xe10
__x64_sys_mount+0x16f/0x1a0
do_syscall_64+0x43/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc

Freed by task 650:
kasan_save_stack+0x1c/0x40
kasan_set_track+0x21/0x30
kasan_save_free_info+0x2a/0x50
__kasan_slab_free+0xf9/0x150
__kmem_cache_free+0x89/0x180
ocfs2_local_free_info+0x2ba/0x3f0 [ocfs2]
dquot_disable+0x35f/0xa70
ocfs2_susp_quotas.isra.0+0x159/0x1a0 [ocfs2]
ocfs2_remount+0x150/0x580 [ocfs2]
reconfigure_super+0x1a5/0x3a0
path_mount+0xc8a/0xe10
__x64_sys_mount+0x16f/0x1a0
do_syscall_64+0x43/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc

Link: https://lkml.kernel.org/r/20230522102112.9031-1-lhenriques@suse.de
Signed-off-by: Luís Henriques <lhenriques@suse.de>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Tested-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


Revision tags: v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78
# ce2fcf15 09-Nov-2022 Li Zetao <ocfs2-devel@oss.oracle.com>

ocfs2: fix memory leak in ocfs2_mount_volume()

There is a memory leak reported by kmemleak:

unreferenced object 0xffff88810cc65e60 (size 32):
comm "mount.ocfs2", pid 23753, jiffies 4302528942

ocfs2: fix memory leak in ocfs2_mount_volume()

There is a memory leak reported by kmemleak:

unreferenced object 0xffff88810cc65e60 (size 32):
comm "mount.ocfs2", pid 23753, jiffies 4302528942 (age 34735.105s)
hex dump (first 32 bytes):
10 00 00 00 00 00 00 00 00 01 01 01 01 01 01 01 ................
01 01 01 01 01 01 01 01 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffff8170f73d>] __kmalloc+0x4d/0x150
[<ffffffffa0ac3f51>] ocfs2_compute_replay_slots+0x121/0x330 [ocfs2]
[<ffffffffa0b65165>] ocfs2_check_volume+0x485/0x900 [ocfs2]
[<ffffffffa0b68129>] ocfs2_mount_volume.isra.0+0x1e9/0x650 [ocfs2]
[<ffffffffa0b7160b>] ocfs2_fill_super+0xe0b/0x1740 [ocfs2]
[<ffffffff818e1fe2>] mount_bdev+0x312/0x400
[<ffffffff819a086d>] legacy_get_tree+0xed/0x1d0
[<ffffffff818de82d>] vfs_get_tree+0x7d/0x230
[<ffffffff81957f92>] path_mount+0xd62/0x1760
[<ffffffff81958a5a>] do_mount+0xca/0xe0
[<ffffffff81958d3c>] __x64_sys_mount+0x12c/0x1a0
[<ffffffff82f26f15>] do_syscall_64+0x35/0x80
[<ffffffff8300006a>] entry_SYSCALL_64_after_hwframe+0x46/0xb0

This call stack is related to two problems. Firstly, the ocfs2 super uses
"replay_map" to trace online/offline slots, in order to recover offline
slots during recovery and mount. But when ocfs2_truncate_log_init()
returns an error in ocfs2_mount_volume(), the memory of "replay_map" will
not be freed in error handling path. Secondly, the memory of "replay_map"
will not be freed if d_make_root() returns an error in ocfs2_fill_super().
But the memory of "replay_map" will be freed normally when completing
recovery and mount in ocfs2_complete_mount_recovery().

Fix the first problem by adding error handling path to free "replay_map"
when ocfs2_truncate_log_init() fails. And fix the second problem by
calling ocfs2_free_replay_slots(osb) in the error handling path
"out_dismount". In addition, since ocfs2_free_replay_slots() is static,
it is necessary to remove its static attribute and declare it in header
file.

Link: https://lkml.kernel.org/r/20221109074627.2303950-1-lizetao1@huawei.com
Fixes: 9140db04ef18 ("ocfs2: recover orphans in offline slots during recovery and mount")
Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


Revision tags: v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62
# c97e21fe 18-Aug-2022 Wolfram Sang <wsa+renesas@sang-engineering.com>

ocfs2: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated

ocfs2: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Link: https://lkml.kernel.org/r/20220818210123.7637-4-wsa+renesas@sang-engineering.com
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


# 54d9171d 01-Sep-2022 Zhang Yi <yi.zhang@huawei.com>

ocfs2: replace ll_rw_block()

ll_rw_block() is not safe for the sync read path because it cannot
guarantee that submitting read IO if the buffer has been locked. We
could get false positive EIO after

ocfs2: replace ll_rw_block()

ll_rw_block() is not safe for the sync read path because it cannot
guarantee that submitting read IO if the buffer has been locked. We
could get false positive EIO after wait_on_buffer() if the buffer has
been locked by others. So stop using ll_rw_block() in ocfs2.

Link: https://lkml.kernel.org/r/20220901133505.2510834-9-yi.zhang@huawei.com
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


Revision tags: v5.15.61
# 550842cc 15-Aug-2022 Heming Zhao <ocfs2-devel@oss.oracle.com>

ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdown

After commit 0737e01de9c4 ("ocfs2: ocfs2_mount_volume does cleanup job
before return error"), any procedure after ocfs2_dlm_init() fai

ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdown

After commit 0737e01de9c4 ("ocfs2: ocfs2_mount_volume does cleanup job
before return error"), any procedure after ocfs2_dlm_init() fails will
trigger crash when calling ocfs2_dlm_shutdown().

ie: On local mount mode, no dlm resource is initialized. If
ocfs2_mount_volume() fails in ocfs2_find_slot(), error handling will call
ocfs2_dlm_shutdown(), then does dlm resource cleanup job, which will
trigger kernel crash.

This solution should bypass uninitialized resources in
ocfs2_dlm_shutdown().

Link: https://lkml.kernel.org/r/20220815085754.20417-1-heming.zhao@suse.com
Fixes: 0737e01de9c4 ("ocfs2: ocfs2_mount_volume does cleanup job before return error")
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


Revision tags: v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45
# c80af0c2 03-Jun-2022 Junxiao Bi <ocfs2-devel@oss.oracle.com>

Revert "ocfs2: mount shared volume without ha stack"

This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.

This commit introduced a regression that can cause mount hung. The
changes in __o

Revert "ocfs2: mount shared volume without ha stack"

This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.

This commit introduced a regression that can cause mount hung. The
changes in __ocfs2_find_empty_slot causes that any node with none-zero
node number can grab the slot that was already taken by node 0, so node 1
will access the same journal with node 0, when it try to grab journal
cluster lock, it will hung because it was already acquired by node 0.
It's very easy to reproduce this, in one cluster, mount node 0 first, then
node 1, you will see the following call trace from node 1.

[13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds.
[13148.739691] Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2
[13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[13148.745846] task:mount.ocfs2 state:D stack: 0 pid:53045 ppid: 53044 flags:0x00004000
[13148.749354] Call Trace:
[13148.750718] <TASK>
[13148.752019] ? usleep_range+0x90/0x89
[13148.753882] __schedule+0x210/0x567
[13148.755684] schedule+0x44/0xa8
[13148.757270] schedule_timeout+0x106/0x13c
[13148.759273] ? __prepare_to_swait+0x53/0x78
[13148.761218] __wait_for_common+0xae/0x163
[13148.763144] __ocfs2_cluster_lock.constprop.0+0x1d6/0x870 [ocfs2]
[13148.765780] ? ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
[13148.768312] ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
[13148.770968] ocfs2_journal_init+0x91/0x340 [ocfs2]
[13148.773202] ocfs2_check_volume+0x39/0x461 [ocfs2]
[13148.775401] ? iput+0x69/0xba
[13148.777047] ocfs2_mount_volume.isra.0.cold+0x40/0x1f5 [ocfs2]
[13148.779646] ocfs2_fill_super+0x54b/0x853 [ocfs2]
[13148.781756] mount_bdev+0x190/0x1b7
[13148.783443] ? ocfs2_remount+0x440/0x440 [ocfs2]
[13148.785634] legacy_get_tree+0x27/0x48
[13148.787466] vfs_get_tree+0x25/0xd0
[13148.789270] do_new_mount+0x18c/0x2d9
[13148.791046] __x64_sys_mount+0x10e/0x142
[13148.792911] do_syscall_64+0x3b/0x89
[13148.794667] entry_SYSCALL_64_after_hwframe+0x170/0x0
[13148.797051] RIP: 0033:0x7f2309f6e26e
[13148.798784] RSP: 002b:00007ffdcee7d408 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
[13148.801974] RAX: ffffffffffffffda RBX: 00007ffdcee7d4a0 RCX: 00007f2309f6e26e
[13148.804815] RDX: 0000559aa762a8ae RSI: 0000559aa939d340 RDI: 0000559aa93a22b0
[13148.807719] RBP: 00007ffdcee7d5b0 R08: 0000559aa93a2290 R09: 00007f230a0b4820
[13148.810659] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcee7d420
[13148.813609] R13: 0000000000000000 R14: 0000559aa939f000 R15: 0000000000000000
[13148.816564] </TASK>

To fix it, we can just fix __ocfs2_find_empty_slot. But original commit
introduced the feature to mount ocfs2 locally even it is cluster based,
that is a very dangerous, it can easily cause serious data corruption,
there is no way to stop other nodes mounting the fs and corrupting it.
Setup ha or other cluster-aware stack is just the cost that we have to
take for avoiding corruption, otherwise we have to do it in kernel.

Link: https://lkml.kernel.org/r/20220603222801.42488-1-junxiao.bi@oracle.com
Fixes: 912f655d78c5("ocfs2: mount shared volume without ha stack")
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <heming.zhao@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


# 1420c4a5 14-Jul-2022 Bart Van Assche <bvanassche@acm.org>

fs/buffer: Combine two submit_bh() and ll_rw_block() arguments

Both submit_bh() and ll_rw_block() accept a request operation type and
request flags as their first two arguments. Micro-optimize these

fs/buffer: Combine two submit_bh() and ll_rw_block() arguments

Both submit_bh() and ll_rw_block() accept a request operation type and
request flags as their first two arguments. Micro-optimize these two
functions by combining these first two arguments into a single argument.
This patch does not change the behavior of any of the modified code.

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jan Kara <jack@suse.cz>
Acked-by: Song Liu <song@kernel.org> (for the md changes)
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-48-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


Revision tags: v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37
# f1e75d12 29-Apr-2022 Heming Zhao via Ocfs2-devel <ocfs2-devel@oss.oracle.com>

ocfs2: rewrite error handling of ocfs2_fill_super

Current ocfs2_fill_super() uses one goto label "read_super_error" to
handle all error cases. And with previous serial patches, the error
handling s

ocfs2: rewrite error handling of ocfs2_fill_super

Current ocfs2_fill_super() uses one goto label "read_super_error" to
handle all error cases. And with previous serial patches, the error
handling should fork more branches to handle different error cases. This
patch rewrite the error handling of ocfs2_fill_super.

Link: https://lkml.kernel.org/r/20220424130952.2436-6-heming.zhao@suse.com
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


# 0737e01d 29-Apr-2022 Heming Zhao via Ocfs2-devel <ocfs2-devel@oss.oracle.com>

ocfs2: ocfs2_mount_volume does cleanup job before return error

After this patch, when error, ocfs2_fill_super doesn't take care to
release resources which are allocated in ocfs2_mount_volume.

Link:

ocfs2: ocfs2_mount_volume does cleanup job before return error

After this patch, when error, ocfs2_fill_super doesn't take care to
release resources which are allocated in ocfs2_mount_volume.

Link: https://lkml.kernel.org/r/20220424130952.2436-5-heming.zhao@suse.com
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


# a8a986db 29-Apr-2022 Heming Zhao via Ocfs2-devel <ocfs2-devel@oss.oracle.com>

ocfs2: ocfs2_initialize_super does cleanup job before return error

After this patch, when error, ocfs2_fill_super doesn't take care to
release resources which are allocated in ocfs2_initialize_super

ocfs2: ocfs2_initialize_super does cleanup job before return error

After this patch, when error, ocfs2_fill_super doesn't take care to
release resources which are allocated in ocfs2_initialize_super.

Link: https://lkml.kernel.org/r/20220424130952.2436-4-heming.zhao@suse.com
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


# 54bd3f7c 29-Apr-2022 Heming Zhao via Ocfs2-devel <ocfs2-devel@oss.oracle.com>

ocfs2: change return type of ocfs2_resmap_init

Since ocfs2_resmap_init() always return 0, change it to void.

Link: https://lkml.kernel.org/r/20220424130952.2436-3-heming.zhao@suse.com
Signed-off-by

ocfs2: change return type of ocfs2_resmap_init

Since ocfs2_resmap_init() always return 0, change it to void.

Link: https://lkml.kernel.org/r/20220424130952.2436-3-heming.zhao@suse.com
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


# bb20b31d 29-Apr-2022 Heming Zhao via Ocfs2-devel <ocfs2-devel@oss.oracle.com>

ocfs2: fix mounting crash if journal is not alloced

Patch series "rewrite error handling during mounting stage".


This patch (of 5):

After commit da5e7c87827e8 ("ocfs2: cleanup journal init and sh

ocfs2: fix mounting crash if journal is not alloced

Patch series "rewrite error handling during mounting stage".


This patch (of 5):

After commit da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown"),
journal init later than before, it makes NULL pointer access in free
routine.

Crash flow:

ocfs2_fill_super
+ ocfs2_mount_volume
| + ocfs2_dlm_init //fail & return, osb->journal is NULL.
| + ...
| + ocfs2_check_volume //no chance to init osb->journal
|
+ ...
+ ocfs2_dismount_volume
ocfs2_release_system_inodes
...
evict
...
ocfs2_clear_inode
ocfs2_checkpoint_inode
ocfs2_ci_fully_checkpointed
time_after(journal->j_trans_id, ci->ci_last_trans)
+ journal is empty, crash!

For fixing, there are three solutions:

1> Partly revert commit da5e7c87827e8

For avoiding kernel crash, this make sense for us. We only
concerned whether there has any non-system inode access before dlm
init. The answer is NO. And all journal replay/recovery handling
happen after dlm & journal init done. So this method is not graceful
but workable.

2> Add osb->journal check in free inode routine (eg ocfs2_clear_inode)

The fix code is special for mounting phase, but it will continue
working after mounting stage. In another word, this method adds
useless code in normal inode free flow.

3> Do directly free inode in mounting phase

This method is brutal/complex and may introduce unsafe code,
currently maintainer didn't like.

At last, we chose method <1> and did partly reverted job. We reverted
journal init codes, and kept cleanup codes flow.

Link: https://lkml.kernel.org/r/20220424130952.2436-1-heming.zhao@suse.com
Link: https://lkml.kernel.org/r/20220424130952.2436-2-heming.zhao@suse.com
Fixes: da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown")
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


Revision tags: v5.15.36, v5.15.35, v5.15.34, v5.15.33, v5.15.32, v5.15.31
# fd60b288 22-Mar-2022 Muchun Song <songmuchun@bytedance.com>

fs: allocate inode by using alloc_inode_sb()

The inode allocation is supposed to use alloc_inode_sb(), so convert
kmem_cache_alloc() of all filesystems to alloc_inode_sb().

Link: https://lkml.kerne

fs: allocate inode by using alloc_inode_sb()

The inode allocation is supposed to use alloc_inode_sb(), so convert
kmem_cache_alloc() of all filesystems to alloc_inode_sb().

Link: https://lkml.kernel.org/r/20220228122126.37293-5-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Theodore Ts'o <tytso@mit.edu> [ext4]
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Alex Shi <alexs@kernel.org>
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
Cc: Chao Yu <chao@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Fam Zheng <fam.zheng@bytedance.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kari Argillander <kari.argillander@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Xiongchun Duan <duanxiongchun@bytedance.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


Revision tags: v5.17, v5.15.30
# 7b0b1332 16-Mar-2022 Joseph Qi <joseph.qi@linux.alibaba.com>

ocfs2: fix crash when initialize filecheck kobj fails

Once s_root is set, genric_shutdown_super() will be called if
fill_super() fails. That means, we will call ocfs2_dismount_volume()
twice in suc

ocfs2: fix crash when initialize filecheck kobj fails

Once s_root is set, genric_shutdown_super() will be called if
fill_super() fails. That means, we will call ocfs2_dismount_volume()
twice in such case, which can lead to kernel crash.

Fix this issue by initializing filecheck kobj before setting s_root.

Link: https://lkml.kernel.org/r/20220310081930.86305-1-joseph.qi@linux.alibaba.com
Fixes: 5f483c4abb50 ("ocfs2: add kobject for online file check")
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


Revision tags: v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17
# 0a4ee518 22-Jan-2022 Christoph Hellwig <hch@lst.de>

mm: remove cleancache

Patch series "remove Xen tmem leftovers".

Since the removal of the Xen tmem driver in 2019, the cleancache hooks
are entirely unused, as are large parts of frontswap. This se

mm: remove cleancache

Patch series "remove Xen tmem leftovers".

Since the removal of the Xen tmem driver in 2019, the cleancache hooks
are entirely unused, as are large parts of frontswap. This series
against linux-next (with the folio changes included) removes
cleancaches, and cuts down frontswap to the bits actually used by zswap.

This patch (of 13):

The cleancache subsystem is unused since the removal of Xen tmem driver
in commit 814bbf49dcd0 ("xen: remove tmem driver").

[akpm@linux-foundation.org: remove now-unreachable code]

Link: https://lkml.kernel.org/r/20211224062246.1258487-1-hch@lst.de
Link: https://lkml.kernel.org/r/20211224062246.1258487-2-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


Revision tags: v5.4.173, v5.15.16, v5.15.15, v5.16, v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1
# da5e7c87 05-Nov-2021 Valentin Vidic <vvidic@valentin-vidic.from.hr>

ocfs2: cleanup journal init and shutdown

Allocate and free struct ocfs2_journal in ocfs2_journal_init and
ocfs2_journal_shutdown. Init and release of system inodes references
the journal so reorder

ocfs2: cleanup journal init and shutdown

Allocate and free struct ocfs2_journal in ocfs2_journal_init and
ocfs2_journal_shutdown. Init and release of system inodes references
the journal so reorder calls to make sure they work correctly.

Link: https://lkml.kernel.org/r/20211009145006.3478-1-vvidic@valentin-vidic.from.hr
Signed-off-by: Valentin Vidic <vvidic@valentin-vidic.from.hr>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# 46f6301f 03-Jun-2022 Junxiao Bi <ocfs2-devel@oss.oracle.com>

Revert "ocfs2: mount shared volume without ha stack"

commit c80af0c250c8f8a3c978aa5aafbe9c39b336b813 upstream.

This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.

This commit introduced

Revert "ocfs2: mount shared volume without ha stack"

commit c80af0c250c8f8a3c978aa5aafbe9c39b336b813 upstream.

This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.

This commit introduced a regression that can cause mount hung. The
changes in __ocfs2_find_empty_slot causes that any node with none-zero
node number can grab the slot that was already taken by node 0, so node 1
will access the same journal with node 0, when it try to grab journal
cluster lock, it will hung because it was already acquired by node 0.
It's very easy to reproduce this, in one cluster, mount node 0 first, then
node 1, you will see the following call trace from node 1.

[13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds.
[13148.739691] Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2
[13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[13148.745846] task:mount.ocfs2 state:D stack: 0 pid:53045 ppid: 53044 flags:0x00004000
[13148.749354] Call Trace:
[13148.750718] <TASK>
[13148.752019] ? usleep_range+0x90/0x89
[13148.753882] __schedule+0x210/0x567
[13148.755684] schedule+0x44/0xa8
[13148.757270] schedule_timeout+0x106/0x13c
[13148.759273] ? __prepare_to_swait+0x53/0x78
[13148.761218] __wait_for_common+0xae/0x163
[13148.763144] __ocfs2_cluster_lock.constprop.0+0x1d6/0x870 [ocfs2]
[13148.765780] ? ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
[13148.768312] ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
[13148.770968] ocfs2_journal_init+0x91/0x340 [ocfs2]
[13148.773202] ocfs2_check_volume+0x39/0x461 [ocfs2]
[13148.775401] ? iput+0x69/0xba
[13148.777047] ocfs2_mount_volume.isra.0.cold+0x40/0x1f5 [ocfs2]
[13148.779646] ocfs2_fill_super+0x54b/0x853 [ocfs2]
[13148.781756] mount_bdev+0x190/0x1b7
[13148.783443] ? ocfs2_remount+0x440/0x440 [ocfs2]
[13148.785634] legacy_get_tree+0x27/0x48
[13148.787466] vfs_get_tree+0x25/0xd0
[13148.789270] do_new_mount+0x18c/0x2d9
[13148.791046] __x64_sys_mount+0x10e/0x142
[13148.792911] do_syscall_64+0x3b/0x89
[13148.794667] entry_SYSCALL_64_after_hwframe+0x170/0x0
[13148.797051] RIP: 0033:0x7f2309f6e26e
[13148.798784] RSP: 002b:00007ffdcee7d408 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
[13148.801974] RAX: ffffffffffffffda RBX: 00007ffdcee7d4a0 RCX: 00007f2309f6e26e
[13148.804815] RDX: 0000559aa762a8ae RSI: 0000559aa939d340 RDI: 0000559aa93a22b0
[13148.807719] RBP: 00007ffdcee7d5b0 R08: 0000559aa93a2290 R09: 00007f230a0b4820
[13148.810659] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcee7d420
[13148.813609] R13: 0000000000000000 R14: 0000559aa939f000 R15: 0000000000000000
[13148.816564] </TASK>

To fix it, we can just fix __ocfs2_find_empty_slot. But original commit
introduced the feature to mount ocfs2 locally even it is cluster based,
that is a very dangerous, it can easily cause serious data corruption,
there is no way to stop other nodes mounting the fs and corrupting it.
Setup ha or other cluster-aware stack is just the cost that we have to
take for avoiding corruption, otherwise we have to do it in kernel.

Link: https://lkml.kernel.org/r/20220603222801.42488-1-junxiao.bi@oracle.com
Fixes: 912f655d78c5("ocfs2: mount shared volume without ha stack")
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <heming.zhao@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


# 46f6301f 03-Jun-2022 Junxiao Bi <ocfs2-devel@oss.oracle.com>

Revert "ocfs2: mount shared volume without ha stack"

commit c80af0c250c8f8a3c978aa5aafbe9c39b336b813 upstream.

This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.

This commit introduced

Revert "ocfs2: mount shared volume without ha stack"

commit c80af0c250c8f8a3c978aa5aafbe9c39b336b813 upstream.

This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.

This commit introduced a regression that can cause mount hung. The
changes in __ocfs2_find_empty_slot causes that any node with none-zero
node number can grab the slot that was already taken by node 0, so node 1
will access the same journal with node 0, when it try to grab journal
cluster lock, it will hung because it was already acquired by node 0.
It's very easy to reproduce this, in one cluster, mount node 0 first, then
node 1, you will see the following call trace from node 1.

[13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds.
[13148.739691] Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2
[13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[13148.745846] task:mount.ocfs2 state:D stack: 0 pid:53045 ppid: 53044 flags:0x00004000
[13148.749354] Call Trace:
[13148.750718] <TASK>
[13148.752019] ? usleep_range+0x90/0x89
[13148.753882] __schedule+0x210/0x567
[13148.755684] schedule+0x44/0xa8
[13148.757270] schedule_timeout+0x106/0x13c
[13148.759273] ? __prepare_to_swait+0x53/0x78
[13148.761218] __wait_for_common+0xae/0x163
[13148.763144] __ocfs2_cluster_lock.constprop.0+0x1d6/0x870 [ocfs2]
[13148.765780] ? ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
[13148.768312] ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
[13148.770968] ocfs2_journal_init+0x91/0x340 [ocfs2]
[13148.773202] ocfs2_check_volume+0x39/0x461 [ocfs2]
[13148.775401] ? iput+0x69/0xba
[13148.777047] ocfs2_mount_volume.isra.0.cold+0x40/0x1f5 [ocfs2]
[13148.779646] ocfs2_fill_super+0x54b/0x853 [ocfs2]
[13148.781756] mount_bdev+0x190/0x1b7
[13148.783443] ? ocfs2_remount+0x440/0x440 [ocfs2]
[13148.785634] legacy_get_tree+0x27/0x48
[13148.787466] vfs_get_tree+0x25/0xd0
[13148.789270] do_new_mount+0x18c/0x2d9
[13148.791046] __x64_sys_mount+0x10e/0x142
[13148.792911] do_syscall_64+0x3b/0x89
[13148.794667] entry_SYSCALL_64_after_hwframe+0x170/0x0
[13148.797051] RIP: 0033:0x7f2309f6e26e
[13148.798784] RSP: 002b:00007ffdcee7d408 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
[13148.801974] RAX: ffffffffffffffda RBX: 00007ffdcee7d4a0 RCX: 00007f2309f6e26e
[13148.804815] RDX: 0000559aa762a8ae RSI: 0000559aa939d340 RDI: 0000559aa93a22b0
[13148.807719] RBP: 00007ffdcee7d5b0 R08: 0000559aa93a2290 R09: 00007f230a0b4820
[13148.810659] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcee7d420
[13148.813609] R13: 0000000000000000 R14: 0000559aa939f000 R15: 0000000000000000
[13148.816564] </TASK>

To fix it, we can just fix __ocfs2_find_empty_slot. But original commit
introduced the feature to mount ocfs2 locally even it is cluster based,
that is a very dangerous, it can easily cause serious data corruption,
there is no way to stop other nodes mounting the fs and corrupting it.
Setup ha or other cluster-aware stack is just the cost that we have to
take for avoiding corruption, otherwise we have to do it in kernel.

Link: https://lkml.kernel.org/r/20220603222801.42488-1-junxiao.bi@oracle.com
Fixes: 912f655d78c5("ocfs2: mount shared volume without ha stack")
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <heming.zhao@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


# 46f6301f 03-Jun-2022 Junxiao Bi <ocfs2-devel@oss.oracle.com>

Revert "ocfs2: mount shared volume without ha stack"

commit c80af0c250c8f8a3c978aa5aafbe9c39b336b813 upstream.

This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.

This commit introduced

Revert "ocfs2: mount shared volume without ha stack"

commit c80af0c250c8f8a3c978aa5aafbe9c39b336b813 upstream.

This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.

This commit introduced a regression that can cause mount hung. The
changes in __ocfs2_find_empty_slot causes that any node with none-zero
node number can grab the slot that was already taken by node 0, so node 1
will access the same journal with node 0, when it try to grab journal
cluster lock, it will hung because it was already acquired by node 0.
It's very easy to reproduce this, in one cluster, mount node 0 first, then
node 1, you will see the following call trace from node 1.

[13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds.
[13148.739691] Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2
[13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[13148.745846] task:mount.ocfs2 state:D stack: 0 pid:53045 ppid: 53044 flags:0x00004000
[13148.749354] Call Trace:
[13148.750718] <TASK>
[13148.752019] ? usleep_range+0x90/0x89
[13148.753882] __schedule+0x210/0x567
[13148.755684] schedule+0x44/0xa8
[13148.757270] schedule_timeout+0x106/0x13c
[13148.759273] ? __prepare_to_swait+0x53/0x78
[13148.761218] __wait_for_common+0xae/0x163
[13148.763144] __ocfs2_cluster_lock.constprop.0+0x1d6/0x870 [ocfs2]
[13148.765780] ? ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
[13148.768312] ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
[13148.770968] ocfs2_journal_init+0x91/0x340 [ocfs2]
[13148.773202] ocfs2_check_volume+0x39/0x461 [ocfs2]
[13148.775401] ? iput+0x69/0xba
[13148.777047] ocfs2_mount_volume.isra.0.cold+0x40/0x1f5 [ocfs2]
[13148.779646] ocfs2_fill_super+0x54b/0x853 [ocfs2]
[13148.781756] mount_bdev+0x190/0x1b7
[13148.783443] ? ocfs2_remount+0x440/0x440 [ocfs2]
[13148.785634] legacy_get_tree+0x27/0x48
[13148.787466] vfs_get_tree+0x25/0xd0
[13148.789270] do_new_mount+0x18c/0x2d9
[13148.791046] __x64_sys_mount+0x10e/0x142
[13148.792911] do_syscall_64+0x3b/0x89
[13148.794667] entry_SYSCALL_64_after_hwframe+0x170/0x0
[13148.797051] RIP: 0033:0x7f2309f6e26e
[13148.798784] RSP: 002b:00007ffdcee7d408 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
[13148.801974] RAX: ffffffffffffffda RBX: 00007ffdcee7d4a0 RCX: 00007f2309f6e26e
[13148.804815] RDX: 0000559aa762a8ae RSI: 0000559aa939d340 RDI: 0000559aa93a22b0
[13148.807719] RBP: 00007ffdcee7d5b0 R08: 0000559aa93a2290 R09: 00007f230a0b4820
[13148.810659] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcee7d420
[13148.813609] R13: 0000000000000000 R14: 0000559aa939f000 R15: 0000000000000000
[13148.816564] </TASK>

To fix it, we can just fix __ocfs2_find_empty_slot. But original commit
introduced the feature to mount ocfs2 locally even it is cluster based,
that is a very dangerous, it can easily cause serious data corruption,
there is no way to stop other nodes mounting the fs and corrupting it.
Setup ha or other cluster-aware stack is just the cost that we have to
take for avoiding corruption, otherwise we have to do it in kernel.

Link: https://lkml.kernel.org/r/20220603222801.42488-1-junxiao.bi@oracle.com
Fixes: 912f655d78c5("ocfs2: mount shared volume without ha stack")
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <heming.zhao@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


# 46f6301f 03-Jun-2022 Junxiao Bi <ocfs2-devel@oss.oracle.com>

Revert "ocfs2: mount shared volume without ha stack"

commit c80af0c250c8f8a3c978aa5aafbe9c39b336b813 upstream.

This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.

This commit introduced

Revert "ocfs2: mount shared volume without ha stack"

commit c80af0c250c8f8a3c978aa5aafbe9c39b336b813 upstream.

This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.

This commit introduced a regression that can cause mount hung. The
changes in __ocfs2_find_empty_slot causes that any node with none-zero
node number can grab the slot that was already taken by node 0, so node 1
will access the same journal with node 0, when it try to grab journal
cluster lock, it will hung because it was already acquired by node 0.
It's very easy to reproduce this, in one cluster, mount node 0 first, then
node 1, you will see the following call trace from node 1.

[13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds.
[13148.739691] Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2
[13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[13148.745846] task:mount.ocfs2 state:D stack: 0 pid:53045 ppid: 53044 flags:0x00004000
[13148.749354] Call Trace:
[13148.750718] <TASK>
[13148.752019] ? usleep_range+0x90/0x89
[13148.753882] __schedule+0x210/0x567
[13148.755684] schedule+0x44/0xa8
[13148.757270] schedule_timeout+0x106/0x13c
[13148.759273] ? __prepare_to_swait+0x53/0x78
[13148.761218] __wait_for_common+0xae/0x163
[13148.763144] __ocfs2_cluster_lock.constprop.0+0x1d6/0x870 [ocfs2]
[13148.765780] ? ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
[13148.768312] ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
[13148.770968] ocfs2_journal_init+0x91/0x340 [ocfs2]
[13148.773202] ocfs2_check_volume+0x39/0x461 [ocfs2]
[13148.775401] ? iput+0x69/0xba
[13148.777047] ocfs2_mount_volume.isra.0.cold+0x40/0x1f5 [ocfs2]
[13148.779646] ocfs2_fill_super+0x54b/0x853 [ocfs2]
[13148.781756] mount_bdev+0x190/0x1b7
[13148.783443] ? ocfs2_remount+0x440/0x440 [ocfs2]
[13148.785634] legacy_get_tree+0x27/0x48
[13148.787466] vfs_get_tree+0x25/0xd0
[13148.789270] do_new_mount+0x18c/0x2d9
[13148.791046] __x64_sys_mount+0x10e/0x142
[13148.792911] do_syscall_64+0x3b/0x89
[13148.794667] entry_SYSCALL_64_after_hwframe+0x170/0x0
[13148.797051] RIP: 0033:0x7f2309f6e26e
[13148.798784] RSP: 002b:00007ffdcee7d408 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
[13148.801974] RAX: ffffffffffffffda RBX: 00007ffdcee7d4a0 RCX: 00007f2309f6e26e
[13148.804815] RDX: 0000559aa762a8ae RSI: 0000559aa939d340 RDI: 0000559aa93a22b0
[13148.807719] RBP: 00007ffdcee7d5b0 R08: 0000559aa93a2290 R09: 00007f230a0b4820
[13148.810659] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcee7d420
[13148.813609] R13: 0000000000000000 R14: 0000559aa939f000 R15: 0000000000000000
[13148.816564] </TASK>

To fix it, we can just fix __ocfs2_find_empty_slot. But original commit
introduced the feature to mount ocfs2 locally even it is cluster based,
that is a very dangerous, it can easily cause serious data corruption,
there is no way to stop other nodes mounting the fs and corrupting it.
Setup ha or other cluster-aware stack is just the cost that we have to
take for avoiding corruption, otherwise we have to do it in kernel.

Link: https://lkml.kernel.org/r/20220603222801.42488-1-junxiao.bi@oracle.com
Fixes: 912f655d78c5("ocfs2: mount shared volume without ha stack")
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <heming.zhao@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


# 46f6301f 03-Jun-2022 Junxiao Bi <ocfs2-devel@oss.oracle.com>

Revert "ocfs2: mount shared volume without ha stack"

commit c80af0c250c8f8a3c978aa5aafbe9c39b336b813 upstream.

This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.

This commit introduced

Revert "ocfs2: mount shared volume without ha stack"

commit c80af0c250c8f8a3c978aa5aafbe9c39b336b813 upstream.

This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.

This commit introduced a regression that can cause mount hung. The
changes in __ocfs2_find_empty_slot causes that any node with none-zero
node number can grab the slot that was already taken by node 0, so node 1
will access the same journal with node 0, when it try to grab journal
cluster lock, it will hung because it was already acquired by node 0.
It's very easy to reproduce this, in one cluster, mount node 0 first, then
node 1, you will see the following call trace from node 1.

[13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds.
[13148.739691] Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2
[13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[13148.745846] task:mount.ocfs2 state:D stack: 0 pid:53045 ppid: 53044 flags:0x00004000
[13148.749354] Call Trace:
[13148.750718] <TASK>
[13148.752019] ? usleep_range+0x90/0x89
[13148.753882] __schedule+0x210/0x567
[13148.755684] schedule+0x44/0xa8
[13148.757270] schedule_timeout+0x106/0x13c
[13148.759273] ? __prepare_to_swait+0x53/0x78
[13148.761218] __wait_for_common+0xae/0x163
[13148.763144] __ocfs2_cluster_lock.constprop.0+0x1d6/0x870 [ocfs2]
[13148.765780] ? ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
[13148.768312] ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
[13148.770968] ocfs2_journal_init+0x91/0x340 [ocfs2]
[13148.773202] ocfs2_check_volume+0x39/0x461 [ocfs2]
[13148.775401] ? iput+0x69/0xba
[13148.777047] ocfs2_mount_volume.isra.0.cold+0x40/0x1f5 [ocfs2]
[13148.779646] ocfs2_fill_super+0x54b/0x853 [ocfs2]
[13148.781756] mount_bdev+0x190/0x1b7
[13148.783443] ? ocfs2_remount+0x440/0x440 [ocfs2]
[13148.785634] legacy_get_tree+0x27/0x48
[13148.787466] vfs_get_tree+0x25/0xd0
[13148.789270] do_new_mount+0x18c/0x2d9
[13148.791046] __x64_sys_mount+0x10e/0x142
[13148.792911] do_syscall_64+0x3b/0x89
[13148.794667] entry_SYSCALL_64_after_hwframe+0x170/0x0
[13148.797051] RIP: 0033:0x7f2309f6e26e
[13148.798784] RSP: 002b:00007ffdcee7d408 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
[13148.801974] RAX: ffffffffffffffda RBX: 00007ffdcee7d4a0 RCX: 00007f2309f6e26e
[13148.804815] RDX: 0000559aa762a8ae RSI: 0000559aa939d340 RDI: 0000559aa93a22b0
[13148.807719] RBP: 00007ffdcee7d5b0 R08: 0000559aa93a2290 R09: 00007f230a0b4820
[13148.810659] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcee7d420
[13148.813609] R13: 0000000000000000 R14: 0000559aa939f000 R15: 0000000000000000
[13148.816564] </TASK>

To fix it, we can just fix __ocfs2_find_empty_slot. But original commit
introduced the feature to mount ocfs2 locally even it is cluster based,
that is a very dangerous, it can easily cause serious data corruption,
there is no way to stop other nodes mounting the fs and corrupting it.
Setup ha or other cluster-aware stack is just the cost that we have to
take for avoiding corruption, otherwise we have to do it in kernel.

Link: https://lkml.kernel.org/r/20220603222801.42488-1-junxiao.bi@oracle.com
Fixes: 912f655d78c5("ocfs2: mount shared volume without ha stack")
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <heming.zhao@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


# 46f6301f 03-Jun-2022 Junxiao Bi <ocfs2-devel@oss.oracle.com>

Revert "ocfs2: mount shared volume without ha stack"

commit c80af0c250c8f8a3c978aa5aafbe9c39b336b813 upstream.

This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.

This commit introduced

Revert "ocfs2: mount shared volume without ha stack"

commit c80af0c250c8f8a3c978aa5aafbe9c39b336b813 upstream.

This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.

This commit introduced a regression that can cause mount hung. The
changes in __ocfs2_find_empty_slot causes that any node with none-zero
node number can grab the slot that was already taken by node 0, so node 1
will access the same journal with node 0, when it try to grab journal
cluster lock, it will hung because it was already acquired by node 0.
It's very easy to reproduce this, in one cluster, mount node 0 first, then
node 1, you will see the following call trace from node 1.

[13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds.
[13148.739691] Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2
[13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[13148.745846] task:mount.ocfs2 state:D stack: 0 pid:53045 ppid: 53044 flags:0x00004000
[13148.749354] Call Trace:
[13148.750718] <TASK>
[13148.752019] ? usleep_range+0x90/0x89
[13148.753882] __schedule+0x210/0x567
[13148.755684] schedule+0x44/0xa8
[13148.757270] schedule_timeout+0x106/0x13c
[13148.759273] ? __prepare_to_swait+0x53/0x78
[13148.761218] __wait_for_common+0xae/0x163
[13148.763144] __ocfs2_cluster_lock.constprop.0+0x1d6/0x870 [ocfs2]
[13148.765780] ? ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
[13148.768312] ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
[13148.770968] ocfs2_journal_init+0x91/0x340 [ocfs2]
[13148.773202] ocfs2_check_volume+0x39/0x461 [ocfs2]
[13148.775401] ? iput+0x69/0xba
[13148.777047] ocfs2_mount_volume.isra.0.cold+0x40/0x1f5 [ocfs2]
[13148.779646] ocfs2_fill_super+0x54b/0x853 [ocfs2]
[13148.781756] mount_bdev+0x190/0x1b7
[13148.783443] ? ocfs2_remount+0x440/0x440 [ocfs2]
[13148.785634] legacy_get_tree+0x27/0x48
[13148.787466] vfs_get_tree+0x25/0xd0
[13148.789270] do_new_mount+0x18c/0x2d9
[13148.791046] __x64_sys_mount+0x10e/0x142
[13148.792911] do_syscall_64+0x3b/0x89
[13148.794667] entry_SYSCALL_64_after_hwframe+0x170/0x0
[13148.797051] RIP: 0033:0x7f2309f6e26e
[13148.798784] RSP: 002b:00007ffdcee7d408 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
[13148.801974] RAX: ffffffffffffffda RBX: 00007ffdcee7d4a0 RCX: 00007f2309f6e26e
[13148.804815] RDX: 0000559aa762a8ae RSI: 0000559aa939d340 RDI: 0000559aa93a22b0
[13148.807719] RBP: 00007ffdcee7d5b0 R08: 0000559aa93a2290 R09: 00007f230a0b4820
[13148.810659] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcee7d420
[13148.813609] R13: 0000000000000000 R14: 0000559aa939f000 R15: 0000000000000000
[13148.816564] </TASK>

To fix it, we can just fix __ocfs2_find_empty_slot. But original commit
introduced the feature to mount ocfs2 locally even it is cluster based,
that is a very dangerous, it can easily cause serious data corruption,
there is no way to stop other nodes mounting the fs and corrupting it.
Setup ha or other cluster-aware stack is just the cost that we have to
take for avoiding corruption, otherwise we have to do it in kernel.

Link: https://lkml.kernel.org/r/20220603222801.42488-1-junxiao.bi@oracle.com
Fixes: 912f655d78c5("ocfs2: mount shared volume without ha stack")
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <heming.zhao@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


12345678910>>...23