95411515 | 10-Oct-2023 |
Alexander Aring <aahringo@redhat.com> |
dlm: fix no ack after final message
[ Upstream commit 6212e4528b248a4bc9b4fe68e029a84689c67461 ]
In case of an final DLM message we can't should not send an ack out after the final message. This pa
dlm: fix no ack after final message
[ Upstream commit 6212e4528b248a4bc9b4fe68e029a84689c67461 ]
In case of an final DLM message we can't should not send an ack out after the final message. This patch moves the ack message before the messages will be transmitted. If it's the final message and the receiving node turns into DLM_CLOSED state another ack messages will being received and turning the receiving node into DLM_ESTABLISHED again.
Fixes: 1696c75f1864 ("fs: dlm: add send ack threshold and append acks to msgs") Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
fd508e08 | 10-Oct-2023 |
Alexander Aring <aahringo@redhat.com> |
dlm: be sure we reset all nodes at forced shutdown
[ Upstream commit e759eb3e27e5b624930548f1c0eda90da6e26ee9 ]
In case we running in a force shutdown in either midcomms or lowcomms implementation
dlm: be sure we reset all nodes at forced shutdown
[ Upstream commit e759eb3e27e5b624930548f1c0eda90da6e26ee9 ]
In case we running in a force shutdown in either midcomms or lowcomms implementation we will make sure we reset all per midcomms node information.
Fixes: 63e711b08160 ("fs: dlm: create midcomms nodes when configure") Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
e4393e9e | 10-Oct-2023 |
Alexander Aring <aahringo@redhat.com> |
dlm: fix remove member after close call
[ Upstream commit 2776635edc7fcd62e03cb2efb93c31f685887460 ]
The idea of commit 63e711b08160 ("fs: dlm: create midcomms nodes when configure") is to set the
dlm: fix remove member after close call
[ Upstream commit 2776635edc7fcd62e03cb2efb93c31f685887460 ]
The idea of commit 63e711b08160 ("fs: dlm: create midcomms nodes when configure") is to set the midcomms node lifetime when a node joins or leaves the cluster. Currently we can hit the following warning:
[10844.611495] ------------[ cut here ]------------ [10844.615913] WARNING: CPU: 4 PID: 84304 at fs/dlm/midcomms.c:1263 dlm_midcomms_remove_member+0x13f/0x180 [dlm]
or running in a state where we hit a midcomms node usage count in a negative value:
[ 260.830782] node 2 users dec count -1
The first warning happens when the a specific node does not exists and it was probably removed but dlm_midcomms_close() which is called when a node leaves the cluster. The second kernel log message is probably in a case when dlm_midcomms_addr() is called when a joined the cluster but due fencing a node leaved the cluster without getting removed from the lockspace. If the node joins the cluster and it was removed from the cluster due fencing the first call is to remove the node from lockspaces triggered by the user space. In both cases if the node wasn't found or the user count is zero, we should ignore any additional midcomms handling of dlm_midcomms_remove_member().
Fixes: 63e711b08160 ("fs: dlm: create midcomms nodes when configure") Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
c0c2b346 | 10-Oct-2023 |
Alexander Aring <aahringo@redhat.com> |
dlm: fix creating multiple node structures
[ Upstream commit fe9b619e6e94acf0b068fb1a8f658f5a96b8fad7 ]
This patch will lookup existing nodes instead of always creating them when dlm_midcomms_addr(
dlm: fix creating multiple node structures
[ Upstream commit fe9b619e6e94acf0b068fb1a8f658f5a96b8fad7 ]
This patch will lookup existing nodes instead of always creating them when dlm_midcomms_addr() is called. The idea is here to create midcomms nodes when user space getting informed that nodes joins the cluster. This is the case when dlm_midcomms_addr() is called, however it can be called multiple times by user space to add several address configurations to one node e.g. when using SCTP. Those multiple times need to be filtered out and we doing that by looking up if the node exists before. Due configfs entry it is safe that this function gets only called once at a time.
Fixes: 63e711b08160 ("fs: dlm: create midcomms nodes when configure") Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
7c53e847 | 24-Aug-2023 |
Alexander Aring <aahringo@redhat.com> |
dlm: fix plock lookup when using multiple lockspaces
All posix lock ops, for all lockspaces (gfs2 file systems) are sent to userspace (dlm_controld) through a single misc device. The dlm_controld da
dlm: fix plock lookup when using multiple lockspaces
All posix lock ops, for all lockspaces (gfs2 file systems) are sent to userspace (dlm_controld) through a single misc device. The dlm_controld daemon reads the ops from the misc device and sends them to other cluster nodes using separate, per-lockspace cluster api communication channels. The ops for a single lockspace are ordered at this level, so that the results are received in the same sequence that the requests were sent. When the results are sent back to the kernel via the misc device, they are again funneled through the single misc device for all lockspaces. When the dlm code in the kernel processes the results from the misc device, these results will be returned in the same sequence that the requests were sent, on a per-lockspace basis. A recent change in this request/reply matching code missed the "per-lockspace" check (fsid comparison) when matching request and reply, so replies could be incorrectly matched to requests from other lockspaces.
Cc: stable@vger.kernel.org Reported-by: Barry Marson <bmarson@redhat.com> Fixes: 57e2c2f2d94c ("fs: dlm: fix mismatch of plock results from userspace") Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
show more ...
|
a3d85fcf | 01-Aug-2023 |
Alexander Aring <aahringo@redhat.com> |
fs: dlm: don't use RCOM_NAMES for version detection
Currently RCOM_STATUS and RCOM_NAMES inclusive their replies are being used to determine the DLM version. The RCOM_NAMES messages are triggered in
fs: dlm: don't use RCOM_NAMES for version detection
Currently RCOM_STATUS and RCOM_NAMES inclusive their replies are being used to determine the DLM version. The RCOM_NAMES messages are triggered in DLM recovery when calling dlm_recover_directory() only. At this time the DLM version need to be determined. I ran some tests and did not expirenced some issues. When the DLM version detection was developed probably I run once in a case of RCOM_NAMES and the version was not detected yet. However it seems to be not necessary.
For backwards compatibility we still need to accept RCOM_NAMES messages which are not protected regarding the DLM message reliability layer aka stateless message. This patch changes that RCOM_NAMES we are sending out after this patch are not stateless anymore.
Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
show more ...
|
63e711b0 | 01-Aug-2023 |
Alexander Aring <aahringo@redhat.com> |
fs: dlm: create midcomms nodes when configure
This patch puts the life of a midcomms node the same as a lowcomms connection. The lowcomms connection lifetime was changed by commit 6f0b0b5d7ae7 ("fs:
fs: dlm: create midcomms nodes when configure
This patch puts the life of a midcomms node the same as a lowcomms connection. The lowcomms connection lifetime was changed by commit 6f0b0b5d7ae7 ("fs: dlm: remove dlm_node_addrs lookup list"). In the future the midcomms node instances can be merged with lowcomms connection structure as the lifetime is the same and states can be controlled over values or flags.
Before midcomms nodes were generated during version detection. This is not necessary anymore when the nodes are created when the cluster manager configures DLM via configfs. When a midcomms node is created over configfs it well set DLM_VERSION_NOT_SET as version. This indicates that the version of the midcomms node is still unknown and need to be probed via certain rcom messages.
Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
show more ...
|
b9d2f6ad | 01-Aug-2023 |
Alexander Aring <aahringo@redhat.com> |
fs: dlm: drop rxbuf manipulation in dlm_recover_master_copy
Currently dlm_recover_master_copy() manipulates the receive buffer of an rcom lock message and modifies it on the fly so a later memcpy()
fs: dlm: drop rxbuf manipulation in dlm_recover_master_copy
Currently dlm_recover_master_copy() manipulates the receive buffer of an rcom lock message and modifies it on the fly so a later memcpy() to a new rcom message with the same message has those new values. This patch avoids manipulating the received rcom message by store the values for the new rcom message in paremter assigned with call by reference. Later when dlm_send_rcom_lock() constructs a new message and memcpy() the receive buffer those values will be set on the new constructed message.
Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
show more ...
|
561c67d8 | 01-Aug-2023 |
Alexander Aring <aahringo@redhat.com> |
fs: dlm: drop rxbuf manipulation in dlm_copy_master_names
This patch removes the manipulation of the receive buffer in case of an error and be sure the buffer is null terminated before an error mess
fs: dlm: drop rxbuf manipulation in dlm_copy_master_names
This patch removes the manipulation of the receive buffer in case of an error and be sure the buffer is null terminated before an error messagea is printed out. Instead of manipulate the receive buffer we tell inside the format string the maximum length the string buffer is being read.
Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
show more ...
|
c4f4e135 | 01-Aug-2023 |
Alexander Aring <aahringo@redhat.com> |
fs: dlm: get recovery sequence number as parameter
This patch removes a read of the ls->ls_recover_seq uint64_t number in _create_rcom(). If the ls->ls_recover_seq is readed the ls_recover_lock need
fs: dlm: get recovery sequence number as parameter
This patch removes a read of the ls->ls_recover_seq uint64_t number in _create_rcom(). If the ls->ls_recover_seq is readed the ls_recover_lock need to held. However this number was always readed before when any rcom message is received and it's not necessary to read it again from a per lockspace variable to use it for the replying message. This patch will pass the sequence number as parameter so another read of ls->ls_recover_seq and holding the ls->ls_recover_lock is not required.
Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
show more ...
|
643f5cfa | 01-Aug-2023 |
Alexander Aring <aahringo@redhat.com> |
fs: dlm: cleanup lock order
This patch cleanups the lock order to hold at first the close_lock and then held the nodes_srcu read lock. Probably it will never be a problem as nodes_srcu is only a rea
fs: dlm: cleanup lock order
This patch cleanups the lock order to hold at first the close_lock and then held the nodes_srcu read lock. Probably it will never be a problem as nodes_srcu is only a read lock preventing the node pointer getting freed.
Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
show more ...
|
c84c4733 | 01-Aug-2023 |
Alexander Aring <aahringo@redhat.com> |
fs: dlm: remove clear_members_cb
This patch is just a small cleanup to directly call remove_remote_member() instead of going over clear_members_cb() which just calls remove_remote_member().
Signed-
fs: dlm: remove clear_members_cb
This patch is just a small cleanup to directly call remove_remote_member() instead of going over clear_members_cb() which just calls remove_remote_member().
Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
show more ...
|
8c95006d | 01-Aug-2023 |
Alexander Aring <aahringo@redhat.com> |
fs: dlm: add plock dev tracepoints
I currently debug nfs plock handling and introduce those two tracepoints for getting more information about what is happening there if the user space reads plock o
fs: dlm: add plock dev tracepoints
I currently debug nfs plock handling and introduce those two tracepoints for getting more information about what is happening there if the user space reads plock operations from kernel and writing the result back.
Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
show more ...
|
67b5da9a | 01-Aug-2023 |
Alexander Aring <aahringo@redhat.com> |
fs: dlm: check on plock ops when exit dlm
To be sure we don't have any issues that there are leftover plock ops in either send_list or recv_list we simple check if either one of the list are empty w
fs: dlm: check on plock ops when exit dlm
To be sure we don't have any issues that there are leftover plock ops in either send_list or recv_list we simple check if either one of the list are empty when we exit the dlm subsystem.
Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
show more ...
|
541adb0d | 01-Aug-2023 |
Alexander Aring <aahringo@redhat.com> |
fs: dlm: debugfs for queued callbacks
It was useful to debug an issue with the callback queue to check if any callbacks in any lkb are for some reason not processed by the callback workqueue. The me
fs: dlm: debugfs for queued callbacks
It was useful to debug an issue with the callback queue to check if any callbacks in any lkb are for some reason not processed by the callback workqueue. The mentioned issue was fixed by commit a034c1370ded ("fs: dlm: fix DLM_IFL_CB_PENDING gets overwritten"). If there are similar issue that looks like a ast callback was not processed, we can confirm now that it is not sitting to be processed by the callback workqueue anymore.
Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
show more ...
|
4b056db8 | 01-Aug-2023 |
Alexander Aring <aahringo@redhat.com> |
fs: dlm: remove unused processed_nodes
The variable processed_nodes is not being used by commit 1696c75f1864 ("fs: dlm: add send ack threshold and append acks to msgs"). This patch removes the lefto
fs: dlm: remove unused processed_nodes
The variable processed_nodes is not being used by commit 1696c75f1864 ("fs: dlm: add send ack threshold and append acks to msgs"). This patch removes the leftover of this commit.
Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
show more ...
|