Home
last modified time | relevance | path

Searched hist:"176774 cf" (Results 1 – 1 of 1) sorted by relevance

/openbmc/ipmitool/src/plugins/lanplus/
H A Dlanplus.c176774cf Sun Mar 23 02:15:51 CDT 2014 Zdenek Styblik <stybla@turnovfree.net> ID: 298 - fix LANplus retry

``I had submitted a patch back on Nov 19, 2013 regarding a fix to lanplus retry.
This had resolved a problem whereby a retry of a payload type of
IPMI_PAYLOAD_TYPE_IPMI first removed the request from the queue before going
back for a retry of the message. I have been able to determine why this fix
works correctly. More importantly I have been able to resolve other retry
problems in lanplus where assertion panics were hitting on certain retry
operations. A new, replacement patch for resolving both of these types of retry
bugs follows.

The first bug,where the ipmi_lanplus_send_payload() is sending a payload type of
IPMI_PAYLOAD_TYPE_IPMI is retryable, however I found in testing that it did not
remove the previous request entry from the list of requests chain. If the
original message had timed out, a second message sent, the second reply would
not match up to the right entry on the list as the req command and sequence
numbers are the same. By first removing the first request from the chain this
resolves it. The consequence of not removing the stale entry was random errors.

The second bug is when waiting for a message response times out during the
ipmi_lanplus_send_payload types IPMI_PAYLOAD_TYPE(s) RCMP_OPEN_REQUEST, RAKP1,
RAKP_3. In various testing where the message timed out on either of these three
payload types, ipmitool would assertion panic upon retry as the session_state
was wrong. The timeout could be due to the message never getting to the BMC, the
BMC never acting/responding to the message, or the reply message packet dropped
(it is UDP after all). If the BMC had acted on the message but the reply was not
received, the BMC state would had advanced, and a retry of any of these three
commands would error. It is not knowable at retry time if the BMC had acted on
the message or not. The solution is upon message timeout failure, retry all
three commands in the sequence. This has shown to be reliable and does not
result in assertions or any unexpected BMC behaviors. Should the original
message response eventually arrive very late, it is just discarded.

The testing for these problems was elusive until we found a moderately slow BMC
and had separate sessions direct a fusillade of nmap operations on the BMC, then
run simple ipmitool commands. This caused sufficient loading of the network and
BMC to cause lengthy delays and outright packet drops. The general approach on
the second fix is to return a timeout error code back through ipmi_lanplus_open
where the sequence can be retried.''

Patch-by: Pat Donlin