Revision tags: v6.6.67, v6.6.66, v6.6.65, v6.6.64, v6.6.63, v6.6.62, v6.6.61, v6.6.60 |
|
#
e50e86db |
| 03-Nov-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.59' into for/openbmc/dev-6.6
This is the 6.6.59 stable release
|
Revision tags: v6.6.59 |
|
#
24f63812 |
| 23-Oct-2024 |
Niklas Cassel <cassel@kernel.org> |
ata: libata: Set DID_TIME_OUT for commands that actually timed out
commit 8e59a2a5459fd9840dbe2cbde85fe154b11e1727 upstream.
When ata_qc_complete() schedules a command for EH using ata_qc_schedule_
ata: libata: Set DID_TIME_OUT for commands that actually timed out
commit 8e59a2a5459fd9840dbe2cbde85fe154b11e1727 upstream.
When ata_qc_complete() schedules a command for EH using ata_qc_schedule_eh(), blk_abort_request() will be called, which leads to req->q->mq_ops->timeout() / scsi_timeout() being called.
scsi_timeout(), if the LLDD has no abort handler (libata has no abort handler), will set host byte to DID_TIME_OUT, and then call scsi_eh_scmd_add() to add the command to EH.
Thus, when commands first enter libata's EH strategy_handler, all the commands that have been added to EH will have DID_TIME_OUT set.
Commit e5dd410acb34 ("ata: libata: Clear DID_TIME_OUT for ATA PT commands with sense data") clears this bogus DID_TIME_OUT flag for all commands that reached libata's EH strategy_handler.
libata has its own flag (AC_ERR_TIMEOUT), that it sets for commands that have not received a completion at the time of entering EH.
ata_eh_worth_retry() has no special handling for AC_ERR_TIMEOUT, so by default timed out commands will get flag ATA_QCFLAG_RETRY set, and will be retried after the port has been reset (ata_eh_link_autopsy() always triggers a port reset if any command has AC_ERR_TIMEOUT set).
For a command that has ATA_QCFLAG_RETRY set, while also having an error flag set (e.g. AC_ERR_TIMEOUT), ata_eh_finish() will not increment scmd->allowed, so the command will at most be retried scmd->allowed number of times (which by default is set to 3).
However, scsi_eh_flush_done_q() will only retry commands for which scsi_noretry_cmd() returns false.
For a command that has DID_TIME_OUT set, while also having either the FAILFAST flag set, or the command being a passthrough command, scsi_noretry_cmd() will return true. Thus, such a command will never be retried.
Thus, make sure that libata sets SCSI's DID_TIME_OUT flag for commands that actually timed out (libata's AC_ERR_TIMEOUT flag), such that timed out commands will once again not be retried if they are also a FAILFAST or passthrough command.
Cc: stable@vger.kernel.org Fixes: e5dd410acb34 ("ata: libata: Clear DID_TIME_OUT for ATA PT commands with sense data") Reported-by: Lai, Yi <yi1.lai@linux.intel.com> Closes: https://lore.kernel.org/linux-ide/ZxYz871I3Blsi30F@ly-workstation/ Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20241023105540.1070012-2-cassel@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v6.6.58 |
|
#
7b7fd0ac |
| 17-Oct-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.57' into for/openbmc/dev-6.6
This is the 6.6.57 stable release
|
Revision tags: v6.6.57, v6.6.56, v6.6.55 |
|
#
31c62224 |
| 08-Oct-2024 |
Niklas Cassel <cassel@kernel.org> |
ata: libata: avoid superfluous disk spin down + spin up during hibernation
commit a38719e3157118428e34fbd45b0d0707a5877784 upstream.
A user reported that commit aa3998dbeb3a ("ata: libata-scsi: Dis
ata: libata: avoid superfluous disk spin down + spin up during hibernation
commit a38719e3157118428e34fbd45b0d0707a5877784 upstream.
A user reported that commit aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop") introduced a spin down + immediate spin up of the disk both when entering and when resuming from hibernation. This behavior was not there before, and causes an increased latency both when entering and when resuming from hibernation.
Hibernation is done by three consecutive PM events, in the following order: 1) PM_EVENT_FREEZE 2) PM_EVENT_THAW 3) PM_EVENT_HIBERNATE
Commit aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop") modified ata_eh_handle_port_suspend() to call ata_dev_power_set_standby() (which spins down the disk), for both event PM_EVENT_FREEZE and event PM_EVENT_HIBERNATE.
Documentation/driver-api/pm/devices.rst, section "Entering Hibernation", explicitly mentions that PM_EVENT_FREEZE does not have to be put the device in a low-power state, and actually recommends not doing so. Thus, let's not spin down the disk on PM_EVENT_FREEZE. (The disk will instead be spun down during the subsequent PM_EVENT_HIBERNATE event.)
This way, PM_EVENT_FREEZE will behave as it did before commit aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop"), while PM_EVENT_HIBERNATE will continue to spin down the disk.
This will avoid the superfluous spin down + spin up when entering and resuming from hibernation, while still making sure that the disk is spun down before actually entering hibernation.
Cc: stable@vger.kernel.org # v6.6+ Fixes: aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop") Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20241008135843.1266244-2-cassel@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
34d6f206 |
| 07-Oct-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.54' into for/openbmc/dev-6.6
This is the 6.6.54 stable release
|
Revision tags: v6.6.54, v6.6.53, v6.6.52, v6.6.51 |
|
#
27f113dc |
| 09-Sep-2024 |
Niklas Cassel <cassel@kernel.org> |
ata: libata: Clear DID_TIME_OUT for ATA PT commands with sense data
[ Upstream commit e5dd410acb34c7341a0a93b429dcf3dabf9e3323 ]
When ata_qc_complete() schedules a command for EH using ata_qc_sched
ata: libata: Clear DID_TIME_OUT for ATA PT commands with sense data
[ Upstream commit e5dd410acb34c7341a0a93b429dcf3dabf9e3323 ]
When ata_qc_complete() schedules a command for EH using ata_qc_schedule_eh(), blk_abort_request() will be called, which leads to req->q->mq_ops->timeout() / scsi_timeout() being called.
scsi_timeout(), if the LLDD has no abort handler (libata has no abort handler), will set host byte to DID_TIME_OUT, and then call scsi_eh_scmd_add() to add the command to EH.
Thus, when commands first enter libata's EH strategy_handler, all the commands that have been added to EH will have DID_TIME_OUT set.
libata has its own flag (AC_ERR_TIMEOUT), that it sets for commands that have not received a completion at the time of entering EH.
Thus, libata doesn't really care about DID_TIME_OUT at all, and currently clears the host byte at the end of EH, in ata_scsi_qc_complete(), before scsi_eh_finish_cmd() is called.
However, this clearing in ata_scsi_qc_complete() is currently only done for commands that are not ATA passthrough commands.
Since the host byte is visible in the completion that we return to user space for ATA passthrough commands, for ATA passthrough commands that got completed via EH (commands with sense data), the user will incorrectly see: ATA pass-through(16): transport error: Host_status=0x03 [DID_TIME_OUT]
Fix this by moving the clearing of the host byte (which is currently only done for commands that are not ATA passthrough commands) from ata_scsi_qc_complete() to the start of EH (regardless if the command is ATA passthrough or not).
While at it, use the proper helper function to clear the host byte, rather than open coding the clearing.
This will make sure that we: -Correctly clear DID_TIME_OUT for both ATA passthrough commands and commands that are not ATA passthrough commands. -Do not needlessly clear the host byte for commands that did not go via EH. ata_scsi_qc_complete() is called both for commands that are completed normally (without going via EH), and for commands that went via EH, however, only commands that went via EH will have DID_TIME_OUT set.
Fixes: 24aeebbf8ea9 ("scsi: ata: libata: Change ata_eh_request_sense() to not set CHECK_CONDITION") Reported-by: Igor Pylypiv <ipylypiv@google.com> Closes: https://lore.kernel.org/linux-ide/ZttIN8He8TOZ7Lct@google.com/ Signed-off-by: Niklas Cassel <cassel@kernel.org> Tested-by: Igor Pylypiv <ipylypiv@google.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.50, v6.6.49, v6.6.48, v6.6.47, v6.6.46, v6.6.45, v6.6.44, v6.6.43, v6.6.42, v6.6.41, v6.6.40, v6.6.39, v6.6.38, v6.6.37, v6.6.36, v6.6.35, v6.6.34, v6.6.33, v6.6.32, v6.6.31, v6.6.30, v6.6.29, v6.6.28, v6.6.27, v6.6.26, v6.6.25 |
|
#
46eeaa11 |
| 03-Apr-2024 |
Andrew Jeffery <andrew@codeconstruct.com.au> |
Merge tag 'v6.6.24' into dev-6.6
This is the 6.6.24 stable release
|
Revision tags: v6.6.24, v6.6.23 |
|
#
a1f506af |
| 19-Mar-2024 |
Damien Le Moal <dlemoal@kernel.org> |
scsi: sd: Fix TCG OPAL unlock on system resume
commit 0c76106cb97548810214def8ee22700bbbb90543 upstream.
Commit 3cc2ffe5c16d ("scsi: sd: Differentiate system and runtime start/stop management") int
scsi: sd: Fix TCG OPAL unlock on system resume
commit 0c76106cb97548810214def8ee22700bbbb90543 upstream.
Commit 3cc2ffe5c16d ("scsi: sd: Differentiate system and runtime start/stop management") introduced the manage_system_start_stop scsi_device flag to allow libata to indicate to the SCSI disk driver that nothing should be done when resuming a disk on system resume. This change turned the execution of sd_resume() into a no-op for ATA devices on system resume. While this solved deadlock issues during device resume, this change also wrongly removed the execution of opal_unlock_from_suspend(). As a result, devices with TCG OPAL locking enabled remain locked and inaccessible after a system resume from sleep.
To fix this issue, introduce the SCSI driver resume method and implement it with the sd_resume() function calling opal_unlock_from_suspend(). The former sd_resume() function is renamed to sd_resume_common() and modified to call the new sd_resume() function. For non-ATA devices, this result in no functional changes.
In order for libata to explicitly execute sd_resume() when a device is resumed during system restart, the function scsi_resume_device() is introduced. libata calls this function from the revalidation work executed on devie resume, a state that is indicated with the new device flag ATA_DFLAG_RESUMING. Doing so, locked TCG OPAL enabled devices are unlocked on resume, allowing normal operation.
Fixes: 3cc2ffe5c16d ("scsi: sd: Differentiate system and runtime start/stop management") Link: https://bugzilla.kernel.org/show_bug.cgi?id=218538 Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20240319071209.1179257-1-dlemoal@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6 |
|
#
95289e49 |
| 29-Sep-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'ata-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ATA fixes from Damien Le Moal: "A larger than usual set of fixes for 6.6-rc4 due to the unexpected num
Merge tag 'ata-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ATA fixes from Damien Le Moal: "A larger than usual set of fixes for 6.6-rc4 due to the unexpected number of fixes needed to address ATA disks suspend/resume issues.
In more detail:
- Add missing additionalProperties on child nodes to the pata-common DT bindings (Rob)
- Fix handling of the REPORT SUPPORTED OPERATION CODES command to ignore reserved bits (Niklas)
- Increase port multiplier soft reset timeout to accomodate slow devices and avoid issues on wakeup (Matthias)
- A couple of minor code fixes to avoid compilation warnings in libata-core and libata-eh (me)
- Many patches from me to address suspend/resume issues, and in particular a potential deadlock on resume due to the SCSI disk driver resume operation not being synchronized with libata EH port resume handling.
This is addressed by changing the scsi disk driver disk start/stop control to allow libata to execute disk suspend (spin down) and resume (spin up) on its own during system suspend/resume. Runtime suspend/resume control remains with the SCSI disk driver.
Other fixes include: - Fix libata power management request issuing to avoid races - Establish a link between ATA ports and SCSI devices to order PM operations - Fix device removal to avoid issues with driver rmmod removal - Fix synchronization of libata device rescan and SCSI disk resume operation - Remove libsas PM operations as suspend/resume is handled directly by the sas controller resume - Fix the SCSI disk driver to not issue commands to suspended disks, thus avoiding potential system lock-up on resume"
* tag 'ata-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: libata-eh: Fix compilation warning in ata_eh_link_report() ata: libata-core: Fix compilation warning in ata_dev_config_ncq() scsi: sd: Do not issue commands to suspended disks on shutdown ata: libata-core: Do not register PM operations for SAS ports ata: libata-scsi: Fix delayed scsi_rescan_device() execution scsi: Do not attempt to rescan suspended devices ata: libata-scsi: Disable scsi device manage_system_start_stop scsi: sd: Differentiate system and runtime start/stop management ata: libata-scsi: link ata port and scsi device ata: libata-core: Fix port and device removal ata: libata-core: Fix ata_port_request_pm() locking ata: libata-sata: increase PMP SRST timeout to 10s ata: libata-scsi: ignore reserved bits for REPORT SUPPORTED OPERATION CODES dt-bindings: ata: pata-common: Add missing additionalProperties on child nodes
show more ...
|
Revision tags: v6.5.5, v6.5.4, v6.5.3 |
|
#
49728bdc |
| 11-Sep-2023 |
Damien Le Moal <dlemoal@kernel.org> |
ata: libata-eh: Fix compilation warning in ata_eh_link_report()
The 6 bytes length of the tries_buf string in ata_eh_link_report() is too short and results in a gcc compilation warning with W-!:
dr
ata: libata-eh: Fix compilation warning in ata_eh_link_report()
The 6 bytes length of the tries_buf string in ata_eh_link_report() is too short and results in a gcc compilation warning with W-!:
drivers/ata/libata-eh.c: In function ‘ata_eh_link_report’: drivers/ata/libata-eh.c:2371:59: warning: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 4 [-Wformat-truncation=] 2371 | snprintf(tries_buf, sizeof(tries_buf), " t%d", | ^~ drivers/ata/libata-eh.c:2371:56: note: directive argument in the range [-2147483648, 4] 2371 | snprintf(tries_buf, sizeof(tries_buf), " t%d", | ^~~~~~ drivers/ata/libata-eh.c:2371:17: note: ‘snprintf’ output between 4 and 14 bytes into a destination of size 6 2371 | snprintf(tries_buf, sizeof(tries_buf), " t%d", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2372 | ap->eh_tries); | ~~~~~~~~~~~~~
Avoid this warning by increasing the string size to 16B.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
show more ...
|
Revision tags: v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48 |
|
#
aa3998db |
| 25-Aug-2023 |
Damien Le Moal <dlemoal@kernel.org> |
ata: libata-scsi: Disable scsi device manage_system_start_stop
The introduction of a device link to create a consumer/supplier relationship between the scsi device of an ATA device and the ATA port
ata: libata-scsi: Disable scsi device manage_system_start_stop
The introduction of a device link to create a consumer/supplier relationship between the scsi device of an ATA device and the ATA port of that ATA device fixes the ordering of system suspend and resume operations. For suspend, the scsi device is suspended first and the ata port after it. This is fine as this allows the synchronize cache and START STOP UNIT commands issued by the scsi disk driver to be executed before the ata port is disabled.
For resume operations, the ata port is resumed first, followed by the scsi device. This allows having the request queue of the scsi device to be unfrozen after the ata port resume is scheduled in EH, thus avoiding to see new requests prematurely issued to the ATA device. Since libata sets manage_system_start_stop to 1, the scsi disk resume operation also results in issuing a START STOP UNIT command to the device being resumed so that the device exits standby power mode.
However, restoring the ATA device to the active power mode must be synchronized with libata EH processing of the port resume operation to avoid either 1) seeing the start stop unit command being received too early when the port is not yet resumed and ready to accept commands, or after the port resume process issues commands such as IDENTIFY to revalidate the device. In this last case, the risk is that the device revalidation fails with timeout errors as the drive is still spun down.
Commit 0a8589055936 ("ata,scsi: do not issue START STOP UNIT on resume") disabled issuing the START STOP UNIT command to avoid issues with it. But this is incorrect as transitioning a device to the active power mode from the standby power mode set on suspend requires a media access command. The IDENTIFY, READ LOG and SET FEATURES commands executed in libata EH context triggered by the ata port resume operation may thus fail.
Fix these synchronization issues is by handling a device power mode transitions for system suspend and resume directly in libata EH context, without relying on the scsi disk driver management triggered with the manage_system_start_stop flag.
To do this, the following libata helper functions are introduced:
1) ata_dev_power_set_standby():
This function issues a STANDBY IMMEDIATE command to transitiom a device to the standby power mode. For HDDs, this spins down the disks. This function applies only to ATA and ZAC devices and does nothing otherwise. This function also does nothing for devices that have the ATA_FLAG_NO_POWEROFF_SPINDOWN or ATA_FLAG_NO_HIBERNATE_SPINDOWN flag set.
For suspend, call ata_dev_power_set_standby() in ata_eh_handle_port_suspend() before the port is disabled and frozen. ata_eh_unload() is also modified to transition all enabled devices to the standby power mode when the system is shutdown or devices removed.
2) ata_dev_power_set_active() and
This function applies to ATA or ZAC devices and issues a VERIFY command for 1 sector at LBA 0 to transition the device to the active power mode. For HDDs, since this function will complete only once the disk spin up. Its execution uses the same timeouts as for reset, to give the drive enough time to complete spinup without triggering a command timeout.
For resume, call ata_dev_power_set_active() in ata_eh_revalidate_and_attach() after the port has been enabled and before any other command is issued to the device.
With these changes, the manage_system_start_stop and no_start_on_resume scsi device flags do not need to be set in ata_scsi_dev_config(). The flag manage_runtime_start_stop is still set to allow the sd driver to spinup/spindown a disk through the sd runtime operations.
Fixes: 0a8589055936 ("ata,scsi: do not issue START STOP UNIT on resume") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
show more ...
|
#
5804c19b |
| 23-Sep-2023 |
Paolo Bonzini <pbonzini@redhat.com> |
Merge tag 'kvm-riscv-fixes-6.6-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv fixes for 6.6, take #1
- Fix KVM_GET_REG_LIST API for ISA_EXT registers - Fix reading ISA_EXT register of
Merge tag 'kvm-riscv-fixes-6.6-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv fixes for 6.6, take #1
- Fix KVM_GET_REG_LIST API for ISA_EXT registers - Fix reading ISA_EXT register of a missing extension - Fix ISA_EXT register handling in get-reg-list test - Fix filtering of AIA registers in get-reg-list test
show more ...
|
#
cc3e5afc |
| 16-Sep-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'ata-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ata fixes from Damien Le Moal:
- Fix link power management transitions to disallow unsupported states
Merge tag 'ata-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ata fixes from Damien Le Moal:
- Fix link power management transitions to disallow unsupported states (Niklas)
- A small string handling fix for the sata_mv driver (Christophe)
- Clear port pending interrupts before reset, as per AHCI specifications (Szuying).
Followup fixes for this one are to not clear ATA_PFLAG_EH_PENDING in ata_eh_reset() to allow EH to continue on with other actions recorded with error interrupts triggered before EH completes. And an additional fix to avoid thawing a port twice in EH (Niklas)
- Small code style fixes in the pata_parport driver to silence the build bot as it keeps complaining about bad indentation (me)
- A fix for the recent CDL code to avoid fetching sense data for successful commands when not necessary for correct operation (Niklas)
* tag 'ata-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: libata-core: fetch sense data for successful commands iff CDL enabled ata: libata-eh: do not thaw the port twice in ata_eh_reset() ata: libata-eh: do not clear ATA_PFLAG_EH_PENDING in ata_eh_reset() ata: pata_parport: Fix code style issues ata: libahci: clear pending interrupt status ata: sata_mv: Fix incorrect string length computation in mv_dump_mem() ata: libata: disallow dev-initiated LPM transitions to unsupported states
show more ...
|
#
7a3bc2b3 |
| 13-Sep-2023 |
Niklas Cassel <niklas.cassel@wdc.com> |
ata: libata-eh: do not thaw the port twice in ata_eh_reset()
commit 1e641060c4b5 ("libata: clear eh_info on reset completion") added a workaround that broke the retry mechanism in ATA EH.
Tejun him
ata: libata-eh: do not thaw the port twice in ata_eh_reset()
commit 1e641060c4b5 ("libata: clear eh_info on reset completion") added a workaround that broke the retry mechanism in ATA EH.
Tejun himself suggested to remove this workaround when it was identified to cause additional problems: https://lore.kernel.org/linux-ide/20110426135027.GI878@htj.dyndns.org/
He even said: "Hmm... it seems I wasn't thinking straight when I added that work around." https://lore.kernel.org/linux-ide/20110426155229.GM878@htj.dyndns.org/
While removing the workaround solved the issue, however, the workaround was kept to avoid "spurious hotplug events during reset", and instead another workaround was added on top of the existing workaround in commit 8c56cacc724c ("libata: fix unexpectedly frozen port after ata_eh_reset()").
Because these IRQs happened when the port was frozen, we know that they were actually a side effect of PxIS and IS.IPS(x) not being cleared before the COMRESET. This is now done in commit 94152042eaa9 ("ata: libahci: clear pending interrupt status"), so these workarounds can now be removed.
Since commit 1e641060c4b5 ("libata: clear eh_info on reset completion") has now been reverted, the ATA EH retry mechanism is functional again, so there is once again no need to thaw the port more than once in ata_eh_reset().
This reverts "the workaround on top of the workaround" introduced in commit 8c56cacc724c ("libata: fix unexpectedly frozen port after ata_eh_reset()").
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
show more ...
|
#
80cc944e |
| 13-Sep-2023 |
Niklas Cassel <niklas.cassel@wdc.com> |
ata: libata-eh: do not clear ATA_PFLAG_EH_PENDING in ata_eh_reset()
ata_scsi_port_error_handler() starts off by clearing ATA_PFLAG_EH_PENDING, before calling ap->ops->error_handler() (without holdin
ata: libata-eh: do not clear ATA_PFLAG_EH_PENDING in ata_eh_reset()
ata_scsi_port_error_handler() starts off by clearing ATA_PFLAG_EH_PENDING, before calling ap->ops->error_handler() (without holding the ap->lock).
If an error IRQ is received while ap->ops->error_handler() is running, the irq handler will set ATA_PFLAG_EH_PENDING.
Once ap->ops->error_handler() returns, ata_scsi_port_error_handler() checks if ATA_PFLAG_EH_PENDING is set, and if it is, another iteration of ATA EH is performed.
The problem is that ATA_PFLAG_EH_PENDING is not only cleared by ata_scsi_port_error_handler(), it is also cleared by ata_eh_reset().
ata_eh_reset() is called by ap->ops->error_handler(). This additional clearing done by ata_eh_reset() breaks the whole retry logic in ata_scsi_port_error_handler(). Thus, if an error IRQ is received while ap->ops->error_handler() is running, the port will currently remain frozen and will never get re-enabled.
The additional clearing in ata_eh_reset() was introduced in commit 1e641060c4b5 ("libata: clear eh_info on reset completion").
Looking at the original error report: https://marc.info/?l=linux-ide&m=124765325828495&w=2
We can see the following happening: [ 1.074659] ata3: XXX port freeze [ 1.074700] ata3: XXX hardresetting link, stopping engine [ 1.074746] ata3: XXX flipping SControl
[ 1.411471] ata3: XXX irq_stat=400040 CONN|PHY [ 1.411475] ata3: XXX port freeze
[ 1.420049] ata3: XXX starting engine [ 1.420096] ata3: XXX rc=0, class=1 [ 1.420142] ata3: XXX clearing IRQs for thawing [ 1.420188] ata3: XXX port thawed [ 1.420234] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
We are not supposed to be able to receive an error IRQ while the port is frozen (PxIE is set to 0, i.e. all IRQs for the port are disabled).
AHCI 1.3.1 section 10.7.1.1 First Tier (IS Register) states: "Each bit location can be thought of as reporting a '1' if the virtual "interrupt line" for that port is indicating it wishes to generate an interrupt. That is, if a port has one or more interrupt status bit set, and the enables for those status bits are set, then this bit shall be set."
Additionally, AHCI state P:ComInit clearly shows that the state machine will only jump to P:ComInitSetIS (which sets IS.IPS(x) to '1'), if PxIE.PCE is set to '1'. In our case, PxIE is set to 0, so IS.IPS(x) won't get set.
So IS.IPS(x) only gets set if PxIS and PxIE is set.
AHCI 1.3.1 section 10.7.1.1 First Tier (IS Register) also states: "The bits in this register are read/write clear. It is set by the level of the virtual interrupt line being a set, and cleared by a write of '1' from the software."
So if IS.IPS(x) is set, you need to explicitly clear it by writing a 1 to IS.IPS(x) for that port.
Since PxIE is cleared, the only way to get an interrupt while the port is frozen, is if IS.IPS(x) is set, and the only way IS.IPS(x) can be set when the port is frozen, is if it was set before the port was frozen.
However, since commit 737dd811a3db ("ata: libahci: clear pending interrupt status"), we clear both PxIS and IS.IPS(x) after freezing the port, but before the COMRESET, so the problem that commit 1e641060c4b5 ("libata: clear eh_info on reset completion") fixed can no longer happen.
Thus, revert commit 1e641060c4b5 ("libata: clear eh_info on reset completion"), so that the retry logic in ata_scsi_port_error_handler() works once again. (The retry logic is still needed, since we can still get an error IRQ _after_ the port has been thawed, but before ata_scsi_port_error_handler() takes the ap->lock in order to check if ATA_PFLAG_EH_PENDING is set.)
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
show more ...
|
#
c900529f |
| 12-Sep-2023 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-fixes into drm-misc-fixes
Forwarding to v6.6-rc1.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
#
4b3d6e0c |
| 05-Sep-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'ata-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ata updates from Damien Le Moal:
- Fix OF include file for ata platform drivers (Rob)
- Simplify vari
Merge tag 'ata-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ata updates from Damien Le Moal:
- Fix OF include file for ata platform drivers (Rob)
- Simplify various ahci, sata and pata platform drivers using the function devm_platform_ioremap_resource() (Yangtao)
- Cleanup libata time related argument types (e.g. timeouts values) (Sergey)
- Cleanup libata code around error handling as all ata drivers now define a error_handler operation (Hannes and Niklas)
- Remove functions intended for libsas that are in fact unused (Niklas)
- Change the remove device callback of platform drivers to a null function (Uwe)
- Simplify the pata_imx driver using devm_clk_get_enabled() (Li)
- Remove old and uinused remnants of the ide code in arm, parisc, powerpc, sparc and m68k architectures and associated drivers (pata_buddha, pata_falcon and pata_gayle) (Geert)
- Add missing MODULE_DESCRIPTION() in the sata_gemini and pata_ftide010 drivers (me)
- Several fixes for the pata_ep93xx and pata_falcon drivers (Nikita, Michael)
- Add Elkhart Lake AHCI controller support to the ahci driver (Werner)
- Disable NCQ trim on Micron 1100 drives (Pawel)
* tag 'ata-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: (60 commits) ata: libata-core: Disable NCQ_TRIM on Micron 1100 drives ata: ahci: Add Elkhart Lake AHCI controller ata: pata_falcon: add data_swab option to byte-swap disk data ata: pata_falcon: fix IO base selection for Q40 ata: pata_ep93xx: use soc_device_match for UDMA modes ata: pata_ep93xx: fix error return code in probe ata: sata_gemini: Add missing MODULE_DESCRIPTION ata: pata_ftide010: Add missing MODULE_DESCRIPTION m68k: Remove <asm/ide.h> ata: pata_gayle: Remove #include <asm/ide.h> ata: pata_falcon: Remove #include <asm/ide.h> ata: pata_buddha: Remove #include <asm/ide.h> asm-generic: Remove ide_iops.h sparc: Remove <asm/ide.h> powerpc: Remove <asm/ide.h> parisc: Remove <asm/ide.h> ARM: Remove <asm/ide.h> ata: pata_imx: Use helper function devm_clk_get_enabled() ata: sata_rcar: Convert to platform remove callback returning void ata: sata_mv: Convert to platform remove callback returning void ...
show more ...
|
#
1ac731c5 |
| 30-Aug-2023 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.6 merge window.
|
Revision tags: v6.1.46, v6.1.45, v6.1.44 |
|
#
2612e3bb |
| 07-Aug-2023 |
Rodrigo Vivi <rodrigo.vivi@intel.com> |
Merge drm/drm-next into drm-intel-next
Catching-up with drm-next and drm-intel-gt-next. It will unblock a code refactor around the platform definitions (names vs acronyms).
Signed-off-by: Rodrigo V
Merge drm/drm-next into drm-intel-next
Catching-up with drm-next and drm-intel-gt-next. It will unblock a code refactor around the platform definitions (names vs acronyms).
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
show more ...
|
#
9f771739 |
| 07-Aug-2023 |
Joonas Lahtinen <joonas.lahtinen@linux.intel.com> |
Merge drm/drm-next into drm-intel-gt-next
Need to pull in b3e4aae612ec ("drm/i915/hdcp: Modify hdcp_gsc_message msg sending mechanism") as a dependency for https://patchwork.freedesktop.org/series/1
Merge drm/drm-next into drm-intel-gt-next
Need to pull in b3e4aae612ec ("drm/i915/hdcp: Modify hdcp_gsc_message msg sending mechanism") as a dependency for https://patchwork.freedesktop.org/series/121735/
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
show more ...
|
Revision tags: v6.1.43 |
|
#
ff8072d5 |
| 31-Jul-2023 |
Hannes Reinecke <hare@suse.de> |
ata: libata: remove references to non-existing error_handler()
With commit 65a15d6560df ("scsi: ipr: Remove SATA support") all libata drivers now have the error_handler() callback provided, so we ca
ata: libata: remove references to non-existing error_handler()
With commit 65a15d6560df ("scsi: ipr: Remove SATA support") all libata drivers now have the error_handler() callback provided, so we can stop checking for non-existing error_handler callback.
Signed-off-by: Hannes Reinecke <hare@suse.de> [niklas: fixed review comments, rebased, solved conflicts during rebase, fixed bug that unconditionally dumped all QCs, removed the now unused function ata_dump_status(), removed the now unreachable failure paths in atapi_qc_complete(), removed the non-EH function to request ATAPI sense] Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
show more ...
|
#
ca02f225 |
| 29-Jul-2023 |
Sergey Shtylyov <s.shtylyov@omp.ru> |
ata: libata-eh: fix reset timeout type
ata_eh_reset_timeouts[] stores 'unsigned long' timeouts in ms, while ata_eh_reset() passes these values to ata_deadline() that takes just 'unsigned int timeout
ata: libata-eh: fix reset timeout type
ata_eh_reset_timeouts[] stores 'unsigned long' timeouts in ms, while ata_eh_reset() passes these values to ata_deadline() that takes just 'unsigned int timeout_msecs' parameter. Change the reset timeout table element's type to 'unsigned int' -- all timeouts fit into 'unsigned int' but we have to change ULONG_MAX to UINT_MAX...
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
show more ...
|
Revision tags: v6.1.42, v6.1.41 |
|
#
61b73694 |
| 24-Jul-2023 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-next
Backmerging to get v6.5-rc2.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
Revision tags: v6.1.40, v6.1.39 |
|
#
50501936 |
| 17-Jul-2023 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v6.4' into next
Sync up with mainline to bring in updates to shared infrastructure.
|
#
0791faeb |
| 17-Jul-2023 |
Mark Brown <broonie@kernel.org> |
ASoC: Merge v6.5-rc2
Get a similar baseline to my other branches, and fixes for people using the branch.
|