1======================== 2libATA Developer's Guide 3======================== 4 5:Author: Jeff Garzik 6 7Introduction 8============ 9 10libATA is a library used inside the Linux kernel to support ATA host 11controllers and devices. libATA provides an ATA driver API, class 12transports for ATA and ATAPI devices, and SCSI<->ATA translation for ATA 13devices according to the T10 SAT specification. 14 15This Guide documents the libATA driver API, library functions, library 16internals, and a couple sample ATA low-level drivers. 17 18libata Driver API 19================= 20 21:c:type:`struct ata_port_operations <ata_port_operations>` 22is defined for every low-level libata 23hardware driver, and it controls how the low-level driver interfaces 24with the ATA and SCSI layers. 25 26FIS-based drivers will hook into the system with ``->qc_prep()`` and 27``->qc_issue()`` high-level hooks. Hardware which behaves in a manner 28similar to PCI IDE hardware may utilize several generic helpers, 29defining at a bare minimum the bus I/O addresses of the ATA shadow 30register blocks. 31 32:c:type:`struct ata_port_operations <ata_port_operations>` 33---------------------------------------------------------- 34 35Disable ATA port 36~~~~~~~~~~~~~~~~ 37 38:: 39 40 void (*port_disable) (struct ata_port *); 41 42 43Called from :c:func:`ata_bus_probe` error path, as well as when unregistering 44from the SCSI module (rmmod, hot unplug). This function should do 45whatever needs to be done to take the port out of use. In most cases, 46:c:func:`ata_port_disable` can be used as this hook. 47 48Called from :c:func:`ata_bus_probe` on a failed probe. Called from 49:c:func:`ata_scsi_release`. 50 51Post-IDENTIFY device configuration 52~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 53 54:: 55 56 void (*dev_config) (struct ata_port *, struct ata_device *); 57 58 59Called after IDENTIFY [PACKET] DEVICE is issued to each device found. 60Typically used to apply device-specific fixups prior to issue of SET 61FEATURES - XFER MODE, and prior to operation. 62 63This entry may be specified as NULL in ata_port_operations. 64 65Set PIO/DMA mode 66~~~~~~~~~~~~~~~~ 67 68:: 69 70 void (*set_piomode) (struct ata_port *, struct ata_device *); 71 void (*set_dmamode) (struct ata_port *, struct ata_device *); 72 void (*post_set_mode) (struct ata_port *); 73 unsigned int (*mode_filter) (struct ata_port *, struct ata_device *, unsigned int); 74 75 76Hooks called prior to the issue of SET FEATURES - XFER MODE command. The 77optional ``->mode_filter()`` hook is called when libata has built a mask of 78the possible modes. This is passed to the ``->mode_filter()`` function 79which should return a mask of valid modes after filtering those 80unsuitable due to hardware limits. It is not valid to use this interface 81to add modes. 82 83``dev->pio_mode`` and ``dev->dma_mode`` are guaranteed to be valid when 84``->set_piomode()`` and when ``->set_dmamode()`` is called. The timings for 85any other drive sharing the cable will also be valid at this point. That 86is the library records the decisions for the modes of each drive on a 87channel before it attempts to set any of them. 88 89``->post_set_mode()`` is called unconditionally, after the SET FEATURES - 90XFER MODE command completes successfully. 91 92``->set_piomode()`` is always called (if present), but ``->set_dma_mode()`` 93is only called if DMA is possible. 94 95Taskfile read/write 96~~~~~~~~~~~~~~~~~~~ 97 98:: 99 100 void (*sff_tf_load) (struct ata_port *ap, struct ata_taskfile *tf); 101 void (*sff_tf_read) (struct ata_port *ap, struct ata_taskfile *tf); 102 103 104``->tf_load()`` is called to load the given taskfile into hardware 105registers / DMA buffers. ``->tf_read()`` is called to read the hardware 106registers / DMA buffers, to obtain the current set of taskfile register 107values. Most drivers for taskfile-based hardware (PIO or MMIO) use 108:c:func:`ata_sff_tf_load` and :c:func:`ata_sff_tf_read` for these hooks. 109 110PIO data read/write 111~~~~~~~~~~~~~~~~~~~ 112 113:: 114 115 void (*sff_data_xfer) (struct ata_device *, unsigned char *, unsigned int, int); 116 117 118All bmdma-style drivers must implement this hook. This is the low-level 119operation that actually copies the data bytes during a PIO data 120transfer. Typically the driver will choose one of 121:c:func:`ata_sff_data_xfer_noirq`, :c:func:`ata_sff_data_xfer`, or 122:c:func:`ata_sff_data_xfer32`. 123 124ATA command execute 125~~~~~~~~~~~~~~~~~~~ 126 127:: 128 129 void (*sff_exec_command)(struct ata_port *ap, struct ata_taskfile *tf); 130 131 132causes an ATA command, previously loaded with ``->tf_load()``, to be 133initiated in hardware. Most drivers for taskfile-based hardware use 134:c:func:`ata_sff_exec_command` for this hook. 135 136Per-cmd ATAPI DMA capabilities filter 137~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 138 139:: 140 141 int (*check_atapi_dma) (struct ata_queued_cmd *qc); 142 143 144Allow low-level driver to filter ATA PACKET commands, returning a status 145indicating whether or not it is OK to use DMA for the supplied PACKET 146command. 147 148This hook may be specified as NULL, in which case libata will assume 149that atapi dma can be supported. 150 151Read specific ATA shadow registers 152~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 153 154:: 155 156 u8 (*sff_check_status)(struct ata_port *ap); 157 u8 (*sff_check_altstatus)(struct ata_port *ap); 158 159 160Reads the Status/AltStatus ATA shadow register from hardware. On some 161hardware, reading the Status register has the side effect of clearing 162the interrupt condition. Most drivers for taskfile-based hardware use 163:c:func:`ata_sff_check_status` for this hook. 164 165Write specific ATA shadow register 166~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 167 168:: 169 170 void (*sff_set_devctl)(struct ata_port *ap, u8 ctl); 171 172 173Write the device control ATA shadow register to the hardware. Most 174drivers don't need to define this. 175 176Select ATA device on bus 177~~~~~~~~~~~~~~~~~~~~~~~~ 178 179:: 180 181 void (*sff_dev_select)(struct ata_port *ap, unsigned int device); 182 183 184Issues the low-level hardware command(s) that causes one of N hardware 185devices to be considered 'selected' (active and available for use) on 186the ATA bus. This generally has no meaning on FIS-based devices. 187 188Most drivers for taskfile-based hardware use :c:func:`ata_sff_dev_select` for 189this hook. 190 191Private tuning method 192~~~~~~~~~~~~~~~~~~~~~ 193 194:: 195 196 void (*set_mode) (struct ata_port *ap); 197 198 199By default libata performs drive and controller tuning in accordance 200with the ATA timing rules and also applies blacklists and cable limits. 201Some controllers need special handling and have custom tuning rules, 202typically raid controllers that use ATA commands but do not actually do 203drive timing. 204 205 **Warning** 206 207 This hook should not be used to replace the standard controller 208 tuning logic when a controller has quirks. Replacing the default 209 tuning logic in that case would bypass handling for drive and bridge 210 quirks that may be important to data reliability. If a controller 211 needs to filter the mode selection it should use the mode_filter 212 hook instead. 213 214Control PCI IDE BMDMA engine 215~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 216 217:: 218 219 void (*bmdma_setup) (struct ata_queued_cmd *qc); 220 void (*bmdma_start) (struct ata_queued_cmd *qc); 221 void (*bmdma_stop) (struct ata_port *ap); 222 u8 (*bmdma_status) (struct ata_port *ap); 223 224 225When setting up an IDE BMDMA transaction, these hooks arm 226(``->bmdma_setup``), fire (``->bmdma_start``), and halt (``->bmdma_stop``) the 227hardware's DMA engine. ``->bmdma_status`` is used to read the standard PCI 228IDE DMA Status register. 229 230These hooks are typically either no-ops, or simply not implemented, in 231FIS-based drivers. 232 233Most legacy IDE drivers use :c:func:`ata_bmdma_setup` for the 234:c:func:`bmdma_setup` hook. :c:func:`ata_bmdma_setup` will write the pointer 235to the PRD table to the IDE PRD Table Address register, enable DMA in the DMA 236Command register, and call :c:func:`exec_command` to begin the transfer. 237 238Most legacy IDE drivers use :c:func:`ata_bmdma_start` for the 239:c:func:`bmdma_start` hook. :c:func:`ata_bmdma_start` will write the 240ATA_DMA_START flag to the DMA Command register. 241 242Many legacy IDE drivers use :c:func:`ata_bmdma_stop` for the 243:c:func:`bmdma_stop` hook. :c:func:`ata_bmdma_stop` clears the ATA_DMA_START 244flag in the DMA command register. 245 246Many legacy IDE drivers use :c:func:`ata_bmdma_status` as the 247:c:func:`bmdma_status` hook. 248 249High-level taskfile hooks 250~~~~~~~~~~~~~~~~~~~~~~~~~ 251 252:: 253 254 void (*qc_prep) (struct ata_queued_cmd *qc); 255 int (*qc_issue) (struct ata_queued_cmd *qc); 256 257 258Higher-level hooks, these two hooks can potentially supercede several of 259the above taskfile/DMA engine hooks. ``->qc_prep`` is called after the 260buffers have been DMA-mapped, and is typically used to populate the 261hardware's DMA scatter-gather table. Most drivers use the standard 262:c:func:`ata_qc_prep` helper function, but more advanced drivers roll their 263own. 264 265``->qc_issue`` is used to make a command active, once the hardware and S/G 266tables have been prepared. IDE BMDMA drivers use the helper function 267:c:func:`ata_qc_issue_prot` for taskfile protocol-based dispatch. More 268advanced drivers implement their own ``->qc_issue``. 269 270:c:func:`ata_qc_issue_prot` calls ``->tf_load()``, ``->bmdma_setup()``, and 271``->bmdma_start()`` as necessary to initiate a transfer. 272 273Exception and probe handling (EH) 274~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 275 276:: 277 278 void (*eng_timeout) (struct ata_port *ap); 279 void (*phy_reset) (struct ata_port *ap); 280 281 282Deprecated. Use ``->error_handler()`` instead. 283 284:: 285 286 void (*freeze) (struct ata_port *ap); 287 void (*thaw) (struct ata_port *ap); 288 289 290:c:func:`ata_port_freeze` is called when HSM violations or some other 291condition disrupts normal operation of the port. A frozen port is not 292allowed to perform any operation until the port is thawed, which usually 293follows a successful reset. 294 295The optional ``->freeze()`` callback can be used for freezing the port 296hardware-wise (e.g. mask interrupt and stop DMA engine). If a port 297cannot be frozen hardware-wise, the interrupt handler must ack and clear 298interrupts unconditionally while the port is frozen. 299 300The optional ``->thaw()`` callback is called to perform the opposite of 301``->freeze()``: prepare the port for normal operation once again. Unmask 302interrupts, start DMA engine, etc. 303 304:: 305 306 void (*error_handler) (struct ata_port *ap); 307 308 309``->error_handler()`` is a driver's hook into probe, hotplug, and recovery 310and other exceptional conditions. The primary responsibility of an 311implementation is to call :c:func:`ata_do_eh` or :c:func:`ata_bmdma_drive_eh` 312with a set of EH hooks as arguments: 313 314'prereset' hook (may be NULL) is called during an EH reset, before any 315other actions are taken. 316 317'postreset' hook (may be NULL) is called after the EH reset is 318performed. Based on existing conditions, severity of the problem, and 319hardware capabilities, 320 321Either 'softreset' (may be NULL) or 'hardreset' (may be NULL) will be 322called to perform the low-level EH reset. 323 324:: 325 326 void (*post_internal_cmd) (struct ata_queued_cmd *qc); 327 328 329Perform any hardware-specific actions necessary to finish processing 330after executing a probe-time or EH-time command via 331:c:func:`ata_exec_internal`. 332 333Hardware interrupt handling 334~~~~~~~~~~~~~~~~~~~~~~~~~~~ 335 336:: 337 338 irqreturn_t (*irq_handler)(int, void *, struct pt_regs *); 339 void (*irq_clear) (struct ata_port *); 340 341 342``->irq_handler`` is the interrupt handling routine registered with the 343system, by libata. ``->irq_clear`` is called during probe just before the 344interrupt handler is registered, to be sure hardware is quiet. 345 346The second argument, dev_instance, should be cast to a pointer to 347:c:type:`struct ata_host_set <ata_host_set>`. 348 349Most legacy IDE drivers use :c:func:`ata_sff_interrupt` for the irq_handler 350hook, which scans all ports in the host_set, determines which queued 351command was active (if any), and calls ata_sff_host_intr(ap,qc). 352 353Most legacy IDE drivers use :c:func:`ata_sff_irq_clear` for the 354:c:func:`irq_clear` hook, which simply clears the interrupt and error flags 355in the DMA status register. 356 357SATA phy read/write 358~~~~~~~~~~~~~~~~~~~ 359 360:: 361 362 int (*scr_read) (struct ata_port *ap, unsigned int sc_reg, 363 u32 *val); 364 int (*scr_write) (struct ata_port *ap, unsigned int sc_reg, 365 u32 val); 366 367 368Read and write standard SATA phy registers. Currently only used if 369``->phy_reset`` hook called the :c:func:`sata_phy_reset` helper function. 370sc_reg is one of SCR_STATUS, SCR_CONTROL, SCR_ERROR, or SCR_ACTIVE. 371 372Init and shutdown 373~~~~~~~~~~~~~~~~~ 374 375:: 376 377 int (*port_start) (struct ata_port *ap); 378 void (*port_stop) (struct ata_port *ap); 379 void (*host_stop) (struct ata_host_set *host_set); 380 381 382``->port_start()`` is called just after the data structures for each port 383are initialized. Typically this is used to alloc per-port DMA buffers / 384tables / rings, enable DMA engines, and similar tasks. Some drivers also 385use this entry point as a chance to allocate driver-private memory for 386``ap->private_data``. 387 388Many drivers use :c:func:`ata_port_start` as this hook or call it from their 389own :c:func:`port_start` hooks. :c:func:`ata_port_start` allocates space for 390a legacy IDE PRD table and returns. 391 392``->port_stop()`` is called after ``->host_stop()``. Its sole function is to 393release DMA/memory resources, now that they are no longer actively being 394used. Many drivers also free driver-private data from port at this time. 395 396``->host_stop()`` is called after all ``->port_stop()`` calls have completed. 397The hook must finalize hardware shutdown, release DMA and other 398resources, etc. This hook may be specified as NULL, in which case it is 399not called. 400 401Error handling 402============== 403 404This chapter describes how errors are handled under libata. Readers are 405advised to read SCSI EH (Documentation/scsi/scsi_eh.txt) and ATA 406exceptions doc first. 407 408Origins of commands 409------------------- 410 411In libata, a command is represented with 412:c:type:`struct ata_queued_cmd <ata_queued_cmd>` or qc. 413qc's are preallocated during port initialization and repetitively used 414for command executions. Currently only one qc is allocated per port but 415yet-to-be-merged NCQ branch allocates one for each tag and maps each qc 416to NCQ tag 1-to-1. 417 418libata commands can originate from two sources - libata itself and SCSI 419midlayer. libata internal commands are used for initialization and error 420handling. All normal blk requests and commands for SCSI emulation are 421passed as SCSI commands through queuecommand callback of SCSI host 422template. 423 424How commands are issued 425----------------------- 426 427Internal commands 428 First, qc is allocated and initialized using :c:func:`ata_qc_new_init`. 429 Although :c:func:`ata_qc_new_init` doesn't implement any wait or retry 430 mechanism when qc is not available, internal commands are currently 431 issued only during initialization and error recovery, so no other 432 command is active and allocation is guaranteed to succeed. 433 434 Once allocated qc's taskfile is initialized for the command to be 435 executed. qc currently has two mechanisms to notify completion. One 436 is via ``qc->complete_fn()`` callback and the other is completion 437 ``qc->waiting``. ``qc->complete_fn()`` callback is the asynchronous path 438 used by normal SCSI translated commands and ``qc->waiting`` is the 439 synchronous (issuer sleeps in process context) path used by internal 440 commands. 441 442 Once initialization is complete, host_set lock is acquired and the 443 qc is issued. 444 445SCSI commands 446 All libata drivers use :c:func:`ata_scsi_queuecmd` as 447 ``hostt->queuecommand`` callback. scmds can either be simulated or 448 translated. No qc is involved in processing a simulated scmd. The 449 result is computed right away and the scmd is completed. 450 451 For a translated scmd, :c:func:`ata_qc_new_init` is invoked to allocate a 452 qc and the scmd is translated into the qc. SCSI midlayer's 453 completion notification function pointer is stored into 454 ``qc->scsidone``. 455 456 ``qc->complete_fn()`` callback is used for completion notification. ATA 457 commands use :c:func:`ata_scsi_qc_complete` while ATAPI commands use 458 :c:func:`atapi_qc_complete`. Both functions end up calling ``qc->scsidone`` 459 to notify upper layer when the qc is finished. After translation is 460 completed, the qc is issued with :c:func:`ata_qc_issue`. 461 462 Note that SCSI midlayer invokes hostt->queuecommand while holding 463 host_set lock, so all above occur while holding host_set lock. 464 465How commands are processed 466-------------------------- 467 468Depending on which protocol and which controller are used, commands are 469processed differently. For the purpose of discussion, a controller which 470uses taskfile interface and all standard callbacks is assumed. 471 472Currently 6 ATA command protocols are used. They can be sorted into the 473following four categories according to how they are processed. 474 475ATA NO DATA or DMA 476 ATA_PROT_NODATA and ATA_PROT_DMA fall into this category. These 477 types of commands don't require any software intervention once 478 issued. Device will raise interrupt on completion. 479 480ATA PIO 481 ATA_PROT_PIO is in this category. libata currently implements PIO 482 with polling. ATA_NIEN bit is set to turn off interrupt and 483 pio_task on ata_wq performs polling and IO. 484 485ATAPI NODATA or DMA 486 ATA_PROT_ATAPI_NODATA and ATA_PROT_ATAPI_DMA are in this 487 category. packet_task is used to poll BSY bit after issuing PACKET 488 command. Once BSY is turned off by the device, packet_task 489 transfers CDB and hands off processing to interrupt handler. 490 491ATAPI PIO 492 ATA_PROT_ATAPI is in this category. ATA_NIEN bit is set and, as 493 in ATAPI NODATA or DMA, packet_task submits cdb. However, after 494 submitting cdb, further processing (data transfer) is handed off to 495 pio_task. 496 497How commands are completed 498-------------------------- 499 500Once issued, all qc's are either completed with :c:func:`ata_qc_complete` or 501time out. For commands which are handled by interrupts, 502:c:func:`ata_host_intr` invokes :c:func:`ata_qc_complete`, and, for PIO tasks, 503pio_task invokes :c:func:`ata_qc_complete`. In error cases, packet_task may 504also complete commands. 505 506:c:func:`ata_qc_complete` does the following. 507 5081. DMA memory is unmapped. 509 5102. ATA_QCFLAG_ACTIVE is cleared from qc->flags. 511 5123. :c:func:`qc->complete_fn` callback is invoked. If the return value of the 513 callback is not zero. Completion is short circuited and 514 :c:func:`ata_qc_complete` returns. 515 5164. :c:func:`__ata_qc_complete` is called, which does 517 518 1. ``qc->flags`` is cleared to zero. 519 520 2. ``ap->active_tag`` and ``qc->tag`` are poisoned. 521 522 3. ``qc->waiting`` is cleared & completed (in that order). 523 524 4. qc is deallocated by clearing appropriate bit in ``ap->qactive``. 525 526So, it basically notifies upper layer and deallocates qc. One exception 527is short-circuit path in #3 which is used by :c:func:`atapi_qc_complete`. 528 529For all non-ATAPI commands, whether it fails or not, almost the same 530code path is taken and very little error handling takes place. A qc is 531completed with success status if it succeeded, with failed status 532otherwise. 533 534However, failed ATAPI commands require more handling as REQUEST SENSE is 535needed to acquire sense data. If an ATAPI command fails, 536:c:func:`ata_qc_complete` is invoked with error status, which in turn invokes 537:c:func:`atapi_qc_complete` via ``qc->complete_fn()`` callback. 538 539This makes :c:func:`atapi_qc_complete` set ``scmd->result`` to 540SAM_STAT_CHECK_CONDITION, complete the scmd and return 1. As the 541sense data is empty but ``scmd->result`` is CHECK CONDITION, SCSI midlayer 542will invoke EH for the scmd, and returning 1 makes :c:func:`ata_qc_complete` 543to return without deallocating the qc. This leads us to 544:c:func:`ata_scsi_error` with partially completed qc. 545 546:c:func:`ata_scsi_error` 547------------------------ 548 549:c:func:`ata_scsi_error` is the current ``transportt->eh_strategy_handler()`` 550for libata. As discussed above, this will be entered in two cases - 551timeout and ATAPI error completion. This function calls low level libata 552driver's :c:func:`eng_timeout` callback, the standard callback for which is 553:c:func:`ata_eng_timeout`. It checks if a qc is active and calls 554:c:func:`ata_qc_timeout` on the qc if so. Actual error handling occurs in 555:c:func:`ata_qc_timeout`. 556 557If EH is invoked for timeout, :c:func:`ata_qc_timeout` stops BMDMA and 558completes the qc. Note that as we're currently in EH, we cannot call 559scsi_done. As described in SCSI EH doc, a recovered scmd should be 560either retried with :c:func:`scsi_queue_insert` or finished with 561:c:func:`scsi_finish_command`. Here, we override ``qc->scsidone`` with 562:c:func:`scsi_finish_command` and calls :c:func:`ata_qc_complete`. 563 564If EH is invoked due to a failed ATAPI qc, the qc here is completed but 565not deallocated. The purpose of this half-completion is to use the qc as 566place holder to make EH code reach this place. This is a bit hackish, 567but it works. 568 569Once control reaches here, the qc is deallocated by invoking 570:c:func:`__ata_qc_complete` explicitly. Then, internal qc for REQUEST SENSE 571is issued. Once sense data is acquired, scmd is finished by directly 572invoking :c:func:`scsi_finish_command` on the scmd. Note that as we already 573have completed and deallocated the qc which was associated with the 574scmd, we don't need to/cannot call :c:func:`ata_qc_complete` again. 575 576Problems with the current EH 577---------------------------- 578 579- Error representation is too crude. Currently any and all error 580 conditions are represented with ATA STATUS and ERROR registers. 581 Errors which aren't ATA device errors are treated as ATA device 582 errors by setting ATA_ERR bit. Better error descriptor which can 583 properly represent ATA and other errors/exceptions is needed. 584 585- When handling timeouts, no action is taken to make device forget 586 about the timed out command and ready for new commands. 587 588- EH handling via :c:func:`ata_scsi_error` is not properly protected from 589 usual command processing. On EH entrance, the device is not in 590 quiescent state. Timed out commands may succeed or fail any time. 591 pio_task and atapi_task may still be running. 592 593- Too weak error recovery. Devices / controllers causing HSM mismatch 594 errors and other errors quite often require reset to return to known 595 state. Also, advanced error handling is necessary to support features 596 like NCQ and hotplug. 597 598- ATA errors are directly handled in the interrupt handler and PIO 599 errors in pio_task. This is problematic for advanced error handling 600 for the following reasons. 601 602 First, advanced error handling often requires context and internal qc 603 execution. 604 605 Second, even a simple failure (say, CRC error) needs information 606 gathering and could trigger complex error handling (say, resetting & 607 reconfiguring). Having multiple code paths to gather information, 608 enter EH and trigger actions makes life painful. 609 610 Third, scattered EH code makes implementing low level drivers 611 difficult. Low level drivers override libata callbacks. If EH is 612 scattered over several places, each affected callbacks should perform 613 its part of error handling. This can be error prone and painful. 614 615libata Library 616============== 617 618.. kernel-doc:: drivers/ata/libata-core.c 619 :export: 620 621libata Core Internals 622===================== 623 624.. kernel-doc:: drivers/ata/libata-core.c 625 :internal: 626 627.. kernel-doc:: drivers/ata/libata-eh.c 628 629libata SCSI translation/emulation 630================================= 631 632.. kernel-doc:: drivers/ata/libata-scsi.c 633 :export: 634 635.. kernel-doc:: drivers/ata/libata-scsi.c 636 :internal: 637 638ATA errors and exceptions 639========================= 640 641This chapter tries to identify what error/exception conditions exist for 642ATA/ATAPI devices and describe how they should be handled in 643implementation-neutral way. 644 645The term 'error' is used to describe conditions where either an explicit 646error condition is reported from device or a command has timed out. 647 648The term 'exception' is either used to describe exceptional conditions 649which are not errors (say, power or hotplug events), or to describe both 650errors and non-error exceptional conditions. Where explicit distinction 651between error and exception is necessary, the term 'non-error exception' 652is used. 653 654Exception categories 655-------------------- 656 657Exceptions are described primarily with respect to legacy taskfile + bus 658master IDE interface. If a controller provides other better mechanism 659for error reporting, mapping those into categories described below 660shouldn't be difficult. 661 662In the following sections, two recovery actions - reset and 663reconfiguring transport - are mentioned. These are described further in 664`EH recovery actions <#exrec>`__. 665 666HSM violation 667~~~~~~~~~~~~~ 668 669This error is indicated when STATUS value doesn't match HSM requirement 670during issuing or execution any ATA/ATAPI command. 671 672- ATA_STATUS doesn't contain !BSY && DRDY && !DRQ while trying to 673 issue a command. 674 675- !BSY && !DRQ during PIO data transfer. 676 677- DRQ on command completion. 678 679- !BSY && ERR after CDB transfer starts but before the last byte of CDB 680 is transferred. ATA/ATAPI standard states that "The device shall not 681 terminate the PACKET command with an error before the last byte of 682 the command packet has been written" in the error outputs description 683 of PACKET command and the state diagram doesn't include such 684 transitions. 685 686In these cases, HSM is violated and not much information regarding the 687error can be acquired from STATUS or ERROR register. IOW, this error can 688be anything - driver bug, faulty device, controller and/or cable. 689 690As HSM is violated, reset is necessary to restore known state. 691Reconfiguring transport for lower speed might be helpful too as 692transmission errors sometimes cause this kind of errors. 693 694ATA/ATAPI device error (non-NCQ / non-CHECK CONDITION) 695~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 696 697These are errors detected and reported by ATA/ATAPI devices indicating 698device problems. For this type of errors, STATUS and ERROR register 699values are valid and describe error condition. Note that some of ATA bus 700errors are detected by ATA/ATAPI devices and reported using the same 701mechanism as device errors. Those cases are described later in this 702section. 703 704For ATA commands, this type of errors are indicated by !BSY && ERR 705during command execution and on completion. 706 707For ATAPI commands, 708 709- !BSY && ERR && ABRT right after issuing PACKET indicates that PACKET 710 command is not supported and falls in this category. 711 712- !BSY && ERR(==CHK) && !ABRT after the last byte of CDB is transferred 713 indicates CHECK CONDITION and doesn't fall in this category. 714 715- !BSY && ERR(==CHK) && ABRT after the last byte of CDB is transferred 716 \*probably\* indicates CHECK CONDITION and doesn't fall in this 717 category. 718 719Of errors detected as above, the following are not ATA/ATAPI device 720errors but ATA bus errors and should be handled according to 721`ATA bus error <#excatATAbusErr>`__. 722 723CRC error during data transfer 724 This is indicated by ICRC bit in the ERROR register and means that 725 corruption occurred during data transfer. Up to ATA/ATAPI-7, the 726 standard specifies that this bit is only applicable to UDMA 727 transfers but ATA/ATAPI-8 draft revision 1f says that the bit may be 728 applicable to multiword DMA and PIO. 729 730ABRT error during data transfer or on completion 731 Up to ATA/ATAPI-7, the standard specifies that ABRT could be set on 732 ICRC errors and on cases where a device is not able to complete a 733 command. Combined with the fact that MWDMA and PIO transfer errors 734 aren't allowed to use ICRC bit up to ATA/ATAPI-7, it seems to imply 735 that ABRT bit alone could indicate transfer errors. 736 737 However, ATA/ATAPI-8 draft revision 1f removes the part that ICRC 738 errors can turn on ABRT. So, this is kind of gray area. Some 739 heuristics are needed here. 740 741ATA/ATAPI device errors can be further categorized as follows. 742 743Media errors 744 This is indicated by UNC bit in the ERROR register. ATA devices 745 reports UNC error only after certain number of retries cannot 746 recover the data, so there's nothing much else to do other than 747 notifying upper layer. 748 749 READ and WRITE commands report CHS or LBA of the first failed sector 750 but ATA/ATAPI standard specifies that the amount of transferred data 751 on error completion is indeterminate, so we cannot assume that 752 sectors preceding the failed sector have been transferred and thus 753 cannot complete those sectors successfully as SCSI does. 754 755Media changed / media change requested error 756 <<TODO: fill here>> 757 758Address error 759 This is indicated by IDNF bit in the ERROR register. Report to upper 760 layer. 761 762Other errors 763 This can be invalid command or parameter indicated by ABRT ERROR bit 764 or some other error condition. Note that ABRT bit can indicate a lot 765 of things including ICRC and Address errors. Heuristics needed. 766 767Depending on commands, not all STATUS/ERROR bits are applicable. These 768non-applicable bits are marked with "na" in the output descriptions but 769up to ATA/ATAPI-7 no definition of "na" can be found. However, 770ATA/ATAPI-8 draft revision 1f describes "N/A" as follows. 771 772 3.2.3.3a N/A 773 A keyword the indicates a field has no defined value in this 774 standard and should not be checked by the host or device. N/A 775 fields should be cleared to zero. 776 777So, it seems reasonable to assume that "na" bits are cleared to zero by 778devices and thus need no explicit masking. 779 780ATAPI device CHECK CONDITION 781~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 782 783ATAPI device CHECK CONDITION error is indicated by set CHK bit (ERR bit) 784in the STATUS register after the last byte of CDB is transferred for a 785PACKET command. For this kind of errors, sense data should be acquired 786to gather information regarding the errors. REQUEST SENSE packet command 787should be used to acquire sense data. 788 789Once sense data is acquired, this type of errors can be handled 790similarly to other SCSI errors. Note that sense data may indicate ATA 791bus error (e.g. Sense Key 04h HARDWARE ERROR && ASC/ASCQ 47h/00h SCSI 792PARITY ERROR). In such cases, the error should be considered as an ATA 793bus error and handled according to `ATA bus error <#excatATAbusErr>`__. 794 795ATA device error (NCQ) 796~~~~~~~~~~~~~~~~~~~~~~ 797 798NCQ command error is indicated by cleared BSY and set ERR bit during NCQ 799command phase (one or more NCQ commands outstanding). Although STATUS 800and ERROR registers will contain valid values describing the error, READ 801LOG EXT is required to clear the error condition, determine which 802command has failed and acquire more information. 803 804READ LOG EXT Log Page 10h reports which tag has failed and taskfile 805register values describing the error. With this information the failed 806command can be handled as a normal ATA command error as in 807`ATA/ATAPI device error (non-NCQ / non-CHECK CONDITION) <#excatDevErr>`__ 808and all other in-flight commands must be retried. Note that this retry 809should not be counted - it's likely that commands retried this way would 810have completed normally if it were not for the failed command. 811 812Note that ATA bus errors can be reported as ATA device NCQ errors. This 813should be handled as described in `ATA bus error <#excatATAbusErr>`__. 814 815If READ LOG EXT Log Page 10h fails or reports NQ, we're thoroughly 816screwed. This condition should be treated according to 817`HSM violation <#excatHSMviolation>`__. 818 819ATA bus error 820~~~~~~~~~~~~~ 821 822ATA bus error means that data corruption occurred during transmission 823over ATA bus (SATA or PATA). This type of errors can be indicated by 824 825- ICRC or ABRT error as described in 826 `ATA/ATAPI device error (non-NCQ / non-CHECK CONDITION) <#excatDevErr>`__. 827 828- Controller-specific error completion with error information 829 indicating transmission error. 830 831- On some controllers, command timeout. In this case, there may be a 832 mechanism to determine that the timeout is due to transmission error. 833 834- Unknown/random errors, timeouts and all sorts of weirdities. 835 836As described above, transmission errors can cause wide variety of 837symptoms ranging from device ICRC error to random device lockup, and, 838for many cases, there is no way to tell if an error condition is due to 839transmission error or not; therefore, it's necessary to employ some kind 840of heuristic when dealing with errors and timeouts. For example, 841encountering repetitive ABRT errors for known supported command is 842likely to indicate ATA bus error. 843 844Once it's determined that ATA bus errors have possibly occurred, 845lowering ATA bus transmission speed is one of actions which may 846alleviate the problem. See `Reconfigure transport <#exrecReconf>`__ for 847more information. 848 849PCI bus error 850~~~~~~~~~~~~~ 851 852Data corruption or other failures during transmission over PCI (or other 853system bus). For standard BMDMA, this is indicated by Error bit in the 854BMDMA Status register. This type of errors must be logged as it 855indicates something is very wrong with the system. Resetting host 856controller is recommended. 857 858Late completion 859~~~~~~~~~~~~~~~ 860 861This occurs when timeout occurs and the timeout handler finds out that 862the timed out command has completed successfully or with error. This is 863usually caused by lost interrupts. This type of errors must be logged. 864Resetting host controller is recommended. 865 866Unknown error (timeout) 867~~~~~~~~~~~~~~~~~~~~~~~ 868 869This is when timeout occurs and the command is still processing or the 870host and device are in unknown state. When this occurs, HSM could be in 871any valid or invalid state. To bring the device to known state and make 872it forget about the timed out command, resetting is necessary. The timed 873out command may be retried. 874 875Timeouts can also be caused by transmission errors. Refer to 876`ATA bus error <#excatATAbusErr>`__ for more details. 877 878Hotplug and power management exceptions 879~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 880 881<<TODO: fill here>> 882 883EH recovery actions 884------------------- 885 886This section discusses several important recovery actions. 887 888Clearing error condition 889~~~~~~~~~~~~~~~~~~~~~~~~ 890 891Many controllers require its error registers to be cleared by error 892handler. Different controllers may have different requirements. 893 894For SATA, it's strongly recommended to clear at least SError register 895during error handling. 896 897Reset 898~~~~~ 899 900During EH, resetting is necessary in the following cases. 901 902- HSM is in unknown or invalid state 903 904- HBA is in unknown or invalid state 905 906- EH needs to make HBA/device forget about in-flight commands 907 908- HBA/device behaves weirdly 909 910Resetting during EH might be a good idea regardless of error condition 911to improve EH robustness. Whether to reset both or either one of HBA and 912device depends on situation but the following scheme is recommended. 913 914- When it's known that HBA is in ready state but ATA/ATAPI device is in 915 unknown state, reset only device. 916 917- If HBA is in unknown state, reset both HBA and device. 918 919HBA resetting is implementation specific. For a controller complying to 920taskfile/BMDMA PCI IDE, stopping active DMA transaction may be 921sufficient iff BMDMA state is the only HBA context. But even mostly 922taskfile/BMDMA PCI IDE complying controllers may have implementation 923specific requirements and mechanism to reset themselves. This must be 924addressed by specific drivers. 925 926OTOH, ATA/ATAPI standard describes in detail ways to reset ATA/ATAPI 927devices. 928 929PATA hardware reset 930 This is hardware initiated device reset signalled with asserted PATA 931 RESET- signal. There is no standard way to initiate hardware reset 932 from software although some hardware provides registers that allow 933 driver to directly tweak the RESET- signal. 934 935Software reset 936 This is achieved by turning CONTROL SRST bit on for at least 5us. 937 Both PATA and SATA support it but, in case of SATA, this may require 938 controller-specific support as the second Register FIS to clear SRST 939 should be transmitted while BSY bit is still set. Note that on PATA, 940 this resets both master and slave devices on a channel. 941 942EXECUTE DEVICE DIAGNOSTIC command 943 Although ATA/ATAPI standard doesn't describe exactly, EDD implies 944 some level of resetting, possibly similar level with software reset. 945 Host-side EDD protocol can be handled with normal command processing 946 and most SATA controllers should be able to handle EDD's just like 947 other commands. As in software reset, EDD affects both devices on a 948 PATA bus. 949 950 Although EDD does reset devices, this doesn't suit error handling as 951 EDD cannot be issued while BSY is set and it's unclear how it will 952 act when device is in unknown/weird state. 953 954ATAPI DEVICE RESET command 955 This is very similar to software reset except that reset can be 956 restricted to the selected device without affecting the other device 957 sharing the cable. 958 959SATA phy reset 960 This is the preferred way of resetting a SATA device. In effect, 961 it's identical to PATA hardware reset. Note that this can be done 962 with the standard SCR Control register. As such, it's usually easier 963 to implement than software reset. 964 965One more thing to consider when resetting devices is that resetting 966clears certain configuration parameters and they need to be set to their 967previous or newly adjusted values after reset. 968 969Parameters affected are. 970 971- CHS set up with INITIALIZE DEVICE PARAMETERS (seldom used) 972 973- Parameters set with SET FEATURES including transfer mode setting 974 975- Block count set with SET MULTIPLE MODE 976 977- Other parameters (SET MAX, MEDIA LOCK...) 978 979ATA/ATAPI standard specifies that some parameters must be maintained 980across hardware or software reset, but doesn't strictly specify all of 981them. Always reconfiguring needed parameters after reset is required for 982robustness. Note that this also applies when resuming from deep sleep 983(power-off). 984 985Also, ATA/ATAPI standard requires that IDENTIFY DEVICE / IDENTIFY PACKET 986DEVICE is issued after any configuration parameter is updated or a 987hardware reset and the result used for further operation. OS driver is 988required to implement revalidation mechanism to support this. 989 990Reconfigure transport 991~~~~~~~~~~~~~~~~~~~~~ 992 993For both PATA and SATA, a lot of corners are cut for cheap connectors, 994cables or controllers and it's quite common to see high transmission 995error rate. This can be mitigated by lowering transmission speed. 996 997The following is a possible scheme Jeff Garzik suggested. 998 999 If more than $N (3?) transmission errors happen in 15 minutes, 1000 1001 - if SATA, decrease SATA PHY speed. if speed cannot be decreased, 1002 1003 - decrease UDMA xfer speed. if at UDMA0, switch to PIO4, 1004 1005 - decrease PIO xfer speed. if at PIO3, complain, but continue 1006 1007ata_piix Internals 1008=================== 1009 1010.. kernel-doc:: drivers/ata/ata_piix.c 1011 :internal: 1012 1013sata_sil Internals 1014=================== 1015 1016.. kernel-doc:: drivers/ata/sata_sil.c 1017 :internal: 1018 1019Thanks 1020====== 1021 1022The bulk of the ATA knowledge comes thanks to long conversations with 1023Andre Hedrick (www.linux-ide.org), and long hours pondering the ATA and 1024SCSI specifications. 1025 1026Thanks to Alan Cox for pointing out similarities between SATA and SCSI, 1027and in general for motivation to hack on libata. 1028 1029libata's device detection method, ata_pio_devchk, and in general all 1030the early probing was based on extensive study of Hale Landis's 1031probe/reset code in his ATADRVR driver (www.ata-atapi.com). 1032