1ff61f079SJonathan Corbet.. SPDX-License-Identifier: GPL-2.0 2ff61f079SJonathan Corbet 3ff61f079SJonathan Corbet.. include:: <isonum.txt> 4ff61f079SJonathan Corbet 5ff61f079SJonathan Corbet=============================== 6ff61f079SJonathan CorbetBus lock detection and handling 7ff61f079SJonathan Corbet=============================== 8ff61f079SJonathan Corbet 9ff61f079SJonathan Corbet:Copyright: |copy| 2021 Intel Corporation 10ff61f079SJonathan Corbet:Authors: - Fenghua Yu <fenghua.yu@intel.com> 11ff61f079SJonathan Corbet - Tony Luck <tony.luck@intel.com> 12ff61f079SJonathan Corbet 13ff61f079SJonathan CorbetProblem 14ff61f079SJonathan Corbet======= 15ff61f079SJonathan Corbet 16ff61f079SJonathan CorbetA split lock is any atomic operation whose operand crosses two cache lines. 17ff61f079SJonathan CorbetSince the operand spans two cache lines and the operation must be atomic, 18ff61f079SJonathan Corbetthe system locks the bus while the CPU accesses the two cache lines. 19ff61f079SJonathan Corbet 20ff61f079SJonathan CorbetA bus lock is acquired through either split locked access to writeback (WB) 21ff61f079SJonathan Corbetmemory or any locked access to non-WB memory. This is typically thousands of 22ff61f079SJonathan Corbetcycles slower than an atomic operation within a cache line. It also disrupts 23ff61f079SJonathan Corbetperformance on other cores and brings the whole system to its knees. 24ff61f079SJonathan Corbet 25ff61f079SJonathan CorbetDetection 26ff61f079SJonathan Corbet========= 27ff61f079SJonathan Corbet 28ff61f079SJonathan CorbetIntel processors may support either or both of the following hardware 29ff61f079SJonathan Corbetmechanisms to detect split locks and bus locks. 30ff61f079SJonathan Corbet 31ff61f079SJonathan Corbet#AC exception for split lock detection 32ff61f079SJonathan Corbet-------------------------------------- 33ff61f079SJonathan Corbet 34ff61f079SJonathan CorbetBeginning with the Tremont Atom CPU split lock operations may raise an 35*d56b699dSBjorn HelgaasAlignment Check (#AC) exception when a split lock operation is attempted. 36ff61f079SJonathan Corbet 37ff61f079SJonathan Corbet#DB exception for bus lock detection 38ff61f079SJonathan Corbet------------------------------------ 39ff61f079SJonathan Corbet 40ff61f079SJonathan CorbetSome CPUs have the ability to notify the kernel by an #DB trap after a user 41ff61f079SJonathan Corbetinstruction acquires a bus lock and is executed. This allows the kernel to 42ff61f079SJonathan Corbetterminate the application or to enforce throttling. 43ff61f079SJonathan Corbet 44ff61f079SJonathan CorbetSoftware handling 45ff61f079SJonathan Corbet================= 46ff61f079SJonathan Corbet 47ff61f079SJonathan CorbetThe kernel #AC and #DB handlers handle bus lock based on the kernel 48ff61f079SJonathan Corbetparameter "split_lock_detect". Here is a summary of different options: 49ff61f079SJonathan Corbet 50ff61f079SJonathan Corbet+------------------+----------------------------+-----------------------+ 51ff61f079SJonathan Corbet|split_lock_detect=|#AC for split lock |#DB for bus lock | 52ff61f079SJonathan Corbet+------------------+----------------------------+-----------------------+ 53ff61f079SJonathan Corbet|off |Do nothing |Do nothing | 54ff61f079SJonathan Corbet+------------------+----------------------------+-----------------------+ 55ff61f079SJonathan Corbet|warn |Kernel OOPs |Warn once per task and | 56ff61f079SJonathan Corbet|(default) |Warn once per task, add a |and continues to run. | 57ff61f079SJonathan Corbet| |delay, add synchronization | | 58ff61f079SJonathan Corbet| |to prevent more than one | | 59ff61f079SJonathan Corbet| |core from executing a | | 60ff61f079SJonathan Corbet| |split lock in parallel. | | 61ff61f079SJonathan Corbet| |sysctl split_lock_mitigate | | 62ff61f079SJonathan Corbet| |can be used to avoid the | | 63ff61f079SJonathan Corbet| |delay and synchronization | | 64ff61f079SJonathan Corbet| |When both features are | | 65ff61f079SJonathan Corbet| |supported, warn in #AC | | 66ff61f079SJonathan Corbet+------------------+----------------------------+-----------------------+ 67ff61f079SJonathan Corbet|fatal |Kernel OOPs |Send SIGBUS to user. | 68ff61f079SJonathan Corbet| |Send SIGBUS to user | | 69ff61f079SJonathan Corbet| |When both features are | | 70ff61f079SJonathan Corbet| |supported, fatal in #AC | | 71ff61f079SJonathan Corbet+------------------+----------------------------+-----------------------+ 72ff61f079SJonathan Corbet|ratelimit:N |Do nothing |Limit bus lock rate to | 73ff61f079SJonathan Corbet|(0 < N <= 1000) | |N bus locks per second | 74ff61f079SJonathan Corbet| | |system wide and warn on| 75ff61f079SJonathan Corbet| | |bus locks. | 76ff61f079SJonathan Corbet+------------------+----------------------------+-----------------------+ 77ff61f079SJonathan Corbet 78ff61f079SJonathan CorbetUsages 79ff61f079SJonathan Corbet====== 80ff61f079SJonathan Corbet 81ff61f079SJonathan CorbetDetecting and handling bus lock may find usages in various areas: 82ff61f079SJonathan Corbet 83ff61f079SJonathan CorbetIt is critical for real time system designers who build consolidated real 84ff61f079SJonathan Corbettime systems. These systems run hard real time code on some cores and run 85ff61f079SJonathan Corbet"untrusted" user processes on other cores. The hard real time cannot afford 86ff61f079SJonathan Corbetto have any bus lock from the untrusted processes to hurt real time 87ff61f079SJonathan Corbetperformance. To date the designers have been unable to deploy these 88ff61f079SJonathan Corbetsolutions as they have no way to prevent the "untrusted" user code from 89ff61f079SJonathan Corbetgenerating split lock and bus lock to block the hard real time code to 90ff61f079SJonathan Corbetaccess memory during bus locking. 91ff61f079SJonathan Corbet 92ff61f079SJonathan CorbetIt's also useful for general computing to prevent guests or user 93ff61f079SJonathan Corbetapplications from slowing down the overall system by executing instructions 94ff61f079SJonathan Corbetwith bus lock. 95ff61f079SJonathan Corbet 96ff61f079SJonathan Corbet 97ff61f079SJonathan CorbetGuidance 98ff61f079SJonathan Corbet======== 99ff61f079SJonathan Corbetoff 100ff61f079SJonathan Corbet--- 101ff61f079SJonathan Corbet 102ff61f079SJonathan CorbetDisable checking for split lock and bus lock. This option can be useful if 103ff61f079SJonathan Corbetthere are legacy applications that trigger these events at a low rate so 104ff61f079SJonathan Corbetthat mitigation is not needed. 105ff61f079SJonathan Corbet 106ff61f079SJonathan Corbetwarn 107ff61f079SJonathan Corbet---- 108ff61f079SJonathan Corbet 109ff61f079SJonathan CorbetA warning is emitted when a bus lock is detected which allows to identify 110ff61f079SJonathan Corbetthe offending application. This is the default behavior. 111ff61f079SJonathan Corbet 112ff61f079SJonathan Corbetfatal 113ff61f079SJonathan Corbet----- 114ff61f079SJonathan Corbet 115ff61f079SJonathan CorbetIn this case, the bus lock is not tolerated and the process is killed. 116ff61f079SJonathan Corbet 117ff61f079SJonathan Corbetratelimit 118ff61f079SJonathan Corbet--------- 119ff61f079SJonathan Corbet 120ff61f079SJonathan CorbetA system wide bus lock rate limit N is specified where 0 < N <= 1000. This 121ff61f079SJonathan Corbetallows a bus lock rate up to N bus locks per second. When the bus lock rate 122ff61f079SJonathan Corbetis exceeded then any task which is caught via the buslock #DB exception is 123ff61f079SJonathan Corbetthrottled by enforced sleeps until the rate goes under the limit again. 124ff61f079SJonathan Corbet 125ff61f079SJonathan CorbetThis is an effective mitigation in cases where a minimal impact can be 126ff61f079SJonathan Corbettolerated, but an eventual Denial of Service attack has to be prevented. It 127ff61f079SJonathan Corbetallows to identify the offending processes and analyze whether they are 128ff61f079SJonathan Corbetmalicious or just badly written. 129ff61f079SJonathan Corbet 130ff61f079SJonathan CorbetSelecting a rate limit of 1000 allows the bus to be locked for up to about 131ff61f079SJonathan Corbetseven million cycles each second (assuming 7000 cycles for each bus 132ff61f079SJonathan Corbetlock). On a 2 GHz processor that would be about 0.35% system slowdown. 133