580253b5 | 08-Sep-2023 |
Palmer Dabbelt <palmer@rivosinc.com> |
Merge patch series "RISC-V: Probe for misaligned access speed"
Evan Green <evan@rivosinc.com> says:
The current setting for the hwprobe bit indicating misaligned access speed is controlled by a ven
Merge patch series "RISC-V: Probe for misaligned access speed"
Evan Green <evan@rivosinc.com> says:
The current setting for the hwprobe bit indicating misaligned access speed is controlled by a vendor-specific feature probe function. This is essentially a per-SoC table we have to maintain on behalf of each vendor going forward. Let's convert that instead to something we detect at runtime.
We have two assembly routines at the heart of our probe: one that does a bunch of word-sized accesses (without aligning its input buffer), and the other that does byte accesses. If we can move a larger number of bytes using misaligned word accesses than we can with the same amount of time doing byte accesses, then we can declare misaligned accesses as "fast".
The tradeoff of reducing this maintenance burden is boot time. We spend 4-6 jiffies per core doing this measurement (0-2 on jiffie edge alignment, and 4 on measurement). The timing loop was based on raid6_choose_gen(), which uses (16+1)*N jiffies (where N is the number of algorithms). By taking only the fastest iteration out of all attempts for use in the comparison, variance between runs is very low. On my THead C906, it looks like this:
[ 0.047563] cpu0: Ratio of byte access time to unaligned word access is 4.34, unaligned accesses are fast
Several others have chimed in with results on slow machines with the older algorithm, which took all runs into account, including noise like interrupts. Even with this variation, results indicate that in all cases (fast, slow, and emulated) the measured numbers are nowhere near each other (always multiple factors away).
* b4-shazam-merge: RISC-V: alternative: Remove feature_probe_func RISC-V: Probe for unaligned access speed
Link: https://lore.kernel.org/r/20230818194136.4084400-1-evan@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
show more ...
|
180bb41d | 17-Aug-2023 |
Alexandre Ghiti <alexghiti@rivosinc.com> |
Documentation: riscv: Update boot image header since EFI stub is supported
The EFI stub is supported on RISC-V so update the documentation that explains how the boot image header was reused to suppo
Documentation: riscv: Update boot image header since EFI stub is supported
The EFI stub is supported on RISC-V so update the documentation that explains how the boot image header was reused to support it.
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20230817130734.10387-3-alexghiti@rivosinc.com
show more ...
|
bcc87900 | 19-Jun-2023 |
Palmer Dabbelt <palmer@rivosinc.com> |
RISC-V: Document that V registers are clobbered on syscalls
This is included in the ISA manual, but it's pretty common for bits of the ISA manual that are actually ABI to change. So let's document
RISC-V: Document that V registers are clobbered on syscalls
This is included in the ISA manual, but it's pretty common for bits of the ISA manual that are actually ABI to change. So let's document it explicitly.
Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20230619190142.26498-1-palmer@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
show more ...
|
16252e01 | 19-Jun-2023 |
Palmer Dabbelt <palmer@rivosinc.com> |
Merge patch series "RISC-V: Export Zba, Zbb to usermode via hwprobe"
Evan Green <evan@rivosinc.com> says:
This change detects the presence of Zba, Zbb, and Zbs extensions and exports them per-hart
Merge patch series "RISC-V: Export Zba, Zbb to usermode via hwprobe"
Evan Green <evan@rivosinc.com> says:
This change detects the presence of Zba, Zbb, and Zbs extensions and exports them per-hart to userspace via the hwprobe mechanism. Glibc can then use these in setting up hwcaps-based library search paths.
There's a little bit of extra housekeeping here: the first change adds Zba and Zbs to the set of extensions the kernel recognizes, and the second change starts tracking ISA features per-hart (in addition to the ANDed mask of features across all harts which the kernel uses to make decisions). Now that we track the ISA information per-hart, we could even fix up /proc/cpuinfo to accurately report extension per-hart, though I've left that out of this series for now.
* b4-shazam-merge: RISC-V: hwprobe: Expose Zba, Zbb, and Zbs RISC-V: Track ISA extensions per hart RISC-V: Add Zba, Zbs extension probing
Link: https://lore.kernel.org/r/20230509182504.2997252-1-evan@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
show more ...
|
62a31d6e | 07-Apr-2023 |
Evan Green <evan@rivosinc.com> |
RISC-V: hwprobe: Support probing of misaligned access performance
This allows userspace to select various routines to use based on the performance of misaligned access on the target hardware.
Rathe
RISC-V: hwprobe: Support probing of misaligned access performance
This allows userspace to select various routines to use based on the performance of misaligned access on the target hardware.
Rather than adding DT bindings, this change taps into the alternatives mechanism used to probe CPU errata. Add a new function pointer alongside the vendor-specific errata_patch_func() that probes for desirable errata (otherwise known as "features"). Unlike the errata_patch_func(), this function is called on each CPU as it comes up, so it can save feature information per-CPU.
The T-head C906 has fast unaligned access, both as defined by GCC [1], and in performing a basic benchmark, which determined that byte copies are >50% slower than a misaligned word copy of the same data size (source for this test at [2]):
bytecopy size f000 count 50000 offset 0 took 31664899 us wordcopy size f000 count 50000 offset 0 took 5180919 us wordcopy size f000 count 50000 offset 1 took 13416949 us
[1] https://github.com/gcc-mirror/gcc/blob/master/gcc/config/riscv/riscv.cc#L353 [2] https://pastebin.com/EPXvDHSW
Co-developed-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Evan Green <evan@rivosinc.com> Reviewed-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Tested-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Paul Walmsley <paul.walmsley@sifive.com> Link: https://lore.kernel.org/r/20230407231103.2622178-5-evan@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
show more ...
|
00e76e2c | 07-Apr-2023 |
Evan Green <evan@rivosinc.com> |
RISC-V: hwprobe: Add support for RISCV_HWPROBE_BASE_BEHAVIOR_IMA
We have an implicit set of base behaviors that userspace depends on, which are mostly defined in various ISA specifications.
Co-deve
RISC-V: hwprobe: Add support for RISCV_HWPROBE_BASE_BEHAVIOR_IMA
We have an implicit set of base behaviors that userspace depends on, which are mostly defined in various ISA specifications.
Co-developed-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Evan Green <evan@rivosinc.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Tested-by: Heiko Stuebner <heiko.stuebner@vrull.eu> Reviewed-by: Paul Walmsley <paul.walmsley@sifive.com> Link: https://lore.kernel.org/r/20230407231103.2622178-4-evan@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
show more ...
|