#
e9cb93b9 |
| 24-May-2023 |
Christoph Hellwig <hch@lst.de> |
btrfs: don't call btrfs_record_physical_zoned for failed append
When a zoned append command fails there is no written address reported, so don't try to record it.
Reviewed-by: Johannes Thumshirn <j
btrfs: don't call btrfs_record_physical_zoned for failed append
When a zoned append command fails there is no written address reported, so don't try to record it.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
Revision tags: v6.1.29, v6.1.28 |
|
#
8bfec2e4 |
| 03-May-2023 |
Christoph Hellwig <hch@lst.de> |
btrfs: remove hipri_workers workqueue
Now that btrfs_wq_submit_bio is never called for synchronous I/O, the hipri_workers workqueue is not used anymore and can be removed.
Reviewed-by: Chris Mason
btrfs: remove hipri_workers workqueue
Now that btrfs_wq_submit_bio is never called for synchronous I/O, the hipri_workers workqueue is not used anymore and can be removed.
Reviewed-by: Chris Mason <clm@fb.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
e917ff56 |
| 03-May-2023 |
Christoph Hellwig <hch@lst.de> |
btrfs: determine synchronous writers from bio or writeback control
The writeback_control structure already passes down the information about a writeback being synchronous from the core VM code, and
btrfs: determine synchronous writers from bio or writeback control
The writeback_control structure already passes down the information about a writeback being synchronous from the core VM code, and thus information is propagated into the bio REQ_SYNC flag through the wbc_to_write_flags helper.
Use that information to decide if checksums calculation is offloaded to a workqueue instead of btrfs_inode::sync_writers field that not only bloats the inode but also has too wide scope, being inode wide instead of limited to the actual writeback request.
The sync writes were set in:
- btrfs_do_write_iter - regular IO, sync status is set - start_ordered_ops - ordered write start, writeback with WB_SYNC_ALL mode - btrfs_write_marked_extents - write marked extents, writeback with WB_SYNC_ALL mode
Reviewed-by: Chris Mason <clm@fb.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
da023618 |
| 03-May-2023 |
Christoph Hellwig <hch@lst.de> |
btrfs: submit IO synchronously for fast checksum implementations
Most modern hardware supports very fast accelerated crc32c calculation. If that is supported the CPU overhead of the checksum calcula
btrfs: submit IO synchronously for fast checksum implementations
Most modern hardware supports very fast accelerated crc32c calculation. If that is supported the CPU overhead of the checksum calculation is very limited, and offloading the calculation to special worker threads has a lot of overhead for no gain.
E.g. on an Intel Optane device is actually very much slows down even 1M buffered writes with fio:
Unpatched:
write: IOPS=3316, BW=3316MiB/s (3477MB/s)(200GiB/61757msec); 0 zone resets
With synchronous CRCs:
write: IOPS=4882, BW=4882MiB/s (5119MB/s)(200GiB/41948msec); 0 zone resets
With a lot of variation during the unpatched run going down as low as 1100MB/s, while the synchronous CRC version has about the same peak write speed but much lower dips, and fewer kworkers churning around. Both tests had fio saturated at 100% CPU.
(thanks to Jens Axboe via Chris Mason for the benchmarking)
Reviewed-by: Chris Mason <clm@fb.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
Revision tags: v6.1.27, v6.1.26, v6.3, v6.1.25 |
|
#
adbe7e38 |
| 15-Apr-2023 |
Anand Jain <anand.jain@oracle.com> |
btrfs: use SECTOR_SHIFT to convert LBA to physical offset
Using SECTOR_SHIFT to convert LBA to physical address makes it more readable.
Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by
btrfs: use SECTOR_SHIFT to convert LBA to physical offset
Using SECTOR_SHIFT to convert LBA to physical address makes it more readable.
Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
cce3b573 |
| 19-Jun-2023 |
Dave Airlie <airlied@redhat.com> |
Backmerge tag 'v6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into drm-next
Linux 6.4-rc7
Need this to pull in the msm work.
Signed-off-by: Dave Airlie <airlied@redhat.c
Backmerge tag 'v6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into drm-next
Linux 6.4-rc7
Need this to pull in the msm work.
Signed-off-by: Dave Airlie <airlied@redhat.com>
show more ...
|
#
db6da59c |
| 15-Jun-2023 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-next-fixes
Backmerging to sync drm-misc-next-fixes with drm-misc-next.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
#
03c60192 |
| 12-Jun-2023 |
Dmitry Baryshkov <dmitry.baryshkov@linaro.org> |
Merge branch 'drm-next' of git://anongit.freedesktop.org/drm/drm into msm-next-lumag-base
Merge the drm-next tree to pick up the DRM DSC helpers (merged via drm-intel-next tree). MSM DSC v1.2 patche
Merge branch 'drm-next' of git://anongit.freedesktop.org/drm/drm into msm-next-lumag-base
Merge the drm-next tree to pick up the DRM DSC helpers (merged via drm-intel-next tree). MSM DSC v1.2 patches depend on these helpers.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
show more ...
|
#
b7c26863 |
| 12-Jun-2023 |
Mark Brown <broonie@kernel.org> |
regmap: Merge up v6.4-rc6
The fix for maple tree RCU locking on sync is a dependency for the block sync code for the maple tree.
|
#
be9aac18 |
| 09-Jun-2023 |
Mauro Carvalho Chehab <mchehab@kernel.org> |
Merge tag 'v6.4-rc5' into media_stage
Linux 6.4-rc5
* tag 'v6.4-rc5': (919 commits) Linux 6.4-rc5 leds: qcom-lpg: Fix PWM period limits selftests/ftrace: Choose target function for filter tes
Merge tag 'v6.4-rc5' into media_stage
Linux 6.4-rc5
* tag 'v6.4-rc5': (919 commits) Linux 6.4-rc5 leds: qcom-lpg: Fix PWM period limits selftests/ftrace: Choose target function for filter test from samples KVM: selftests: Add test for race in kvm_recalculate_apic_map() KVM: x86: Bail from kvm_recalculate_phys_map() if x2APIC ID is out-of-bounds KVM: x86: Account fastpath-only VM-Exits in vCPU stats KVM: SVM: vNMI pending bit is V_NMI_PENDING_MASK not V_NMI_BLOCKING_MASK KVM: x86/mmu: Grab memslot for correct address space in NX recovery worker tpm, tpm_tis: correct tpm_tis_flags enumeration values Revert "ext4: remove ac->ac_found > sbi->s_mb_min_to_scan dead check in ext4_mb_check_limits" media: uvcvideo: Don't expose unsupported formats to userspace media: v4l2-subdev: Fix missing kerneldoc for client_caps media: staging: media: imx: initialize hs_settle to avoid warning media: v4l2-mc: Drop subdev check in v4l2_create_fwnode_links_to_pad() riscv: Implement missing huge_ptep_get riscv: Fix huge_ptep_set_wrprotect when PTE is a NAPOT module/decompress: Fix error checking on zstd decompression fork, vhost: Use CLONE_THREAD to fix freezer/ps regression dt-bindings: serial: 8250_omap: add rs485-rts-active-high selinux: don't use make's grouped targets feature yet ...
show more ...
|
#
4b0a5014 |
| 09-Jun-2023 |
Mauro Carvalho Chehab <mchehab@kernel.org> |
Merge tag 'v6.4-rc5' into v4l_for_linus
Linux 6.4-rc5
* tag 'v6.4-rc5': (303 commits) Linux 6.4-rc5 leds: qcom-lpg: Fix PWM period limits selftests/ftrace: Choose target function for filter t
Merge tag 'v6.4-rc5' into v4l_for_linus
Linux 6.4-rc5
* tag 'v6.4-rc5': (303 commits) Linux 6.4-rc5 leds: qcom-lpg: Fix PWM period limits selftests/ftrace: Choose target function for filter test from samples KVM: selftests: Add test for race in kvm_recalculate_apic_map() KVM: x86: Bail from kvm_recalculate_phys_map() if x2APIC ID is out-of-bounds KVM: x86: Account fastpath-only VM-Exits in vCPU stats KVM: SVM: vNMI pending bit is V_NMI_PENDING_MASK not V_NMI_BLOCKING_MASK KVM: x86/mmu: Grab memslot for correct address space in NX recovery worker tpm, tpm_tis: correct tpm_tis_flags enumeration values Revert "ext4: remove ac->ac_found > sbi->s_mb_min_to_scan dead check in ext4_mb_check_limits" riscv: Implement missing huge_ptep_get riscv: Fix huge_ptep_set_wrprotect when PTE is a NAPOT module/decompress: Fix error checking on zstd decompression fork, vhost: Use CLONE_THREAD to fix freezer/ps regression dt-bindings: serial: 8250_omap: add rs485-rts-active-high selinux: don't use make's grouped targets feature yet riscv: perf: Fix callchain parse error with kernel tracepoint events mptcp: fix active subflow finalization mptcp: add annotations around sk->sk_shutdown accesses mptcp: fix data race around msk->first access ...
show more ...
|
#
449f6bc1 |
| 08-Jun-2023 |
Jakub Kicinski <kuba@kernel.org> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.
Conflicts:
net/sched/sch_taprio.c d636fc5dd692 ("net: sched: add rcu annotations
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.
Conflicts:
net/sched/sch_taprio.c d636fc5dd692 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping") dced11ef84fb ("net/sched: taprio: don't overwrite "sch" variable in taprio_dump_class_stats()")
net/ipv4/sysctl_net_ipv4.c e209fee4118f ("net/ipv4: ping_group_range: allow GID from 2147483648 to 4294967294") ccce324dabfe ("tcp: make the first N SYN RTO backoffs linear") https://lore.kernel.org/all/20230605100816.08d41a7b@canb.auug.org.au/
No adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
30e15cb0 |
| 07-Jun-2023 |
Mark Brown <broonie@kernel.org> |
spi-geni-qcom: Add new interfaces and utilise them
Merge series from Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com>:
A "known issue" during implementation of SE DMA for spi geni driver was tha
spi-geni-qcom: Add new interfaces and utilise them
Merge series from Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com>:
A "known issue" during implementation of SE DMA for spi geni driver was that it does DMA map/unmap internally instead of in spi framework. Current patches remove this hiccup and also clean up code a bit.
Testing revealed no regressions and results with 1000 iterations of reading from EC showed no loss of performance. Results ======= Before - Iteration 999, min=5.10, max=5.17, avg=5.14, ints=25129 After - Iteration 999, min=5.10, max=5.20, avg=5.15, ints=25153
show more ...
|
#
5c680050 |
| 06-Jun-2023 |
Miquel Raynal <miquel.raynal@bootlin.com> |
Merge tag 'v6.4-rc4' into wpan-next/staging
Linux 6.4-rc4
|
#
9ff17e6b |
| 05-Jun-2023 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
Merge drm/drm-next into drm-intel-gt-next
For conflict avoidance we need the following commit:
c9a9f18d3ad8 drm/i915/huc: use const struct bus_type pointers
Signed-off-by: Tvrtko Ursulin <tvrtko
Merge drm/drm-next into drm-intel-gt-next
For conflict avoidance we need the following commit:
c9a9f18d3ad8 drm/i915/huc: use const struct bus_type pointers
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
show more ...
|
#
2076b2a4 |
| 05-Jun-2023 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
Merge 6.4-rc5 into tty-next
We need the tty fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
d37537a1 |
| 05-Jun-2023 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
Merge 6.4-rc5 into usb-next
We need the USB fixes in here are well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
16b58423 |
| 05-Jun-2023 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
Merge 6.4-rc5 into driver-core-next
We need the driver core fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
5959fe74 |
| 05-Jun-2023 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
Merge 6.4-rc5 into char-misc-next
We need the char/misc fixes in here as well for mergeing and testing.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
e0178b54 |
| 02-Jun-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'for-6.4-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba: "One regression fix.
The rewrite of scrub code in 6.4 broke device repl
Merge tag 'for-6.4-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba: "One regression fix.
The rewrite of scrub code in 6.4 broke device replace in zoned mode, some of the writes could happen out of order so this had to be adjusted for all cases"
* tag 'for-6.4-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: zoned: fix dev-replace after the scrub rework
show more ...
|
#
a03a91bd |
| 01-Jun-2023 |
Jakub Kicinski <kuba@kernel.org> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.
No conflicts.
Adjacent changes:
drivers/net/ethernet/sfc/tc.c 622ab656344a ("sfc
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.
No conflicts.
Adjacent changes:
drivers/net/ethernet/sfc/tc.c 622ab656344a ("sfc: fix error unwinds in TC offload") b6583d5e9e94 ("sfc: support TC decap rules matching on enc_src_port")
net/mptcp/protocol.c 5b825727d087 ("mptcp: add annotations around msk->subflow accesses") e76c8ef5cc5b ("mptcp: refactor mptcp_stream_accept()")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
b675df02 |
| 01-Jun-2023 |
Qu Wenruo <wqu@suse.com> |
btrfs: zoned: fix dev-replace after the scrub rework
[BUG] After commit e02ee89baa66 ("btrfs: scrub: switch scrub_simple_mirror() to scrub_stripe infrastructure"), scrub no longer works for zoned de
btrfs: zoned: fix dev-replace after the scrub rework
[BUG] After commit e02ee89baa66 ("btrfs: scrub: switch scrub_simple_mirror() to scrub_stripe infrastructure"), scrub no longer works for zoned device at all.
Even an empty zoned btrfs cannot be replaced:
# mkfs.btrfs -f /dev/nvme0n1 # mount /dev/nvme0n1 /mnt/btrfs # btrfs replace start -Bf 1 /dev/nvme0n2 /mnt/btrfs Resetting device zones /dev/nvme1n1 (160 zones) ... ERROR: ioctl(DEV_REPLACE_START) failed on "/mnt/btrfs/": Input/output error
And we can hit kernel crash related to that:
BTRFS info (device nvme1n1): host-managed zoned block device /dev/nvme3n1, 160 zones of 134217728 bytes BTRFS info (device nvme1n1): dev_replace from /dev/nvme2n1 (devid 2) to /dev/nvme3n1 started nvme3n1: Zone Management Append(0x7d) @ LBA 65536, 4 blocks, Zone Is Full (sct 0x1 / sc 0xb9) DNR I/O error, dev nvme3n1, sector 786432 op 0xd:(ZONE_APPEND) flags 0x4000 phys_seg 3 prio class 2 BTRFS error (device nvme1n1): bdev /dev/nvme3n1 errs: wr 1, rd 0, flush 0, corrupt 0, gen 0 BUG: kernel NULL pointer dereference, address: 00000000000000a8 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:_raw_spin_lock_irqsave+0x1e/0x40 Call Trace: <IRQ> btrfs_lookup_ordered_extent+0x31/0x190 btrfs_record_physical_zoned+0x18/0x40 btrfs_simple_end_io+0xaf/0xc0 blk_update_request+0x153/0x4c0 blk_mq_end_request+0x15/0xd0 nvme_poll_cq+0x1d3/0x360 nvme_irq+0x39/0x80 __handle_irq_event_percpu+0x3b/0x190 handle_irq_event+0x2f/0x70 handle_edge_irq+0x7c/0x210 __common_interrupt+0x34/0xa0 common_interrupt+0x7d/0xa0 </IRQ> <TASK> asm_common_interrupt+0x22/0x40
[CAUSE] Dev-replace reuses scrub code to iterate all extents and write the existing content back to the new device.
And for zoned devices, we call fill_writer_pointer_gap() to make sure all the writes into the zoned device is sequential, even if there may be some gaps between the writes.
However we have several different bugs all related to zoned dev-replace:
- We are using ZONE_APPEND operation for metadata style write back For zoned devices, btrfs has two ways to write data:
* ZONE_APPEND for data This allows higher queue depth, but will not be able to know where the write would land. Thus needs to grab the real on-disk physical location in it's endio.
* WRITE for metadata This requires single queue depth (new writes can only be submitted after previous one finished), and all writes must be sequential.
For scrub, we go single queue depth, but still goes with ZONE_APPEND, which requires btrfs_bio::inode being populated. This is the cause of that crash.
- No correct tracing of write_pointer After a write finished, we should forward sctx->write_pointer, or fill_writer_pointer_gap() would not work properly and cause more than necessary zero out, and fill the whole zone prematurely.
- Incorrect physical bytenr passed to fill_writer_pointer_gap() In scrub_write_sectors(), one call site passes logical address, which is completely wrong.
The other call site passes physical address of current sector, but we should pass the physical address of the btrfs_bio we're submitting.
This is the cause of the -EIO errors.
[FIX] - Do not use ZONE_APPEND for btrfs_submit_repair_write().
- Manually forward sctx->write_pointer after successful writeback
- Use the physical address of the to-be-submitted btrfs_bio for fill_writer_pointer_gap()
Now zoned device replace would work as expected.
Reported-by: Christoph Hellwig <hch@lst.de> Fixes: e02ee89baa66 ("btrfs: scrub: switch scrub_simple_mirror() to scrub_stripe infrastructure") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
48b1320a |
| 30-May-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'for-6.4-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba: "One bug fix and two build warning fixes:
- call proper end bio call
Merge tag 'for-6.4-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba: "One bug fix and two build warning fixes:
- call proper end bio callback for metadata RAID0 in a rare case of an unaligned block
- fix uninitialized variable (reported by gcc 10.2)
- fix warning about potential access beyond array bounds on mips64 with 64k pages (runtime check would not allow that)"
* tag 'for-6.4-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix csum_tree_block page iteration to avoid tripping on -Werror=array-bounds btrfs: fix an uninitialized variable warning in btrfs_log_inode btrfs: call btrfs_orig_bbio_end_io in btrfs_end_bio_work
show more ...
|
#
45c2f368 |
| 15-May-2023 |
Christoph Hellwig <hch@lst.de> |
btrfs: call btrfs_orig_bbio_end_io in btrfs_end_bio_work
When I implemented the storage layer bio splitting, I was under the assumption that we'll never split metadata bios. But Qu reminded me that
btrfs: call btrfs_orig_bbio_end_io in btrfs_end_bio_work
When I implemented the storage layer bio splitting, I was under the assumption that we'll never split metadata bios. But Qu reminded me that this can actually happen with very old file systems with unaligned metadata chunks and RAID0.
I still haven't seen such a case in practice, but we better handled this case, especially as it is fairly easily to do not calling the ->end_іo method directly in btrfs_end_io_work, and using the proper btrfs_orig_bbio_end_io helper instead.
In addition to the old file system with unaligned metadata chunks case documented in the commit log, the combination of the new scrub code with Johannes pending raid-stripe-tree also triggers this case. We spent some time debugging it and found that this patch solves the problem.
Fixes: 103c19723c80 ("btrfs: split the bio submission path into a separate file") CC: stable@vger.kernel.org # 6.3+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Sterba <dsterba@suse.com>
show more ...
|
#
9c3a985f |
| 17-May-2023 |
Rodrigo Vivi <rodrigo.vivi@intel.com> |
Merge drm/drm-next into drm-intel-next
Backmerge to get some hwmon dependencies.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|