History log of /openbmc/linux/block/blk-iocost.c (Results 576 – 596 of 596)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 41591a51 31-Oct-2019 Dan Carpenter <dan.carpenter@oracle.com>

iocost: don't nest spin_lock_irq in ioc_weight_write()

This code causes a static analysis warning:

block/blk-iocost.c:2113 ioc_weight_write() error: double lock 'irq'

We disable IRQs in blkg_c

iocost: don't nest spin_lock_irq in ioc_weight_write()

This code causes a static analysis warning:

block/blk-iocost.c:2113 ioc_weight_write() error: double lock 'irq'

We disable IRQs in blkg_conf_prep() and re-enable them in
blkg_conf_finish(). IRQ disable/enable should not be nested because
that means the IRQs will be enabled at the first unlock instead of the
second one.

Fixes: 7caa47151ab2 ("blkcg: implement blk-iocost")
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


# 728d90bd 27-Oct-2019 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge tag 'v5.4-rc5' into next

Sync up with mainline.


# 112d6212 18-Oct-2019 Mark Brown <broonie@kernel.org>

Merge tag 'v5.4-rc3' into spi-5.4

Linux 5.4-rc3


# fa41d6ee 15-Oct-2019 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Merge drm/drm-next into drm-intel-next-queued

Backmerging to pull in HDR DP code:

https://lists.freedesktop.org/archives/dri-devel/2019-September/236453.html

Signed-off-by: Joonas Lahtinen <joonas

Merge drm/drm-next into drm-intel-next-queued

Backmerging to pull in HDR DP code:

https://lists.freedesktop.org/archives/dri-devel/2019-September/236453.html

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

show more ...


Revision tags: v5.3.6, v5.3.5, v5.3.4, v5.3.3
# 1913c7f3 04-Oct-2019 Tony Lindgren <tony@atomide.com>

Merge tag 'fix-missing-panels' into fixes


# 4092de1b 03-Oct-2019 Maxime Ripard <mripard@kernel.org>

Merge drm/drm-next into drm-misc-next

We haven't done any backmerge for a while due to the merge window, and it
starts to become an issue for komeda. Let's bring 5.4-rc1 in.

Signed-off-by: Maxime R

Merge drm/drm-next into drm-misc-next

We haven't done any backmerge for a while due to the merge window, and it
starts to become an issue for komeda. Let's bring 5.4-rc1 in.

Signed-off-by: Maxime Ripard <mripard@kernel.org>

show more ...


# 77fdaa09 03-Oct-2019 Maxime Ripard <mripard@kernel.org>

Merge drm/drm-fixes into drm-misc-fixes

We haven't backmerged for a while, let's start the -rc period by pulling
rc1.

Signed-off-by: Maxime Ripard <mripard@kernel.org>


Revision tags: v5.3.2
# 47db9b9a 27-Sep-2019 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'for-linus-2019-09-27' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
"A few fixes/changes to round off this merge window. This contains:

- Small series making s

Merge tag 'for-linus-2019-09-27' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
"A few fixes/changes to round off this merge window. This contains:

- Small series making some functional tweaks to blk-iocost (Tejun)

- Elevator switch locking fix (Ming)

- Kill redundant call in blk-wbt (Yufen)

- Fix flush timeout handling (Yufen)"

* tag 'for-linus-2019-09-27' of git://git.kernel.dk/linux-block:
block: fix null pointer dereference in blk_mq_rq_timed_out()
rq-qos: get rid of redundant wbt_update_limits()
iocost: bump up default latency targets for hard disks
iocost: improve nr_lagging handling
iocost: better trace vrate changes
block: don't release queue's sysfs lock during switching elevator
blk-mq: move lockdep_assert_held() into elevator_exit

show more ...


# 7afcccaf 25-Sep-2019 Tejun Heo <tj@kernel.org>

iocost: bump up default latency targets for hard disks

The default hard disk param sets latency targets at 50ms. As the
default target percentiles are zero, these don't directly regulate
vrate; how

iocost: bump up default latency targets for hard disks

The default hard disk param sets latency targets at 50ms. As the
default target percentiles are zero, these don't directly regulate
vrate; however, they're still used to calculate the period length -
100ms in this case.

This is excessively low. A SATA drive with QD32 saturated with random
IOs can easily reach avg completion latency of several hundred msecs.
A period duration which is substantially lower than avg completion
latency can lead to wildly fluctuating vrate.

Let's bump up the default latency targets to 250ms so that the period
duration is sufficiently long.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


# 7cd806a9 25-Sep-2019 Tejun Heo <tj@kernel.org>

iocost: improve nr_lagging handling

Some IOs may span multiple periods. As latencies are collected on
completion, the inbetween periods won't register them and may
incorrectly decide to increase vr

iocost: improve nr_lagging handling

Some IOs may span multiple periods. As latencies are collected on
completion, the inbetween periods won't register them and may
incorrectly decide to increase vrate. nr_lagging tracks these IOs to
avoid those situations. Currently, whenever there are IOs which are
spanning from the previous period, busy_level is reset to 0 if
negative thus suppressing vrate increase.

This has the following two problems.

* When latency target percentiles aren't set, vrate adjustment should
only be governed by queue depth depletion; however, the current code
keeps nr_lagging active which pulls in latency results and can keep
down vrate unexpectedly.

* When lagging condition is detected, it resets the entire negative
busy_level. This turned out to be way too aggressive on some
devices which sometimes experience extended latencies on a small
subset of commands. In addition, a lagging IO will be accounted as
latency target miss on completion anyway and resetting busy_level
amplifies its impact unnecessarily.

This patch fixes the above two problems by disabling nr_lagging
counting when latency target percentiles aren't set and blocking vrate
increases when there are lagging IOs while leaving busy_level as-is.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


# 25d41e4a 25-Sep-2019 Tejun Heo <tj@kernel.org>

iocost: better trace vrate changes

vrate_adj tracepoint traces vrate changes; however, it does so only
when busy_level is non-zero. busy_level turning to zero can sometimes
be as interesting an eve

iocost: better trace vrate changes

vrate_adj tracepoint traces vrate changes; however, it does so only
when busy_level is non-zero. busy_level turning to zero can sometimes
be as interesting an event. This patch also enables vrate_adj
tracepoint on other vrate related events - busy_level changes and
non-zero nr_lagging.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


Revision tags: v5.3.1
# 7ad67ca5 17-Sep-2019 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'for-5.4/block-2019-09-16' of git://git.kernel.dk/linux-block

Pull block updates from Jens Axboe:

- Two NVMe pull requests:
- ana log parse fix from Anton
- nvme quirks support

Merge tag 'for-5.4/block-2019-09-16' of git://git.kernel.dk/linux-block

Pull block updates from Jens Axboe:

- Two NVMe pull requests:
- ana log parse fix from Anton
- nvme quirks support for Apple devices from Ben
- fix missing bio completion tracing for multipath stack devices
from Hannes and Mikhail
- IP TOS settings for nvme rdma and tcp transports from Israel
- rq_dma_dir cleanups from Israel
- tracing for Get LBA Status command from Minwoo
- Some nvme-tcp cleanups from Minwoo, Potnuri and Myself
- Some consolidation between the fabrics transports for handling
the CAP register
- reset race with ns scanning fix for fabrics (move fabrics
commands to a dedicated request queue with a different lifetime
from the admin request queue)."
- controller reset and namespace scan races fixes
- nvme discovery log change uevent support
- naming improvements from Keith
- multiple discovery controllers reject fix from James
- some regular cleanups from various people

- Series fixing (and re-fixing) null_blk debug printing and nr_devices
checks (André)

- A few pull requests from Song, with fixes from Andy, Guoqing,
Guilherme, Neil, Nigel, and Yufen.

- REQ_OP_ZONE_RESET_ALL support (Chaitanya)

- Bio merge handling unification (Christoph)

- Pick default elevator correctly for devices with special needs
(Damien)

- Block stats fixes (Hou)

- Timeout and support devices nbd fixes (Mike)

- Series fixing races around elevator switching and device add/remove
(Ming)

- sed-opal cleanups (Revanth)

- Per device weight support for BFQ (Fam)

- Support for blk-iocost, a new model that can properly account cost of
IO workloads. (Tejun)

- blk-cgroup writeback fixes (Tejun)

- paride queue init fixes (zhengbin)

- blk_set_runtime_active() cleanup (Stanley)

- Block segment mapping optimizations (Bart)

- lightnvm fixes (Hans/Minwoo/YueHaibing)

- Various little fixes and cleanups

* tag 'for-5.4/block-2019-09-16' of git://git.kernel.dk/linux-block: (186 commits)
null_blk: format pr_* logs with pr_fmt
null_blk: match the type of parameter nr_devices
null_blk: do not fail the module load with zero devices
block: also check RQF_STATS in blk_mq_need_time_stamp()
block: make rq sector size accessible for block stats
bfq: Fix bfq linkage error
raid5: use bio_end_sector in r5_next_bio
raid5: remove STRIPE_OPS_REQ_PENDING
md: add feature flag MD_FEATURE_RAID0_LAYOUT
md/raid0: avoid RAID0 data corruption due to layout confusion.
raid5: don't set STRIPE_HANDLE to stripe which is in batch list
raid5: don't increment read_errors on EILSEQ return
nvmet: fix a wrong error status returned in error log page
nvme: send discovery log page change events to userspace
nvme: add uevent variables for controller devices
nvme: enable aen regardless of the presence of I/O queues
nvme-fabrics: allow discovery subsystems accept a kato
nvmet: Use PTR_ERR_OR_ZERO() in nvmet_init_discovery()
nvme: Remove redundant assignment of cq vector
nvme: Assign subsys instance from first ctrl
...

show more ...


Revision tags: v5.3, v5.2.14, v5.3-rc8, v5.2.13, v5.2.12
# 7c1ee704 04-Sep-2019 Tejun Heo <tj@kernel.org>

iocost_monitor: Report debt

Report debt and rename del_ms row to delay for consistency.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>


# e1518f63 04-Sep-2019 Tejun Heo <tj@kernel.org>

blk-iocost: Don't let merges push vtime into the future

Merges have the same problem that forced-bios had which is fixed by
the previous patch. The cost of a merge is calculated at the time of
issu

blk-iocost: Don't let merges push vtime into the future

Merges have the same problem that forced-bios had which is fixed by
the previous patch. The cost of a merge is calculated at the time of
issue and force-advances vtime into the future. Until global vtime
catches up, how the cgroup's hweight changes in the meantime doesn't
matter and it often leads to situations where the cost is calculated
at one hweight and paid at a very different one. See the previous
patch for more details.

Fix it by never advancing vtime into the future for merges. If budget
is available, vtime is advanced. Otherwise, the cost is charged as
debt.

This brings merge cost handling in line with issue cost handling in
ioc_rqos_throttle().

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


# 36a52481 04-Sep-2019 Tejun Heo <tj@kernel.org>

blk-iocost: Account force-charged overage in absolute vtime

Currently, when a bio needs to be force-charged and there isn't enough
budget, vtime is simply pushed into the future. This means that th

blk-iocost: Account force-charged overage in absolute vtime

Currently, when a bio needs to be force-charged and there isn't enough
budget, vtime is simply pushed into the future. This means that the
cost of the whole bio is scaled using the current hweight and then
charged immediately. Until the global vtime advances beyond this
future vtime, the cgroup won't be allowed to issue normal IOs.

This is incorrect and can lead to, for example, exploding vrate or
extended stalls if vrate range is constrained. Consider the following
scenario.

1. A cgroup with a very low hweight runs out of budget.

2. A storm of swap-out happens on it. All of them are scaled
according to the current low hweight and charged to vtime pushing
it to a far future.

3. All other cgroups go idle and now the above cgroup has access to
the whole device. However, because vtime is already wound using
the past low hweight, what its current hweight is doesn't matter
until global vtime catches up to the local vtime.

4. As a result, either vrate gets ramped up extremely or the IOs stall
while the underlying device is idle.

This is because the hweight the overage is calculated at is different
from the hweight that it's being paid at.

Fix it by remembering the overage in absoulte vtime and continuously
paying with the actual budget according to the current hweight at each
period.

Note that non-forced bios which wait already remembers the cost in
absolute vtime. This brings forced-bio accounting in line.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


# e036c4ca 10-Sep-2019 Tejun Heo <tj@kernel.org>

blk-iocost: Fix incorrect operation order during iocg free

ioc_pd_free() first cancels the hrtimers and then deactivates the
iocg. However, the iocg timer can run inbetween and reschedule the
hrtim

blk-iocost: Fix incorrect operation order during iocg free

ioc_pd_free() first cancels the hrtimers and then deactivates the
iocg. However, the iocg timer can run inbetween and reschedule the
hrtimers which will end up running after the iocg is freed leading to
crashes like the following.

general protection fault: 0000 [#1] SMP
...
RIP: 0010:iocg_kick_delay+0xbe/0x1b0
RSP: 0018:ffffc90003598ea0 EFLAGS: 00010046
RAX: 1cee00fd69512b54 RBX: ffff8881bba48400 RCX: 00000000000003e8
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8881bba48400
RBP: 0000000000004e20 R08: 0000000000000002 R09: 00000000000003e8
R10: 0000000000000000 R11: 0000000000000000 R12: ffffc90003598ef0
R13: 00979f3810ad461f R14: ffff8881bba4b400 R15: 25439f950d26e1d1
FS: 0000000000000000(0000) GS:ffff88885f800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f64328c7e40 CR3: 0000000002409005 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
iocg_delay_timer_fn+0x3d/0x60
__hrtimer_run_queues+0xfe/0x270
hrtimer_interrupt+0xf4/0x210
smp_apic_timer_interrupt+0x5e/0x120
apic_timer_interrupt+0xf/0x20
</IRQ>

Fix it by canceling hrtimers after deactivating the iocg.

Fixes: 7caa47151ab2 ("blkcg: implement blk-iocost")
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


# e916ad29 30-Aug-2019 Tejun Heo <tj@kernel.org>

blkcg: add missing NULL check in ioc_cpd_alloc()

ioc_cpd_alloc() forgot to check NULL return from kzalloc(). Add it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: kbuild test robot <lkp@in

blkcg: add missing NULL check in ioc_cpd_alloc()

ioc_cpd_alloc() forgot to check NULL return from kzalloc(). Add it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


# 3532e722 29-Aug-2019 Tejun Heo <tj@kernel.org>

blkcg: fix missing free on error path of blk_iocost_init()

blk_iocost_init() forgot to free its percpu stat on the error path.
Fix it.

Fixes: 7caa47151ab2 ("blkcg: implement blk-iocost")
Reported-b

blkcg: fix missing free on error path of blk_iocost_init()

blk_iocost_init() forgot to free its percpu stat on the error path.
Fix it.

Fixes: 7caa47151ab2 ("blkcg: implement blk-iocost")
Reported-by: Hillf Danton <hdanton@sina.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


Revision tags: v5.2.11
# 8504dea7 28-Aug-2019 Tejun Heo <tj@kernel.org>

blkcg: add tools/cgroup/iocost_coef_gen.py

Add a script which can be used to generate device-specific iocost
linear model coefficients.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens

blkcg: add tools/cgroup/iocost_coef_gen.py

Add a script which can be used to generate device-specific iocost
linear model coefficients.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


# 6954ff18 28-Aug-2019 Tejun Heo <tj@kernel.org>

blkcg: add tools/cgroup/iocost_monitor.py

Instead of mucking with debugfs and ->pd_stat(), add drgn based
monitoring script.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Omar Sandoval <osandov@fb.c

blkcg: add tools/cgroup/iocost_monitor.py

Instead of mucking with debugfs and ->pd_stat(), add drgn based
monitoring script.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


# 7caa4715 28-Aug-2019 Tejun Heo <tj@kernel.org>

blkcg: implement blk-iocost

This patchset implements IO cost model based work-conserving
proportional controller.

While io.latency provides the capability to comprehensively prioritize
and protect

blkcg: implement blk-iocost

This patchset implements IO cost model based work-conserving
proportional controller.

While io.latency provides the capability to comprehensively prioritize
and protect IOs depending on the cgroups, its protection is binary -
the lowest latency target cgroup which is suffering is protected at
the cost of all others. In many use cases including stacking multiple
workload containers in a single system, it's necessary to distribute
IO capacity with better granularity.

One challenge of controlling IO resources is the lack of trivially
observable cost metric. The most common metrics - bandwidth and iops
- can be off by orders of magnitude depending on the device type and
IO pattern. However, the cost isn't a complete mystery. Given
several key attributes, we can make fairly reliable predictions on how
expensive a given stream of IOs would be, at least compared to other
IO patterns.

The function which determines the cost of a given IO is the IO cost
model for the device. This controller distributes IO capacity based
on the costs estimated by such model. The more accurate the cost
model the better but the controller adapts based on IO completion
latency and as long as the relative costs across differents IO
patterns are consistent and sensible, it'll adapt to the actual
performance of the device.

Currently, the only implemented cost model is a simple linear one with
a few sets of default parameters for different classes of device.
This covers most common devices reasonably well. All the
infrastructure to tune and add different cost models is already in
place and a later patch will also allow using bpf progs for cost
models.

Please see the top comment in blk-iocost.c and documentation for
more details.

v2: Rebased on top of RQ_ALLOC_TIME changes and folded in Rik's fix
for a divide-by-zero bug in current_hweight() triggered by zero
inuse_sum.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Andy Newell <newella@fb.com>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


1...<<21222324