Revision tags: v4.6.1, v4.4.12, openbmc-20160521-1, v4.4.11, openbmc-20160518-1, v4.6, v4.4.10, openbmc-20160511-1, openbmc-20160505-1, v4.4.9, v4.4.8, v4.4.7 |
|
#
c8b8e32d |
| 07-Apr-2016 |
Christoph Hellwig <hch@lst.de> |
direct-io: eliminate the offset argument to ->direct_IO
Including blkdev_direct_IO and dax_do_io. It has to be ki_pos to actually work, so eliminate the superflous argument.
Signed-off-by: Christo
direct-io: eliminate the offset argument to ->direct_IO
Including blkdev_direct_IO and dax_do_io. It has to be ki_pos to actually work, so eliminate the superflous argument.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
show more ...
|
#
b06bad17 |
| 28-Apr-2016 |
Minchan Kim <minchan@kernel.org> |
mm: call swap_slot_free_notify() with page lock held
Kyeongdon reported below error which is BUG_ON(!PageSwapCache(page)) in page_swap_info. The reason is that page_endio in rw_page unlocks the pag
mm: call swap_slot_free_notify() with page lock held
Kyeongdon reported below error which is BUG_ON(!PageSwapCache(page)) in page_swap_info. The reason is that page_endio in rw_page unlocks the page if read I/O is completed so we need to hold a PG_lock again to check PageSwapCache. Otherwise, the page can be removed from swapcache.
Kernel BUG at c00f9040 [verbose debug info unavailable] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM Modules linked in: CPU: 4 PID: 13446 Comm: RenderThread Tainted: G W 3.10.84-g9f14aec-dirty #73 task: c3b73200 ti: dd192000 task.ti: dd192000 PC is at page_swap_info+0x10/0x2c LR is at swap_slot_free_notify+0x18/0x6c pc : [<c00f9040>] lr : [<c00f5560>] psr: 400f0113 sp : dd193d78 ip : c2deb1e4 fp : da015180 r10: 00000000 r9 : 000200da r8 : c120fe08 r7 : 00000000 r6 : 00000000 r5 : c249a6c0 r4 : = c249a6c0 r3 : 00000000 r2 : 40080009 r1 : 200f0113 r0 : = c249a6c0 ..<snip> .. Call Trace: page_swap_info+0x10/0x2c swap_slot_free_notify+0x18/0x6c swap_readpage+0x90/0x11c read_swap_cache_async+0x134/0x1ac swapin_readahead+0x70/0xb0 handle_pte_fault+0x320/0x6fc handle_mm_fault+0xc0/0xf0 do_page_fault+0x11c/0x36c do_DataAbort+0x34/0x118
Fixes: 3f2b1a04f44933f2 ("zram: revive swap_slot_free_notify") Signed-off-by: Minchan Kim <minchan@kernel.org> Tested-by: Kyeongdon Kim <kyeongdon.kim@lge.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
09cbfeaf |
| 01-Apr-2016 |
Kirill A. Shutemov <kirill.shutemov@linux.intel.com> |
mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to impleme
mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable.
Let's stop pretending that pages in page cache are special. They are not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch.
virtual patch
@@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E
@@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E
@@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT
@@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE
@@ @@ - PAGE_CACHE_MASK + PAGE_MASK
@@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E)
@@ expression E; @@ - page_cache_get(E) + get_page(E)
@@ expression E; @@ - page_cache_release(E) + put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: openbmc-20160329-2, openbmc-20160329-1 |
|
#
3f2b1a04 |
| 22-Mar-2016 |
Minchan Kim <minchan@kernel.org> |
zram: revive swap_slot_free_notify
Commit b430e9d1c6d4 ("remove compressed copy from zram in-memory") applied swap_slot_free_notify call in *end_swap_bio_read* to remove duplicated memory between zr
zram: revive swap_slot_free_notify
Commit b430e9d1c6d4 ("remove compressed copy from zram in-memory") applied swap_slot_free_notify call in *end_swap_bio_read* to remove duplicated memory between zram and memory.
However, with the introduction of rw_page in zram: 8c7f01025f7b ("zram: implement rw_page operation of zram"), it became void because rw_page doesn't need bio.
Memory footprint is really important in embedded platforms which have small memory, for example, 512M) recently because it could start to kill processes if memory footprint exceeds some threshold by LMK or some similar memory management modules.
This patch restores the function for rw_page, thereby eliminating this duplication.
Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: karam.lee <karam.lee@lge.com> Cc: <sangseok.lee@lge.com> Cc: Chan Jeong <chan.jeong@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: openbmc-20160321-1 |
|
#
1170532b |
| 17-Mar-2016 |
Joe Perches <joe@perches.com> |
mm: convert printk(KERN_<LEVEL> to pr_<level>
Most of the mm subsystem uses pr_<level> so make it consistent.
Miscellanea:
- Realign arguments - Add missing newline to format - kmemleak-test.c
mm: convert printk(KERN_<LEVEL> to pr_<level>
Most of the mm subsystem uses pr_<level> so make it consistent.
Miscellanea:
- Realign arguments - Add missing newline to format - kmemleak-test.c has a "kmemleak: " prefix added to the "Kmemleak testing" logging message via pr_fmt
Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Tejun Heo <tj@kernel.org> [percpu] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v4.4.6, v4.5, v4.4.5, v4.4.4, v4.4.3, openbmc-20160222-1, v4.4.2, openbmc-20160212-1, openbmc-20160210-1, openbmc-20160202-2, openbmc-20160202-1, v4.4.1, openbmc-20160127-1, openbmc-20160120-1, v4.4, openbmc-20151217-1, openbmc-20151210-1, openbmc-20151202-1, openbmc-20151123-1, openbmc-20151118-1, openbmc-20151104-1, v4.3, openbmc-20151102-1, openbmc-20151028-1, v4.3-rc1, v4.2, v4.2-rc8, v4.2-rc7, v4.2-rc6, v4.2-rc5, v4.2-rc4, v4.2-rc3, v4.2-rc2, v4.2-rc1, v4.1, v4.1-rc8, v4.1-rc7, v4.1-rc6, v4.1-rc5, v4.1-rc4, v4.1-rc3, v4.1-rc2, v4.1-rc1, v4.0, v4.0-rc7, v4.0-rc6, v4.0-rc5, v4.0-rc4, v4.0-rc3, v4.0-rc2, v4.0-rc1, v3.19, v3.19-rc7, v3.19-rc6, v3.19-rc5, v3.19-rc4, v3.19-rc3, v3.19-rc2 |
|
#
6cf66b4c |
| 22-Dec-2014 |
Kent Overstreet <kent.overstreet@gmail.com> |
fs: use helper bio_add_page() instead of open coding on bi_io_vec
Call pre-defined helper bio_add_page() instead of open coding for iterating through bi_io_vec[]. Doing that, it's possible to make s
fs: use helper bio_add_page() instead of open coding on bi_io_vec
Call pre-defined helper bio_add_page() instead of open coding for iterating through bi_io_vec[]. Doing that, it's possible to make some parts in filesystems and mm/page_io.c simpler than before.
Acked-by: Dave Kleikamp <shaggy@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> [dpark: add more description in commit message] Signed-off-by: Dongsu Park <dpark@posteo.net> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
4246a0b6 |
| 20-Jul-2015 |
Christoph Hellwig <hch@lst.de> |
block: add a bi_error field to struct bio
Currently we have two different ways to signal an I/O error on a BIO:
(1) by clearing the BIO_UPTODATE flag (2) by returning a Linux errno value to the b
block: add a bi_error field to struct bio
Currently we have two different ways to signal an I/O error on a BIO:
(1) by clearing the BIO_UPTODATE flag (2) by returning a Linux errno value to the bi_end_io callback
The first one has the drawback of only communicating a single possible error (-EIO), and the second one has the drawback of not beeing persistent when bios are queued up, and are not passed along from child to parent bio in the ever more popular chaining scenario. Having both mechanisms available has the additional drawback of utterly confusing driver authors and introducing bugs where various I/O submitters only deal with one of them, and the others have to add boilerplate code to deal with both kinds of error returns.
So add a new bi_error field to store an errno value directly in struct bio and remove the existing mechanisms to clean all this up.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
343df3c7 |
| 19-May-2015 |
Christoph Hellwig <hch@lst.de> |
suspend: simplify block I/O handling
Stop abusing struct page functionality and the swap end_io handler, and instead add a modified version of the blk-lib.c bio_batch helpers.
Also move the block I
suspend: simplify block I/O handling
Stop abusing struct page functionality and the swap end_io handler, and instead add a modified version of the blk-lib.c bio_batch helpers.
Also move the block I/O code into swap.c as they are directly tied into each other.
Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Pavel Machek <pavel@ucw.cz> Tested-by: Ming Lin <mlin@kernel.org> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
22c6186e |
| 16-Mar-2015 |
Omar Sandoval <osandov@osandov.com> |
direct_IO: remove rw from a_ops->direct_IO()
Now that no one is using rw, remove it completely.
Signed-off-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
e2e40f2c |
| 22-Feb-2015 |
Christoph Hellwig <hch@lst.de> |
fs: move struct kiocb to fs.h
struct kiocb now is a generic I/O container, so move it to fs.h. Also do a #include diet for aio.h while we're at it.
Signed-off-by: Christoph Hellwig <hch@lst.de> Sig
fs: move struct kiocb to fs.h
struct kiocb now is a generic I/O container, so move it to fs.h. Also do a #include diet for aio.h while we're at it.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
show more ...
|
#
66ee59af |
| 11-Feb-2015 |
Christoph Hellwig <hch@lst.de> |
fs: remove ki_nbytes
There is no need to pass the total request length in the kiocb, as we already get passed in through the iov_iter argument.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-
fs: remove ki_nbytes
There is no need to pass the total request length in the kiocb, as we already get passed in through the iov_iter argument.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
show more ...
|
#
05afcb77 |
| 23-Jan-2015 |
Al Viro <viro@zeniv.linux.org.uk> |
new helper: iov_iter_bvec()
similar to iov_iter_kvec(), for ITER_BVEC ones
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
Revision tags: v3.19-rc1, v3.18, v3.18-rc7, v3.18-rc6, v3.18-rc5, v3.18-rc4, v3.18-rc3, v3.18-rc2, v3.18-rc1, v3.17, v3.17-rc7, v3.17-rc6, v3.17-rc5, v3.17-rc4, v3.17-rc3, v3.17-rc2, v3.17-rc1, v3.16, v3.16-rc7, v3.16-rc6, v3.16-rc5, v3.16-rc4, v3.16-rc3, v3.16-rc2, v3.16-rc1 |
|
#
05064084 |
| 14-Jun-2014 |
Al Viro <viro@ZenIV.linux.org.uk> |
fix __swap_writepage() compile failure on old gcc versions
Tetsuo Handa wrote: "Commit 62a8067a7f35 ("bio_vec-backed iov_iter") introduced an unnamed union inside a struct which gcc-4.4.7 cannot
fix __swap_writepage() compile failure on old gcc versions
Tetsuo Handa wrote: "Commit 62a8067a7f35 ("bio_vec-backed iov_iter") introduced an unnamed union inside a struct which gcc-4.4.7 cannot handle. Name the unnamed union as u in order to fix build failure"
Let's do this instead: there is only one place in the entire tree that steps into this breakage. Anon structs and unions work in older gcc versions; as the matter of fact, we have those in the tree - see e.g. struct ieee80211_tx_info in include/net/mac80211.h
What doesn't work is handling their initializers:
struct { int a; union { int b; char c; }; } x[2] = {{.a = 1, .c = 'a'}, {.a = 0, .b = 1}};
is the obvious syntax for initializer, perfectly fine for C11 and handled correctly by gcc-4.7 or later.
Earlier versions, though, break on it - declaration is fine and so's access to fields (i.e. x[0].c = 'a'; would produce the right code), but members of the anon structs and unions are not inserted into the right namespace. Tellingly, those older versions will not barf on struct {int a; struct {int a;};}; - looks like they just have it hacked up somewhere around the handling of . and -> instead of doing the right thing.
The easiest way to deal with that crap is to turn initialization of those fields (in the only place where we have such initializer of iov_iter) into plain assignment.
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v3.15 |
|
#
dd6bd0d9 |
| 04-Jun-2014 |
Matthew Wilcox <matthew.r.wilcox@intel.com> |
swap: use bdev_read_page() / bdev_write_page()
By calling the device driver to write the page directly, we avoid allocating a BIO, which allows us to free memory without allocating memory.
[akpm@li
swap: use bdev_read_page() / bdev_write_page()
By calling the device driver to write the page directly, we avoid allocating a BIO, which allows us to free memory without allocating memory.
[akpm@linux-foundation.org: fix used-uninitialized bug] Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dheeraj Reddy <dheeraj.reddy@intel.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v3.15-rc8, v3.15-rc7, v3.15-rc6, v3.15-rc5, v3.15-rc4, v3.15-rc3, v3.15-rc2, v3.15-rc1 |
|
#
62a8067a |
| 04-Apr-2014 |
Al Viro <viro@zeniv.linux.org.uk> |
bio_vec-backed iov_iter
New variant of iov_iter - ITER_BVEC in iter->type, backed with bio_vec array instead of iovec one. Primitives taught to deal with such beasts, __swap_write() switched to usi
bio_vec-backed iov_iter
New variant of iov_iter - ITER_BVEC in iter->type, backed with bio_vec array instead of iovec one. Primitives taught to deal with such beasts, __swap_write() switched to using that kind of iov_iter.
Note that bio_vec is just a <page, offset, length> triple - there's nothing block-specific about it. I've left the definition where it was, but took it from under ifdef CONFIG_BLOCK.
Next target: ->splice_write()...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
show more ...
|
Revision tags: v3.14, v3.14-rc8, v3.14-rc7, v3.14-rc6 |
|
#
71d8e532 |
| 05-Mar-2014 |
Al Viro <viro@zeniv.linux.org.uk> |
start adding the tag to iov_iter
For now, just use the same thing we pass to ->direct_IO() - it's all iovec-based at the moment. Pass it explicitly to iov_iter_init() and account for kvec vs. iovec
start adding the tag to iov_iter
For now, just use the same thing we pass to ->direct_IO() - it's all iovec-based at the moment. Pass it explicitly to iov_iter_init() and account for kvec vs. iovec in there, by the same kludge NFS ->direct_IO() uses.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
show more ...
|
#
d8d3d94b |
| 04-Mar-2014 |
Al Viro <viro@zeniv.linux.org.uk> |
pass iov_iter to ->direct_IO()
unmodified, for now
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
Revision tags: v3.14-rc5, v3.14-rc4, v3.14-rc3, v3.14-rc2, v3.14-rc1 |
|
#
309381fe |
| 23-Jan-2014 |
Sasha Levin <sasha.levin@oracle.com> |
mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE
Most of the VM_BUG_ON assertions are performed on a page. Usually, when one of these assertions fails we'll get a BUG_ON with a call stac
mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE
Most of the VM_BUG_ON assertions are performed on a page. Usually, when one of these assertions fails we'll get a BUG_ON with a call stack and the registers.
I've recently noticed based on the requests to add a small piece of code that dumps the page to various VM_BUG_ON sites that the page dump is quite useful to people debugging issues in mm.
This patch adds a VM_BUG_ON_PAGE(cond, page) which beyond doing what VM_BUG_ON() does, also dumps the page before executing the actual BUG_ON.
[akpm@linux-foundation.org: fix up includes] Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
Revision tags: v3.13, v3.13-rc8, v3.13-rc7, v3.13-rc6, v3.13-rc5, v3.13-rc4, v3.13-rc3, v3.13-rc2, v3.13-rc1, v3.12, v3.12-rc7, v3.12-rc6, v3.12-rc5 |
|
#
4f024f37 |
| 11-Oct-2013 |
Kent Overstreet <kmo@daterainc.com> |
block: Abstract out bvec iterator
Immutable biovecs are going to require an explicit iterator. To implement immutable bvecs, a later patch is going to add a bi_bvec_done member to this struct; for n
block: Abstract out bvec iterator
Immutable biovecs are going to require an explicit iterator. To implement immutable bvecs, a later patch is going to add a bi_bvec_done member to this struct; for now, this patch effectively just renames things.
Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Ed L. Cashin" <ecashin@coraid.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Yehuda Sadeh <yehuda@inktank.com> Cc: Sage Weil <sage@inktank.com> Cc: Alex Elder <elder@inktank.com> Cc: ceph-devel@vger.kernel.org Cc: Joshua Morris <josh.h.morris@us.ibm.com> Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Neil Brown <neilb@suse.de> Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: dm-devel@redhat.com Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux390@de.ibm.com Cc: Boaz Harrosh <bharrosh@panasas.com> Cc: Benny Halevy <bhalevy@tonian.com> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Chris Mason <chris.mason@fusionio.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Dave Kleikamp <shaggy@kernel.org> Cc: Joern Engel <joern@logfs.org> Cc: Prasad Joshi <prasadjoshi.linux@gmail.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Ben Myers <bpm@sgi.com> Cc: xfs@oss.sgi.com Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Len Brown <len.brown@intel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Guo Chao <yan@linux.vnet.ibm.com> Cc: Tejun Heo <tj@kernel.org> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: "Roger Pau Monné" <roger.pau@citrix.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Ian Campbell <Ian.Campbell@citrix.com> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchand@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Peng Tao <tao.peng@emc.com> Cc: Andy Adamson <andros@netapp.com> Cc: fanchaoting <fanchaoting@cn.fujitsu.com> Cc: Jie Liu <jeff.liu@oracle.com> Cc: Sunil Mushran <sunil.mushran@gmail.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Namjae Jeon <namjae.jeon@samsung.com> Cc: Pankaj Kumar <pankaj.km@samsung.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Mel Gorman <mgorman@suse.de>6
show more ...
|
Revision tags: v3.12-rc4, v3.12-rc3, v3.12-rc2, v3.12-rc1, v3.11, v3.11-rc7, v3.11-rc6, v3.11-rc5, v3.11-rc4, v3.11-rc3, v3.11-rc2, v3.11-rc1, v3.10, v3.10-rc7, v3.10-rc6, v3.10-rc5, v3.10-rc4, v3.10-rc3, v3.10-rc2, v3.10-rc1 |
|
#
73a7075e |
| 09-May-2013 |
Kent Overstreet <koverstreet@google.com> |
aio: Kill aio_rw_vect_retry()
This code doesn't serve any purpose anymore, since the aio retry infrastructure has been removed.
This change should be safe because aio_read/write are also used for s
aio: Kill aio_rw_vect_retry()
This code doesn't serve any purpose anymore, since the aio retry infrastructure has been removed.
This change should be safe because aio_read/write are also used for synchronous IO, and called from do_sync_read()/do_sync_write() - and there's no looping done in the sync case (the read and write syscalls).
Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
show more ...
|
#
b430e9d1 |
| 03-Jul-2013 |
Minchan Kim <minchan@kernel.org> |
mm: remove compressed copy from zram in-memory
Swap subsystem does lazy swap slot free with expecting the page would be swapped out again so we can avoid unnecessary write.
But the problem in in-me
mm: remove compressed copy from zram in-memory
Swap subsystem does lazy swap slot free with expecting the page would be swapped out again so we can avoid unnecessary write.
But the problem in in-memory swap(ex, zram) is that it consumes memory space until vm_swap_full(ie, used half of all of swap device) condition meet. It could be bad if we use multiple swap device, small in-memory swap and big storage swap or in-memory swap alone.
This patch makes swap subsystem free swap slot as soon as swap-read is completed and make the swapcache page dirty so the page should be written out the swap device to reclaim it. It means we never lose it.
I tested this patch with kernel compile workload.
1. before
compile time : 9882.42 zram max wasted space by fragmentation: 13471881 byte memory space consumed by zram: 174227456 byte the number of slot free notify: 206684
2. after
compile time : 9653.90 zram max wasted space by fragmentation: 11805932 byte memory space consumed by zram: 154001408 byte the number of slot free notify: 426972
[akpm@linux-foundation.org: tweak comment text] [artem.savkov@gmail.com: fix BUG due to non-swapcache pages in end_swap_bio_read()] [akpm@linux-foundation.org: invert unlikely() test, augment comment, 80-col cleanup] Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com> Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Artem Savkov <artem.savkov@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Konrad Rzeszutek Wilk <konrad@darnok.org> Cc: Shaohua Li <shli@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
a27bb332 |
| 07-May-2013 |
Kent Overstreet <koverstreet@google.com> |
aio: don't include aio.h in sched.h
Faster kernel compiles by way of fewer unnecessary includes.
[akpm@linux-foundation.org: fix fallout] [akpm@linux-foundation.org: fix build] Signed-off-by: Kent
aio: don't include aio.h in sched.h
Faster kernel compiles by way of fewer unnecessary includes.
[akpm@linux-foundation.org: fix fallout] [akpm@linux-foundation.org: fix build] Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Reviewed-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
0cdc444a |
| 29-Apr-2013 |
Mel Gorman <mgorman@suse.de> |
mm: swap: mark swap pages writeback before queueing for direct IO
As pointed out by Andrew Morton, the swap-over-NFS writeback is not setting PageWriteback before it is queued for direct IO. While
mm: swap: mark swap pages writeback before queueing for direct IO
As pointed out by Andrew Morton, the swap-over-NFS writeback is not setting PageWriteback before it is queued for direct IO. While swap pages do not participate in BDI or process dirty accounting and the IO is synchronous, the writeback bit is still required and not setting it in this case was an oversight. swapoff depends on the page writeback to synchronoise all pending writes on a swap page before it is reused. Swapcache freeing and reuse depend on checking the PageWriteback under lock to ensure the page is safe to reuse.
Direct IO handlers and the direct IO handler for NFS do not deal with PageWriteback as they are synchronous writes. In the case of NFS, it schedules pages (or a page in the case of swap) for IO and then waits synchronously for IO to complete in nfs_direct_write(). It is recognised that this is a slowdown from normal swap handling which is asynchronous and uses a completion handler. Shoving PageWriteback handling down into direct IO handlers looks like a bad fit to handle the swap case although it may have to be dealt with some day if swap is converted to use direct IO in general and bmap is finally done away with. At that point it will be necessary to refit asynchronous direct IO with completion handlers onto the swap subsystem.
As swapcache currently depends on PageWriteback to protect against races, this patch sets PageWriteback under the page lock before queueing it for direct IO. It is cleared when the direct IO handler returns. IO errors are treated similarly to the direct-to-bio case except PageError is not set as in the case of swap-over-NFS, it is likely to be a transient error.
It was asked what prevents such a page being reclaimed in parallel. With this patch applied, such a page will now be skipped (most of the time) or blocked until the writeback completes. Reclaim checks PageWriteback under the page lock before calling try_to_free_swap and the page lock should prevent the page being requeued for IO before it is freed.
This and Jerome's related patch should considered for -stable as far back as 3.6 when swap-over-NFS was introduced.
[akpm@linux-foundation.org: use pr_err_ratelimited()] [akpm@linux-foundation.org: remove hopefully-unneeded cast in printk] Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> [3.6+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
2d30d31e |
| 29-Apr-2013 |
Jerome Marchand <jmarchan@redhat.com> |
swap: redirty page if page write fails on swap file
Since commit 62c230bc1790 ("mm: add support for a filesystem to activate swap files and use direct_IO for writing swap pages"), swap_writepage() c
swap: redirty page if page write fails on swap file
Since commit 62c230bc1790 ("mm: add support for a filesystem to activate swap files and use direct_IO for writing swap pages"), swap_writepage() calls direct_IO on swap files. However, in that case the page isn't redirtied if I/O fails, and is therefore handled afterwards as if it has been successfully written to the swap file, leading to memory corruption when the page is eventually swapped back in.
This patch sets the page dirty when direct_IO() fails. It fixes a memory corruption that happened while using swap-over-NFS.
Signed-off-by: Jerome Marchand <jmarchan@redhat.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> [3.6+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
1eec6702 |
| 29-Apr-2013 |
Seth Jennings <sjenning@linux.vnet.ibm.com> |
mm: allow for outstanding swap writeback accounting
To prevent flooding the swap device with writebacks, frontswap backends need to count and limit the number of outstanding writebacks. The increme
mm: allow for outstanding swap writeback accounting
To prevent flooding the swap device with writebacks, frontswap backends need to count and limit the number of outstanding writebacks. The incrementing of the counter can be done before the call to __swap_writepage(). However, the caller must receive a notification when the writeback completes in order to decrement the counter.
To achieve this functionality, this patch modifies __swap_writepage() to take the bio completion callback function as an argument.
end_swap_bio_write(), the normal bio completion function, is also made non-static so that code doing the accounting can call it after the accounting is done.
There should be no behavioural change to existing code.
Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Signed-off-by: Bob Liu <bob.liu@oracle.com> Acked-by: Minchan Kim <minchan@kernel.org> Reviewed-by: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|