Message ID | 20210323162203.30942-1-mwilck@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | target: pscsi: avoid OOM in pscsi_map_sg() | expand |
On Tue, Mar 23, 2021 at 05:22:03PM +0100, mwilck@suse.com wrote: > Avoid this by simply not resetting nr_pages to 0 after allocating the > bio. This way, the client receives an IO error when it tries to send > requests exceeding the devices max_sectors_kb, and eventually gets > it right. The client must still limit max_sectors_kb e.g. by an udev > rule if (like in my case) the driver doesn't report valid block > limits, otherwise it encounters I/O errors. FYI, I think the what you did here is correct, but not enough. When pscsi_get_bio (that is bio_kmalloc) fails, this function needs to unwind and return an error insted of blindly retrying the allocation, else we can't recover from a memory shortage.
Martin, On 3/23/21 09:23, mwilck@suse.com wrote: > From: Martin Wilck <mwilck@suse.com> > > pscsi_map_sg() uses the variable nr_pages as a hint for bio_kmalloc() > how many vector elements to allocate. If nr_pages is < BIO_MAX_PAGES, > it will be reset to 0 after successful allocation of the bio. I think BIO_MAX_PAGES is replaced by BIO_MAX_VECS with commit a8affc03a9b3 ("block: rename BIO_MAX_PAGES to BIO_MAX_VECS").
On Tue, 2021-03-23 at 16:29 +0000, Christoph Hellwig wrote: > On Tue, Mar 23, 2021 at 05:22:03PM +0100, mwilck@suse.com wrote: > > Avoid this by simply not resetting nr_pages to 0 after allocating the > > bio. This way, the client receives an IO error when it tries to send > > requests exceeding the devices max_sectors_kb, and eventually gets > > it right. The client must still limit max_sectors_kb e.g. by an udev > > rule if (like in my case) the driver doesn't report valid block > > limits, otherwise it encounters I/O errors. > > FYI, I think the what you did here is correct, but not enough. > When pscsi_get_bio (that is bio_kmalloc) fails, this function needs > to unwind and return an error insted of blindly retrying the > allocation, > else we can't recover from a memory shortage. I can try to do that, but in the tests I ran, I never observed bio_kmalloc() failing. I saw it eat all memory, and various processes being killed by the OOM killer, before the system eventually panicked with OOM. It takes only fractions of a second until this happens: [ 57.356238] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/,task=lightdm-gtk-gre,pid=4586,uid=481 [ 57.369635] Out of memory: Killed process 4586 (lightdm-gtk-gre) total-vm:1035752kB, anon-rss:0kB, file-rss:2936kB, shmem-rss:0kB ... [ 57.698656] Node 0 Normal free:54828kB min:55008kB low:68760kB high:82512kB active_anon:4kB inactive_anon:0kB active_file:936kB inactive_file:1164kB unevictable:58036kB writepending:720kB present:13631488kB managed:13310252kB mlocked:58036kB kernel_stack:5808kB pagetables:8972kB bounce:0kB free_pcp:564kB local_pcp:0kB free_cma:0kB ... [ 57.818978] Unreclaimable slab info: [ 58.254160] kmalloc-192 15683546KB 15683546KB (system has 16GiB memory). Martin
On Tue, 2021-03-23 at 18:07 +0000, Chaitanya Kulkarni wrote: > Martin, > > On 3/23/21 09:23, mwilck@suse.com wrote: > > From: Martin Wilck <mwilck@suse.com> > > > > pscsi_map_sg() uses the variable nr_pages as a hint for > > bio_kmalloc() > > how many vector elements to allocate. If nr_pages is < > > BIO_MAX_PAGES, > > it will be reset to 0 after successful allocation of the bio. > > I think BIO_MAX_PAGES is replaced by BIO_MAX_VECS with > commit a8affc03a9b3 ("block: rename BIO_MAX_PAGES to BIO_MAX_VECS"). Right. I made my patch against mkp/queue, which doesn't include this commit yet. As this is just in the description, I don't think it matters much, does it? Martin
On 3/23/21 14:21, Martin Wilck wrote: >> I think BIO_MAX_PAGES is replaced by BIO_MAX_VECS with >> commit a8affc03a9b3 ("block: rename BIO_MAX_PAGES to BIO_MAX_VECS"). > Right. I made my patch against mkp/queue, which doesn't include this > commit yet. As this is just in the description, I don't think it > matters much, does it? > > Martin > > > I don't think it does, I'll let you decide that.
diff --git a/drivers/target/target_core_pscsi.c b/drivers/target/target_core_pscsi.c index 7b1035e..977362d 100644 --- a/drivers/target/target_core_pscsi.c +++ b/drivers/target/target_core_pscsi.c @@ -881,7 +881,6 @@ pscsi_map_sg(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents, if (!bio) { new_bio: nr_vecs = bio_max_segs(nr_pages); - nr_pages -= nr_vecs; /* * Calls bio_kmalloc() and sets bio->bi_end_io() */