From patchwork Fri Jul 6 19:32:34 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 10512441 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id BAA576024A for ; Fri, 6 Jul 2018 19:34:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AAD27287AC for ; Fri, 6 Jul 2018 19:34:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9F939287B7; Fri, 6 Jul 2018 19:34:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 35C30287AC for ; Fri, 6 Jul 2018 19:34:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934855AbeGFTeC (ORCPT ); Fri, 6 Jul 2018 15:34:02 -0400 Received: from mail-pf0-f193.google.com ([209.85.192.193]:38668 "EHLO mail-pf0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934740AbeGFTcs (ORCPT ); Fri, 6 Jul 2018 15:32:48 -0400 Received: by mail-pf0-f193.google.com with SMTP id j17-v6so9189720pfn.5 for ; Fri, 06 Jul 2018 12:32:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=yIfom0bteWoeSozeaDPwgTA0fOV2FhnwD4cyFdWL/0g=; b=kmzkQ7D7jdDNkrKq9wFOhclQP0ynRsXyEUs6KmdmlKDTWnJouRUD5zalUBlezCAvSK z2IHjyKSV4kRTmqpb68WTGq3mRpvnWjGbnSTX42jM/c/ZsaUB2VlFtTjXge+bzC9tGYN Lim0+RMXweg0ji6c+wdiZ1nIg1j9Pvu7bB6aRdlIpQXuoXRn698I71vgXjCcuqTQt8NE JpTt7BATUxP0mF+hSjc4cW4ZewQnmsSloS6ZjObHW4pXfWTYm/puxFB/oNgArci0bHvL z1P32Z81h458Bhxtdbs3TLfJdsim7Io56p0tDl0dzg2vp32oKSvItu027dCL+Ifbtrii iVCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=yIfom0bteWoeSozeaDPwgTA0fOV2FhnwD4cyFdWL/0g=; b=oIrGdjlLWttNLAU7CkY6R5GIo0qx4ifUmPFgEAy44b6g6fJbVAjwp/aU4EHqKRjxz7 OjzdYB7OQreSLMfsBWKLj10BQHHP67FVy4qI5sNkVqCdOBJSVl5mJYLiezXgUFpd10ox rg3VQGCXCv4mQVfARttFZcmvB7MrBxkFeXHXdVXLdi3QnaABY34qtWxnwcqUoCGLXtId c1uOwTlGwNtKikPJZPLTbSTHgEztKM+UkfkyJNstN5D4sg2WXbNPXL/MlHVj+IYKh62K EMUfPuKipcBtYa0t1airZRod5+CPMRSAxaCAgIFAS9+9n7WsOpR7xUw8riNLMjr5qiIY 6UJg== X-Gm-Message-State: APt69E1uMt7oSl/PZjhCh61QwKuy5FGewB35LxoboxyyZwgRodANms6W /V1Na5o+Kb73kIsxcbqfAjVrYg== X-Google-Smtp-Source: AAOMgpd+CRY8s3Xj/wJKRkjDRLgWVypKeo4v2DkdCWa589Skm2CnJobbbNmP3lgrBERug6RaAkbhBA== X-Received: by 2002:a62:e18:: with SMTP id w24-v6mr11831655pfi.145.1530905568240; Fri, 06 Jul 2018 12:32:48 -0700 (PDT) Received: from vader.thefacebook.com ([2620:10d:c090:200::4:984e]) by smtp.gmail.com with ESMTPSA id a11-v6sm12846162pgq.32.2018.07.06.12.32.47 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 06 Jul 2018 12:32:47 -0700 (PDT) From: Omar Sandoval To: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Alexey Dobriyan Cc: Eric Biederman , kernel-team@fb.com Subject: [PATCH 3/7] proc/kcore: fix memory hotplug vs multiple opens race Date: Fri, 6 Jul 2018 12:32:34 -0700 Message-Id: <904e8cabeddf594c30ba5fd314e2d5b887ceb3c8.1530904769.git.osandov@fb.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: References: Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Omar Sandoval There's a theoretical race condition that will cause /proc/kcore to miss a memory hotplug event: CPU0 CPU1 // hotplug event 1 kcore_need_update = 1 open_kcore() open_kcore() kcore_update_ram() kcore_update_ram() // Walk RAM // Walk RAM __kcore_update_ram() __kcore_update_ram() kcore_need_update = 0 // hotplug event 2 kcore_need_update = 1 kcore_need_update = 0 Note that CPU1 set up the RAM kcore entries with the state after hotplug event 1 but cleared the flag for hotplug event 2. The RAM entries will therefore be stale until there is another hotplug event. This is an extremely unlikely sequence of events, but the fix makes the synchronization saner, anyways: we serialize the entire update sequence, which means that whoever clears the flag will always succeed in replacing the kcore list. Signed-off-by: Omar Sandoval --- fs/proc/kcore.c | 93 +++++++++++++++++++++++-------------------------- 1 file changed, 44 insertions(+), 49 deletions(-) diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c index eb1be07bdb3d..f335400300d3 100644 --- a/fs/proc/kcore.c +++ b/fs/proc/kcore.c @@ -98,53 +98,15 @@ static size_t get_kcore_size(int *nphdr, size_t *elf_buflen) return size + *elf_buflen; } -static void free_kclist_ents(struct list_head *head) -{ - struct kcore_list *tmp, *pos; - - list_for_each_entry_safe(pos, tmp, head, list) { - list_del(&pos->list); - kfree(pos); - } -} -/* - * Replace all KCORE_RAM/KCORE_VMEMMAP information with passed list. - */ -static void __kcore_update_ram(struct list_head *list) -{ - int nphdr; - size_t size; - struct kcore_list *tmp, *pos; - LIST_HEAD(garbage); - - down_write(&kclist_lock); - if (atomic_cmpxchg(&kcore_need_update, 1, 0)) { - list_for_each_entry_safe(pos, tmp, &kclist_head, list) { - if (pos->type == KCORE_RAM - || pos->type == KCORE_VMEMMAP) - list_move(&pos->list, &garbage); - } - list_splice_tail(list, &kclist_head); - } else - list_splice(list, &garbage); - proc_root_kcore->size = get_kcore_size(&nphdr, &size); - up_write(&kclist_lock); - - free_kclist_ents(&garbage); -} - - #ifdef CONFIG_HIGHMEM /* * If no highmem, we can assume [0...max_low_pfn) continuous range of memory * because memory hole is not as big as !HIGHMEM case. * (HIGHMEM is special because part of memory is _invisible_ from the kernel.) */ -static int kcore_update_ram(void) +static int kcore_ram_list(struct list_head *head) { - LIST_HEAD(head); struct kcore_list *ent; - int ret = 0; ent = kmalloc(sizeof(*ent), GFP_KERNEL); if (!ent) @@ -152,9 +114,8 @@ static int kcore_update_ram(void) ent->addr = (unsigned long)__va(0); ent->size = max_low_pfn << PAGE_SHIFT; ent->type = KCORE_RAM; - list_add(&ent->list, &head); - __kcore_update_ram(&head); - return ret; + list_add(&ent->list, head); + return 0; } #else /* !CONFIG_HIGHMEM */ @@ -253,11 +214,10 @@ kclist_add_private(unsigned long pfn, unsigned long nr_pages, void *arg) return 1; } -static int kcore_update_ram(void) +static int kcore_ram_list(struct list_head *list) { int nid, ret; unsigned long end_pfn; - LIST_HEAD(head); /* Not inialized....update now */ /* find out "max pfn" */ @@ -269,15 +229,50 @@ static int kcore_update_ram(void) end_pfn = node_end; } /* scan 0 to max_pfn */ - ret = walk_system_ram_range(0, end_pfn, &head, kclist_add_private); - if (ret) { - free_kclist_ents(&head); + ret = walk_system_ram_range(0, end_pfn, list, kclist_add_private); + if (ret) return -ENOMEM; + return 0; +} +#endif /* CONFIG_HIGHMEM */ + +static int kcore_update_ram(void) +{ + LIST_HEAD(list); + LIST_HEAD(garbage); + int nphdr; + size_t size; + struct kcore_list *tmp, *pos; + int ret = 0; + + down_write(&kclist_lock); + if (!atomic_cmpxchg(&kcore_need_update, 1, 0)) + goto out; + + ret = kcore_ram_list(&list); + if (ret) { + /* Couldn't get the RAM list, try again next time. */ + atomic_set(&kcore_need_update, 1); + list_splice_tail(&list, &garbage); + goto out; + } + + list_for_each_entry_safe(pos, tmp, &kclist_head, list) { + if (pos->type == KCORE_RAM || pos->type == KCORE_VMEMMAP) + list_move(&pos->list, &garbage); + } + list_splice_tail(&list, &kclist_head); + + proc_root_kcore->size = get_kcore_size(&nphdr, &size); + +out: + up_write(&kclist_lock); + list_for_each_entry_safe(pos, tmp, &garbage, list) { + list_del(&pos->list); + kfree(pos); } - __kcore_update_ram(&head); return ret; } -#endif /* CONFIG_HIGHMEM */ /*****************************************************************************/ /*