From patchwork Thu Jan 16 23:59:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Minchan Kim X-Patchwork-Id: 11338039 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8B7C2109A for ; Fri, 17 Jan 2020 00:00:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4B9BF207E0 for ; Fri, 17 Jan 2020 00:00:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="A5GN7q7x" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4B9BF207E0 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 972BD6B000C; Thu, 16 Jan 2020 19:00:14 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 926F76B000D; Thu, 16 Jan 2020 19:00:14 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7C9846B000E; Thu, 16 Jan 2020 19:00:14 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0175.hostedemail.com [216.40.44.175]) by kanga.kvack.org (Postfix) with ESMTP id 65D366B000C for ; Thu, 16 Jan 2020 19:00:14 -0500 (EST) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id F34F6180AD817 for ; Fri, 17 Jan 2020 00:00:13 +0000 (UTC) X-FDA: 76385168706.29.lunch29_1f11d2f5b372c X-Spam-Summary: 2,0,0,7f931fcbe97d435c,d41d8cd98f00b204,minchan.kim@gmail.com,:akpm@linux-foundation.org:linux-kernel@vger.kernel.org::linux-api@vger.kernel.org:oleksandr@redhat.com:surenb@google.com:timmurray@google.com:dancol@google.com:sspatil@google.com:sonnyrao@google.com:bgeffon@google.com:mhocko@suse.com:hannes@cmpxchg.org:shakeelb@google.com:joaodias@google.com:ktkhai@virtuozzo.com:christian.brauner@ubuntu.com:sjpark@amazon.de:minchan@kernel.org,RULES_HIT:2:41:69:355:379:541:800:960:967:973:982:988:989:1260:1311:1314:1345:1359:1437:1515:1535:1605:1606:1730:1747:1777:1792:2393:2525:2559:2563:2682:2685:2859:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3865:3866:3867:3868:3871:3872:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4119:4321:4605:5007:6261:6653:6742:7875:7903:8603:9025:9592:10004:11026:11473:11658:11914:12043:12291:12296:12297:12438:12517:12519:12555:12683:12895:12986:13161:13229:13894:14096:14394:21080:21324:21444:21451:21627:217 49:21789 X-HE-Tag: lunch29_1f11d2f5b372c X-Filterd-Recvd-Size: 8598 Received: from mail-pg1-f194.google.com (mail-pg1-f194.google.com [209.85.215.194]) by imf44.hostedemail.com (Postfix) with ESMTP for ; Fri, 17 Jan 2020 00:00:13 +0000 (UTC) Received: by mail-pg1-f194.google.com with SMTP id x7so10716902pgl.11 for ; Thu, 16 Jan 2020 16:00:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=QglLgkqFjKQCxygZipmvR3GsUGfJTk60o5ZOd6nyvzM=; b=A5GN7q7x2GgL0yPH12L2wx30f/yE2I9iB0owJ5yvV6hgmMPnJliTxnId1SVEnEGQpd QvlmNEQyMO0r00oVtIp6KNJj/mc6EGd+aeeqNfrLm7+djbPkDqYAUwmkn8tD3JtS+QoY oh5qRUm0J8PChTC7+42Ua/aMp6heo62nShftwWrQZo4vyJWOdqVqg/Rrc9Vt0A0T4JwX 5bxaKMe+tXS7W+igmL+rFGNTMRuh8nXhpJWEohjzJKTG6fKgHto/dIcDsfe5JQxOLW0+ sgPsdAE6RPWPazEoIO+1MTqZjkBjylWA3gDkPE0/1rERbm+P70i7VztTgwVGKggowswh qDlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references:mime-version:content-transfer-encoding; bh=QglLgkqFjKQCxygZipmvR3GsUGfJTk60o5ZOd6nyvzM=; b=bLq6+tSTgFOLspNW6ytnbhAyzrcRsUT4lUKMLt4gbWPDZ/+gq510Px+dB4LFr4HZZj M2qmRs58Fm7SgwsqukBY3sMepTPX5WAAes+ORPFp80dADlBgHx7ZqVQ9q6DXZ5KNjYGm Q0FVujFAGhHY7pkQPOLm8lTohFCBS70TQJOVsTLaldY98q4I30nycl31JeCf8CBgCaga QtR50RcmnA5R9YGSShK41kMCuorAfsyocXvp9VEjqeHixrCAbZzrFqkkgu0b0O9Mpgv9 cVr+hb/Sha4KR+vLyqU3mDZ5P2ZbyeK4Bg/I5cth82cQCpoi++ufuBHCe20772g+AwQQ XdGQ== X-Gm-Message-State: APjAAAVfjthgiN829uEJqtAbQId11EiMr0tGRvrwhOWoHD8bqgf5VLwu zVxPQdeL2S2TAMP1RpQAGIM= X-Google-Smtp-Source: APXvYqwsJcCW+qt7R9B04B+vnwPfGrVGJtCei/LJzA5JSNhEEvKUwibk1XtW2akSOZv3EZr4Odh6TA== X-Received: by 2002:a63:dd58:: with SMTP id g24mr41178134pgj.102.1579219212474; Thu, 16 Jan 2020 16:00:12 -0800 (PST) Received: from bbox-1.mtv.corp.google.com ([2620:15c:211:1:3e01:2939:5992:52da]) by smtp.gmail.com with ESMTPSA id z4sm26584885pfn.42.2020.01.16.16.00.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 16 Jan 2020 16:00:11 -0800 (PST) From: Minchan Kim To: Andrew Morton Cc: LKML , linux-mm , linux-api@vger.kernel.org, oleksandr@redhat.com, Suren Baghdasaryan , Tim Murray , Daniel Colascione , Sandeep Patil , Sonny Rao , Brian Geffon , Michal Hocko , Johannes Weiner , Shakeel Butt , John Dias , ktkhai@virtuozzo.com, christian.brauner@ubuntu.com, sjpark@amazon.de, Minchan Kim Subject: [PATCH v2 5/5] mm: support both pid and pidfd for process_madvise Date: Thu, 16 Jan 2020 15:59:53 -0800 Message-Id: <20200116235953.163318-6-minchan@kernel.org> X-Mailer: git-send-email 2.25.0.rc1.283.g88dfdc4193-goog In-Reply-To: <20200116235953.163318-1-minchan@kernel.org> References: <20200116235953.163318-1-minchan@kernel.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There is a demand[1] to support pid as well pidfd for process_madvise to reduce unncessary syscall to get pidfd if the user has control of the targer process(ie, they could gaurantee the process is not gone or pid is not reused. Or, it might be okay to give a hint to wrong process). This patch aims for supporting both options like waitid(2). So, the syscall is currently, int process_madvise(int which, pid_t pid, void *addr, size_t length, int advise, unsigned long flag); @which is actually idtype_t for userspace libray and currently, it supports P_PID and P_PIDFD. [1] https://lore.kernel.org/linux-mm/9d849087-3359-c4ab-fbec-859e8186c509@virtuozzo.com/ Signed-off-by: Minchan Kim --- include/linux/pid.h | 1 + include/linux/syscalls.h | 3 ++- kernel/exit.c | 17 ----------------- kernel/pid.c | 17 +++++++++++++++++ mm/madvise.c | 34 ++++++++++++++++++++++------------ 5 files changed, 42 insertions(+), 30 deletions(-) diff --git a/include/linux/pid.h b/include/linux/pid.h index 998ae7d24450..023d9c3a8edc 100644 --- a/include/linux/pid.h +++ b/include/linux/pid.h @@ -75,6 +75,7 @@ extern const struct file_operations pidfd_fops; struct file; extern struct pid *pidfd_pid(const struct file *file); +extern struct pid *pidfd_get_pid(unsigned int fd); static inline struct pid *get_pid(struct pid *pid) { diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 1b58a11ff49f..27060e59db37 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -877,7 +877,8 @@ asmlinkage long sys_munlockall(void); asmlinkage long sys_mincore(unsigned long start, size_t len, unsigned char __user * vec); asmlinkage long sys_madvise(unsigned long start, size_t len, int behavior); -asmlinkage long sys_process_madvise(int pidfd, unsigned long start, + +asmlinkage long sys_process_madvise(int which, pid_t pid, unsigned long start, size_t len, int behavior, unsigned long flags); asmlinkage long sys_remap_file_pages(unsigned long start, unsigned long size, unsigned long prot, unsigned long pgoff, diff --git a/kernel/exit.c b/kernel/exit.c index bcbd59888e67..7698843b1411 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -1466,23 +1466,6 @@ static long do_wait(struct wait_opts *wo) return retval; } -static struct pid *pidfd_get_pid(unsigned int fd) -{ - struct fd f; - struct pid *pid; - - f = fdget(fd); - if (!f.file) - return ERR_PTR(-EBADF); - - pid = pidfd_pid(f.file); - if (!IS_ERR(pid)) - get_pid(pid); - - fdput(f); - return pid; -} - static long kernel_waitid(int which, pid_t upid, struct waitid_info *infop, int options, struct rusage *ru) { diff --git a/kernel/pid.c b/kernel/pid.c index 2278e249141d..a41a89d5dad2 100644 --- a/kernel/pid.c +++ b/kernel/pid.c @@ -496,6 +496,23 @@ struct pid *find_ge_pid(int nr, struct pid_namespace *ns) return idr_get_next(&ns->idr, &nr); } +struct pid *pidfd_get_pid(unsigned int fd) +{ + struct fd f; + struct pid *pid; + + f = fdget(fd); + if (!f.file) + return ERR_PTR(-EBADF); + + pid = pidfd_pid(f.file); + if (!IS_ERR(pid)) + get_pid(pid); + + fdput(f); + return pid; +} + /** * pidfd_create() - Create a new pid file descriptor. * diff --git a/mm/madvise.c b/mm/madvise.c index 89557998d287..2ac62716e5b8 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1192,11 +1192,10 @@ SYSCALL_DEFINE3(madvise, unsigned long, start, size_t, len_in, int, behavior) return madvise_common(current, current->mm, start, len_in, behavior); } -SYSCALL_DEFINE5(process_madvise, int, pidfd, unsigned long, start, +SYSCALL_DEFINE6(process_madvise, int, which, pid_t, upid, unsigned long, start, size_t, len_in, int, behavior, unsigned long, flags) { int ret; - struct fd f; struct pid *pid; struct task_struct *task; struct mm_struct *mm; @@ -1207,20 +1206,31 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, unsigned long, start, if (!process_madvise_behavior_valid(behavior)) return -EINVAL; - f = fdget(pidfd); - if (!f.file) - return -EBADF; + switch (which) { + case P_PID: + if (upid <= 0) + return -EINVAL; + + pid = find_get_pid(upid); + if (!pid) + return -ESRCH; + break; + case P_PIDFD: + if (upid < 0) + return -EINVAL; - pid = pidfd_pid(f.file); - if (IS_ERR(pid)) { - ret = PTR_ERR(pid); - goto fdput; + pid = pidfd_get_pid(upid); + if (IS_ERR(pid)) + return PTR_ERR(pid); + break; + default: + return -EINVAL; } task = get_pid_task(pid, PIDTYPE_PID); if (!task) { ret = -ESRCH; - goto fdput; + goto put_pid; } mm = mm_access(task, PTRACE_MODE_ATTACH_FSCREDS); @@ -1233,7 +1243,7 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, unsigned long, start, mmput(mm); release_task: put_task_struct(task); -fdput: - fdput(f); +put_pid: + put_pid(pid); return ret; }