Message ID | 20211123051658.3195589-2-pcc@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | kernel: introduce uaccess logging | expand |
On Tue, 23 Nov 2021 at 06:17, Peter Collingbourne <pcc@google.com> wrote: > > With uaccess logging the contract is that the kernel must not report > accessing more data than necessary, as this can lead to false positive > reports in downstream consumers. This generally works out of the box > when instrumenting copy_{from,to}_user(), but with the data argument > to mount() we use copy_from_user() to copy PAGE_SIZE bytes (or as > much as we can, if the PAGE_SIZE sized access failed) and figure out > later how much we actually need. > > To prevent this from leading to a false positive report, use > raw_copy_from_user(), which will prevent the access from being logged. > Recall that it is valid for the kernel to report accessing less > data than it actually accessed, as uaccess logging is a best-effort > mechanism for reporting uaccesses. > > Link: https://linux-review.googlesource.com/id/I5629b92a725c817acd9a861288338dd605cafee6 > Signed-off-by: Peter Collingbourne <pcc@google.com> > --- > fs/namespace.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/fs/namespace.c b/fs/namespace.c > index 659a8f39c61a..695b30e391f0 100644 > --- a/fs/namespace.c > +++ b/fs/namespace.c > @@ -3197,7 +3197,12 @@ static void *copy_mount_options(const void __user * data) > if (!copy) > return ERR_PTR(-ENOMEM); > > - left = copy_from_user(copy, data, PAGE_SIZE); > + /* > + * Use raw_copy_from_user to avoid reporting overly large accesses in > + * the uaccess buffer, as this can lead to false positive reports in > + * downstream consumers. > + */ > + left = raw_copy_from_user(copy, data, PAGE_SIZE); This will skip KASAN/etc checks as well, right? I guess it is fine b/c this affects just this place and the code looks safe (famous last words :)) and we can refine it in future. But I wonder about false positives under KMSAN. However, we probably can add an explicit KMSAN annotation to mark it as initialised. Alex? > /* > * Not all architectures have an exact copy_from_user(). Resort to > -- > 2.34.0.rc2.393.gf8c9666880-goog >
On Tue, Nov 23, 2021 at 8:51 AM Dmitry Vyukov <dvyukov@google.com> wrote: > > On Tue, 23 Nov 2021 at 06:17, Peter Collingbourne <pcc@google.com> wrote: > > > > With uaccess logging the contract is that the kernel must not report > > accessing more data than necessary, as this can lead to false positive > > reports in downstream consumers. This generally works out of the box > > when instrumenting copy_{from,to}_user(), but with the data argument > > to mount() we use copy_from_user() to copy PAGE_SIZE bytes (or as > > much as we can, if the PAGE_SIZE sized access failed) and figure out > > later how much we actually need. > > > > To prevent this from leading to a false positive report, use > > raw_copy_from_user(), which will prevent the access from being logged. > > Recall that it is valid for the kernel to report accessing less > > data than it actually accessed, as uaccess logging is a best-effort > > mechanism for reporting uaccesses. > > > > Link: https://linux-review.googlesource.com/id/I5629b92a725c817acd9a861288338dd605cafee6 > > Signed-off-by: Peter Collingbourne <pcc@google.com> > > --- > > fs/namespace.c | 7 ++++++- > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > diff --git a/fs/namespace.c b/fs/namespace.c > > index 659a8f39c61a..695b30e391f0 100644 > > --- a/fs/namespace.c > > +++ b/fs/namespace.c > > @@ -3197,7 +3197,12 @@ static void *copy_mount_options(const void __user * data) > > if (!copy) > > return ERR_PTR(-ENOMEM); > > > > - left = copy_from_user(copy, data, PAGE_SIZE); > > + /* > > + * Use raw_copy_from_user to avoid reporting overly large accesses in > > + * the uaccess buffer, as this can lead to false positive reports in > > + * downstream consumers. > > + */ > > + left = raw_copy_from_user(copy, data, PAGE_SIZE); I don't really like the idea of using raw_copy_from_user() anywhere. Right now users of instrumented.h can more or less assume they see all usercopy events, and removing the copy_from_user() call from here looks like a regression. Cannot the usercopy logger decide whether it wants to log the access based on the size (e.g. skip accesses >= PAGE_SIZE)? Will it help if we can instrument both sides of copy_from_user() (see the code here: https://linux-review.googlesource.com/c/linux/kernel/git/torvalds/linux/+/14103/4)? If not, maybe we can disable/enable uaccess logging for the current task around these accesses? > This will skip KASAN/etc checks as well, right? I guess it is fine b/c > this affects just this place and the code looks safe (famous last > words :)) and we can refine it in future. > But I wonder about false positives under KMSAN. However, we probably > can add an explicit KMSAN annotation to mark it as initialised. > Alex? > > > /* > > * Not all architectures have an exact copy_from_user(). Resort to > > -- > > 2.34.0.rc2.393.gf8c9666880-goog > >
On Tue, Nov 23, 2021 at 2:09 AM Alexander Potapenko <glider@google.com> wrote: > > On Tue, Nov 23, 2021 at 8:51 AM Dmitry Vyukov <dvyukov@google.com> wrote: > > > > On Tue, 23 Nov 2021 at 06:17, Peter Collingbourne <pcc@google.com> wrote: > > > > > > With uaccess logging the contract is that the kernel must not report > > > accessing more data than necessary, as this can lead to false positive > > > reports in downstream consumers. This generally works out of the box > > > when instrumenting copy_{from,to}_user(), but with the data argument > > > to mount() we use copy_from_user() to copy PAGE_SIZE bytes (or as > > > much as we can, if the PAGE_SIZE sized access failed) and figure out > > > later how much we actually need. > > > > > > To prevent this from leading to a false positive report, use > > > raw_copy_from_user(), which will prevent the access from being logged. > > > Recall that it is valid for the kernel to report accessing less > > > data than it actually accessed, as uaccess logging is a best-effort > > > mechanism for reporting uaccesses. > > > > > > Link: https://linux-review.googlesource.com/id/I5629b92a725c817acd9a861288338dd605cafee6 > > > Signed-off-by: Peter Collingbourne <pcc@google.com> > > > --- > > > fs/namespace.c | 7 ++++++- > > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > > > diff --git a/fs/namespace.c b/fs/namespace.c > > > index 659a8f39c61a..695b30e391f0 100644 > > > --- a/fs/namespace.c > > > +++ b/fs/namespace.c > > > @@ -3197,7 +3197,12 @@ static void *copy_mount_options(const void __user * data) > > > if (!copy) > > > return ERR_PTR(-ENOMEM); > > > > > > - left = copy_from_user(copy, data, PAGE_SIZE); > > > + /* > > > + * Use raw_copy_from_user to avoid reporting overly large accesses in > > > + * the uaccess buffer, as this can lead to false positive reports in > > > + * downstream consumers. > > > + */ > > > + left = raw_copy_from_user(copy, data, PAGE_SIZE); > > I don't really like the idea of using raw_copy_from_user() anywhere. > Right now users of instrumented.h can more or less assume they see all > usercopy events, and removing the copy_from_user() call from here > looks like a regression. > > Cannot the usercopy logger decide whether it wants to log the access > based on the size (e.g. skip accesses >= PAGE_SIZE)? > Will it help if we can instrument both sides of copy_from_user() (see > the code here: https://linux-review.googlesource.com/c/linux/kernel/git/torvalds/linux/+/14103/4)? > > If not, maybe we can disable/enable uaccess logging for the current > task around these accesses? This seems reasonable, done in v3. Peter
diff --git a/fs/namespace.c b/fs/namespace.c index 659a8f39c61a..695b30e391f0 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -3197,7 +3197,12 @@ static void *copy_mount_options(const void __user * data) if (!copy) return ERR_PTR(-ENOMEM); - left = copy_from_user(copy, data, PAGE_SIZE); + /* + * Use raw_copy_from_user to avoid reporting overly large accesses in + * the uaccess buffer, as this can lead to false positive reports in + * downstream consumers. + */ + left = raw_copy_from_user(copy, data, PAGE_SIZE); /* * Not all architectures have an exact copy_from_user(). Resort to
With uaccess logging the contract is that the kernel must not report accessing more data than necessary, as this can lead to false positive reports in downstream consumers. This generally works out of the box when instrumenting copy_{from,to}_user(), but with the data argument to mount() we use copy_from_user() to copy PAGE_SIZE bytes (or as much as we can, if the PAGE_SIZE sized access failed) and figure out later how much we actually need. To prevent this from leading to a false positive report, use raw_copy_from_user(), which will prevent the access from being logged. Recall that it is valid for the kernel to report accessing less data than it actually accessed, as uaccess logging is a best-effort mechanism for reporting uaccesses. Link: https://linux-review.googlesource.com/id/I5629b92a725c817acd9a861288338dd605cafee6 Signed-off-by: Peter Collingbourne <pcc@google.com> --- fs/namespace.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)