Message ID | 20220908170133.1159747-1-abrestic@rivosinc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | riscv: Allow PROT_WRITE-only mmap() | expand |
> is unnecessary since RISC-V defines its protection_map such that PROT_WRITE > maps to the same PTE permissions as PROT_WRITE|PROT_READ, and it is > inconsistent with other architectures that don't support write-only PTEs, > creating a potential software portability issue. I don't believe that the check is unnecessary. The missing check is discovered in realworld scenario, while we are fixing libaio's test failure on RISC-V [1]. A minimum reproducible example is uploaded to https://fars.ee/1sPb, showing *inconsistent* read results on -r- pages before/after a write attempt performed by the kernel. [1]: https://pagure.io/libaio/blob/1b18bfafc6a2f7b9fa2c6be77a95afed8b7be448/f/harness/cases/5.t > - if (unlikely((prot & PROT_WRITE) && !(prot & PROT_READ))) > - return -EINVAL; > - Just to mention, this revert patch is removing the check of exec without read (--x), too.
> https://fars.ee/1sPb, showing *inconsistent* read results on -r- pages > before/after a write attempt performed by the kernel. That said, maybe prohibit mmap-ing -w- pages is not the best fix for this issue. If -w- pages are irreplaceable for some use cases (and hence need to be allowed), I'd suggest at least we need to re-fix the read result inconsistency issue somewhere else despite simply reverting the patch. Yours, Pan Ruizhe
On Thu, Sep 8, 2022 at 1:28 PM SS JieJi <c141028@gmail.com> wrote: > > > https://fars.ee/1sPb, showing *inconsistent* read results on -r- pages > > before/after a write attempt performed by the kernel. > > That said, maybe prohibit mmap-ing -w- pages is not the best fix for > this issue. If -w- pages are irreplaceable for some use cases (and > hence need to be allowed), I'd suggest at least we need to re-fix the > read result inconsistency issue somewhere else despite simply > reverting the patch. Ah, this is because do_page_fault() also needs to be made aware of write-implying-read. Will send a v2 shortly. -Andrew > > Yours, Pan Ruizhe
diff --git a/arch/riscv/kernel/sys_riscv.c b/arch/riscv/kernel/sys_riscv.c index 571556bb9261..5d3f2fbeb33c 100644 --- a/arch/riscv/kernel/sys_riscv.c +++ b/arch/riscv/kernel/sys_riscv.c @@ -18,9 +18,6 @@ static long riscv_sys_mmap(unsigned long addr, unsigned long len, if (unlikely(offset & (~PAGE_MASK >> page_shift_offset))) return -EINVAL; - if (unlikely((prot & PROT_WRITE) && !(prot & PROT_READ))) - return -EINVAL; - return ksys_mmap_pgoff(addr, len, prot, flags, fd, offset >> (PAGE_SHIFT - page_shift_offset)); }
Commit 2139619bcad7 ("riscv: mmap with PROT_WRITE but no PROT_READ is invalid") made mmap() return EINVAL if PROT_WRITE was set wihtout PROT_READ with the justification that a write-only PTE is considered a reserved PTE permission bit pattern in the privileged spec. This check is unnecessary since RISC-V defines its protection_map such that PROT_WRITE maps to the same PTE permissions as PROT_WRITE|PROT_READ, and it is inconsistent with other architectures that don't support write-only PTEs, creating a potential software portability issue. Just remove the check altogether and let PROT_WRITE imply PROT_READ as is the case on other architectures. Fixes: 2139619bcad7 ("riscv: mmap with PROT_WRITE but no PROT_READ is invalid") Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com> --- arch/riscv/kernel/sys_riscv.c | 3 --- 1 file changed, 3 deletions(-)