mbox series

[v4,00/10] riscv: Allow userspace to directly access perf counters

Message ID 20230703124647.215952-1-alexghiti@rivosinc.com (mailing list archive)
Headers show
Series riscv: Allow userspace to directly access perf counters | expand

Message

Alexandre Ghiti July 3, 2023, 12:46 p.m. UTC
riscv used to allow direct access to cycle/time/instret counters,
bypassing the perf framework, this patchset intends to allow the user to
mmap any counter when accessed through perf. But we can't break the
existing behaviour so we introduce a sysctl perf_user_access like arm64
does, which defaults to the legacy mode described above.

This version needs openSBI v1.3 *and* a kernel fix that went upstream lately
(https://lore.kernel.org/lkml/20230616114831.3186980-1-maz@kernel.org/T/).

**Important**: In this version, the default mode is now user access, not
the legacy so some applications will break.

base-commit-tag: v6.4-rc6

Changes in v4:
- Fixed some nits in riscv_pmu_sbi.c thanks to Andrew
- Fixed the documentation thanks to Andrew
- Added RB from Andrew \o/

Changes in v3:
- patch 1 now contains the ref to the faulty commit (no Fixes tag as it is only a comment), as Andrew suggested
- Removed RISCV_PMU_LEGACY_TIME from patch 3, as Andrew suggested
- Rename RISCV_PMU_PDEV_NAME to "riscv-pmu-sbi", patch4 is just cosmetic now, as Andrew suggested
- Removed a few useless (and wrong) comments, as Andrew suggested
- Simplify arch_perf_update_userpage code, as Andrew suggested
- Documentation now mentions that time CSR is *always* accessible, whatever the mode, as suggested by Andrew
- Removed CYCLEH reference and add TODO for rv32 support, as suggested by Atish
- Do not rename the pmu instance as Atish suggested
- Set pmc_width only if rdpmc is enabled and CONFIG_RISCV_PMU is set and the event is a hw event
- Move arch_perf_update_userpage https://lore.kernel.org/lkml/20230616114831.3186980-1-maz@kernel.org/T/
- **Switch to user mode access by default**

Changes in v2:
- Split into smaller patches, way better!
- Add RB from Conor
- Simplify the way we checked riscv architecture
- Fix race mmap and other thread running on other cpus
- Use hwc when available
- Set all userspace access flags in event_init, too cumbersome to handle sysctl changes
- Fix arch_perf_update_userpage for pmu other than riscv-pmu by renaming pmu driver
- Fixed kernel test robot build error
- Fixed documentation (Andrew and Bagas)
- perf testsuite passes mmap tests in all 3 modes

Alexandre Ghiti (10):
  perf: Fix wrong comment about default event_idx
  include: riscv: Fix wrong include guard in riscv_pmu.h
  riscv: Make legacy counter enum match the HW numbering
  drivers: perf: Rename riscv pmu sbi driver
  riscv: Prepare for user-space perf event mmap support
  drivers: perf: Implement perf event mmap support in the legacy backend
  drivers: perf: Implement perf event mmap support in the SBI backend
  Documentation: admin-guide: Add riscv sysctl_perf_user_access
  tools: lib: perf: Implement riscv mmap support
  perf: tests: Adapt mmap-basic.c for riscv

 Documentation/admin-guide/sysctl/kernel.rst |  27 ++-
 drivers/perf/riscv_pmu.c                    | 113 +++++++++++
 drivers/perf/riscv_pmu_legacy.c             |  28 ++-
 drivers/perf/riscv_pmu_sbi.c                | 196 +++++++++++++++++++-
 include/linux/perf/riscv_pmu.h              |  12 +-
 include/linux/perf_event.h                  |   3 +-
 tools/lib/perf/mmap.c                       |  65 +++++++
 tools/perf/tests/mmap-basic.c               |   4 +-
 8 files changed, 428 insertions(+), 20 deletions(-)

Comments

Atish Patra July 14, 2023, 9:07 a.m. UTC | #1
On Mon, Jul 3, 2023 at 5:46 AM Alexandre Ghiti <alexghiti@rivosinc.com> wrote:
>
> riscv used to allow direct access to cycle/time/instret counters,
> bypassing the perf framework, this patchset intends to allow the user to
> mmap any counter when accessed through perf. But we can't break the
> existing behaviour so we introduce a sysctl perf_user_access like arm64
> does, which defaults to the legacy mode described above.
>
> This version needs openSBI v1.3 *and* a kernel fix that went upstream lately
> (https://lore.kernel.org/lkml/20230616114831.3186980-1-maz@kernel.org/T/).
>
> **Important**: In this version, the default mode is now user access, not
> the legacy so some applications will break.
>
Maybe providing a little more context to the issue will help
understanding why breaking those legacy
applications is necessary due to security reasons.

Here was the previous discussion around this[1].
Most of the legacy user applications using rdcycle should use rdtime
instead as they just want to record the time
elapsed. Allowing rdcycle/rdinstret to be read from user space can
lead to very deterministic attacks[2].

[1] https://groups.google.com/a/groups.riscv.org/g/sw-dev/c/REWcwYnzsKE?pli=1
[2] https://www.youtube.com/watch?v=3-c4C_L2PRQ&ab_channel=IEEESymposiumonSecurityandPrivacy

Any user application that really requires to read rdcycle directly can
use this new perf interface to do that without any latency.

> base-commit-tag: v6.4-rc6
>
> Changes in v4:
> - Fixed some nits in riscv_pmu_sbi.c thanks to Andrew
> - Fixed the documentation thanks to Andrew
> - Added RB from Andrew \o/
>
> Changes in v3:
> - patch 1 now contains the ref to the faulty commit (no Fixes tag as it is only a comment), as Andrew suggested
> - Removed RISCV_PMU_LEGACY_TIME from patch 3, as Andrew suggested
> - Rename RISCV_PMU_PDEV_NAME to "riscv-pmu-sbi", patch4 is just cosmetic now, as Andrew suggested
> - Removed a few useless (and wrong) comments, as Andrew suggested
> - Simplify arch_perf_update_userpage code, as Andrew suggested
> - Documentation now mentions that time CSR is *always* accessible, whatever the mode, as suggested by Andrew
> - Removed CYCLEH reference and add TODO for rv32 support, as suggested by Atish
> - Do not rename the pmu instance as Atish suggested
> - Set pmc_width only if rdpmc is enabled and CONFIG_RISCV_PMU is set and the event is a hw event
> - Move arch_perf_update_userpage https://lore.kernel.org/lkml/20230616114831.3186980-1-maz@kernel.org/T/
> - **Switch to user mode access by default**
>
> Changes in v2:
> - Split into smaller patches, way better!
> - Add RB from Conor
> - Simplify the way we checked riscv architecture
> - Fix race mmap and other thread running on other cpus
> - Use hwc when available
> - Set all userspace access flags in event_init, too cumbersome to handle sysctl changes
> - Fix arch_perf_update_userpage for pmu other than riscv-pmu by renaming pmu driver
> - Fixed kernel test robot build error
> - Fixed documentation (Andrew and Bagas)
> - perf testsuite passes mmap tests in all 3 modes
>
> Alexandre Ghiti (10):
>   perf: Fix wrong comment about default event_idx
>   include: riscv: Fix wrong include guard in riscv_pmu.h
>   riscv: Make legacy counter enum match the HW numbering
>   drivers: perf: Rename riscv pmu sbi driver
>   riscv: Prepare for user-space perf event mmap support
>   drivers: perf: Implement perf event mmap support in the legacy backend
>   drivers: perf: Implement perf event mmap support in the SBI backend
>   Documentation: admin-guide: Add riscv sysctl_perf_user_access
>   tools: lib: perf: Implement riscv mmap support
>   perf: tests: Adapt mmap-basic.c for riscv
>
>  Documentation/admin-guide/sysctl/kernel.rst |  27 ++-
>  drivers/perf/riscv_pmu.c                    | 113 +++++++++++
>  drivers/perf/riscv_pmu_legacy.c             |  28 ++-
>  drivers/perf/riscv_pmu_sbi.c                | 196 +++++++++++++++++++-
>  include/linux/perf/riscv_pmu.h              |  12 +-
>  include/linux/perf_event.h                  |   3 +-
>  tools/lib/perf/mmap.c                       |  65 +++++++
>  tools/perf/tests/mmap-basic.c               |   4 +-
>  8 files changed, 428 insertions(+), 20 deletions(-)
>
> --
> 2.39.2
>
Rémi Denis-Courmont July 17, 2023, 1:35 p.m. UTC | #2
Hi,

Le 3 juillet 2023 15:46:37 GMT+03:00, Alexandre Ghiti <alexghiti@rivosinc.com> a écrit :
>riscv used to allow direct access to cycle/time/instret counters,
>bypassing the perf framework, this patchset intends to allow the user to
>mmap any counter when accessed through perf. But we can't break the
>existing behaviour so we introduce a sysctl perf_user_access like arm64
>does, which defaults to the legacy mode described above.

AFAIK, if the default settings breaks user space, the patchset is considered to break user space. That being the case, either this case is deemed special enough that breaking user space is OK, or it is not.

If it is not OK, then the only way out that I can think of, consists of trapping and emulating the counters, returning the same sanitised values that Linux perf would return. Then you can add a kernel config option to disable that trap-and-emulation code in the future.

Either way, I don't suppose that there should be an option to be insecurely backward compatible.

>
>This version needs openSBI v1.3 *and* a kernel fix that went upstream lately
>(https://lore.kernel.org/lkml/20230616114831.3186980-1-maz@kernel.org/T/).
>
>**Important**: In this version, the default mode is now user access, not
>the legacy so some applications will break.
>
>base-commit-tag: v6.4-rc6
>
>Changes in v4:
>- Fixed some nits in riscv_pmu_sbi.c thanks to Andrew
>- Fixed the documentation thanks to Andrew
>- Added RB from Andrew \o/
>
>Changes in v3:
>- patch 1 now contains the ref to the faulty commit (no Fixes tag as it is only a comment), as Andrew suggested
>- Removed RISCV_PMU_LEGACY_TIME from patch 3, as Andrew suggested
>- Rename RISCV_PMU_PDEV_NAME to "riscv-pmu-sbi", patch4 is just cosmetic now, as Andrew suggested
>- Removed a few useless (and wrong) comments, as Andrew suggested
>- Simplify arch_perf_update_userpage code, as Andrew suggested
>- Documentation now mentions that time CSR is *always* accessible, whatever the mode, as suggested by Andrew
>- Removed CYCLEH reference and add TODO for rv32 support, as suggested by Atish
>- Do not rename the pmu instance as Atish suggested
>- Set pmc_width only if rdpmc is enabled and CONFIG_RISCV_PMU is set and the event is a hw event
>- Move arch_perf_update_userpage https://lore.kernel.org/lkml/20230616114831.3186980-1-maz@kernel.org/T/
>- **Switch to user mode access by default**
>
>Changes in v2:
>- Split into smaller patches, way better!
>- Add RB from Conor
>- Simplify the way we checked riscv architecture
>- Fix race mmap and other thread running on other cpus
>- Use hwc when available
>- Set all userspace access flags in event_init, too cumbersome to handle sysctl changes
>- Fix arch_perf_update_userpage for pmu other than riscv-pmu by renaming pmu driver
>- Fixed kernel test robot build error
>- Fixed documentation (Andrew and Bagas)
>- perf testsuite passes mmap tests in all 3 modes
>
>Alexandre Ghiti (10):
>  perf: Fix wrong comment about default event_idx
>  include: riscv: Fix wrong include guard in riscv_pmu.h
>  riscv: Make legacy counter enum match the HW numbering
>  drivers: perf: Rename riscv pmu sbi driver
>  riscv: Prepare for user-space perf event mmap support
>  drivers: perf: Implement perf event mmap support in the legacy backend
>  drivers: perf: Implement perf event mmap support in the SBI backend
>  Documentation: admin-guide: Add riscv sysctl_perf_user_access
>  tools: lib: perf: Implement riscv mmap support
>  perf: tests: Adapt mmap-basic.c for riscv
>
> Documentation/admin-guide/sysctl/kernel.rst |  27 ++-
> drivers/perf/riscv_pmu.c                    | 113 +++++++++++
> drivers/perf/riscv_pmu_legacy.c             |  28 ++-
> drivers/perf/riscv_pmu_sbi.c                | 196 +++++++++++++++++++-
> include/linux/perf/riscv_pmu.h              |  12 +-
> include/linux/perf_event.h                  |   3 +-
> tools/lib/perf/mmap.c                       |  65 +++++++
> tools/perf/tests/mmap-basic.c               |   4 +-
> 8 files changed, 428 insertions(+), 20 deletions(-)
>
Atish Patra July 17, 2023, 11:22 p.m. UTC | #3
On Mon, Jul 17, 2023 at 6:38 AM Rémi Denis-Courmont <remi@remlab.net> wrote:
>
> Hi,
>
> Le 3 juillet 2023 15:46:37 GMT+03:00, Alexandre Ghiti <alexghiti@rivosinc.com> a écrit :
> >riscv used to allow direct access to cycle/time/instret counters,
> >bypassing the perf framework, this patchset intends to allow the user to
> >mmap any counter when accessed through perf. But we can't break the
> >existing behaviour so we introduce a sysctl perf_user_access like arm64
> >does, which defaults to the legacy mode described above.
>
> AFAIK, if the default settings breaks user space, the patchset is considered to break user space. That being the case, either this case is deemed special enough that breaking user space is OK, or it is not.
>

This case is a special case as the usage was incorrect in the first
place. Any application that genuinely requires rdcycle can always get
it now via the perf interface.
If the insecure and incorrect behavior is allowed, we suspect the user
space behavior will never be fixed as it is hard to put a future flag
date in these cases.
Given that the RISC-V eco-system is still young, it is better to
disallow this behavior to avoid further proliferation of the same
incorrect usage.

> If it is not OK, then the only way out that I can think of, consists of trapping and emulating the counters, returning the same sanitised values that Linux perf would return. Then you can add a kernel config option to disable that trap-and-emulation code in the future.
>

What do you mean by "sanitised" value ? If you mean the correct value,
it doesn't know when to stop providing that value (without reinventing
the entire context switch which the perf framework already does for
us).
If we just provide a dummy value, that doesn't help the current users
as well. They will probably get unexpected results instead of sigill.

I have cc'd a few folks who were involved in the initial discussions
when this issue was reported to collect the feedback.

> Either way, I don't suppose that there should be an option to be insecurely backward compatible.
>
> >
> >This version needs openSBI v1.3 *and* a kernel fix that went upstream lately
> >(https://lore.kernel.org/lkml/20230616114831.3186980-1-maz@kernel.org/T/).
> >
> >**Important**: In this version, the default mode is now user access, not
> >the legacy so some applications will break.
> >
> >base-commit-tag: v6.4-rc6
> >
> >Changes in v4:
> >- Fixed some nits in riscv_pmu_sbi.c thanks to Andrew
> >- Fixed the documentation thanks to Andrew
> >- Added RB from Andrew \o/
> >
> >Changes in v3:
> >- patch 1 now contains the ref to the faulty commit (no Fixes tag as it is only a comment), as Andrew suggested
> >- Removed RISCV_PMU_LEGACY_TIME from patch 3, as Andrew suggested
> >- Rename RISCV_PMU_PDEV_NAME to "riscv-pmu-sbi", patch4 is just cosmetic now, as Andrew suggested
> >- Removed a few useless (and wrong) comments, as Andrew suggested
> >- Simplify arch_perf_update_userpage code, as Andrew suggested
> >- Documentation now mentions that time CSR is *always* accessible, whatever the mode, as suggested by Andrew
> >- Removed CYCLEH reference and add TODO for rv32 support, as suggested by Atish
> >- Do not rename the pmu instance as Atish suggested
> >- Set pmc_width only if rdpmc is enabled and CONFIG_RISCV_PMU is set and the event is a hw event
> >- Move arch_perf_update_userpage https://lore.kernel.org/lkml/20230616114831.3186980-1-maz@kernel.org/T/
> >- **Switch to user mode access by default**
> >
> >Changes in v2:
> >- Split into smaller patches, way better!
> >- Add RB from Conor
> >- Simplify the way we checked riscv architecture
> >- Fix race mmap and other thread running on other cpus
> >- Use hwc when available
> >- Set all userspace access flags in event_init, too cumbersome to handle sysctl changes
> >- Fix arch_perf_update_userpage for pmu other than riscv-pmu by renaming pmu driver
> >- Fixed kernel test robot build error
> >- Fixed documentation (Andrew and Bagas)
> >- perf testsuite passes mmap tests in all 3 modes
> >
> >Alexandre Ghiti (10):
> >  perf: Fix wrong comment about default event_idx
> >  include: riscv: Fix wrong include guard in riscv_pmu.h
> >  riscv: Make legacy counter enum match the HW numbering
> >  drivers: perf: Rename riscv pmu sbi driver
> >  riscv: Prepare for user-space perf event mmap support
> >  drivers: perf: Implement perf event mmap support in the legacy backend
> >  drivers: perf: Implement perf event mmap support in the SBI backend
> >  Documentation: admin-guide: Add riscv sysctl_perf_user_access
> >  tools: lib: perf: Implement riscv mmap support
> >  perf: tests: Adapt mmap-basic.c for riscv
> >
> > Documentation/admin-guide/sysctl/kernel.rst |  27 ++-
> > drivers/perf/riscv_pmu.c                    | 113 +++++++++++
> > drivers/perf/riscv_pmu_legacy.c             |  28 ++-
> > drivers/perf/riscv_pmu_sbi.c                | 196 +++++++++++++++++++-
> > include/linux/perf/riscv_pmu.h              |  12 +-
> > include/linux/perf_event.h                  |   3 +-
> > tools/lib/perf/mmap.c                       |  65 +++++++
> > tools/perf/tests/mmap-basic.c               |   4 +-
> > 8 files changed, 428 insertions(+), 20 deletions(-)
> >
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
Rémi Denis-Courmont July 18, 2023, 5:06 p.m. UTC | #4
Hi,

Le tiistaina 18. heinäkuuta 2023, 2.22.54 EEST Atish Patra a écrit :
> > AFAIK, if the default settings breaks user space, the patchset is
> > considered to break user space. That being the case, either this case is
> > deemed special enough that breaking user space is OK, or it is not.

> This case is a special case as the usage was incorrect in the first
> place.

I agree that it's not only insecure but also incorrect. However it mostly 
works. In fact I don't disagree with the change as such, but I think that the 
commit messages are misleading and confusing. For a start, in one place it 
says that it is not breaking user space and in another it says basically the 
opposite.

(Unfortunately, not everybody agrees with the change. I can't seem to get 
FFmpeg's checkasm tool fixed:
http://ffmpeg.org/pipermail/ffmpeg-devel/2023-July/312245.html )

Also this is not the first time somebody argues that an API should be removed 
because it's broken. That's not special.

> Any application that genuinely requires rdcycle can always get
> it now via the perf interface.

Sure. But the question is whether it breaks user space and if so, whether 
that's acceptable. Existing apps that call RDCYCLE will now fail, presumbly 
receive SIGILL(?).

> If the insecure and incorrect behavior is allowed, we suspect the user
> space behavior will never be fixed as it is hard to put a future flag
> date in these cases.

For better or worse, I can only agree there. But then adding an option to 
preserve the broken behaviour is begging for people to (ab)use it, and indeed 
never fix the problem, and never be able to remove the option.

> > If it is not OK, then the only way out that I can think of, consists of
> > trapping and emulating the counters, returning the same sanitised values
> > that Linux perf would return. Then you can add a kernel config option to
> > disable that trap-and-emulation code in the future.
> What do you mean by "sanitised" value ?

I mean whatever avoids creating a security issue. Presumably report the number 
of cycles spent in the calling thread and in user mode since the first time 
that the process called RDCYCLE?

Maybe it's not reasonable for complexity or performance reasons, but then IMO, 
it deserves a little bit better explaining in the commit message.
Atish Patra July 18, 2023, 6:45 p.m. UTC | #5
On Tue, Jul 18, 2023 at 10:06 AM Rémi Denis-Courmont <remi@remlab.net> wrote:
>
>         Hi,
>
> Le tiistaina 18. heinäkuuta 2023, 2.22.54 EEST Atish Patra a écrit :
> > > AFAIK, if the default settings breaks user space, the patchset is
> > > considered to break user space. That being the case, either this case is
> > > deemed special enough that breaking user space is OK, or it is not.
>
> > This case is a special case as the usage was incorrect in the first
> > place.
>
> I agree that it's not only insecure but also incorrect. However it mostly
> works. In fact I don't disagree with the change as such, but I think that the
> commit messages are misleading and confusing. For a start, in one place it
> says that it is not breaking user space and in another it says basically the
> opposite.
>

Agreed. We will improve the commit message to clarify that. That's also the
reason I started this whole thread :)

> (Unfortunately, not everybody agrees with the change. I can't seem to get
> FFmpeg's checkasm tool fixed:
> http://ffmpeg.org/pipermail/ffmpeg-devel/2023-July/312245.html )
>

Why can't rdtime(equivalent of rdtsc) be used instead of rdcycle ?
What does it use in x86 ? It also doesn't allow reading cycle counter
by default.

The perf syscall overhead is just one time setup thing during the
start of the application.
For counting the cycles before/after a loop, it still provides a
direct CSR access in user mode.

> Also this is not the first time somebody argues that an API should be removed
> because it's broken. That's not special.
>
> > Any application that genuinely requires rdcycle can always get
> > it now via the perf interface.
>
> Sure. But the question is whether it breaks user space and if so, whether
> that's acceptable. Existing apps that call RDCYCLE will now fail, presumbly
> receive SIGILL(?).
>

Yes. With this changes it will receive SIGILL if the default is NO ACCESS.
You can change the sysctl parameter to enable the direct access though
and make it work though.

> > If the insecure and incorrect behavior is allowed, we suspect the user
> > space behavior will never be fixed as it is hard to put a future flag
> > date in these cases.
>
> For better or worse, I can only agree there. But then adding an option to
> preserve the broken behaviour is begging for people to (ab)use it, and indeed
> never fix the problem, and never be able to remove the option.
>

x86 still carries that option. So I don't think once get down path, it
will very difficult to remove it.

> > > If it is not OK, then the only way out that I can think of, consists of
> > > trapping and emulating the counters, returning the same sanitised values
> > > that Linux perf would return. Then you can add a kernel config option to
> > > disable that trap-and-emulation code in the future.
> > What do you mean by "sanitised" value ?
>
> I mean whatever avoids creating a security issue. Presumably report the number
> of cycles spent in the calling thread and in user mode since the first time
> that the process called RDCYCLE?
>
> Maybe it's not reasonable for complexity or performance reasons, but then IMO,
> it deserves a little bit better explaining in the commit message.
>

Yes. I believe the complexities and throwaway code (assuming we should
stop doing that in the long run)
is not worth it given that we have a perfectly valid interface via
perf without any performance sacrifice.
RISC-V is not the first one to do it. It is disabled by default for
ARM64/x86 as well.

If the application usage was legal and we have years of software
development relying on that, it might have
made sense (e.g. x86 legacy usage). However, RISC-V is still young to
avoid those pitfalls.

> --
> 雷米‧德尼-库尔蒙
> http://www.remlab.net/
>
>
>
Rémi Denis-Courmont July 18, 2023, 7:04 p.m. UTC | #6
Le tiistaina 18. heinäkuuta 2023, 21.45.15 EEST Atish Patra a écrit :
> > I agree that it's not only insecure but also incorrect. However it mostly
> > works. In fact I don't disagree with the change as such, but I think that
> > the commit messages are misleading and confusing. For a start, in one
> > place it says that it is not breaking user space and in another it says
> > basically the opposite.
> 
> Agreed. We will improve the commit message to clarify that. That's also the
> reason I started this whole thread :)
> 
> > (Unfortunately, not everybody agrees with the change. I can't seem to get
> > FFmpeg's checkasm tool fixed:
> > http://ffmpeg.org/pipermail/ffmpeg-devel/2023-July/312245.html )
> 
> Why can't rdtime(equivalent of rdtsc) be used instead of rdcycle ?

Isn't RDTIM susceptible to interference from power management and CPU 
frequency scaling? I suppose that RDCYCLE may behave differently depending on 
PM in *some* designs, but that would still be way better than RDTIME for the 
purpose.

As far as benchmarking is concerned (_excluding_ system security), RDTIME 
seems to have all the problems of RDCYCLE, and then some more, no?

> The perf syscall overhead is just one time setup thing during the
> start of the application.
> For counting the cycles before/after a loop, it still provides a
> direct CSR access in user mode.

I suppose that you allude to mmap() here? The (dumb) FFmpeg code is using the 
ioctl() interface though, but that's just laziness.
Atish Patra July 18, 2023, 10:48 p.m. UTC | #7
On Tue, Jul 18, 2023 at 12:04 PM Rémi Denis-Courmont <remi@remlab.net> wrote:
>
> Le tiistaina 18. heinäkuuta 2023, 21.45.15 EEST Atish Patra a écrit :
> > > I agree that it's not only insecure but also incorrect. However it mostly
> > > works. In fact I don't disagree with the change as such, but I think that
> > > the commit messages are misleading and confusing. For a start, in one
> > > place it says that it is not breaking user space and in another it says
> > > basically the opposite.
> >
> > Agreed. We will improve the commit message to clarify that. That's also the
> > reason I started this whole thread :)
> >
> > > (Unfortunately, not everybody agrees with the change. I can't seem to get
> > > FFmpeg's checkasm tool fixed:
> > > http://ffmpeg.org/pipermail/ffmpeg-devel/2023-July/312245.html )
> >
> > Why can't rdtime(equivalent of rdtsc) be used instead of rdcycle ?
>
> Isn't RDTIM susceptible to interference from power management and CPU
> frequency scaling? I suppose that RDCYCLE may behave differently depending on
> PM in *some* designs, but that would still be way better than RDTIME for the
> purpose.
>

Yes. But that's what it is probably using for other ISAs ? My point
was it should
just do whatever it does for other ISA. RISC-V is no special in that regard.

> As far as benchmarking is concerned (_excluding_ system security), RDTIME
> seems to have all the problems of RDCYCLE, and then some more, no?
>

Correct. If the application can deal with noisy data, it can use
rdtime similar to
other architectures. Otherwise, it should just use the perf mmap
interface in the beginning
and read the counters whenever required.

> > The perf syscall overhead is just one time setup thing during the
> > start of the application.
> > For counting the cycles before/after a loop, it still provides a
> > direct CSR access in user mode.
>
> I suppose that you allude to mmap() here? The (dumb) FFmpeg code is using the
> ioctl() interface though, but that's just laziness.
>

Yeah. I agree it is more work but just one time effort to get more
reliable data without
compromising the security.

> --
> レミ・デニ-クールモン
> http://www.remlab.net/
>
>
>
Rémi Denis-Courmont July 19, 2023, 2:45 p.m. UTC | #8
Le keskiviikkona 19. heinäkuuta 2023, 1.48.49 EEST Atish Patra a écrit :
> > Isn't RDTIM susceptible to interference from power management and CPU
> > frequency scaling? I suppose that RDCYCLE may behave differently depending
> > on PM in *some* designs, but that would still be way better than RDTIME
> > for the purpose.
> 
> Yes. But that's what it is probably using for other ISAs ?

At least on AArch64, it is using either Linux perf cycle counter, or if that 
is disabled at build time, the raw PMU cycle counter - which obviously leads 
to SIGILL on Linux, just like this MR would do with RDCYCLE.

Again, I do not *personally* have objections to disabling RDCYCLE for 
userspace (somebody else does, but that's neither my nor your problem). I do 
have objections to the wording of some of the commit messages though.

> My point was it should just do whatever it does for other ISA. RISC-V is no
> special in that regard.

Sure. My point is that RDTIME may be great for, so to say, system-level 
benchmarks. For FFmpeg that could something like how long it takes to 
transcode a video. But it doesn't seem to make much sense for 
microbenchmarking of single threaded tightly optimised loops, as opposed to 
RDCYCLE (or a wrapper for RDCYCLE).
Atish Patra July 19, 2023, 5:14 p.m. UTC | #9
On Wed, Jul 19, 2023 at 7:46 AM Rémi Denis-Courmont <remi@remlab.net> wrote:
>
> Le keskiviikkona 19. heinäkuuta 2023, 1.48.49 EEST Atish Patra a écrit :
> > > Isn't RDTIM susceptible to interference from power management and CPU
> > > frequency scaling? I suppose that RDCYCLE may behave differently depending
> > > on PM in *some* designs, but that would still be way better than RDTIME
> > > for the purpose.
> >
> > Yes. But that's what it is probably using for other ISAs ?
>
> At least on AArch64, it is using either Linux perf cycle counter, or if that
> is disabled at build time, the raw PMU cycle counter - which obviously leads
> to SIGILL on Linux, just like this MR would do with RDCYCLE.
>

Good to know. Thanks for the clarification.

> Again, I do not *personally* have objections to disabling RDCYCLE for
> userspace (somebody else does, but that's neither my nor your problem). I do
> have objections to the wording of some of the commit messages though.
>

Completely agreed. We will update the commit text with more clarification in v5.

> > My point was it should just do whatever it does for other ISA. RISC-V is no
> > special in that regard.
>
> Sure. My point is that RDTIME may be great for, so to say, system-level
> benchmarks. For FFmpeg that could something like how long it takes to
> transcode a video. But it doesn't seem to make much sense for
> microbenchmarking of single threaded tightly optimised loops, as opposed to
> RDCYCLE (or a wrapper for RDCYCLE).
>
> --
> Rémi Denis-Courmont
> http://www.remlab.net/
>
>
>
Alexandre Ghiti July 20, 2023, 9:13 a.m. UTC | #10
Hi Atish, Rémi,


On 19/07/2023 19:14, Atish Patra wrote:
> On Wed, Jul 19, 2023 at 7:46 AM Rémi Denis-Courmont <remi@remlab.net> wrote:
>> Le keskiviikkona 19. heinäkuuta 2023, 1.48.49 EEST Atish Patra a écrit :
>>>> Isn't RDTIM susceptible to interference from power management and CPU
>>>> frequency scaling? I suppose that RDCYCLE may behave differently depending
>>>> on PM in *some* designs, but that would still be way better than RDTIME
>>>> for the purpose.
>>> Yes. But that's what it is probably using for other ISAs ?
>> At least on AArch64, it is using either Linux perf cycle counter, or if that
>> is disabled at build time, the raw PMU cycle counter - which obviously leads
>> to SIGILL on Linux, just like this MR would do with RDCYCLE.
>>
> Good to know. Thanks for the clarification.
>
>> Again, I do not *personally* have objections to disabling RDCYCLE for
>> userspace (somebody else does, but that's neither my nor your problem). I do
>> have objections to the wording of some of the commit messages though.
>>
> Completely agreed. We will update the commit text with more clarification in v5.


Thanks for all your comments and sorry for being slow here. I will 
improve the commit logs in the next version, that's an oversight, sorry 
about that.

Thanks again,

Alex


>
>>> My point was it should just do whatever it does for other ISA. RISC-V is no
>>> special in that regard.
>> Sure. My point is that RDTIME may be great for, so to say, system-level
>> benchmarks. For FFmpeg that could something like how long it takes to
>> transcode a video. But it doesn't seem to make much sense for
>> microbenchmarking of single threaded tightly optimised loops, as opposed to
>> RDCYCLE (or a wrapper for RDCYCLE).
>>
>> --
>> Rémi Denis-Courmont
>> http://www.remlab.net/
>>
>>
>>
>