diff mbox series

[RFC,v3,6/7] gpu: nova-core: add basic timer device

Message ID 20250320-nova_timer-v3-6-79aa2ad25a79@nvidia.com (mailing list archive)
State New
Headers show
Series gpu: nova-core: register definitions and basic timer and falcon devices | expand

Commit Message

Alexandre Courbot March 20, 2025, 1:39 p.m. UTC
Add a basic timer device and exercise it during device probing. This
first draft is probably very questionable.

One point in particular which should IMHO receive attention: the generic
wait_on() method aims at providing similar functionality to Nouveau's
nvkm_[num]sec() macros. Since this method will be heavily used with
different conditions to test, I'd like to avoid monomorphizing it
entirely with each instance ; that's something that is achieved in
nvkm_xsec() using functions that the macros invoke.

I have tried achieving the same result in Rust using closures (kept
as-is in the current code), but they seem to be monomorphized as well.
Calling extra functions could work better, but looks also less elegant
to me, so I am really open to suggestions here.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
 drivers/gpu/nova-core/driver.rs    |   4 +-
 drivers/gpu/nova-core/gpu.rs       |  55 +++++++++++++++-
 drivers/gpu/nova-core/nova_core.rs |   1 +
 drivers/gpu/nova-core/regs.rs      |  11 ++++
 drivers/gpu/nova-core/timer.rs     | 132 +++++++++++++++++++++++++++++++++++++
 5 files changed, 201 insertions(+), 2 deletions(-)

Comments

Daniel Brooks March 20, 2025, 3:54 p.m. UTC | #1
Alexandre Courbot <acourbot@nvidia.com> writes:

> +impl Add<Duration> for Timestamp {
> +    type Output = Self;
> +
> +    fn add(mut self, rhs: Duration) -> Self::Output {
> +        let mut nanos = rhs.as_nanos();
> +        while nanos > u64::MAX as u128 {
> +            self.0 = self.0.wrapping_add(nanos as u64);
> +            nanos -= u64::MAX as u128;
> +        }
> +
> +        Timestamp(self.0.wrapping_add(nanos as u64))
> +    }
> +}

Perhaps I missed something, but couldn’t you simplify this method like
this:

    fn add(mut self, rhs: Duration) -> Self::Output {
        let stamp = self.0 as u128;
        Timestamp(stamp.wrapping_add(rhs.as_nanos()) as u64)
    }

db48x
Boqun Feng March 20, 2025, 6:17 p.m. UTC | #2
Hi Alexandre,

On Thu, Mar 20, 2025 at 10:39:14PM +0900, Alexandre Courbot wrote:
> Add a basic timer device and exercise it during device probing. This
> first draft is probably very questionable.
> 
> One point in particular which should IMHO receive attention: the generic
> wait_on() method aims at providing similar functionality to Nouveau's
> nvkm_[num]sec() macros. Since this method will be heavily used with
> different conditions to test, I'd like to avoid monomorphizing it
> entirely with each instance ; that's something that is achieved in
> nvkm_xsec() using functions that the macros invoke.
> 
> I have tried achieving the same result in Rust using closures (kept
> as-is in the current code), but they seem to be monomorphized as well.
> Calling extra functions could work better, but looks also less elegant
> to me, so I am really open to suggestions here.
> 
> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
> ---
>  drivers/gpu/nova-core/driver.rs    |   4 +-
>  drivers/gpu/nova-core/gpu.rs       |  55 +++++++++++++++-
>  drivers/gpu/nova-core/nova_core.rs |   1 +
>  drivers/gpu/nova-core/regs.rs      |  11 ++++
>  drivers/gpu/nova-core/timer.rs     | 132 +++++++++++++++++++++++++++++++++++++
>  5 files changed, 201 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
> index 63c19f140fbdd65d8fccf81669ac590807cc120f..0cd23aa306e4082405f480afc0530a41131485e7 100644
> --- a/drivers/gpu/nova-core/driver.rs
> +++ b/drivers/gpu/nova-core/driver.rs
> @@ -10,7 +10,7 @@ pub(crate) struct NovaCore {
>      pub(crate) gpu: Gpu,
>  }
>  
> -const BAR0_SIZE: usize = 8;
> +const BAR0_SIZE: usize = 0x9500;
>  pub(crate) type Bar0 = pci::Bar<BAR0_SIZE>;
>  
>  kernel::pci_device_table!(
> @@ -42,6 +42,8 @@ fn probe(pdev: &mut pci::Device, _info: &Self::IdInfo) -> Result<Pin<KBox<Self>>
>              GFP_KERNEL,
>          )?;
>  
> +        let _ = this.gpu.test_timer();
> +
>          Ok(this)
>      }
>  }
> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
> index d96901e5c8eace1e7c57c77da7def209e8149cd3..f010d3152530b1cec032ca620e59de51a2fc1a13 100644
> --- a/drivers/gpu/nova-core/gpu.rs
> +++ b/drivers/gpu/nova-core/gpu.rs
> @@ -6,8 +6,10 @@
>  
>  use crate::driver::Bar0;
>  use crate::regs;
> +use crate::timer::Timer;
>  use crate::util;
>  use core::fmt;
> +use core::time::Duration;
>  

Since there is already a Delta type proposed to represent the timestamp
difference in kernel:

	https://lore.kernel.org/rust-for-linux/20250220070611.214262-4-fujita.tomonori@gmail.com/

, could you please make your work based on that and avoid using
core::time::Duration. Thanks!

>  macro_rules! define_chipset {
>      ({ $($variant:ident = $value:expr),* $(,)* }) =>
> @@ -179,6 +181,7 @@ pub(crate) struct Gpu {
>      /// MMIO mapping of PCI BAR 0
>      bar: Devres<Bar0>,
>      fw: Firmware,
> +    timer: Timer,
>  }
>  
[...]
> +
> +/// A timestamp with nanosecond granularity obtained from the GPU timer.
> +///
> +/// A timestamp can also be substracted to another in order to obtain a [`Duration`].
> +///
> +/// TODO: add Kunit tests!
> +#[derive(Debug, Copy, Clone, PartialEq, Eq, PartialOrd, Ord)]
> +pub(crate) struct Timestamp(u64);
> +

Also an Instant type has been proposed and reviewed for a while:

	https://lore.kernel.org/rust-for-linux/20250220070611.214262-5-fujita.tomonori@gmail.com/

we should use that type instead of re-inventing the wheel here. Of
course, it's currently not quite working because Instant is only for
CLOCK_MONOTONIC. But there was a proposal to make `Instant` generic over
clock:

	https://lore.kernel.org/rust-for-linux/20230714-rust-time-v2-1-f5aed84218c4@asahilina.net/

if you follow that design, you can implement a `Instant<NovaGpu>`, where

    ipml Now for NovaGpu {
        fn now() -> Instant<Self> {
	    // your Timer::read() implementation.
	}
    }

Thanks!

Regards,
Boqun

> +impl Display for Timestamp {
> +    fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
> +        write!(f, "{}", self.0)
> +    }
> +}
> +
> +impl Add<Duration> for Timestamp {
> +    type Output = Self;
> +
> +    fn add(mut self, rhs: Duration) -> Self::Output {
> +        let mut nanos = rhs.as_nanos();
> +        while nanos > u64::MAX as u128 {
> +            self.0 = self.0.wrapping_add(nanos as u64);
> +            nanos -= u64::MAX as u128;
> +        }
> +
> +        Timestamp(self.0.wrapping_add(nanos as u64))
> +    }
> +}
> +
> +impl Sub for Timestamp {
> +    type Output = Duration;
> +
> +    fn sub(self, rhs: Self) -> Self::Output {
> +        Duration::from_nanos(self.0.wrapping_sub(rhs.0))
> +    }
> +}
> +
> +pub(crate) struct Timer {}
> +
> +impl Timer {
> +    pub(crate) fn new() -> Self {
> +        Self {}
> +    }
> +
> +    /// Read the current timer timestamp.
> +    pub(crate) fn read(&self, bar: &Bar0) -> Timestamp {
> +        loop {
> +            let hi = regs::PtimerTime1::read(bar);
> +            let lo = regs::PtimerTime0::read(bar);
> +
> +            if hi.hi() == regs::PtimerTime1::read(bar).hi() {
> +                return Timestamp(u64::from_u32s(hi.hi(), lo.lo()));
> +            }
> +        }
> +    }
> +
[...]
Alexandre Courbot March 21, 2025, 3:09 a.m. UTC | #3
On Fri Mar 21, 2025 at 12:54 AM JST, Daniel Brooks wrote:
> Alexandre Courbot <acourbot@nvidia.com> writes:
>
>> +impl Add<Duration> for Timestamp {
>> +    type Output = Self;
>> +
>> +    fn add(mut self, rhs: Duration) -> Self::Output {
>> +        let mut nanos = rhs.as_nanos();
>> +        while nanos > u64::MAX as u128 {
>> +            self.0 = self.0.wrapping_add(nanos as u64);
>> +            nanos -= u64::MAX as u128;
>> +        }
>> +
>> +        Timestamp(self.0.wrapping_add(nanos as u64))
>> +    }
>> +}
>
> Perhaps I missed something, but couldn’t you simplify this method like
> this:
>
>     fn add(mut self, rhs: Duration) -> Self::Output {
>         let stamp = self.0 as u128;
>         Timestamp(stamp.wrapping_add(rhs.as_nanos()) as u64)
>     }

You are absolutely right, this loop will just wrap self.0 to the same value
after the initial pass.

... or at least it would if there weren't two bugs in it. >_<

- It adds the lower part of nanos while substracting u64::MAX from rhs,
  which is completely wrong;
- Substracting u64::MAX is also wrong, it should be u64::MAX as u128 +
  1.

So thanks a lot for pointing this out!
Alexandre Courbot March 21, 2025, 5:41 a.m. UTC | #4
Hi Boqun,

On Fri Mar 21, 2025 at 3:17 AM JST, Boqun Feng wrote:
> Hi Alexandre,
>
> On Thu, Mar 20, 2025 at 10:39:14PM +0900, Alexandre Courbot wrote:
>> Add a basic timer device and exercise it during device probing. This
>> first draft is probably very questionable.
>> 
>> One point in particular which should IMHO receive attention: the generic
>> wait_on() method aims at providing similar functionality to Nouveau's
>> nvkm_[num]sec() macros. Since this method will be heavily used with
>> different conditions to test, I'd like to avoid monomorphizing it
>> entirely with each instance ; that's something that is achieved in
>> nvkm_xsec() using functions that the macros invoke.
>> 
>> I have tried achieving the same result in Rust using closures (kept
>> as-is in the current code), but they seem to be monomorphized as well.
>> Calling extra functions could work better, but looks also less elegant
>> to me, so I am really open to suggestions here.
>> 
>> Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
>> ---
>>  drivers/gpu/nova-core/driver.rs    |   4 +-
>>  drivers/gpu/nova-core/gpu.rs       |  55 +++++++++++++++-
>>  drivers/gpu/nova-core/nova_core.rs |   1 +
>>  drivers/gpu/nova-core/regs.rs      |  11 ++++
>>  drivers/gpu/nova-core/timer.rs     | 132 +++++++++++++++++++++++++++++++++++++
>>  5 files changed, 201 insertions(+), 2 deletions(-)
>> 
>> diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
>> index 63c19f140fbdd65d8fccf81669ac590807cc120f..0cd23aa306e4082405f480afc0530a41131485e7 100644
>> --- a/drivers/gpu/nova-core/driver.rs
>> +++ b/drivers/gpu/nova-core/driver.rs
>> @@ -10,7 +10,7 @@ pub(crate) struct NovaCore {
>>      pub(crate) gpu: Gpu,
>>  }
>>  
>> -const BAR0_SIZE: usize = 8;
>> +const BAR0_SIZE: usize = 0x9500;
>>  pub(crate) type Bar0 = pci::Bar<BAR0_SIZE>;
>>  
>>  kernel::pci_device_table!(
>> @@ -42,6 +42,8 @@ fn probe(pdev: &mut pci::Device, _info: &Self::IdInfo) -> Result<Pin<KBox<Self>>
>>              GFP_KERNEL,
>>          )?;
>>  
>> +        let _ = this.gpu.test_timer();
>> +
>>          Ok(this)
>>      }
>>  }
>> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
>> index d96901e5c8eace1e7c57c77da7def209e8149cd3..f010d3152530b1cec032ca620e59de51a2fc1a13 100644
>> --- a/drivers/gpu/nova-core/gpu.rs
>> +++ b/drivers/gpu/nova-core/gpu.rs
>> @@ -6,8 +6,10 @@
>>  
>>  use crate::driver::Bar0;
>>  use crate::regs;
>> +use crate::timer::Timer;
>>  use crate::util;
>>  use core::fmt;
>> +use core::time::Duration;
>>  
>
> Since there is already a Delta type proposed to represent the timestamp
> difference in kernel:
>
> 	https://lore.kernel.org/rust-for-linux/20250220070611.214262-4-fujita.tomonori@gmail.com/
>
> , could you please make your work based on that and avoid using
> core::time::Duration. Thanks!
>
>>  macro_rules! define_chipset {
>>      ({ $($variant:ident = $value:expr),* $(,)* }) =>
>> @@ -179,6 +181,7 @@ pub(crate) struct Gpu {
>>      /// MMIO mapping of PCI BAR 0
>>      bar: Devres<Bar0>,
>>      fw: Firmware,
>> +    timer: Timer,
>>  }
>>  
> [...]
>> +
>> +/// A timestamp with nanosecond granularity obtained from the GPU timer.
>> +///
>> +/// A timestamp can also be substracted to another in order to obtain a [`Duration`].
>> +///
>> +/// TODO: add Kunit tests!
>> +#[derive(Debug, Copy, Clone, PartialEq, Eq, PartialOrd, Ord)]
>> +pub(crate) struct Timestamp(u64);
>> +
>
> Also an Instant type has been proposed and reviewed for a while:
>
> 	https://lore.kernel.org/rust-for-linux/20250220070611.214262-5-fujita.tomonori@gmail.com/
>
> we should use that type instead of re-inventing the wheel here. Of
> course, it's currently not quite working because Instant is only for
> CLOCK_MONOTONIC. But there was a proposal to make `Instant` generic over
> clock:
>
> 	https://lore.kernel.org/rust-for-linux/20230714-rust-time-v2-1-f5aed84218c4@asahilina.net/
>
> if you follow that design, you can implement a `Instant<NovaGpu>`, where
>
>     ipml Now for NovaGpu {
>         fn now() -> Instant<Self> {
> 	    // your Timer::read() implementation.
> 	}
>     }

Ah, thanks for pointing this out. I'll keep track of these patches,
hopefully they get merged soon!
diff mbox series

Patch

diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
index 63c19f140fbdd65d8fccf81669ac590807cc120f..0cd23aa306e4082405f480afc0530a41131485e7 100644
--- a/drivers/gpu/nova-core/driver.rs
+++ b/drivers/gpu/nova-core/driver.rs
@@ -10,7 +10,7 @@  pub(crate) struct NovaCore {
     pub(crate) gpu: Gpu,
 }
 
-const BAR0_SIZE: usize = 8;
+const BAR0_SIZE: usize = 0x9500;
 pub(crate) type Bar0 = pci::Bar<BAR0_SIZE>;
 
 kernel::pci_device_table!(
@@ -42,6 +42,8 @@  fn probe(pdev: &mut pci::Device, _info: &Self::IdInfo) -> Result<Pin<KBox<Self>>
             GFP_KERNEL,
         )?;
 
+        let _ = this.gpu.test_timer();
+
         Ok(this)
     }
 }
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index d96901e5c8eace1e7c57c77da7def209e8149cd3..f010d3152530b1cec032ca620e59de51a2fc1a13 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -6,8 +6,10 @@ 
 
 use crate::driver::Bar0;
 use crate::regs;
+use crate::timer::Timer;
 use crate::util;
 use core::fmt;
+use core::time::Duration;
 
 macro_rules! define_chipset {
     ({ $($variant:ident = $value:expr),* $(,)* }) =>
@@ -179,6 +181,7 @@  pub(crate) struct Gpu {
     /// MMIO mapping of PCI BAR 0
     bar: Devres<Bar0>,
     fw: Firmware,
+    timer: Timer,
 }
 
 impl Gpu {
@@ -194,6 +197,56 @@  pub(crate) fn new(pdev: &pci::Device, bar: Devres<Bar0>) -> Result<impl PinInit<
             spec.revision
         );
 
-        Ok(pin_init!(Self { spec, bar, fw }))
+        let timer = Timer::new();
+
+        Ok(pin_init!(Self {
+            spec,
+            bar,
+            fw,
+            timer,
+        }))
+    }
+
+    pub(crate) fn test_timer(&self) -> Result<()> {
+        pr_info!("testing timer subdev\n");
+        with_bar!(self.bar, |b| {
+            pr_info!("current timestamp: {}\n", self.timer.read(b))
+        })?;
+
+        if !matches!(
+            self.timer
+                .wait_on(&self.bar, Duration::from_millis(10), || Some(())),
+            Ok(())
+        ) {
+            pr_crit!("timer test failure\n");
+        }
+
+        let t1 = with_bar!(self.bar, |b| {
+            pr_info!("timestamp after immediate exit: {}\n", self.timer.read(b));
+            self.timer.read(b)
+        })?;
+
+        if self
+            .timer
+            .wait_on(&self.bar, Duration::from_millis(10), || Option::<()>::None)
+            != Err(ETIMEDOUT)
+        {
+            pr_crit!("timer test 2 failure\n");
+        }
+
+        let t2 = with_bar!(self.bar, |b| self.timer.read(b))?;
+        if t2 - t1 < Duration::from_millis(10) {
+            pr_crit!("timer test 3 failure\n");
+        }
+
+        with_bar!(self.bar, |b| {
+            pr_info!(
+                "timestamp after timeout: {} ({:?})\n",
+                self.timer.read(b),
+                t2 - t1
+            );
+        })?;
+
+        Ok(())
     }
 }
diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs
index 94f4778c16f6a4d046c2f799129ed0cc68df6fd4..f54dcfc66490cb6b10090ef944ac14feca9f6972 100644
--- a/drivers/gpu/nova-core/nova_core.rs
+++ b/drivers/gpu/nova-core/nova_core.rs
@@ -18,6 +18,7 @@  macro_rules! with_bar {
 mod firmware;
 mod gpu;
 mod regs;
+mod timer;
 mod util;
 
 kernel::module_pci_driver! {
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 7bfd2b575fe2184565d495012e55cd0829b0b1ad..0d06e09b1ba62d55688c633500f37d3fe1aeb30e 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -11,3 +11,14 @@ 
     7:4     major_rev => as u8, "major revision of the chip";
     28:20   chipset => try_into Chipset, "chipset model"
 );
+
+/* PTIMER */
+
+register!(PtimerTime0@0x00009400;
+    31:0    lo => as u32, "low 32-bits of the timer"
+);
+
+register!(PtimerTime1@0x00009410;
+    31:0    hi => as u32, "high 32 bits of the timer"
+);
+
diff --git a/drivers/gpu/nova-core/timer.rs b/drivers/gpu/nova-core/timer.rs
new file mode 100644
index 0000000000000000000000000000000000000000..1361e4ad10d923ce114972889cdfcfa6e58a691b
--- /dev/null
+++ b/drivers/gpu/nova-core/timer.rs
@@ -0,0 +1,132 @@ 
+// SPDX-License-Identifier: GPL-2.0
+
+//! Nova Core Timer subdevice
+
+use core::fmt::Display;
+use core::ops::{Add, Sub};
+use core::time::Duration;
+
+use kernel::devres::Devres;
+use kernel::num::U64Ext;
+use kernel::prelude::*;
+
+use crate::driver::Bar0;
+use crate::regs;
+
+/// A timestamp with nanosecond granularity obtained from the GPU timer.
+///
+/// A timestamp can also be substracted to another in order to obtain a [`Duration`].
+///
+/// TODO: add Kunit tests!
+#[derive(Debug, Copy, Clone, PartialEq, Eq, PartialOrd, Ord)]
+pub(crate) struct Timestamp(u64);
+
+impl Display for Timestamp {
+    fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
+        write!(f, "{}", self.0)
+    }
+}
+
+impl Add<Duration> for Timestamp {
+    type Output = Self;
+
+    fn add(mut self, rhs: Duration) -> Self::Output {
+        let mut nanos = rhs.as_nanos();
+        while nanos > u64::MAX as u128 {
+            self.0 = self.0.wrapping_add(nanos as u64);
+            nanos -= u64::MAX as u128;
+        }
+
+        Timestamp(self.0.wrapping_add(nanos as u64))
+    }
+}
+
+impl Sub for Timestamp {
+    type Output = Duration;
+
+    fn sub(self, rhs: Self) -> Self::Output {
+        Duration::from_nanos(self.0.wrapping_sub(rhs.0))
+    }
+}
+
+pub(crate) struct Timer {}
+
+impl Timer {
+    pub(crate) fn new() -> Self {
+        Self {}
+    }
+
+    /// Read the current timer timestamp.
+    pub(crate) fn read(&self, bar: &Bar0) -> Timestamp {
+        loop {
+            let hi = regs::PtimerTime1::read(bar);
+            let lo = regs::PtimerTime0::read(bar);
+
+            if hi.hi() == regs::PtimerTime1::read(bar).hi() {
+                return Timestamp(u64::from_u32s(hi.hi(), lo.lo()));
+            }
+        }
+    }
+
+    #[allow(dead_code)]
+    pub(crate) fn time(bar: &Bar0, time: u64) {
+        regs::PtimerTime1::default()
+            .set_hi(time.upper_32_bits())
+            .write(bar);
+        regs::PtimerTime0::default()
+            .set_lo(time.lower_32_bits())
+            .write(bar);
+    }
+
+    /// Wait until `cond` is true or `timeout` elapsed, based on GPU time.
+    ///
+    /// When `cond` evaluates to `Some`, its return value is returned.
+    ///
+    /// `Err(ETIMEDOUT)` is returned if `timeout` has been reached without `cond` evaluating to
+    /// `Some`, or if the timer device is stuck for some reason.
+    pub(crate) fn wait_on<R, F: Fn() -> Option<R>>(
+        &self,
+        bar: &Devres<Bar0>,
+        timeout: Duration,
+        cond: F,
+    ) -> Result<R> {
+        // Number of consecutive time reads after which we consider the timer frozen if it hasn't
+        // moved forward.
+        const MAX_STALLED_READS: usize = 16;
+
+        let (mut cur_time, mut prev_time, deadline) = {
+            let cur_time = with_bar!(bar, |b| self.read(b))?;
+            let deadline = cur_time + timeout;
+
+            (cur_time, cur_time, deadline)
+        };
+        let mut num_reads = 0;
+
+        loop {
+            if let Some(ret) = cond() {
+                return Ok(ret);
+            }
+
+            (|| {
+                cur_time = with_bar!(bar, |b| self.read(b))?;
+
+                /* Check if the timer is frozen for some reason. */
+                if cur_time == prev_time {
+                    if num_reads >= MAX_STALLED_READS {
+                        return Err(ETIMEDOUT);
+                    }
+                    num_reads += 1;
+                } else {
+                    if cur_time >= deadline {
+                        return Err(ETIMEDOUT);
+                    }
+
+                    num_reads = 0;
+                    prev_time = cur_time;
+                }
+
+                Ok(())
+            })()?;
+        }
+    }
+}