Message ID | 20240311063018.1886757-1-dqfext@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | perf: RISC-V: fix IRQ detection on T-Head C908 | expand |
On Mon, Mar 11, 2024 at 02:30:18PM +0800, Qingfang Deng wrote: > T-Head C908 has the same IRQ num and CSR as previous C9xx cores, but > reports non-zero marchid and mimpid. Remove the ID checks. > Hi, Qingfang, IIRC, the existed C908 SoC (such as K230) have an early version of C908 core. But C908 core itself may support Sscofpmf. So I do not think removing the ID checks is a good idea. Instead, I suggest adding CPUID of your SoC to this check. Regards, Inochi > Fixes: 65e9fb081877 ("drivers/perf: riscv_pmu_sbi: add support for PMU variant on T-Head C9xx cores") > Signed-off-by: Qingfang Deng <dqfext@gmail.com> > --- > arch/riscv/errata/thead/errata.c | 4 ---- > drivers/perf/riscv_pmu_sbi.c | 4 +--- > 2 files changed, 1 insertion(+), 7 deletions(-) > > diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c > index b1c410bbc1ae..49ccad5b21bb 100644 > --- a/arch/riscv/errata/thead/errata.c > +++ b/arch/riscv/errata/thead/errata.c > @@ -125,10 +125,6 @@ static bool errata_probe_pmu(unsigned int stage, > if (!IS_ENABLED(CONFIG_ERRATA_THEAD_PMU)) > return false; > > - /* target-c9xx cores report arch_id and impid as 0 */ > - if (arch_id != 0 || impid != 0) > - return false; > - > if (stage == RISCV_ALTERNATIVES_EARLY_BOOT) > return false; > > diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c > index 452aab49db1e..87b83184383a 100644 > --- a/drivers/perf/riscv_pmu_sbi.c > +++ b/drivers/perf/riscv_pmu_sbi.c > @@ -811,9 +811,7 @@ static int pmu_sbi_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pde > riscv_pmu_irq_num = RV_IRQ_PMU; > riscv_pmu_use_irq = true; > } else if (IS_ENABLED(CONFIG_ERRATA_THEAD_PMU) && > - riscv_cached_mvendorid(0) == THEAD_VENDOR_ID && > - riscv_cached_marchid(0) == 0 && > - riscv_cached_mimpid(0) == 0) { > + riscv_cached_mvendorid(0) == THEAD_VENDOR_ID) { > riscv_pmu_irq_num = THEAD_C9XX_RV_IRQ_PMU; > riscv_pmu_use_irq = true; > } > -- > 2.34.1 > > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
Hi Inochi, On Mon, Mar 11, 2024 at 3:13 PM Inochi Amaoto <inochiama@outlook.com> wrote: > > On Mon, Mar 11, 2024 at 02:30:18PM +0800, Qingfang Deng wrote: > > T-Head C908 has the same IRQ num and CSR as previous C9xx cores, but > > reports non-zero marchid and mimpid. Remove the ID checks. > > > > Hi, Qingfang, > > IIRC, the existed C908 SoC (such as K230) have an early version > of C908 core. But C908 core itself may support Sscofpmf. > So I do not think removing the ID checks is a good idea. Instead, > I suggest adding CPUID of your SoC to this check. As of Feb 2024, the latest C908 revision does not support Sscofpmf. You may Google "C908R1S0" to see its user manual. But I think you're right. Even though C908 does not have Sscofpmf, T-Head may release new SoCs which do have Sscofpmf, and the check will break. I will submit a new patch with your suggested changes. Regards, Qingfang > > Regards, > Inochi >
On Mon, Mar 11, 2024 at 03:56:29PM +0800, Qingfang Deng wrote: > Hi Inochi, > > On Mon, Mar 11, 2024 at 3:13 PM Inochi Amaoto <inochiama@outlook.com> wrote: > > > > On Mon, Mar 11, 2024 at 02:30:18PM +0800, Qingfang Deng wrote: > > > T-Head C908 has the same IRQ num and CSR as previous C9xx cores, but > > > reports non-zero marchid and mimpid. Remove the ID checks. > > > > > > > Hi, Qingfang, > > > > IIRC, the existed C908 SoC (such as K230) have an early version > > of C908 core. But C908 core itself may support Sscofpmf. > > So I do not think removing the ID checks is a good idea. Instead, > > I suggest adding CPUID of your SoC to this check. > > As of Feb 2024, the latest C908 revision does not support Sscofpmf. > You may Google "C908R1S0" to see its user manual. > But I think you're right. Even though C908 does not have Sscofpmf, > T-Head may release new SoCs which do have Sscofpmf, and the check will > break. I will submit a new patch with your suggested changes. If on an SoC where they have updated vector to 1.0 and implemented both Zicbom and Svpbmt instead of their custom stuff they did not implement Sscofpmf I think we can expect they won't move away from their custom implementation soon. I do agree that we should not remove the ID checks entirely, but I also do not want to be adding an ID for every SoC that needs this. I think we should be getting this information from DT going forward. The DT parsing is done prior to the application of boot time alternatives, so I think we could apply the "erratum" based on the DT. I'm also pretty sure that we can also modify the existing code for the archid == impid == 0x0 case to set a pseudo isa extension so that the perf driver could do call riscv_isa_eextension_available() and not worry about the specfic conditions in which that is true. It'd be something like this patch: https://lore.kernel.org/linux-riscv/20240110073917.2398826-8-peterlin@andestech.com/ Just without removing the archid == impid == 0x0 case from the errata code. If you're lost after reading that, I can probably throw together some untested code for it. Thanks, Conor.
On Tue, Mar 12, 2024 at 02:07:31PM +0000, Conor Dooley wrote: > On Mon, Mar 11, 2024 at 03:56:29PM +0800, Qingfang Deng wrote: > > Hi Inochi, > > > > On Mon, Mar 11, 2024 at 3:13 PM Inochi Amaoto <inochiama@outlook.com> wrote: > > > > > > On Mon, Mar 11, 2024 at 02:30:18PM +0800, Qingfang Deng wrote: > > > > T-Head C908 has the same IRQ num and CSR as previous C9xx cores, but > > > > reports non-zero marchid and mimpid. Remove the ID checks. > > > > > > > > > > Hi, Qingfang, > > > > > > IIRC, the existed C908 SoC (such as K230) have an early version > > > of C908 core. But C908 core itself may support Sscofpmf. > > > So I do not think removing the ID checks is a good idea. Instead, > > > I suggest adding CPUID of your SoC to this check. > > > > As of Feb 2024, the latest C908 revision does not support Sscofpmf. > > You may Google "C908R1S0" to see its user manual. > > But I think you're right. Even though C908 does not have Sscofpmf, > > T-Head may release new SoCs which do have Sscofpmf, and the check will > > break. I will submit a new patch with your suggested changes. > > If on an SoC where they have updated vector to 1.0 and implemented both > Zicbom and Svpbmt instead of their custom stuff they did not implement > Sscofpmf I think we can expect they won't move away from their custom > implementation soon. > I do agree that we should not remove the ID checks entirely, but I also > do not want to be adding an ID for every SoC that needs this. I think we > should be getting this information from DT going forward. > The DT parsing is done prior to the application of boot time > alternatives, so I think we could apply the "erratum" based on the DT. > > I'm also pretty sure that we can also modify the existing code for the > archid == impid == 0x0 case to set a pseudo isa extension so that the > perf driver could do call riscv_isa_eextension_available() and not worry > about the specfic conditions in which that is true. It'd be something > like this patch: > https://lore.kernel.org/linux-riscv/20240110073917.2398826-8-peterlin@andestech.com/ > Just without removing the archid == impid == 0x0 case from the errata > code. If you're lost after reading that, I can probably throw together > some untested code for it. > > Thanks, > Conor. I agree to use something to replace the existing check, but using a pseudo isa extension is not a good idea. There are two reasons: 1. Pseudo isa is misleading. As it is not the real isa, setting this in isa list may make userspace think errata a feather. 2. Using pseudo isa is more like an abuse of reserved isa bits, which means kernel may need infinite bits to handle the errata. IMHO, it may be better to use a new DT property like "riscv,cpu-errata" or "<vendor>,cpu-errata". It can achieve almost everything like using pseudo isa. And the only cost I think is a small amount code to parse this. Regards, Inochi.
On Wed, Mar 13, 2024 at 09:31:26AM +0800, Inochi Amaoto wrote: > On Tue, Mar 12, 2024 at 02:07:31PM +0000, Conor Dooley wrote: > > On Mon, Mar 11, 2024 at 03:56:29PM +0800, Qingfang Deng wrote: > > > Hi Inochi, > > > > > > On Mon, Mar 11, 2024 at 3:13 PM Inochi Amaoto <inochiama@outlook.com> wrote: > > > > > > > > On Mon, Mar 11, 2024 at 02:30:18PM +0800, Qingfang Deng wrote: > > > > > T-Head C908 has the same IRQ num and CSR as previous C9xx cores, but > > > > > reports non-zero marchid and mimpid. Remove the ID checks. > > > > > > > > > > > > > Hi, Qingfang, > > > > > > > > IIRC, the existed C908 SoC (such as K230) have an early version > > > > of C908 core. But C908 core itself may support Sscofpmf. > > > > So I do not think removing the ID checks is a good idea. Instead, > > > > I suggest adding CPUID of your SoC to this check. > > > > > > As of Feb 2024, the latest C908 revision does not support Sscofpmf. > > > You may Google "C908R1S0" to see its user manual. > > > But I think you're right. Even though C908 does not have Sscofpmf, > > > T-Head may release new SoCs which do have Sscofpmf, and the check will > > > break. I will submit a new patch with your suggested changes. > > > > If on an SoC where they have updated vector to 1.0 and implemented both > > Zicbom and Svpbmt instead of their custom stuff they did not implement > > Sscofpmf I think we can expect they won't move away from their custom > > implementation soon. > > I do agree that we should not remove the ID checks entirely, but I also > > do not want to be adding an ID for every SoC that needs this. I think we > > should be getting this information from DT going forward. > > The DT parsing is done prior to the application of boot time > > alternatives, so I think we could apply the "erratum" based on the DT. > > > > I'm also pretty sure that we can also modify the existing code for the > > archid == impid == 0x0 case to set a pseudo isa extension so that the > > perf driver could do call riscv_isa_eextension_available() and not worry > > about the specfic conditions in which that is true. It'd be something > > like this patch: > > https://lore.kernel.org/linux-riscv/20240110073917.2398826-8-peterlin@andestech.com/ > > Just without removing the archid == impid == 0x0 case from the errata > > code. If you're lost after reading that, I can probably throw together > > some untested code for it. > > I agree to use something to replace the existing check, but using a pseudo > isa extension is not a good idea. There are two reasons: > > 1. Pseudo isa is misleading. As it is not the real isa, setting this in isa > list may make userspace think errata a feather. If we wanted to suppress sharing this information to userspace we could, but I don't even see what the harm is in userspace knowing. I also disagree with this even really being an erratum in the first place as they implemented something before a spec was created. It's not an implementation of Sscofpmf with a bug. I don't think that calling it a "real" vendor extension is problematic either, just I am the one calling it that, not T-Head themselves. > 2. Using pseudo isa is more like an abuse of reserved isa bits, which means > kernel may need infinite bits to handle the errata. What makes it an abuse of a "reserved" bit? There's no rule on disallowing something vendor-specific there. Given the amount of standard ISA extensions that are being created compared to CPU errata, I don't think that I'm particularly worried about vendor specific stuff being the reason for infinite bits being needed either. If there do come to be issues with the number of extensions we care about, we could split things per vendor I suppose. > IMHO, it may be better to use a new DT property like "riscv,cpu-errata" or > "<vendor>,cpu-errata". It can achieve almost everything like using pseudo > isa. And the only cost I think is a small amount code to parse this. I suppose we could do that, but accounting for vendor specifics was one of the goals for the property I only just added and that I am suggesting to use here.
On Thu, Mar 14, 2024 at 08:41:15PM +0000, Conor Dooley wrote: > On Wed, Mar 13, 2024 at 09:31:26AM +0800, Inochi Amaoto wrote: > > On Tue, Mar 12, 2024 at 02:07:31PM +0000, Conor Dooley wrote: > > > On Mon, Mar 11, 2024 at 03:56:29PM +0800, Qingfang Deng wrote: > > > > Hi Inochi, > > > > > > > > On Mon, Mar 11, 2024 at 3:13 PM Inochi Amaoto <inochiama@outlook.com> wrote: > > > > > > > > > > On Mon, Mar 11, 2024 at 02:30:18PM +0800, Qingfang Deng wrote: > > > > > > T-Head C908 has the same IRQ num and CSR as previous C9xx cores, but > > > > > > reports non-zero marchid and mimpid. Remove the ID checks. > > > > > > > > > > > > > > > > Hi, Qingfang, > > > > > > > > > > IIRC, the existed C908 SoC (such as K230) have an early version > > > > > of C908 core. But C908 core itself may support Sscofpmf. > > > > > So I do not think removing the ID checks is a good idea. Instead, > > > > > I suggest adding CPUID of your SoC to this check. > > > > > > > > As of Feb 2024, the latest C908 revision does not support Sscofpmf. > > > > You may Google "C908R1S0" to see its user manual. > > > > But I think you're right. Even though C908 does not have Sscofpmf, > > > > T-Head may release new SoCs which do have Sscofpmf, and the check will > > > > break. I will submit a new patch with your suggested changes. > > > > > > If on an SoC where they have updated vector to 1.0 and implemented both > > > Zicbom and Svpbmt instead of their custom stuff they did not implement > > > Sscofpmf I think we can expect they won't move away from their custom > > > implementation soon. > > > I do agree that we should not remove the ID checks entirely, but I also > > > do not want to be adding an ID for every SoC that needs this. I think we > > > should be getting this information from DT going forward. > > > The DT parsing is done prior to the application of boot time > > > alternatives, so I think we could apply the "erratum" based on the DT. > > > > > > I'm also pretty sure that we can also modify the existing code for the > > > archid == impid == 0x0 case to set a pseudo isa extension so that the > > > perf driver could do call riscv_isa_eextension_available() and not worry > > > about the specfic conditions in which that is true. It'd be something > > > like this patch: > > > https://lore.kernel.org/linux-riscv/20240110073917.2398826-8-peterlin@andestech.com/ > > > Just without removing the archid == impid == 0x0 case from the errata > > > code. If you're lost after reading that, I can probably throw together > > > some untested code for it. > > > > I agree to use something to replace the existing check, but using a pseudo > > isa extension is not a good idea. There are two reasons: > > > > 1. Pseudo isa is misleading. As it is not the real isa, setting this in isa > > list may make userspace think errata a feather. > > If we wanted to suppress sharing this information to userspace we could, > but I don't even see what the harm is in userspace knowing. > I also disagree with this even really being an erratum in the first place > as they implemented something before a spec was created. It's not an > implementation of Sscofpmf with a bug. I don't think that calling it a > "real" vendor extension is problematic either, just I am the one calling > it that, not T-Head themselves. > I agree with that not implement standard extension is not a bug. And it is necessary to let userspace know if the core have behavior that is different from the standard. In fact, I think it may need a different name to describe this difference, "errata" is kind of unsuitable. > > 2. Using pseudo isa is more like an abuse of reserved isa bits, which means > > kernel may need infinite bits to handle the errata. > > What makes it an abuse of a "reserved" bit? There's no rule on > disallowing something vendor-specific there. Given the amount of standard > ISA extensions that are being created compared to CPU errata, I don't > think that I'm particularly worried about vendor specific stuff being > the reason for infinite bits being needed either. As the cpus evolve, it may have so many different bug. This is why I say it may need infinite bits. In fact, it is just a extreme case. > > If there do come to be issues with the number of extensions we care > about, we could split things per vendor I suppose. > Instead of reusing, it may be better to store them separately. But if we store them separately, why not to use a different DT property to describe them? This is what I want to say. > > IMHO, it may be better to use a new DT property like "riscv,cpu-errata" or > > "<vendor>,cpu-errata". It can achieve almost everything like using pseudo > > isa. And the only cost I think is a small amount code to parse this. > > I suppose we could do that, but accounting for vendor specifics was one > of the goals for the property I only just added and that I am suggesting > to use here.
On Wed, Mar 13, 2024 at 09:31:26AM +0800, Inochi Amaoto wrote: ... > IMHO, it may be better to use a new DT property like "riscv,cpu-errata" or > "<vendor>,cpu-errata". It can achieve almost everything like using pseudo > isa. And the only cost I think is a small amount code to parse this. > What's the ACPI equivalent for this new DT property? If there isn't one, then the cost is also to introduce something to the ACPI spec and add the ACPI parsing code. I'd much rather we call specified behaviors "extensions", whether they are vendor-specific or RVI standard, and then treat all extensions the same way in hardware descriptions and Linux. It'd also be best if errata in extension implementations were handled by replacing the extension in the hardware description with a new name which is specifically for the behavior Linux should expect. (Just because two extensions are almost the same doesn't mean we should say we have one and then have some second mechanism to say, "well, not really, instead of that, it's this". It's cleaner to just remove the extension it doesn't properly implement from its hardware description and create a name for the behavior it does have.) Errata in behaviors which don't have extension names (are hopefully few) and are where mvendorid and friends would need to be checked, but then why not create a pseudo extension name, as Conor suggests, so the rest of Linux code can manage errata the same way it manages every other behavior? The growth rate of the ISA bitmap is worth thinking about, though, since we have several copies of it (at least one "all harts" bitmap, one bitmap for each hart, another one for each vcpu, and then there's nested virt...) We don't have enough extensions to worry about it now, but we can eventually try partitioning, using common maps for common bits, not storing bits which can be inferred from other bits, etc. Thanks, drew
On Fri, Mar 15, 2024 at 09:11:35AM +0100, Andrew Jones wrote: > On Wed, Mar 13, 2024 at 09:31:26AM +0800, Inochi Amaoto wrote: > ... > > IMHO, it may be better to use a new DT property like "riscv,cpu-errata" or > > "<vendor>,cpu-errata". It can achieve almost everything like using pseudo > > isa. And the only cost I think is a small amount code to parse this. > > > > What's the ACPI equivalent for this new DT property? If there isn't one, > then the cost is also to introduce something to the ACPI spec and add the > ACPI parsing code. > > I'd much rather we call specified behaviors "extensions", whether they > are vendor-specific or RVI standard, and then treat all extensions the > same way in hardware descriptions and Linux. It'd also be best if errata > in extension implementations were handled by replacing the extension in > the hardware description with a new name which is specifically for the > behavior Linux should expect. (Just because two extensions are almost the > same doesn't mean we should say we have one and then have some second > mechanism to say, "well, not really, instead of that, it's this". It's > cleaner to just remove the extension it doesn't properly implement from > its hardware description and create a name for the behavior it does have.) > > Errata in behaviors which don't have extension names (are hopefully few) > and are where mvendorid and friends would need to be checked, but then why > not create a pseudo extension name, as Conor suggests, so the rest of > Linux code can manage errata the same way it manages every other behavior? > You are right, the errata have no difference from extension in real: they both change the behaviors of processor. Meanwhile, the ACPI is the problem, it is better to reuse existing property. At least for now, using pseudo extension is OK for me. > The growth rate of the ISA bitmap is worth thinking about, though, since > we have several copies of it (at least one "all harts" bitmap, one bitmap > for each hart, another one for each vcpu, and then there's nested virt...) > We don't have enough extensions to worry about it now, but we can > eventually try partitioning, using common maps for common bits, not > storing bits which can be inferred from other bits, etc. > > Thanks, > drew
On 3/15/24 01:11, Andrew Jones wrote: > On Wed, Mar 13, 2024 at 09:31:26AM +0800, Inochi Amaoto wrote: > ... >> IMHO, it may be better to use a new DT property like "riscv,cpu-errata" or >> "<vendor>,cpu-errata". It can achieve almost everything like using pseudo >> isa. And the only cost I think is a small amount code to parse this. >> > > What's the ACPI equivalent for this new DT property? If there isn't one, > then the cost is also to introduce something to the ACPI spec and add the > ACPI parsing code. > > I'd much rather we call specified behaviors "extensions", whether they > are vendor-specific or RVI standard, and then treat all extensions the > same way in hardware descriptions and Linux. It'd also be best if errata > in extension implementations were handled by replacing the extension in > the hardware description with a new name which is specifically for the > behavior Linux should expect. (Just because two extensions are almost the > same doesn't mean we should say we have one and then have some second > mechanism to say, "well, not really, instead of that, it's this". It's > cleaner to just remove the extension it doesn't properly implement from > its hardware description and create a name for the behavior it does have.) > > Errata in behaviors which don't have extension names (are hopefully few) > and are where mvendorid and friends would need to be checked, but then why > not create a pseudo extension name, as Conor suggests, so the rest of > Linux code can manage errata the same way it manages every other behavior? > > The growth rate of the ISA bitmap is worth thinking about, though, since > we have several copies of it (at least one "all harts" bitmap, one bitmap > for each hart, another one for each vcpu, and then there's nested virt...) > We don't have enough extensions to worry about it now, but we can > eventually try partitioning, using common maps for common bits, not > storing bits which can be inferred from other bits, etc. This is my biggest worry going forward. We already have a ever growing standard RVI extension list. On top of that we have genuine vendor extensions. IMHO, errata are bit different than extensions as there will be few vendor extensions in the future but many hardware erratas :) If we start calling every hardware errata as an pseudo ISA extensions, we will much bigger problem maintaining it in the future. We discussed this earlier during the Andes PMU extension series[1] as well. We have three types of extensions in discussions now. 1. standard RVI extensions 2. Vendor extensions a. Genuine vendor extension b. Vendor erratas which can be described as pseudo-extensions now Keeping all these within a single ISA bitmap space seems very odd to me. I think the feasible approach would be to partition the standard and vendor ISA extension space as you suggested. For 2.b, either we can start defining pseudo extensions or adding vendor/arch/impid checks. @Conor: You seems to prefer the earlier approach instead of adding the checks. Care to elaborate why do you think that's a better method compared to a simple check ? I agree that don't have the crystal ball and may be proven wrong in the future (I will be definitely happy about that!). But given the diversity of RISC-V ecosystem, I feel that may be our sad reality. [1] https://lore.kernel.org/linux-riscv/20240110073917.2398826-8-peterlin@andestech.com/ > > Thanks, > drew > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
On Mon, Mar 18, 2024 at 03:46:54PM -0700, Atish Patra wrote: > On 3/15/24 01:11, Andrew Jones wrote: > > On Wed, Mar 13, 2024 at 09:31:26AM +0800, Inochi Amaoto wrote: > > ... > > > IMHO, it may be better to use a new DT property like "riscv,cpu-errata" or > > > "<vendor>,cpu-errata". It can achieve almost everything like using pseudo > > > isa. And the only cost I think is a small amount code to parse this. > > > > > > > What's the ACPI equivalent for this new DT property? If there isn't one, > > then the cost is also to introduce something to the ACPI spec and add the > > ACPI parsing code. > > > > I'd much rather we call specified behaviors "extensions", whether they > > are vendor-specific or RVI standard, and then treat all extensions the > > same way in hardware descriptions and Linux. It'd also be best if errata > > in extension implementations were handled by replacing the extension in > > the hardware description with a new name which is specifically for the > > behavior Linux should expect. (Just because two extensions are almost the > > same doesn't mean we should say we have one and then have some second > > mechanism to say, "well, not really, instead of that, it's this". It's > > cleaner to just remove the extension it doesn't properly implement from > > its hardware description and create a name for the behavior it does have.) > > > > Errata in behaviors which don't have extension names (are hopefully few) > > and are where mvendorid and friends would need to be checked, but then why > > not create a pseudo extension name, as Conor suggests, so the rest of > > Linux code can manage errata the same way it manages every other behavior? > > > > The growth rate of the ISA bitmap is worth thinking about, though, since > > we have several copies of it (at least one "all harts" bitmap, one bitmap > > for each hart, another one for each vcpu, and then there's nested virt...) > > We don't have enough extensions to worry about it now, but we can > > eventually try partitioning, using common maps for common bits, not > > storing bits which can be inferred from other bits, etc. > > This is my biggest worry going forward. We already have a ever growing > standard RVI extension list. On top of that we have genuine vendor > extensions. IMHO, errata are bit different than extensions as there will be > few vendor extensions in the future but many hardware erratas :) I dunno, I think there's going to be plenty of both. We may not see (or use) a lot of vendor extensions in mainline Linux, but they will exist. > If we start calling every hardware errata as an pseudo ISA extensions, we > will much bigger problem maintaining it in the future. I've explained to you at least once already that this is not my goal. Where there are genuine issues with the implementation of an extension creating a "pseudo" extension is not what I am suggesting we do. I have no problem with with the approach taken for the SiFive errata, for example. > We discussed this earlier during the Andes PMU extension series[1] as well. > We have three types of extensions in discussions now. > > 1. standard RVI extensions > 2. Vendor extensions > a. Genuine vendor extension > b. Vendor erratas which can be described as pseudo-extensions now > Keeping all these within a single ISA bitmap space seems very odd to me. > I think the feasible approach would be to partition the standard and vendor > ISA extension space as you suggested. Let's be clear - partitioning the space is unrelated to the detection method. We can go ahead and partition the space and use "pseudo" extensions or we can have a unified space but use archid/impid for detection. Having a unified space is the simpler thing to implement right now, but it totally does not stop us breaking them out in the future. We could even gate these custom implementations behind config options if bloat is a concern - but multiplatform kernels are likely to enable all the options anyway. > For 2.b, either we can start defining pseudo extensions or adding > vendor/arch/impid checks. > > @Conor: You seems to prefer the earlier approach instead of adding the > checks. Care to elaborate why do you think that's a better method compared > to a simple check ? Because I don't think that describing these as "errata" in the first place is even accurate. This is not a case of a vendor claiming they have Sscofpmf support but the implementation is flawed. As far as I understand, this is a vendor creating a useful feature prior to the creation of a standard extension. A bit of a test for this could be "If the standard extension never existed, would this be considered a new feature or an implementation issue". I think this is pretty clearly in the former camp. I do not think we should be using m*id detection implementations of a feature prior to creation of a standard extension for the same purpose. To me the main difference between a case like this and VentanaCondOps/Zicond is that we are the ones calling this an extension (hence my use of pseudo) and not the vendor of the IP. If T-Head were to publish a document tomorrow on the T-Head github repo for official vendor extensions, that difference would not even exist any longer. I also do not believe that it is a "simple" check. The number of implementations that could end up using this PMU could just balloon if T-Head has no intention of switching to Sscofpmf. If they don't balloon in this case, there's nothing stopping them ballooning in a similar case in the future. We should let the platform firmware tell us explicitly, be that via DT or ACPI, what features are supported rather than try to reverse engineer it ourselves via m*id. That leads into another general issue I have with using m*id detection, which I think I have mentioned several times on the list - it prevents the platform (hypervisor, emulator or firmware) from disabling that feature. If I had a time machine back to when the T-Head perf or cmo stuff was submitted, I was try to avoid any of it being merged with the m*id detection method. > I agree that don't have the crystal ball and may be proven wrong in the > future (I will be definitely happy about that!). But given the diversity of > RISC-V ecosystem, I feel that may be our sad reality. I don't understand what this comment is referring to, it lacks context as to what the sad reality actually is. I hope that all made sense and explained why I am against this method for detecting what I believe to be features rather than errata, Conor.
On 3/18/24 16:48, Conor Dooley wrote: > On Mon, Mar 18, 2024 at 03:46:54PM -0700, Atish Patra wrote: >> On 3/15/24 01:11, Andrew Jones wrote: >>> On Wed, Mar 13, 2024 at 09:31:26AM +0800, Inochi Amaoto wrote: >>> ... >>>> IMHO, it may be better to use a new DT property like "riscv,cpu-errata" or >>>> "<vendor>,cpu-errata". It can achieve almost everything like using pseudo >>>> isa. And the only cost I think is a small amount code to parse this. >>>> >>> >>> What's the ACPI equivalent for this new DT property? If there isn't one, >>> then the cost is also to introduce something to the ACPI spec and add the >>> ACPI parsing code. >>> >>> I'd much rather we call specified behaviors "extensions", whether they >>> are vendor-specific or RVI standard, and then treat all extensions the >>> same way in hardware descriptions and Linux. It'd also be best if errata >>> in extension implementations were handled by replacing the extension in >>> the hardware description with a new name which is specifically for the >>> behavior Linux should expect. (Just because two extensions are almost the >>> same doesn't mean we should say we have one and then have some second >>> mechanism to say, "well, not really, instead of that, it's this". It's >>> cleaner to just remove the extension it doesn't properly implement from >>> its hardware description and create a name for the behavior it does have.) >>> >>> Errata in behaviors which don't have extension names (are hopefully few) >>> and are where mvendorid and friends would need to be checked, but then why >>> not create a pseudo extension name, as Conor suggests, so the rest of >>> Linux code can manage errata the same way it manages every other behavior? >>> >>> The growth rate of the ISA bitmap is worth thinking about, though, since >>> we have several copies of it (at least one "all harts" bitmap, one bitmap >>> for each hart, another one for each vcpu, and then there's nested virt...) >>> We don't have enough extensions to worry about it now, but we can >>> eventually try partitioning, using common maps for common bits, not >>> storing bits which can be inferred from other bits, etc. >> >> This is my biggest worry going forward. We already have a ever growing >> standard RVI extension list. On top of that we have genuine vendor >> extensions. IMHO, errata are bit different than extensions as there will be >> few vendor extensions in the future but many hardware erratas :) > > I dunno, I think there's going to be plenty of both. We may not see (or > use) a lot of vendor extensions in mainline Linux, but they will exist. > I hope that will happen. But I fear we will have lot of vendor extensions in mainline Linux. That is the "sad reality" I was talking about at the end of the thread. >> If we start calling every hardware errata as an pseudo ISA extensions, we >> will much bigger problem maintaining it in the future. > > I've explained to you at least once already that this is not my goal. > Where there are genuine issues with the implementation of an extension > creating a "pseudo" extension is not what I am suggesting we do. > I have no problem with with the approach taken for the SiFive errata, > for example. > Thanks for clarifying. But we have to define the rules what gets in as pseudo extension very clearly to avoid any kind of abuse in the future. >> We discussed this earlier during the Andes PMU extension series[1] as well. >> We have three types of extensions in discussions now. >> >> 1. standard RVI extensions >> 2. Vendor extensions >> a. Genuine vendor extension >> b. Vendor erratas which can be described as pseudo-extensions now > >> Keeping all these within a single ISA bitmap space seems very odd to me. >> I think the feasible approach would be to partition the standard and vendor >> ISA extension space as you suggested. > > Let's be clear - partitioning the space is unrelated to the detection > method. We can go ahead and partition the space and use "pseudo" > extensions or we can have a unified space but use archid/impid for > detection. Having a unified space is the simpler thing to implement > right now, but it totally does not stop us breaking them out in the > future. We could even gate these custom implementations behind config > options if bloat is a concern - but multiplatform kernels are likely to > enable all the options anyway. > Agreed. >> For 2.b, either we can start defining pseudo extensions or adding >> vendor/arch/impid checks. >> >> @Conor: You seems to prefer the earlier approach instead of adding the >> checks. Care to elaborate why do you think that's a better method compared >> to a simple check ? > > Because I don't think that describing these as "errata" in the first > place is even accurate. This is not a case of a vendor claiming they > have Sscofpmf support but the implementation is flawed. As far as I > understand, this is a vendor creating a useful feature prior to the > creation of a standard extension. > A bit of a test for this could be "If the standard extension never > existed, would this be considered a new feature or an implementation > issue". I think this is pretty clearly in the former camp. > So we have 3 cases. 1. Pseudo extension: An vendor extension designed and/or implemented before the standard RVI extension was ratified but do not violate any standard encoding space. 2. Erratas: An genuine bug/design issue in the expected behavior from a standard RVI extension (including violating standard encoding space) 3. Vendor extension: A new or a variant of standard RVI extension which is different enough from standard extension. IMO, the line between #2 and #1 may get blurry as we going forward because of the sheer number of small extensions RVI is comping up with (which is a problem as well). Just to clarify: I am not too worried about this particular case as we know that T-head's implementation predates the Sscofpmf extension. But once we define a standard mechanism for this kind of situation, vendor may start to abuse it. > I do not think we should be using m*id detection implementations of a > feature prior to creation of a standard extension for the same purpose. > To me the main difference between a case like this and VentanaCondOps/Zicond > is that we are the ones calling this an extension (hence my use of pseudo) > and not the vendor of the IP. If T-Head were to publish a document tomorrow > on the T-Head github repo for official vendor extensions, that difference > would not even exist any longer. > Exactly! If vendor publishes these as an extension or an errata, that's a binding agreement to call it in a specific way. > I also do not believe that it is a "simple" check. The number of > implementations that could end up using this PMU could just balloon > if T-Head has no intention of switching to Sscofpmf. If they don't > balloon in this case, there's nothing stopping them ballooning in a Ideally, they shouldn't as it a simple case of CSR number & IRQ number. If they care to implement AIA, then they must change it to standard sscofpmf as the current IRQ violates the AIA spec. But who knows if they care to implement AIA or not. > similar case in the future. We should let the platform firmware tell > explicitly, be that via DT or ACPI, what features are supported rather > than try to reverse engineer it ourselves via m*id. > Fair enough. > That leads into another general issue I have with using m*id detection, > which I think I have mentioned several times on the list - it prevents the > platform (hypervisor, emulator or firmware) from disabling that feature. > If that is the only concern, platform can just disable the actual extension(i.e. sscofpmf in this case) to disable that feature for that particular vendor. > If I had a time machine back to when the T-Head perf or cmo stuff was > submitted, I was try to avoid any of it being merged with the m*id > detection method. > >> I agree that don't have the crystal ball and may be proven wrong in the >> future (I will be definitely happy about that!). But given the diversity of >> RISC-V ecosystem, I feel that may be our sad reality. > > I don't understand what this comment is referring to, it lacks context > as to what the sad reality actually is. > > I hope that all made sense and explained why I am against this method > for detecting what I believe to be features rather than errata, > Conor. > Yes.Thanks again for the clarification. Again, I am not opposed to the idea. I just wanted to understand if this is the best option we have right now. > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
On Mon, Mar 18, 2024 at 05:48:13PM -0700, Atish Patra wrote: > On 3/18/24 16:48, Conor Dooley wrote: > > On Mon, Mar 18, 2024 at 03:46:54PM -0700, Atish Patra wrote: > > > For 2.b, either we can start defining pseudo extensions or adding > > > vendor/arch/impid checks. > > > > > > @Conor: You seems to prefer the earlier approach instead of adding the > > > checks. Care to elaborate why do you think that's a better method compared > > > to a simple check ? > > > > Because I don't think that describing these as "errata" in the first > > place is even accurate. This is not a case of a vendor claiming they > > have Sscofpmf support but the implementation is flawed. As far as I > > understand, this is a vendor creating a useful feature prior to the > > creation of a standard extension. > > A bit of a test for this could be "If the standard extension never > > existed, would this be considered a new feature or an implementation > > issue". I think this is pretty clearly in the former camp. > > > > So we have 3 cases. > > 1. Pseudo extension: An vendor extension designed and/or implemented before > the standard RVI extension was ratified but do not violate any standard > encoding space. > > 2. Erratas: An genuine bug/design issue in the expected behavior from a > standard RVI extension (including violating standard encoding space) > > 3. Vendor extension: A new or a variant of standard RVI extension which is > different enough from standard extension. > > IMO, the line between #2 and #1 may get blurry as we going forward because > of the sheer number of small extensions RVI is comping up with (which is a > problem as well). Aye, I think some of that is verging on ridiculous. > Just to clarify: I am not too worried about this particular case as we know > that T-head's implementation predates the Sscofpmf extension. > But once we define a standard mechanism for this kind of situation, vendor > may start to abuse it. How do you envisage it being abused by a vendor? Pre-dating the standard extension does make this one fairly clear-cut, but are you worried about people coming along and claiming to implement XConorSscofpmf instead of Sscofpmf rather than suffer the "shame" of a broken implementation? All this stuff is going to be pretty case-by-case (to begin with at least) so I'm not too worried about that sort of abuse. > > I do not think we should be using m*id detection implementations of a > > feature prior to creation of a standard extension for the same purpose. > > To me the main difference between a case like this and VentanaCondOps/Zicond > > is that we are the ones calling this an extension (hence my use of pseudo) > > and not the vendor of the IP. If T-Head were to publish a document tomorrow > > on the T-Head github repo for official vendor extensions, that difference > > would not even exist any longer. > > > > Exactly! If vendor publishes these as an extension or an errata, that's a > binding agreement to call it in a specific way. I don't agree that we are bound to call it the way that the vendor does. We should just review these sorts of things on a case-by-case basis, committing to doing what the vendor says is abusable. > > I also do not believe that it is a "simple" check. The number of > > implementations that could end up using this PMU could just balloon > > if T-Head has no intention of switching to Sscofpmf. If they don't > > balloon in this case, there's nothing stopping them ballooning in a > > Ideally, they shouldn't as it a simple case of CSR number & IRQ number. > If they care to implement AIA, then they must change it to standard sscofpmf > as the current IRQ violates the AIA spec. But who knows if they care to > implement AIA or not. What kinda "worried" me here is that the c908 implements /both/ Zicbom and the T-Head CMO instructions and /both/ Svpbmt and their original misuse of the reserved bits but they do not support Sscofpmf. Maybe it just was not feasible to migrate entirely (but they did for vector) or to support both interrupt numbers and to alias the CSR, but it seemed like the opportunity to standardise a bunch of other stuff was taken, but this particular extension was not. That's why I was worried that we'd see some ballooning in these specific checks. > > similar case in the future. We should let the platform firmware tell > > explicitly, be that via DT or ACPI, what features are supported rather > > than try to reverse engineer it ourselves via m*id. > > > Fair enough. > > > > That leads into another general issue I have with using m*id detection, > > which I think I have mentioned several times on the list - it prevents the > > platform (hypervisor, emulator or firmware) from disabling that feature. > > > > If that is the only concern, platform can just disable the actual > extension(i.e. sscofpmf in this case) to disable that feature for that > particular vendor. Right. Maybe I wasn't clear that this is a problem with using m*id for /detection/ of extensions and not with using m*id to work around implementation issues with the extension. In the latter case, you're applying a fixup only when the actual extension is communicated to be present, which leaves that control in the hands of the platform. > > If I had a time machine back to when the T-Head perf or cmo stuff was > > submitted, I was try to avoid any of it being merged with the m*id > > detection method. > > > > > I agree that don't have the crystal ball and may be proven wrong in the > > > future (I will be definitely happy about that!). But given the diversity of > > > RISC-V ecosystem, I feel that may be our sad reality. > > > > I don't understand what this comment is referring to, it lacks context > > as to what the sad reality actually is. > > > > I hope that all made sense and explained why I am against this method > > for detecting what I believe to be features rather than errata, > > Conor. > > > > Yes.Thanks again for the clarification. Again, I am not opposed to the idea. > I just wanted to understand if this is the best option we have right now.
On Tue, Mar 19, 2024 at 09:06:34AM +0000, Conor Dooley wrote: > On Mon, Mar 18, 2024 at 05:48:13PM -0700, Atish Patra wrote: > > On 3/18/24 16:48, Conor Dooley wrote: > > > On Mon, Mar 18, 2024 at 03:46:54PM -0700, Atish Patra wrote: > > > > > For 2.b, either we can start defining pseudo extensions or adding > > > > vendor/arch/impid checks. > > > > > > > > @Conor: You seems to prefer the earlier approach instead of adding the > > > > checks. Care to elaborate why do you think that's a better method compared > > > > to a simple check ? > > > > > > Because I don't think that describing these as "errata" in the first > > > place is even accurate. This is not a case of a vendor claiming they > > > have Sscofpmf support but the implementation is flawed. As far as I > > > understand, this is a vendor creating a useful feature prior to the > > > creation of a standard extension. > > > A bit of a test for this could be "If the standard extension never > > > existed, would this be considered a new feature or an implementation > > > issue". I think this is pretty clearly in the former camp. > > > > > > > So we have 3 cases. > > > > 1. Pseudo extension: An vendor extension designed and/or implemented before > > the standard RVI extension was ratified but do not violate any standard > > encoding space. The vendor should name these extensions themselves. > > > > 2. Erratas: An genuine bug/design issue in the expected behavior from a > > standard RVI extension (including violating standard encoding space) More on this below, but I think vendors should name these too. > > > > 3. Vendor extension: A new or a variant of standard RVI extension which is > > different enough from standard extension. > > > > IMO, the line between #2 and #1 may get blurry as we going forward because > > of the sheer number of small extensions RVI is comping up with (which is a > > problem as well). The line between #1 and #2 is blurry because the only difference is the original intentions. The end result is that a vendor has implemented something that resembles a standard extension, but is not the same as the standard extension. > > Aye, I think some of that is verging on ridiculous. > > > Just to clarify: I am not too worried about this particular case as we know > > that T-head's implementation predates the Sscofpmf extension. > > But once we define a standard mechanism for this kind of situation, vendor > > may start to abuse it. > > How do you envisage it being abused by a vendor? > Pre-dating the standard extension does make this one fairly clear-cut, > but are you worried about people coming along and claiming to implement > XConorSscofpmf instead of Sscofpmf rather than suffer the "shame" of a > broken implementation? Other than the concern of the ballooning bitmap, I'd prefer this approach. If a vendor has implemented some extension which happens to be "almost Sscofpmf", then whether it was implemented before the Sscofpmf spec existed, or after, isn't really important. What's important is that it's only "almost Sscofpmf" and not _exactly_ Sscofpmf, which means it should not use the Sscofpmf extension name. Since vendors are allowed to create their own XVendor names, then that shouldn't be a problem. Indeed, the abuse concern seems to be in the opposite direction, that vendors will try to pass off almost-standard extensions as standard extensions by trying to get workarounds into Linux. Maybe Linux policy should be to simply reject workarounds to extensions, requiring vendors to create new names instead. > All this stuff is going to be pretty case-by-case (to begin with at > least) so I'm not too worried about that sort of abuse. Case-by-case is reasonable, since it's probably too strict to always require new names. We can consider each proposed workaround as they come, but it's a slippery slope once workarounds are accepted. Thanks, drew
On Tue, Mar 19, 2024 at 02:39:21PM +0100, Andrew Jones wrote: > On Tue, Mar 19, 2024 at 09:06:34AM +0000, Conor Dooley wrote: > > On Mon, Mar 18, 2024 at 05:48:13PM -0700, Atish Patra wrote: > > > On 3/18/24 16:48, Conor Dooley wrote: > > > > On Mon, Mar 18, 2024 at 03:46:54PM -0700, Atish Patra wrote: > > > > > > > For 2.b, either we can start defining pseudo extensions or adding > > > > > vendor/arch/impid checks. > > > > > > > > > > @Conor: You seems to prefer the earlier approach instead of adding the > > > > > checks. Care to elaborate why do you think that's a better method compared > > > > > to a simple check ? > > > > > > > > Because I don't think that describing these as "errata" in the first > > > > place is even accurate. This is not a case of a vendor claiming they > > > > have Sscofpmf support but the implementation is flawed. As far as I > > > > understand, this is a vendor creating a useful feature prior to the > > > > creation of a standard extension. > > > > A bit of a test for this could be "If the standard extension never > > > > existed, would this be considered a new feature or an implementation > > > > issue". I think this is pretty clearly in the former camp. > > > > > > > > > > So we have 3 cases. > > > > > > 1. Pseudo extension: An vendor extension designed and/or implemented before > > > the standard RVI extension was ratified but do not violate any standard > > > encoding space. > > The vendor should name these extensions themselves. > > > > > > > 2. Erratas: An genuine bug/design issue in the expected behavior from a > > > standard RVI extension (including violating standard encoding space) > > More on this below, but I think vendors should name these too. Yah, both of these the vendor /should/ name it themselves but I don't want some set in stone /must/ that locks someone who is not the vendor from upstreaming. > > > 3. Vendor extension: A new or a variant of standard RVI extension which is > > > different enough from standard extension. > > > > > > IMO, the line between #2 and #1 may get blurry as we going forward because > > > of the sheer number of small extensions RVI is comping up with (which is a > > > problem as well). > > The line between #1 and #2 is blurry because the only difference is the > original intentions. The end result is that a vendor has implemented > something that resembles a standard extension, but is not the same as the > standard extension. Aye, a large part of this is definitely based on intent. Or maybe marketing rather than intent, but the two aren't all that different as the public may not be privy to which it actually is. I think you're missing a factor though - when the difference is discovered. Equating #1 and #2 is fine when that difference is known when the platform is originally supported, but if the divergence between what's implemented and the spec is only discovered down the line... > > All this stuff is going to be pretty case-by-case (to begin with at > > least) so I'm not too worried about that sort of abuse. > > Case-by-case is reasonable, since it's probably too strict to always > require new names. We can consider each proposed workaround as they > come, but it's a slippery slope once workarounds are accepted. ... I think that means that having some workarounds are inescapable really. Some sort of workaround could then be only way fix the problem without a firmware update. That workaround might be triggered by the m*id CSRs or it could be based on the firmware's SBI implemenation or version IDs. When it's something that never worked at all or was discovered early, then ye equate the two. > > > Just to clarify: I am not too worried about this particular case as we know > > > that T-head's implementation predates the Sscofpmf extension. > > > But once we define a standard mechanism for this kind of situation, vendor > > > may start to abuse it. > > > > How do you envisage it being abused by a vendor? > > Pre-dating the standard extension does make this one fairly clear-cut, > > but are you worried about people coming along and claiming to implement > > XConorSscofpmf instead of Sscofpmf rather than suffer the "shame" of a > > broken implementation? > > Other than the concern of the ballooning bitmap, I'd prefer this approach. > If a vendor has implemented some extension which happens to be "almost > Sscofpmf", then whether it was implemented before the Sscofpmf spec > existed, or after, isn't really important. What's important is that it's > only "almost Sscofpmf" and not _exactly_ Sscofpmf, which means it should > not use the Sscofpmf extension name. One of the reasons I keep bringing up when things were created prior to the creation of Sscofpmf (and I guess the fact that the vendor never claims to support Sscofpmf) is to highlight that we are not looking at at one of these edge cases where we're only discovering that there's an implementation issue on these CPUs that we need to work around silently. > Since vendors are allowed to create > their own XVendor names, then that shouldn't be a problem. Indeed, the > abuse concern seems to be in the opposite direction, that vendors will > try to pass off almost-standard extensions as standard extensions by > trying to get workarounds into Linux. Maybe Linux policy should be to > simply reject workarounds to extensions, requiring vendors to create new > names instead. I'd be inclined to agree (although there's gonna be some exceptions as I mention above) - but it's easy for me to say that given I want people to use riscv,isa-extensions on the DT side which cites specific versions of extensions that a device is compliant with. If people have issues with riscv,isa in DT, I'm can take a bit of a "oh it doesn't work with riscv,isa? Tough shit, that's why we made the new properties." approach and push people into new names. With ACPI, I have absolutely no idea how you police any of that. The way the code works rn is that both DT properties and ACPI share the exact same extension "names". Obviously it doesn't need to be that way, but it is handy. I'm kinda derailing here, but how would we map names to exact features in ACPI land? When I last read the ACPI stuff it was very non-specific and the kernel documentation (Documentation/riscv/acpi.rst) cites a specific version of the ISA manual that provides no information (and I guess never will?) about vendor extensions.
On 3/19/24 02:06, Conor Dooley wrote: > On Mon, Mar 18, 2024 at 05:48:13PM -0700, Atish Patra wrote: >> On 3/18/24 16:48, Conor Dooley wrote: >>> On Mon, Mar 18, 2024 at 03:46:54PM -0700, Atish Patra wrote: > >>>> For 2.b, either we can start defining pseudo extensions or adding >>>> vendor/arch/impid checks. >>>> >>>> @Conor: You seems to prefer the earlier approach instead of adding the >>>> checks. Care to elaborate why do you think that's a better method compared >>>> to a simple check ? >>> >>> Because I don't think that describing these as "errata" in the first >>> place is even accurate. This is not a case of a vendor claiming they >>> have Sscofpmf support but the implementation is flawed. As far as I >>> understand, this is a vendor creating a useful feature prior to the >>> creation of a standard extension. >>> A bit of a test for this could be "If the standard extension never >>> existed, would this be considered a new feature or an implementation >>> issue". I think this is pretty clearly in the former camp. >>> >> >> So we have 3 cases. >> >> 1. Pseudo extension: An vendor extension designed and/or implemented before >> the standard RVI extension was ratified but do not violate any standard >> encoding space. >> >> 2. Erratas: An genuine bug/design issue in the expected behavior from a >> standard RVI extension (including violating standard encoding space) >> >> 3. Vendor extension: A new or a variant of standard RVI extension which is >> different enough from standard extension. >> >> IMO, the line between #2 and #1 may get blurry as we going forward because >> of the sheer number of small extensions RVI is comping up with (which is a >> problem as well). > > Aye, I think some of that is verging on ridiculous. > >> Just to clarify: I am not too worried about this particular case as we know >> that T-head's implementation predates the Sscofpmf extension. >> But once we define a standard mechanism for this kind of situation, vendor >> may start to abuse it. > > How do you envisage it being abused by a vendor? > Pre-dating the standard extension does make this one fairly clear-cut, > but are you worried about people coming along and claiming to implement > XConorSscofpmf instead of Sscofpmf rather than suffer the "shame" of a > broken implementation? Yes or just use this excuse continue their old implementation in a newer chip which was designed after the standard extension is ratified. > All this stuff is going to be pretty case-by-case (to begin with at > least) so I'm not too worried about that sort of abuse. > >>> I do not think we should be using m*id detection implementations of a >>> feature prior to creation of a standard extension for the same purpose. >>> To me the main difference between a case like this and VentanaCondOps/Zicond >>> is that we are the ones calling this an extension (hence my use of pseudo) >>> and not the vendor of the IP. If T-Head were to publish a document tomorrow >>> on the T-Head github repo for official vendor extensions, that difference >>> would not even exist any longer. >>> >> >> Exactly! If vendor publishes these as an extension or an errata, that's a >> binding agreement to call it in a specific way. > > I don't agree that we are bound to call it the way that the vendor does. > We should just review these sorts of things on a case-by-case basis, > committing to doing what the vendor says is abusable. > >>> I also do not believe that it is a "simple" check. The number of >>> implementations that could end up using this PMU could just balloon >>> if T-Head has no intention of switching to Sscofpmf. If they don't >>> balloon in this case, there's nothing stopping them ballooning in a >> >> Ideally, they shouldn't as it a simple case of CSR number & IRQ number. >> If they care to implement AIA, then they must change it to standard sscofpmf >> as the current IRQ violates the AIA spec. But who knows if they care to >> implement AIA or not. > > What kinda "worried" me here is that the c908 implements /both/ Zicbom > and the T-Head CMO instructions and /both/ Svpbmt and their original > misuse of the reserved bits but they do not support Sscofpmf. Maybe it > just was not feasible to migrate entirely (but they did for vector) or > to support both interrupt numbers and to alias the CSR, but it seemed > like the opportunity to standardise a bunch of other stuff was taken, > but this particular extension was not. That's why I was worried that > we'd see some ballooning in these specific checks. > This is the exact abuse I was talking about. A vendor can keep churning new chips with incompatible extensions because Linux just allowed them get away with it by calling it a Vendor Extension. +Guo: In case he has any insight behind this decision from Thead. >>> similar case in the future. We should let the platform firmware tell >>> explicitly, be that via DT or ACPI, what features are supported rather >>> than try to reverse engineer it ourselves via m*id. >>> >> Fair enough. >> >> >>> That leads into another general issue I have with using m*id detection, >>> which I think I have mentioned several times on the list - it prevents the >>> platform (hypervisor, emulator or firmware) from disabling that feature. >>> >> >> If that is the only concern, platform can just disable the actual >> extension(i.e. sscofpmf in this case) to disable that feature for that >> particular vendor. > > Right. Maybe I wasn't clear that this is a problem with using m*id for > /detection/ of extensions and not with using m*id to work around > implementation issues with the extension. In the latter case, you're > applying a fixup only when the actual extension is communicated to be > present, which leaves that control in the hands of the platform. > Hmm. I guess there is no way predict how vendors will include standard extensions. Let's give them a benefit of doubt hoping they realize abusing the system won't help anybody. They will try to adopt the standard extensions if it is possible and creates erratas for genuine problems. >>> If I had a time machine back to when the T-Head perf or cmo stuff was >>> submitted, I was try to avoid any of it being merged with the m*id >>> detection method. >>> >>>> I agree that don't have the crystal ball and may be proven wrong in the >>>> future (I will be definitely happy about that!). But given the diversity of >>>> RISC-V ecosystem, I feel that may be our sad reality. >>> >>> I don't understand what this comment is referring to, it lacks context >>> as to what the sad reality actually is. >>> >>> I hope that all made sense and explained why I am against this method >>> for detecting what I believe to be features rather than errata, >>> Conor. >>> >> >> Yes.Thanks again for the clarification. Again, I am not opposed to the idea. >> I just wanted to understand if this is the best option we have right now. >> >> _______________________________________________ >> linux-riscv mailing list >> linux-riscv@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/linux-riscv
On 3/19/24 08:36, Conor Dooley wrote: > On Tue, Mar 19, 2024 at 02:39:21PM +0100, Andrew Jones wrote: >> On Tue, Mar 19, 2024 at 09:06:34AM +0000, Conor Dooley wrote: >>> On Mon, Mar 18, 2024 at 05:48:13PM -0700, Atish Patra wrote: >>>> On 3/18/24 16:48, Conor Dooley wrote: >>>>> On Mon, Mar 18, 2024 at 03:46:54PM -0700, Atish Patra wrote: >>> >>>>>> For 2.b, either we can start defining pseudo extensions or adding >>>>>> vendor/arch/impid checks. >>>>>> >>>>>> @Conor: You seems to prefer the earlier approach instead of adding the >>>>>> checks. Care to elaborate why do you think that's a better method compared >>>>>> to a simple check ? >>>>> >>>>> Because I don't think that describing these as "errata" in the first >>>>> place is even accurate. This is not a case of a vendor claiming they >>>>> have Sscofpmf support but the implementation is flawed. As far as I >>>>> understand, this is a vendor creating a useful feature prior to the >>>>> creation of a standard extension. >>>>> A bit of a test for this could be "If the standard extension never >>>>> existed, would this be considered a new feature or an implementation >>>>> issue". I think this is pretty clearly in the former camp. >>>>> >>>> >>>> So we have 3 cases. >>>> >>>> 1. Pseudo extension: An vendor extension designed and/or implemented before >>>> the standard RVI extension was ratified but do not violate any standard >>>> encoding space. >> >> The vendor should name these extensions themselves. >> >>>> >>>> 2. Erratas: An genuine bug/design issue in the expected behavior from a >>>> standard RVI extension (including violating standard encoding space) >> >> More on this below, but I think vendors should name these too. > > Yah, both of these the vendor /should/ name it themselves but I don't > want some set in stone /must/ that locks someone who is not the vendor > from upstreaming. > >>>> 3. Vendor extension: A new or a variant of standard RVI extension which is >>>> different enough from standard extension. >>>> >>>> IMO, the line between #2 and #1 may get blurry as we going forward because >>>> of the sheer number of small extensions RVI is comping up with (which is a >>>> problem as well). >> >> The line between #1 and #2 is blurry because the only difference is the >> original intentions. The end result is that a vendor has implemented >> something that resembles a standard extension, but is not the same as the >> standard extension. > > Aye, a large part of this is definitely based on intent. Or maybe > marketing rather than intent, but the two aren't all that different > as the public may not be privy to which it actually is. > > I think you're missing a factor though - when the difference is > discovered. Equating #1 and #2 is fine when that difference is known > when the platform is originally supported, but if the divergence between > what's implemented and the spec is only discovered down the line... > >>> All this stuff is going to be pretty case-by-case (to begin with at >>> least) so I'm not too worried about that sort of abuse. >> >> Case-by-case is reasonable, since it's probably too strict to always >> require new names. We can consider each proposed workaround as they >> come, but it's a slippery slope once workarounds are accepted. > May be we treat #1 and #3 are same. In this case, we just call it XTheadSscofpmf or something similar. The patches would look similar to what was proposed in Andes PMU v7 series in that case. > ... I think that means that having some workarounds are inescapable > really. Some sort of workaround could then be only way fix the problem > without a firmware update. That workaround might be triggered by the > m*id CSRs or it could be based on the firmware's SBI implemenation > or version IDs. > When it's something that never worked at all or was discovered early, > then ye equate the two. > >>>> Just to clarify: I am not too worried about this particular case as we know >>>> that T-head's implementation predates the Sscofpmf extension. >>>> But once we define a standard mechanism for this kind of situation, vendor >>>> may start to abuse it. >>> >>> How do you envisage it being abused by a vendor? >>> Pre-dating the standard extension does make this one fairly clear-cut, >>> but are you worried about people coming along and claiming to implement >>> XConorSscofpmf instead of Sscofpmf rather than suffer the "shame" of a >>> broken implementation? >> >> Other than the concern of the ballooning bitmap, I'd prefer this approach. >> If a vendor has implemented some extension which happens to be "almost >> Sscofpmf", then whether it was implemented before the Sscofpmf spec >> existed, or after, isn't really important. What's important is that it's >> only "almost Sscofpmf" and not _exactly_ Sscofpmf, which means it should >> not use the Sscofpmf extension name. > > One of the reasons I keep bringing up when things were created prior to > the creation of Sscofpmf (and I guess the fact that the vendor never > claims to support Sscofpmf) is to highlight that we are not looking at > at one of these edge cases where we're only discovering that there's an > implementation issue on these CPUs that we need to work around silently. > >> Since vendors are allowed to create >> their own XVendor names, then that shouldn't be a problem. Indeed, the >> abuse concern seems to be in the opposite direction, that vendors will >> try to pass off almost-standard extensions as standard extensions by >> trying to get workarounds into Linux. Maybe Linux policy should be to >> simply reject workarounds to extensions, requiring vendors to create new >> names instead. > > I'd be inclined to agree (although there's gonna be some exceptions as I > mention above) - but it's easy for me to say that given I want people to > use riscv,isa-extensions on the DT side which cites specific versions of > extensions that a device is compliant with. > If people have issues with riscv,isa in DT, I'm can take a bit of a "oh > it doesn't work with riscv,isa? Tough shit, that's why we made the new > properties." approach and push people into new names. > > With ACPI, I have absolutely no idea how you police any of that. The way > the code works rn is that both DT properties and ACPI share the exact > same extension "names". Obviously it doesn't need to be that way, but it > is handy. I'm kinda derailing here, but how would we map names to exact > features in ACPI land? When I last read the ACPI stuff it was very > non-specific and the kernel documentation > (Documentation/riscv/acpi.rst) cites a specific version of the ISA > manual that provides no information (and I guess never will?) about > vendor extensions. > > > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
On 2024/3/11 14:30, Qingfang Deng wrote: > T-Head C908 has the same IRQ num and CSR as previous C9xx cores, but > reports non-zero marchid and mimpid. Remove the ID checks. > > Fixes: 65e9fb081877 ("drivers/perf: riscv_pmu_sbi: add support for PMU variant on T-Head C9xx cores") > Signed-off-by: Qingfang Deng<dqfext@gmail.com> > --- > arch/riscv/errata/thead/errata.c | 4 ---- > drivers/perf/riscv_pmu_sbi.c | 4 +--- > 2 files changed, 1 insertion(+), 7 deletions(-) > > diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c > index b1c410bbc1ae..49ccad5b21bb 100644 > --- a/arch/riscv/errata/thead/errata.c > +++ b/arch/riscv/errata/thead/errata.c > @@ -125,10 +125,6 @@ static bool errata_probe_pmu(unsigned int stage, > if (!IS_ENABLED(CONFIG_ERRATA_THEAD_PMU)) > return false; > > - /* target-c9xx cores report arch_id and impid as 0 */ > - if (arch_id != 0 || impid != 0) > - return false; > - > if (stage == RISCV_ALTERNATIVES_EARLY_BOOT) > return false; > > diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c > index 452aab49db1e..87b83184383a 100644 > --- a/drivers/perf/riscv_pmu_sbi.c > +++ b/drivers/perf/riscv_pmu_sbi.c > @@ -811,9 +811,7 @@ static int pmu_sbi_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pde > riscv_pmu_irq_num = RV_IRQ_PMU; > riscv_pmu_use_irq = true; > } else if (IS_ENABLED(CONFIG_ERRATA_THEAD_PMU) && > - riscv_cached_mvendorid(0) == THEAD_VENDOR_ID && > - riscv_cached_marchid(0) == 0 && > - riscv_cached_mimpid(0) == 0) { > + riscv_cached_mvendorid(0) == THEAD_VENDOR_ID) { > riscv_pmu_irq_num = THEAD_C9XX_RV_IRQ_PMU; > riscv_pmu_use_irq = true; > } > -- 2.34.1 Tested-by: Yangyu Chen <cyy@cyyself.name> With this patch and T-Head C908 PMU being probed by OpenSBI, I can now use the perf record to profile RVV 1.0 software on Canaan Kendryte K230. This will speed up many RVV 1.0 software developments now and even for better performance. However, as Inochi said, the newer version, C908 may support Sccofpmf. We should ask Guo Ren to clarify this so we can have the cleanest way to probe what to use between THEAD_PMU and Sscofpmu. I added CC to Guo Ren. Please clarify about this. Some off-topic things: I need this feature recently since I am implementing a pure RVV chacha20 algorithm. I have already sent PR to openssl to speed up the crypto performance on RVV without Zvkb support and maybe ported to kernel crypto sometimes. To speed up TLS or other applications for many chips that may come this year with RVV 1.0 but without Zvkb. Link: https://github.com/openssl/openssl/pull/24069 However, the performance evaluation on K230 is not well compared to pure C implementation. I will need this PMU driver to do some profiling. Thanks, Yangyu Chen
>> IMHO, it may be better to use a new DT property like "riscv,cpu-errata" or >> "<vendor>,cpu-errata". It can achieve almost everything like using pseudo >> isa. And the only cost I think is a small amount code to parse this. > > I suppose we could do that, but accounting for vendor specifics was one > of the goals for the property I only just added and that I am suggesting > to use here. I think there is a simpler way to do that. We use T-Head PMU by default for All T-Head CPUs (from mvendor id). Then, to test there is sscofpmf in the ISA string being probed by the kernel. If yes, then use scofpmf. Otherwise, use T-Head PMU. I will check if this can also be switched in any vendor CSR like Svpbmt and T-Head MAE we discussed before. Thanks, Yangyu Chen
On Fri, Apr 12, 2024 at 02:27:28PM +0800, Yangyu Chen wrote: > >> IMHO, it may be better to use a new DT property like "riscv,cpu-errata" or > >> "<vendor>,cpu-errata". It can achieve almost everything like using pseudo > >> isa. And the only cost I think is a small amount code to parse this. > > > > I suppose we could do that, but accounting for vendor specifics was one > > of the goals for the property I only just added and that I am suggesting > > to use here. > > I think there is a simpler way to do that. We use T-Head PMU by default > for All T-Head CPUs (from mvendor id). Then, to test there is sscofpmf in > the ISA string being probed by the kernel. If yes, then use scofpmf. > Otherwise, use T-Head PMU. I am strongly opposed to doing something like this. Firstly, making it unconditional is a time-bomb as if T-Head ever ship something without support then we'll be broken on that platform and have to return to conditional behaviour. Secondly, we are taking agency away from hypervisors etc that may not want a guest to use the PMU. Cheers, Conor.
On Fri, Apr 12, 2024 at 02:09:32PM +0800, Yangyu Chen wrote: > On 2024/3/11 14:30, Qingfang Deng wrote: > > T-Head C908 has the same IRQ num and CSR as previous C9xx cores, but > > reports non-zero marchid and mimpid. Remove the ID checks. > > > > Fixes: 65e9fb081877 ("drivers/perf: riscv_pmu_sbi: add support for PMU variant on T-Head C9xx cores") > > Signed-off-by: Qingfang Deng<dqfext@gmail.com> > > --- > > arch/riscv/errata/thead/errata.c | 4 ---- > > drivers/perf/riscv_pmu_sbi.c | 4 +--- > > 2 files changed, 1 insertion(+), 7 deletions(-) > > > > diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c > > index b1c410bbc1ae..49ccad5b21bb 100644 > > --- a/arch/riscv/errata/thead/errata.c > > +++ b/arch/riscv/errata/thead/errata.c > > @@ -125,10 +125,6 @@ static bool errata_probe_pmu(unsigned int stage, > > if (!IS_ENABLED(CONFIG_ERRATA_THEAD_PMU)) > > return false; > > > > - /* target-c9xx cores report arch_id and impid as 0 */ > > - if (arch_id != 0 || impid != 0) > > - return false; > > - > > if (stage == RISCV_ALTERNATIVES_EARLY_BOOT) > > return false; > > > > diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c > > index 452aab49db1e..87b83184383a 100644 > > --- a/drivers/perf/riscv_pmu_sbi.c > > +++ b/drivers/perf/riscv_pmu_sbi.c > > @@ -811,9 +811,7 @@ static int pmu_sbi_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pde > > riscv_pmu_irq_num = RV_IRQ_PMU; > > riscv_pmu_use_irq = true; > > } else if (IS_ENABLED(CONFIG_ERRATA_THEAD_PMU) && > > - riscv_cached_mvendorid(0) == THEAD_VENDOR_ID && > > - riscv_cached_marchid(0) == 0 && > > - riscv_cached_mimpid(0) == 0) { > > + riscv_cached_mvendorid(0) == THEAD_VENDOR_ID) { > > riscv_pmu_irq_num = THEAD_C9XX_RV_IRQ_PMU; > > riscv_pmu_use_irq = true; > > } > > -- 2.34.1 > > Tested-by: Yangyu Chen <cyy@cyyself.name> > > With this patch and T-Head C908 PMU being probed by OpenSBI, I can now use > the perf record to profile RVV 1.0 software on Canaan Kendryte K230. This > will speed up many RVV 1.0 software developments now and even for better > performance. > > However, as Inochi said, the newer version, C908 may support Sccofpmf. We > should ask Guo Ren to clarify this so we can have the cleanest way to > probe what to use between THEAD_PMU and Sscofpmu. > > I added CC to Guo Ren. Please clarify about this. diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c index c6ef09c4548c..ee6fa5b65b53 100644 --- a/drivers/perf/riscv_pmu_sbi.c +++ b/drivers/perf/riscv_pmu_sbi.c @@ -812,8 +812,10 @@ static int pmu_sbi_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pde riscv_pmu_use_irq = true; } else if (IS_ENABLED(CONFIG_ERRATA_THEAD_PMU) && riscv_cached_mvendorid(0) == THEAD_VENDOR_ID && - riscv_cached_marchid(0) == 0 && - riscv_cached_mimpid(0) == 0) { + (riscv_cached_marchid(0) == 0 || + riscv_cached_marchid(0) == 0x8000000009140d00) && + (riscv_cached_mimpid(0) == 0 || + riscv_cached_mimpid(0) == 0x50000)) { riscv_pmu_irq_num = THEAD_C9XX_RV_IRQ_PMU; riscv_pmu_use_irq = true; } Only k230's c908 has the problem, not all XuanTie Processors. We just need to pick it out. Could the above satisfy you? > > Some off-topic things: > > I need this feature recently since I am implementing a pure RVV chacha20 > algorithm. I have already sent PR to openssl to speed up the crypto > performance on RVV without Zvkb support and maybe ported to kernel crypto > sometimes. To speed up TLS or other applications for many chips that may > come this year with RVV 1.0 but without Zvkb. > > Link: https://github.com/openssl/openssl/pull/24069 > > However, the performance evaluation on K230 is not well compared to pure C > implementation. I will need this PMU driver to do some profiling. > > Thanks, > Yangyu Chen > > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv >
diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c index b1c410bbc1ae..49ccad5b21bb 100644 --- a/arch/riscv/errata/thead/errata.c +++ b/arch/riscv/errata/thead/errata.c @@ -125,10 +125,6 @@ static bool errata_probe_pmu(unsigned int stage, if (!IS_ENABLED(CONFIG_ERRATA_THEAD_PMU)) return false; - /* target-c9xx cores report arch_id and impid as 0 */ - if (arch_id != 0 || impid != 0) - return false; - if (stage == RISCV_ALTERNATIVES_EARLY_BOOT) return false; diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c index 452aab49db1e..87b83184383a 100644 --- a/drivers/perf/riscv_pmu_sbi.c +++ b/drivers/perf/riscv_pmu_sbi.c @@ -811,9 +811,7 @@ static int pmu_sbi_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pde riscv_pmu_irq_num = RV_IRQ_PMU; riscv_pmu_use_irq = true; } else if (IS_ENABLED(CONFIG_ERRATA_THEAD_PMU) && - riscv_cached_mvendorid(0) == THEAD_VENDOR_ID && - riscv_cached_marchid(0) == 0 && - riscv_cached_mimpid(0) == 0) { + riscv_cached_mvendorid(0) == THEAD_VENDOR_ID) { riscv_pmu_irq_num = THEAD_C9XX_RV_IRQ_PMU; riscv_pmu_use_irq = true; }
T-Head C908 has the same IRQ num and CSR as previous C9xx cores, but reports non-zero marchid and mimpid. Remove the ID checks. Fixes: 65e9fb081877 ("drivers/perf: riscv_pmu_sbi: add support for PMU variant on T-Head C9xx cores") Signed-off-by: Qingfang Deng <dqfext@gmail.com> --- arch/riscv/errata/thead/errata.c | 4 ---- drivers/perf/riscv_pmu_sbi.c | 4 +--- 2 files changed, 1 insertion(+), 7 deletions(-)