Message ID | 20221202071900.1143950-1-francesco@dolcini.it (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v1] mtd: parsers: ofpart: Fix parsing when size-cells is 0 | expand |
Hi Francesco, francesco@dolcini.it wrote on Fri, 2 Dec 2022 08:19:00 +0100: > From: Francesco Dolcini <francesco.dolcini@toradex.com> > > Add a fallback mechanism to handle the case in which #size-cells is set > to <0>. According to the DT binding the nand controller node should have > set it to 0 and this is not compatible with the legacy way of > specifying partitions directly as child nodes of the nand-controller node. I understand the problem, I understand the fix, but I have to say, I strongly dislike it :) Touching an mtd core driver to fix a single broken use case like that is... problematic, for the least. I am sorry but if a 6.0 kernel breaks because: - a legacy scheme is used in the description - u-boot still does not conform to the DT standard - a patch tries to make a tool happy Then the solution is clearly not an 'mtd core' mainline patch. If you really want to workaround U-Boot, either you revert that patch or you just fix the DT description instead. The parent/child/partitions scheme has been enforced for maybe 5 years now and for a good reason: a NAND controller with partitions does not make _any_ sense. There are plenty of examples out there, imx7-colibri.dtsi has received many updates since its introduction (for the best), so why not this one? Cheers, Miquèl > This fixes a boot failure on colibri-imx7 and potentially other boards. > > Cc: stable@vger.kernel.org > Fixes: 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") > Link: https://lore.kernel.org/all/Y4dgBTGNWpM6SQXI@francesco-nb.int.toradex.com/ > Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> > --- > drivers/mtd/parsers/ofpart_core.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/mtd/parsers/ofpart_core.c b/drivers/mtd/parsers/ofpart_core.c > index 192190c42fc8..aa3b7fa61e50 100644 > --- a/drivers/mtd/parsers/ofpart_core.c > +++ b/drivers/mtd/parsers/ofpart_core.c > @@ -122,6 +122,17 @@ static int parse_fixed_partitions(struct mtd_info *master, > > a_cells = of_n_addr_cells(pp); > s_cells = of_n_size_cells(pp); > + if (s_cells == 0) { > + /* > + * Use #size-cells = <1> for backward compatibility > + * in case #size-cells is set to <0> and firmware adds > + * OF partitions without setting it. > + */ > + pr_warn_once("%s: ofpart partition %pOF (%pOF) #size-cells is <0>, using <1> for backward compatibility.\n", > + master->name, pp, > + mtd_node); > + s_cells = 1; > + } > if (len / 4 != a_cells + s_cells) { > pr_debug("%s: ofpart partition %pOF (%pOF) error parsing reg property.\n", > master->name, pp,
Hello Miquel, On Fri, Dec 02, 2022 at 10:14:18AM +0100, Miquel Raynal wrote: > francesco@dolcini.it wrote on Fri, 2 Dec 2022 08:19:00 +0100: > > From: Francesco Dolcini <francesco.dolcini@toradex.com> > > > > Add a fallback mechanism to handle the case in which #size-cells is set > > to <0>. According to the DT binding the nand controller node should have > > set it to 0 and this is not compatible with the legacy way of > > specifying partitions directly as child nodes of the nand-controller node. > > I understand the problem, I understand the fix, but I have to say, I > strongly dislike it :) Touching an mtd core driver to fix a single > broken use case like that is... problematic, for the least. I just noticed it 2 days after this patch was backported to a stable kernel, I am just the first one to notice, we are not talking about a single use case. > I am sorry but if a 6.0 kernel breaks because: Not only kernel 6.0 is currently broken. This patch is going to be backported to any stable kernel given the fixes tag it has. > If you really want to workaround U-Boot, either you revert that patch > or you just fix the DT description instead. The parent/child/partitions > scheme has been enforced for maybe 5 years now and for a good reason: a > NAND controller with partitions does not make _any_ sense. There are > plenty of examples out there, imx7-colibri.dtsi has received many > updates since its introduction (for the best), so why not this one? I can and I will update imx7-colibri.dtsi (patch coming), but is this good enough given the kind of boot failure regression this introduce? We are going to have old u-boot around that will not work with it, and the reality is that there are tons of reasons why people do update the linux kernel and dts everyday, but never ever update the bootloader. We cannot tell "All users of the XXX kernel series must upgrade." and at the same time introduce a regression that break the boot and ignore it. Francesco
On Fri, Dec 02, 2022 at 11:12:43AM +0100, Francesco Dolcini wrote: > Hello Miquel, > > On Fri, Dec 02, 2022 at 10:14:18AM +0100, Miquel Raynal wrote: > > francesco@dolcini.it wrote on Fri, 2 Dec 2022 08:19:00 +0100: > > > From: Francesco Dolcini <francesco.dolcini@toradex.com> > > > > > > Add a fallback mechanism to handle the case in which #size-cells is set > > > to <0>. According to the DT binding the nand controller node should have > > > set it to 0 and this is not compatible with the legacy way of > > > specifying partitions directly as child nodes of the nand-controller node. > > > > I understand the problem, I understand the fix, but I have to say, I > > strongly dislike it :) Touching an mtd core driver to fix a single > > broken use case like that is... problematic, for the least. > I just noticed it 2 days after this patch was backported to a stable > kernel, I am just the first one to notice, we are not talking about a single > use case. > > > I am sorry but if a 6.0 kernel breaks because: > Not only kernel 6.0 is currently broken. This patch is going to be > backported to any stable kernel given the fixes tag it has. > > > If you really want to workaround U-Boot, either you revert that patch > > or you just fix the DT description instead. The parent/child/partitions > > scheme has been enforced for maybe 5 years now and for a good reason: a > > NAND controller with partitions does not make _any_ sense. There are > > plenty of examples out there, imx7-colibri.dtsi has received many > > updates since its introduction (for the best), so why not this one? > > I can and I will update imx7-colibri.dtsi (patch coming), but is this > good enough given the kind of boot failure regression this introduce? We > are going to have old u-boot around that will not work with it, and the Just another piece of information, support for the partitions node in U-Boot was added in version v2022.04 [1], we are not talking about ancient old legacy stuff. If I add the partitions node as a child of my nand controller, as I was planning to do and I wrote 10 lines above, I will create a new flavor of non-booting system with U-Boot older than v2022.04 :-/ U-Boot older than v2022.04 will update the nand controller node never the less, the partition node will still be there and Linux will use it, but it will be empty since nobody populate it. Francesco [1] commit 36fee2f7621e ("common: fdt_support: add support for "partitions" subnode to fdt_fixup_mtdparts()")
Hi Francesco, francesco@dolcini.it wrote on Fri, 2 Dec 2022 11:24:29 +0100: > On Fri, Dec 02, 2022 at 11:12:43AM +0100, Francesco Dolcini wrote: > > Hello Miquel, > > > > On Fri, Dec 02, 2022 at 10:14:18AM +0100, Miquel Raynal wrote: > > > francesco@dolcini.it wrote on Fri, 2 Dec 2022 08:19:00 +0100: > > > > From: Francesco Dolcini <francesco.dolcini@toradex.com> > > > > > > > > Add a fallback mechanism to handle the case in which #size-cells is set > > > > to <0>. According to the DT binding the nand controller node should have > > > > set it to 0 and this is not compatible with the legacy way of > > > > specifying partitions directly as child nodes of the nand-controller node. > > > > > > I understand the problem, I understand the fix, but I have to say, I > > > strongly dislike it :) Touching an mtd core driver to fix a single > > > broken use case like that is... problematic, for the least. > > I just noticed it 2 days after this patch was backported to a stable > > kernel, I am just the first one to notice, we are not talking about a single > > use case. > > > > > I am sorry but if a 6.0 kernel breaks because: > > Not only kernel 6.0 is currently broken. This patch is going to be > > backported to any stable kernel given the fixes tag it has. > > > > > If you really want to workaround U-Boot, either you revert that patch > > > or you just fix the DT description instead. The parent/child/partitions > > > scheme has been enforced for maybe 5 years now and for a good reason: a > > > NAND controller with partitions does not make _any_ sense. There are > > > plenty of examples out there, imx7-colibri.dtsi has received many > > > updates since its introduction (for the best), so why not this one? > > > > I can and I will update imx7-colibri.dtsi (patch coming), :thumb_up: > > but is this > > good enough given the kind of boot failure regression this introduce? We > > are going to have old u-boot around that will not work with it, and the > > Just another piece of information, support for the partitions node in > U-Boot was added in version v2022.04 [1], we are not talking about ancient > old legacy stuff. If it is so recent, then this is what needs to be fixed, and it should not bother "many" people because 2022.04 is not so old. So I am a bit lost, IIUC what is currently broken is: - U-Boot > 2022.04 and any version of Linux with the backport? > If I add the partitions node as a child of my nand controller, as I was > planning to do and I wrote 10 lines above, I will create a new flavor of > non-booting system with U-Boot older than v2022.04 :-/ I think there is a little confusion here. You are referring to the NAND controller node, the commit refers to the NAND chip node. What this commit does looks fine because it just tries to use the partitions {} node rather than the NAND chip node and if the partitions {} node already exist, I expect #address-cells and #size-cells to be defined and be != 0 already. Here is a proper description: nand-controller { #address-cells = <1>; #size-cells = <0>; nand-chip { partitions { #address-cells = <1 or 2>; #size-cells = <1 or 2>; partition@x { }; partition@y { }; }; }; /* Here you can very well have another nand-chip node with * another reg property which represents its own CS and another * set of partitions. */ }; > U-Boot older than v2022.04 will update the nand controller node never > the less, the partition node will still be there and Linux will use it, > but it will be empty since nobody populate it. > > Francesco > > [1] commit 36fee2f7621e ("common: fdt_support: add support for "partitions" subnode to fdt_fixup_mtdparts()") Thanks, Miquèl
+ u-boot list On Fri, Dec 02, 2022 at 11:53:27AM +0100, Miquel Raynal wrote: > francesco@dolcini.it wrote on Fri, 2 Dec 2022 11:24:29 +0100: > > On Fri, Dec 02, 2022 at 11:12:43AM +0100, Francesco Dolcini wrote: > > > On Fri, Dec 02, 2022 at 10:14:18AM +0100, Miquel Raynal wrote: > > > > francesco@dolcini.it wrote on Fri, 2 Dec 2022 08:19:00 +0100: > > > > > From: Francesco Dolcini <francesco.dolcini@toradex.com> > > > > > > > > > > Add a fallback mechanism to handle the case in which #size-cells is set > > > > > to <0>. According to the DT binding the nand controller node should have > > > > > set it to 0 and this is not compatible with the legacy way of > > > > > specifying partitions directly as child nodes of the nand-controller node. > > > > > > > > I understand the problem, I understand the fix, but I have to say, I > > > > strongly dislike it :) Touching an mtd core driver to fix a single > > > > broken use case like that is... problematic, for the least. > > > I just noticed it 2 days after this patch was backported to a stable > > > kernel, I am just the first one to notice, we are not talking about a single > > > use case. > > > > > > > I am sorry but if a 6.0 kernel breaks because: > > > Not only kernel 6.0 is currently broken. This patch is going to be > > > backported to any stable kernel given the fixes tag it has. > > > > > > > If you really want to workaround U-Boot, either you revert that patch > > > > or you just fix the DT description instead. The parent/child/partitions > > > > scheme has been enforced for maybe 5 years now and for a good reason: a > > > > NAND controller with partitions does not make _any_ sense. There are > > > > plenty of examples out there, imx7-colibri.dtsi has received many > > > > updates since its introduction (for the best), so why not this one? > > > > > > I can and I will update imx7-colibri.dtsi (patch coming), > > :thumb_up: > > > > but is this > > > good enough given the kind of boot failure regression this introduce? We > > > are going to have old u-boot around that will not work with it, and the > > > > Just another piece of information, support for the partitions node in > > U-Boot was added in version v2022.04 [1], we are not talking about ancient > > old legacy stuff. > > If it is so recent, then this is what needs to be fixed, and it should > not bother "many" people because 2022.04 is not so old. > > So I am a bit lost, IIUC what is currently broken is: > - U-Boot > 2022.04 and any version of Linux with the backport? > > > If I add the partitions node as a child of my nand controller, as I was > > planning to do and I wrote 10 lines above, I will create a new flavor of > > non-booting system with U-Boot older than v2022.04 :-/ > > I think there is a little confusion here. You are referring to the NAND I guess I have not explained myself well enough :-) U-Boot is creating the partitions in the dtb, they are not defined in the source dts file (this is common practice with multiple boards). Before v2022.04 it was always updating the nand-controller node, starting from v2022.04 if there is a dedicated `partitions` node it uses it. This is just the reverse of what ofpart_core.c is doing (if the partitions node is there it assumes the partitions should go into it, otherwise it proceeds with the legacy way). Let's have a concrete example with colibri-imx7. Current status: - The nand-controller node does not include any partitions child, any U-Boot version will just add the partition directly as child of the nand controller. This is where I am hitting this boot regression now. Potential change I envisioned here: - I add the partitions node to the nand-controller, e.g. --- a/arch/arm/boot/dts/imx7-colibri.dtsi +++ b/arch/arm/boot/dts/imx7-colibri.dtsi @@ -380,6 +380,12 @@ &gpmi { nand-on-flash-bbt; pinctrl-names = "default"; pinctrl-0 = <&pinctrl_gpmi_nand>; + + partitions { + compatible = "fixed-partitions"; + #address-cells = <1>; + #size-cells = <1>; + }; }; - U-Boot >= v2022.04 will just work fine creating the partitions as currently described in the bindings. - U-Boot < v2022.04 will still create the partitions as child of the nand-controller node. Linux will see that a `partitions` node exists but it will be empty, leading to a boot failure in case mtd is used as boot device. > controller node, the commit refers to the NAND chip node. What this > commit does looks fine because it just tries to use the partitions {} > node rather than the NAND chip node and if the partitions {} node > already exist, I expect #address-cells and #size-cells to be defined > and be != 0 already. yes, this commit is perfectly fine I agree. The reality is that people is using newer kernel with older U-Boot, and I do not think that deliberately breaking this use case is what the Linux kernel should do. I do not think that I can push a change in the DTS that will break booting any board using an older U-Boot. Francesco
On Fri, Dec 02, 2022 at 08:19:00AM +0100, Francesco Dolcini wrote: > From: Francesco Dolcini <francesco.dolcini@toradex.com> > > Add a fallback mechanism to handle the case in which #size-cells is set > to <0>. According to the DT binding the nand controller node should have > set it to 0 and this is not compatible with the legacy way of > specifying partitions directly as child nodes of the nand-controller node. > > This fixes a boot failure on colibri-imx7 and potentially other boards. > > Cc: stable@vger.kernel.org > Fixes: 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") > Link: https://lore.kernel.org/all/Y4dgBTGNWpM6SQXI@francesco-nb.int.toradex.com/ > Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> > --- > drivers/mtd/parsers/ofpart_core.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hi Francesco, francesco@dolcini.it wrote on Fri, 2 Dec 2022 12:23:37 +0100: > + u-boot list > > On Fri, Dec 02, 2022 at 11:53:27AM +0100, Miquel Raynal wrote: > > francesco@dolcini.it wrote on Fri, 2 Dec 2022 11:24:29 +0100: > > > On Fri, Dec 02, 2022 at 11:12:43AM +0100, Francesco Dolcini wrote: > > > > On Fri, Dec 02, 2022 at 10:14:18AM +0100, Miquel Raynal wrote: > > > > > francesco@dolcini.it wrote on Fri, 2 Dec 2022 08:19:00 +0100: > > > > > > From: Francesco Dolcini <francesco.dolcini@toradex.com> > > > > > > > > > > > > Add a fallback mechanism to handle the case in which #size-cells is set > > > > > > to <0>. According to the DT binding the nand controller node should have > > > > > > set it to 0 and this is not compatible with the legacy way of > > > > > > specifying partitions directly as child nodes of the nand-controller node. > > > > > > > > > > I understand the problem, I understand the fix, but I have to say, I > > > > > strongly dislike it :) Touching an mtd core driver to fix a single > > > > > broken use case like that is... problematic, for the least. > > > > I just noticed it 2 days after this patch was backported to a stable > > > > kernel, I am just the first one to notice, we are not talking about a single > > > > use case. > > > > > > > > > I am sorry but if a 6.0 kernel breaks because: > > > > Not only kernel 6.0 is currently broken. This patch is going to be > > > > backported to any stable kernel given the fixes tag it has. > > > > > > > > > If you really want to workaround U-Boot, either you revert that patch > > > > > or you just fix the DT description instead. The parent/child/partitions > > > > > scheme has been enforced for maybe 5 years now and for a good reason: a > > > > > NAND controller with partitions does not make _any_ sense. There are > > > > > plenty of examples out there, imx7-colibri.dtsi has received many > > > > > updates since its introduction (for the best), so why not this one? > > > > > > > > I can and I will update imx7-colibri.dtsi (patch coming), > > > > :thumb_up: > > > > > > but is this > > > > good enough given the kind of boot failure regression this introduce? We > > > > are going to have old u-boot around that will not work with it, and the > > > > > > Just another piece of information, support for the partitions node in > > > U-Boot was added in version v2022.04 [1], we are not talking about ancient > > > old legacy stuff. > > > > If it is so recent, then this is what needs to be fixed, and it should > > not bother "many" people because 2022.04 is not so old. > > > > So I am a bit lost, IIUC what is currently broken is: > > - U-Boot > 2022.04 and any version of Linux with the backport? > > > > > If I add the partitions node as a child of my nand controller, as I was > > > planning to do and I wrote 10 lines above, I will create a new flavor of > > > non-booting system with U-Boot older than v2022.04 :-/ > > > > I think there is a little confusion here. You are referring to the NAND > I guess I have not explained myself well enough :-) Ok, there is still a confusion. Even though I think your logic still applies, I want to emphasis on how wrong it is to define partitions in the NAND _controller_ node rather than the NAND _chip_ node. And I think this might have an impact on our final choice. > U-Boot is creating the partitions in the dtb, they are not defined in > the source dts file (this is common practice with multiple boards). That fdt_fixup_mtdparts() thing is a mistake. The original idea is: 1. Define wrong nodes in your DT 2. Fix your DT at run time in U-Boot 3. Provide the "fixed" DT to Linux Now step #2 now produces wrong FDT. So what, we should darken even more the of partition driver in Linux to workaround it? At most what we can do is warn the user so that people don't loose time understanding what happens, but I am against supporting this, ever. > Before v2022.04 it was always updating the nand-controller node, > starting from v2022.04 if there is a dedicated `partitions` node it uses > it. Sounds reasonable. > This is just the reverse of what ofpart_core.c is doing (if the > partitions node is there it assumes the partitions should go into it, > otherwise it proceeds with the legacy way). Yes, that's how we handle legacy bindings. > Let's have a concrete example with colibri-imx7. > > Current status: > - The nand-controller node does not include any partitions child, any > U-Boot version will just add the partition directly as child of the > nand controller. This is where I am hitting this boot regression now. Not exactly. It worked until now because your original DT already included #size-cells = <1> I believe. It does not do that anymore and that is why you get your boot regression: because the DT was modified. The reason why the DT got modified however is interesting. The commit log says the goal is to comply with modern bindings, which is great. But if you break how your board boots, then you should probably not do that. And if we really care about complying with the bindings, there is something much more interesting than fixing a single property: distinguishing the NAND controller vs. the NAND chip(s), which has been enforced since 2016 (which probably predates the imx7-colibri.dtsi, but whatever): 2d472aba15ff ("mtd: nand: document the NAND controller/NAND chip DT representation") > Potential change I envisioned here: > - I add the partitions node to the nand-controller, e.g. > > --- a/arch/arm/boot/dts/imx7-colibri.dtsi > +++ b/arch/arm/boot/dts/imx7-colibri.dtsi > @@ -380,6 +380,12 @@ &gpmi { > nand-on-flash-bbt; > pinctrl-names = "default"; > pinctrl-0 = <&pinctrl_gpmi_nand>; > + > + partitions { > + compatible = "fixed-partitions"; > + #address-cells = <1>; > + #size-cells = <1>; > + }; > }; > > - U-Boot >= v2022.04 will just work fine creating the partitions as > currently described in the bindings. > - U-Boot < v2022.04 will still create the partitions as child of the > nand-controller node. Linux will see that a `partitions` node exists > but it will be empty, leading to a boot failure in case mtd is used > as boot device. > > > > controller node, the commit refers to the NAND chip node. What this > > commit does looks fine because it just tries to use the partitions {} > > node rather than the NAND chip node and if the partitions {} node > > already exist, I expect #address-cells and #size-cells to be defined > > and be != 0 already. > yes, this commit is perfectly fine I agree. > > The reality is that people is using newer kernel with older U-Boot, and > I do not think that deliberately breaking this use case is what the > Linux kernel should do. Agreed. > I do not think that I can push a change in the DTS that will break > booting any board using an older U-Boot. That's however the initial cause of this discovery. A DT change broke your boot flow. I'm saying "your" boot flow because I am not sure it affects "any" board. For now it only affects the imx7 colibri boards because of: 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") But all these boards could be affected in the same way because of some machine code playing with fdt_fixup_mtdparts(): * arch/arm/mach-uniphier/fdt-fixup.c * board/compulab/cm_fx6/cm_fx6.c * board/gateworks/gw_ventana/gw_ventana.c * board/isee/igep003x/board.c * board/isee/igep00x0/igep00x0.c * board/phytec/phycore_am335x_r2/board.c * board/st/stm32mp1/stm32mp1.c * board/toradex/colibri-imx6ull/colibri-imx6ull.c * board/toradex/colibri_imx7/colibri_imx7.c * board/toradex/colibri_vf/colibri_vf.c That's of course way too much possible failures. I still strongly disagree with the initial proposal but what I think we can do is: 1. To prevent future breakages: Fix fdt_fixup_mtdparts() in u-boot. This way newer U-Boot + any kernel should work. 2. To help tracking down situations like that: Keep the warning in ofpart.c but continue to fail. 3. To fix the current situation: Immediately revert commit (and prevent it from being backported): 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") This way your own boot flow is fixed in the short term. 4. There is no reason to partially fix a DT like what the above did besides trying to avoid warnings emitted by the DT check tools. If complying with modern bindings is a goal (and I think it should be), then we can modernize this DT without breaking the boot flow: Instead of only setting #size-cell = <0>, you can as well define in your DT a subnode to define the NAND chip. NAND chips are not supposed to have #size-cells properties, but in the past they did, which means #address-cells and #size-cells are allowed (and marked deprecated in the schema). So in practice, the dt-schema will not warn you if they are there, which means you can still set #size-cell = <1>. Please mind, the tools have been updated very recently to match what I am describing above, so they will likely still report errors until v6.2-rc1, see: https://lore.kernel.org/linux-mtd/20221114090315.848208-1-miquel.raynal@bootlin.com/ Does this sound reasonable? Thanks, Miquèl
On 12/2/22 15:05, Miquel Raynal wrote: > Hi Francesco, Hi, [...] > I still strongly disagree with the initial proposal but what I think we > can do is: > > 1. To prevent future breakages: > Fix fdt_fixup_mtdparts() in u-boot. This way newer U-Boot + any > kernel should work. > > 2. To help tracking down situations like that: > Keep the warning in ofpart.c but continue to fail. > > 3. To fix the current situation: > Immediately revert commit (and prevent it from being backported): > 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") > This way your own boot flow is fixed in the short term. Here I disagree, the fix is correct and I think we shouldn't proliferate incorrect DTs which don't match the binding document. Rather, if a bootloader generates incorrect (new) DT entries, I believe the driver should implement a fixup and warn user about this. PC does that as well with broken ACPI tables as far as I can tell. I'm not convinced making a DT non-compliant with bindings again, only to work around a problem induced by bootloader, is the right approach here. This would be setting a dangerous example, where anyone could request a DT fix to be reverted because their random bootloader does the wrong thing and with valid DT clean up, something broke. > 4. There is no reason to partially fix a DT like what the above did > besides trying to avoid warnings emitted by the DT check tools. Note that the 3. does not partially fix the DT, it fixes the node fully. > If > complying with modern bindings is a goal (and I think it should > be), then we can modernize this DT without breaking the boot flow: > Instead of only setting #size-cell = <0>, you can as well define > in your DT a subnode to define the NAND chip. NAND chips are not > supposed to have #size-cells properties, but in the past they did, > which means #address-cells and #size-cells are allowed (and marked > deprecated in the schema). So in practice, the dt-schema will not > warn you if they are there, which means you can still set > #size-cell = <1>. I am really not convinced we should hack around this on the DT end and try to push some sort of convoluted workaround there, instead of fixing it on the driver side (and bootloader side, in the long run). > Please mind, the tools have been updated very recently to match > what I am describing above, so they will likely still report > errors until v6.2-rc1, see: > https://lore.kernel.org/linux-mtd/20221114090315.848208-1-miquel.raynal@bootlin.com/ > > Does this sound reasonable? [...]
Hi Marek, marex@denx.de wrote on Fri, 2 Dec 2022 15:31:40 +0100: > On 12/2/22 15:05, Miquel Raynal wrote: > > Hi Francesco, > > Hi, > > [...] > > > I still strongly disagree with the initial proposal but what I think we > > can do is: > > > > 1. To prevent future breakages: > > Fix fdt_fixup_mtdparts() in u-boot. This way newer U-Boot + any > > kernel should work. > > > > 2. To help tracking down situations like that: > > Keep the warning in ofpart.c but continue to fail. > > > > 3. To fix the current situation: > > Immediately revert commit (and prevent it from being backported): > > 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") > > This way your own boot flow is fixed in the short term. > > Here I disagree, the fix is correct and I think we shouldn't > proliferate incorrect DTs which don't match the binding document. I agree we should not proliferate incorrect DTs, so let's use a modern description then, with a controller and a child node which defines the chip. > Rather, if a bootloader generates incorrect (new) DT entries, I > believe the driver should implement a fixup and warn user about this. > PC does that as well with broken ACPI tables as far as I can tell. > > I'm not convinced making a DT non-compliant with bindings again, I am sorry to say so, but while warnings reported by the tools should be fixed, it's not because the tool does not scream at you that the description is valid. We are actively working on enhancing the schema so that "all" improper descriptions get warnings (see the series pointed earlier), but in no way this change makes the node compliant with modern bindings. I'm not saying the fix is wrong, but let's be pragmatic, it currently leads to boot failures. > only to work around a problem induced by bootloader, is the right approach > here. When a patch breaks a board and there is no straight fix, you revert it, then you think harder. That's what I am saying. This is a temporary solution. > This would be setting a dangerous example, where anyone could request a DT fix to be reverted because their random bootloader does the wrong thing and with valid DT clean up, something broke. Please, you know this is not valid DT clean up. We've been decoupling controller and chip description since 2016. What I am proposing is a valid DT cleanup, not to the latest standard, but way closer than the current solution. > > 4. There is no reason to partially fix a DT like what the above did > > besides trying to avoid warnings emitted by the DT check tools. > > Note that the 3. does not partially fix the DT, it fixes the node fully. > > > If > > complying with modern bindings is a goal (and I think it should > > be), then we can modernize this DT without breaking the boot flow: > > Instead of only setting #size-cell = <0>, you can as well define > > in your DT a subnode to define the NAND chip. NAND chips are not > > supposed to have #size-cells properties, but in the past they did, > > which means #address-cells and #size-cells are allowed (and marked > > deprecated in the schema). So in practice, the dt-schema will not > > warn you if they are there, which means you can still set > > #size-cell = <1>. > > I am really not convinced we should hack around this on the DT end and try to push some sort of convoluted workaround there, "convoluted workaround" :-) > instead of fixing it on the driver side (and bootloader side, in the > long run). > > > Please mind, the tools have been updated very recently to match s/tools/schema/ > > what I am describing above, so they will likely still report > > errors until v6.2-rc1, see: > > https://lore.kernel.org/linux-mtd/20221114090315.848208-1-miquel.raynal@bootlin.com/ > > > > Does this sound reasonable? > > [...] Thanks, Miquèl
On 12/2/22 16:00, Miquel Raynal wrote: > Hi Marek, Hi, > marex@denx.de wrote on Fri, 2 Dec 2022 15:31:40 +0100: > >> On 12/2/22 15:05, Miquel Raynal wrote: >>> Hi Francesco, >> >> Hi, >> >> [...] >> >>> I still strongly disagree with the initial proposal but what I think we >>> can do is: >>> >>> 1. To prevent future breakages: >>> Fix fdt_fixup_mtdparts() in u-boot. This way newer U-Boot + any >>> kernel should work. >>> >>> 2. To help tracking down situations like that: >>> Keep the warning in ofpart.c but continue to fail. >>> >>> 3. To fix the current situation: >>> Immediately revert commit (and prevent it from being backported): >>> 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") >>> This way your own boot flow is fixed in the short term. >> >> Here I disagree, the fix is correct and I think we shouldn't >> proliferate incorrect DTs which don't match the binding document. > > I agree we should not proliferate incorrect DTs, so let's use a modern > description then Yes please ! > , with a controller and a child node which defines the > chip. But what if there is no chip connected to the controller node ? If I understand the proposal here right (please correct me if I'm wrong), then: 1) This is the original, old, wrong binding: &gpmi { #size-cells = <1>; ... partition@N { ... }; }; 2) This is the newer, but still wrong binding: &gpmi { #size-cells = <0>; ... partitions { partition@N { ... }; }; }; 3) This is the newest binding, what we want: &gpmi { #size-cells = <0>; ... nand-chip { partitions { partition@N { ... }; }; }; }; But if there is no physical nand chip connected to the controller, would we end up with empty nand-chip node in DT, like this? &gpmi { #size-cells = <X>; ... nand-chip { /* empty */ }; }; What would be the gpmi controller size cells (X) in that case, still 0, right ? So how does that help solve this problem, wouldn't U-Boot still populate the partitions directly under the gpmi node or into partitions sub-node ? >> Rather, if a bootloader generates incorrect (new) DT entries, I >> believe the driver should implement a fixup and warn user about this. >> PC does that as well with broken ACPI tables as far as I can tell. >> >> I'm not convinced making a DT non-compliant with bindings again, > > I am sorry to say so, but while warnings reported by the tools > should be fixed, it's not because the tool does not scream at you that > the description is valid. We are actively working on enhancing the > schema so that "all" improper descriptions get warnings (see the series > pointed earlier), but in no way this change makes the node compliant > with modern bindings. > > I'm not saying the fix is wrong, but let's be pragmatic, it currently > leads to boot failures. I fully agree that we do have a problem, and that it trickled into stable makes it even worse. Maybe I don't fully understand the thing with nand-chip proposal, see my question above, esp. the last part. >> only to work around a problem induced by bootloader, is the right approach >> here. > > When a patch breaks a board and there is no straight fix, you revert > it, then you think harder. That's what I am saying. This is a temporary > solution. Isn't this patch the straight fix, at least until the bootloader can be updated to generate the nand-chip node correctly ? >> This would be setting a dangerous example, where anyone could request a DT fix to be reverted because their random bootloader does the wrong thing and with valid DT clean up, something broke. > > Please, you know this is not valid DT clean up. We've been decoupling > controller and chip description since 2016. What I am proposing is a > valid DT cleanup, not to the latest standard, but way closer than the > current solution. I think I really need one more explanation of the nand-chip part above. [...]
Hi Marek, marex@denx.de wrote on Fri, 2 Dec 2022 16:23:29 +0100: > On 12/2/22 16:00, Miquel Raynal wrote: > > Hi Marek, > > Hi, > > > marex@denx.de wrote on Fri, 2 Dec 2022 15:31:40 +0100: > > > >> On 12/2/22 15:05, Miquel Raynal wrote: > >>> Hi Francesco, > >> > >> Hi, > >> > >> [...] > >> > >>> I still strongly disagree with the initial proposal but what I think we > >>> can do is: > >>> > >>> 1. To prevent future breakages: > >>> Fix fdt_fixup_mtdparts() in u-boot. This way newer U-Boot + any > >>> kernel should work. > >>> > >>> 2. To help tracking down situations like that: > >>> Keep the warning in ofpart.c but continue to fail. > >>> > >>> 3. To fix the current situation: > >>> Immediately revert commit (and prevent it from being backported): > >>> 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") > >>> This way your own boot flow is fixed in the short term. > >> > >> Here I disagree, the fix is correct and I think we shouldn't > >> proliferate incorrect DTs which don't match the binding document. > > > > I agree we should not proliferate incorrect DTs, so let's use a modern > > description then > > Yes please ! > > > , with a controller and a child node which defines the > > chip. > > But what if there is no chip connected to the controller node ? > > If I understand the proposal here right (please correct me if I'm wrong), then: Good idea to summarize. > > 1) This is the original, old, wrong binding: > &gpmi { > #size-cells = <1>; > ... > partition@N { ... }; > }; Yes. > > > 2) This is the newer, but still wrong binding: > &gpmi { > #size-cells = <0>; > ... > partitions { > partition@N { ... }; > }; > }; Well, this is wrong description, but it would work (for compat reasons, even though I don't think this is considered valid DT by the schemas). > > 3) This is the newest binding, what we want: > &gpmi { > #size-cells = <0>; > ... > nand-chip { > partitions { > partition@N { ... }; > }; > }; > }; Yes > > But if there is no physical nand chip connected to the controller, would we end up with empty nand-chip node in DT, like this? > &gpmi { > #size-cells = <X>; > ... > nand-chip { /* empty */ }; > }; Is this really a concern? If there is no NAND chip, the controller should be disabled, no? I guess technically you could even use the status property in the nand-chip node... However, it should not be empty, at the very least a reg property should indicate on which CS it is wired, as expected there: https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/nand-chip.yaml?h=mtd/next But, as nand-chip.yaml references mtd.yaml, you can as well use whatever is described here: https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/mtd.yaml?h=mtd/next > What would be the gpmi controller size cells (X) in that case, still 0, right ? So how does that help solve this problem, wouldn't U-Boot still populate the partitions directly under the gpmi node or into partitions sub-node ? The commit that was pointed in the original fix clearly stated that the NAND chip node was targeted, not the NAND controller node. I hope this is correctly supported in U-Boot though. So if there is a NAND chip subnode, I suppose U-Boot would try to create the partitions that are inside, or even in the sub "partitions" container. > >> Rather, if a bootloader generates incorrect (new) DT entries, I > >> believe the driver should implement a fixup and warn user about this. > >> PC does that as well with broken ACPI tables as far as I can tell. > >> > >> I'm not convinced making a DT non-compliant with bindings again, > > > > I am sorry to say so, but while warnings reported by the tools > > should be fixed, it's not because the tool does not scream at you that > > the description is valid. We are actively working on enhancing the > > schema so that "all" improper descriptions get warnings (see the series > > pointed earlier), but in no way this change makes the node compliant > > with modern bindings. > > > > I'm not saying the fix is wrong, but let's be pragmatic, it currently > > leads to boot failures. > > I fully agree that we do have a problem, and that it trickled into stable makes it even worse. Maybe I don't fully understand the thing with nand-chip proposal, see my question above, esp. the last part. > > >> only to work around a problem induced by bootloader, is the right approach > >> here. > > > > When a patch breaks a board and there is no straight fix, you revert > > it, then you think harder. That's what I am saying. This is a temporary > > solution. > > Isn't this patch the straight fix, at least until the bootloader can be updated to generate the nand-chip node correctly ? > > >> This would be setting a dangerous example, where anyone could request a DT fix to be reverted because their random bootloader does the wrong thing and with valid DT clean up, something broke. > > > > Please, you know this is not valid DT clean up. We've been decoupling > > controller and chip description since 2016. What I am proposing is a > > valid DT cleanup, not to the latest standard, but way closer than the > > current solution. > > I think I really need one more explanation of the nand-chip part above. I hope things are clearer now. Thanks, Miquèl
On 02.12.22 15:31, Marek Vasut wrote: > On 12/2/22 15:05, Miquel Raynal wrote: > [...] >> 3. To fix the current situation: >> Immediately revert commit (and prevent it from being backported): >> 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") >> This way your own boot flow is fixed in the short term. > > Here I disagree, the fix is correct and I think we shouldn't proliferate > incorrect DTs which don't match the binding document. Rather, if a > bootloader generates incorrect (new) DT entries, I believe the driver > should implement a fixup and warn user about this. PC does that as well > with broken ACPI tables as far as I can tell. Well, that might be the right solution in the long run, that's up for others to decide, but we need to fix this *quickly*. For two reasons actually: the 6.1 release is near and the change was backported to stable already. For details wrt to the "quickly", see "Prioritize work on fixing regressions" here: https://docs.kernel.org/process/handling-regressions.html IOW: Ideally it should be fixed by Sunday. I'll hence likely soon will point Linus to this and suggest to revert this, unless there are strong reasons against that or some sort of agreement on a better solution. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) P.S.: As the Linux kernel's regression tracker I deal with a lot of reports and sometimes miss something important when writing mails like this. If that's the case here, don't hesitate to tell me in a public reply, it's in everyone's interest to set the public record straight.
miquel.raynal@bootlin.com wrote on Fri, 2 Dec 2022 16:49:04 +0100: > Hi Marek, > > marex@denx.de wrote on Fri, 2 Dec 2022 16:23:29 +0100: > > > On 12/2/22 16:00, Miquel Raynal wrote: > > > Hi Marek, > > > > Hi, > > > > > marex@denx.de wrote on Fri, 2 Dec 2022 15:31:40 +0100: > > > > > >> On 12/2/22 15:05, Miquel Raynal wrote: > > >>> Hi Francesco, > > >> > > >> Hi, > > >> > > >> [...] > > >> > > >>> I still strongly disagree with the initial proposal but what I think we > > >>> can do is: > > >>> > > >>> 1. To prevent future breakages: > > >>> Fix fdt_fixup_mtdparts() in u-boot. This way newer U-Boot + any > > >>> kernel should work. > > >>> > > >>> 2. To help tracking down situations like that: > > >>> Keep the warning in ofpart.c but continue to fail. > > >>> > > >>> 3. To fix the current situation: > > >>> Immediately revert commit (and prevent it from being backported): > > >>> 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") > > >>> This way your own boot flow is fixed in the short term. > > >> > > >> Here I disagree, the fix is correct and I think we shouldn't > > >> proliferate incorrect DTs which don't match the binding document. > > > > > > I agree we should not proliferate incorrect DTs, so let's use a modern > > > description then > > > > Yes please ! > > > > > , with a controller and a child node which defines the > > > chip. > > > > But what if there is no chip connected to the controller node ? > > > > If I understand the proposal here right (please correct me if I'm wrong), then: > > Good idea to summarize. > > > > > 1) This is the original, old, wrong binding: > > &gpmi { > > #size-cells = <1>; > > ... > > partition@N { ... }; > > }; > > Yes. > > > > > > > 2) This is the newer, but still wrong binding: > > &gpmi { > > #size-cells = <0>; > > ... > > partitions { > > partition@N { ... }; > > }; > > }; > > Well, this is wrong description, but it would work (for compat reasons, > even though I don't think this is considered valid DT by the schemas). > > > > > 3) This is the newest binding, what we want: > > &gpmi { > > #size-cells = <0>; > > ... > > nand-chip { > > partitions { > > partition@N { ... }; > > }; > > }; > > }; > > Yes Perhaps I should also mention that #size-cells expected to be 0 has nothing to do with the "partitions" container (otherwise #address-cells would be 0 as well). This value is however asking for an address-only reg property describing which NAND chip should be addressed and how, basically the NAND controller CS because you can wire your NAND to any CS. > > But if there is no physical nand chip connected to the controller, would we end up with empty nand-chip node in DT, like this? > > &gpmi { > > #size-cells = <X>; > > ... > > nand-chip { /* empty */ }; > > }; > > Is this really a concern? If there is no NAND chip, the controller > should be disabled, no? I guess technically you could even use the > status property in the nand-chip node... > > However, it should not be empty, at the very least a reg property > should indicate on which CS it is wired, as expected there: > https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/nand-chip.yaml?h=mtd/next > > But, as nand-chip.yaml references mtd.yaml, you can as well use > whatever is described here: > https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/mtd.yaml?h=mtd/next > > > What would be the gpmi controller size cells (X) in that case, still 0, right ? So how does that help solve this problem, wouldn't U-Boot still populate the partitions directly under the gpmi node or into partitions sub-node ? > > The commit that was pointed in the original fix clearly stated that the > NAND chip node was targeted, not the NAND controller node. I hope this > is correctly supported in U-Boot though. So if there is a NAND chip > subnode, I suppose U-Boot would try to create the partitions that are > inside, or even in the sub "partitions" container. > > > >> Rather, if a bootloader generates incorrect (new) DT entries, I > > >> believe the driver should implement a fixup and warn user about this. > > >> PC does that as well with broken ACPI tables as far as I can tell. > > >> > > >> I'm not convinced making a DT non-compliant with bindings again, > > > > > > I am sorry to say so, but while warnings reported by the tools > > > should be fixed, it's not because the tool does not scream at you that > > > the description is valid. We are actively working on enhancing the > > > schema so that "all" improper descriptions get warnings (see the series > > > pointed earlier), but in no way this change makes the node compliant > > > with modern bindings. > > > > > > I'm not saying the fix is wrong, but let's be pragmatic, it currently > > > leads to boot failures. > > > > I fully agree that we do have a problem, and that it trickled into stable makes it even worse. Maybe I don't fully understand the thing with nand-chip proposal, see my question above, esp. the last part. > > > > >> only to work around a problem induced by bootloader, is the right approach > > >> here. > > > > > > When a patch breaks a board and there is no straight fix, you revert > > > it, then you think harder. That's what I am saying. This is a temporary > > > solution. > > > > Isn't this patch the straight fix, at least until the bootloader can be updated to generate the nand-chip node correctly ? > > > > >> This would be setting a dangerous example, where anyone could request a DT fix to be reverted because their random bootloader does the wrong thing and with valid DT clean up, something broke. > > > > > > Please, you know this is not valid DT clean up. We've been decoupling > > > controller and chip description since 2016. What I am proposing is a > > > valid DT cleanup, not to the latest standard, but way closer than the > > > current solution. > > > > I think I really need one more explanation of the nand-chip part above. > > I hope things are clearer now. > > Thanks, > Miquèl
On 12/2/22 16:49, Miquel Raynal wrote: > Hi Marek, Hi, >> On 12/2/22 16:00, Miquel Raynal wrote: >>> Hi Marek, >> >> Hi, >> >>> marex@denx.de wrote on Fri, 2 Dec 2022 15:31:40 +0100: >>> >>>> On 12/2/22 15:05, Miquel Raynal wrote: >>>>> Hi Francesco, >>>> >>>> Hi, >>>> >>>> [...] >>>> >>>>> I still strongly disagree with the initial proposal but what I think we >>>>> can do is: >>>>> >>>>> 1. To prevent future breakages: >>>>> Fix fdt_fixup_mtdparts() in u-boot. This way newer U-Boot + any >>>>> kernel should work. >>>>> >>>>> 2. To help tracking down situations like that: >>>>> Keep the warning in ofpart.c but continue to fail. >>>>> >>>>> 3. To fix the current situation: >>>>> Immediately revert commit (and prevent it from being backported): >>>>> 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") >>>>> This way your own boot flow is fixed in the short term. >>>> >>>> Here I disagree, the fix is correct and I think we shouldn't >>>> proliferate incorrect DTs which don't match the binding document. >>> >>> I agree we should not proliferate incorrect DTs, so let's use a modern >>> description then >> >> Yes please ! >> >>> , with a controller and a child node which defines the >>> chip. >> >> But what if there is no chip connected to the controller node ? >> >> If I understand the proposal here right (please correct me if I'm wrong), then: > > Good idea to summarize. > >> >> 1) This is the original, old, wrong binding: >> &gpmi { >> #size-cells = <1>; >> ... >> partition@N { ... }; >> }; > > Yes. > >> >> >> 2) This is the newer, but still wrong binding: >> &gpmi { >> #size-cells = <0>; >> ... >> partitions { >> partition@N { ... }; >> }; >> }; > > Well, this is wrong description, but it would work (for compat reasons, > even though I don't think this is considered valid DT by the schemas). > >> >> 3) This is the newest binding, what we want: >> &gpmi { >> #size-cells = <0>; >> ... >> nand-chip { >> partitions { >> partition@N { ... }; >> }; >> }; >> }; > > Yes > >> >> But if there is no physical nand chip connected to the controller, would we end up with empty nand-chip node in DT, like this? >> &gpmi { >> #size-cells = <X>; >> ... >> nand-chip { /* empty */ }; >> }; > > Is this really a concern? I don't know, maybe it is not. > If there is no NAND chip, the controller > should be disabled, no? I guess technically you could even use the > status property in the nand-chip node... Sure. > However, it should not be empty, at the very least a reg property > should indicate on which CS it is wired, as expected there: > https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/nand-chip.yaml?h=mtd/next OK, I see your point. So basically this? &gpmi { #size-cells = <1>; ... nand-chip@0 { reg = <0>; }; }; btw. the GPMI NAND controller supports only one chipselect, so the reg in nand-chip node makes little sense. > But, as nand-chip.yaml references mtd.yaml, you can as well use > whatever is described here: > https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/mtd.yaml?h=mtd/next > >> What would be the gpmi controller size cells (X) in that case, still 0, right ? So how does that help solve this problem, wouldn't U-Boot still populate the partitions directly under the gpmi node or into partitions sub-node ? > > The commit that was pointed in the original fix clearly stated that the > NAND chip node was targeted I think this is another miscommunication here. The commit 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") modifies the size-cells of the NAND controller. The nand-chip is not involved in this at all . In the examples above, it's the "&gpmi" node size-cells that is modified. > , not the NAND controller node. I hope this > is correctly supported in U-Boot though. So if there is a NAND chip > subnode, I suppose U-Boot would try to create the partitions that are > inside, or even in the sub "partitions" container. My understanding is that U-Boot checks the nand-controller node size-cells, not the nand-chip{} or partitions{} subnode size-cells . Francesco, can you please share the DT, including the U-Boot generated partitions, which is passed to Linux on Colibri MX7 ? I think that should make all confusion go away. (or am I the only one who's still confused here?)
Hi Marek, marex@denx.de wrote on Fri, 2 Dec 2022 17:17:59 +0100: > On 12/2/22 16:49, Miquel Raynal wrote: > > Hi Marek, > > Hi, > > >> On 12/2/22 16:00, Miquel Raynal wrote: > >>> Hi Marek, > >> > >> Hi, > >> > >>> marex@denx.de wrote on Fri, 2 Dec 2022 15:31:40 +0100: > >>> >>>> On 12/2/22 15:05, Miquel Raynal wrote: > >>>>> Hi Francesco, > >>>> > >>>> Hi, > >>>> > >>>> [...] > >>>> >>>>> I still strongly disagree with the initial proposal but what I think we > >>>>> can do is: > >>>>> > >>>>> 1. To prevent future breakages: > >>>>> Fix fdt_fixup_mtdparts() in u-boot. This way newer U-Boot + any > >>>>> kernel should work. > >>>>> > >>>>> 2. To help tracking down situations like that: > >>>>> Keep the warning in ofpart.c but continue to fail. > >>>>> > >>>>> 3. To fix the current situation: > >>>>> Immediately revert commit (and prevent it from being backported): > >>>>> 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") > >>>>> This way your own boot flow is fixed in the short term. > >>>> > >>>> Here I disagree, the fix is correct and I think we shouldn't > >>>> proliferate incorrect DTs which don't match the binding document. > >>> > >>> I agree we should not proliferate incorrect DTs, so let's use a modern > >>> description then > >> > >> Yes please ! > >> > >>> , with a controller and a child node which defines the > >>> chip. > >> > >> But what if there is no chip connected to the controller node ? > >> > >> If I understand the proposal here right (please correct me if I'm wrong), then: > > > > Good idea to summarize. > > > >> > >> 1) This is the original, old, wrong binding: > >> &gpmi { > >> #size-cells = <1>; > >> ... > >> partition@N { ... }; > >> }; > > > > Yes. > > > >> > >> > >> 2) This is the newer, but still wrong binding: > >> &gpmi { > >> #size-cells = <0>; > >> ... > >> partitions { > >> partition@N { ... }; > >> }; > >> }; > > > > Well, this is wrong description, but it would work (for compat reasons, > > even though I don't think this is considered valid DT by the schemas). > > > >> > >> 3) This is the newest binding, what we want: > >> &gpmi { > >> #size-cells = <0>; > >> ... > >> nand-chip { > >> partitions { > >> partition@N { ... }; > >> }; > >> }; > >> }; > > > > Yes > > > >> > >> But if there is no physical nand chip connected to the controller, would we end up with empty nand-chip node in DT, like this? > >> &gpmi { > >> #size-cells = <X>; > >> ... > >> nand-chip { /* empty */ }; > >> }; > > > > Is this really a concern? > > I don't know, maybe it is not. > > > If there is no NAND chip, the controller > > should be disabled, no? I guess technically you could even use the > > status property in the nand-chip node... > > Sure. > > > However, it should not be empty, at the very least a reg property > > should indicate on which CS it is wired, as expected there: > > https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/nand-chip.yaml?h=mtd/next > > OK, I see your point. So basically this? > > &gpmi { > #size-cells = <1>; > ... > nand-chip@0 { > reg = <0>; > }; > }; > > btw. the GPMI NAND controller supports only one chipselect, so the reg in nand-chip node makes little sense. I randomly opened a reference manual (IMX6DQL.pdf), they say: "Up to four NAND devices, supported by four chip-selects and one ganged ready/ busy." Anyway, the NAND controller generic bindings which require this reg property, what the controller or the driver actually supports, or even how it is used on current designs is not relevant here. > > But, as nand-chip.yaml references mtd.yaml, you can as well use > > whatever is described here: > > https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/mtd.yaml?h=mtd/next > > > >> What would be the gpmi controller size cells (X) in that case, still 0, right ? So how does that help solve this problem, wouldn't U-Boot still populate the partitions directly under the gpmi node or into partitions sub-node ? > > > > The commit that was pointed in the original fix clearly stated that the > > NAND chip node was targeted > > I think this is another miscommunication here. The commit > > 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") > > modifies the size-cells of the NAND controller. The nand-chip is not involved in this at all . In the examples above, it's the "&gpmi" node size-cells that is modified. Yes I know. I was referring to this commit, sorry: 36fee2f7621e ("common: fdt_support: add support for "partitions" subnode to fdt_fixup_mtdparts()") The log says: Listing MTD partitions directly in the flash mode has been deprecated for a while for kernel Device Trees. Look for a node "partitions" in the found flash nodes and use it instead of the flash node itself for the partition list when it exists, so Device Trees following the current best practices can be fixed up. Which (I hope) means U-boot will equivalently try to play with the partitions container, either in the controller node or in the chip node. > > , not the NAND controller node. I hope this > > is correctly supported in U-Boot though. So if there is a NAND chip > > subnode, I suppose U-Boot would try to create the partitions that are > > inside, or even in the sub "partitions" container. > > My understanding is that U-Boot checks the nand-controller node size-cells, not the nand-chip{} or partitions{} subnode size-cells . I don't think U-Boot cares. > Francesco, can you please share the DT, including the U-Boot generated partitions, which is passed to Linux on Colibri MX7 ? I think that should make all confusion go away. Please also do it with the NAND chip described. If, when the NAND chip is described U-Boot tries to create partitions in the controller node, then the situation is even worse than I thought. But I believe describing the node like a suggest in the DT should prevent the boot failure while still allowing a rather good description of the hardware. BTW I still think the relevant action right now is to revert the DT patch. Thanks, Miquèl
On Fri, Dec 02, 2022 at 05:17:59PM +0100, Marek Vasut wrote: > On 12/2/22 16:49, Miquel Raynal wrote: > > , not the NAND controller node. I hope this > > is correctly supported in U-Boot though. So if there is a NAND chip > > subnode, I suppose U-Boot would try to create the partitions that are > > inside, or even in the sub "partitions" container. > > My understanding is that U-Boot checks the nand-controller node size-cells, > not the nand-chip{} or partitions{} subnode size-cells . Not 100% correct. - U-Boot before v2022.04 updates the nand-controller{} node, no matter what. - U-Boot starting from v2022.04 looks for `partitions{}` into the nand-controller{} node, and creates the partition into it if found. If not found it behaves the same way as the previous versions. See commit 36fee2f7621e ("common: fdt_support: add support for "partitions" subnode to fdt_fixup_mtdparts()") I'd like to stress once more the fact that we cannot expect old U-Boot to be updated in the field, and they will keep generating the partitions as child of the nand-controller node whatever we do with the dts file. I think that this should be treated the same way as any other fixup we might have for broken firmware, especially considering that this used to "work" (yes, I can agree that it horrible, but I cannot change the past) without even a warning since the imx7 support was first introduced in the linux kernel years ago. > Francesco, can you please share the DT, including the U-Boot generated > partitions, which is passed to Linux on Colibri MX7 ? I think that should > make all confusion go away. The device tree part is easy, just arch/arm/boot/dts/imx7d-colibri-eval-v3.dts. and the nand-controller node is coming from #include "imx7d.dtsi" plus &gpmi { fsl,use-minimum-ecc; nand-ecc-mode = "hw"; nand-on-flash-bbt; pinctrl-names = "default"; pinctrl-0 = <&pinctrl_gpmi_nand>; }; The partitions nodes are generated 100% by U-Boot, nothing is present in the dts source files. With this DTS file as input, whatever U-Boot version is used I have the following generated: root@colibri-imx7-02844233:/# ls /proc/device-tree/soc/nand-controller@33002000/ #address-cells dma-names nand-on-flash-bbt pinctrl-0 #size-cells dmas partition@0 pinctrl-names assigned-clock-parents fsl,use-minimum-ecc partition@200000 reg assigned-clocks interrupt-names partition@380000 reg-names clock-names interrupts partition@400000 status clocks name partition@80000 compatible nand-ecc-mode phandle root@colibri-imx7-02844233:/# ls /proc/device-tree/soc/nand-controller@33002000/partition@* /proc/device-tree/soc/nand-controller@33002000/partition@0: label name reg /proc/device-tree/soc/nand-controller@33002000/partition@200000: label name read_only reg /proc/device-tree/soc/nand-controller@33002000/partition@380000: label name reg /proc/device-tree/soc/nand-controller@33002000/partition@400000: label name reg /proc/device-tree/soc/nand-controller@33002000/partition@80000: label name read_only reg
On 12/2/22 17:42, Miquel Raynal wrote: > Hi Marek, Hi, [...] >>> However, it should not be empty, at the very least a reg property >>> should indicate on which CS it is wired, as expected there: >>> https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/nand-chip.yaml?h=mtd/next >> >> OK, I see your point. So basically this? >> >> &gpmi { >> #size-cells = <1>; >> ... >> nand-chip@0 { >> reg = <0>; >> }; >> }; >> >> btw. the GPMI NAND controller supports only one chipselect, so the reg in nand-chip node makes little sense. > > I randomly opened a reference manual (IMX6DQL.pdf), they say: > > "Up to four NAND devices, supported by four chip-selects and one > ganged ready/ busy." Doh, and MX7D has the same controller, so size-cells = <1>; makes sense with nand-chip@N {} . > Anyway, the NAND controller generic bindings which require this reg > property, what the controller or the driver actually supports, or even > how it is used on current designs is not relevant here. > >>> But, as nand-chip.yaml references mtd.yaml, you can as well use >>> whatever is described here: >>> https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/mtd.yaml?h=mtd/next >>> >>>> What would be the gpmi controller size cells (X) in that case, still 0, right ? So how does that help solve this problem, wouldn't U-Boot still populate the partitions directly under the gpmi node or into partitions sub-node ? >>> >>> The commit that was pointed in the original fix clearly stated that the >>> NAND chip node was targeted >> >> I think this is another miscommunication here. The commit >> >> 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") >> >> modifies the size-cells of the NAND controller. The nand-chip is not involved in this at all . In the examples above, it's the "&gpmi" node size-cells that is modified. > > Yes I know. I was referring to this commit, sorry: > 36fee2f7621e ("common: fdt_support: add support for "partitions" subnode to fdt_fixup_mtdparts()") > > The log says: > > Listing MTD partitions directly in the flash mode has been > deprecated for a while for kernel Device Trees. Look for a node "partitions" in the > found flash nodes and use it instead of the flash node itself for the > partition list when it exists, so Device Trees following the current > best practices can be fixed up. > > Which (I hope) means U-boot will equivalently try to play with the > partitions container, either in the controller node or in the chip node. > >>> , not the NAND controller node. I hope this >>> is correctly supported in U-Boot though. So if there is a NAND chip >>> subnode, I suppose U-Boot would try to create the partitions that are >>> inside, or even in the sub "partitions" container. >> >> My understanding is that U-Boot checks the nand-controller node size-cells, not the nand-chip{} or partitions{} subnode size-cells . > > I don't think U-Boot cares. > >> Francesco, can you please share the DT, including the U-Boot generated partitions, which is passed to Linux on Colibri MX7 ? I think that should make all confusion go away. > > Please also do it with the NAND chip described. If, when the NAND chip > is described U-Boot tries to create partitions in the controller node, > then the situation is even worse than I thought. But I believe > describing the node like a suggest in the DT should prevent the boot > failure while still allowing a rather good description of the hardware. > > BTW I still think the relevant action right now is to revert the DT > patch. I am starting to bank toward that variant as well (thanks for clarifying the rationale in the discussion, that helped a lot). But then, the follow up fix would be what exactly, update the binding document to require #size-cells = <1>; ?
Hi Marek, marex@denx.de wrote on Fri, 2 Dec 2022 17:52:05 +0100: > On 12/2/22 17:42, Miquel Raynal wrote: > > Hi Marek, > > Hi, > > [...] > > >>> However, it should not be empty, at the very least a reg property > >>> should indicate on which CS it is wired, as expected there: > >>> https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/nand-chip.yaml?h=mtd/next > >> > >> OK, I see your point. So basically this? > >> > >> &gpmi { > >> #size-cells = <1>; > >> ... > >> nand-chip@0 { > >> reg = <0>; > >> }; > >> }; > >> > >> btw. the GPMI NAND controller supports only one chipselect, so the reg in nand-chip node makes little sense. > > > > I randomly opened a reference manual (IMX6DQL.pdf), they say: > > > > "Up to four NAND devices, supported by four chip-selects and one > > ganged ready/ busy." > > Doh, and MX7D has the same controller, so size-cells = <1>; makes sense with nand-chip@N {} . Actually #address-cells is here for that. You need to point at one CS, so in most cases this is: controller { #address-cells = <1>; #size-cells = <0>; chip@N { reg = <N>; }; }; > > > Anyway, the NAND controller generic bindings which require this reg > > property, what the controller or the driver actually supports, or even > > how it is used on current designs is not relevant here. > > > >>> But, as nand-chip.yaml references mtd.yaml, you can as well use > >>> whatever is described here: > >>> https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/mtd.yaml?h=mtd/next > >>> >>>> What would be the gpmi controller size cells (X) in that case, still 0, right ? So how does that help solve this problem, wouldn't U-Boot still populate the partitions directly under the gpmi node or into partitions sub-node ? > >>> > >>> The commit that was pointed in the original fix clearly stated that the > >>> NAND chip node was targeted > >> > >> I think this is another miscommunication here. The commit > >> > >> 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") > >> > >> modifies the size-cells of the NAND controller. The nand-chip is not involved in this at all . In the examples above, it's the "&gpmi" node size-cells that is modified. > > > > Yes I know. I was referring to this commit, sorry: > > 36fee2f7621e ("common: fdt_support: add support for "partitions" subnode to fdt_fixup_mtdparts()") > > > > The log says: > > > > Listing MTD partitions directly in the flash mode has been > > deprecated for a while for kernel Device Trees. Look for a node "partitions" in the > > found flash nodes and use it instead of the flash node itself for the > > partition list when it exists, so Device Trees following the current > > best practices can be fixed up. > > > > Which (I hope) means U-boot will equivalently try to play with the > > partitions container, either in the controller node or in the chip node. > > > >>> , not the NAND controller node. I hope this > >>> is correctly supported in U-Boot though. So if there is a NAND chip > >>> subnode, I suppose U-Boot would try to create the partitions that are > >>> inside, or even in the sub "partitions" container. > >> > >> My understanding is that U-Boot checks the nand-controller node size-cells, not the nand-chip{} or partitions{} subnode size-cells . > > > > I don't think U-Boot cares. > > > >> Francesco, can you please share the DT, including the U-Boot generated partitions, which is passed to Linux on Colibri MX7 ? I think that should make all confusion go away. > > > > Please also do it with the NAND chip described. If, when the NAND chip > > is described U-Boot tries to create partitions in the controller node, > > then the situation is even worse than I thought. But I believe > > describing the node like a suggest in the DT should prevent the boot > > failure while still allowing a rather good description of the hardware. > > > > BTW I still think the relevant action right now is to revert the DT > > patch. > > I am starting to bank toward that variant as well (thanks for clarifying the rationale in the discussion, that helped a lot). > > But then, the follow up fix would be what exactly, update the binding document to require #size-cells = <1>; ? Thanks, Miquèl
Hi Francesco, francesco@dolcini.it wrote on Fri, 2 Dec 2022 17:45:37 +0100: > On Fri, Dec 02, 2022 at 05:17:59PM +0100, Marek Vasut wrote: > > On 12/2/22 16:49, Miquel Raynal wrote: > > > , not the NAND controller node. I hope this > > > is correctly supported in U-Boot though. So if there is a NAND chip > > > subnode, I suppose U-Boot would try to create the partitions that are > > > inside, or even in the sub "partitions" container. > > > > My understanding is that U-Boot checks the nand-controller node size-cells, > > not the nand-chip{} or partitions{} subnode size-cells . > Not 100% correct. > > - U-Boot before v2022.04 updates the nand-controller{} node, no matter what. > - U-Boot starting from v2022.04 looks for `partitions{}` into the > nand-controller{} node, and creates the partition into it if found. > If not found it behaves the same way as the previous versions. > See commit 36fee2f7621e ("common: fdt_support: add support for "partitions" subnode to fdt_fixup_mtdparts()") > > I'd like to stress once more the fact that we cannot expect old U-Boot > to be updated in the field, and they will keep generating the partitions > as child of the nand-controller node whatever we do with the dts file. > > I think that this should be treated the same way as any other fixup we > might have for broken firmware, especially considering that this used to > "work" (yes, I can agree that it horrible, but I cannot change the past) > without even a warning since the imx7 support was first introduced in > the linux kernel years ago. > > > Francesco, can you please share the DT, including the U-Boot generated > > partitions, which is passed to Linux on Colibri MX7 ? I think that should > > make all confusion go away. > > The device tree part is easy, just > arch/arm/boot/dts/imx7d-colibri-eval-v3.dts. > > and the nand-controller node is coming from > > #include "imx7d.dtsi" > > plus > > &gpmi { > fsl,use-minimum-ecc; > nand-ecc-mode = "hw"; > nand-on-flash-bbt; > pinctrl-names = "default"; > pinctrl-0 = <&pinctrl_gpmi_nand>; > }; > > The partitions nodes are generated 100% by U-Boot, nothing is present in > the dts source files. I hope if you provide a NAND chip child node, the partitions are created at the right location, otherwise this is so, so wrong... > > With this DTS file as input, whatever U-Boot version is used I have the > following generated: > > root@colibri-imx7-02844233:/# ls /proc/device-tree/soc/nand-controller@33002000/ > #address-cells dma-names nand-on-flash-bbt pinctrl-0 > #size-cells dmas partition@0 pinctrl-names > assigned-clock-parents fsl,use-minimum-ecc partition@200000 reg > assigned-clocks interrupt-names partition@380000 reg-names > clock-names interrupts partition@400000 status > clocks name partition@80000 > compatible nand-ecc-mode phandle > > root@colibri-imx7-02844233:/# ls /proc/device-tree/soc/nand-controller@33002000/partition@* > /proc/device-tree/soc/nand-controller@33002000/partition@0: > label name reg > > /proc/device-tree/soc/nand-controller@33002000/partition@200000: > label name read_only reg > > /proc/device-tree/soc/nand-controller@33002000/partition@380000: > label name reg > > /proc/device-tree/soc/nand-controller@33002000/partition@400000: > label name reg > > /proc/device-tree/soc/nand-controller@33002000/partition@80000: > label name read_only reg > > Thanks, Miquèl
On 12/2/22 17:57, Miquel Raynal wrote: > Hi Marek, Hi, >> On 12/2/22 17:42, Miquel Raynal wrote: >>> Hi Marek, >> >> Hi, >> >> [...] >> >>>>> However, it should not be empty, at the very least a reg property >>>>> should indicate on which CS it is wired, as expected there: >>>>> https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/tree/Documentation/devicetree/bindings/mtd/nand-chip.yaml?h=mtd/next >>>> >>>> OK, I see your point. So basically this? >>>> >>>> &gpmi { >>>> #size-cells = <1>; >>>> ... >>>> nand-chip@0 { >>>> reg = <0>; >>>> }; >>>> }; >>>> >>>> btw. the GPMI NAND controller supports only one chipselect, so the reg in nand-chip node makes little sense. >>> >>> I randomly opened a reference manual (IMX6DQL.pdf), they say: >>> >>> "Up to four NAND devices, supported by four chip-selects and one >>> ganged ready/ busy." >> >> Doh, and MX7D has the same controller, so size-cells = <1>; makes sense with nand-chip@N {} . > > Actually #address-cells is here for that. You need to point at one CS, > so in most cases this is: > > controller { > #address-cells = <1>; > #size-cells = <0>; > chip@N { > reg = <N>; > }; > }; Right ... thanks for spotting this. But then the size-cells in the controller node should be zero. And 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") is therefore correct ? But that correction in 753395ea1e45 breaks setup where old U-Boot injects partition@N directly into the nand-controller node, without updating the nand-controller node size-cells, i.e. this: nand-controller { #address-cells = <1>; #size-cells = <0>; + partition@0 { ... }; }; Because the size-cells is 0 in that case, and U-Boot does not update the size-cells . It used to work before because Linux DT contained size-cells=<1> in the nand-controller node before 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells"). But here I would say this is a firmware bug and it might have to be handled like a firmware bug, i.e. with fixup in the partition parser. I seem to be changing my opinion here again.
On Fri, Dec 02, 2022 at 05:42:55PM +0100, Miquel Raynal wrote: > Please also do it with the NAND chip described. If, when the NAND chip > is described U-Boot tries to create partitions in the controller node, > then the situation is even worse than I thought. But I believe It's like that for U-Boot older than v2022.04 ... and IMO we cannot ignore it. Said that from the code U-Boot looks into a `partition{}` node only as a direct child of the nand-controller, if there is a nand-chip in between the nand-controller{} and the partitions{} it will just ignore it. I could try to see what it is doing exactly, but I would need a little bit more time, I just tried changing the DTS as wrote I got a non bootable system. Francesco
On 12/2/22 16:56, Thorsten Leemhuis wrote: > On 02.12.22 15:31, Marek Vasut wrote: >> On 12/2/22 15:05, Miquel Raynal wrote: >> [...] >>> 3. To fix the current situation: >>> Immediately revert commit (and prevent it from being backported): >>> 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") >>> This way your own boot flow is fixed in the short term. >> >> Here I disagree, the fix is correct and I think we shouldn't proliferate >> incorrect DTs which don't match the binding document. Rather, if a >> bootloader generates incorrect (new) DT entries, I believe the driver >> should implement a fixup and warn user about this. PC does that as well >> with broken ACPI tables as far as I can tell. > > Well, that might be the right solution in the long run, that's up for > others to decide, but we need to fix this *quickly*. For two reasons > actually: the 6.1 release is near and the change was backported to > stable already. > > For details wrt to the "quickly", see "Prioritize work on fixing > regressions" here: > https://docs.kernel.org/process/handling-regressions.html > > IOW: Ideally it should be fixed by Sunday. > > I'll hence likely soon will point Linus to this and suggest to revert > this, unless there are strong reasons against that or some sort of > agreement on a better solution. You might want to wait until everyone is back on Monday, the discussion is still ongoing, but it seems to be getting to a conclusion.
On 04.12.22 13:50, Marek Vasut wrote: > On 12/2/22 16:56, Thorsten Leemhuis wrote: >> On 02.12.22 15:31, Marek Vasut wrote: >>> On 12/2/22 15:05, Miquel Raynal wrote: >>> [...] >>>> 3. To fix the current situation: >>>> Immediately revert commit (and prevent it from being backported): >>>> 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") >>>> This way your own boot flow is fixed in the short term. >>> >>> Here I disagree, the fix is correct and I think we shouldn't proliferate >>> incorrect DTs which don't match the binding document. Rather, if a >>> bootloader generates incorrect (new) DT entries, I believe the driver >>> should implement a fixup and warn user about this. PC does that as well >>> with broken ACPI tables as far as I can tell. >> >> Well, that might be the right solution in the long run, that's up for >> others to decide, but we need to fix this *quickly*. For two reasons >> actually: the 6.1 release is near and the change was backported to >> stable already. >> >> For details wrt to the "quickly", see "Prioritize work on fixing >> regressions" here: >> https://docs.kernel.org/process/handling-regressions.html >> >> IOW: Ideally it should be fixed by Sunday. >> >> I'll hence likely soon will point Linus to this and suggest to revert >> this, unless there are strong reasons against that or some sort of >> agreement on a better solution. > > You might want to wait until everyone is back on Monday, the discussion > is still ongoing, but it seems to be getting to a conclusion. Yeah, came to a similar conclusion, but want to mentioned it nevertheless and already have this prepared (together will appropriate links to the discussion): ``` A regression causing boot failures on iMX7 (due to a backport this is also affecting 6.0.y) could be fixed with a quick revert as well. But looks like there is no need for it, after some back and forth the developers that care are close to come to an agreement how to fix the problem properly soonish: ``` Ciao, Thorsten
On 12/4/22 13:59, Thorsten Leemhuis wrote: > On 04.12.22 13:50, Marek Vasut wrote: >> On 12/2/22 16:56, Thorsten Leemhuis wrote: >>> On 02.12.22 15:31, Marek Vasut wrote: >>>> On 12/2/22 15:05, Miquel Raynal wrote: >>>> [...] >>>>> 3. To fix the current situation: >>>>> Immediately revert commit (and prevent it from being backported): >>>>> 753395ea1e45 ("ARM: dts: imx7: Fix NAND controller size-cells") >>>>> This way your own boot flow is fixed in the short term. >>>> >>>> Here I disagree, the fix is correct and I think we shouldn't proliferate >>>> incorrect DTs which don't match the binding document. Rather, if a >>>> bootloader generates incorrect (new) DT entries, I believe the driver >>>> should implement a fixup and warn user about this. PC does that as well >>>> with broken ACPI tables as far as I can tell. >>> >>> Well, that might be the right solution in the long run, that's up for >>> others to decide, but we need to fix this *quickly*. For two reasons >>> actually: the 6.1 release is near and the change was backported to >>> stable already. >>> >>> For details wrt to the "quickly", see "Prioritize work on fixing >>> regressions" here: >>> https://docs.kernel.org/process/handling-regressions.html >>> >>> IOW: Ideally it should be fixed by Sunday. >>> >>> I'll hence likely soon will point Linus to this and suggest to revert >>> this, unless there are strong reasons against that or some sort of >>> agreement on a better solution. >> >> You might want to wait until everyone is back on Monday, the discussion >> is still ongoing, but it seems to be getting to a conclusion. > > Yeah, came to a similar conclusion, but want to mentioned it > nevertheless and already have this prepared (together will appropriate > links to the discussion): > > ``` > A regression causing boot failures on iMX7 (due to a backport this is > also affecting 6.0.y) could be fixed with a quick revert as well. But > looks like there is no need for it, after some back and forth the > developers that care are close to come to an agreement how to fix the > problem properly soonish: > ``` ACK, thanks
On Fri, Dec 02, 2022 at 06:08:22PM +0100, Marek Vasut wrote: > But here I would say this is a firmware bug and it might have to be handled > like a firmware bug, i.e. with fixup in the partition parser. I seem to be > changing my opinion here again. I was thinking at this over the weekend, and I came to the following ideas: - we need some improvement on the fixup we already have in the partition parser. We cannot ignore the fdt produced by U-Boot - as bad as it is. - the proposed fixup is fine for the immediate need, but it is not going to be enough to cover the general issue with the U-Boot generated partitions. U-Boot might keep generating partitions as direct child of the nand controller even when a partitions{} node is available. In this case the current parser just fails since it looks only into it and it will find it empty. - the current U-Boot only handle partitions{} as a direct child of the nand-controller, the nand-chip is ignored. This is not the way it is supposed to work. U-Boot code would need to be improved. With all of that said I think that Miquel is right > When a patch breaks a board and there is no straight fix, you revert > it, then you think harder. That's what I am saying. This is a temporary > solution. ? Francesco
Hello Miquel On Fri, Dec 02, 2022 at 06:20:02PM +0100, Francesco Dolcini wrote: > On Fri, Dec 02, 2022 at 05:42:55PM +0100, Miquel Raynal wrote: > > Please also do it with the NAND chip described. If, when the NAND chip > > is described U-Boot tries to create partitions in the controller node, > > then the situation is even worse than I thought. But I believe > > It's like that for U-Boot older than v2022.04 ... and IMO we cannot > ignore it. > > Said that from the code U-Boot looks into a `partition{}` node only as a > direct child of the nand-controller, if there is a nand-chip in between > the nand-controller{} and the partitions{} it will just ignore it. > > I could try to see what it is doing exactly, but I would need a little > bit more time, I just tried changing the DTS as wrote I got a non > bootable system. If I have a nand-chip { partitions {} } described in the dts U-Boot (even the latest one) ignores it and generates the partition as child of the nand controller, the linux parser however see that partitions{} exists, even if empty, and ignore the partition directly defined as child of the nand controller. TL;DR: parser fails and boot fails according to that. Francesco
Hi Francesco, francesco@dolcini.it wrote on Mon, 5 Dec 2022 12:26:44 +0100: > On Fri, Dec 02, 2022 at 06:08:22PM +0100, Marek Vasut wrote: > > But here I would say this is a firmware bug and it might have to be handled > > like a firmware bug, i.e. with fixup in the partition parser. I seem to be > > changing my opinion here again. > > I was thinking at this over the weekend, and I came to the following > ideas: > > - we need some improvement on the fixup we already have in the > partition parser. We cannot ignore the fdt produced by U-Boot - as > bad as it is. > - the proposed fixup is fine for the immediate need, but it is > not going to be enough to cover the general issue with the U-Boot > generated partitions. U-Boot might keep generating partitions as direct > child of the nand controller even when a partitions{} node is > available. In this case the current parser just fails since it looks > only into it and it will find it empty. > - the current U-Boot only handle partitions{} as a direct child of the > nand-controller, the nand-chip is ignored. This is not the way it is > supposed to work. U-Boot code would need to be improved. I've been thinking about it this weekend as well and the current fix which "just set" s_cell to 1 seems risky for me, it is typically the type of quick & dirty fix that might even break other board (nobody knew that U-Boot current logic expected #size-cells to be set in the DT, what if another "broken" DT expects the opposite...), not mentioning potential issues with big storages (> 4GiB). All in all, I really think we should revert the DT change now, reverting as little to no drawbacks besides a dt_binding_check warning and gives us time to deal with it properly (both in U-Boot and Linux). > With all of that said I think that Miquel is right > > > When a patch breaks a board and there is no straight fix, you revert > > it, then you think harder. That's what I am saying. This is a temporary > > solution. > > ? > > Francesco > > Thanks, Miquèl
Hi Francesco, francesco@dolcini.it wrote on Mon, 5 Dec 2022 12:30:16 +0100: > Hello Miquel > > On Fri, Dec 02, 2022 at 06:20:02PM +0100, Francesco Dolcini wrote: > > On Fri, Dec 02, 2022 at 05:42:55PM +0100, Miquel Raynal wrote: > > > Please also do it with the NAND chip described. If, when the NAND chip > > > is described U-Boot tries to create partitions in the controller node, > > > then the situation is even worse than I thought. But I believe > > > > It's like that for U-Boot older than v2022.04 ... and IMO we cannot > > ignore it. > > > > Said that from the code U-Boot looks into a `partition{}` node only as a > > direct child of the nand-controller, if there is a nand-chip in between > > the nand-controller{} and the partitions{} it will just ignore it. > > > > I could try to see what it is doing exactly, but I would need a little > > bit more time, I just tried changing the DTS as wrote I got a non > > bootable system. > > If I have a nand-chip { partitions {} } described in the dts U-Boot > (even the latest one) ignores it and generates the partition as child of > the nand controller, the linux parser however see that partitions{} > exists, even if empty, and ignore the partition directly defined as > child of the nand controller. > > TL;DR: parser fails and boot fails according to that. Yeah I get that. For me the longterm goal should be to just kill that function. We have proper DT support today, Linux knows how to read the mtdparts cmdline variable, so there is no need for anything else. I guess in U-Boot we should just: - warn users of this function that the function is deprecated and they should update their machine support - just migrate to another solution on the colibri board What do you think? Thanks, Miquèl
On 12/5/22 14:49, Miquel Raynal wrote: > Hi Francesco, Hi, > francesco@dolcini.it wrote on Mon, 5 Dec 2022 12:26:44 +0100: > >> On Fri, Dec 02, 2022 at 06:08:22PM +0100, Marek Vasut wrote: >>> But here I would say this is a firmware bug and it might have to be handled >>> like a firmware bug, i.e. with fixup in the partition parser. I seem to be >>> changing my opinion here again. >> >> I was thinking at this over the weekend, and I came to the following >> ideas: >> >> - we need some improvement on the fixup we already have in the >> partition parser. We cannot ignore the fdt produced by U-Boot - as >> bad as it is. >> - the proposed fixup is fine for the immediate need, but it is >> not going to be enough to cover the general issue with the U-Boot >> generated partitions. U-Boot might keep generating partitions as direct >> child of the nand controller even when a partitions{} node is >> available. In this case the current parser just fails since it looks >> only into it and it will find it empty. >> - the current U-Boot only handle partitions{} as a direct child of the >> nand-controller, the nand-chip is ignored. This is not the way it is >> supposed to work. U-Boot code would need to be improved. > > I've been thinking about it this weekend as well and the current fix > which "just set" s_cell to 1 seems risky for me, it is typically the > type of quick & dirty fix that might even break other board (nobody > knew that U-Boot current logic expected #size-cells to be set in the > DT, what if another "broken" DT expects the opposite...) Then with the current configuration, such broken DT would not work, since current DT does set #size-cells=<1> (wrongly). > , not > mentioning potential issues with big storages (> 4GiB). > > All in all, I really think we should revert the DT change now, reverting > as little to no drawbacks besides a dt_binding_check warning and gives > us time to deal with it properly (both in U-Boot and Linux). I am really not happy with this, but if that's marked as intermediate fix, go for it. How do we deal with this in the long run however? Parser-side fix like this one, maybe with better heuristics ?
Hi Marek & Francesco, marex@denx.de wrote on Mon, 5 Dec 2022 17:25:11 +0100: > On 12/5/22 14:49, Miquel Raynal wrote: > > Hi Francesco, > > Hi, > > > francesco@dolcini.it wrote on Mon, 5 Dec 2022 12:26:44 +0100: > > > >> On Fri, Dec 02, 2022 at 06:08:22PM +0100, Marek Vasut wrote: > >>> But here I would say this is a firmware bug and it might have to be handled > >>> like a firmware bug, i.e. with fixup in the partition parser. I seem to be > >>> changing my opinion here again. > >> > >> I was thinking at this over the weekend, and I came to the following > >> ideas: > >> > >> - we need some improvement on the fixup we already have in the > >> partition parser. We cannot ignore the fdt produced by U-Boot - as > >> bad as it is. > >> - the proposed fixup is fine for the immediate need, but it is > >> not going to be enough to cover the general issue with the U-Boot > >> generated partitions. U-Boot might keep generating partitions as direct > >> child of the nand controller even when a partitions{} node is > >> available. In this case the current parser just fails since it looks > >> only into it and it will find it empty. > >> - the current U-Boot only handle partitions{} as a direct child of the > >> nand-controller, the nand-chip is ignored. This is not the way it is > >> supposed to work. U-Boot code would need to be improved. > > > > I've been thinking about it this weekend as well and the current fix > > which "just set" s_cell to 1 seems risky for me, it is typically the > > type of quick & dirty fix that might even break other board (nobody > > knew that U-Boot current logic expected #size-cells to be set in the > > DT, what if another "broken" DT expects the opposite...) > > Then with the current configuration, such broken DT would not work, since current DT does set #size-cells=<1> (wrongly). > > > , not > > mentioning potential issues with big storages (> 4GiB). > > > > All in all, I really think we should revert the DT change now, reverting > > as little to no drawbacks besides a dt_binding_check warning and gives > > us time to deal with it properly (both in U-Boot and Linux). > > I am really not happy with this, but if that's marked as intermediate fix, go for it. > > How do we deal with this in the long run however? Parser-side fix like this one, maybe with better heuristics ? Yesterday while talking about an ACPI mis-description which needed fixing, I realized fixing up what the firmware provides to Linux should preferably be handled as early as possible. So my first first idea was to avoid using the broken "fixup mtdparts" function in U-Boot and I am still convinced this is what we should do in priority. However, as rightly pointed in this thread, we need to take care about the case where someone would use a newer DT (let's say, with the reverted changed reverted again) with an old U-Boot. I am still against piggy hacks in the generic ofpart.c driver, but what we could do however is a DT fixup in the init_machine (or the dt_fixup) hook for imx7 Colibri, very much like this: https://elixir.bootlin.com/linux/latest/source/arch/arm/mach-mvebu/board-v7.c#L111 Plus a warning there saying "your dt is broken, update your firmware". So next time someone stumbles upon this issue, we can tell them "fix your bootloader", and apply the same hack in their board family (there are three or four IIRC which might be concerned some day). That would fix all cases and only have an impact on the affected boards. Thanks, Miquèl
On 12/15/22 08:16, Miquel Raynal wrote: > Hi Marek & Francesco, Hi, > marex@denx.de wrote on Mon, 5 Dec 2022 17:25:11 +0100: > >> On 12/5/22 14:49, Miquel Raynal wrote: >>> Hi Francesco, >> >> Hi, >> >>> francesco@dolcini.it wrote on Mon, 5 Dec 2022 12:26:44 +0100: >>> >>>> On Fri, Dec 02, 2022 at 06:08:22PM +0100, Marek Vasut wrote: >>>>> But here I would say this is a firmware bug and it might have to be handled >>>>> like a firmware bug, i.e. with fixup in the partition parser. I seem to be >>>>> changing my opinion here again. >>>> >>>> I was thinking at this over the weekend, and I came to the following >>>> ideas: >>>> >>>> - we need some improvement on the fixup we already have in the >>>> partition parser. We cannot ignore the fdt produced by U-Boot - as >>>> bad as it is. >>>> - the proposed fixup is fine for the immediate need, but it is >>>> not going to be enough to cover the general issue with the U-Boot >>>> generated partitions. U-Boot might keep generating partitions as direct >>>> child of the nand controller even when a partitions{} node is >>>> available. In this case the current parser just fails since it looks >>>> only into it and it will find it empty. >>>> - the current U-Boot only handle partitions{} as a direct child of the >>>> nand-controller, the nand-chip is ignored. This is not the way it is >>>> supposed to work. U-Boot code would need to be improved. >>> >>> I've been thinking about it this weekend as well and the current fix >>> which "just set" s_cell to 1 seems risky for me, it is typically the >>> type of quick & dirty fix that might even break other board (nobody >>> knew that U-Boot current logic expected #size-cells to be set in the >>> DT, what if another "broken" DT expects the opposite...) >> >> Then with the current configuration, such broken DT would not work, since current DT does set #size-cells=<1> (wrongly). >> >>> , not >>> mentioning potential issues with big storages (> 4GiB). >>> >>> All in all, I really think we should revert the DT change now, reverting >>> as little to no drawbacks besides a dt_binding_check warning and gives >>> us time to deal with it properly (both in U-Boot and Linux). >> >> I am really not happy with this, but if that's marked as intermediate fix, go for it. >> >> How do we deal with this in the long run however? Parser-side fix like this one, maybe with better heuristics ? > > Yesterday while talking about an ACPI mis-description which needed > fixing, I realized fixing up what the firmware provides to Linux should > preferably be handled as early as possible. So my first first idea was > to avoid using the broken "fixup mtdparts" function in U-Boot and I am > still convinced this is what we should do in priority. However, as > rightly pointed in this thread, we need to take care about the case > where someone would use a newer DT (let's say, with the reverted changed > reverted again) with an old U-Boot. I am still against piggy hacks in > the generic ofpart.c driver, but what we could do however is a DT > fixup in the init_machine (or the dt_fixup) hook for imx7 Colibri, very > much like this: > https://elixir.bootlin.com/linux/latest/source/arch/arm/mach-mvebu/board-v7.c#L111 > Plus a warning there saying "your dt is broken, update your firmware". This does not work, because the old U-Boot fixup_mtdparts() may be applied on any machine, it is not colibri mx7 specific. Also, new arch-side workaround are really not welcome by the architecture maintainers as far as I can tell. > So next time someone stumbles upon this issue, we can tell them "fix > your bootloader", and apply the same hack in their board family (there > are three or four IIRC which might be concerned some day). There are also those machines we do not even know about which might be generating bogus DT using old U-Boot and fixup_mtdparts(), so, unless there is some all-arch fixup implementation, we wouldn't be able to fix them all on arch side. I think the all-arch fixup implementation would be the driver one, i.e. this patch as it is (or maybe with some improvement). > That would fix all cases and only have an impact on the affected boards. Sadly, it does only fix the known cases, not the unknown cases like downstream forks which never get any bootloader updates ever, and which you can't find in upstream U-Boot, and which you therefore cannot easily catch in the arch side fixup. [...]
Hi Marek, marex@denx.de wrote on Thu, 15 Dec 2022 08:45:33 +0100: > On 12/15/22 08:16, Miquel Raynal wrote: > > Hi Marek & Francesco, > > Hi, > > > marex@denx.de wrote on Mon, 5 Dec 2022 17:25:11 +0100: > > > >> On 12/5/22 14:49, Miquel Raynal wrote: > >>> Hi Francesco, > >> > >> Hi, > >> > >>> francesco@dolcini.it wrote on Mon, 5 Dec 2022 12:26:44 +0100: > >>> >>>> On Fri, Dec 02, 2022 at 06:08:22PM +0100, Marek Vasut wrote: > >>>>> But here I would say this is a firmware bug and it might have to be handled > >>>>> like a firmware bug, i.e. with fixup in the partition parser. I seem to be > >>>>> changing my opinion here again. > >>>> > >>>> I was thinking at this over the weekend, and I came to the following > >>>> ideas: > >>>> > >>>> - we need some improvement on the fixup we already have in the > >>>> partition parser. We cannot ignore the fdt produced by U-Boot - as > >>>> bad as it is. > >>>> - the proposed fixup is fine for the immediate need, but it is > >>>> not going to be enough to cover the general issue with the U-Boot > >>>> generated partitions. U-Boot might keep generating partitions as direct > >>>> child of the nand controller even when a partitions{} node is > >>>> available. In this case the current parser just fails since it looks > >>>> only into it and it will find it empty. > >>>> - the current U-Boot only handle partitions{} as a direct child of the > >>>> nand-controller, the nand-chip is ignored. This is not the way it is > >>>> supposed to work. U-Boot code would need to be improved. > >>> > >>> I've been thinking about it this weekend as well and the current fix > >>> which "just set" s_cell to 1 seems risky for me, it is typically the > >>> type of quick & dirty fix that might even break other board (nobody > >>> knew that U-Boot current logic expected #size-cells to be set in the > >>> DT, what if another "broken" DT expects the opposite...) > >> > >> Then with the current configuration, such broken DT would not work, since current DT does set #size-cells=<1> (wrongly). > >> > >>> , not > >>> mentioning potential issues with big storages (> 4GiB). > >>> > >>> All in all, I really think we should revert the DT change now, reverting > >>> as little to no drawbacks besides a dt_binding_check warning and gives > >>> us time to deal with it properly (both in U-Boot and Linux). > >> > >> I am really not happy with this, but if that's marked as intermediate fix, go for it. > >> > >> How do we deal with this in the long run however? Parser-side fix like this one, maybe with better heuristics ? > > > > Yesterday while talking about an ACPI mis-description which needed > > fixing, I realized fixing up what the firmware provides to Linux should > > preferably be handled as early as possible. So my first first idea was > > to avoid using the broken "fixup mtdparts" function in U-Boot and I am > > still convinced this is what we should do in priority. However, as > > rightly pointed in this thread, we need to take care about the case > > where someone would use a newer DT (let's say, with the reverted changed > > reverted again) with an old U-Boot. I am still against piggy hacks in > > the generic ofpart.c driver, but what we could do however is a DT > > fixup in the init_machine (or the dt_fixup) hook for imx7 Colibri, very > > much like this: > > https://elixir.bootlin.com/linux/latest/source/arch/arm/mach-mvebu/board-v7.c#L111 > > Plus a warning there saying "your dt is broken, update your firmware". > > This does not work, because the old U-Boot fixup_mtdparts() may be applied on any machine, No: https://elixir.bootlin.com/u-boot/latest/A/ident/fdt_fixup_mtdparts And we should make our best so its use does not proliferate. It's not like there is half a dozen of good ways to describe and forward partitions today. > it is not colibri mx7 specific. Also, new arch-side workaround are > really not welcome by the architecture maintainers as far as I can > tell. So what? Let's propose the change and see what the maintainers have to say. I am open to discussion. As I said, it is not colibri mx7 specific, there are a few boards which might be affected, they are all clearly identifiable with a compatible. It's not the entire planet either. > > So next time someone stumbles upon this issue, we can tell them "fix > > your bootloader", and apply the same hack in their board family (there > > are three or four IIRC which might be concerned some day). > > There are also those machines we do not even know about which might be generating bogus DT using old U-Boot and fixup_mtdparts(), so, unless there is some all-arch fixup implementation, we wouldn't be able to fix them all on arch side. I think the all-arch fixup implementation would be the driver one, i.e. this patch as it is (or maybe with some improvement). If we don't know about them, as you say, I don't feel concerned. If something is buggy, people will report it, we will point them in the right direction so they can fix their firmware and propose a similar fix in their case which will involve adding a new machine compatible to the list of boards that should tweak the #size-cell property. > > That would fix all cases and only have an impact on the affected boards. > > Sadly, it does only fix the known cases, not the unknown cases like downstream forks which never get any bootloader updates ever, and which you can't find in upstream U-Boot, and which you therefore cannot easily catch in the arch side fixup. And ? Thanks, Miquèl
On 12/15/22 09:04, Miquel Raynal wrote: > Hi Marek, Hi, [...] >>> Yesterday while talking about an ACPI mis-description which needed >>> fixing, I realized fixing up what the firmware provides to Linux should >>> preferably be handled as early as possible. So my first first idea was >>> to avoid using the broken "fixup mtdparts" function in U-Boot and I am >>> still convinced this is what we should do in priority. However, as >>> rightly pointed in this thread, we need to take care about the case >>> where someone would use a newer DT (let's say, with the reverted changed >>> reverted again) with an old U-Boot. I am still against piggy hacks in >>> the generic ofpart.c driver, but what we could do however is a DT >>> fixup in the init_machine (or the dt_fixup) hook for imx7 Colibri, very >>> much like this: >>> https://elixir.bootlin.com/linux/latest/source/arch/arm/mach-mvebu/board-v7.c#L111 >>> Plus a warning there saying "your dt is broken, update your firmware". >> >> This does not work, because the old U-Boot fixup_mtdparts() may be applied on any machine, > > No: https://elixir.bootlin.com/u-boot/latest/A/ident/fdt_fixup_mtdparts These are the boards from vendors who upstreamed their properly. This does NOT take into account either boards which are using downstream U-Boot, or older U-Boot e.g. because they can not easily update. > And we should make our best so its use does not proliferate. I am not disputing that, but that's a separate U-Boot side fix which I hope Francesco would submit soon, AND, more importantly, the code is already in at least two U-Boot releases, so it's done, it's not going away any time soon. > It's not like there is half a dozen of good ways to describe and forward > partitions today. That's really not what I am disputing here, the approach to describing partitions is crystal clear as far as I can tell. >> it is not colibri mx7 specific. Also, new arch-side workaround are >> really not welcome by the architecture maintainers as far as I can >> tell. > > So what? Let's propose the change and see what the maintainers have to > say. I am open to discussion. Why is there such strong opposition toward generic fix in the OF partition parser ? > As I said, it is not colibri mx7 specific, there are a few boards which > might be affected, ... that you know about ... > they are all clearly identifiable with a compatible. > It's not the entire planet either. Neither of us can make this statement with certainty, because neither of us knows what hardware is running the affected version of U-Boot. >>> So next time someone stumbles upon this issue, we can tell them "fix >>> your bootloader", and apply the same hack in their board family (there >>> are three or four IIRC which might be concerned some day). >> >> There are also those machines we do not even know about which might be generating bogus DT using old U-Boot and fixup_mtdparts(), so, unless there is some all-arch fixup implementation, we wouldn't be able to fix them all on arch side. I think the all-arch fixup implementation would be the driver one, i.e. this patch as it is (or maybe with some improvement). > > If we don't know about them, as you say, I don't feel concerned. > > If something is buggy, people will report it, we will point them in the > right direction so they can fix their firmware and propose a similar > fix in their case which will involve adding a new machine compatible to > the list of boards that should tweak the #size-cell property. Why is a potentially lengthy list of compatible strings in arch code, which every single user has to find _after_ their system completely fails to boot and forces them to perform potentially difficult recovery, potentially after an update to new linux-stable release no less -- over -- 4 liner generic fix in OF partition parser, which covers all the systems, does not cause systems to fail to boot completely, does not force users to suffer through recovery, does not require a list of compatibles in arch code, and rather only gracefully prints a warning ? I very much prefer the second solution over the first. And one more thing, the list of compatibles in arch code does not really work anyway, since once user updates their bootloader, the compatible won't change and the arch-side workaround would still be applied, which is not desired at that point anymore. >>> That would fix all cases and only have an impact on the affected boards. >> >> Sadly, it does only fix the known cases, not the unknown cases like downstream forks which never get any bootloader updates ever, and which you can't find in upstream U-Boot, and which you therefore cannot easily catch in the arch side fixup. > > And ? I was under the impression Linux was supposed to deliver the best possible experience to its users even on not-perfect hardware, and if there are any quirks, the kernel should try to fix them up or work around them as best as it can, not dismiss them as broken hardware and fail to boot outright.
On Fri, Dec 16, 2022 at 01:36:03AM +0100, Marek Vasut wrote: > On 12/15/22 09:04, Miquel Raynal wrote: > > > > That would fix all cases and only have an impact on the affected > > > > boards. > > > > > > Sadly, it does only fix the known cases, not the unknown cases > > > like downstream forks which never get any bootloader updates ever, > > > and which you can't find in upstream U-Boot, and which you > > > therefore cannot easily catch in the arch side fixup. > > > > And ? > > I was under the impression Linux was supposed to deliver the best possible > experience to its users even on not-perfect hardware, and if there are any > quirks, the kernel should try to fix them up or work around them as best as > it can, not dismiss them as broken hardware and fail to boot outright. I would say something more on this. We are not talking about Linux not working well on some hardware, we are talking about breaking hardware that was working fine since ever. I believe that the Linux has a quite strong point of view on such kind of regression. Quoting Linus > If the kernel used to work for you, the rule is that it continues to work for you. Francesco
diff --git a/drivers/mtd/parsers/ofpart_core.c b/drivers/mtd/parsers/ofpart_core.c index 192190c42fc8..aa3b7fa61e50 100644 --- a/drivers/mtd/parsers/ofpart_core.c +++ b/drivers/mtd/parsers/ofpart_core.c @@ -122,6 +122,17 @@ static int parse_fixed_partitions(struct mtd_info *master, a_cells = of_n_addr_cells(pp); s_cells = of_n_size_cells(pp); + if (s_cells == 0) { + /* + * Use #size-cells = <1> for backward compatibility + * in case #size-cells is set to <0> and firmware adds + * OF partitions without setting it. + */ + pr_warn_once("%s: ofpart partition %pOF (%pOF) #size-cells is <0>, using <1> for backward compatibility.\n", + master->name, pp, + mtd_node); + s_cells = 1; + } if (len / 4 != a_cells + s_cells) { pr_debug("%s: ofpart partition %pOF (%pOF) error parsing reg property.\n", master->name, pp,