Message ID | 20220311080147.453483-1-tudor.ambarus@microchip.com (mailing list archive) |
---|---|
Headers | show |
Series | spi-mem: Allow specifying the byte order in DTR mode | expand |
Hi, On 11/03/22 1:31 pm, Tudor Ambarus wrote: > There are NOR flashes (Macronix) that swap the bytes on a 16-bit boundary > when configured in Octal DTR mode. The byte order of 16-bit words is > swapped when read or written in Octal Double Transfer Rate (DTR) mode > compared to Single Transfer Rate (STR) modes. If one writes D0 D1 D2 D3 > bytes using 1-1-1 mode, and uses 8D-8D-8D SPI mode for reading, it will > read back D1 D0 D3 D2. Swapping the bytes is a bad design decision because > it may introduce some endianness problems. It can affect the boot sequence > if the entire boot sequence is not handled in either 8D-8D-8D mode or 1-1-1 > mode. So we must swap the bytes back to have the same byte order as in STR > modes. Fortunately there are controllers that can swap the bytes back at > runtime, addressing the flash's endiannesses requirements. > If the controllers are not capable of swapping the bytes, the protocol is > downgraded via spi_nor_spimem_adjust_hwcaps(). When available, the swapping > of the bytes is always done regardless if it's a data or register access, > so that we comply with the JESD216 requirements: "Byte order of 16-bit > words is swapped when read in 8D-8D-8D mode compared to 1-1-1". > Sorry, bit late to the thread. But, dropping 8D-8D-8D mode support is quite restrictive IMO. AFAIK, SFDP standard does not dictate how data should be stored in flash or how SW should interpret after reading data from flash. It merely indicates endian-ness compared to 1-1-1 mode. So, its up to various system SWs like bootloader/Linux to work according to pre-aligned layout as there is no rule that data needs to be stored in byte order. We have two types of controllers: 1. SPI controllers supporting swapping endian-ness on the fly: -> For such flashes, better choice is to have SWAP option always enabled. So that data written in 8D-8D-8D mode can be read correctly in 1-1-1 mode and vice-versa. ( I am assuming SWAP option of controller is only effective in 8D-8D-8D mode and is NOP in 1-1-1 or other modes) But, its possible that "ROM" or other non-upgradable SWs may choose not make to use of this SWAP option of HW to keep things simple in which case, they cannot boot from 8D-8D-8D mode with above setting. Such SW don't always have knowledge of flash and cannot be forced to have a constraint to enable byte swap on read. So, IMO, its best left to system integrators to specify whether or not SWAP option needs to be enabled (perhaps via DT as its per flash specific property?) 2. SPI controllers don't support endian-ness SWAP on the fly: It is still possible to reliably read and write data as long as its written and read back in same mode. De-rating speeds because of absence of this support would mean reduction of speed by **16 times** (maybe even higher as 8D mode tends to support higher bus freqs). Swapping bytes in Linux before writing or after reading is not an option either as it negatively impacts performance. Asking ROM/bootloaders to swap bytes based on SFDP indication is restrictive too as it involves boot time penalty and most systems with OSPI flashes are using them to achieve super fast boot times. One more case to consider is flashes that dont have SFDP table to indicate byte order but follow Macronix's convention. In such cases, its better for SPI NOR layer to be as dumb as possible and not really do any byte swapping, leaving it up to user space to handle/interpret data appropriately. Also, Macronix is probably in violation of xSPI spec (JESD251A 6.9.5.2 8D-8D-8D Profile 1.0) where diagrams clearly show data output should be D0 D1 D2 D3... So ROMs following xSPI spec (which is the only spec providing flash agnostic way of switching to 8D mode and reading data in 8D mode) would not care about SFDP bit indicating byteorder and its up to flasher programs to take care of the same IMO, kernel device drivers should just provide access to underlying HW and not have too much intelligence to interpret data/take decisions So, simpler constraint to put is: Flasher programs should program data in the same mode in which ROM/bootloder/Linux is expected to read the data on that system. For Macronix like flashes, if one has a ROM/bootloader that only supports 1-1-1 mode and flashing data in 8D-8D-8D mode with Linux, then please generate a byte-swapped image offline and flash it. Don't impose penalty on systems that do best to handle this messy situation. I see this as the only option with least performance penalty. Regards Vignesh [...]
On 3/15/22 08:08, Vignesh Raghavendra wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe > > Hi, Hi, > > On 11/03/22 1:31 pm, Tudor Ambarus wrote: >> There are NOR flashes (Macronix) that swap the bytes on a 16-bit boundary >> when configured in Octal DTR mode. The byte order of 16-bit words is >> swapped when read or written in Octal Double Transfer Rate (DTR) mode >> compared to Single Transfer Rate (STR) modes. If one writes D0 D1 D2 D3 >> bytes using 1-1-1 mode, and uses 8D-8D-8D SPI mode for reading, it will >> read back D1 D0 D3 D2. Swapping the bytes is a bad design decision because >> it may introduce some endianness problems. It can affect the boot sequence >> if the entire boot sequence is not handled in either 8D-8D-8D mode or 1-1-1 >> mode. So we must swap the bytes back to have the same byte order as in STR >> modes. Fortunately there are controllers that can swap the bytes back at >> runtime, addressing the flash's endiannesses requirements. >> If the controllers are not capable of swapping the bytes, the protocol is >> downgraded via spi_nor_spimem_adjust_hwcaps(). When available, the swapping >> of the bytes is always done regardless if it's a data or register access, >> so that we comply with the JESD216 requirements: "Byte order of 16-bit >> words is swapped when read in 8D-8D-8D mode compared to 1-1-1". >> > > Sorry, bit late to the thread. But, dropping 8D-8D-8D mode support is no worries > quite restrictive IMO. > > AFAIK, SFDP standard does not dictate how data should be stored in flash > or how SW should interpret after reading data from flash. It merely > indicates endian-ness compared to 1-1-1 mode. > > So, its up to various system SWs like bootloader/Linux to work according > to pre-aligned layout as there is no rule that data needs to be stored > in byte order. > > We have two types of controllers: > > 1. SPI controllers supporting swapping endian-ness on the fly: > -> For such flashes, better choice is to have SWAP option always > enabled. So that data written in 8D-8D-8D mode can be read correctly in > 1-1-1 mode and vice-versa. > ( I am assuming SWAP option of controller is only effective in 8D-8D-8D > mode and is NOP in 1-1-1 or other modes) > > But, its possible that "ROM" or other non-upgradable SWs may choose not > make to use of this SWAP option of HW to keep things simple in which > case, they cannot boot from 8D-8D-8D mode with above setting. Such SW > don't always have knowledge of flash and cannot be forced to have a > constraint to enable byte swap on read. > > So, IMO, its best left to system integrators to specify whether or not > SWAP option needs to be enabled (perhaps via DT as its per flash > specific property?) we can't use DT for configuration, maybe a Kconfig instead. Are there any other options? > > 2. SPI controllers don't support endian-ness SWAP on the fly: > It is still possible to reliably read and write data as long as its > written and read back in same mode. > > De-rating speeds because of absence of this support would mean reduction > of speed by **16 times** (maybe even higher as 8D mode tends to support > higher bus freqs). Swapping bytes in Linux before writing or after > reading is not an option either as it negatively impacts performance. > > Asking ROM/bootloaders to swap bytes based on SFDP indication is > restrictive too as it involves boot time penalty and most systems with > OSPI flashes are using them to achieve super fast boot times. > > One more case to consider is flashes that dont have SFDP table to > indicate byte order but follow Macronix's convention. In such cases, its > better for SPI NOR layer to be as dumb as possible and not really do any > byte swapping, leaving it up to user space to handle/interpret data > appropriately. > > Also, Macronix is probably in violation of xSPI spec (JESD251A 6.9.5.2 > 8D-8D-8D Profile 1.0) where diagrams clearly show data output should be > D0 D1 D2 D3... So ROMs following xSPI spec (which is the only spec > providing flash agnostic way of switching to 8D mode and reading data in > 8D mode) would not care about SFDP bit indicating byteorder and its up > to flasher programs to take care of the same This is a bit in contradiction, because if the ROMs follow xSPI, thus little endian byte order, they should swap the bytes. > > IMO, kernel device drivers should just provide access to underlying HW > and not have too much intelligence to interpret data/take decisions > > So, simpler constraint to put is: > Flasher programs should program data in the same mode in which > ROM/bootloder/Linux is expected to read the data on that system. No, this constraint doesn't cover all possible cases: take a 1-1-1 ROMcode, 8D-8D-8D for other bootloaders and kernel. You need to dynamically change modes in the flasher program in order to address this use case, which is a no go. > > For Macronix like flashes, if one has a ROM/bootloader that only > supports 1-1-1 mode and flashing data in 8D-8D-8D mode with Linux, then > please generate a byte-swapped image offline and flash it. Don't impose we can't do that, see the example from above. > penalty on systems that do best to handle this messy situation. > I see this as the only option with least performance penalty. > I take from this that we should let the byte swap be user-configurable, thus a Kconfig. Which I'm not against it, but it will give users headaches to sync all the software components. Making such a decision implies that users know SPI NOR internal details, which also is a bit stretched. Let's sync so that we move forward with this. Opinions? Cheers, ta
Hi, Am 2022-03-15 07:08, schrieb Vignesh Raghavendra: > On 11/03/22 1:31 pm, Tudor Ambarus wrote: >> There are NOR flashes (Macronix) that swap the bytes on a 16-bit >> boundary >> when configured in Octal DTR mode. The byte order of 16-bit words is >> swapped when read or written in Octal Double Transfer Rate (DTR) mode >> compared to Single Transfer Rate (STR) modes. If one writes D0 D1 D2 >> D3 >> bytes using 1-1-1 mode, and uses 8D-8D-8D SPI mode for reading, it >> will >> read back D1 D0 D3 D2. Swapping the bytes is a bad design decision >> because >> it may introduce some endianness problems. It can affect the boot >> sequence >> if the entire boot sequence is not handled in either 8D-8D-8D mode or >> 1-1-1 >> mode. So we must swap the bytes back to have the same byte order as in >> STR >> modes. Fortunately there are controllers that can swap the bytes back >> at >> runtime, addressing the flash's endiannesses requirements. >> If the controllers are not capable of swapping the bytes, the protocol >> is >> downgraded via spi_nor_spimem_adjust_hwcaps(). When available, the >> swapping >> of the bytes is always done regardless if it's a data or register >> access, >> so that we comply with the JESD216 requirements: "Byte order of 16-bit >> words is swapped when read in 8D-8D-8D mode compared to 1-1-1". >> > > Sorry, bit late to the thread. But, dropping 8D-8D-8D mode support is > quite restrictive IMO. > > AFAIK, SFDP standard does not dictate how data should be stored in > flash > or how SW should interpret after reading data from flash. It merely > indicates endian-ness compared to 1-1-1 mode. Mh, but below you are saying that Micronix is violating the standard and is swapping the bytes. So, the standard is actually specifying the byte order. > So, its up to various system SWs like bootloader/Linux to work > according > to pre-aligned layout as there is no rule that data needs to be stored > in byte order. > > We have two types of controllers: > > 1. SPI controllers supporting swapping endian-ness on the fly: > -> For such flashes, better choice is to have SWAP option always > enabled. So that data written in 8D-8D-8D mode can be read correctly in > 1-1-1 mode and vice-versa. > ( I am assuming SWAP option of controller is only effective in 8D-8D-8D > mode and is NOP in 1-1-1 or other modes) Why should it be always enabled? You can also say it should always be disabled. It doesn't matter if the byte order isn't important. But I say the byte order *is* important. We need one reference. > But, its possible that "ROM" or other non-upgradable SWs may choose not > make to use of this SWAP option of HW to keep things simple in which > case, they cannot boot from 8D-8D-8D mode with above setting. Such SW > don't always have knowledge of flash and cannot be forced to have a > constraint to enable byte swap on read. > > So, IMO, its best left to system integrators to specify whether or not > SWAP option needs to be enabled (perhaps via DT as its per flash > specific property?) Agreed, but I don't think we have to do it now. If someone cares, he can make a patch. Also we have to consider non-DT platforms. > 2. SPI controllers don't support endian-ness SWAP on the fly: > It is still possible to reliably read and write data as long as its > written and read back in same mode. > > De-rating speeds because of absence of this support would mean > reduction > of speed by **16 times** (maybe even higher as 8D mode tends to support > higher bus freqs). You can also fall back to a quad mode. Again, can be implemented if someone cares. > Swapping bytes in Linux before writing or after > reading is not an option either as it negatively impacts performance. > > Asking ROM/bootloaders to swap bytes based on SFDP indication is > restrictive too as it involves boot time penalty and most systems with > OSPI flashes are using them to achieve super fast boot times. No we are talking about what? ms? to read the sfdp? If that is really a use case, the the bootloader can hardcode it there. > One more case to consider is flashes that dont have SFDP table to > indicate byte order but follow Macronix's convention. In such cases, > its > better for SPI NOR layer to be as dumb as possible and not really do > any > byte swapping, leaving it up to user space to handle/interpret data > appropriately. > > Also, Macronix is probably in violation of xSPI spec (JESD251A 6.9.5.2 > 8D-8D-8D Profile 1.0) where diagrams clearly show data output should be > D0 D1 D2 D3... So ROMs following xSPI spec (which is the only spec > providing flash agnostic way of switching to 8D mode and reading data > in > 8D mode) would not care about SFDP bit indicating byteorder and its up > to flasher programs to take care of the same > > IMO, kernel device drivers should just provide access to underlying HW > and not have too much intelligence to interpret data/take decisions I strongly disagree here. The kernel should provide a consistent view of the flash content, regardless if its read in 1-1-1 or 8d-8d-8d. Imagine you are forced to switch away from 8d-8d-8d mode (for whatever reason) after some time. How would you know how to read the contents? JESD216 is a standard, too. There is a reason this bit ended up in there. If I had to guess, someone messed up the byte order in 8d-8d-8d but it was too late. And now the standard is giving you a hint that you are using a flash with that messed up byte ordering. I want to avoid having flash contents where the byte are swapped. The sooner we care about that, the better. You cannot undo that later on. > So, simpler constraint to put is: > Flasher programs should program data in the same mode in which > ROM/bootloder/Linux is expected to read the data on that system. So what if your flasher only has 1 pin? With this you are just shifting the problem elsewhere. > For Macronix like flashes, if one has a ROM/bootloader that only > supports 1-1-1 mode and flashing data in 8D-8D-8D mode with Linux, then > please generate a byte-swapped image offline and flash it. Don't impose > penalty on systems that do best to handle this messy situation. > I see this as the only option with least performance penalty. I certainly have nothing against an option to turn this all off to improve speed. But the swapping (if asked to do so) and the degradation should be an *opt-out*. Not an opt-in. Nobody will do the opt-in and we end up with 'corrupted' flash contents. -michael
Hi Michael, On 15/03/22 12:49 pm, Michael Walle wrote: > Hi, > > Am 2022-03-15 07:08, schrieb Vignesh Raghavendra: >> On 11/03/22 1:31 pm, Tudor Ambarus wrote: >>> There are NOR flashes (Macronix) that swap the bytes on a 16-bit >>> boundary >>> when configured in Octal DTR mode. The byte order of 16-bit words is >>> swapped when read or written in Octal Double Transfer Rate (DTR) mode >>> compared to Single Transfer Rate (STR) modes. If one writes D0 D1 D2 D3 >>> bytes using 1-1-1 mode, and uses 8D-8D-8D SPI mode for reading, it will >>> read back D1 D0 D3 D2. Swapping the bytes is a bad design decision >>> because >>> it may introduce some endianness problems. It can affect the boot >>> sequence >>> if the entire boot sequence is not handled in either 8D-8D-8D mode or >>> 1-1-1 >>> mode. So we must swap the bytes back to have the same byte order as >>> in STR >>> modes. Fortunately there are controllers that can swap the bytes back at >>> runtime, addressing the flash's endiannesses requirements. >>> If the controllers are not capable of swapping the bytes, the >>> protocol is >>> downgraded via spi_nor_spimem_adjust_hwcaps(). When available, the >>> swapping >>> of the bytes is always done regardless if it's a data or register >>> access, >>> so that we comply with the JESD216 requirements: "Byte order of 16-bit >>> words is swapped when read in 8D-8D-8D mode compared to 1-1-1". >>> >> >> Sorry, bit late to the thread. But, dropping 8D-8D-8D mode support is >> quite restrictive IMO. >> >> AFAIK, SFDP standard does not dictate how data should be stored in flash >> or how SW should interpret after reading data from flash. It merely >> indicates endian-ness compared to 1-1-1 mode. > > Mh, but below you are saying that Micronix is violating the standard > and is swapping the bytes. So, the standard is actually specifying the > byte order. > As I understand, SFDP spec(JESD216) is way of describing flash details, it does not enforce any rules on how flash should "behave". It just provides info for drivers to discover flash parameters at runtime. OTOH, xSPI spec (JESD251) lays down guidelines for SW interface, electrical, and mechanical interfaces. Macronix flash deviates from xSPI spec but no SFDP per say (unless it claims xSPI compliance elsewhere in SFDP tables). >> So, its up to various system SWs like bootloader/Linux to work according >> to pre-aligned layout as there is no rule that data needs to be stored >> in byte order. >> >> We have two types of controllers: >> >> 1. SPI controllers supporting swapping endian-ness on the fly: >> -> For such flashes, better choice is to have SWAP option always >> enabled. So that data written in 8D-8D-8D mode can be read correctly in >> 1-1-1 mode and vice-versa. >> ( I am assuming SWAP option of controller is only effective in 8D-8D-8D >> mode and is NOP in 1-1-1 or other modes) > > Why should it be always enabled? You can also say it should always > be disabled. It doesn't matter if the byte order isn't important. > > But I say the byte order *is* important. We need one reference. > >> But, its possible that "ROM" or other non-upgradable SWs may choose not >> make to use of this SWAP option of HW to keep things simple in which >> case, they cannot boot from 8D-8D-8D mode with above setting. Such SW >> don't always have knowledge of flash and cannot be forced to have a >> constraint to enable byte swap on read. >> >> So, IMO, its best left to system integrators to specify whether or not >> SWAP option needs to be enabled (perhaps via DT as its per flash >> specific property?) > > Agreed, but I don't think we have to do it now. If someone cares, I am fine with that, but I want to make sure we are not saying controllers w/o ability to swap byte order can never support Macronix like flash in 8D-8D-8D mode. > he can make a patch. Also we have to consider non-DT platforms. > non DT platforms also have a way to pass HW related data >> 2. SPI controllers don't support endian-ness SWAP on the fly: >> It is still possible to reliably read and write data as long as its >> written and read back in same mode. >> >> De-rating speeds because of absence of this support would mean reduction >> of speed by **16 times** (maybe even higher as 8D mode tends to support >> higher bus freqs). > > You can also fall back to a quad mode. Again, can be implemented if > someone cares. Actually I have not come across OSPI flash that also supports quad mode. So, fallback is 1-1-1 for majority of these flashes > >> Swapping bytes in Linux before writing or after >> reading is not an option either as it negatively impacts performance. >> >> Asking ROM/bootloaders to swap bytes based on SFDP indication is >> restrictive too as it involves boot time penalty and most systems with >> OSPI flashes are using them to achieve super fast boot times. > > No we are talking about what? ms? to read the sfdp? If that is really > a use case, the the bootloader can hardcode it there. > Reading SFDP is not the issue but swapping entire image is the problem. Takes several tens of ms depending on boot image size which is not acceptable to systems looking at < 100 - 200msms start up times for quick bootup usecases involving early Display / CAN output etc >> One more case to consider is flashes that dont have SFDP table to >> indicate byte order but follow Macronix's convention. In such cases, its >> better for SPI NOR layer to be as dumb as possible and not really do any >> byte swapping, leaving it up to user space to handle/interpret data >> appropriately. >> >> Also, Macronix is probably in violation of xSPI spec (JESD251A 6.9.5.2 >> 8D-8D-8D Profile 1.0) where diagrams clearly show data output should be >> D0 D1 D2 D3... So ROMs following xSPI spec (which is the only spec >> providing flash agnostic way of switching to 8D mode and reading data in >> 8D mode) would not care about SFDP bit indicating byteorder and its up >> to flasher programs to take care of the same >> >> IMO, kernel device drivers should just provide access to underlying HW >> and not have too much intelligence to interpret data/take decisions > > I strongly disagree here. The kernel should provide a consistent > view of the flash content, regardless if its read in 1-1-1 or 8d-8d-8d. > Imagine you are forced to switch away from 8d-8d-8d mode (for whatever > reason) after some time. How would you know how to read the contents? > Agree with need for consistent view, although if we need to fall back to 1-1-1 mode for a while and switch back to 8D-8D-8D mode then there is something fundamentally broken. > JESD216 is a standard, too. There is a reason this bit ended up in > there. If I had to guess, someone messed up the byte order in 8d-8d-8d > but it was too late. And now the standard is giving you a hint that > you are using a flash with that messed up byte ordering. > Yes, but note JESD216 is all about giving hint to softwares on how flash behaves, it does not dictate what softwares need to do with it. So another OS is free to leave these Macronix flashes using native byte ordering and not swapping on read / write > I want to avoid having flash contents where the byte are swapped. > The sooner we care about that, the better. You cannot undo that > later on. > >> So, simpler constraint to put is: >> Flasher programs should program data in the same mode in which >> ROM/bootloder/Linux is expected to read the data on that system. > > So what if your flasher only has 1 pin? With this you are just > shifting the problem elsewhere. > flasher program would have to use byte swapped image if read is expected to be in 8D-8D mode >> For Macronix like flashes, if one has a ROM/bootloader that only >> supports 1-1-1 mode and flashing data in 8D-8D-8D mode with Linux, then >> please generate a byte-swapped image offline and flash it. Don't impose >> penalty on systems that do best to handle this messy situation. >> I see this as the only option with least performance penalty. > > I certainly have nothing against an option to turn this all off > to improve speed. But the swapping (if asked to do so) and the > degradation should be an *opt-out*. Not an opt-in. Nobody will > do the opt-in and we end up with 'corrupted' flash contents. > Sounds good to me. We should note this in commit message disabling 8D-8D-8D mode for these flashes. Regards Vignesh
On 15/03/22 12:28 pm, Tudor.Ambarus@microchip.com wrote: > On 3/15/22 08:08, Vignesh Raghavendra wrote: >> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe >> >> Hi, > > Hi, > >> >> On 11/03/22 1:31 pm, Tudor Ambarus wrote: >>> There are NOR flashes (Macronix) that swap the bytes on a 16-bit boundary >>> when configured in Octal DTR mode. The byte order of 16-bit words is >>> swapped when read or written in Octal Double Transfer Rate (DTR) mode >>> compared to Single Transfer Rate (STR) modes. If one writes D0 D1 D2 D3 >>> bytes using 1-1-1 mode, and uses 8D-8D-8D SPI mode for reading, it will >>> read back D1 D0 D3 D2. Swapping the bytes is a bad design decision because >>> it may introduce some endianness problems. It can affect the boot sequence >>> if the entire boot sequence is not handled in either 8D-8D-8D mode or 1-1-1 >>> mode. So we must swap the bytes back to have the same byte order as in STR >>> modes. Fortunately there are controllers that can swap the bytes back at >>> runtime, addressing the flash's endiannesses requirements. >>> If the controllers are not capable of swapping the bytes, the protocol is >>> downgraded via spi_nor_spimem_adjust_hwcaps(). When available, the swapping >>> of the bytes is always done regardless if it's a data or register access, >>> so that we comply with the JESD216 requirements: "Byte order of 16-bit >>> words is swapped when read in 8D-8D-8D mode compared to 1-1-1". >>> >> >> Sorry, bit late to the thread. But, dropping 8D-8D-8D mode support is > > no worries > >> quite restrictive IMO. >> >> AFAIK, SFDP standard does not dictate how data should be stored in flash >> or how SW should interpret after reading data from flash. It merely >> indicates endian-ness compared to 1-1-1 mode. >> >> So, its up to various system SWs like bootloader/Linux to work according >> to pre-aligned layout as there is no rule that data needs to be stored >> in byte order. >> >> We have two types of controllers: >> >> 1. SPI controllers supporting swapping endian-ness on the fly: >> -> For such flashes, better choice is to have SWAP option always >> enabled. So that data written in 8D-8D-8D mode can be read correctly in >> 1-1-1 mode and vice-versa. >> ( I am assuming SWAP option of controller is only effective in 8D-8D-8D >> mode and is NOP in 1-1-1 or other modes) >> >> But, its possible that "ROM" or other non-upgradable SWs may choose not >> make to use of this SWAP option of HW to keep things simple in which >> case, they cannot boot from 8D-8D-8D mode with above setting. Such SW >> don't always have knowledge of flash and cannot be forced to have a >> constraint to enable byte swap on read. >> >> So, IMO, its best left to system integrators to specify whether or not >> SWAP option needs to be enabled (perhaps via DT as its per flash >> specific property?) > > we can't use DT for configuration, maybe a Kconfig instead. Are there any > other options? > Problem with Kconfig is that it cannot be used when there are multiple flash instances on the board and both behave differently. This is similar to big-endian vs little-endian property used by CFI physmap driver and other places in kernel[1] >> >> 2. SPI controllers don't support endian-ness SWAP on the fly: >> It is still possible to reliably read and write data as long as its >> written and read back in same mode. >> >> De-rating speeds because of absence of this support would mean reduction >> of speed by **16 times** (maybe even higher as 8D mode tends to support >> higher bus freqs). Swapping bytes in Linux before writing or after >> reading is not an option either as it negatively impacts performance. >> >> Asking ROM/bootloaders to swap bytes based on SFDP indication is >> restrictive too as it involves boot time penalty and most systems with >> OSPI flashes are using them to achieve super fast boot times. >> >> One more case to consider is flashes that dont have SFDP table to >> indicate byte order but follow Macronix's convention. In such cases, its >> better for SPI NOR layer to be as dumb as possible and not really do any >> byte swapping, leaving it up to user space to handle/interpret data >> appropriately. >> >> Also, Macronix is probably in violation of xSPI spec (JESD251A 6.9.5.2 >> 8D-8D-8D Profile 1.0) where diagrams clearly show data output should be >> D0 D1 D2 D3... So ROMs following xSPI spec (which is the only spec >> providing flash agnostic way of switching to 8D mode and reading data in >> 8D mode) would not care about SFDP bit indicating byteorder and its up >> to flasher programs to take care of the same > > This is a bit in contradiction, because if the ROMs follow xSPI, thus little > endian byte order, they should swap the bytes. > As I indicated below xSPI (JESD251) spec seems to align on little endian order. So, don't really need to swap the bytes on read. Anyways impact on performance prohibits such support and thus most ROM designs will expect flashing program to take care of it (especially in case where controller does not support on the fly swapping). >> >> IMO, kernel device drivers should just provide access to underlying HW >> and not have too much intelligence to interpret data/take decisions >> >> So, simpler constraint to put is: >> Flasher programs should program data in the same mode in which >> ROM/bootloder/Linux is expected to read the data on that system. > > No, this constraint doesn't cover all possible cases: take a 1-1-1 ROMcode, > 8D-8D-8D for other bootloaders and kernel. You need to dynamically change modes > in the flasher program in order to address this use case, which is a no go. > Flash programmer need not change the mode, but needs to generate a byte swapped image and write it knowing that image is being flashed in 8D-8D-8D mode and read in 1-1-1 mode or vice versa. >> >> For Macronix like flashes, if one has a ROM/bootloader that only >> supports 1-1-1 mode and flashing data in 8D-8D-8D mode with Linux, then >> please generate a byte-swapped image offline and flash it. Don't impose > > we can't do that, see the example from above. > See above, no need to change mode dynamically. >> penalty on systems that do best to handle this messy situation. >> I see this as the only option with least performance penalty. >> > > I take from this that we should let the byte swap be user-configurable, > thus a Kconfig. Which I'm not against it, but it will give users > headaches to sync all the software components. Making such a decision > implies that users know SPI NOR internal details, which also is a bit > stretched. Let's sync so that we move forward with this. Opinions? > I am fine with things being configurable Only issues with Kconfig: it does not help systems with multiple OSPIs with different endian-ness. [1] Documentation/devicetree/bindings/mtd/mtd-physmap.yaml Regards Vignesh
On 3/16/22 09:05, Vignesh Raghavendra wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe > > Hi Michael, > > On 15/03/22 12:49 pm, Michael Walle wrote: >> Hi, >> >> Am 2022-03-15 07:08, schrieb Vignesh Raghavendra: >>> On 11/03/22 1:31 pm, Tudor Ambarus wrote: >>>> There are NOR flashes (Macronix) that swap the bytes on a 16-bit >>>> boundary >>>> when configured in Octal DTR mode. The byte order of 16-bit words is >>>> swapped when read or written in Octal Double Transfer Rate (DTR) mode >>>> compared to Single Transfer Rate (STR) modes. If one writes D0 D1 D2 D3 >>>> bytes using 1-1-1 mode, and uses 8D-8D-8D SPI mode for reading, it will >>>> read back D1 D0 D3 D2. Swapping the bytes is a bad design decision >>>> because >>>> it may introduce some endianness problems. It can affect the boot >>>> sequence >>>> if the entire boot sequence is not handled in either 8D-8D-8D mode or >>>> 1-1-1 >>>> mode. So we must swap the bytes back to have the same byte order as >>>> in STR >>>> modes. Fortunately there are controllers that can swap the bytes back at >>>> runtime, addressing the flash's endiannesses requirements. >>>> If the controllers are not capable of swapping the bytes, the >>>> protocol is >>>> downgraded via spi_nor_spimem_adjust_hwcaps(). When available, the >>>> swapping >>>> of the bytes is always done regardless if it's a data or register >>>> access, >>>> so that we comply with the JESD216 requirements: "Byte order of 16-bit >>>> words is swapped when read in 8D-8D-8D mode compared to 1-1-1". >>>> >>> >>> Sorry, bit late to the thread. But, dropping 8D-8D-8D mode support is >>> quite restrictive IMO. >>> >>> AFAIK, SFDP standard does not dictate how data should be stored in flash >>> or how SW should interpret after reading data from flash. It merely >>> indicates endian-ness compared to 1-1-1 mode. >> >> Mh, but below you are saying that Micronix is violating the standard >> and is swapping the bytes. So, the standard is actually specifying the >> byte order. >> > > As I understand, SFDP spec(JESD216) is way of describing flash details, > it does not enforce any rules on how flash should "behave". It just > provides info for drivers to discover flash parameters at runtime. > OTOH, xSPI spec (JESD251) lays down guidelines for SW interface, > electrical, and mechanical interfaces. Macronix flash deviates from xSPI > spec but no SFDP per say (unless it claims xSPI compliance elsewhere in > SFDP tables). > >>> So, its up to various system SWs like bootloader/Linux to work according >>> to pre-aligned layout as there is no rule that data needs to be stored >>> in byte order. >>> >>> We have two types of controllers: >>> >>> 1. SPI controllers supporting swapping endian-ness on the fly: >>> -> For such flashes, better choice is to have SWAP option always >>> enabled. So that data written in 8D-8D-8D mode can be read correctly in >>> 1-1-1 mode and vice-versa. >>> ( I am assuming SWAP option of controller is only effective in 8D-8D-8D >>> mode and is NOP in 1-1-1 or other modes) >> >> Why should it be always enabled? You can also say it should always >> be disabled. It doesn't matter if the byte order isn't important. >> >> But I say the byte order *is* important. We need one reference. >> >>> But, its possible that "ROM" or other non-upgradable SWs may choose not >>> make to use of this SWAP option of HW to keep things simple in which >>> case, they cannot boot from 8D-8D-8D mode with above setting. Such SW >>> don't always have knowledge of flash and cannot be forced to have a >>> constraint to enable byte swap on read. >>> >>> So, IMO, its best left to system integrators to specify whether or not >>> SWAP option needs to be enabled (perhaps via DT as its per flash >>> specific property?) >> >> Agreed, but I don't think we have to do it now. If someone cares, > > I am fine with that, but I want to make sure we are not saying > controllers w/o ability to swap byte order can never support Macronix > like flash in 8D-8D-8D mode. > >> he can make a patch. Also we have to consider non-DT platforms. >> > > non DT platforms also have a way to pass HW related data > >>> 2. SPI controllers don't support endian-ness SWAP on the fly: >>> It is still possible to reliably read and write data as long as its >>> written and read back in same mode. >>> >>> De-rating speeds because of absence of this support would mean reduction >>> of speed by **16 times** (maybe even higher as 8D mode tends to support >>> higher bus freqs). >> >> You can also fall back to a quad mode. Again, can be implemented if >> someone cares. > > Actually I have not come across OSPI flash that also supports quad mode. > So, fallback is 1-1-1 for majority of these flashes 8-8-8 is typically supported where 8d-8d-8d is supported. But downgrading to 8-8-8 still comes with a performance penalty. > >> >>> Swapping bytes in Linux before writing or after >>> reading is not an option either as it negatively impacts performance. >>> >>> Asking ROM/bootloaders to swap bytes based on SFDP indication is >>> restrictive too as it involves boot time penalty and most systems with >>> OSPI flashes are using them to achieve super fast boot times. >> >> No we are talking about what? ms? to read the sfdp? If that is really >> a use case, the the bootloader can hardcode it there. >> > > Reading SFDP is not the issue but swapping entire image is the problem. > Takes several tens of ms depending on boot image size which is not > acceptable to systems looking at < 100 - 200msms start up times for > quick bootup usecases involving early Display / CAN output etc > >>> One more case to consider is flashes that dont have SFDP table to >>> indicate byte order but follow Macronix's convention. In such cases, its >>> better for SPI NOR layer to be as dumb as possible and not really do any >>> byte swapping, leaving it up to user space to handle/interpret data >>> appropriately. >>> >>> Also, Macronix is probably in violation of xSPI spec (JESD251A 6.9.5.2 >>> 8D-8D-8D Profile 1.0) where diagrams clearly show data output should be >>> D0 D1 D2 D3... So ROMs following xSPI spec (which is the only spec >>> providing flash agnostic way of switching to 8D mode and reading data in >>> 8D mode) would not care about SFDP bit indicating byteorder and its up >>> to flasher programs to take care of the same >>> >>> IMO, kernel device drivers should just provide access to underlying HW >>> and not have too much intelligence to interpret data/take decisions >> >> I strongly disagree here. The kernel should provide a consistent >> view of the flash content, regardless if its read in 1-1-1 or 8d-8d-8d. >> Imagine you are forced to switch away from 8d-8d-8d mode (for whatever >> reason) after some time. How would you know how to read the contents? >> > > Agree with need for consistent view, although if we need to fall back to > 1-1-1 mode for a while and switch back to 8D-8D-8D mode then there is > something fundamentally broken. > >> JESD216 is a standard, too. There is a reason this bit ended up in >> there. If I had to guess, someone messed up the byte order in 8d-8d-8d >> but it was too late. And now the standard is giving you a hint that >> you are using a flash with that messed up byte ordering. >> > > Yes, but note JESD216 is all about giving hint to softwares on how > flash behaves, it does not dictate what softwares need to do with it. > So another OS is free to leave these Macronix flashes using native byte > ordering and not swapping on read / write > >> I want to avoid having flash contents where the byte are swapped. >> The sooner we care about that, the better. You cannot undo that >> later on. >> >>> So, simpler constraint to put is: >>> Flasher programs should program data in the same mode in which >>> ROM/bootloder/Linux is expected to read the data on that system. >> >> So what if your flasher only has 1 pin? With this you are just >> shifting the problem elsewhere. >> > > flasher program would have to use byte swapped image if read is expected > to be in 8D-8D mode :) > >>> For Macronix like flashes, if one has a ROM/bootloader that only >>> supports 1-1-1 mode and flashing data in 8D-8D-8D mode with Linux, then >>> please generate a byte-swapped image offline and flash it. Don't impose >>> penalty on systems that do best to handle this messy situation. >>> I see this as the only option with least performance penalty. >> >> I certainly have nothing against an option to turn this all off >> to improve speed. But the swapping (if asked to do so) and the >> degradation should be an *opt-out*. Not an opt-in. Nobody will >> do the opt-in and we end up with 'corrupted' flash contents. >> > > Sounds good to me. We should note this in commit message disabling > 8D-8D-8D mode for these flashes. > I like Michael's *opt-out* idea too. But I'm not convinced with a dt property used to configure when to swap the bytes and when not. DT should be used to describe the hardware, not to configure it. So we can add a "dtr-swab16" property but IMO it should specify just that the bytes are swapped in 8D-8D-8D mode, just as SFDP does, and not to configure the logic to swab16 or not. And if it just describes the hardware and not configuring it, then the entire dt idea diminishes because it is equivalent with setting a SNOR_F flag either by a flash-info flag or at the SFDP parsing time.
Am 2022-03-16 08:08, schrieb Vignesh Raghavendra: > On 15/03/22 12:28 pm, Tudor.Ambarus@microchip.com wrote: >> On 3/15/22 08:08, Vignesh Raghavendra wrote: >>> EXTERNAL EMAIL: Do not click links or open attachments unless you >>> know the content is safe >>> >>> Hi, >> >> Hi, >> >>> >>> On 11/03/22 1:31 pm, Tudor Ambarus wrote: >>>> There are NOR flashes (Macronix) that swap the bytes on a 16-bit >>>> boundary >>>> when configured in Octal DTR mode. The byte order of 16-bit words is >>>> swapped when read or written in Octal Double Transfer Rate (DTR) >>>> mode >>>> compared to Single Transfer Rate (STR) modes. If one writes D0 D1 D2 >>>> D3 >>>> bytes using 1-1-1 mode, and uses 8D-8D-8D SPI mode for reading, it >>>> will >>>> read back D1 D0 D3 D2. Swapping the bytes is a bad design decision >>>> because >>>> it may introduce some endianness problems. It can affect the boot >>>> sequence >>>> if the entire boot sequence is not handled in either 8D-8D-8D mode >>>> or 1-1-1 >>>> mode. So we must swap the bytes back to have the same byte order as >>>> in STR >>>> modes. Fortunately there are controllers that can swap the bytes >>>> back at >>>> runtime, addressing the flash's endiannesses requirements. >>>> If the controllers are not capable of swapping the bytes, the >>>> protocol is >>>> downgraded via spi_nor_spimem_adjust_hwcaps(). When available, the >>>> swapping >>>> of the bytes is always done regardless if it's a data or register >>>> access, >>>> so that we comply with the JESD216 requirements: "Byte order of >>>> 16-bit >>>> words is swapped when read in 8D-8D-8D mode compared to 1-1-1". >>>> >>> >>> Sorry, bit late to the thread. But, dropping 8D-8D-8D mode support is >> >> no worries >> >>> quite restrictive IMO. >>> >>> AFAIK, SFDP standard does not dictate how data should be stored in >>> flash >>> or how SW should interpret after reading data from flash. It merely >>> indicates endian-ness compared to 1-1-1 mode. >>> >>> So, its up to various system SWs like bootloader/Linux to work >>> according >>> to pre-aligned layout as there is no rule that data needs to be >>> stored >>> in byte order. >>> >>> We have two types of controllers: >>> >>> 1. SPI controllers supporting swapping endian-ness on the fly: >>> -> For such flashes, better choice is to have SWAP option always >>> enabled. So that data written in 8D-8D-8D mode can be read correctly >>> in >>> 1-1-1 mode and vice-versa. >>> ( I am assuming SWAP option of controller is only effective in >>> 8D-8D-8D >>> mode and is NOP in 1-1-1 or other modes) >>> >>> But, its possible that "ROM" or other non-upgradable SWs may choose >>> not >>> make to use of this SWAP option of HW to keep things simple in which >>> case, they cannot boot from 8D-8D-8D mode with above setting. Such SW >>> don't always have knowledge of flash and cannot be forced to have a >>> constraint to enable byte swap on read. >>> >>> So, IMO, its best left to system integrators to specify whether or >>> not >>> SWAP option needs to be enabled (perhaps via DT as its per flash >>> specific property?) >> >> we can't use DT for configuration, maybe a Kconfig instead. Are there >> any >> other options? >> > > Problem with Kconfig is that it cannot be used when there are multiple > flash instances on the board and both behave differently. > > This is similar to big-endian vs little-endian property used by CFI > physmap driver and other places in kernel[1] On more thing to consider is that if we are using Kconfig things cannot be changed during runtime. While this might be ok when you are building your typical embedded distribution, there are efforts to be able to boot standard distributions on embedded platforms (see ARM SystemReady). A Kconfig option won't work there. -michael >>> >>> 2. SPI controllers don't support endian-ness SWAP on the fly: >>> It is still possible to reliably read and write data as long as its >>> written and read back in same mode. >>> >>> De-rating speeds because of absence of this support would mean >>> reduction >>> of speed by **16 times** (maybe even higher as 8D mode tends to >>> support >>> higher bus freqs). Swapping bytes in Linux before writing or after >>> reading is not an option either as it negatively impacts performance. >>> >>> Asking ROM/bootloaders to swap bytes based on SFDP indication is >>> restrictive too as it involves boot time penalty and most systems >>> with >>> OSPI flashes are using them to achieve super fast boot times. >>> >>> One more case to consider is flashes that dont have SFDP table to >>> indicate byte order but follow Macronix's convention. In such cases, >>> its >>> better for SPI NOR layer to be as dumb as possible and not really do >>> any >>> byte swapping, leaving it up to user space to handle/interpret data >>> appropriately. >>> >>> Also, Macronix is probably in violation of xSPI spec (JESD251A >>> 6.9.5.2 >>> 8D-8D-8D Profile 1.0) where diagrams clearly show data output should >>> be >>> D0 D1 D2 D3... So ROMs following xSPI spec (which is the only spec >>> providing flash agnostic way of switching to 8D mode and reading data >>> in >>> 8D mode) would not care about SFDP bit indicating byteorder and its >>> up >>> to flasher programs to take care of the same >> >> This is a bit in contradiction, because if the ROMs follow xSPI, thus >> little >> endian byte order, they should swap the bytes. >> > > As I indicated below xSPI (JESD251) spec seems to align on little > endian > order. So, don't really need to swap the bytes on read. Anyways impact > on performance prohibits such support and thus most ROM designs will > expect flashing program to take care of it (especially in case where > controller does not support on the fly swapping). > >>> >>> IMO, kernel device drivers should just provide access to underlying >>> HW >>> and not have too much intelligence to interpret data/take decisions >>> >>> So, simpler constraint to put is: >>> Flasher programs should program data in the same mode in which >>> ROM/bootloder/Linux is expected to read the data on that system. >> >> No, this constraint doesn't cover all possible cases: take a 1-1-1 >> ROMcode, >> 8D-8D-8D for other bootloaders and kernel. You need to dynamically >> change modes >> in the flasher program in order to address this use case, which is a >> no go. >> > > Flash programmer need not change the mode, but needs to generate a byte > swapped image and write it knowing that image is being flashed in > 8D-8D-8D mode and read in 1-1-1 mode or vice versa. > >>> >>> For Macronix like flashes, if one has a ROM/bootloader that only >>> supports 1-1-1 mode and flashing data in 8D-8D-8D mode with Linux, >>> then >>> please generate a byte-swapped image offline and flash it. Don't >>> impose >> >> we can't do that, see the example from above. >> > > See above, no need to change mode dynamically. > >>> penalty on systems that do best to handle this messy situation. >>> I see this as the only option with least performance penalty. >>> >> >> I take from this that we should let the byte swap be >> user-configurable, >> thus a Kconfig. Which I'm not against it, but it will give users >> headaches to sync all the software components. Making such a decision >> implies that users know SPI NOR internal details, which also is a bit >> stretched. Let's sync so that we move forward with this. Opinions? >> > > I am fine with things being configurable > > Only issues with Kconfig: it does not help systems with multiple OSPIs > with different endian-ness. > > > [1] Documentation/devicetree/bindings/mtd/mtd-physmap.yaml > > > Regards > Vignesh
Thought... Can you read the device in STR mode until you get a suitable non-palindromic value, then read it in DTR mode and dynamically determine the byte order? Clearly this won't work if the device is erased to all 0xff. But a check could be done on/after the first write. I suspect write times are actually dominated by the time spent waiting for the write to complete? (Never mind the earlier block erase time.) So always writing in STR mode probably makes little difference? Writes really ought to be uncommon as well. Speeding up reads is a different matter - and probably useful. Of course, if you've got hardware reading the spi memory in DTR mode for config data you might need to byteswap it (compared to the STR writes) - but that is probably a 2nd order problem. I've got some bespoke logic on an PCIe fpga for accessing spi memory. Uses address bits for the control signals and converts a 32bit read/write into 8 nibble transfers to the chip. (uses byte enables - don't an odd number of clocks.) mmapp()ed to userspace for updating the 6MB fpga image. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Hi, Am 2022-03-16 14:55, schrieb David Laight: > Thought... Thank you for your proposal. > Can you read the device in STR mode until you get a suitable > non-palindromic value, then read it in DTR mode and dynamically > determine the byte order? > > Clearly this won't work if the device is erased to all 0xff. > But a check could be done on/after the first write. > > I suspect write times are actually dominated by the time spent > waiting for the write to complete? > (Never mind the earlier block erase time.) > So always writing in STR mode probably makes little difference? > Writes really ought to be uncommon as well. > > Speeding up reads is a different matter - and probably useful. > > Of course, if you've got hardware reading the spi memory in DTR > mode for config data you might need to byteswap it (compared > to the STR writes) - but that is probably a 2nd order problem. > > I've got some bespoke logic on an PCIe fpga for accessing spi memory. > Uses address bits for the control signals and converts a 32bit > read/write into 8 nibble transfers to the chip. > (uses byte enables - don't an odd number of clocks.) > mmapp()ed to userspace for updating the 6MB fpga image. Our problem is not how to detect that we have to swap it, but rather what we do when we have to do it. If we have a controller which can swap the bytes for us on the fly, we are lucky and can enable swapping if we need it. We are also lucky when we don't have to swap the flash contents, obviously. But what do we do when we need to swap it and the controller doesn't support it. We could do it in software which will slow things down. So depending on the use case this might or might not work. We can degrade it to a speed which doesn't have this issue; which might be 1-1-1 in the worst case. We could also do just nothing special; but this will lead to inconsistencies between reading in 1-1-1 and 8d-8d-8d. -michael
From: Michael Walle > Sent: 17 March 2022 09:40 > > Am 2022-03-16 14:55, schrieb David Laight: > > Thought... > > Thank you for your proposal. > > > Can you read the device in STR mode until you get a suitable > > non-palindromic value, then read it in DTR mode and dynamically > > determine the byte order? > > > > Clearly this won't work if the device is erased to all 0xff. > > But a check could be done on/after the first write. > > > > I suspect write times are actually dominated by the time spent > > waiting for the write to complete? > > (Never mind the earlier block erase time.) > > So always writing in STR mode probably makes little difference? > > Writes really ought to be uncommon as well. > > > > Speeding up reads is a different matter - and probably useful. > > > > Of course, if you've got hardware reading the spi memory in DTR > > mode for config data you might need to byteswap it (compared > > to the STR writes) - but that is probably a 2nd order problem. > > > > I've got some bespoke logic on an PCIe fpga for accessing spi memory. > > Uses address bits for the control signals and converts a 32bit > > read/write into 8 nibble transfers to the chip. > > (uses byte enables - don't an odd number of clocks.) > > mmapp()ed to userspace for updating the 6MB fpga image. > > Our problem is not how to detect that we have to swap it, but > rather what we do when we have to do it. > > If we have a controller which can swap the bytes for us on the > fly, we are lucky and can enable swapping if we need it. We are > also lucky when we don't have to swap the flash contents, obviously. > > But what do we do when we need to swap it and the controller > doesn't support it. We could do it in software which will slow > things down. So depending on the use case this might or might not > work. We can degrade it to a speed which doesn't have this issue; > which might be 1-1-1 in the worst case. We could also do just > nothing special; but this will lead to inconsistencies between > reading in 1-1-1 and 8d-8d-8d. I really doubt you'll notice the effects of a software byteswap compared to the actual time taken to do an spi read. What's the maximum clock rate for spi memory? Something like 50MHz ? If the spi controller isn't doing dma then the cpu pio reads to get the data are very likely to be even slower than that. (Especially if they are PCIe reads.) David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On 17/03/22 3:44 pm, David Laight wrote: > From: Michael Walle >> Sent: 17 March 2022 09:40 >> >> Am 2022-03-16 14:55, schrieb David Laight: >>> Thought... >> >> Thank you for your proposal. >> >>> Can you read the device in STR mode until you get a suitable >>> non-palindromic value, then read it in DTR mode and dynamically >>> determine the byte order? >>> >>> Clearly this won't work if the device is erased to all 0xff. >>> But a check could be done on/after the first write. >>> >>> I suspect write times are actually dominated by the time spent >>> waiting for the write to complete? >>> (Never mind the earlier block erase time.) >>> So always writing in STR mode probably makes little difference? >>> Writes really ought to be uncommon as well. >>> >>> Speeding up reads is a different matter - and probably useful. >>> >>> Of course, if you've got hardware reading the spi memory in DTR >>> mode for config data you might need to byteswap it (compared >>> to the STR writes) - but that is probably a 2nd order problem. >>> >>> I've got some bespoke logic on an PCIe fpga for accessing spi memory. >>> Uses address bits for the control signals and converts a 32bit >>> read/write into 8 nibble transfers to the chip. >>> (uses byte enables - don't an odd number of clocks.) >>> mmapp()ed to userspace for updating the 6MB fpga image. >> >> Our problem is not how to detect that we have to swap it, but >> rather what we do when we have to do it. >> >> If we have a controller which can swap the bytes for us on the >> fly, we are lucky and can enable swapping if we need it. We are >> also lucky when we don't have to swap the flash contents, obviously. >> >> But what do we do when we need to swap it and the controller >> doesn't support it. We could do it in software which will slow >> things down. So depending on the use case this might or might not >> work. We can degrade it to a speed which doesn't have this issue; >> which might be 1-1-1 in the worst case. We could also do just >> nothing special; but this will lead to inconsistencies between >> reading in 1-1-1 and 8d-8d-8d. > > I really doubt you'll notice the effects of a software byteswap > compared to the actual time taken to do an spi read. > > What's the maximum clock rate for spi memory? > Something like 50MHz ? We have Octal SPI flashes running at upwards of 200MHz clock (400MB/s) so SW byteswap will add significant overhead. > If the spi controller isn't doing dma then the cpu pio reads > to get the data are very likely to be even slower than that. > (Especially if they are PCIe reads.) > Modern OSPI/QSPI flash controllers provide MMIO interface to read from flash where DMA can pull data as if though you are reading from On chip RAM Regards Vignesh
From: Vignesh Raghavendra > Sent: 17 March 2022 10:24 ... > Modern OSPI/QSPI flash controllers provide MMIO interface to read from > flash where DMA can pull data as if though you are reading from On chip RAM So the cpu does an MMIO read cycle to the controller which doesn't complete until (for the nibble-mode spi device I have): 1) Chipselect is asserted. 2) The 8-bit command has been clocked out. 3) The 32bit address have been clocked out (8 clocks in nibbles). 4) A few (probably 4) extra delay clocks are added. 5) The data is read - 8 clocks for 32bits in nibble mode. 6) Chipselect is removed. Now you can do long sequential reads without all the red tape. But a random read in nibble mode is about 30 clocks. 16 bit mode saves 6 clocks for the data and maybe 6 for the address? The controller could do 'clever stuff' for sequential reads. At a cost of slowing down random reads. So even at 400MHz it isn't that fast. If the MMIO interface to the flash controller is PCIe you can add in a load of extra latency for the cpu read itself. While PCIe allows multiple read requests to be outstanding, the Intel cpu I've looked at serialise the reads from each cpu core (each cpu always uses the same TLP tag). Now longer read TLP help a lot (IIRC max is 256 bytes). But the x86 cpu will only generate read TLP for register reads. You need to use AVX512 registers (or cache line fetches) to get better throughput! The alternative is getting the flash controller to issue the read/write TLP for memory transfers. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On 17/03/22 4:40 pm, David Laight wrote: > From: Vignesh Raghavendra >> Sent: 17 March 2022 10:24 > ... >> Modern OSPI/QSPI flash controllers provide MMIO interface to read from >> flash where DMA can pull data as if though you are reading from On chip RAM > > So the cpu does an MMIO read cycle to the controller which doesn't > complete until (for the nibble-mode spi device I have): > 1) Chipselect is asserted. > 2) The 8-bit command has been clocked out. > 3) The 32bit address have been clocked out (8 clocks in nibbles). > 4) A few (probably 4) extra delay clocks are added. > 5) The data is read - 8 clocks for 32bits in nibble mode. > 6) Chipselect is removed. > > Now you can do long sequential reads without all the red tape. > But a random read in nibble mode is about 30 clocks. > 16 bit mode saves 6 clocks for the data and maybe 6 for the address? > > The controller could do 'clever stuff' for sequential reads. > At a cost of slowing down random reads. > > So even at 400MHz it isn't that fast. Random CPU reads would be inherently slow, its just how HW is. But, there are cases like image load from flash and Filesystem over flash which would use DMA to maximize performance, such cases would be greatly affected if we do SW byte swap > > If the MMIO interface to the flash controller is PCIe you can > add in a load of extra latency for the cpu read itself. > > While PCIe allows multiple read requests to be outstanding, > the Intel cpu I've looked at serialise the reads from each > cpu core (each cpu always uses the same TLP tag). > > Now longer read TLP help a lot (IIRC max is 256 bytes). > But the x86 cpu will only generate read TLP for register reads. > You need to use AVX512 registers (or cache line fetches) to > get better throughput! > Direct CPU fetch from SPI would not be able to make use of full Bandwidth for high speed flashes and its not the only usecase. Regards Vignesh