Message ID | 20180816113417.3641-1-mb@lightnvm.io (mailing list archive) |
---|---|
Headers | show |
Series | lightnvm: move bad block and chunk state logic to core | expand |
> On 16 Aug 2018, at 13.34, Matias Bjørling <mb@lightnvm.io> wrote: > > This patch moves the 1.2 and 2.0 block/chunk metadata retrieval to > core. > > Hi Javier, I did not end up using your patch. I had misunderstood what > was implemented. Instead I implemented the detection of the each chunk by > first sensing the first page, then the last page, and if the chunk > is sensed as open, a per page scan will be executed to update the write > pointer appropriately. > I see why you want to do it this way for maintaining the chunk abstraction, but this is potentially very inefficient as blocks not used by any target will be recovered unnecessarily. Note that in 1.2, it is expected that targets will need to recover the write pointer themselves. What is more, in the normal path, this will be part of the metadata being stored so no wp recovery is needed. Still, this approach forces recovery on each 1.2 instance creation (also on factory reset). In this context, you are right, the patch I proposed only addresses the double erase issue, which was the original motivator, and left the actual pointer recovery to the normal pblk recovery process. Besides this, in order to consider this as a real possibility, we need to measure the impact on startup time. For this, could you implement nvm_bb_scan_chunk() and nvm_bb_chunk_sense() more efficiently by recovering (i) asynchronously and (ii) concurrently across luns so that we can establish the recovery cost more fairly? We can look at a specific penalty ranges afterwards. Also, the recovery scheme in pblk will change significantly by doing this, so I assume you will send a followup patchset reimplementing recovery for the 1.2 path? I am rebasing wp recovery for 2.0 now and expect to post in the next couple of days. This logic can be reused, but it requires some work and testing. A preliminary version of this patch can be found here [1]. > Note that one needs a real drive to test the implementation. The 1.2 > qemu implementation is lacking. I did update it a bit, such that > it defaults to all blocks being free. It can be picked up in the ocssd > qemu repository. I added patches to fix store/recover chunk metadata in qemu. This should help you generating an arbitrary chunk state and test sanity for these patches. Can you share the tests that you have run to verify this patch? I can run them on a 1.2 device next week (preferably on a V3 that includes the comments above). [1] https://github.com/OpenChannelSSD/linux/commits/for-4.20/pblk: 3c9c548a83ce Javier
On 08/16/2018 05:53 PM, Javier Gonzalez wrote: >> On 16 Aug 2018, at 13.34, Matias Bjørling <mb@lightnvm.io> wrote: >> >> This patch moves the 1.2 and 2.0 block/chunk metadata retrieval to >> core. >> >> Hi Javier, I did not end up using your patch. I had misunderstood what >> was implemented. Instead I implemented the detection of the each chunk by >> first sensing the first page, then the last page, and if the chunk >> is sensed as open, a per page scan will be executed to update the write >> pointer appropriately. >> > > I see why you want to do it this way for maintaining the chunk > abstraction, but this is potentially very inefficient as blocks not used > by any target will be recovered unnecessarily. True. It will up to the target to not ask for more metadata than necessary (similarly for 2.0) Note that in 1.2, it is > expected that targets will need to recover the write pointer themselves. > What is more, in the normal path, this will be part of the metadata > being stored so no wp recovery is needed. Still, this approach forces > recovery on each 1.2 instance creation (also on factory reset). In this > context, you are right, the patch I proposed only addresses the double > erase issue, which was the original motivator, and left the actual > pointer recovery to the normal pblk recovery process. > > Besides this, in order to consider this as a real possibility, we need > to measure the impact on startup time. For this, could you implement > nvm_bb_scan_chunk() and nvm_bb_chunk_sense() more efficiently by > recovering (i) asynchronously and (ii) concurrently across luns so that > we can establish the recovery cost more fairly? We can look at a > specific penalty ranges afterwards. Honestly, 1.2 is deprecated. I don't care about the performance, I care about being easy to maintain, so it doesn't borg me down in the future. Back of the envelope calculation for a 64 die SSD with 1024 blocks per die, and 60us read time, will take 4 seconds to scan if all chunks are free, a worst case something like ~10 seconds. -> Not a problem for me. > > Also, the recovery scheme in pblk will change significantly by doing > this, so I assume you will send a followup patchset reimplementing > recovery for the 1.2 path? The 1.2 path shouldn't be necessary after this. That is the idea of this work. Obviously, the set bad block interface will have to preserved and called.
> On 17 Aug 2018, at 10.21, Matias Bjørling <mb@lightnvm.io> wrote: > > On 08/16/2018 05:53 PM, Javier Gonzalez wrote: >>> On 16 Aug 2018, at 13.34, Matias Bjørling <mb@lightnvm.io> wrote: >>> >>> This patch moves the 1.2 and 2.0 block/chunk metadata retrieval to >>> core. >>> >>> Hi Javier, I did not end up using your patch. I had misunderstood what >>> was implemented. Instead I implemented the detection of the each chunk by >>> first sensing the first page, then the last page, and if the chunk >>> is sensed as open, a per page scan will be executed to update the write >>> pointer appropriately. >> I see why you want to do it this way for maintaining the chunk >> abstraction, but this is potentially very inefficient as blocks not used >> by any target will be recovered unnecessarily. > > True. It will up to the target to not ask for more metadata than necessary (similarly for 2.0) > > Note that in 1.2, it is >> expected that targets will need to recover the write pointer themselves. >> What is more, in the normal path, this will be part of the metadata >> being stored so no wp recovery is needed. Still, this approach forces >> recovery on each 1.2 instance creation (also on factory reset). In this >> context, you are right, the patch I proposed only addresses the double >> erase issue, which was the original motivator, and left the actual >> pointer recovery to the normal pblk recovery process. >> Besides this, in order to consider this as a real possibility, we need >> to measure the impact on startup time. For this, could you implement >> nvm_bb_scan_chunk() and nvm_bb_chunk_sense() more efficiently by >> recovering (i) asynchronously and (ii) concurrently across luns so that >> we can establish the recovery cost more fairly? We can look at a >> specific penalty ranges afterwards. > > Honestly, 1.2 is deprecated. For some... > I don't care about the performance, I > care about being easy to maintain, so it doesn't borg me down in the > future. This should be stated clear in the commit message. > > Back of the envelope calculation for a 64 die SSD with 1024 blocks per > die, and 60us read time, will take 4 seconds to scan if all chunks are > free, a worst case something like ~10 seconds. -> Not a problem for > me. > Worst case is _much_ worse than 10s if you need to scan the block to find the write pointer. We are talking minutes. At least make the recovery reads asynchronous. It is low hanging fruit and will help the average case significantly. >> Also, the recovery scheme in pblk will change significantly by doing >> this, so I assume you will send a followup patchset reimplementing >> recovery for the 1.2 path? > > The 1.2 path shouldn't be necessary after this. That is the idea of > this work. Obviously, the set bad block interface will have to > preserved and called. If we base this patch on top of my 2.0 recovery, we will still need to make changes to support all 1.2 corner cases. How do you want to do it? We get this patch in shape and I rebase on top or the other way around? Javier
On 08/17/2018 10:44 AM, Javier Gonzalez wrote: >> On 17 Aug 2018, at 10.21, Matias Bjørling <mb@lightnvm.io> wrote: >> >> On 08/16/2018 05:53 PM, Javier Gonzalez wrote: >>>> On 16 Aug 2018, at 13.34, Matias Bjørling <mb@lightnvm.io> wrote: >>>> >>>> This patch moves the 1.2 and 2.0 block/chunk metadata retrieval to >>>> core. >>>> >>>> Hi Javier, I did not end up using your patch. I had misunderstood what >>>> was implemented. Instead I implemented the detection of the each chunk by >>>> first sensing the first page, then the last page, and if the chunk >>>> is sensed as open, a per page scan will be executed to update the write >>>> pointer appropriately. >>> I see why you want to do it this way for maintaining the chunk >>> abstraction, but this is potentially very inefficient as blocks not used >>> by any target will be recovered unnecessarily. >> >> True. It will up to the target to not ask for more metadata than necessary (similarly for 2.0) >> >> Note that in 1.2, it is >>> expected that targets will need to recover the write pointer themselves. >>> What is more, in the normal path, this will be part of the metadata >>> being stored so no wp recovery is needed. Still, this approach forces >>> recovery on each 1.2 instance creation (also on factory reset). In this >>> context, you are right, the patch I proposed only addresses the double >>> erase issue, which was the original motivator, and left the actual >>> pointer recovery to the normal pblk recovery process. >>> Besides this, in order to consider this as a real possibility, we need >>> to measure the impact on startup time. For this, could you implement >>> nvm_bb_scan_chunk() and nvm_bb_chunk_sense() more efficiently by >>> recovering (i) asynchronously and (ii) concurrently across luns so that >>> we can establish the recovery cost more fairly? We can look at a >>> specific penalty ranges afterwards. >> >> Honestly, 1.2 is deprecated. > > For some... No. OCSSD 1.2 is deprecated. Others that have a derivative of 1.2 have their own storage stack and spec that they will continue development on, which can not be expected to be compatible with the OCSSD 1.2 that is implemented in the lightnvm subsystem. > >> I don't care about the performance, I >> care about being easy to maintain, so it doesn't borg me down in the >> future. > > This should be stated clear in the commit message. > >> >> Back of the envelope calculation for a 64 die SSD with 1024 blocks per >> die, and 60us read time, will take 4 seconds to scan if all chunks are >> free, a worst case something like ~10 seconds. -> Not a problem for >> me. >> > > Worst case is _much_ worse than 10s if you need to scan the block to > find the write pointer. We are talking minutes. I think you may be assuming that all blocks are open. My assumption is that this is very rare (given the NAND characteristics). At most a couple of blocks may be open per die. That leads me to the time quoted. > > At least make the recovery reads asynchronous. It is low hanging fruit > and will help the average case significantly. > >>> Also, the recovery scheme in pblk will change significantly by doing >>> this, so I assume you will send a followup patchset reimplementing >>> recovery for the 1.2 path? >> >> The 1.2 path shouldn't be necessary after this. That is the idea of >> this work. Obviously, the set bad block interface will have to >> preserved and called. > > If we base this patch on top of my 2.0 recovery, we will still need to > make changes to support all 1.2 corner cases. > > How do you want to do it? We get this patch in shape and I rebase on top > or the other way around? I'll pull this in when you're tested it with your 1.2 implementation.
> On 17 Aug 2018, at 11.29, Matias Bjørling <mb@lightnvm.io> wrote: > > On 08/17/2018 10:44 AM, Javier Gonzalez wrote: >>> On 17 Aug 2018, at 10.21, Matias Bjørling <mb@lightnvm.io> wrote: >>> >>> On 08/16/2018 05:53 PM, Javier Gonzalez wrote: >>>>> On 16 Aug 2018, at 13.34, Matias Bjørling <mb@lightnvm.io> wrote: >>>>> >>>>> This patch moves the 1.2 and 2.0 block/chunk metadata retrieval to >>>>> core. >>>>> >>>>> Hi Javier, I did not end up using your patch. I had misunderstood what >>>>> was implemented. Instead I implemented the detection of the each chunk by >>>>> first sensing the first page, then the last page, and if the chunk >>>>> is sensed as open, a per page scan will be executed to update the write >>>>> pointer appropriately. >>>> I see why you want to do it this way for maintaining the chunk >>>> abstraction, but this is potentially very inefficient as blocks not used >>>> by any target will be recovered unnecessarily. >>> >>> True. It will up to the target to not ask for more metadata than necessary (similarly for 2.0) >>> >>> Note that in 1.2, it is >>>> expected that targets will need to recover the write pointer themselves. >>>> What is more, in the normal path, this will be part of the metadata >>>> being stored so no wp recovery is needed. Still, this approach forces >>>> recovery on each 1.2 instance creation (also on factory reset). In this >>>> context, you are right, the patch I proposed only addresses the double >>>> erase issue, which was the original motivator, and left the actual >>>> pointer recovery to the normal pblk recovery process. >>>> Besides this, in order to consider this as a real possibility, we need >>>> to measure the impact on startup time. For this, could you implement >>>> nvm_bb_scan_chunk() and nvm_bb_chunk_sense() more efficiently by >>>> recovering (i) asynchronously and (ii) concurrently across luns so that >>>> we can establish the recovery cost more fairly? We can look at a >>>> specific penalty ranges afterwards. >>> >>> Honestly, 1.2 is deprecated. >> For some... > No. OCSSD 1.2 is deprecated. Others that have a derivative of 1.2 have > their own storage stack and spec that they will continue development > on, which can not be expected to be compatible with the OCSSD 1.2 that > is implemented in the lightnvm subsystem. > There are 1.2 devices out there using the current stack with no changes. >>> I don't care about the performance, I >>> care about being easy to maintain, so it doesn't borg me down in the >>> future. >> This should be stated clear in the commit message. >>> Back of the envelope calculation for a 64 die SSD with 1024 blocks per >>> die, and 60us read time, will take 4 seconds to scan if all chunks are >>> free, a worst case something like ~10 seconds. -> Not a problem for >>> me. >> Worst case is _much_ worse than 10s if you need to scan the block to >> find the write pointer. We are talking minutes. > > I think you may be assuming that all blocks are open. My assumption is > that this is very rare (given the NAND characteristics). At most a > couple of blocks may be open per die. That leads me to the time > quoted. > Worst case is worst case, no assumptions. >> At least make the recovery reads asynchronous. It is low hanging fruit >> and will help the average case significantly. >>>> Also, the recovery scheme in pblk will change significantly by doing >>>> this, so I assume you will send a followup patchset reimplementing >>>> recovery for the 1.2 path? >>> >>> The 1.2 path shouldn't be necessary after this. That is the idea of >>> this work. Obviously, the set bad block interface will have to >>> preserved and called. >> If we base this patch on top of my 2.0 recovery, we will still need to >> make changes to support all 1.2 corner cases. >> How do you want to do it? We get this patch in shape and I rebase on top >> or the other way around? > > I'll pull this in when you're tested it with your 1.2 implementation. Please, address the asynchronous read comment before considering pulling this path. There is really no reason not to improve this. Javier
On 08/17/2018 11:34 AM, Javier Gonzalez wrote: >> On 17 Aug 2018, at 11.29, Matias Bjørling <mb@lightnvm.io> wrote: >> >> On 08/17/2018 10:44 AM, Javier Gonzalez wrote: >>>> On 17 Aug 2018, at 10.21, Matias Bjørling <mb@lightnvm.io> wrote: >>>> >>>> On 08/16/2018 05:53 PM, Javier Gonzalez wrote: >>>>>> On 16 Aug 2018, at 13.34, Matias Bjørling <mb@lightnvm.io> wrote: >>>>>> >>>>>> This patch moves the 1.2 and 2.0 block/chunk metadata retrieval to >>>>>> core. >>>>>> >>>>>> Hi Javier, I did not end up using your patch. I had misunderstood what >>>>>> was implemented. Instead I implemented the detection of the each chunk by >>>>>> first sensing the first page, then the last page, and if the chunk >>>>>> is sensed as open, a per page scan will be executed to update the write >>>>>> pointer appropriately. >>>>> I see why you want to do it this way for maintaining the chunk >>>>> abstraction, but this is potentially very inefficient as blocks not used >>>>> by any target will be recovered unnecessarily. >>>> >>>> True. It will up to the target to not ask for more metadata than necessary (similarly for 2.0) >>>> >>>> Note that in 1.2, it is >>>>> expected that targets will need to recover the write pointer themselves. >>>>> What is more, in the normal path, this will be part of the metadata >>>>> being stored so no wp recovery is needed. Still, this approach forces >>>>> recovery on each 1.2 instance creation (also on factory reset). In this >>>>> context, you are right, the patch I proposed only addresses the double >>>>> erase issue, which was the original motivator, and left the actual >>>>> pointer recovery to the normal pblk recovery process. >>>>> Besides this, in order to consider this as a real possibility, we need >>>>> to measure the impact on startup time. For this, could you implement >>>>> nvm_bb_scan_chunk() and nvm_bb_chunk_sense() more efficiently by >>>>> recovering (i) asynchronously and (ii) concurrently across luns so that >>>>> we can establish the recovery cost more fairly? We can look at a >>>>> specific penalty ranges afterwards. >>>> >>>> Honestly, 1.2 is deprecated. >>> For some... >> No. OCSSD 1.2 is deprecated. Others that have a derivative of 1.2 have >> their own storage stack and spec that they will continue development >> on, which can not be expected to be compatible with the OCSSD 1.2 that >> is implemented in the lightnvm subsystem. >> > > There are 1.2 devices out there using the current stack with no changes. > Yes, obviously, and they should continue to work. Which this patch doesn't change. >>>> I don't care about the performance, I >>>> care about being easy to maintain, so it doesn't borg me down in the >>>> future. >>> This should be stated clear in the commit message. >>>> Back of the envelope calculation for a 64 die SSD with 1024 blocks per >>>> die, and 60us read time, will take 4 seconds to scan if all chunks are >>>> free, a worst case something like ~10 seconds. -> Not a problem for >>>> me. >>> Worst case is _much_ worse than 10s if you need to scan the block to >>> find the write pointer. We are talking minutes. >> >> I think you may be assuming that all blocks are open. My assumption is >> that this is very rare (given the NAND characteristics). At most a >> couple of blocks may be open per die. That leads me to the time >> quoted. >> > > Worst case is worst case, no assumptions. > >>> At least make the recovery reads asynchronous. It is low hanging fruit >>> and will help the average case significantly. >>>>> Also, the recovery scheme in pblk will change significantly by doing >>>>> this, so I assume you will send a followup patchset reimplementing >>>>> recovery for the 1.2 path? >>>> >>>> The 1.2 path shouldn't be necessary after this. That is the idea of >>>> this work. Obviously, the set bad block interface will have to >>>> preserved and called. >>> If we base this patch on top of my 2.0 recovery, we will still need to >>> make changes to support all 1.2 corner cases. >>> How do you want to do it? We get this patch in shape and I rebase on top >>> or the other way around? >> >> I'll pull this in when you're tested it with your 1.2 implementation. > > Please, address the asynchronous read comment before considering pulling > this path. There is really no reason not to improve this. > I'll accept patches, but I won't spend time on it. Please let me know if you have other comments.
> On 17 Aug 2018, at 11.42, Matias Bjørling <mb@lightnvm.io> wrote: > > On 08/17/2018 11:34 AM, Javier Gonzalez wrote: >>> On 17 Aug 2018, at 11.29, Matias Bjørling <mb@lightnvm.io> wrote: >>> >>> On 08/17/2018 10:44 AM, Javier Gonzalez wrote: >>>>> On 17 Aug 2018, at 10.21, Matias Bjørling <mb@lightnvm.io> wrote: >>>>> >>>>> On 08/16/2018 05:53 PM, Javier Gonzalez wrote: >>>>>>> On 16 Aug 2018, at 13.34, Matias Bjørling <mb@lightnvm.io> wrote: >>>>>>> >>>>>>> This patch moves the 1.2 and 2.0 block/chunk metadata retrieval to >>>>>>> core. >>>>>>> >>>>>>> Hi Javier, I did not end up using your patch. I had misunderstood what >>>>>>> was implemented. Instead I implemented the detection of the each chunk by >>>>>>> first sensing the first page, then the last page, and if the chunk >>>>>>> is sensed as open, a per page scan will be executed to update the write >>>>>>> pointer appropriately. >>>>>> I see why you want to do it this way for maintaining the chunk >>>>>> abstraction, but this is potentially very inefficient as blocks not used >>>>>> by any target will be recovered unnecessarily. >>>>> >>>>> True. It will up to the target to not ask for more metadata than necessary (similarly for 2.0) >>>>> >>>>> Note that in 1.2, it is >>>>>> expected that targets will need to recover the write pointer themselves. >>>>>> What is more, in the normal path, this will be part of the metadata >>>>>> being stored so no wp recovery is needed. Still, this approach forces >>>>>> recovery on each 1.2 instance creation (also on factory reset). In this >>>>>> context, you are right, the patch I proposed only addresses the double >>>>>> erase issue, which was the original motivator, and left the actual >>>>>> pointer recovery to the normal pblk recovery process. >>>>>> Besides this, in order to consider this as a real possibility, we need >>>>>> to measure the impact on startup time. For this, could you implement >>>>>> nvm_bb_scan_chunk() and nvm_bb_chunk_sense() more efficiently by >>>>>> recovering (i) asynchronously and (ii) concurrently across luns so that >>>>>> we can establish the recovery cost more fairly? We can look at a >>>>>> specific penalty ranges afterwards. >>>>> >>>>> Honestly, 1.2 is deprecated. >>>> For some... >>> No. OCSSD 1.2 is deprecated. Others that have a derivative of 1.2 have >>> their own storage stack and spec that they will continue development >>> on, which can not be expected to be compatible with the OCSSD 1.2 that >>> is implemented in the lightnvm subsystem. >> There are 1.2 devices out there using the current stack with no changes. > > > Yes, obviously, and they should continue to work. Which this patch doesn't change. > >>>>> I don't care about the performance, I >>>>> care about being easy to maintain, so it doesn't borg me down in the >>>>> future. >>>> This should be stated clear in the commit message. >>>>> Back of the envelope calculation for a 64 die SSD with 1024 blocks per >>>>> die, and 60us read time, will take 4 seconds to scan if all chunks are >>>>> free, a worst case something like ~10 seconds. -> Not a problem for >>>>> me. >>>> Worst case is _much_ worse than 10s if you need to scan the block to >>>> find the write pointer. We are talking minutes. >>> >>> I think you may be assuming that all blocks are open. My assumption is >>> that this is very rare (given the NAND characteristics). At most a >>> couple of blocks may be open per die. That leads me to the time >>> quoted. >> Worst case is worst case, no assumptions. >>>> At least make the recovery reads asynchronous. It is low hanging fruit >>>> and will help the average case significantly. >>>>>> Also, the recovery scheme in pblk will change significantly by doing >>>>>> this, so I assume you will send a followup patchset reimplementing >>>>>> recovery for the 1.2 path? >>>>> >>>>> The 1.2 path shouldn't be necessary after this. That is the idea of >>>>> this work. Obviously, the set bad block interface will have to >>>>> preserved and called. >>>> If we base this patch on top of my 2.0 recovery, we will still need to >>>> make changes to support all 1.2 corner cases. >>>> How do you want to do it? We get this patch in shape and I rebase on top >>>> or the other way around? >>> >>> I'll pull this in when you're tested it with your 1.2 implementation. >> Please, address the asynchronous read comment before considering pulling >> this path. There is really no reason not to improve this. > > I'll accept patches, but I won't spend time on it. Please let me know if you have other comments. Your choice to ignore my comment on a RFC at this early stage of the 4.20 window. In any case, I tested on our 1.2 devices and the patch has passed functionality. Please do not add reviewed-by or tested-by tags with my name as I do not back the performance implications of the current implementation (on an otherwise good cleanup patch). Javier
On 08/17/2018 12:49 PM, Javier Gonzalez wrote: > >> On 17 Aug 2018, at 11.42, Matias Bjørling <mb@lightnvm.io> wrote: >> >> On 08/17/2018 11:34 AM, Javier Gonzalez wrote: >>>> On 17 Aug 2018, at 11.29, Matias Bjørling <mb@lightnvm.io> wrote: >>>> >>>> On 08/17/2018 10:44 AM, Javier Gonzalez wrote: >>>>>> On 17 Aug 2018, at 10.21, Matias Bjørling <mb@lightnvm.io> wrote: >>>>>> >>>>>> On 08/16/2018 05:53 PM, Javier Gonzalez wrote: >>>>>>>> On 16 Aug 2018, at 13.34, Matias Bjørling <mb@lightnvm.io> wrote: >>>>>>>> >>>>>>>> This patch moves the 1.2 and 2.0 block/chunk metadata retrieval to >>>>>>>> core. >>>>>>>> >>>>>>>> Hi Javier, I did not end up using your patch. I had misunderstood what >>>>>>>> was implemented. Instead I implemented the detection of the each chunk by >>>>>>>> first sensing the first page, then the last page, and if the chunk >>>>>>>> is sensed as open, a per page scan will be executed to update the write >>>>>>>> pointer appropriately. >>>>>>> I see why you want to do it this way for maintaining the chunk >>>>>>> abstraction, but this is potentially very inefficient as blocks not used >>>>>>> by any target will be recovered unnecessarily. >>>>>> >>>>>> True. It will up to the target to not ask for more metadata than necessary (similarly for 2.0) >>>>>> >>>>>> Note that in 1.2, it is >>>>>>> expected that targets will need to recover the write pointer themselves. >>>>>>> What is more, in the normal path, this will be part of the metadata >>>>>>> being stored so no wp recovery is needed. Still, this approach forces >>>>>>> recovery on each 1.2 instance creation (also on factory reset). In this >>>>>>> context, you are right, the patch I proposed only addresses the double >>>>>>> erase issue, which was the original motivator, and left the actual >>>>>>> pointer recovery to the normal pblk recovery process. >>>>>>> Besides this, in order to consider this as a real possibility, we need >>>>>>> to measure the impact on startup time. For this, could you implement >>>>>>> nvm_bb_scan_chunk() and nvm_bb_chunk_sense() more efficiently by >>>>>>> recovering (i) asynchronously and (ii) concurrently across luns so that >>>>>>> we can establish the recovery cost more fairly? We can look at a >>>>>>> specific penalty ranges afterwards. >>>>>> >>>>>> Honestly, 1.2 is deprecated. >>>>> For some... >>>> No. OCSSD 1.2 is deprecated. Others that have a derivative of 1.2 have >>>> their own storage stack and spec that they will continue development >>>> on, which can not be expected to be compatible with the OCSSD 1.2 that >>>> is implemented in the lightnvm subsystem. >>> There are 1.2 devices out there using the current stack with no changes. > >> >> Yes, obviously, and they should continue to work. Which this patch doesn't change. >> >>>>>> I don't care about the performance, I >>>>>> care about being easy to maintain, so it doesn't borg me down in the >>>>>> future. >>>>> This should be stated clear in the commit message. >>>>>> Back of the envelope calculation for a 64 die SSD with 1024 blocks per >>>>>> die, and 60us read time, will take 4 seconds to scan if all chunks are >>>>>> free, a worst case something like ~10 seconds. -> Not a problem for >>>>>> me. >>>>> Worst case is _much_ worse than 10s if you need to scan the block to >>>>> find the write pointer. We are talking minutes. >>>> >>>> I think you may be assuming that all blocks are open. My assumption is >>>> that this is very rare (given the NAND characteristics). At most a >>>> couple of blocks may be open per die. That leads me to the time >>>> quoted. >>> Worst case is worst case, no assumptions. >>>>> At least make the recovery reads asynchronous. It is low hanging fruit >>>>> and will help the average case significantly. >>>>>>> Also, the recovery scheme in pblk will change significantly by doing >>>>>>> this, so I assume you will send a followup patchset reimplementing >>>>>>> recovery for the 1.2 path? >>>>>> >>>>>> The 1.2 path shouldn't be necessary after this. That is the idea of >>>>>> this work. Obviously, the set bad block interface will have to >>>>>> preserved and called. >>>>> If we base this patch on top of my 2.0 recovery, we will still need to >>>>> make changes to support all 1.2 corner cases. >>>>> How do you want to do it? We get this patch in shape and I rebase on top >>>>> or the other way around? >>>> >>>> I'll pull this in when you're tested it with your 1.2 implementation. >>> Please, address the asynchronous read comment before considering pulling >>> this path. There is really no reason not to improve this. >> >> I'll accept patches, but I won't spend time on it. Please let me know if you have other comments. > > Your choice to ignore my comment on a RFC at this early stage of the > 4.20 window. > > In any case, I tested on our 1.2 devices and the patch has passed > functionality. > > Please do not add reviewed-by or tested-by tags with my name as I do not > back the performance implications of the current implementation (on an > otherwise good cleanup patch). > Thanks for testing. I appreciate it.