Message ID | 1523431471-204238-1-git-send-email-shawn.lin@rock-chips.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 11/04/18 10:24, Shawn Lin wrote: > A simply continueous background I/O could 100% make the system stuck for > a long time in my system when a unbind for the controller driver happens > simultaneously. See: > > dd if=/dev/mmcblk0 of=/dev/null bs=512k count=100000 && > echo fe320000.dwmmc > /sys/bus/platform/drivers/dwmmc_rockchip/unbind > > The reason is all pending requests wait for timeout one by one, but never > propagates BLK_STS_IOERR in the first place when kicked from the queue. > Set the card as removed immediately in mmc_remove_card() to solve it. > > Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> > --- > > drivers/mmc/core/bus.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c > index fc92c6c..c6c5dfe 100644 > --- a/drivers/mmc/core/bus.c > +++ b/drivers/mmc/core/bus.c > @@ -389,6 +389,7 @@ void mmc_remove_card(struct mmc_card *card) > pr_info("%s: card %04x removed\n", > mmc_hostname(card->host), card->rca); > } > + mmc_card_set_removed(card); Pedantically we should not call mmc_card_set_removed() if we have not claimed the host. Of course we can't claim the host because the block driver already has it, but I am not sure this is the right place to do this. My first question is how come the I/O times out if the card is still present i.e. you are only unbinding the host controller, so you should remove the card while the host controller and card are still functional? > device_del(&card->dev); > of_node_put(card->dev.of_node); > } > -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2018/4/11 16:46, Adrian Hunter wrote: > On 11/04/18 10:24, Shawn Lin wrote: >> A simply continueous background I/O could 100% make the system stuck for >> a long time in my system when a unbind for the controller driver happens >> simultaneously. See: >> >> dd if=/dev/mmcblk0 of=/dev/null bs=512k count=100000 && >> echo fe320000.dwmmc > /sys/bus/platform/drivers/dwmmc_rockchip/unbind >> >> The reason is all pending requests wait for timeout one by one, but never >> propagates BLK_STS_IOERR in the first place when kicked from the queue. >> Set the card as removed immediately in mmc_remove_card() to solve it. >> >> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> >> --- >> >> drivers/mmc/core/bus.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c >> index fc92c6c..c6c5dfe 100644 >> --- a/drivers/mmc/core/bus.c >> +++ b/drivers/mmc/core/bus.c >> @@ -389,6 +389,7 @@ void mmc_remove_card(struct mmc_card *card) >> pr_info("%s: card %04x removed\n", >> mmc_hostname(card->host), card->rca); >> } >> + mmc_card_set_removed(card); > > Pedantically we should not call mmc_card_set_removed() if we have not > claimed the host. Of course we can't claim the host because the block yep. > driver already has it, but I am not sure this is the right place to do this. > My first question is how come the I/O times out if the card is still > present i.e. you are only unbinding the host controller, so you should > remove the card while the host controller and card are still functional? The card is still functional for my host(but maybe not if ->remove() touchs the vmmc or vqmmc parts), but the host controller isn't, as the ->remove() calls. > >> device_del(&card->dev); >> of_node_put(card->dev.of_node); >> } >> > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11/04/18 12:13, Shawn Lin wrote: > On 2018/4/11 16:46, Adrian Hunter wrote: >> On 11/04/18 10:24, Shawn Lin wrote: >>> A simply continueous background I/O could 100% make the system stuck for >>> a long time in my system when a unbind for the controller driver happens >>> simultaneously. See: >>> >>> dd if=/dev/mmcblk0 of=/dev/null bs=512k count=100000 && >>> echo fe320000.dwmmc > /sys/bus/platform/drivers/dwmmc_rockchip/unbind >>> >>> The reason is all pending requests wait for timeout one by one, but never >>> propagates BLK_STS_IOERR in the first place when kicked from the queue. >>> Set the card as removed immediately in mmc_remove_card() to solve it. >>> >>> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> >>> --- >>> >>> drivers/mmc/core/bus.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c >>> index fc92c6c..c6c5dfe 100644 >>> --- a/drivers/mmc/core/bus.c >>> +++ b/drivers/mmc/core/bus.c >>> @@ -389,6 +389,7 @@ void mmc_remove_card(struct mmc_card *card) >>> pr_info("%s: card %04x removed\n", >>> mmc_hostname(card->host), card->rca); >>> } >>> + mmc_card_set_removed(card); >> >> Pedantically we should not call mmc_card_set_removed() if we have not >> claimed the host. Of course we can't claim the host because the block > > yep. > >> driver already has it, but I am not sure this is the right place to do this. >> My first question is how come the I/O times out if the card is still >> present i.e. you are only unbinding the host controller, so you should >> remove the card while the host controller and card are still functional? > > The card is still functional for my host(but maybe not if ->remove() > touchs the vmmc or vqmmc parts), but the host controller isn't, as the > ->remove() calls. I don't understand why the host controller isn't functional. I tried it with sdhci and it just waits for dd to finish but there are no I/O errors. Is there a use-case for unbinding quickly? > >> >>> device_del(&card->dev); >>> of_node_put(card->dev.of_node); >>> } >>> >> >> >> >> > > -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2018/4/11 18:26, Adrian Hunter wrote: > On 11/04/18 12:13, Shawn Lin wrote: >> On 2018/4/11 16:46, Adrian Hunter wrote: >>> On 11/04/18 10:24, Shawn Lin wrote: >>>> A simply continueous background I/O could 100% make the system stuck for >>>> a long time in my system when a unbind for the controller driver happens >>>> simultaneously. See: >>>> >>>> dd if=/dev/mmcblk0 of=/dev/null bs=512k count=100000 && >>>> echo fe320000.dwmmc > /sys/bus/platform/drivers/dwmmc_rockchip/unbind >>>> >>>> The reason is all pending requests wait for timeout one by one, but never >>>> propagates BLK_STS_IOERR in the first place when kicked from the queue. >>>> Set the card as removed immediately in mmc_remove_card() to solve it. >>>> >>>> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> >>>> --- >>>> >>>> drivers/mmc/core/bus.c | 1 + >>>> 1 file changed, 1 insertion(+) >>>> >>>> diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c >>>> index fc92c6c..c6c5dfe 100644 >>>> --- a/drivers/mmc/core/bus.c >>>> +++ b/drivers/mmc/core/bus.c >>>> @@ -389,6 +389,7 @@ void mmc_remove_card(struct mmc_card *card) >>>> pr_info("%s: card %04x removed\n", >>>> mmc_hostname(card->host), card->rca); >>>> } >>>> + mmc_card_set_removed(card); >>> >>> Pedantically we should not call mmc_card_set_removed() if we have not >>> claimed the host. Of course we can't claim the host because the block >> >> yep. >> >>> driver already has it, but I am not sure this is the right place to do this. >>> My first question is how come the I/O times out if the card is still >>> present i.e. you are only unbinding the host controller, so you should >>> remove the card while the host controller and card are still functional? >> >> The card is still functional for my host(but maybe not if ->remove() >> touchs the vmmc or vqmmc parts), but the host controller isn't, as the >> ->remove() calls. > > I don't understand why the host controller isn't functional. I tried it I guess we have misunderstanding about what the "functional" means. What I want to say is when unbinding the host driver, mmc_remove_host() is called, so the target host isn't present from the view of core layer, right? > with sdhci and it just waits for dd to finish but there are no I/O errors. > > Is there a use-case for unbinding quickly? Oh, I think I made a mistake in commit msg, the repro steps should be (1) run dd in background: dd if=/dev/mmcblk0 of=/dev/null bs=512k count=100000 & (2) unbind you driver, for instance: echo fe320000.dwmmc > /sys/bus/platform/drivers/dwmmc_rockchip/unbind Then yes, there is no I/O errors and it waits for dd to finish. But when waiting for dd to finish, console got stuck, and can't respond to any ctrl+{c,d,z}, even sysrq. The time depends how much I/O it submits, the above setp(1) make it stuck for 2 miniutes for my system. > >> >>> >>>> device_del(&card->dev); >>>> of_node_put(card->dev.of_node); >>>> } >>>> >>> >>> >>> >>> >> >> > > -- > To unsubscribe from this list: send the line "unsubscribe linux-mmc" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/04/18 03:43, Shawn Lin wrote: > On 2018/4/11 18:26, Adrian Hunter wrote: >> On 11/04/18 12:13, Shawn Lin wrote: >>> On 2018/4/11 16:46, Adrian Hunter wrote: >>>> On 11/04/18 10:24, Shawn Lin wrote: >>>>> A simply continueous background I/O could 100% make the system stuck for >>>>> a long time in my system when a unbind for the controller driver happens >>>>> simultaneously. See: >>>>> >>>>> dd if=/dev/mmcblk0 of=/dev/null bs=512k count=100000 && >>>>> echo fe320000.dwmmc > /sys/bus/platform/drivers/dwmmc_rockchip/unbind >>>>> >>>>> The reason is all pending requests wait for timeout one by one, but never >>>>> propagates BLK_STS_IOERR in the first place when kicked from the queue. >>>>> Set the card as removed immediately in mmc_remove_card() to solve it. >>>>> >>>>> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> >>>>> --- >>>>> >>>>> drivers/mmc/core/bus.c | 1 + >>>>> 1 file changed, 1 insertion(+) >>>>> >>>>> diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c >>>>> index fc92c6c..c6c5dfe 100644 >>>>> --- a/drivers/mmc/core/bus.c >>>>> +++ b/drivers/mmc/core/bus.c >>>>> @@ -389,6 +389,7 @@ void mmc_remove_card(struct mmc_card *card) >>>>> pr_info("%s: card %04x removed\n", >>>>> mmc_hostname(card->host), card->rca); >>>>> } >>>>> + mmc_card_set_removed(card); >>>> >>>> Pedantically we should not call mmc_card_set_removed() if we have not >>>> claimed the host. Of course we can't claim the host because the block >>> >>> yep. >>> >>>> driver already has it, but I am not sure this is the right place to do >>>> this. >>>> My first question is how come the I/O times out if the card is still >>>> present i.e. you are only unbinding the host controller, so you should >>>> remove the card while the host controller and card are still functional? >>> >>> The card is still functional for my host(but maybe not if ->remove() >>> touchs the vmmc or vqmmc parts), but the host controller isn't, as the >>> ->remove() calls. >> >> I don't understand why the host controller isn't functional. I tried it > > I guess we have misunderstanding about what the "functional" means. What > I want to say is when unbinding the host driver, mmc_remove_host() is > called, so the target host isn't present from the view of core layer, > right? > >> with sdhci and it just waits for dd to finish but there are no I/O errors. >> >> Is there a use-case for unbinding quickly? > > Oh, I think I made a mistake in commit msg, the repro steps should be > (1) run dd in background: > dd if=/dev/mmcblk0 of=/dev/null bs=512k count=100000 & > (2) unbind you driver, for instance: > echo fe320000.dwmmc > /sys/bus/platform/drivers/dwmmc_rockchip/unbind > > Then yes, there is no I/O errors and it waits for dd to finish. But > when waiting for dd to finish, console got stuck, and can't respond to > any ctrl+{c,d,z}, even sysrq. The time depends how much I/O it submits, > the above setp(1) make it stuck for 2 miniutes for my system. Job control on a console might not always work, especially with busybox. I guess if you run the unbind in the background, then there is not a problem. Given that unbinding the host controller while it is still in use is a bad thing, why do we care? -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2018/4/12 15:03, Adrian Hunter wrote: > On 12/04/18 03:43, Shawn Lin wrote: >> On 2018/4/11 18:26, Adrian Hunter wrote: >>> On 11/04/18 12:13, Shawn Lin wrote: >>>> On 2018/4/11 16:46, Adrian Hunter wrote: >>>>> On 11/04/18 10:24, Shawn Lin wrote: >>>>>> A simply continueous background I/O could 100% make the system stuck for >>>>>> a long time in my system when a unbind for the controller driver happens >>>>>> simultaneously. See: >>>>>> >>>>>> dd if=/dev/mmcblk0 of=/dev/null bs=512k count=100000 && >>>>>> echo fe320000.dwmmc > /sys/bus/platform/drivers/dwmmc_rockchip/unbind >>>>>> >>>>>> The reason is all pending requests wait for timeout one by one, but never >>>>>> propagates BLK_STS_IOERR in the first place when kicked from the queue. >>>>>> Set the card as removed immediately in mmc_remove_card() to solve it. >>>>>> >>>>>> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> >>>>>> --- >>>>>> >>>>>> drivers/mmc/core/bus.c | 1 + >>>>>> 1 file changed, 1 insertion(+) >>>>>> >>>>>> diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c >>>>>> index fc92c6c..c6c5dfe 100644 >>>>>> --- a/drivers/mmc/core/bus.c >>>>>> +++ b/drivers/mmc/core/bus.c >>>>>> @@ -389,6 +389,7 @@ void mmc_remove_card(struct mmc_card *card) >>>>>> pr_info("%s: card %04x removed\n", >>>>>> mmc_hostname(card->host), card->rca); >>>>>> } >>>>>> + mmc_card_set_removed(card); >>>>> >>>>> Pedantically we should not call mmc_card_set_removed() if we have not >>>>> claimed the host. Of course we can't claim the host because the block >>>> >>>> yep. >>>> >>>>> driver already has it, but I am not sure this is the right place to do >>>>> this. >>>>> My first question is how come the I/O times out if the card is still >>>>> present i.e. you are only unbinding the host controller, so you should >>>>> remove the card while the host controller and card are still functional? >>>> >>>> The card is still functional for my host(but maybe not if ->remove() >>>> touchs the vmmc or vqmmc parts), but the host controller isn't, as the >>>> ->remove() calls. >>> >>> I don't understand why the host controller isn't functional. I tried it >> >> I guess we have misunderstanding about what the "functional" means. What >> I want to say is when unbinding the host driver, mmc_remove_host() is >> called, so the target host isn't present from the view of core layer, >> right? >> >>> with sdhci and it just waits for dd to finish but there are no I/O errors. >>> >>> Is there a use-case for unbinding quickly? >> >> Oh, I think I made a mistake in commit msg, the repro steps should be >> (1) run dd in background: >> dd if=/dev/mmcblk0 of=/dev/null bs=512k count=100000 & >> (2) unbind you driver, for instance: >> echo fe320000.dwmmc > /sys/bus/platform/drivers/dwmmc_rockchip/unbind >> >> Then yes, there is no I/O errors and it waits for dd to finish. But >> when waiting for dd to finish, console got stuck, and can't respond to >> any ctrl+{c,d,z}, even sysrq. The time depends how much I/O it submits, >> the above setp(1) make it stuck for 2 miniutes for my system. > > Job control on a console might not always work, especially with busybox. > I guess if you run the unbind in the background, then there is not a problem. The dd case is just a quick reproduce but the real case is the Android system has many vold(similar to udev) driven partition scan tools/applications that fire I/Os once sd card mounted. The applications is busy waiting there if we check background job by "top", which make the user experience very bad in this way. > > Given that unbinding the host controller while it is still in use is a bad > thing, why do we care? I just checked the log and wonder why mmc_mq_queue_rq still allows request to dispatch? Maybe it's a bad thing to unbind the controller when it's used, but that's not forbade, at least it should behave the same with real physically hot-plug. Another reason I do care about this is because I usually can't access to my test devices, but only remotely do job on their console. So if I need to reset/hot-plug the card, I usually do unbind/bind, but the partition scan applications + unbind make the console stuck for a longer time than expected which is very painful for me to wait for finishment. :( > > > -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/04/18 11:53, Shawn Lin wrote: > On 2018/4/12 15:03, Adrian Hunter wrote: >> On 12/04/18 03:43, Shawn Lin wrote: >>> On 2018/4/11 18:26, Adrian Hunter wrote: >>>> On 11/04/18 12:13, Shawn Lin wrote: >>>>> On 2018/4/11 16:46, Adrian Hunter wrote: >>>>>> On 11/04/18 10:24, Shawn Lin wrote: >>>>>>> A simply continueous background I/O could 100% make the system stuck for >>>>>>> a long time in my system when a unbind for the controller driver happens >>>>>>> simultaneously. See: >>>>>>> >>>>>>> dd if=/dev/mmcblk0 of=/dev/null bs=512k count=100000 && >>>>>>> echo fe320000.dwmmc > /sys/bus/platform/drivers/dwmmc_rockchip/unbind >>>>>>> >>>>>>> The reason is all pending requests wait for timeout one by one, but >>>>>>> never >>>>>>> propagates BLK_STS_IOERR in the first place when kicked from the queue. >>>>>>> Set the card as removed immediately in mmc_remove_card() to solve it. >>>>>>> >>>>>>> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> >>>>>>> --- >>>>>>> >>>>>>> drivers/mmc/core/bus.c | 1 + >>>>>>> 1 file changed, 1 insertion(+) >>>>>>> >>>>>>> diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c >>>>>>> index fc92c6c..c6c5dfe 100644 >>>>>>> --- a/drivers/mmc/core/bus.c >>>>>>> +++ b/drivers/mmc/core/bus.c >>>>>>> @@ -389,6 +389,7 @@ void mmc_remove_card(struct mmc_card *card) >>>>>>> pr_info("%s: card %04x removed\n", >>>>>>> mmc_hostname(card->host), card->rca); >>>>>>> } >>>>>>> + mmc_card_set_removed(card); >>>>>> >>>>>> Pedantically we should not call mmc_card_set_removed() if we have not >>>>>> claimed the host. Of course we can't claim the host because the block >>>>> >>>>> yep. >>>>> >>>>>> driver already has it, but I am not sure this is the right place to do >>>>>> this. >>>>>> My first question is how come the I/O times out if the card is still >>>>>> present i.e. you are only unbinding the host controller, so you should >>>>>> remove the card while the host controller and card are still functional? >>>>> >>>>> The card is still functional for my host(but maybe not if ->remove() >>>>> touchs the vmmc or vqmmc parts), but the host controller isn't, as the >>>>> ->remove() calls. >>>> >>>> I don't understand why the host controller isn't functional. I tried it >>> >>> I guess we have misunderstanding about what the "functional" means. What >>> I want to say is when unbinding the host driver, mmc_remove_host() is >>> called, so the target host isn't present from the view of core layer, >>> right? >>> >>>> with sdhci and it just waits for dd to finish but there are no I/O errors. >>>> >>>> Is there a use-case for unbinding quickly? >>> >>> Oh, I think I made a mistake in commit msg, the repro steps should be >>> (1) run dd in background: >>> dd if=/dev/mmcblk0 of=/dev/null bs=512k count=100000 & >>> (2) unbind you driver, for instance: >>> echo fe320000.dwmmc > /sys/bus/platform/drivers/dwmmc_rockchip/unbind >>> >>> Then yes, there is no I/O errors and it waits for dd to finish. But >>> when waiting for dd to finish, console got stuck, and can't respond to >>> any ctrl+{c,d,z}, even sysrq. The time depends how much I/O it submits, >>> the above setp(1) make it stuck for 2 miniutes for my system. >> >> Job control on a console might not always work, especially with busybox. >> I guess if you run the unbind in the background, then there is not a problem. > > The dd case is just a quick reproduce but the real case is the > Android system has many vold(similar to udev) driven partition scan > tools/applications that fire I/Os once sd card mounted. The applications > is busy waiting there if we check background job by "top", which make > the user experience very bad in this way. Is this related to unbinding the host controller? > >> >> Given that unbinding the host controller while it is still in use is a bad >> thing, why do we care? > > I just checked the log and wonder why mmc_mq_queue_rq still allows > request to dispatch? Maybe it's a bad thing to unbind the controller > when it's used, but that's not forbade, at least it should behave > the same with real physically hot-plug. Should it? From memory I vaguely recall seeing EXT4 doing a sync before its partition gets yanked from beneath it. If that is right it would tend to indicate that there is not an expectation that I/O should fail immediately. > Another reason I do care about this is because I usually can't access to > my test devices, but only remotely do job on their console. So if I need > to reset/hot-plug the card, I usually do unbind/bind, but the partition > scan applications + unbind make the console stuck for a longer time > than expected which is very painful for me to wait for finishment. :( Why not do the unbind in the background and then kill the partition scan? -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2018/4/12 17:19, Adrian Hunter wrote: > On 12/04/18 11:53, Shawn Lin wrote: >> On 2018/4/12 15:03, Adrian Hunter wrote: >>> On 12/04/18 03:43, Shawn Lin wrote: >>>> On 2018/4/11 18:26, Adrian Hunter wrote: >>>>> On 11/04/18 12:13, Shawn Lin wrote: >>>>>> On 2018/4/11 16:46, Adrian Hunter wrote: >>>>>>> On 11/04/18 10:24, Shawn Lin wrote: >>>>>>>> A simply continueous background I/O could 100% make the system stuck for >>>>>>>> a long time in my system when a unbind for the controller driver happens >>>>>>>> simultaneously. See: >>>>>>>> >>>>>>>> dd if=/dev/mmcblk0 of=/dev/null bs=512k count=100000 && >>>>>>>> echo fe320000.dwmmc > /sys/bus/platform/drivers/dwmmc_rockchip/unbind >>>>>>>> >>>>>>>> The reason is all pending requests wait for timeout one by one, but >>>>>>>> never >>>>>>>> propagates BLK_STS_IOERR in the first place when kicked from the queue. >>>>>>>> Set the card as removed immediately in mmc_remove_card() to solve it. >>>>>>>> >>>>>>>> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> >>>>>>>> --- >>>>>>>> >>>>>>>> drivers/mmc/core/bus.c | 1 + >>>>>>>> 1 file changed, 1 insertion(+) >>>>>>>> >>>>>>>> diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c >>>>>>>> index fc92c6c..c6c5dfe 100644 >>>>>>>> --- a/drivers/mmc/core/bus.c >>>>>>>> +++ b/drivers/mmc/core/bus.c >>>>>>>> @@ -389,6 +389,7 @@ void mmc_remove_card(struct mmc_card *card) >>>>>>>> pr_info("%s: card %04x removed\n", >>>>>>>> mmc_hostname(card->host), card->rca); >>>>>>>> } >>>>>>>> + mmc_card_set_removed(card); >>>>>>> >>>>>>> Pedantically we should not call mmc_card_set_removed() if we have not >>>>>>> claimed the host. Of course we can't claim the host because the block >>>>>> >>>>>> yep. >>>>>> >>>>>>> driver already has it, but I am not sure this is the right place to do >>>>>>> this. >>>>>>> My first question is how come the I/O times out if the card is still >>>>>>> present i.e. you are only unbinding the host controller, so you should >>>>>>> remove the card while the host controller and card are still functional? >>>>>> >>>>>> The card is still functional for my host(but maybe not if ->remove() >>>>>> touchs the vmmc or vqmmc parts), but the host controller isn't, as the >>>>>> ->remove() calls. >>>>> >>>>> I don't understand why the host controller isn't functional. I tried it >>>> >>>> I guess we have misunderstanding about what the "functional" means. What >>>> I want to say is when unbinding the host driver, mmc_remove_host() is >>>> called, so the target host isn't present from the view of core layer, >>>> right? >>>> >>>>> with sdhci and it just waits for dd to finish but there are no I/O errors. >>>>> >>>>> Is there a use-case for unbinding quickly? >>>> >>>> Oh, I think I made a mistake in commit msg, the repro steps should be >>>> (1) run dd in background: >>>> dd if=/dev/mmcblk0 of=/dev/null bs=512k count=100000 & >>>> (2) unbind you driver, for instance: >>>> echo fe320000.dwmmc > /sys/bus/platform/drivers/dwmmc_rockchip/unbind >>>> >>>> Then yes, there is no I/O errors and it waits for dd to finish. But >>>> when waiting for dd to finish, console got stuck, and can't respond to >>>> any ctrl+{c,d,z}, even sysrq. The time depends how much I/O it submits, >>>> the above setp(1) make it stuck for 2 miniutes for my system. >>> >>> Job control on a console might not always work, especially with busybox. >>> I guess if you run the unbind in the background, then there is not a problem. >> >> The dd case is just a quick reproduce but the real case is the >> Android system has many vold(similar to udev) driven partition scan >> tools/applications that fire I/Os once sd card mounted. The applications >> is busy waiting there if we check background job by "top", which make >> the user experience very bad in this way. > > Is this related to unbinding the host controller? > >> >>> >>> Given that unbinding the host controller while it is still in use is a bad >>> thing, why do we care? >> >> I just checked the log and wonder why mmc_mq_queue_rq still allows >> request to dispatch? Maybe it's a bad thing to unbind the controller >> when it's used, but that's not forbade, at least it should behave >> the same with real physically hot-plug. > > Should it? From memory I vaguely recall seeing EXT4 doing a sync before its > partition gets yanked from beneath it. If that is right it would tend to > indicate that there is not an expectation that I/O should fail immediately. I have little knowledge about EXT4 :),but that happend for cards with different fs format. > >> Another reason I do care about this is because I usually can't access to >> my test devices, but only remotely do job on their console. So if I need >> to reset/hot-plug the card, I usually do unbind/bind, but the partition >> scan applications + unbind make the console stuck for a longer time >> than expected which is very painful for me to wait for finishment. :( > > Why not do the unbind in the background and then kill the partition scan? Yes, I could :). But the conversation is deviating from core issue that why the MMC_CARD_REMOVED flag is left apart for the card but the card is removed. That being said, don't you think "mmc1: card 1234 removed" means mmc_card_removed(card) should be true? > > > -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 13/04/18 03:53, Shawn Lin wrote: > On 2018/4/12 17:19, Adrian Hunter wrote: >> On 12/04/18 11:53, Shawn Lin wrote: >>> On 2018/4/12 15:03, Adrian Hunter wrote: >>>> On 12/04/18 03:43, Shawn Lin wrote: >>>>> On 2018/4/11 18:26, Adrian Hunter wrote: >>>>>> On 11/04/18 12:13, Shawn Lin wrote: >>>>>>> On 2018/4/11 16:46, Adrian Hunter wrote: >>>>>>>> On 11/04/18 10:24, Shawn Lin wrote: >>>>>>>>> A simply continueous background I/O could 100% make the system >>>>>>>>> stuck for >>>>>>>>> a long time in my system when a unbind for the controller driver >>>>>>>>> happens >>>>>>>>> simultaneously. See: >>>>>>>>> >>>>>>>>> dd if=/dev/mmcblk0 of=/dev/null bs=512k count=100000 && >>>>>>>>> echo fe320000.dwmmc > /sys/bus/platform/drivers/dwmmc_rockchip/unbind >>>>>>>>> >>>>>>>>> The reason is all pending requests wait for timeout one by one, but >>>>>>>>> never >>>>>>>>> propagates BLK_STS_IOERR in the first place when kicked from the >>>>>>>>> queue. >>>>>>>>> Set the card as removed immediately in mmc_remove_card() to solve it. >>>>>>>>> >>>>>>>>> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> >>>>>>>>> --- >>>>>>>>> >>>>>>>>> drivers/mmc/core/bus.c | 1 + >>>>>>>>> 1 file changed, 1 insertion(+) >>>>>>>>> >>>>>>>>> diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c >>>>>>>>> index fc92c6c..c6c5dfe 100644 >>>>>>>>> --- a/drivers/mmc/core/bus.c >>>>>>>>> +++ b/drivers/mmc/core/bus.c >>>>>>>>> @@ -389,6 +389,7 @@ void mmc_remove_card(struct mmc_card *card) >>>>>>>>> pr_info("%s: card %04x removed\n", >>>>>>>>> mmc_hostname(card->host), card->rca); >>>>>>>>> } >>>>>>>>> + mmc_card_set_removed(card); >>>>>>>> >>>>>>>> Pedantically we should not call mmc_card_set_removed() if we have not >>>>>>>> claimed the host. Of course we can't claim the host because the block >>>>>>> >>>>>>> yep. >>>>>>> >>>>>>>> driver already has it, but I am not sure this is the right place to do >>>>>>>> this. >>>>>>>> My first question is how come the I/O times out if the card is >>>>>>>> still >>>>>>>> present i.e. you are only unbinding the host controller, so you should >>>>>>>> remove the card while the host controller and card are still >>>>>>>> functional? >>>>>>> >>>>>>> The card is still functional for my host(but maybe not if ->remove() >>>>>>> touchs the vmmc or vqmmc parts), but the host controller isn't, as the >>>>>>> ->remove() calls. >>>>>> >>>>>> I don't understand why the host controller isn't functional. I tried it >>>>> >>>>> I guess we have misunderstanding about what the "functional" means. What >>>>> I want to say is when unbinding the host driver, mmc_remove_host() is >>>>> called, so the target host isn't present from the view of core layer, >>>>> right? >>>>> >>>>>> with sdhci and it just waits for dd to finish but there are no I/O >>>>>> errors. >>>>>> >>>>>> Is there a use-case for unbinding quickly? >>>>> >>>>> Oh, I think I made a mistake in commit msg, the repro steps should be >>>>> (1) run dd in background: >>>>> dd if=/dev/mmcblk0 of=/dev/null bs=512k count=100000 & >>>>> (2) unbind you driver, for instance: >>>>> echo fe320000.dwmmc > /sys/bus/platform/drivers/dwmmc_rockchip/unbind >>>>> >>>>> Then yes, there is no I/O errors and it waits for dd to finish. But >>>>> when waiting for dd to finish, console got stuck, and can't respond to >>>>> any ctrl+{c,d,z}, even sysrq. The time depends how much I/O it submits, >>>>> the above setp(1) make it stuck for 2 miniutes for my system. >>>> >>>> Job control on a console might not always work, especially with busybox. >>>> I guess if you run the unbind in the background, then there is not a >>>> problem. >>> >>> The dd case is just a quick reproduce but the real case is the >>> Android system has many vold(similar to udev) driven partition scan >>> tools/applications that fire I/Os once sd card mounted. The applications >>> is busy waiting there if we check background job by "top", which make >>> the user experience very bad in this way. >> >> Is this related to unbinding the host controller? >> >>> >>>> >>>> Given that unbinding the host controller while it is still in use is a bad >>>> thing, why do we care? >>> >>> I just checked the log and wonder why mmc_mq_queue_rq still allows >>> request to dispatch? Maybe it's a bad thing to unbind the controller >>> when it's used, but that's not forbade, at least it should behave >>> the same with real physically hot-plug. >> >> Should it? From memory I vaguely recall seeing EXT4 doing a sync before its >> partition gets yanked from beneath it. If that is right it would tend to >> indicate that there is not an expectation that I/O should fail immediately. > > I have little knowledge about EXT4 :),but that happend for cards with > different fs format. > >> >>> Another reason I do care about this is because I usually can't access to >>> my test devices, but only remotely do job on their console. So if I need >>> to reset/hot-plug the card, I usually do unbind/bind, but the partition >>> scan applications + unbind make the console stuck for a longer time >>> than expected which is very painful for me to wait for finishment. :( >> >> Why not do the unbind in the background and then kill the partition scan? > > Yes, I could :). But the conversation is deviating from core issue that > why the MMC_CARD_REMOVED flag is left apart for the card but the card > is removed. That being said, don't you think "mmc1: card 1234 removed" > means mmc_card_removed(card) should be true? No. It means the message is misleading for the unbind case. If you want a way to mark the card as removed, couldn't you add something to debugfs. -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2018/4/13 15:03, Adrian Hunter wrote: > On 13/04/18 03:53, Shawn Lin wrote: >> On 2018/4/12 17:19, Adrian Hunter wrote: >>> On 12/04/18 11:53, Shawn Lin wrote: >>>> On 2018/4/12 15:03, Adrian Hunter wrote: >>>>> On 12/04/18 03:43, Shawn Lin wrote: >>>>>> On 2018/4/11 18:26, Adrian Hunter wrote: >>>>>>> On 11/04/18 12:13, Shawn Lin wrote: >>>>>>>> On 2018/4/11 16:46, Adrian Hunter wrote: >>>>>>>>> On 11/04/18 10:24, Shawn Lin wrote: >>>>>>>>>> A simply continueous background I/O could 100% make the system >>>>>>>>>> stuck for >>>>>>>>>> a long time in my system when a unbind for the controller driver >>>>>>>>>> happens >>>>>>>>>> simultaneously. See: >>>>>>>>>> >>>>>>>>>> dd if=/dev/mmcblk0 of=/dev/null bs=512k count=100000 && >>>>>>>>>> echo fe320000.dwmmc > /sys/bus/platform/drivers/dwmmc_rockchip/unbind >>>>>>>>>> >>>>>>>>>> The reason is all pending requests wait for timeout one by one, but >>>>>>>>>> never >>>>>>>>>> propagates BLK_STS_IOERR in the first place when kicked from the >>>>>>>>>> queue. >>>>>>>>>> Set the card as removed immediately in mmc_remove_card() to solve it. >>>>>>>>>> >>>>>>>>>> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> >>>>>>>>>> --- >>>>>>>>>> >>>>>>>>>> drivers/mmc/core/bus.c | 1 + >>>>>>>>>> 1 file changed, 1 insertion(+) >>>>>>>>>> >>>>>>>>>> diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c >>>>>>>>>> index fc92c6c..c6c5dfe 100644 >>>>>>>>>> --- a/drivers/mmc/core/bus.c >>>>>>>>>> +++ b/drivers/mmc/core/bus.c >>>>>>>>>> @@ -389,6 +389,7 @@ void mmc_remove_card(struct mmc_card *card) >>>>>>>>>> pr_info("%s: card %04x removed\n", >>>>>>>>>> mmc_hostname(card->host), card->rca); >>>>>>>>>> } >>>>>>>>>> + mmc_card_set_removed(card); >>>>>>>>> >>>>>>>>> Pedantically we should not call mmc_card_set_removed() if we have not >>>>>>>>> claimed the host. Of course we can't claim the host because the block >>>>>>>> >>>>>>>> yep. >>>>>>>> >>>>>>>>> driver already has it, but I am not sure this is the right place to do >>>>>>>>> this. >>>>>>>>> My first question is how come the I/O times out if the card is >>>>>>>>> still >>>>>>>>> present i.e. you are only unbinding the host controller, so you should >>>>>>>>> remove the card while the host controller and card are still >>>>>>>>> functional? >>>>>>>> >>>>>>>> The card is still functional for my host(but maybe not if ->remove() >>>>>>>> touchs the vmmc or vqmmc parts), but the host controller isn't, as the >>>>>>>> ->remove() calls. >>>>>>> >>>>>>> I don't understand why the host controller isn't functional. I tried it >>>>>> >>>>>> I guess we have misunderstanding about what the "functional" means. What >>>>>> I want to say is when unbinding the host driver, mmc_remove_host() is >>>>>> called, so the target host isn't present from the view of core layer, >>>>>> right? >>>>>> >>>>>>> with sdhci and it just waits for dd to finish but there are no I/O >>>>>>> errors. >>>>>>> >>>>>>> Is there a use-case for unbinding quickly? >>>>>> >>>>>> Oh, I think I made a mistake in commit msg, the repro steps should be >>>>>> (1) run dd in background: >>>>>> dd if=/dev/mmcblk0 of=/dev/null bs=512k count=100000 & >>>>>> (2) unbind you driver, for instance: >>>>>> echo fe320000.dwmmc > /sys/bus/platform/drivers/dwmmc_rockchip/unbind >>>>>> >>>>>> Then yes, there is no I/O errors and it waits for dd to finish. But >>>>>> when waiting for dd to finish, console got stuck, and can't respond to >>>>>> any ctrl+{c,d,z}, even sysrq. The time depends how much I/O it submits, >>>>>> the above setp(1) make it stuck for 2 miniutes for my system. >>>>> >>>>> Job control on a console might not always work, especially with busybox. >>>>> I guess if you run the unbind in the background, then there is not a >>>>> problem. >>>> >>>> The dd case is just a quick reproduce but the real case is the >>>> Android system has many vold(similar to udev) driven partition scan >>>> tools/applications that fire I/Os once sd card mounted. The applications >>>> is busy waiting there if we check background job by "top", which make >>>> the user experience very bad in this way. >>> >>> Is this related to unbinding the host controller? >>> >>>> >>>>> >>>>> Given that unbinding the host controller while it is still in use is a bad >>>>> thing, why do we care? >>>> >>>> I just checked the log and wonder why mmc_mq_queue_rq still allows >>>> request to dispatch? Maybe it's a bad thing to unbind the controller >>>> when it's used, but that's not forbade, at least it should behave >>>> the same with real physically hot-plug. >>> >>> Should it? From memory I vaguely recall seeing EXT4 doing a sync before its >>> partition gets yanked from beneath it. If that is right it would tend to >>> indicate that there is not an expectation that I/O should fail immediately. >> >> I have little knowledge about EXT4 :),but that happend for cards with >> different fs format. >> >>> >>>> Another reason I do care about this is because I usually can't access to >>>> my test devices, but only remotely do job on their console. So if I need >>>> to reset/hot-plug the card, I usually do unbind/bind, but the partition >>>> scan applications + unbind make the console stuck for a longer time >>>> than expected which is very painful for me to wait for finishment. :( >>> >>> Why not do the unbind in the background and then kill the partition scan? >> >> Yes, I could :). But the conversation is deviating from core issue that >> why the MMC_CARD_REMOVED flag is left apart for the card but the card >> is removed. That being said, don't you think "mmc1: card 1234 removed" >> means mmc_card_removed(card) should be true? > > No. It means the message is misleading for the unbind case. Hmm.. I don't think so. The unbind case means the controller/slot has gone, so system can never access to the card any more. I admit it's physcially present, but not logically. So it looks same to me that I have a card in slot but the kernel lacks driver to support the card. > > If you want a way to mark the card as removed, couldn't you add something to > debugfs. It's irrelevant, as no in-kernel scenario check the debugfs. They only care about status propagated from MMC block layer. If the card could never be access any more, you shouldn't still allow requests pass through the first check of mmc_card_removed in mmc_mq_queue_rq. Do you agree? :) > > > -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 13/04/18 10:21, Shawn Lin wrote: > On 2018/4/13 15:03, Adrian Hunter wrote: >> On 13/04/18 03:53, Shawn Lin wrote: >>> On 2018/4/12 17:19, Adrian Hunter wrote: >>>> On 12/04/18 11:53, Shawn Lin wrote: >>>>> On 2018/4/12 15:03, Adrian Hunter wrote: >>>>>> On 12/04/18 03:43, Shawn Lin wrote: >>>>>>> On 2018/4/11 18:26, Adrian Hunter wrote: >>>>>>>> On 11/04/18 12:13, Shawn Lin wrote: >>>>>>>>> On 2018/4/11 16:46, Adrian Hunter wrote: >>>>>>>>>> On 11/04/18 10:24, Shawn Lin wrote: >>>>>>>>>>> A simply continueous background I/O could 100% make the system >>>>>>>>>>> stuck for >>>>>>>>>>> a long time in my system when a unbind for the controller driver >>>>>>>>>>> happens >>>>>>>>>>> simultaneously. See: >>>>>>>>>>> >>>>>>>>>>> dd if=/dev/mmcblk0 of=/dev/null bs=512k count=100000 && >>>>>>>>>>> echo fe320000.dwmmc > >>>>>>>>>>> /sys/bus/platform/drivers/dwmmc_rockchip/unbind >>>>>>>>>>> >>>>>>>>>>> The reason is all pending requests wait for timeout one by one, but >>>>>>>>>>> never >>>>>>>>>>> propagates BLK_STS_IOERR in the first place when kicked from the >>>>>>>>>>> queue. >>>>>>>>>>> Set the card as removed immediately in mmc_remove_card() to solve >>>>>>>>>>> it. >>>>>>>>>>> >>>>>>>>>>> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> >>>>>>>>>>> --- >>>>>>>>>>> >>>>>>>>>>> drivers/mmc/core/bus.c | 1 + >>>>>>>>>>> 1 file changed, 1 insertion(+) >>>>>>>>>>> >>>>>>>>>>> diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c >>>>>>>>>>> index fc92c6c..c6c5dfe 100644 >>>>>>>>>>> --- a/drivers/mmc/core/bus.c >>>>>>>>>>> +++ b/drivers/mmc/core/bus.c >>>>>>>>>>> @@ -389,6 +389,7 @@ void mmc_remove_card(struct mmc_card *card) >>>>>>>>>>> pr_info("%s: card %04x removed\n", >>>>>>>>>>> mmc_hostname(card->host), card->rca); >>>>>>>>>>> } >>>>>>>>>>> + mmc_card_set_removed(card); >>>>>>>>>> >>>>>>>>>> Pedantically we should not call mmc_card_set_removed() if we have not >>>>>>>>>> claimed the host. Of course we can't claim the host because the >>>>>>>>>> block >>>>>>>>> >>>>>>>>> yep. >>>>>>>>> >>>>>>>>>> driver already has it, but I am not sure this is the right place >>>>>>>>>> to do >>>>>>>>>> this. >>>>>>>>>> My first question is how come the I/O times out if the card is >>>>>>>>>> still >>>>>>>>>> present i.e. you are only unbinding the host controller, so you >>>>>>>>>> should >>>>>>>>>> remove the card while the host controller and card are still >>>>>>>>>> functional? >>>>>>>>> >>>>>>>>> The card is still functional for my host(but maybe not if ->remove() >>>>>>>>> touchs the vmmc or vqmmc parts), but the host controller isn't, as the >>>>>>>>> ->remove() calls. >>>>>>>> >>>>>>>> I don't understand why the host controller isn't functional. I >>>>>>>> tried it >>>>>>> >>>>>>> I guess we have misunderstanding about what the "functional" means. What >>>>>>> I want to say is when unbinding the host driver, mmc_remove_host() is >>>>>>> called, so the target host isn't present from the view of core layer, >>>>>>> right? >>>>>>> >>>>>>>> with sdhci and it just waits for dd to finish but there are no I/O >>>>>>>> errors. >>>>>>>> >>>>>>>> Is there a use-case for unbinding quickly? >>>>>>> >>>>>>> Oh, I think I made a mistake in commit msg, the repro steps should be >>>>>>> (1) run dd in background: >>>>>>> dd if=/dev/mmcblk0 of=/dev/null bs=512k count=100000 & >>>>>>> (2) unbind you driver, for instance: >>>>>>> echo fe320000.dwmmc > /sys/bus/platform/drivers/dwmmc_rockchip/unbind >>>>>>> >>>>>>> Then yes, there is no I/O errors and it waits for dd to finish. But >>>>>>> when waiting for dd to finish, console got stuck, and can't respond to >>>>>>> any ctrl+{c,d,z}, even sysrq. The time depends how much I/O it submits, >>>>>>> the above setp(1) make it stuck for 2 miniutes for my system. >>>>>> >>>>>> Job control on a console might not always work, especially with busybox. >>>>>> I guess if you run the unbind in the background, then there is not a >>>>>> problem. >>>>> >>>>> The dd case is just a quick reproduce but the real case is the >>>>> Android system has many vold(similar to udev) driven partition scan >>>>> tools/applications that fire I/Os once sd card mounted. The applications >>>>> is busy waiting there if we check background job by "top", which make >>>>> the user experience very bad in this way. >>>> >>>> Is this related to unbinding the host controller? >>>> >>>>> >>>>>> >>>>>> Given that unbinding the host controller while it is still in use is a >>>>>> bad >>>>>> thing, why do we care? >>>>> >>>>> I just checked the log and wonder why mmc_mq_queue_rq still allows >>>>> request to dispatch? Maybe it's a bad thing to unbind the controller >>>>> when it's used, but that's not forbade, at least it should behave >>>>> the same with real physically hot-plug. >>>> >>>> Should it? From memory I vaguely recall seeing EXT4 doing a sync before >>>> its >>>> partition gets yanked from beneath it. If that is right it would tend to >>>> indicate that there is not an expectation that I/O should fail immediately. >>> >>> I have little knowledge about EXT4 :),but that happend for cards with >>> different fs format. >>> >>>> >>>>> Another reason I do care about this is because I usually can't access to >>>>> my test devices, but only remotely do job on their console. So if I need >>>>> to reset/hot-plug the card, I usually do unbind/bind, but the partition >>>>> scan applications + unbind make the console stuck for a longer time >>>>> than expected which is very painful for me to wait for finishment. :( >>>> >>>> Why not do the unbind in the background and then kill the partition scan? >>> >>> Yes, I could :). But the conversation is deviating from core issue that >>> why the MMC_CARD_REMOVED flag is left apart for the card but the card >>> is removed. That being said, don't you think "mmc1: card 1234 removed" >>> means mmc_card_removed(card) should be true? >> >> No. It means the message is misleading for the unbind case. > > Hmm.. I don't think so. The unbind case means the controller/slot has > gone, so system can never access to the card any more. I admit it's > physcially present, but not logically. So it looks same to me that > I have a card in slot but the kernel lacks driver to support the card. No we always remove children (card) before the parent (host controller). > >> >> If you want a way to mark the card as removed, couldn't you add something to >> debugfs. > > It's irrelevant, as no in-kernel scenario check the debugfs. They only I am not sure that is true. For a removable card, setting the card as removed and scheduling a rescan would probably remove it. > care about status propagated from MMC block layer. If the card could > never be access any more, you shouldn't still allow requests pass > through the first check of mmc_card_removed in mmc_mq_queue_rq. Do you > agree? :) No, I think we have established that the card is fully functional. -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2018/4/13 15:55, Adrian Hunter wrote: > On 13/04/18 10:21, Shawn Lin wrote: > > No we always remove children (card) before the parent (host controller). > Ok, I agree with this. One thing now come to my mind is could we allow mmc_remove_host() to help remove children in advance or is it okay to add a new helper for host drivers to remove children in their ->remove()? -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 24/04/18 12:13, Shawn Lin wrote: > > On 2018/4/13 15:55, Adrian Hunter wrote: >> On 13/04/18 10:21, Shawn Lin wrote: > >> >> No we always remove children (card) before the parent (host controller). >> > > Ok, I agree with this. One thing now come to my mind is could we allow > mmc_remove_host() to help remove children in advance or is it okay to > add a new helper for host drivers to remove children in their > ->remove()? Lets' step back a bit. At what point does the remove end up waiting? Does it get to mmc_blk_remove()? -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2018/4/26 18:09, Adrian Hunter wrote: > On 24/04/18 12:13, Shawn Lin wrote: >> >> On 2018/4/13 15:55, Adrian Hunter wrote: >>> On 13/04/18 10:21, Shawn Lin wrote: >> >>> >>> No we always remove children (card) before the parent (host controller). >>> >> >> Ok, I agree with this. One thing now come to my mind is could we allow >> mmc_remove_host() to help remove children in advance or is it okay to >> add a new helper for host drivers to remove children in their >> ->remove()? > > Lets' step back a bit. At what point does the remove end up waiting? Does > it get to mmc_blk_remove()? I guess the remove might get to mmc_blk_remove for sd/(e)MMC cards. > > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c index fc92c6c..c6c5dfe 100644 --- a/drivers/mmc/core/bus.c +++ b/drivers/mmc/core/bus.c @@ -389,6 +389,7 @@ void mmc_remove_card(struct mmc_card *card) pr_info("%s: card %04x removed\n", mmc_hostname(card->host), card->rca); } + mmc_card_set_removed(card); device_del(&card->dev); of_node_put(card->dev.of_node); }
A simply continueous background I/O could 100% make the system stuck for a long time in my system when a unbind for the controller driver happens simultaneously. See: dd if=/dev/mmcblk0 of=/dev/null bs=512k count=100000 && echo fe320000.dwmmc > /sys/bus/platform/drivers/dwmmc_rockchip/unbind The reason is all pending requests wait for timeout one by one, but never propagates BLK_STS_IOERR in the first place when kicked from the queue. Set the card as removed immediately in mmc_remove_card() to solve it. Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> --- drivers/mmc/core/bus.c | 1 + 1 file changed, 1 insertion(+)