Message ID | 46dc0cbdcb8a414d70b7807fceb1cca6229408d5.1561055076.git.alifm@linux.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC,v1,1/1] vfio-ccw: Don't call cp_free if we are processing a channel program | expand |
On 6/20/19 3:40 PM, Farhan Ali wrote: > There is a small window where it's possible that an interrupt can > arrive and can call cp_free, while we are still processing a channel > program (i.e allocating memory, pinnging pages, translating s/pinnging/pinning/ > addresses etc). This can lead to allocating and freeing at the same > time and can cause memory corruption. > > Let's not call cp_free if we are currently processing a channel program. The check around this cp_free() call is for a solicited interrupt, so it's presumably in response to a SSCH we issued. But if we're still processing a CP, then we hadn't issued the SSCH to the hardware yet. So what is this interrupt for? Do the contents of irb.cpa provide any clues, perhaps if it's in the current cp or for someone else? > > Signed-off-by: Farhan Ali <alifm@linux.ibm.com> > --- > > I have been running my test overnight with this patch and I haven't > seen the stack traces that I mentioned about earlier. I would like > to get some reviews on this and also if this is the right thing to > do? > > Thanks > Farhan > > drivers/s390/cio/vfio_ccw_drv.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c > index 66a66ac..61ece3f 100644 > --- a/drivers/s390/cio/vfio_ccw_drv.c > +++ b/drivers/s390/cio/vfio_ccw_drv.c > @@ -88,7 +88,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work) > (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT)); > if (scsw_is_solicited(&irb->scsw)) { > cp_update_scsw(&private->cp, &irb->scsw); As I alluded earlier, do we know this irb is for this cp? If no, what does this function end up putting in the scsw? > - if (is_final) > + if (is_final && private->state != VFIO_CCW_STATE_CP_PROCESSING) In looking at how we set this state, and how we exit it, I see we do: if SSCH got CC0, CP_PROCESSING -> CP_PENDING if SSCH got !CC0, CP_PROCESSING -> IDLE While the first scenario happens immediately after the SSCH instruction, I guess it could be just tiny enough, like the io_trigger FSM patch I sent a few weeks ago. Meanwhile, the latter happens way after we return from the jump table. So that scenario leaves considerable time for such an interrupt to occur, though I don't understand why it would if we got a CC(1-3) on the SSCH. And anyway, the return from fsm_io_helper() in that case will also call cp_free(). So why does the cp->initialized check provide protection from a double-free in that direction, but not here? I'm confused. > cp_free(&private->cp); > } > mutex_lock(&private->io_mutex); >
On Thu, 20 Jun 2019 17:07:09 -0400 Farhan Ali <alifm@linux.ibm.com> wrote: > There is a small window where it's possible that an interrupt can > arrive and can call cp_free, while we are still processing a channel > program (i.e allocating memory, pinnging pages, translating > addresses etc). This can lead to allocating and freeing at the same > time and can cause memory corruption. > > Let's not call cp_free if we are currently processing a channel program. > > Signed-off-by: Farhan Ali <alifm@linux.ibm.com> > --- > > I have been running my test overnight with this patch and I haven't > seen the stack traces that I mentioned about earlier. I would like > to get some reviews on this and also if this is the right thing to > do? > > Thanks > Farhan > > drivers/s390/cio/vfio_ccw_drv.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c > index 66a66ac..61ece3f 100644 > --- a/drivers/s390/cio/vfio_ccw_drv.c > +++ b/drivers/s390/cio/vfio_ccw_drv.c > @@ -88,7 +88,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work) > (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT)); > if (scsw_is_solicited(&irb->scsw)) { > cp_update_scsw(&private->cp, &irb->scsw); > - if (is_final) > + if (is_final && private->state != VFIO_CCW_STATE_CP_PROCESSING) How is access to private->state correctly synchronized? And don't we expect private->state == VFIO_CCW_STATE_CP_PENDING in case the cp was submitted successfully with a ssch() and is done now (one way or the other)? Does this have something to do with 71189f2 "vfio-ccw: make it safe to access channel programs" (Cornelia Huck, 2019-01-21)? Regards, Halil > cp_free(&private->cp); > } > mutex_lock(&private->io_mutex);
On 06/20/2019 04:27 PM, Eric Farman wrote: > > > On 6/20/19 3:40 PM, Farhan Ali wrote: >> There is a small window where it's possible that an interrupt can >> arrive and can call cp_free, while we are still processing a channel >> program (i.e allocating memory, pinnging pages, translating > > s/pinnging/pinning/ > >> addresses etc). This can lead to allocating and freeing at the same >> time and can cause memory corruption. >> >> Let's not call cp_free if we are currently processing a channel program. > > The check around this cp_free() call is for a solicited interrupt, so > it's presumably in response to a SSCH we issued. But if we're still > processing a CP, then we hadn't issued the SSCH to the hardware yet. So > what is this interrupt for? Do the contents of irb.cpa provide any > clues, perhaps if it's in the current cp or for someone else? > I don't think the interrupt is in response to an ssch but rather due to an csch/hsch. >> >> Signed-off-by: Farhan Ali <alifm@linux.ibm.com> >> --- >> >> I have been running my test overnight with this patch and I haven't >> seen the stack traces that I mentioned about earlier. I would like >> to get some reviews on this and also if this is the right thing to >> do? >> >> Thanks >> Farhan >> >> drivers/s390/cio/vfio_ccw_drv.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c >> index 66a66ac..61ece3f 100644 >> --- a/drivers/s390/cio/vfio_ccw_drv.c >> +++ b/drivers/s390/cio/vfio_ccw_drv.c >> @@ -88,7 +88,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work) >> (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT)); >> if (scsw_is_solicited(&irb->scsw)) { >> cp_update_scsw(&private->cp, &irb->scsw); > > As I alluded earlier, do we know this irb is for this cp? If no, what > does this function end up putting in the scsw? > >> - if (is_final) >> + if (is_final && private->state != VFIO_CCW_STATE_CP_PROCESSING) > > In looking at how we set this state, and how we exit it, I see we do: > > if SSCH got CC0, CP_PROCESSING -> CP_PENDING > if SSCH got !CC0, CP_PROCESSING -> IDLE > > While the first scenario happens immediately after the SSCH instruction, > I guess it could be just tiny enough, like the io_trigger FSM patch I > sent a few weeks ago. > > Meanwhile, the latter happens way after we return from the jump table. > So that scenario leaves considerable time for such an interrupt to > occur, though I don't understand why it would if we got a CC(1-3) on the > SSCH. > > And anyway, the return from fsm_io_helper() in that case will also call > cp_free(). So why does the cp->initialized check provide protection > from a double-free in that direction, but not here? I'm confused. I have a theory where I think it's possible to have 2 different threads executing cp_free If we start with private->state == IDLE and the guest issues a clear/halt and then an ssch - clear/halt will be issued to hardware, and if succeeds we will return cc=0 to guest - the guest can then issue ssch - we get an interrupt for csch/hsch and we queue the interrupt in the workqueue - we start processing the ssch and then at the same time another cpu could be working on the interrupt Thread 1 Thread 2 -------- -------- fsm_io_request vfio_ccw_sch_io_todo cp_init cp_free cp_prefetch fsm_io_helper cp_free The test that I am trying is with a guest running an fio workload, while at the same time stressing the error recovery path in the guest. So there is a lot of ssch and lot of csch. Of course I don't think my patch completely solves the problem, I think it just makes the window narrower. I just wanted to get a discussion started :) Now that I am thinking more about it, I think we might have to protect cp with it's own mutex. Thanks Farhan > >> cp_free(&private->cp); >> } >> mutex_lock(&private->io_mutex); >> >
Hey Halil, On 06/21/2019 10:00 AM, Halil Pasic wrote: > On Thu, 20 Jun 2019 17:07:09 -0400 > Farhan Ali <alifm@linux.ibm.com> wrote: > >> There is a small window where it's possible that an interrupt can >> arrive and can call cp_free, while we are still processing a channel >> program (i.e allocating memory, pinnging pages, translating >> addresses etc). This can lead to allocating and freeing at the same >> time and can cause memory corruption. >> >> Let's not call cp_free if we are currently processing a channel program. >> >> Signed-off-by: Farhan Ali <alifm@linux.ibm.com> >> --- >> >> I have been running my test overnight with this patch and I haven't >> seen the stack traces that I mentioned about earlier. I would like >> to get some reviews on this and also if this is the right thing to >> do? >> >> Thanks >> Farhan >> >> drivers/s390/cio/vfio_ccw_drv.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c >> index 66a66ac..61ece3f 100644 >> --- a/drivers/s390/cio/vfio_ccw_drv.c >> +++ b/drivers/s390/cio/vfio_ccw_drv.c >> @@ -88,7 +88,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work) >> (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT)); >> if (scsw_is_solicited(&irb->scsw)) { >> cp_update_scsw(&private->cp, &irb->scsw); >> - if (is_final) >> + if (is_final && private->state != VFIO_CCW_STATE_CP_PROCESSING) > > How is access to private->state correctly synchronized? And don't we > expect private->state == VFIO_CCW_STATE_CP_PENDING in case the cp was > submitted successfully with a ssch() and is done now (one way or the > other)? I think the interrupt that we are processing could be for a csch/hsch and not an ssch. So we could have one thread handling an ssch and another thread handling a csch/hsch interrupt. > > Does this have something to do with 71189f2 "vfio-ccw: make it safe to > access channel programs" (Cornelia Huck, 2019-01-21)? It's related. Though that patch handles freeing cp when we have successfully issued a ssch and then issue a csch/hsch. But it leaves a tiny window where if the reverse happens where we get an csch/hsch first, and then ssch there could be a scenario where we can end up calling cp_free in 2 different threads. I have responded to Eric's email explaining further, so if you could kindly refer to that and continue the discussion there because I think he had similar questions and concern as you :) Thanks Farhan > > Regards, > Halil > >> cp_free(&private->cp); >> } >> mutex_lock(&private->io_mutex); > >
On 6/21/19 10:17 AM, Farhan Ali wrote: > > > On 06/20/2019 04:27 PM, Eric Farman wrote: >> >> >> On 6/20/19 3:40 PM, Farhan Ali wrote: >>> There is a small window where it's possible that an interrupt can >>> arrive and can call cp_free, while we are still processing a channel >>> program (i.e allocating memory, pinnging pages, translating >> >> s/pinnging/pinning/ >> >>> addresses etc). This can lead to allocating and freeing at the same >>> time and can cause memory corruption. >>> >>> Let's not call cp_free if we are currently processing a channel program. >> >> The check around this cp_free() call is for a solicited interrupt, so >> it's presumably in response to a SSCH we issued. But if we're still >> processing a CP, then we hadn't issued the SSCH to the hardware yet. So >> what is this interrupt for? Do the contents of irb.cpa provide any >> clues, perhaps if it's in the current cp or for someone else? >> > > I don't think the interrupt is in response to an ssch but rather due to > an csch/hsch. > >>> >>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com> >>> --- >>> >>> I have been running my test overnight with this patch and I haven't >>> seen the stack traces that I mentioned about earlier. I would like >>> to get some reviews on this and also if this is the right thing to >>> do? >>> >>> Thanks >>> Farhan >>> >>> drivers/s390/cio/vfio_ccw_drv.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/drivers/s390/cio/vfio_ccw_drv.c >>> b/drivers/s390/cio/vfio_ccw_drv.c >>> index 66a66ac..61ece3f 100644 >>> --- a/drivers/s390/cio/vfio_ccw_drv.c >>> +++ b/drivers/s390/cio/vfio_ccw_drv.c >>> @@ -88,7 +88,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct >>> *work) >>> (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT)); >>> if (scsw_is_solicited(&irb->scsw)) { >>> cp_update_scsw(&private->cp, &irb->scsw); >> >> As I alluded earlier, do we know this irb is for this cp? If no, what >> does this function end up putting in the scsw? >> >>> - if (is_final) >>> + if (is_final && private->state != VFIO_CCW_STATE_CP_PROCESSING) >> >> In looking at how we set this state, and how we exit it, I see we do: >> >> if SSCH got CC0, CP_PROCESSING -> CP_PENDING >> if SSCH got !CC0, CP_PROCESSING -> IDLE >> >> While the first scenario happens immediately after the SSCH instruction, >> I guess it could be just tiny enough, like the io_trigger FSM patch I >> sent a few weeks ago. >> >> Meanwhile, the latter happens way after we return from the jump table. >> So that scenario leaves considerable time for such an interrupt to >> occur, though I don't understand why it would if we got a CC(1-3) on the >> SSCH. >> >> And anyway, the return from fsm_io_helper() in that case will also call >> cp_free(). So why does the cp->initialized check provide protection >> from a double-free in that direction, but not here? I'm confused. > > I have a theory where I think it's possible to have 2 different threads > executing cp_free > > If we start with private->state == IDLE and the guest issues a > clear/halt and then an ssch > > - clear/halt will be issued to hardware, and if succeeds we will return > cc=0 to guest > > - the guest can then issue ssch It can issue whatever it wants, but shouldn't the SSCH get a CC2 until the halt/clear pending state is cleared? Hrm, fsm_io_request() doesn't look. Rather, it (fsm_io_helper()) relies on the CC2 to be signalled from the SSCH issued to the device. There's a lot of stuff that happens before we get to that point. I'm wondering if there's a way we could/should return the SSCH here before we do any processing. After all, until the interrupt on the workqueue is processed, we are busy. > > - we get an interrupt for csch/hsch and we queue the interrupt in the > workqueue > > - we start processing the ssch and then at the same time another cpu > could be working on the > interrupt> > > Thread 1 Thread 2 > -------- -------- > > fsm_io_request vfio_ccw_sch_io_todo > cp_init cp_free > cp_prefetch > fsm_io_helper > cp_free > > > > The test that I am trying is with a guest running an fio workload, while > at the same time stressing the error recovery path in the guest. So > there is a lot of ssch and lot of csch. > > Of course I don't think my patch completely solves the problem, I think > it just makes the window narrower. I just wanted to get a discussion > started :) > > > Now that I am thinking more about it, I think we might have to protect > cp with it's own mutex. That seems like a big hammer, and I wonder if the existing SCHIB/FSM/CP state data doesn't provide that information to us. But I gotta wander around some code before I say. > > Thanks > Farhan > > >> >>> cp_free(&private->cp); >>> } >>> mutex_lock(&private->io_mutex); >>> >> >
On 06/21/2019 01:40 PM, Eric Farman wrote: > > > On 6/21/19 10:17 AM, Farhan Ali wrote: >> >> >> On 06/20/2019 04:27 PM, Eric Farman wrote: >>> >>> >>> On 6/20/19 3:40 PM, Farhan Ali wrote: >>>> There is a small window where it's possible that an interrupt can >>>> arrive and can call cp_free, while we are still processing a channel >>>> program (i.e allocating memory, pinnging pages, translating >>> >>> s/pinnging/pinning/ >>> >>>> addresses etc). This can lead to allocating and freeing at the same >>>> time and can cause memory corruption. >>>> >>>> Let's not call cp_free if we are currently processing a channel program. >>> >>> The check around this cp_free() call is for a solicited interrupt, so >>> it's presumably in response to a SSCH we issued. But if we're still >>> processing a CP, then we hadn't issued the SSCH to the hardware yet. So >>> what is this interrupt for? Do the contents of irb.cpa provide any >>> clues, perhaps if it's in the current cp or for someone else? >>> >> >> I don't think the interrupt is in response to an ssch but rather due to >> an csch/hsch. >> >>>> >>>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com> >>>> --- >>>> >>>> I have been running my test overnight with this patch and I haven't >>>> seen the stack traces that I mentioned about earlier. I would like >>>> to get some reviews on this and also if this is the right thing to >>>> do? >>>> >>>> Thanks >>>> Farhan >>>> >>>> drivers/s390/cio/vfio_ccw_drv.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/s390/cio/vfio_ccw_drv.c >>>> b/drivers/s390/cio/vfio_ccw_drv.c >>>> index 66a66ac..61ece3f 100644 >>>> --- a/drivers/s390/cio/vfio_ccw_drv.c >>>> +++ b/drivers/s390/cio/vfio_ccw_drv.c >>>> @@ -88,7 +88,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct >>>> *work) >>>> (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT)); >>>> if (scsw_is_solicited(&irb->scsw)) { >>>> cp_update_scsw(&private->cp, &irb->scsw); >>> >>> As I alluded earlier, do we know this irb is for this cp? If no, what >>> does this function end up putting in the scsw? >>> >>>> - if (is_final) >>>> + if (is_final && private->state != VFIO_CCW_STATE_CP_PROCESSING) >>> >>> In looking at how we set this state, and how we exit it, I see we do: >>> >>> if SSCH got CC0, CP_PROCESSING -> CP_PENDING >>> if SSCH got !CC0, CP_PROCESSING -> IDLE >>> >>> While the first scenario happens immediately after the SSCH instruction, >>> I guess it could be just tiny enough, like the io_trigger FSM patch I >>> sent a few weeks ago. >>> >>> Meanwhile, the latter happens way after we return from the jump table. >>> So that scenario leaves considerable time for such an interrupt to >>> occur, though I don't understand why it would if we got a CC(1-3) on the >>> SSCH. >>> >>> And anyway, the return from fsm_io_helper() in that case will also call >>> cp_free(). So why does the cp->initialized check provide protection >>> from a double-free in that direction, but not here? I'm confused. >> >> I have a theory where I think it's possible to have 2 different threads >> executing cp_free >> >> If we start with private->state == IDLE and the guest issues a >> clear/halt and then an ssch >> >> - clear/halt will be issued to hardware, and if succeeds we will return >> cc=0 to guest >> >> - the guest can then issue ssch > > It can issue whatever it wants, but shouldn't the SSCH get a CC2 until > the halt/clear pending state is cleared? Hrm, fsm_io_request() doesn't > look. Rather, it (fsm_io_helper()) relies on the CC2 to be signalled > from the SSCH issued to the device. There's a lot of stuff that happens > before we get to that point. Yes, and stuff that happens is memory allocation, pinning and other things which can take time. > > I'm wondering if there's a way we could/should return the SSCH here > before we do any processing. After all, until the interrupt on the > workqueue is processed, we are busy. > you mean return the csch/hsch before processing the ssch? Maybe we could extend CP_PENDING state, to just PENDING and use that to reject any ssch if we have a pending csch/hsch? >> >> - we get an interrupt for csch/hsch and we queue the interrupt in the >> workqueue >> >> - we start processing the ssch and then at the same time another cpu >> could be working on the >> interrupt> >> >> Thread 1 Thread 2 >> -------- -------- >> >> fsm_io_request vfio_ccw_sch_io_todo >> cp_init cp_free >> cp_prefetch >> fsm_io_helper >> cp_free >> >> >> >> The test that I am trying is with a guest running an fio workload, while >> at the same time stressing the error recovery path in the guest. So >> there is a lot of ssch and lot of csch. >> >> Of course I don't think my patch completely solves the problem, I think >> it just makes the window narrower. I just wanted to get a discussion >> started :) >> >> >> Now that I am thinking more about it, I think we might have to protect >> cp with it's own mutex. > > That seems like a big hammer, and I wonder if the existing SCHIB/FSM/CP > state data doesn't provide that information to us. But I gotta wander > around some code before I say. Any ideas are welcome :) > >> >> Thanks >> Farhan >> >> >>> >>>> cp_free(&private->cp); >>>> } >>>> mutex_lock(&private->io_mutex); >>>> >>> >> >
On Fri, 21 Jun 2019 14:34:10 -0400 Farhan Ali <alifm@linux.ibm.com> wrote: > On 06/21/2019 01:40 PM, Eric Farman wrote: > > > > > > On 6/21/19 10:17 AM, Farhan Ali wrote: > >> > >> > >> On 06/20/2019 04:27 PM, Eric Farman wrote: > >>> > >>> > >>> On 6/20/19 3:40 PM, Farhan Ali wrote: > >>>> There is a small window where it's possible that an interrupt can > >>>> arrive and can call cp_free, while we are still processing a channel > >>>> program (i.e allocating memory, pinnging pages, translating > >>> > >>> s/pinnging/pinning/ > >>> > >>>> addresses etc). This can lead to allocating and freeing at the same > >>>> time and can cause memory corruption. > >>>> > >>>> Let's not call cp_free if we are currently processing a channel program. > >>> > >>> The check around this cp_free() call is for a solicited interrupt, so > >>> it's presumably in response to a SSCH we issued. But if we're still > >>> processing a CP, then we hadn't issued the SSCH to the hardware yet. So > >>> what is this interrupt for? Do the contents of irb.cpa provide any > >>> clues, perhaps if it's in the current cp or for someone else? > >>> > >> > >> I don't think the interrupt is in response to an ssch but rather due to > >> an csch/hsch. The solicited check only checks if it is solicited. It can be for any channel I/O instruction that causes an interrupt... we probably should adapt the check. > >> > >>>> > >>>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com> > >>>> --- > >>>> > >>>> I have been running my test overnight with this patch and I haven't > >>>> seen the stack traces that I mentioned about earlier. I would like > >>>> to get some reviews on this and also if this is the right thing to > >>>> do? > >>>> > >>>> Thanks > >>>> Farhan > >>>> > >>>> drivers/s390/cio/vfio_ccw_drv.c | 2 +- > >>>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>>> > >>>> diff --git a/drivers/s390/cio/vfio_ccw_drv.c > >>>> b/drivers/s390/cio/vfio_ccw_drv.c > >>>> index 66a66ac..61ece3f 100644 > >>>> --- a/drivers/s390/cio/vfio_ccw_drv.c > >>>> +++ b/drivers/s390/cio/vfio_ccw_drv.c > >>>> @@ -88,7 +88,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct > >>>> *work) > >>>> (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT)); > >>>> if (scsw_is_solicited(&irb->scsw)) { > >>>> cp_update_scsw(&private->cp, &irb->scsw); > >>> > >>> As I alluded earlier, do we know this irb is for this cp? If no, what > >>> does this function end up putting in the scsw? Yes, I think this also needs to check whether we have at least a prior start function around. (We use the orb provided by the guest; maybe we should check if that intparm is set in the irb?) > >>> > >>>> - if (is_final) > >>>> + if (is_final && private->state != VFIO_CCW_STATE_CP_PROCESSING) > >>> > >>> In looking at how we set this state, and how we exit it, I see we do: > >>> > >>> if SSCH got CC0, CP_PROCESSING -> CP_PENDING > >>> if SSCH got !CC0, CP_PROCESSING -> IDLE > >>> > >>> While the first scenario happens immediately after the SSCH instruction, > >>> I guess it could be just tiny enough, like the io_trigger FSM patch I > >>> sent a few weeks ago. > >>> > >>> Meanwhile, the latter happens way after we return from the jump table. > >>> So that scenario leaves considerable time for such an interrupt to > >>> occur, though I don't understand why it would if we got a CC(1-3) on the > >>> SSCH. > >>> > >>> And anyway, the return from fsm_io_helper() in that case will also call > >>> cp_free(). So why does the cp->initialized check provide protection > >>> from a double-free in that direction, but not here? I'm confused. > >> > >> I have a theory where I think it's possible to have 2 different threads > >> executing cp_free > >> > >> If we start with private->state == IDLE and the guest issues a > >> clear/halt and then an ssch > >> > >> - clear/halt will be issued to hardware, and if succeeds we will return > >> cc=0 to guest > >> > >> - the guest can then issue ssch > > > > It can issue whatever it wants, but shouldn't the SSCH get a CC2 until > > the halt/clear pending state is cleared? Hrm, fsm_io_request() doesn't > > look. Rather, it (fsm_io_helper()) relies on the CC2 to be signalled > > from the SSCH issued to the device. There's a lot of stuff that happens > > before we get to that point. > > Yes, and stuff that happens is memory allocation, pinning and other > things which can take time. > > > > > I'm wondering if there's a way we could/should return the SSCH here > > before we do any processing. After all, until the interrupt on the > > workqueue is processed, we are busy. > > > > you mean return the csch/hsch before processing the ssch? Maybe we could > extend CP_PENDING state, to just PENDING and use that to reject any ssch > if we have a pending csch/hsch? I'd probably not rely on the state for this. Maybe we could look at the work queue? But it might be that the check for the intparm I suggested above is already enough. > > >> > >> - we get an interrupt for csch/hsch and we queue the interrupt in the > >> workqueue > >> > >> - we start processing the ssch and then at the same time another cpu > >> could be working on the > >> interrupt> > >> > >> Thread 1 Thread 2 > >> -------- -------- > >> > >> fsm_io_request vfio_ccw_sch_io_todo > >> cp_init cp_free > >> cp_prefetch > >> fsm_io_helper > >> cp_free > >> > >> > >> > >> The test that I am trying is with a guest running an fio workload, while > >> at the same time stressing the error recovery path in the guest. So > >> there is a lot of ssch and lot of csch. > >> > >> Of course I don't think my patch completely solves the problem, I think > >> it just makes the window narrower. I just wanted to get a discussion > >> started :) > >> > >> > >> Now that I am thinking more about it, I think we might have to protect > >> cp with it's own mutex. > > > > That seems like a big hammer, and I wonder if the existing SCHIB/FSM/CP > > state data doesn't provide that information to us. But I gotta wander > > around some code before I say. > > Any ideas are welcome :) See above :) That certainly looks like a much smaller hammer. > > > > >> > >> Thanks > >> Farhan > >> > >> > >>> > >>>> cp_free(&private->cp); > >>>> } > >>>> mutex_lock(&private->io_mutex); > >>>> > >>> > >> > >
On Mon, 24 Jun 2019 11:42:31 +0200 Cornelia Huck <cohuck@redhat.com> wrote: > On Fri, 21 Jun 2019 14:34:10 -0400 > Farhan Ali <alifm@linux.ibm.com> wrote: > > > On 06/21/2019 01:40 PM, Eric Farman wrote: > > > > > > > > > On 6/21/19 10:17 AM, Farhan Ali wrote: > > >> > > >> > > >> On 06/20/2019 04:27 PM, Eric Farman wrote: > > >>> > > >>> > > >>> On 6/20/19 3:40 PM, Farhan Ali wrote: > > >>>> There is a small window where it's possible that an interrupt can > > >>>> arrive and can call cp_free, while we are still processing a channel > > >>>> program (i.e allocating memory, pinnging pages, translating > > >>> > > >>> s/pinnging/pinning/ > > >>> > > >>>> addresses etc). This can lead to allocating and freeing at the same > > >>>> time and can cause memory corruption. > > >>>> > > >>>> Let's not call cp_free if we are currently processing a channel program. > > >>> > > >>> The check around this cp_free() call is for a solicited interrupt, so > > >>> it's presumably in response to a SSCH we issued. But if we're still > > >>> processing a CP, then we hadn't issued the SSCH to the hardware yet. So > > >>> what is this interrupt for? Do the contents of irb.cpa provide any > > >>> clues, perhaps if it's in the current cp or for someone else? > > >>> > > >> > > >> I don't think the interrupt is in response to an ssch but rather due to > > >> an csch/hsch. > > The solicited check only checks if it is solicited. It can be for any > channel I/O instruction that causes an interrupt... we probably should > adapt the check. > > > >> > > >>>> > > >>>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com> > > >>>> --- > > >>>> > > >>>> I have been running my test overnight with this patch and I haven't > > >>>> seen the stack traces that I mentioned about earlier. I would like > > >>>> to get some reviews on this and also if this is the right thing to > > >>>> do? > > >>>> > > >>>> Thanks > > >>>> Farhan > > >>>> > > >>>> drivers/s390/cio/vfio_ccw_drv.c | 2 +- > > >>>> 1 file changed, 1 insertion(+), 1 deletion(-) > > >>>> > > >>>> diff --git a/drivers/s390/cio/vfio_ccw_drv.c > > >>>> b/drivers/s390/cio/vfio_ccw_drv.c > > >>>> index 66a66ac..61ece3f 100644 > > >>>> --- a/drivers/s390/cio/vfio_ccw_drv.c > > >>>> +++ b/drivers/s390/cio/vfio_ccw_drv.c > > >>>> @@ -88,7 +88,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct > > >>>> *work) > > >>>> (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT)); > > >>>> if (scsw_is_solicited(&irb->scsw)) { > > >>>> cp_update_scsw(&private->cp, &irb->scsw); > > >>> > > >>> As I alluded earlier, do we know this irb is for this cp? If no, what > > >>> does this function end up putting in the scsw? > > Yes, I think this also needs to check whether we have at least a prior > start function around. (We use the orb provided by the guest; maybe we > should check if that intparm is set in the irb?) Hrm; not so easy as we always set the intparm to the address of the subchannel structure... Maybe check if we have have one of the conditions of the large table 16-6 and correlate to the ccw address? Or is it enough to check the function control? (Don't remember when the hardware resets it.) > > > >>> > > >>>> - if (is_final) > > >>>> + if (is_final && private->state != VFIO_CCW_STATE_CP_PROCESSING) > > >>> > > >>> In looking at how we set this state, and how we exit it, I see we do: > > >>> > > >>> if SSCH got CC0, CP_PROCESSING -> CP_PENDING > > >>> if SSCH got !CC0, CP_PROCESSING -> IDLE > > >>> > > >>> While the first scenario happens immediately after the SSCH instruction, > > >>> I guess it could be just tiny enough, like the io_trigger FSM patch I > > >>> sent a few weeks ago. > > >>> > > >>> Meanwhile, the latter happens way after we return from the jump table. > > >>> So that scenario leaves considerable time for such an interrupt to > > >>> occur, though I don't understand why it would if we got a CC(1-3) on the > > >>> SSCH. > > >>> > > >>> And anyway, the return from fsm_io_helper() in that case will also call > > >>> cp_free(). So why does the cp->initialized check provide protection > > >>> from a double-free in that direction, but not here? I'm confused. > > >> > > >> I have a theory where I think it's possible to have 2 different threads > > >> executing cp_free > > >> > > >> If we start with private->state == IDLE and the guest issues a > > >> clear/halt and then an ssch > > >> > > >> - clear/halt will be issued to hardware, and if succeeds we will return > > >> cc=0 to guest > > >> > > >> - the guest can then issue ssch > > > > > > It can issue whatever it wants, but shouldn't the SSCH get a CC2 until > > > the halt/clear pending state is cleared? Hrm, fsm_io_request() doesn't > > > look. Rather, it (fsm_io_helper()) relies on the CC2 to be signalled > > > from the SSCH issued to the device. There's a lot of stuff that happens > > > before we get to that point. > > > > Yes, and stuff that happens is memory allocation, pinning and other > > things which can take time. > > > > > > > > I'm wondering if there's a way we could/should return the SSCH here > > > before we do any processing. After all, until the interrupt on the > > > workqueue is processed, we are busy. > > > > > > > you mean return the csch/hsch before processing the ssch? Maybe we could > > extend CP_PENDING state, to just PENDING and use that to reject any ssch > > if we have a pending csch/hsch? > > I'd probably not rely on the state for this. Maybe we could look at the > work queue? But it might be that the check for the intparm I suggested > above is already enough. > > > > > >> > > >> - we get an interrupt for csch/hsch and we queue the interrupt in the > > >> workqueue > > >> > > >> - we start processing the ssch and then at the same time another cpu > > >> could be working on the > > >> interrupt> > > >> > > >> Thread 1 Thread 2 > > >> -------- -------- > > >> > > >> fsm_io_request vfio_ccw_sch_io_todo > > >> cp_init cp_free > > >> cp_prefetch > > >> fsm_io_helper > > >> cp_free > > >> > > >> > > >> > > >> The test that I am trying is with a guest running an fio workload, while > > >> at the same time stressing the error recovery path in the guest. So > > >> there is a lot of ssch and lot of csch. > > >> > > >> Of course I don't think my patch completely solves the problem, I think > > >> it just makes the window narrower. I just wanted to get a discussion > > >> started :) > > >> > > >> > > >> Now that I am thinking more about it, I think we might have to protect > > >> cp with it's own mutex. > > > > > > That seems like a big hammer, and I wonder if the existing SCHIB/FSM/CP > > > state data doesn't provide that information to us. But I gotta wander > > > around some code before I say. > > > > Any ideas are welcome :) > > See above :) That certainly looks like a much smaller hammer. > > > > > > > > >> > > >> Thanks > > >> Farhan > > >> > > >> > > >>> > > >>>> cp_free(&private->cp); > > >>>> } > > >>>> mutex_lock(&private->io_mutex); > > >>>> > > >>> > > >> > > > >
On Mon, 24 Jun 2019 11:42:31 +0200 Cornelia Huck <cohuck@redhat.com> wrote: > > > It can issue whatever it wants, but shouldn't the SSCH get a CC2 until > > > the halt/clear pending state is cleared? Hrm, fsm_io_request() doesn't > > > look. Rather, it (fsm_io_helper()) relies on the CC2 to be signalled > > > from the SSCH issued to the device. There's a lot of stuff that happens > > > before we get to that point. Yes CC2 would be the correct thing to do in this situation. Doesn't QEMU do some sort of logic of this kind already. AFAIR QEMU should reject the SSCH because it knows that the halt/clear function is in progress or pending. Or am I worng? Even if QEMU does it, the kernel must not rely on QEMU or whatever userspace doing it correctly. What I'm trying to say, if QEMU can do it vfio_ccw should do it as well. I'm almost always in favor of sticking close to what PoP says. > > > > Yes, and stuff that happens is memory allocation, pinning and other > > things which can take time. > > > > > > > > I'm wondering if there's a way we could/should return the SSCH here > > > before we do any processing. After all, until the interrupt on the > > > workqueue is processed, we are busy. > > > > > > > you mean return the csch/hsch before processing the ssch? Maybe we could > > extend CP_PENDING state, to just PENDING and use that to reject any ssch > > if we have a pending csch/hsch? > > I'd probably not rely on the state for this. Maybe we could look at the > work queue? But it might be that the check for the intparm I suggested > above is already enough. PoP says function control bits are what matter here: """ Condition code 2 is set, and no other action is taken, when a start, halt, or clear function is currently in progress at the subchannel (see “Function Control (FC)” on page 13). """ Regards, Halil
On Mon, 24 Jun 2019 12:05:14 +0200 Cornelia Huck <cohuck@redhat.com> wrote: > On Mon, 24 Jun 2019 11:42:31 +0200 > Cornelia Huck <cohuck@redhat.com> wrote: > > > On Fri, 21 Jun 2019 14:34:10 -0400 > > Farhan Ali <alifm@linux.ibm.com> wrote: > > > > > On 06/21/2019 01:40 PM, Eric Farman wrote: > > > > > > > > > > > > On 6/21/19 10:17 AM, Farhan Ali wrote: > > > >> > > > >> > > > >> On 06/20/2019 04:27 PM, Eric Farman wrote: > > > >>> > > > >>> > > > >>> On 6/20/19 3:40 PM, Farhan Ali wrote: > > > >>>> diff --git a/drivers/s390/cio/vfio_ccw_drv.c > > > >>>> b/drivers/s390/cio/vfio_ccw_drv.c > > > >>>> index 66a66ac..61ece3f 100644 > > > >>>> --- a/drivers/s390/cio/vfio_ccw_drv.c > > > >>>> +++ b/drivers/s390/cio/vfio_ccw_drv.c > > > >>>> @@ -88,7 +88,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct > > > >>>> *work) > > > >>>> (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT)); > > > >>>> if (scsw_is_solicited(&irb->scsw)) { > > > >>>> cp_update_scsw(&private->cp, &irb->scsw); > > > >>> > > > >>> As I alluded earlier, do we know this irb is for this cp? If no, what > > > >>> does this function end up putting in the scsw? > > > > Yes, I think this also needs to check whether we have at least a prior > > start function around. (We use the orb provided by the guest; maybe we > > should check if that intparm is set in the irb?) > > Hrm; not so easy as we always set the intparm to the address of the > subchannel structure... > > Maybe check if we have have one of the conditions of the large table > 16-6 and correlate to the ccw address? Or is it enough to check the > function control? (Don't remember when the hardware resets it.) Nope, we cannot look at the function control, as csch clears any set start function bit :( (see "Function Control", pg 16-13) I think this problem mostly boils down to "csch clears pending status; therefore, we may only get one interrupt, even though there had been a start function going on". If we only go with what the hardware gives us, I don't see a way to distinguish "clear with a prior start" from "clear only". Maybe we want to track an "issued" status in the cp?
On Mon, 24 Jun 2019 13:46:22 +0200 Cornelia Huck <cohuck@redhat.com> wrote: > On Mon, 24 Jun 2019 12:05:14 +0200 > Cornelia Huck <cohuck@redhat.com> wrote: > > > On Mon, 24 Jun 2019 11:42:31 +0200 > > Cornelia Huck <cohuck@redhat.com> wrote: > > > > > On Fri, 21 Jun 2019 14:34:10 -0400 > > > Farhan Ali <alifm@linux.ibm.com> wrote: > > > > > > > On 06/21/2019 01:40 PM, Eric Farman wrote: > > > > > > > > > > > > > > > On 6/21/19 10:17 AM, Farhan Ali wrote: > > > > >> > > > > >> > > > > >> On 06/20/2019 04:27 PM, Eric Farman wrote: > > > > >>> > > > > >>> > > > > >>> On 6/20/19 3:40 PM, Farhan Ali wrote: > > > > > >>>> diff --git a/drivers/s390/cio/vfio_ccw_drv.c > > > > >>>> b/drivers/s390/cio/vfio_ccw_drv.c > > > > >>>> index 66a66ac..61ece3f 100644 > > > > >>>> --- a/drivers/s390/cio/vfio_ccw_drv.c > > > > >>>> +++ b/drivers/s390/cio/vfio_ccw_drv.c > > > > >>>> @@ -88,7 +88,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct > > > > >>>> *work) > > > > >>>> (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT)); > > > > >>>> if (scsw_is_solicited(&irb->scsw)) { > > > > >>>> cp_update_scsw(&private->cp, &irb->scsw); > > > > >>> > > > > >>> As I alluded earlier, do we know this irb is for this cp? If no, what > > > > >>> does this function end up putting in the scsw? > > > > > > Yes, I think this also needs to check whether we have at least a prior > > > start function around. (We use the orb provided by the guest; maybe we > > > should check if that intparm is set in the irb?) > > > > Hrm; not so easy as we always set the intparm to the address of the > > subchannel structure... > > > > Maybe check if we have have one of the conditions of the large table > > 16-6 and correlate to the ccw address? Or is it enough to check the > > function control? (Don't remember when the hardware resets it.) > > Nope, we cannot look at the function control, as csch clears any set > start function bit :( (see "Function Control", pg 16-13) > > I think this problem mostly boils down to "csch clears pending status; > therefore, we may only get one interrupt, even though there had been a > start function going on". If we only go with what the hardware gives > us, I don't see a way to distinguish "clear with a prior start" from > "clear only". Maybe we want to track an "issued" status in the cp? Sorry for replying to myself again :), but maybe we should simply call cp_free() if we got cc 0 from a csch? Any start function has been terminated at the subchannel during successful execution of csch, and cp_free does nothing if !cp->initialized, so we should hopefully be safe there as well. We can then add a check for the start function in the function control in the check above and should be fine, I think.
On 06/24/2019 08:07 AM, Cornelia Huck wrote: > On Mon, 24 Jun 2019 13:46:22 +0200 > Cornelia Huck <cohuck@redhat.com> wrote: > >> On Mon, 24 Jun 2019 12:05:14 +0200 >> Cornelia Huck <cohuck@redhat.com> wrote: >> >>> On Mon, 24 Jun 2019 11:42:31 +0200 >>> Cornelia Huck <cohuck@redhat.com> wrote: >>> >>>> On Fri, 21 Jun 2019 14:34:10 -0400 >>>> Farhan Ali <alifm@linux.ibm.com> wrote: >>>> >>>>> On 06/21/2019 01:40 PM, Eric Farman wrote: >>>>>> >>>>>> >>>>>> On 6/21/19 10:17 AM, Farhan Ali wrote: >>>>>>> >>>>>>> >>>>>>> On 06/20/2019 04:27 PM, Eric Farman wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 6/20/19 3:40 PM, Farhan Ali wrote: >> >>>>>>>>> diff --git a/drivers/s390/cio/vfio_ccw_drv.c >>>>>>>>> b/drivers/s390/cio/vfio_ccw_drv.c >>>>>>>>> index 66a66ac..61ece3f 100644 >>>>>>>>> --- a/drivers/s390/cio/vfio_ccw_drv.c >>>>>>>>> +++ b/drivers/s390/cio/vfio_ccw_drv.c >>>>>>>>> @@ -88,7 +88,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct >>>>>>>>> *work) >>>>>>>>> (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT)); >>>>>>>>> if (scsw_is_solicited(&irb->scsw)) { >>>>>>>>> cp_update_scsw(&private->cp, &irb->scsw); >>>>>>>> >>>>>>>> As I alluded earlier, do we know this irb is for this cp? If no, what >>>>>>>> does this function end up putting in the scsw? >>>> >>>> Yes, I think this also needs to check whether we have at least a prior >>>> start function around. (We use the orb provided by the guest; maybe we >>>> should check if that intparm is set in the irb?) >>> >>> Hrm; not so easy as we always set the intparm to the address of the >>> subchannel structure... >>> >>> Maybe check if we have have one of the conditions of the large table >>> 16-6 and correlate to the ccw address? Or is it enough to check the >>> function control? (Don't remember when the hardware resets it.) >> >> Nope, we cannot look at the function control, as csch clears any set >> start function bit :( (see "Function Control", pg 16-13) >> >> I think this problem mostly boils down to "csch clears pending status; >> therefore, we may only get one interrupt, even though there had been a >> start function going on". If we only go with what the hardware gives >> us, I don't see a way to distinguish "clear with a prior start" from >> "clear only". Maybe we want to track an "issued" status in the cp? > > Sorry for replying to myself again :), but maybe we should simply call > cp_free() if we got cc 0 from a csch? Any start function has been > terminated at the subchannel during successful execution of csch, and > cp_free does nothing if !cp->initialized, so we should hopefully be > safe there as well. We can then add a check for the start function in > the function control in the check above and should be fine, I think. > > So you mean not call cp_free in vfio_ccw_sch_io_todo, and instead call cp_free for a cc=0 for csch (and hsch) ? Won't we end up with memory leak for a successful for ssch then? But even if we don't remove the cp_free from vfio_ccw_sch_io_todo, I am not sure if your suggestion will fix the problem. The problem here is that we can call vfio_ccw_sch_io_todo (for a clear or halt interrupt) at the same time we are handling an ssch request. So depending on the order of the operations we could still end up calling cp_free from both from threads (i refer to the threads I mentioned in response to Eric's earlier email). Another thing that concerns me is that vfio-ccw can also issue csch/hsch in the quiesce path, independently of what the guest issues. So in that case we could have a similar scenario to processing an ssch request and issuing halt/clear in parallel. But maybe I am being paranoid :) Thanks Farhan
On Mon, 24 Jun 2019 10:44:17 -0400 Farhan Ali <alifm@linux.ibm.com> wrote: > On 06/24/2019 08:07 AM, Cornelia Huck wrote: > > On Mon, 24 Jun 2019 13:46:22 +0200 > > Cornelia Huck <cohuck@redhat.com> wrote: > > > >> On Mon, 24 Jun 2019 12:05:14 +0200 > >> Cornelia Huck <cohuck@redhat.com> wrote: > >> > >>> On Mon, 24 Jun 2019 11:42:31 +0200 > >>> Cornelia Huck <cohuck@redhat.com> wrote: > >>> > >>>> On Fri, 21 Jun 2019 14:34:10 -0400 > >>>> Farhan Ali <alifm@linux.ibm.com> wrote: > >>>> > >>>>> On 06/21/2019 01:40 PM, Eric Farman wrote: > >>>>>> > >>>>>> > >>>>>> On 6/21/19 10:17 AM, Farhan Ali wrote: > >>>>>>> > >>>>>>> > >>>>>>> On 06/20/2019 04:27 PM, Eric Farman wrote: > >>>>>>>> > >>>>>>>> > >>>>>>>> On 6/20/19 3:40 PM, Farhan Ali wrote: > >> > >>>>>>>>> diff --git a/drivers/s390/cio/vfio_ccw_drv.c > >>>>>>>>> b/drivers/s390/cio/vfio_ccw_drv.c > >>>>>>>>> index 66a66ac..61ece3f 100644 > >>>>>>>>> --- a/drivers/s390/cio/vfio_ccw_drv.c > >>>>>>>>> +++ b/drivers/s390/cio/vfio_ccw_drv.c > >>>>>>>>> @@ -88,7 +88,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct > >>>>>>>>> *work) > >>>>>>>>> (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT)); > >>>>>>>>> if (scsw_is_solicited(&irb->scsw)) { > >>>>>>>>> cp_update_scsw(&private->cp, &irb->scsw); > >>>>>>>> > >>>>>>>> As I alluded earlier, do we know this irb is for this cp? If no, what > >>>>>>>> does this function end up putting in the scsw? > >>>> > >>>> Yes, I think this also needs to check whether we have at least a prior > >>>> start function around. (We use the orb provided by the guest; maybe we > >>>> should check if that intparm is set in the irb?) > >>> > >>> Hrm; not so easy as we always set the intparm to the address of the > >>> subchannel structure... > >>> > >>> Maybe check if we have have one of the conditions of the large table > >>> 16-6 and correlate to the ccw address? Or is it enough to check the > >>> function control? (Don't remember when the hardware resets it.) > >> > >> Nope, we cannot look at the function control, as csch clears any set > >> start function bit :( (see "Function Control", pg 16-13) > >> > >> I think this problem mostly boils down to "csch clears pending status; > >> therefore, we may only get one interrupt, even though there had been a > >> start function going on". If we only go with what the hardware gives > >> us, I don't see a way to distinguish "clear with a prior start" from > >> "clear only". Maybe we want to track an "issued" status in the cp? > > > > Sorry for replying to myself again :), but maybe we should simply call > > cp_free() if we got cc 0 from a csch? Any start function has been > > terminated at the subchannel during successful execution of csch, and > > cp_free does nothing if !cp->initialized, so we should hopefully be > > safe there as well. We can then add a check for the start function in > > the function control in the check above and should be fine, I think. > > > > > > So you mean not call cp_free in vfio_ccw_sch_io_todo, and instead call > cp_free for a cc=0 for csch (and hsch) ? > > Won't we end up with memory leak for a successful for ssch then? No; both: - free if cc=0 for csch (as this clears the status; hsch doesn't) - free in _todo if the start function is set in the irb and the status is final > > But even if we don't remove the cp_free from vfio_ccw_sch_io_todo, I am > not sure if your suggestion will fix the problem. The problem here is > that we can call vfio_ccw_sch_io_todo (for a clear or halt interrupt) at > the same time we are handling an ssch request. So depending on the order > of the operations we could still end up calling cp_free from both from > threads (i refer to the threads I mentioned in response to Eric's > earlier email). What I don't see is why this is a problem with ->initialized; wasn't the problem that we misinterpreted an interrupt for csch as one for a not-yet-issued ssch? > > Another thing that concerns me is that vfio-ccw can also issue csch/hsch > in the quiesce path, independently of what the guest issues. So in that > case we could have a similar scenario to processing an ssch request and > issuing halt/clear in parallel. But maybe I am being paranoid :) I think the root problem is really trying to clear a cp while another thread is trying to set it up. Should we maybe use something like rcu?
On 06/24/2019 11:09 AM, Cornelia Huck wrote: > On Mon, 24 Jun 2019 10:44:17 -0400 > Farhan Ali <alifm@linux.ibm.com> wrote: > >> On 06/24/2019 08:07 AM, Cornelia Huck wrote: >>> On Mon, 24 Jun 2019 13:46:22 +0200 >>> Cornelia Huck <cohuck@redhat.com> wrote: >>> >>>> On Mon, 24 Jun 2019 12:05:14 +0200 >>>> Cornelia Huck <cohuck@redhat.com> wrote: >>>> >>>>> On Mon, 24 Jun 2019 11:42:31 +0200 >>>>> Cornelia Huck <cohuck@redhat.com> wrote: >>>>> >>>>>> On Fri, 21 Jun 2019 14:34:10 -0400 >>>>>> Farhan Ali <alifm@linux.ibm.com> wrote: >>>>>> >>>>>>> On 06/21/2019 01:40 PM, Eric Farman wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 6/21/19 10:17 AM, Farhan Ali wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On 06/20/2019 04:27 PM, Eric Farman wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 6/20/19 3:40 PM, Farhan Ali wrote: >>>> >>>>>>>>>>> diff --git a/drivers/s390/cio/vfio_ccw_drv.c >>>>>>>>>>> b/drivers/s390/cio/vfio_ccw_drv.c >>>>>>>>>>> index 66a66ac..61ece3f 100644 >>>>>>>>>>> --- a/drivers/s390/cio/vfio_ccw_drv.c >>>>>>>>>>> +++ b/drivers/s390/cio/vfio_ccw_drv.c >>>>>>>>>>> @@ -88,7 +88,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct >>>>>>>>>>> *work) >>>>>>>>>>> (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT)); >>>>>>>>>>> if (scsw_is_solicited(&irb->scsw)) { >>>>>>>>>>> cp_update_scsw(&private->cp, &irb->scsw); >>>>>>>>>> >>>>>>>>>> As I alluded earlier, do we know this irb is for this cp? If no, what >>>>>>>>>> does this function end up putting in the scsw? >>>>>> >>>>>> Yes, I think this also needs to check whether we have at least a prior >>>>>> start function around. (We use the orb provided by the guest; maybe we >>>>>> should check if that intparm is set in the irb?) >>>>> >>>>> Hrm; not so easy as we always set the intparm to the address of the >>>>> subchannel structure... >>>>> >>>>> Maybe check if we have have one of the conditions of the large table >>>>> 16-6 and correlate to the ccw address? Or is it enough to check the >>>>> function control? (Don't remember when the hardware resets it.) >>>> >>>> Nope, we cannot look at the function control, as csch clears any set >>>> start function bit :( (see "Function Control", pg 16-13) >>>> >>>> I think this problem mostly boils down to "csch clears pending status; >>>> therefore, we may only get one interrupt, even though there had been a >>>> start function going on". If we only go with what the hardware gives >>>> us, I don't see a way to distinguish "clear with a prior start" from >>>> "clear only". Maybe we want to track an "issued" status in the cp? >>> >>> Sorry for replying to myself again :), but maybe we should simply call >>> cp_free() if we got cc 0 from a csch? Any start function has been >>> terminated at the subchannel during successful execution of csch, and >>> cp_free does nothing if !cp->initialized, so we should hopefully be >>> safe there as well. We can then add a check for the start function in >>> the function control in the check above and should be fine, I think. >>> >>> >> >> So you mean not call cp_free in vfio_ccw_sch_io_todo, and instead call >> cp_free for a cc=0 for csch (and hsch) ? >> >> Won't we end up with memory leak for a successful for ssch then? > > No; both: > > - free if cc=0 for csch (as this clears the status; hsch doesn't) > - free in _todo if the start function is set in the irb and the status > is final > >> >> But even if we don't remove the cp_free from vfio_ccw_sch_io_todo, I am >> not sure if your suggestion will fix the problem. The problem here is >> that we can call vfio_ccw_sch_io_todo (for a clear or halt interrupt) at >> the same time we are handling an ssch request. So depending on the order >> of the operations we could still end up calling cp_free from both from >> threads (i refer to the threads I mentioned in response to Eric's >> earlier email). > > What I don't see is why this is a problem with ->initialized; wasn't > the problem that we misinterpreted an interrupt for csch as one for a > not-yet-issued ssch? > It's the order in which we do things, which could cause the problem. Since we queue interrupt handling in the workqueue, we could delay processing the csch interrupt. During this delay if ssch comes through, we might have already set ->initialized to true. So when we get around to handling the interrupt in io_todo, we would go ahead and call cp_free. This would cause the problem of freeing the ccwchain list while we might be adding to it. >> >> Another thing that concerns me is that vfio-ccw can also issue csch/hsch >> in the quiesce path, independently of what the guest issues. So in that >> case we could have a similar scenario to processing an ssch request and >> issuing halt/clear in parallel. But maybe I am being paranoid :) > > I think the root problem is really trying to clear a cp while another > thread is trying to set it up. Should we maybe use something like rcu? > > Yes, this is the root problem. I am not too familiar with rcu locking, but what would be the benefit over a traditional mutex? Thanks Farhan
On Mon, 24 Jun 2019 11:24:16 -0400 Farhan Ali <alifm@linux.ibm.com> wrote: > On 06/24/2019 11:09 AM, Cornelia Huck wrote: > > On Mon, 24 Jun 2019 10:44:17 -0400 > > Farhan Ali <alifm@linux.ibm.com> wrote: > >> But even if we don't remove the cp_free from vfio_ccw_sch_io_todo, I am > >> not sure if your suggestion will fix the problem. The problem here is > >> that we can call vfio_ccw_sch_io_todo (for a clear or halt interrupt) at > >> the same time we are handling an ssch request. So depending on the order > >> of the operations we could still end up calling cp_free from both from > >> threads (i refer to the threads I mentioned in response to Eric's > >> earlier email). > > > > What I don't see is why this is a problem with ->initialized; wasn't > > the problem that we misinterpreted an interrupt for csch as one for a > > not-yet-issued ssch? > > > > It's the order in which we do things, which could cause the problem. > Since we queue interrupt handling in the workqueue, we could delay > processing the csch interrupt. During this delay if ssch comes through, > we might have already set ->initialized to true. > > So when we get around to handling the interrupt in io_todo, we would go > ahead and call cp_free. This would cause the problem of freeing the > ccwchain list while we might be adding to it. > > >> > >> Another thing that concerns me is that vfio-ccw can also issue csch/hsch > >> in the quiesce path, independently of what the guest issues. So in that > >> case we could have a similar scenario to processing an ssch request and > >> issuing halt/clear in parallel. But maybe I am being paranoid :) > > > > I think the root problem is really trying to clear a cp while another > > thread is trying to set it up. Should we maybe use something like rcu? > > > > > > Yes, this is the root problem. I am not too familiar with rcu locking, > but what would be the benefit over a traditional mutex? I don't quite remember what I had been envisioning at the time (sorry, the heat seems to make my brain a bit slushy :/), but I think we might have two copies of the cp and use an rcu-ed pointer in the private structure to point to one of the copies. If we make sure we've synchronized on the pointer at interrupt time, we should be able to free the old one in _todo and act on the new on when doing ssch. And yes, I realize that this is awfully vague :)
On 06/27/2019 05:14 AM, Cornelia Huck wrote: > On Mon, 24 Jun 2019 11:24:16 -0400 > Farhan Ali <alifm@linux.ibm.com> wrote: > >> On 06/24/2019 11:09 AM, Cornelia Huck wrote: >>> On Mon, 24 Jun 2019 10:44:17 -0400 >>> Farhan Ali <alifm@linux.ibm.com> wrote: > >>>> But even if we don't remove the cp_free from vfio_ccw_sch_io_todo, I am >>>> not sure if your suggestion will fix the problem. The problem here is >>>> that we can call vfio_ccw_sch_io_todo (for a clear or halt interrupt) at >>>> the same time we are handling an ssch request. So depending on the order >>>> of the operations we could still end up calling cp_free from both from >>>> threads (i refer to the threads I mentioned in response to Eric's >>>> earlier email). >>> >>> What I don't see is why this is a problem with ->initialized; wasn't >>> the problem that we misinterpreted an interrupt for csch as one for a >>> not-yet-issued ssch? >>> >> >> It's the order in which we do things, which could cause the problem. >> Since we queue interrupt handling in the workqueue, we could delay >> processing the csch interrupt. During this delay if ssch comes through, >> we might have already set ->initialized to true. >> >> So when we get around to handling the interrupt in io_todo, we would go >> ahead and call cp_free. This would cause the problem of freeing the >> ccwchain list while we might be adding to it. >> >>>> >>>> Another thing that concerns me is that vfio-ccw can also issue csch/hsch >>>> in the quiesce path, independently of what the guest issues. So in that >>>> case we could have a similar scenario to processing an ssch request and >>>> issuing halt/clear in parallel. But maybe I am being paranoid :) >>> >>> I think the root problem is really trying to clear a cp while another >>> thread is trying to set it up. Should we maybe use something like rcu? >>> >>> >> >> Yes, this is the root problem. I am not too familiar with rcu locking, >> but what would be the benefit over a traditional mutex? > > I don't quite remember what I had been envisioning at the time (sorry, > the heat seems to make my brain a bit slushy :/), but I think we might > have two copies of the cp and use an rcu-ed pointer in the private > structure to point to one of the copies. If we make sure we've > synchronized on the pointer at interrupt time, we should be able to > free the old one in _todo and act on the new on when doing ssch. And > yes, I realize that this is awfully vague :) > > Sorry for the delayed response. I was trying out few ideas, and I think the simplest one for me that worked and that makes sense is to explicitly add the check to see if the state == CP_PENDING when trying to free the cp (as mentioned by Halil in a separate thread). When we are in the CP_PENDING state then we know for sure that we have a currently allocated cp and no other thread is working on it. So in the interrupt context, it should be okay to free cp. I have prototyped with the mutex, but the code becomes too hairy. I looked into the rcu api and from what I understand about rcu it would provide advantage if we more readers than updaters. But in our case we really have 2 updaters, updating the cp at the same time. In the meantime I also have some minor fixes while going over the code again :). I will post a v2 soon for review. Thanks Farhan
diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c index 66a66ac..61ece3f 100644 --- a/drivers/s390/cio/vfio_ccw_drv.c +++ b/drivers/s390/cio/vfio_ccw_drv.c @@ -88,7 +88,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work) (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT)); if (scsw_is_solicited(&irb->scsw)) { cp_update_scsw(&private->cp, &irb->scsw); - if (is_final) + if (is_final && private->state != VFIO_CCW_STATE_CP_PROCESSING) cp_free(&private->cp); } mutex_lock(&private->io_mutex);
There is a small window where it's possible that an interrupt can arrive and can call cp_free, while we are still processing a channel program (i.e allocating memory, pinnging pages, translating addresses etc). This can lead to allocating and freeing at the same time and can cause memory corruption. Let's not call cp_free if we are currently processing a channel program. Signed-off-by: Farhan Ali <alifm@linux.ibm.com> --- I have been running my test overnight with this patch and I haven't seen the stack traces that I mentioned about earlier. I would like to get some reviews on this and also if this is the right thing to do? Thanks Farhan drivers/s390/cio/vfio_ccw_drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)