Message ID | 20250206142307.921070-5-eric.auger@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Fix vIOMMU reset order | expand |
On Thu, 6 Feb 2025 at 14:23, Eric Auger <eric.auger@redhat.com> wrote: > > Currently the iommu may be reset before the devices > it protects. For example this happens with virtio-scsi-pci. > when system_reset is issued from qmp monitor, spurious > "virtio: zero sized buffers are not allowed" warnings can > be observed. > > Signed-off-by: Eric Auger <eric.auger@redhat.com> > --- > hw/arm/smmuv3.c | 9 +++++---- > hw/arm/trace-events | 1 + > 2 files changed, 6 insertions(+), 4 deletions(-) > > diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c > index c0cf5df0f6..7522c32b24 100644 > --- a/hw/arm/smmuv3.c > +++ b/hw/arm/smmuv3.c > @@ -1870,13 +1870,14 @@ static void smmu_init_irq(SMMUv3State *s, SysBusDevice *dev) > } > } > > -static void smmu_reset_hold(Object *obj, ResetType type) > +static void smmu_reset_exit(Object *obj, ResetType type) > { > SMMUv3State *s = ARM_SMMUV3(obj); > SMMUv3Class *c = ARM_SMMUV3_GET_CLASS(s); > > - if (c->parent_phases.hold) { > - c->parent_phases.hold(obj, type); > + trace_smmu_reset_exit(); > + if (c->parent_phases.exit) { > + c->parent_phases.exit(obj, type); > } If we need to do something unexpected like reset register values in the exit phase rather than the hold phase, it's a good idea to have a comment explaining why, to avoid somebody coming along afterwards and tidying it up into the more usual arrangement. If I understand correctly we need to keep the whole IOMMU config intact until the exit phase? What's the thing the device behind the IOMMU is trying to do during its reset that triggers the warning? thanks -- PMM
On 2/7/25 5:37 PM, Peter Maydell wrote: > On Thu, 6 Feb 2025 at 14:23, Eric Auger <eric.auger@redhat.com> wrote: >> Currently the iommu may be reset before the devices >> it protects. For example this happens with virtio-scsi-pci. >> when system_reset is issued from qmp monitor, spurious >> "virtio: zero sized buffers are not allowed" warnings can >> be observed. >> >> Signed-off-by: Eric Auger <eric.auger@redhat.com> >> --- >> hw/arm/smmuv3.c | 9 +++++---- >> hw/arm/trace-events | 1 + >> 2 files changed, 6 insertions(+), 4 deletions(-) >> >> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c >> index c0cf5df0f6..7522c32b24 100644 >> --- a/hw/arm/smmuv3.c >> +++ b/hw/arm/smmuv3.c >> @@ -1870,13 +1870,14 @@ static void smmu_init_irq(SMMUv3State *s, SysBusDevice *dev) >> } >> } >> >> -static void smmu_reset_hold(Object *obj, ResetType type) >> +static void smmu_reset_exit(Object *obj, ResetType type) >> { >> SMMUv3State *s = ARM_SMMUV3(obj); >> SMMUv3Class *c = ARM_SMMUV3_GET_CLASS(s); >> >> - if (c->parent_phases.hold) { >> - c->parent_phases.hold(obj, type); >> + trace_smmu_reset_exit(); >> + if (c->parent_phases.exit) { >> + c->parent_phases.exit(obj, type); >> } > If we need to do something unexpected like reset > register values in the exit phase rather than the > hold phase, it's a good idea to have a comment explaining > why, to avoid somebody coming along afterwards and tidying > it up into the more usual arrangement. sure > > If I understand correctly we need to keep the whole IOMMU > config intact until the exit phase? What's the thing the > device behind the IOMMU is trying to do during its reset > that triggers the warning? The virtio-pci-net continues to perform DMA requests and this causes some weird messages such as: "virtio: bogus descriptor or out of resources" Also VFIO devices may continue issuing DMAs causing translation faults Thanks Eric > > thanks > -- PMM >
On Fri, 7 Feb 2025 at 16:50, Eric Auger <eric.auger@redhat.com> wrote: > > > > > On 2/7/25 5:37 PM, Peter Maydell wrote: > > On Thu, 6 Feb 2025 at 14:23, Eric Auger <eric.auger@redhat.com> wrote: > >> Currently the iommu may be reset before the devices > >> it protects. For example this happens with virtio-scsi-pci. > >> when system_reset is issued from qmp monitor, spurious > >> "virtio: zero sized buffers are not allowed" warnings can > >> be observed. > >> > >> Signed-off-by: Eric Auger <eric.auger@redhat.com> > >> --- > >> hw/arm/smmuv3.c | 9 +++++---- > >> hw/arm/trace-events | 1 + > >> 2 files changed, 6 insertions(+), 4 deletions(-) > >> > >> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c > >> index c0cf5df0f6..7522c32b24 100644 > >> --- a/hw/arm/smmuv3.c > >> +++ b/hw/arm/smmuv3.c > >> @@ -1870,13 +1870,14 @@ static void smmu_init_irq(SMMUv3State *s, SysBusDevice *dev) > >> } > >> } > >> > >> -static void smmu_reset_hold(Object *obj, ResetType type) > >> +static void smmu_reset_exit(Object *obj, ResetType type) > >> { > >> SMMUv3State *s = ARM_SMMUV3(obj); > >> SMMUv3Class *c = ARM_SMMUV3_GET_CLASS(s); > >> > >> - if (c->parent_phases.hold) { > >> - c->parent_phases.hold(obj, type); > >> + trace_smmu_reset_exit(); > >> + if (c->parent_phases.exit) { > >> + c->parent_phases.exit(obj, type); > >> } > > If we need to do something unexpected like reset > > register values in the exit phase rather than the > > hold phase, it's a good idea to have a comment explaining > > why, to avoid somebody coming along afterwards and tidying > > it up into the more usual arrangement. > sure > > > > If I understand correctly we need to keep the whole IOMMU > > config intact until the exit phase? What's the thing the > > device behind the IOMMU is trying to do during its reset > > that triggers the warning? > The virtio-pci-net continues to perform DMA requests and this causes > some weird messages such as: > "virtio: bogus descriptor or out of resources" > > Also VFIO devices may continue issuing DMAs causing translation faults Hmm, right. I guess this only happens with KVM, or can you trigger it in a TCG setup too? Anyway, presumably we can rely on the devices quiescing all their outstanding DMA by the time the hold phase comes along. (I wonder if we ought to suggest quiescing outstanding DMA in the enter phase? But it's probably easier to fix the iommus like this series does than try to get every dma-capable pci device to do something different.) thanks -- PMM
On Fri, Feb 07, 2025 at 04:58:39PM +0000, Peter Maydell wrote: > (I wonder if we ought to suggest quiescing outstanding > DMA in the enter phase? But it's probably easier to fix > the iommus like this series does than try to get every > dma-capable pci device to do something different.) I wonder if we should provide some generic helper to register vIOMMU reset callbacks, so that we'll be sure any vIOMMU model impl that will register at exit() phase only, and do nothing during the initial two phases. Then we can put some rich comment on that helper on why. Looks like it means the qemu reset model in the future can be a combination of device tree (which resets depth-first) and the three phases model. We will start to use different approach to solve different problems. Maybe after we settle our mind, we should update the reset document, e.g. for device emulation developers, we need to be clear on where to quiesce the DMAs, and it must not happen at exit(). Both all devices and all iommu impls need to follow the rules to make it work like the plan. Thanks,
On Fri, 7 Feb 2025 at 17:48, Peter Xu <peterx@redhat.com> wrote: > > On Fri, Feb 07, 2025 at 04:58:39PM +0000, Peter Maydell wrote: > > (I wonder if we ought to suggest quiescing outstanding > > DMA in the enter phase? But it's probably easier to fix > > the iommus like this series does than try to get every > > dma-capable pci device to do something different.) > > I wonder if we should provide some generic helper to register vIOMMU reset > callbacks, so that we'll be sure any vIOMMU model impl that will register > at exit() phase only, and do nothing during the initial two phases. Then > we can put some rich comment on that helper on why. > > Looks like it means the qemu reset model in the future can be a combination > of device tree (which resets depth-first) and the three phases model. We > will start to use different approach to solve different problems. The tree of QOM devices (i.e. the one based on the qbus buses and rooted at the sysbus) resets depth-first, but it does so in three phases: first we traverse everything doing 'enter'; then we traverse everything doing 'hold'; then we traverse everything doing 'exit'. There *used* to be an awkward mix of some things being three-phase and some not, but we have now got rid of all of those so a system reset does a single three-phase reset run which resets everything. -- PMM
Hi Peter, On 2/7/25 6:47 PM, Peter Xu wrote: > On Fri, Feb 07, 2025 at 04:58:39PM +0000, Peter Maydell wrote: >> (I wonder if we ought to suggest quiescing outstanding >> DMA in the enter phase? But it's probably easier to fix >> the iommus like this series does than try to get every >> dma-capable pci device to do something different.) > I wonder if we should provide some generic helper to register vIOMMU reset > callbacks, so that we'll be sure any vIOMMU model impl that will register > at exit() phase only, and do nothing during the initial two phases. Then > we can put some rich comment on that helper on why. As discussed with Cédric, I think it shall think about having eventually a base class for vIOMMU. Maybe this is something we can handle afterwards though. > > Looks like it means the qemu reset model in the future can be a combination > of device tree (which resets depth-first) and the three phases model. We > will start to use different approach to solve different problems. > > Maybe after we settle our mind, we should update the reset document, > e.g. for device emulation developers, we need to be clear on where to > quiesce the DMAs, and it must not happen at exit(). Both all devices and > all iommu impls need to follow the rules to make it work like the plan. The 3 phase documentation already states that you shouldn't do anything in enter phase that can have side-effect on other devices. However I agree we can add another example besides the qemu_irq line one. Eric > > Thanks, >
Hi Peter, On 2/7/25 5:58 PM, Peter Maydell wrote: > On Fri, 7 Feb 2025 at 16:50, Eric Auger <eric.auger@redhat.com> wrote: >> >> >> >> On 2/7/25 5:37 PM, Peter Maydell wrote: >>> On Thu, 6 Feb 2025 at 14:23, Eric Auger <eric.auger@redhat.com> wrote: >>>> Currently the iommu may be reset before the devices >>>> it protects. For example this happens with virtio-scsi-pci. >>>> when system_reset is issued from qmp monitor, spurious >>>> "virtio: zero sized buffers are not allowed" warnings can >>>> be observed. >>>> >>>> Signed-off-by: Eric Auger <eric.auger@redhat.com> >>>> --- >>>> hw/arm/smmuv3.c | 9 +++++---- >>>> hw/arm/trace-events | 1 + >>>> 2 files changed, 6 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c >>>> index c0cf5df0f6..7522c32b24 100644 >>>> --- a/hw/arm/smmuv3.c >>>> +++ b/hw/arm/smmuv3.c >>>> @@ -1870,13 +1870,14 @@ static void smmu_init_irq(SMMUv3State *s, SysBusDevice *dev) >>>> } >>>> } >>>> >>>> -static void smmu_reset_hold(Object *obj, ResetType type) >>>> +static void smmu_reset_exit(Object *obj, ResetType type) >>>> { >>>> SMMUv3State *s = ARM_SMMUV3(obj); >>>> SMMUv3Class *c = ARM_SMMUV3_GET_CLASS(s); >>>> >>>> - if (c->parent_phases.hold) { >>>> - c->parent_phases.hold(obj, type); >>>> + trace_smmu_reset_exit(); >>>> + if (c->parent_phases.exit) { >>>> + c->parent_phases.exit(obj, type); >>>> } >>> If we need to do something unexpected like reset >>> register values in the exit phase rather than the >>> hold phase, it's a good idea to have a comment explaining >>> why, to avoid somebody coming along afterwards and tidying >>> it up into the more usual arrangement. >> sure >>> If I understand correctly we need to keep the whole IOMMU >>> config intact until the exit phase? What's the thing the >>> device behind the IOMMU is trying to do during its reset >>> that triggers the warning? >> The virtio-pci-net continues to perform DMA requests and this causes >> some weird messages such as: >> "virtio: bogus descriptor or out of resources" >> >> Also VFIO devices may continue issuing DMAs causing translation faults > Hmm, right. I guess this only happens with KVM, or can you > trigger it in a TCG setup too? Anyway, presumably we can I have only tested with KVM acceleration for now. > rely on the devices quiescing all their outstanding DMA > by the time the hold phase comes along. > > (I wonder if we ought to suggest quiescing outstanding > DMA in the enter phase? But it's probably easier to fix > the iommus like this series does than try to get every > dma-capable pci device to do something different.) at least we can document this, as Peter later suggests. Indeed fixing it at vIOMMU level looked much easier for me, hoping that no other DMA capable device would stop DMAs at exit phase Eric > > thanks > -- PMM >
Hi Peter, On 2/7/25 7:18 PM, Peter Maydell wrote: > On Fri, 7 Feb 2025 at 17:48, Peter Xu <peterx@redhat.com> wrote: >> On Fri, Feb 07, 2025 at 04:58:39PM +0000, Peter Maydell wrote: >>> (I wonder if we ought to suggest quiescing outstanding >>> DMA in the enter phase? But it's probably easier to fix >>> the iommus like this series does than try to get every >>> dma-capable pci device to do something different.) >> I wonder if we should provide some generic helper to register vIOMMU reset >> callbacks, so that we'll be sure any vIOMMU model impl that will register >> at exit() phase only, and do nothing during the initial two phases. Then >> we can put some rich comment on that helper on why. >> >> Looks like it means the qemu reset model in the future can be a combination >> of device tree (which resets depth-first) and the three phases model. We >> will start to use different approach to solve different problems. > The tree of QOM devices (i.e. the one based on the qbus buses > and rooted at the sysbus) resets depth-first, but it does so in > three phases: first we traverse everything doing 'enter'; then > we traverse everything doing 'hold'; then we traverse everything > doing 'exit'. There *used* to be an awkward mix of some things > being three-phase and some not, but we have now got rid of all > of those so a system reset does a single three-phase reset run > which resets everything. Thank you Peter. This is reassuring. I will add such kind of description in the commit msg/cover letter. Eric > > -- PMM >
On Fri, Feb 07, 2025 at 06:18:50PM +0000, Peter Maydell wrote: > On Fri, 7 Feb 2025 at 17:48, Peter Xu <peterx@redhat.com> wrote: > > > > On Fri, Feb 07, 2025 at 04:58:39PM +0000, Peter Maydell wrote: > > > (I wonder if we ought to suggest quiescing outstanding > > > DMA in the enter phase? But it's probably easier to fix > > > the iommus like this series does than try to get every > > > dma-capable pci device to do something different.) > > > > I wonder if we should provide some generic helper to register vIOMMU reset > > callbacks, so that we'll be sure any vIOMMU model impl that will register > > at exit() phase only, and do nothing during the initial two phases. Then > > we can put some rich comment on that helper on why. > > > > Looks like it means the qemu reset model in the future can be a combination > > of device tree (which resets depth-first) and the three phases model. We > > will start to use different approach to solve different problems. > > The tree of QOM devices (i.e. the one based on the qbus buses > and rooted at the sysbus) resets depth-first, but it does so in > three phases: first we traverse everything doing 'enter'; then > we traverse everything doing 'hold'; then we traverse everything > doing 'exit'. There *used* to be an awkward mix of some things > being three-phase and some not, but we have now got rid of all > of those so a system reset does a single three-phase reset run > which resets everything. Right. Sorry I wasn't very clear before indeed on what I wanted to express. My understanding is the 3 phases reset, even if existed, was not designed to order things like vIOMMU and devices that is already described by system topology. That's, IMHO, exactly what QOM topology wanted to achieve right now on ordering device resets and the whole depth-first reset method would make sense with it. So from that specific POV, it's a mixture use of both methods on ordering of devices to reset now (rather than the order of reset process within a same device, provided into 3 phases). It may not be very intuitive when someone reads about the two reset mechanisms, as one would naturally take vIOMMU as a root object of any other PCIe devices under root complex, and thinking the order should be guaranteed by QOM on reset already. In reality it's not. So that's the part I wonder if we want to document. So we must make sure both: - All vIOMMUs across all archs must only tear down its mapping at its exit() phase, providing the mapping available for all devices during the initial 2 phases (probably we could even assert the initial 2 phase functions to be NULL when there's a base class). Meanwhile, - All PCIe devices must quiesce their DMA in the initial 2 phases, guaranteeing that there's no on-the-fly DMAs possible in the complete 3rd exit() phase, because any vIOMMU implementation can start to tear down its device mappings even as the first entry in 3rd phase (IOW, there's also no order constraint for 3rd phase that vIOMMU exit() will be invoked before devices' exit()). I'm not sure if it would be important to document this, but only thought about it if we want crystal clearance on the choice of this design. Thanks,
On Mon, Feb 10, 2025 at 09:35:53AM +0100, Eric Auger wrote: > Hi Peter, Eric, > > > On 2/7/25 6:47 PM, Peter Xu wrote: > > On Fri, Feb 07, 2025 at 04:58:39PM +0000, Peter Maydell wrote: > >> (I wonder if we ought to suggest quiescing outstanding > >> DMA in the enter phase? But it's probably easier to fix > >> the iommus like this series does than try to get every > >> dma-capable pci device to do something different.) > > I wonder if we should provide some generic helper to register vIOMMU reset > > callbacks, so that we'll be sure any vIOMMU model impl that will register > > at exit() phase only, and do nothing during the initial two phases. Then > > we can put some rich comment on that helper on why. > As discussed with Cédric, I think it shall think about having eventually > a base class for vIOMMU. Maybe this is something we can handle > afterwards though. Yes agreed. > > > > Looks like it means the qemu reset model in the future can be a combination > > of device tree (which resets depth-first) and the three phases model. We > > will start to use different approach to solve different problems. > > > > Maybe after we settle our mind, we should update the reset document, > > e.g. for device emulation developers, we need to be clear on where to > > quiesce the DMAs, and it must not happen at exit(). Both all devices and > > all iommu impls need to follow the rules to make it work like the plan. > The 3 phase documentation already states that you shouldn't do anything > in enter phase that can have side-effect on other devices. However I > agree we can add another example besides the qemu_irq line one. The document will be relevant to two sides of things (so far not relevant to enter() phase, but more on what we should put into the last two phases either for vIOMMU impl or a PCIe device impl), that I commented in the other reply to Peter. I am not sure whether we need such fine granule document, but in all cases I agree this can definitely be done later. :) Thanks!
On Mon, 10 Feb 2025 at 14:14, Peter Xu <peterx@redhat.com> wrote: > > On Fri, Feb 07, 2025 at 06:18:50PM +0000, Peter Maydell wrote: > > On Fri, 7 Feb 2025 at 17:48, Peter Xu <peterx@redhat.com> wrote: > > > > > > On Fri, Feb 07, 2025 at 04:58:39PM +0000, Peter Maydell wrote: > > > > (I wonder if we ought to suggest quiescing outstanding > > > > DMA in the enter phase? But it's probably easier to fix > > > > the iommus like this series does than try to get every > > > > dma-capable pci device to do something different.) > > > > > > I wonder if we should provide some generic helper to register vIOMMU reset > > > callbacks, so that we'll be sure any vIOMMU model impl that will register > > > at exit() phase only, and do nothing during the initial two phases. Then > > > we can put some rich comment on that helper on why. > > > > > > Looks like it means the qemu reset model in the future can be a combination > > > of device tree (which resets depth-first) and the three phases model. We > > > will start to use different approach to solve different problems. > > > > The tree of QOM devices (i.e. the one based on the qbus buses > > and rooted at the sysbus) resets depth-first, but it does so in > > three phases: first we traverse everything doing 'enter'; then > > we traverse everything doing 'hold'; then we traverse everything > > doing 'exit'. There *used* to be an awkward mix of some things > > being three-phase and some not, but we have now got rid of all > > of those so a system reset does a single three-phase reset run > > which resets everything. > > Right. Sorry I wasn't very clear before indeed on what I wanted to > express. > > My understanding is the 3 phases reset, even if existed, was not designed > to order things like vIOMMU and devices that is already described by system > topology. That's, IMHO, exactly what QOM topology wanted to achieve right > now on ordering device resets and the whole depth-first reset method would > make sense with it. > > So from that specific POV, it's a mixture use of both methods on ordering > of devices to reset now (rather than the order of reset process within a > same device, provided into 3 phases). It may not be very intuitive when > someone reads about the two reset mechanisms, as one would naturally take > vIOMMU as a root object of any other PCIe devices under root complex, and > thinking the order should be guaranteed by QOM on reset already. In > reality it's not. So that's the part I wonder if we want to document. Yeah, I see what you mean. The issue here is that the iommu isn't actually a parent of the devices that access through it. This is true both in the "qbus/qdev bus tree" sense (where it is the PCI controller device that owns the PCI bus that the devices are on) and also in the QOM tree sense[*] (where usually the iommu and the PCI controller are both created by the machine or the SoC, I guess?). Instead iommus are separate devices that control data flow but aren't in a parent-child relationship with the devices on either side of that flow. There is a guarantee about reset ordering between bus parent/child (so the PCI controller resets before it resets the PCI bus that resets the PCI devices on the bus), but that doesn't help the iommu. [*] I have a vague idea that ideally we might want reset to be QOM-tree based rather than qbus-tree based. But getting there from here is non-trivial. And maybe what we really want is "objects, especially SoCs, that create children can define what their reset tree is, with the default being to reset all the QOM children". Lots of non-thought-through complexity here ;-) thanks -- PMM
On 2/10/25 15:22, Peter Maydell wrote: > On Mon, 10 Feb 2025 at 14:14, Peter Xu <peterx@redhat.com> wrote: >> >> On Fri, Feb 07, 2025 at 06:18:50PM +0000, Peter Maydell wrote: >>> On Fri, 7 Feb 2025 at 17:48, Peter Xu <peterx@redhat.com> wrote: >>>> >>>> On Fri, Feb 07, 2025 at 04:58:39PM +0000, Peter Maydell wrote: >>>>> (I wonder if we ought to suggest quiescing outstanding >>>>> DMA in the enter phase? But it's probably easier to fix >>>>> the iommus like this series does than try to get every >>>>> dma-capable pci device to do something different.) >>>> >>>> I wonder if we should provide some generic helper to register vIOMMU reset >>>> callbacks, so that we'll be sure any vIOMMU model impl that will register >>>> at exit() phase only, and do nothing during the initial two phases. Then >>>> we can put some rich comment on that helper on why. >>>> >>>> Looks like it means the qemu reset model in the future can be a combination >>>> of device tree (which resets depth-first) and the three phases model. We >>>> will start to use different approach to solve different problems. >>> >>> The tree of QOM devices (i.e. the one based on the qbus buses >>> and rooted at the sysbus) resets depth-first, but it does so in >>> three phases: first we traverse everything doing 'enter'; then >>> we traverse everything doing 'hold'; then we traverse everything >>> doing 'exit'. There *used* to be an awkward mix of some things >>> being three-phase and some not, but we have now got rid of all >>> of those so a system reset does a single three-phase reset run >>> which resets everything. >> >> Right. Sorry I wasn't very clear before indeed on what I wanted to >> express. >> >> My understanding is the 3 phases reset, even if existed, was not designed >> to order things like vIOMMU and devices that is already described by system >> topology. That's, IMHO, exactly what QOM topology wanted to achieve right >> now on ordering device resets and the whole depth-first reset method would >> make sense with it. >> >> So from that specific POV, it's a mixture use of both methods on ordering >> of devices to reset now (rather than the order of reset process within a >> same device, provided into 3 phases). It may not be very intuitive when >> someone reads about the two reset mechanisms, as one would naturally take >> vIOMMU as a root object of any other PCIe devices under root complex, and >> thinking the order should be guaranteed by QOM on reset already. In >> reality it's not. So that's the part I wonder if we want to document. > > Yeah, I see what you mean. The issue here is that the iommu isn't > actually a parent of the devices that access through it. This is > true both in the "qbus/qdev bus tree" sense (where it is the PCI > controller device that owns the PCI bus that the devices are on) > and also in the QOM tree sense[*] (where usually the iommu and the > PCI controller are both created by the machine or the SoC, I guess?). > Instead iommus are separate devices that control data flow but > aren't in a parent-child relationship with the devices on either > side of that flow. There is a guarantee about reset ordering between > bus parent/child (so the PCI controller resets before it resets > the PCI bus that resets the PCI devices on the bus), but that doesn't > help the iommu. > > [*] I have a vague idea that ideally we might want reset to be > QOM-tree based rather than qbus-tree based. But getting there from > here is non-trivial. And maybe what we really want is "objects, > especially SoCs, that create children can define what their reset > tree is, with the default being to reset all the QOM children". yes. When I quickly thought about it, I had the idea that we could reparent the vIOMMU device(s) to a default or a specific PHB and order the reset calls from the PHB reset routine. > Lots of non-thought-through complexity here ;-) same ... C.
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c index c0cf5df0f6..7522c32b24 100644 --- a/hw/arm/smmuv3.c +++ b/hw/arm/smmuv3.c @@ -1870,13 +1870,14 @@ static void smmu_init_irq(SMMUv3State *s, SysBusDevice *dev) } } -static void smmu_reset_hold(Object *obj, ResetType type) +static void smmu_reset_exit(Object *obj, ResetType type) { SMMUv3State *s = ARM_SMMUV3(obj); SMMUv3Class *c = ARM_SMMUV3_GET_CLASS(s); - if (c->parent_phases.hold) { - c->parent_phases.hold(obj, type); + trace_smmu_reset_exit(); + if (c->parent_phases.exit) { + c->parent_phases.exit(obj, type); } smmuv3_init_regs(s); @@ -1999,7 +2000,7 @@ static void smmuv3_class_init(ObjectClass *klass, void *data) SMMUv3Class *c = ARM_SMMUV3_CLASS(klass); dc->vmsd = &vmstate_smmuv3; - resettable_class_set_parent_phases(rc, NULL, smmu_reset_hold, NULL, + resettable_class_set_parent_phases(rc, NULL, NULL, smmu_reset_exit, &c->parent_phases); device_class_set_parent_realize(dc, smmu_realize, &c->parent_realize); diff --git a/hw/arm/trace-events b/hw/arm/trace-events index c64ad344bd..7790db780e 100644 --- a/hw/arm/trace-events +++ b/hw/arm/trace-events @@ -56,6 +56,7 @@ smmuv3_config_cache_inv(uint32_t sid) "Config cache INV for sid=0x%x" smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node for iommu mr=%s" smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node for iommu mr=%s" smmuv3_inv_notifiers_iova(const char *name, int asid, int vmid, uint64_t iova, uint8_t tg, uint64_t num_pages, int stage) "iommu mr=%s asid=%d vmid=%d iova=0x%"PRIx64" tg=%d num_pages=0x%"PRIx64" stage=%d" +smmu_reset_exit(void) "" # strongarm.c strongarm_uart_update_parameters(const char *label, int speed, char parity, int data_bits, int stop_bits) "%s speed=%d parity=%c data=%d stop=%d"
Currently the iommu may be reset before the devices it protects. For example this happens with virtio-scsi-pci. when system_reset is issued from qmp monitor, spurious "virtio: zero sized buffers are not allowed" warnings can be observed. Signed-off-by: Eric Auger <eric.auger@redhat.com> --- hw/arm/smmuv3.c | 9 +++++---- hw/arm/trace-events | 1 + 2 files changed, 6 insertions(+), 4 deletions(-)