Message ID | 1346168638-32724-5-git-send-email-jiang.liu@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Tue, Aug 28, 2012 at 8:43 AM, Jiang Liu <liuj97@gmail.com> wrote: > Following code has a race window between pci_find_bus() and pci_get_slot() > if PCI hotplug operation happens between them which removes the pci_bus. > So use PCI hotplug safe interface pci_get_domain_bus_and_slot() instead, > which also reduces code complexity. > > struct pci_bus *pci_bus = pci_find_bus(domain, busno); > struct pci_dev *pci_dev = pci_get_slot(pci_bus, devfn); > > Signed-off-by: Jiang Liu <jiang.liu@huawei.com> > --- > drivers/pci/iov.c | 8 ++------ > 1 file changed, 2 insertions(+), 6 deletions(-) > > diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c > index aeccc91..b0fe771 100644 > --- a/drivers/pci/iov.c > +++ b/drivers/pci/iov.c > @@ -152,15 +152,11 @@ failed1: > static void virtfn_remove(struct pci_dev *dev, int id, int reset) > { > char buf[VIRTFN_ID_LEN]; > - struct pci_bus *bus; > struct pci_dev *virtfn; > struct pci_sriov *iov = dev->sriov; > > - bus = pci_find_bus(pci_domain_nr(dev->bus), virtfn_bus(dev, id)); > - if (!bus) > - return; > - > - virtfn = pci_get_slot(bus, virtfn_devfn(dev, id)); > + virtfn = pci_get_domain_bus_and_slot(pci_domain_nr(dev->bus), > + virtfn_bus(dev, id), virtfn_devfn(dev, id)); > if (!virtfn) > return; > Hi, This one cause IOV regression, when remove bridge with pci devices under that. in that case, VFs are stopped before PF, so they are not in device tree anymore. so pci_get_domain_bus_and_slot will not find those VFs. So the reference to PF is not released. Also vit_bus may not be released too. So you have to rework pci_get_domain_bus_and_slot to make it work on pci devices get stopped only. or just drop this from the tree. BTW, Bjorn, for the similar reason, you need to apply http://git.kernel.org/?p=linux/kernel/git/yinghai/linux-yinghai.git;a=commitdiff;h=e5a50aa3dfca1331c7c783412b1524bea06d2752 PCI: Split out stop_bus_device and remove_bus_dev again. So later could use them for pci root bus hotplug support. Also restore old behavoir: stop all at first then remove all. -v2: only split the functions. the reason is: We stop all VFs at first , stop PF before we stop PF, we can not remove VFs, otherwise virtfn_remove does not work properly to remove reference and virtfn bus while can not find vf. I would like to have PCI: Split out stop_bus_device and remove_bus_dev again. fold into http://git.kernel.org/?p=linux/kernel/git/helgaas/pci.git;a=commitdiff;h=282e1d655fe7c7c2e6b0dd8166c4c6b7c2a1219b -Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 20, 2012 at 2:38 PM, Yinghai Lu <yinghai@kernel.org> wrote: > On Tue, Aug 28, 2012 at 8:43 AM, Jiang Liu <liuj97@gmail.com> wrote: >> Following code has a race window between pci_find_bus() and pci_get_slot() >> if PCI hotplug operation happens between them which removes the pci_bus. >> So use PCI hotplug safe interface pci_get_domain_bus_and_slot() instead, >> which also reduces code complexity. >> >> struct pci_bus *pci_bus = pci_find_bus(domain, busno); >> struct pci_dev *pci_dev = pci_get_slot(pci_bus, devfn); >> >> Signed-off-by: Jiang Liu <jiang.liu@huawei.com> >> --- >> drivers/pci/iov.c | 8 ++------ >> 1 file changed, 2 insertions(+), 6 deletions(-) >> >> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c >> index aeccc91..b0fe771 100644 >> --- a/drivers/pci/iov.c >> +++ b/drivers/pci/iov.c >> @@ -152,15 +152,11 @@ failed1: >> static void virtfn_remove(struct pci_dev *dev, int id, int reset) >> { >> char buf[VIRTFN_ID_LEN]; >> - struct pci_bus *bus; >> struct pci_dev *virtfn; >> struct pci_sriov *iov = dev->sriov; >> >> - bus = pci_find_bus(pci_domain_nr(dev->bus), virtfn_bus(dev, id)); >> - if (!bus) >> - return; >> - >> - virtfn = pci_get_slot(bus, virtfn_devfn(dev, id)); >> + virtfn = pci_get_domain_bus_and_slot(pci_domain_nr(dev->bus), >> + virtfn_bus(dev, id), virtfn_devfn(dev, id)); >> if (!virtfn) >> return; >> > > Hi, > > This one cause IOV regression, when remove bridge with pci devices under that. > > in that case, VFs are stopped before PF, so they are not in device > tree anymore. > so pci_get_domain_bus_and_slot will not find those VFs. > > So the reference to PF is not released. Also vit_bus may not be released too. > > So you have to rework > pci_get_domain_bus_and_slot to make it work on pci devices get stopped only. > > or just drop this from the tree. pci_find_bus() is a broken interface (because there's no reference counting or safety with respect to hot-plug), and if the design depends on it, that means the design is broken, too. I don't think reworking pci_get_domain_bus_and_slot() is the right answer. It's not clear to me why we need the split between stopping and removing devices. That split leads to these zombie devices that have been stopped and are no longer findable by bus_find_device() (which is used by pci_get_domain_bus_and_slot()), but still "exist" in some fashion until they're removed. It's unreasonable for random PCI and driver code to have to worry about that zombie state. I'll revert this patch for now to fix the regression. Of course, that means we will still have the old problem of using the unsafe pci_find_bus(). > BTW, Bjorn, > for the similar reason, you need to apply > http://git.kernel.org/?p=linux/kernel/git/yinghai/linux-yinghai.git;a=commitdiff;h=e5a50aa3dfca1331c7c783412b1524bea06d2752 > ... > the reason is: > > We stop all VFs at first , stop PF > before we stop PF, we can not remove VFs, > > otherwise virtfn_remove does not work properly to remove reference and > virtfn bus while can not find vf. I'll apply this one, too, for now. I put them both on the pci/yinghai-revert-pci_find_bus-and-remove-cleanup branch. Let me know if that's not what you expected. I'm not happy about either reverting Jiang's patch or splitting stop/remove again. It complicates the design and the code. I'll apply them because they're regressions, and we don't have time for redesign before 3.7. But I encourage you to think about how to do this more cleanly. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/21/2012 07:59 AM, Bjorn Helgaas wrote: > On Thu, Sep 20, 2012 at 2:38 PM, Yinghai Lu <yinghai@kernel.org> wrote: >> On Tue, Aug 28, 2012 at 8:43 AM, Jiang Liu <liuj97@gmail.com> wrote: >>> Following code has a race window between pci_find_bus() and pci_get_slot() >>> if PCI hotplug operation happens between them which removes the pci_bus. >>> So use PCI hotplug safe interface pci_get_domain_bus_and_slot() instead, >>> which also reduces code complexity. >>> >>> struct pci_bus *pci_bus = pci_find_bus(domain, busno); >>> struct pci_dev *pci_dev = pci_get_slot(pci_bus, devfn); >>> >>> Signed-off-by: Jiang Liu <jiang.liu@huawei.com> >>> --- >>> drivers/pci/iov.c | 8 ++------ >>> 1 file changed, 2 insertions(+), 6 deletions(-) >>> >>> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c >>> index aeccc91..b0fe771 100644 >>> --- a/drivers/pci/iov.c >>> +++ b/drivers/pci/iov.c >>> @@ -152,15 +152,11 @@ failed1: >>> static void virtfn_remove(struct pci_dev *dev, int id, int reset) >>> { >>> char buf[VIRTFN_ID_LEN]; >>> - struct pci_bus *bus; >>> struct pci_dev *virtfn; >>> struct pci_sriov *iov = dev->sriov; >>> >>> - bus = pci_find_bus(pci_domain_nr(dev->bus), virtfn_bus(dev, id)); >>> - if (!bus) >>> - return; >>> - >>> - virtfn = pci_get_slot(bus, virtfn_devfn(dev, id)); >>> + virtfn = pci_get_domain_bus_and_slot(pci_domain_nr(dev->bus), >>> + virtfn_bus(dev, id), virtfn_devfn(dev, id)); >>> if (!virtfn) >>> return; >>> >> >> Hi, >> >> This one cause IOV regression, when remove bridge with pci devices under that. >> >> in that case, VFs are stopped before PF, so they are not in device >> tree anymore. >> so pci_get_domain_bus_and_slot will not find those VFs. >> >> So the reference to PF is not released. Also vit_bus may not be released too. >> >> So you have to rework >> pci_get_domain_bus_and_slot to make it work on pci devices get stopped only. >> >> or just drop this from the tree. > > pci_find_bus() is a broken interface (because there's no reference > counting or safety with respect to hot-plug), and if the design > depends on it, that means the design is broken, too. I don't think > reworking pci_get_domain_bus_and_slot() is the right answer. > > It's not clear to me why we need the split between stopping and > removing devices. That split leads to these zombie devices that have > been stopped and are no longer findable by bus_find_device() (which is > used by pci_get_domain_bus_and_slot()), but still "exist" in some > fashion until they're removed. It's unreasonable for random PCI and > driver code to have to worry about that zombie state. > > I'll revert this patch for now to fix the regression. Of course, that > means we will still have the old problem of using the unsafe > pci_find_bus(). Hi Bjorn, I'm working on to enhance unsafe calling of pci_find_bus(). --Gerry -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 20, 2012 at 4:59 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: > On Thu, Sep 20, 2012 at 2:38 PM, Yinghai Lu <yinghai@kernel.org> wrote: >> in that case, VFs are stopped before PF, so they are not in device >> tree anymore. >> so pci_get_domain_bus_and_slot will not find those VFs. >> >> So the reference to PF is not released. Also vit_bus may not be released too. >> >> So you have to rework >> pci_get_domain_bus_and_slot to make it work on pci devices get stopped only. >> >> or just drop this from the tree. > > pci_find_bus() is a broken interface (because there's no reference > counting or safety with respect to hot-plug), and if the design > depends on it, that means the design is broken, too. I don't think > reworking pci_get_domain_bus_and_slot() is the right answer. > > It's not clear to me why we need the split between stopping and > removing devices. That split leads to these zombie devices that have > been stopped and are no longer findable by bus_find_device() (which is > used by pci_get_domain_bus_and_slot()), but still "exist" in some > fashion until they're removed. It's unreasonable for random PCI and > driver code to have to worry about that zombie state. That is not zombie state. that is driver unloaded, and not in /sys, /proc. that pci device only can be found under bus->devices. just like we have pci_device_add and pci_bus_add_device or acpi add and acpi start. Splitting stop and remove should be more clean than mixing stop and remove. > > I'll revert this patch for now to fix the regression. Of course, that > means we will still have the old problem of using the unsafe > pci_find_bus(). > >> BTW, Bjorn, >> for the similar reason, you need to apply >> http://git.kernel.org/?p=linux/kernel/git/yinghai/linux-yinghai.git;a=commitdiff;h=e5a50aa3dfca1331c7c783412b1524bea06d2752 >> ... >> the reason is: >> >> We stop all VFs at first , stop PF >> before we stop PF, we can not remove VFs, >> >> otherwise virtfn_remove does not work properly to remove reference and >> virtfn bus while can not find vf. > > I'll apply this one, too, for now. I put them both on the > pci/yinghai-revert-pci_find_bus-and-remove-cleanup branch. Let me > know if that's not what you expected. Yes, they are good. > > I'm not happy about either reverting Jiang's patch or splitting > stop/remove again. It complicates the design and the code. I'll > apply them because they're regressions, and we don't have time for > redesign before 3.7. But I encourage you to think about how to do > this more cleanly. That will need to redesign sriov implementation. Also that pci root bus add/start, stop/remove will need special sequence to make ioapic and dmar to be started early before normal pci device drivers and stopped after normal pci device drivers. -Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 20, 2012 at 7:51 PM, Yinghai Lu <yinghai@kernel.org> wrote: > On Thu, Sep 20, 2012 at 4:59 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: >> On Thu, Sep 20, 2012 at 2:38 PM, Yinghai Lu <yinghai@kernel.org> wrote: >>> in that case, VFs are stopped before PF, so they are not in device >>> tree anymore. >>> so pci_get_domain_bus_and_slot will not find those VFs. >>> >>> So the reference to PF is not released. Also vit_bus may not be released too. >>> >>> So you have to rework >>> pci_get_domain_bus_and_slot to make it work on pci devices get stopped only. >>> >>> or just drop this from the tree. >> >> pci_find_bus() is a broken interface (because there's no reference >> counting or safety with respect to hot-plug), and if the design >> depends on it, that means the design is broken, too. I don't think >> reworking pci_get_domain_bus_and_slot() is the right answer. >> >> It's not clear to me why we need the split between stopping and >> removing devices. That split leads to these zombie devices that have >> been stopped and are no longer findable by bus_find_device() (which is >> used by pci_get_domain_bus_and_slot()), but still "exist" in some >> fashion until they're removed. It's unreasonable for random PCI and >> driver code to have to worry about that zombie state. > > That is not zombie state. that is driver unloaded, and not in /sys, /proc. > that pci device only can be found under bus->devices. It doesn't matter whether we call this a "zombie state" or just refer to it as "devices not in /sys & /proc but still in bus->devices." The point is that this state is not very useful, and code outside the PCI core should not have to know that it exists. > just like we have pci_device_add and pci_bus_add_device > or acpi add and acpi start. The fact that ACPI drivers have both .add() and .start() methods is another artifact of poor design, in my opinion. No other subsystem has that split, as far as I know. The ACPI split exists because of a messed-up ACPI hotplug implementation. That doesn't mean we should copy it. >> I'm not happy about either reverting Jiang's patch or splitting >> stop/remove again. It complicates the design and the code. I'll >> apply them because they're regressions, and we don't have time for >> redesign before 3.7. But I encourage you to think about how to do >> this more cleanly. > > That will need to redesign sriov implementation. That's right. If we can improve the situation by redesigning, that's what we should do. The present situation, where we keep adding special cases because "that's the way the rest of the system works" is not sustainable in the long term. > Also that pci root bus add/start, stop/remove will need special > sequence to make ioapic > and dmar to be started early before normal pci device drivers and > stopped after normal pci device drivers. This is another thing I'm curious about. How do you handle this situation today (before host bridge hot-add)? The DMAR I'm not so worried about because as far as I know, there's no such thing as a DMAR that's discovered by PCI enumeration. We should discover it via ACPI, and that can happen before we enumerate anything behind a host bridge, so I don't really see any ordering problem between the DMAR and the PCI devices that would use it. However, I know there *are* IOAPICs that are enumerated as PCI devices, and I don't know whether we can deduce a relationship between the IOAPIC and the devices that use it. Don't we have this problem already? I assume that even without hot-adding a host bridge, we might discover a PCI IOAPIC that was present at boot, and we'd have to make sure to bind a driver to it before we use any of the PCI devices connected to it. How does that work? Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 20, 2012 at 7:56 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: > This is another thing I'm curious about. How do you handle this > situation today (before host bridge hot-add)? > > The DMAR I'm not so worried about because as far as I know, there's no > such thing as a DMAR that's discovered by PCI enumeration. We should > discover it via ACPI, and that can happen before we enumerate anything > behind a host bridge, so I don't really see any ordering problem > between the DMAR and the PCI devices that would use it. only need to have pci devices on that root bus scanned, and current intel iommu maintain one device scope to drhd with pointer to pci device... that need to be fixed too. > > However, I know there *are* IOAPICs that are enumerated as PCI > devices, and I don't know whether we can deduce a relationship between > the IOAPIC and the devices that use it. Don't we have this problem > already? I assume that even without hot-adding a host bridge, we > might discover a PCI IOAPIC that was present at boot, and we'd have to > make sure to bind a driver to it before we use any of the PCI devices > connected to it. How does that work? I converted it to acpi way to discover it, and it could handle that case. will search _GSB and try to get pci device, if there is pci device will try to get BAR as ioapic base. http://git.kernel.org/?p=linux/kernel/git/yinghai/linux-yinghai.git;a=blob;f=drivers/pci/ioapic.c;h=504ca93ac692646a7754fff83a04e3d07d98f648;hb=refs/heads/for-x86-irq something like: static void handle_ioapic_add(acpi_handle handle, struct pci_dev **pdev, u32 *pgsi_base) { acpi_status status; unsigned long long gsb; struct pci_dev *dev; u32 gsi_base; int ret; char *type; struct resource r; struct resource *res = &r; char objname[64]; struct acpi_buffer buffer = {sizeof(objname), objname}; *pdev = NULL; *pgsi_base = 0; status = acpi_evaluate_integer(handle, "_GSB", NULL, &gsb); if (ACPI_FAILURE(status) || !gsb) return; dev = acpi_get_pci_dev(handle); if (!dev) { struct acpi_device_info *info; char *hid = NULL; status = acpi_get_object_info(handle, &info); if (ACPI_FAILURE(status)) return; if (info->valid & ACPI_VALID_HID) hid = info->hardware_id.string; if (!hid || strcmp(hid, "ACPI0009")) { kfree(info); return; } kfree(info); memset(res, 0, sizeof(*res)); acpi_walk_resources(handle, METHOD_NAME__CRS, setup_res, res); if (!res->flags) return; } acpi_get_name(handle, ACPI_FULL_PATHNAME, &buffer); gsi_base = gsb; type = "IOxAPIC"; if (dev) { ret = pci_enable_device(dev); if (ret < 0) goto exit_put; pci_set_master(dev); if (dev->class == PCI_CLASS_SYSTEM_PIC_IOAPIC) type = "IOAPIC"; if (pci_request_region(dev, 0, type)) goto exit_disable; res = &dev->resource[0]; } if (acpi_register_ioapic(handle, res->start, gsi_base)) { if (dev) goto exit_release; return; } printk(KERN_INFO "%s %s %s at %pR, GSI %u\n", dev ? dev_name(&dev->dev) : "", objname, type, res, gsi_base); *pdev = dev; *pgsi_base = gsi_base; return; exit_release: pci_release_region(dev, 0); exit_disable: pci_disable_device(dev); exit_put: pci_dev_put(dev); } -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/21/2012 02:22 AM, Yinghai Lu wrote: > On Thu, Sep 20, 2012 at 7:56 PM, Bjorn Helgaas<bhelgaas@google.com> wrote: >> This is another thing I'm curious about. How do you handle this >> situation today (before host bridge hot-add)? >> >> The DMAR I'm not so worried about because as far as I know, there's no >> such thing as a DMAR that's discovered by PCI enumeration. We should >> discover it via ACPI, and that can happen before we enumerate anything >> behind a host bridge, so I don't really see any ordering problem >> between the DMAR and the PCI devices that would use it. > > only need to have pci devices on that root bus scanned, and current intel iommu > maintain one device scope to drhd with pointer to pci device... that > need to be fixed > too. > translation: you have an ACPI-DMAR setup bug? a drhd can have multiple device scopes, one of which can be "all devices under bus X uses this IOMMU". If (dynamic) DMARs are scanned at root hot-plug time in ACPI hot-plug, the proper dmar-init should be completed before any PCI devs are scanned (and put into the proper iommu domain). >> >> However, I know there *are* IOAPICs that are enumerated as PCI >> devices, and I don't know whether we can deduce a relationship between >> the IOAPIC and the devices that use it. Don't we have this problem >> already? I assume that even without hot-adding a host bridge, we >> might discover a PCI IOAPIC that was present at boot, and we'd have to >> make sure to bind a driver to it before we use any of the PCI devices >> connected to it. How does that work? > > I converted it to acpi way to discover it, and it could handle that case. > > will search _GSB and try to get pci device, if there is pci device > will try to get BAR as ioapic base. > > http://git.kernel.org/?p=linux/kernel/git/yinghai/linux-yinghai.git;a=blob;f=drivers/pci/ioapic.c;h=504ca93ac692646a7754fff83a04e3d07d98f648;hb=refs/heads/for-x86-irq > > something like: > > static void handle_ioapic_add(acpi_handle handle, struct pci_dev **pdev, > u32 *pgsi_base) > { > acpi_status status; > unsigned long long gsb; > struct pci_dev *dev; > u32 gsi_base; > int ret; > char *type; > struct resource r; > struct resource *res =&r; > char objname[64]; > struct acpi_buffer buffer = {sizeof(objname), objname}; > > *pdev = NULL; > *pgsi_base = 0; > > status = acpi_evaluate_integer(handle, "_GSB", NULL,&gsb); > if (ACPI_FAILURE(status) || !gsb) > return; > > dev = acpi_get_pci_dev(handle); > if (!dev) { > struct acpi_device_info *info; > char *hid = NULL; > > status = acpi_get_object_info(handle,&info); > if (ACPI_FAILURE(status)) > return; > if (info->valid& ACPI_VALID_HID) > hid = info->hardware_id.string; > if (!hid || strcmp(hid, "ACPI0009")) { > kfree(info); > return; > } > kfree(info); > memset(res, 0, sizeof(*res)); > acpi_walk_resources(handle, METHOD_NAME__CRS, setup_res, res); > if (!res->flags) > return; > } > > acpi_get_name(handle, ACPI_FULL_PATHNAME,&buffer); > > gsi_base = gsb; > type = "IOxAPIC"; > if (dev) { > ret = pci_enable_device(dev); > if (ret< 0) > goto exit_put; > > pci_set_master(dev); > > if (dev->class == PCI_CLASS_SYSTEM_PIC_IOAPIC) > type = "IOAPIC"; > > if (pci_request_region(dev, 0, type)) > goto exit_disable; > > res =&dev->resource[0]; > } > > if (acpi_register_ioapic(handle, res->start, gsi_base)) { > if (dev) > goto exit_release; > return; > } > > printk(KERN_INFO "%s %s %s at %pR, GSI %u\n", > dev ? dev_name(&dev->dev) : "", objname, type, > res, gsi_base); > > *pdev = dev; > *pgsi_base = gsi_base; > return; > > exit_release: > pci_release_region(dev, 0); > exit_disable: > pci_disable_device(dev); > exit_put: > pci_dev_put(dev); > } -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Sep 21, 2012 at 7:15 AM, Don Dutile <ddutile@redhat.com> wrote: > On 09/21/2012 02:22 AM, Yinghai Lu wrote: >> >> On Thu, Sep 20, 2012 at 7:56 PM, Bjorn Helgaas<bhelgaas@google.com> >> wrote: >>> >>> This is another thing I'm curious about. How do you handle this >>> situation today (before host bridge hot-add)? >>> >>> The DMAR I'm not so worried about because as far as I know, there's no >>> such thing as a DMAR that's discovered by PCI enumeration. We should >>> discover it via ACPI, and that can happen before we enumerate anything >>> behind a host bridge, so I don't really see any ordering problem >>> between the DMAR and the PCI devices that would use it. >> >> >> only need to have pci devices on that root bus scanned, and current intel >> iommu >> maintain one device scope to drhd with pointer to pci device... that >> need to be fixed >> too. >> > translation: you have an ACPI-DMAR setup bug? a drhd can have multiple > device > scopes, one of which can be "all devices under bus X uses this IOMMU". > If (dynamic) DMARs are scanned at root hot-plug time in ACPI hot-plug, > the proper dmar-init should be completed before any PCI devs are scanned > (and put into the proper iommu domain). if you can check for-iommu branch, http://git.kernel.org/?p=linux/kernel/git/yinghai/linux-yinghai.git;a=shortlog;h=refs/heads/for-iommu current implementation: after root bus plugged in, drivers/acpi/pci_root_hp.c::handle_root_bridge_insert() ==> acpi_bus_add ==> drivers/acpi/pci_root.c::acpi_pci_root_add ==> arch/x86/pci/acpi.c::pci_acpi_scan_root so all pci devices get scanned. later handl_root_bridge_insert ==>acpi_bus_add ==> drivers/acpi/pci_root.c::acpi_pci_root_start==> pcibios_reserouce_survey_bus/pci_assign_unassigned_bus_resource and call iommu acpi_pci_driver .add aka drivers/iommu/dmar.c::acpi_pci_iommu_add ==> register_iommu ==> handle_iommu_add ==> dmar_parse_dev ==> dmar_parse_dev_scope ==> dmar_parse_one_dev_scope that dmar_parse_one_dev_scope will find pci device pointer and put it back into dmaru device pointer array. in the boot path, it the same, dev_scope is parsed later after pci devices get scanned. We really should remove that cache array for device pointer... -Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c index aeccc91..b0fe771 100644 --- a/drivers/pci/iov.c +++ b/drivers/pci/iov.c @@ -152,15 +152,11 @@ failed1: static void virtfn_remove(struct pci_dev *dev, int id, int reset) { char buf[VIRTFN_ID_LEN]; - struct pci_bus *bus; struct pci_dev *virtfn; struct pci_sriov *iov = dev->sriov; - bus = pci_find_bus(pci_domain_nr(dev->bus), virtfn_bus(dev, id)); - if (!bus) - return; - - virtfn = pci_get_slot(bus, virtfn_devfn(dev, id)); + virtfn = pci_get_domain_bus_and_slot(pci_domain_nr(dev->bus), + virtfn_bus(dev, id), virtfn_devfn(dev, id)); if (!virtfn) return;
Following code has a race window between pci_find_bus() and pci_get_slot() if PCI hotplug operation happens between them which removes the pci_bus. So use PCI hotplug safe interface pci_get_domain_bus_and_slot() instead, which also reduces code complexity. struct pci_bus *pci_bus = pci_find_bus(domain, busno); struct pci_dev *pci_dev = pci_get_slot(pci_bus, devfn); Signed-off-by: Jiang Liu <jiang.liu@huawei.com> --- drivers/pci/iov.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-)