Message ID | 20190620051333.2235-3-drake@endlessm.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | Support Intel AHCI remapped NVMe devices | expand |
Please give up on this route. We will not accept messing the NVMe driver for Intels fucked up chipsets that are so bad that even they are not allowed to talk about it anymore. The Linux NVMe driver will deal with NVMe as specified plus whatever minor tweaks we'll need for small bugs. Hiding it behind an AHCI device is completely out of scope and will not be accepted.
On Thu, Jun 20, 2019 at 2:11 PM Christoph Hellwig <hch@lst.de> wrote: > The Linux NVMe driver will deal with NVMe as specified plus whatever > minor tweaks we'll need for small bugs. Hiding it behind an AHCI > device is completely out of scope and will not be accepted. Do you have any new suggestions for alternative ways we can implement support for this storage configuration? I tried to follow your earlier suggestions regarding faking a PCI bus here: (or let me know if you had something different in mind...) https://marc.info/?l=linux-pci&m=156015271021614&w=2 but it looks like that is not going to fly either :( Daniel
On Thu, Jun 20, 2019 at 04:11:26PM +0800, Daniel Drake wrote: > On Thu, Jun 20, 2019 at 2:11 PM Christoph Hellwig <hch@lst.de> wrote: > > The Linux NVMe driver will deal with NVMe as specified plus whatever > > minor tweaks we'll need for small bugs. Hiding it behind an AHCI > > device is completely out of scope and will not be accepted. > > Do you have any new suggestions for alternative ways we can implement > support for this storage configuration? IFF we want to support it it has to be done at the PCIe layer. But even that will require actual documentation and support from Intel. If Intel still believes this scheme is their magic secret to control the NVMe market and give themselves and unfair advantage over their competitors there is not much we can do.
On Mon, 2019-06-24 at 08:16 +0200, Christoph Hellwig wrote: > On Thu, Jun 20, 2019 at 04:11:26PM +0800, Daniel Drake wrote: > > On Thu, Jun 20, 2019 at 2:11 PM Christoph Hellwig <hch@lst.de> wrote: > > > The Linux NVMe driver will deal with NVMe as specified plus whatever > > > minor tweaks we'll need for small bugs. Hiding it behind an AHCI > > > device is completely out of scope and will not be accepted. > > > > Do you have any new suggestions for alternative ways we can implement > > support for this storage configuration? > > IFF we want to support it it has to be done at the PCIe layer. But > even that will require actual documentation and support from Intel. > > If Intel still believes this scheme is their magic secret to control > the NVMe market and give themselves and unfair advantage over their > competitors there is not much we can do. At the very least, the switch to make it normal again shouldn't be in the BIOS settings where it requires manual interaction, but should be changeable at run time by the operating system. Intel are consistently failing to learn the "firmware exists to boot the OS and get out of the way" lesson. There are cases like thermal management which sometimes make for valid exceptions, of course. This isn't one of them.
On Mon, Jun 24, 2019 at 2:16 PM Christoph Hellwig <hch@lst.de> wrote: > IFF we want to support it it has to be done at the PCIe layer. But > even that will require actual documentation and support from Intel. > > If Intel still believes this scheme is their magic secret to control > the NVMe market and give themselves and unfair advantage over their > competitors there is not much we can do. Since the 2016 discussion, more documentation has been published: https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/300-series-chipset-pch-datasheet-vol-2.pdf Chapter 15 is entirely new, and section 15.2 provides a nice clarity improvement of the magic regs in the AHCI BAR, which I have used in these patches to clean up the code and add documentation in the header (see patch 1 in this series, ahci-remap.h). I believe there's room for further improvement in the docs here, but it would be nice to know what you see as the blocking questions or documentation gaps that would prevent us from continuing to develop the fake PCI bridge approach (https://marc.info/?l=linux-pci&m=156015271021614&w=2). We are going to try and push Intel on this via other channels to see if we can get a contact to help us, so it would be useful if I can include a concrete list of what we need. Bearing in mind that we've already been told that the NVMe device config space is inaccessible, and the new docs show exactly how the BIOS enforces such inaccessibility during early boot, the remaining points you mentioned recently were: b) reset handling, including the PCI device removal as the last escalation step c) SR-IOV VFs and their management d) power management Are there other blocking questions you would require answers to? Thanks, Daniel
On Tue, Jun 25, 2019 at 11:51:28AM +0800, Daniel Drake wrote: > Bearing in mind that we've already been told that the NVMe device > config space is inaccessible, and the new docs show exactly how the > BIOS enforces such inaccessibility during early boot, the remaining > points you mentioned recently were: If we can't access the config space we unfortunately can't support this scheme at all, as it invalidates all our quirks handling.
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 524d6bd6d095..42990b93349d 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -1108,7 +1108,7 @@ static int nvme_poll(struct blk_mq_hw_ctx *hctx) return found; } -static void nvme_pci_submit_async_event(struct nvme_ctrl *ctrl) +static void nvme_mmio_submit_async_event(struct nvme_ctrl *ctrl) { struct nvme_dev *dev = to_nvme_dev(ctrl); struct nvme_queue *nvmeq = &dev->queues[0]; @@ -2448,7 +2448,7 @@ static void nvme_release_prp_pools(struct nvme_dev *dev) dma_pool_destroy(dev->prp_small_pool); } -static void nvme_pci_free_ctrl(struct nvme_ctrl *ctrl) +static void nvme_mmio_free_ctrl(struct nvme_ctrl *ctrl) { struct nvme_dev *dev = to_nvme_dev(ctrl); @@ -2610,42 +2610,42 @@ static void nvme_remove_dead_ctrl_work(struct work_struct *work) nvme_put_ctrl(&dev->ctrl); } -static int nvme_pci_reg_read32(struct nvme_ctrl *ctrl, u32 off, u32 *val) +static int nvme_mmio_reg_read32(struct nvme_ctrl *ctrl, u32 off, u32 *val) { *val = readl(to_nvme_dev(ctrl)->bar + off); return 0; } -static int nvme_pci_reg_write32(struct nvme_ctrl *ctrl, u32 off, u32 val) +static int nvme_mmio_reg_write32(struct nvme_ctrl *ctrl, u32 off, u32 val) { writel(val, to_nvme_dev(ctrl)->bar + off); return 0; } -static int nvme_pci_reg_read64(struct nvme_ctrl *ctrl, u32 off, u64 *val) +static int nvme_mmio_reg_read64(struct nvme_ctrl *ctrl, u32 off, u64 *val) { *val = readq(to_nvme_dev(ctrl)->bar + off); return 0; } -static int nvme_pci_get_address(struct nvme_ctrl *ctrl, char *buf, int size) +static int nvme_mmio_get_address(struct nvme_ctrl *ctrl, char *buf, int size) { struct pci_dev *pdev = to_pci_dev(to_nvme_dev(ctrl)->dev); return snprintf(buf, size, "%s", dev_name(&pdev->dev)); } -static const struct nvme_ctrl_ops nvme_pci_ctrl_ops = { +static const struct nvme_ctrl_ops nvme_mmio_ctrl_ops = { .name = "pcie", .module = THIS_MODULE, .flags = NVME_F_METADATA_SUPPORTED | NVME_F_PCI_P2PDMA, - .reg_read32 = nvme_pci_reg_read32, - .reg_write32 = nvme_pci_reg_write32, - .reg_read64 = nvme_pci_reg_read64, - .free_ctrl = nvme_pci_free_ctrl, - .submit_async_event = nvme_pci_submit_async_event, - .get_address = nvme_pci_get_address, + .reg_read32 = nvme_mmio_reg_read32, + .reg_write32 = nvme_mmio_reg_write32, + .reg_read64 = nvme_mmio_reg_read64, + .free_ctrl = nvme_mmio_free_ctrl, + .submit_async_event = nvme_mmio_submit_async_event, + .get_address = nvme_mmio_get_address, }; static int nvme_dev_map(struct nvme_dev *dev) @@ -2758,7 +2758,7 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id) goto release_pools; } - result = nvme_init_ctrl(&dev->ctrl, &pdev->dev, &nvme_pci_ctrl_ops, + result = nvme_init_ctrl(&dev->ctrl, &pdev->dev, &nvme_mmio_ctrl_ops, quirks); if (result) goto release_mempool;