diff mbox series

[v9,03/13] PCI: Add partial-BAR devres support

Message ID 20240613115032.29098-4-pstanner@redhat.com (mailing list archive)
State Accepted
Commit bb4f3f2f3c37a32a55f2686ef1f41d9ee09a2121
Delegated to: Bjorn Helgaas
Headers show
Series Make PCI's devres API more consistent | expand

Commit Message

Philipp Stanner June 13, 2024, 11:50 a.m. UTC
With the current PCI devres API implementing a managed version of
pci_iomap_range() is impossible.

Furthermore, the PCI devres API currently is inconsistent and
complicated. This is in large part due to the fact that there are hybrid
functions which are only sometimes managed via devres, and functions
IO-mapping and requesting several BARs at once and returning mappings
through a separately administrated table.

This table's indexing mechanism does not support partial-BAR mappings.

Another notable problem is that there are no separate managed
counterparts for region-request functions such as pci_request_region(),
as they exist for other PCI functions (e.g., pci_iomap() <->
pcim_iomap()). Instead, functions based on __pci_request_region() change
their internal behavior and suddenly become managed functions when
pcim_enable_device() instead of pci_enable_device() is used.

This API is hard to understand and potentially bug-provoking. Hence, it
should be made more consistent.

This patch adds the necessary infrastructure for partial-BAR mappings
managed with devres. That infrastructure also serves as a ground layer
for significantly simplifying the PCI devres API in subsequent patches
which can then cleanly separate managed and unmanaged API.

When having the long term goal of providing always managed functions
prefixed with "pcim_" and never managed functions prefixed with "pci_"
and, thus, separating managed and unmanaged APIs cleanly, new PCI devres
infrastructure cannot use __pci_request_region() and its wrappers since
those would then again interact with PCI devres and, consequently,
prevent the managed nature from being removed from the pci_* functions
in the first place. Thus, it's necessary to provide an alternative to
__pci_request_region().

This patch addresses the following problems of the PCI devres API:

  a) There is no PCI devres infrastructure on which a managed counter
     part to pci_iomap_range() could be based on.

  b) The vast majority of the users of plural functions such as
     pcim_iomap_regions() only ever sets a single bit in the bit mask,
     consequently making them singular functions anyways.

  c) region-request functions being sometimes managed and sometimes not
     is bug-provoking. pcim_* functions should always be managed, pci_*
     functions never.

Add a new PCI device resource, pcim_addr_devres, that serves to
encapsulate all device resource types related to region requests and
IO-mappings since those are very frequently created together.

Add a set of alternatives cleanly separated from the hybrid mechanism in
__pci_request_region() and its respective wrappers:
  - __pcim_request_region_range()
  - __pcim_release_region_range()
  - __pcim_request_region()
  - __pcim_release_region()

Add the following PCI-internal devres functions based on the above:
  - pcim_iomap_region()
  - pcim_iounmap_region()
  - _pcim_request_region()
  - pcim_request_region()
  - pcim_release_region()
  - pcim_request_all_regions()
  - pcim_release_all_regions()

Add new needed helper pcim_remove_bar_from_legacy_table().

Rework the following public interfaces using the new infrastructure
listed above:
  - pcim_iomap_release()
  - pcim_iomap()
  - pcim_iounmap()
  - pcim_iomap_regions()
  - pcim_iomap_regions_request_all()
  - pcim_iounmap_regions()

Update API documentation.

Link: https://lore.kernel.org/r/20240605081605.18769-5-pstanner@redhat.com
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/devres.c | 608 ++++++++++++++++++++++++++++++++++++++-----
 drivers/pci/pci.c    |  22 ++
 drivers/pci/pci.h    |   5 +
 3 files changed, 568 insertions(+), 67 deletions(-)

Comments

Bjorn Helgaas June 13, 2024, 9:28 p.m. UTC | #1
On Thu, Jun 13, 2024 at 01:50:16PM +0200, Philipp Stanner wrote:
> With the current PCI devres API implementing a managed version of
> pci_iomap_range() is impossible.
> 
> Furthermore, the PCI devres API currently is inconsistent and
> complicated. This is in large part due to the fact that there are hybrid
> functions which are only sometimes managed via devres, and functions
> IO-mapping and requesting several BARs at once and returning mappings
> through a separately administrated table.
> 
> This table's indexing mechanism does not support partial-BAR mappings.
> 
> Another notable problem is that there are no separate managed
> counterparts for region-request functions such as pci_request_region(),
> as they exist for other PCI functions (e.g., pci_iomap() <->
> pcim_iomap()). Instead, functions based on __pci_request_region() change
> their internal behavior and suddenly become managed functions when
> pcim_enable_device() instead of pci_enable_device() is used.

The hybrid thing is certainly a problem, but does this patch address
it?  I don't see that it does (other than adding comments in
__pci_request_region() and pci_release_region()), but maybe I missed
it.

Correct me if I'm wrong, but I don't think this patch makes any
user-visible changes.

I'm proposing this:

  PCI: Add managed partial-BAR request and map infrastructure

  The pcim_iomap_devres table tracks entire-BAR mappings, so we can't use it
  to build a managed version of pci_iomap_range(), which maps partial BARs.

  Add struct pcim_addr_devres, which can track request and mapping of both
  entire BARs and partial BARs.

  Add the following internal devres functions based on struct
  pcim_addr_devres:

    pcim_iomap_region()               # request & map entire BAR
    pcim_iounmap_region()             # unmap & release entire BAR
    pcim_request_region()             # request entire BAR
    pcim_release_region()             # release entire BAR
    pcim_request_all_regions()        # request all entire BARs
    pcim_release_all_regions()        # release all entire BARs

  Rework the following public interfaces using the new infrastructure
  listed above:

    pcim_iomap()                      # map partial BAR
    pcim_iounmap()                    # unmap partial BAR
    pcim_iomap_regions()              # request & map specified BARs
    pcim_iomap_regions_request_all()  # request all BARs, map specified BARs
    pcim_iounmap_regions()            # unmap & release specified BARs


> This API is hard to understand and potentially bug-provoking. Hence, it
> should be made more consistent.
> 
> This patch adds the necessary infrastructure for partial-BAR mappings
> managed with devres. That infrastructure also serves as a ground layer
> for significantly simplifying the PCI devres API in subsequent patches
> which can then cleanly separate managed and unmanaged API.
> 
> When having the long term goal of providing always managed functions
> prefixed with "pcim_" and never managed functions prefixed with "pci_"
> and, thus, separating managed and unmanaged APIs cleanly, new PCI devres
> infrastructure cannot use __pci_request_region() and its wrappers since
> those would then again interact with PCI devres and, consequently,
> prevent the managed nature from being removed from the pci_* functions
> in the first place. Thus, it's necessary to provide an alternative to
> __pci_request_region().
> 
> This patch addresses the following problems of the PCI devres API:
> 
>   a) There is no PCI devres infrastructure on which a managed counter
>      part to pci_iomap_range() could be based on.
> 
>   b) The vast majority of the users of plural functions such as
>      pcim_iomap_regions() only ever sets a single bit in the bit mask,
>      consequently making them singular functions anyways.
> 
>   c) region-request functions being sometimes managed and sometimes not
>      is bug-provoking. pcim_* functions should always be managed, pci_*
>      functions never.
> 
> Add a new PCI device resource, pcim_addr_devres, that serves to
> encapsulate all device resource types related to region requests and
> IO-mappings since those are very frequently created together.
> 
> Add a set of alternatives cleanly separated from the hybrid mechanism in
> __pci_request_region() and its respective wrappers:
>   - __pcim_request_region_range()
>   - __pcim_release_region_range()
>   - __pcim_request_region()
>   - __pcim_release_region()
> 
> Add the following PCI-internal devres functions based on the above:
>   - pcim_iomap_region()
>   - pcim_iounmap_region()
>   - _pcim_request_region()
>   - pcim_request_region()
>   - pcim_release_region()
>   - pcim_request_all_regions()
>   - pcim_release_all_regions()
> 
> Add new needed helper pcim_remove_bar_from_legacy_table().
> 
> Rework the following public interfaces using the new infrastructure
> listed above:
>   - pcim_iomap_release()
>   - pcim_iomap()
>   - pcim_iounmap()
>   - pcim_iomap_regions()
>   - pcim_iomap_regions_request_all()
>   - pcim_iounmap_regions()
> 
> Update API documentation.
> 
> Link: https://lore.kernel.org/r/20240605081605.18769-5-pstanner@redhat.com
> Signed-off-by: Philipp Stanner <pstanner@redhat.com>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  drivers/pci/devres.c | 608 ++++++++++++++++++++++++++++++++++++++-----
>  drivers/pci/pci.c    |  22 ++
>  drivers/pci/pci.h    |   5 +
>  3 files changed, 568 insertions(+), 67 deletions(-)
> 
> diff --git a/drivers/pci/devres.c b/drivers/pci/devres.c
> index 845d6fab0ce7..cf2c11b54ca6 100644
> --- a/drivers/pci/devres.c
> +++ b/drivers/pci/devres.c
> @@ -4,14 +4,243 @@
>  #include "pci.h"
>  
>  /*
> - * PCI iomap devres
> + * On the state of PCI's devres implementation:
> + *
> + * The older devres API for PCI has two significant problems:
> + *
> + * 1. It is very strongly tied to the statically allocated mapping table in
> + *    struct pcim_iomap_devres below. This is mostly solved in the sense of the
> + *    pcim_ functions in this file providing things like ranged mapping by
> + *    bypassing this table, wheras the functions that were present in the old
> + *    API still enter the mapping addresses into the table for users of the old
> + *    API.
> + *
> + * 2. The region-request-functions in pci.c do become managed IF the device has
> + *    been enabled with pcim_enable_device() instead of pci_enable_device().
> + *    This resulted in the API becoming inconsistent: Some functions have an
> + *    obviously managed counter-part (e.g., pci_iomap() <-> pcim_iomap()),
> + *    whereas some don't and are never managed, while others don't and are
> + *    _sometimes_ managed (e.g. pci_request_region()).
> + *
> + *    Consequently, in the new API, region requests performed by the pcim_
> + *    functions are automatically cleaned up through the devres callback
> + *    pcim_addr_resource_release(), while requests performed by
> + *    pcim_enable_device() + pci_*region*() are automatically cleaned up
> + *    through the for-loop in pcim_release().
> + *
> + * TODO 1:
> + * Remove the legacy table entirely once all calls to pcim_iomap_table() in
> + * the kernel have been removed.
> + *
> + * TODO 2:
> + * Port everyone calling pcim_enable_device() + pci_*region*() to using the
> + * pcim_ functions. Then, remove all devres functionality from pci_*region*()
> + * functions and remove the associated cleanups described above in point #2.
>   */
> -#define PCIM_IOMAP_MAX	PCI_STD_NUM_BARS
>  
> +/*
> + * Legacy struct storing addresses to whole mapped BARs.
> + */
>  struct pcim_iomap_devres {
> -	void __iomem *table[PCIM_IOMAP_MAX];
> +	void __iomem *table[PCI_STD_NUM_BARS];
> +};
> +
> +enum pcim_addr_devres_type {
> +	/* Default initializer. */
> +	PCIM_ADDR_DEVRES_TYPE_INVALID,
> +
> +	/* A requested region spanning an entire BAR. */
> +	PCIM_ADDR_DEVRES_TYPE_REGION,
> +
> +	/*
> +	 * A requested region spanning an entire BAR, and a mapping for
> +	 * the entire BAR.
> +	 */
> +	PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING,
> +
> +	/*
> +	 * A mapping within a BAR, either spanning the whole BAR or just a
> +	 * range.  Without a requested region.
> +	 */
> +	PCIM_ADDR_DEVRES_TYPE_MAPPING,
>  };
>  
> +/*
> + * This struct envelops IO or MEM addresses, i.e., mappings and region
> + * requests, because those are very frequently requested and released
> + * together.
> + */
> +struct pcim_addr_devres {
> +	enum pcim_addr_devres_type type;
> +	void __iomem *baseaddr;
> +	unsigned long offset;
> +	unsigned long len;
> +	short bar;
> +};
> +
> +static inline void pcim_addr_devres_clear(struct pcim_addr_devres *res)
> +{
> +	memset(res, 0, sizeof(*res));
> +	res->bar = -1;
> +}
> +
> +/*
> + * The following functions, __pcim_*_region*, exist as counterparts to the
> + * versions from pci.c - which, unfortunately, can be in "hybrid mode", i.e.,
> + * sometimes managed, sometimes not.
> + *
> + * To separate the APIs cleanly, we define our own, simplified versions here.
> + */
> +
> +/**
> + * __pcim_request_region_range - Request a ranged region
> + * @pdev: PCI device the region belongs to
> + * @bar: BAR the range is within
> + * @offset: offset from the BAR's start address
> + * @maxlen: length in bytes, beginning at @offset
> + * @name: name associated with the request
> + * @req_flags: flags for the request, e.g., for kernel-exclusive requests
> + *
> + * Returns: 0 on success, a negative error code on failure.
> + *
> + * Request a range within a device's PCI BAR.  Sanity check the input.
> + */
> +static int __pcim_request_region_range(struct pci_dev *pdev, int bar,
> +		unsigned long offset, unsigned long maxlen,
> +		const char *name, int req_flags)
> +{
> +	resource_size_t start = pci_resource_start(pdev, bar);
> +	resource_size_t len = pci_resource_len(pdev, bar);
> +	unsigned long dev_flags = pci_resource_flags(pdev, bar);
> +
> +	if (start == 0 || len == 0) /* Unused BAR. */
> +		return 0;
> +	if (len <= offset)
> +		return  -EINVAL;
> +
> +	start += offset;
> +	len -= offset;
> +
> +	if (len > maxlen && maxlen != 0)
> +		len = maxlen;
> +
> +	if (dev_flags & IORESOURCE_IO) {
> +		if (!request_region(start, len, name))
> +			return -EBUSY;
> +	} else if (dev_flags & IORESOURCE_MEM) {
> +		if (!__request_mem_region(start, len, name, req_flags))
> +			return -EBUSY;
> +	} else {
> +		/* That's not a device we can request anything on. */
> +		return -ENODEV;
> +	}
> +
> +	return 0;
> +}
> +
> +static void __pcim_release_region_range(struct pci_dev *pdev, int bar,
> +		unsigned long offset, unsigned long maxlen)
> +{
> +	resource_size_t start = pci_resource_start(pdev, bar);
> +	resource_size_t len = pci_resource_len(pdev, bar);
> +	unsigned long flags = pci_resource_flags(pdev, bar);
> +
> +	if (len <= offset || start == 0)
> +		return;
> +
> +	if (len == 0 || maxlen == 0) /* This an unused BAR. Do nothing. */
> +		return;
> +
> +	start += offset;
> +	len -= offset;
> +
> +	if (len > maxlen)
> +		len = maxlen;
> +
> +	if (flags & IORESOURCE_IO)
> +		release_region(start, len);
> +	else if (flags & IORESOURCE_MEM)
> +		release_mem_region(start, len);
> +}
> +
> +static int __pcim_request_region(struct pci_dev *pdev, int bar,
> +		const char *name, int flags)
> +{
> +	unsigned long offset = 0;
> +	unsigned long len = pci_resource_len(pdev, bar);
> +
> +	return __pcim_request_region_range(pdev, bar, offset, len, name, flags);
> +}
> +
> +static void __pcim_release_region(struct pci_dev *pdev, int bar)
> +{
> +	unsigned long offset = 0;
> +	unsigned long len = pci_resource_len(pdev, bar);
> +
> +	__pcim_release_region_range(pdev, bar, offset, len);
> +}
> +
> +static void pcim_addr_resource_release(struct device *dev, void *resource_raw)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	struct pcim_addr_devres *res = resource_raw;
> +
> +	switch (res->type) {
> +	case PCIM_ADDR_DEVRES_TYPE_REGION:
> +		__pcim_release_region(pdev, res->bar);
> +		break;
> +	case PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING:
> +		pci_iounmap(pdev, res->baseaddr);
> +		__pcim_release_region(pdev, res->bar);
> +		break;
> +	case PCIM_ADDR_DEVRES_TYPE_MAPPING:
> +		pci_iounmap(pdev, res->baseaddr);
> +		break;
> +	default:
> +		break;
> +	}
> +}
> +
> +static struct pcim_addr_devres *pcim_addr_devres_alloc(struct pci_dev *pdev)
> +{
> +	struct pcim_addr_devres *res;
> +
> +	res = devres_alloc_node(pcim_addr_resource_release, sizeof(*res),
> +			GFP_KERNEL, dev_to_node(&pdev->dev));
> +	if (res)
> +		pcim_addr_devres_clear(res);
> +	return res;
> +}
> +
> +/* Just for consistency and readability. */
> +static inline void pcim_addr_devres_free(struct pcim_addr_devres *res)
> +{
> +	devres_free(res);
> +}
> +
> +/*
> + * Used by devres to identify a pcim_addr_devres.
> + */
> +static int pcim_addr_resources_match(struct device *dev, void *a_raw, void *b_raw)
> +{
> +	struct pcim_addr_devres *a, *b;
> +
> +	a = a_raw;
> +	b = b_raw;
> +
> +	if (a->type != b->type)
> +		return 0;
> +
> +	switch (a->type) {
> +	case PCIM_ADDR_DEVRES_TYPE_REGION:
> +	case PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING:
> +		return a->bar == b->bar;
> +	case PCIM_ADDR_DEVRES_TYPE_MAPPING:
> +		return a->baseaddr == b->baseaddr;
> +	default:
> +		return 0;
> +	}
> +}
>  
>  static void devm_pci_unmap_iospace(struct device *dev, void *ptr)
>  {
> @@ -92,8 +321,8 @@ EXPORT_SYMBOL(devm_pci_remap_cfgspace);
>   *
>   * All operations are managed and will be undone on driver detach.
>   *
> - * Returns a pointer to the remapped memory or an ERR_PTR() encoded error code
> - * on failure. Usage example::
> + * Returns a pointer to the remapped memory or an IOMEM_ERR_PTR() encoded error
> + * code on failure. Usage example::
>   *
>   *	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>   *	base = devm_pci_remap_cfg_resource(&pdev->dev, res);
> @@ -172,6 +401,17 @@ static void pcim_release(struct device *gendev, void *res)
>  	struct pci_devres *this = res;
>  	int i;
>  
> +	/*
> +	 * This is legacy code.
> +	 *
> +	 * All regions requested by a pcim_ function do get released through
> +	 * pcim_addr_resource_release(). Thanks to the hybrid nature of the pci_
> +	 * region-request functions, this for-loop has to release the regions
> +	 * if they have been requested by such a function.
> +	 *
> +	 * TODO: Remove this once all users of pcim_enable_device() PLUS
> +	 * pci-region-request-functions have been ported to pcim_ functions.
> +	 */
>  	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++)
>  		if (mask_contains_bar(this->region_mask, i))
>  			pci_release_region(dev, i);
> @@ -258,19 +498,21 @@ EXPORT_SYMBOL(pcim_pin_device);
>  
>  static void pcim_iomap_release(struct device *gendev, void *res)
>  {
> -	struct pci_dev *dev = to_pci_dev(gendev);
> -	struct pcim_iomap_devres *this = res;
> -	int i;
> -
> -	for (i = 0; i < PCIM_IOMAP_MAX; i++)
> -		if (this->table[i])
> -			pci_iounmap(dev, this->table[i]);
> +	/*
> +	 * Do nothing. This is legacy code.
> +	 *
> +	 * Cleanup of the mappings is now done directly through the callbacks
> +	 * registered when creating them.
> +	 */
>  }
>  
>  /**
>   * pcim_iomap_table - access iomap allocation table
>   * @pdev: PCI device to access iomap table for
>   *
> + * Returns:
> + * Const pointer to array of __iomem pointers on success, NULL on failure.
> + *
>   * Access iomap allocation table for @dev.  If iomap table doesn't
>   * exist and @pdev is managed, it will be allocated.  All iomaps
>   * recorded in the iomap table are automatically unmapped on driver
> @@ -343,30 +585,67 @@ static void pcim_remove_mapping_from_legacy_table(struct pci_dev *pdev,
>  	}
>  }
>  
> +/*
> + * The same as pcim_remove_mapping_from_legacy_table(), but identifies the
> + * mapping by its BAR index.
> + */
> +static void pcim_remove_bar_from_legacy_table(struct pci_dev *pdev, short bar)
> +{
> +	void __iomem **legacy_iomap_table;
> +
> +	if (bar >= PCI_STD_NUM_BARS)
> +		return;
> +
> +	legacy_iomap_table = (void __iomem **)pcim_iomap_table(pdev);
> +	if (!legacy_iomap_table)
> +		return;
> +
> +	legacy_iomap_table[bar] = NULL;
> +}
> +
>  /**
>   * pcim_iomap - Managed pcim_iomap()
>   * @pdev: PCI device to iomap for
>   * @bar: BAR to iomap
>   * @maxlen: Maximum length of iomap
>   *
> - * Managed pci_iomap().  Map is automatically unmapped on driver
> - * detach.
> + * Returns: __iomem pointer on success, NULL on failure.
> + *
> + * Managed pci_iomap(). Map is automatically unmapped on driver detach. If
> + * desired, unmap manually only with pcim_iounmap().
> + *
> + * This SHOULD only be used once per BAR.
> + *
> + * NOTE:
> + * Contrary to the other pcim_* functions, this function does not return an
> + * IOMEM_ERR_PTR() on failure, but a simple NULL. This is done for backwards
> + * compatibility.
>   */
>  void __iomem *pcim_iomap(struct pci_dev *pdev, int bar, unsigned long maxlen)
>  {
>  	void __iomem *mapping;
> +	struct pcim_addr_devres *res;
> +
> +	res = pcim_addr_devres_alloc(pdev);
> +	if (!res)
> +		return NULL;
> +	res->type = PCIM_ADDR_DEVRES_TYPE_MAPPING;
>  
>  	mapping = pci_iomap(pdev, bar, maxlen);
>  	if (!mapping)
> -		return NULL;
> +		goto err_iomap;
> +	res->baseaddr = mapping;
>  
>  	if (pcim_add_mapping_to_legacy_table(pdev, mapping, bar) != 0)
>  		goto err_table;
>  
> +	devres_add(&pdev->dev, res);
>  	return mapping;
>  
>  err_table:
>  	pci_iounmap(pdev, mapping);
> +err_iomap:
> +	pcim_addr_devres_free(res);
>  	return NULL;
>  }
>  EXPORT_SYMBOL(pcim_iomap);
> @@ -376,91 +655,291 @@ EXPORT_SYMBOL(pcim_iomap);
>   * @pdev: PCI device to iounmap for
>   * @addr: Address to unmap
>   *
> - * Managed pci_iounmap().  @addr must have been mapped using pcim_iomap().
> + * Managed pci_iounmap(). @addr must have been mapped using a pcim_* mapping
> + * function.
>   */
>  void pcim_iounmap(struct pci_dev *pdev, void __iomem *addr)
>  {
> -	pci_iounmap(pdev, addr);
> +	struct pcim_addr_devres res_searched;
> +
> +	pcim_addr_devres_clear(&res_searched);
> +	res_searched.type = PCIM_ADDR_DEVRES_TYPE_MAPPING;
> +	res_searched.baseaddr = addr;
> +
> +	if (devres_release(&pdev->dev, pcim_addr_resource_release,
> +			pcim_addr_resources_match, &res_searched) != 0) {
> +		/* Doesn't exist. User passed nonsense. */
> +		return;
> +	}
>  
>  	pcim_remove_mapping_from_legacy_table(pdev, addr);
>  }
>  EXPORT_SYMBOL(pcim_iounmap);
>  
> +/**
> + * pcim_iomap_region - Request and iomap a PCI BAR
> + * @pdev: PCI device to map IO resources for
> + * @bar: Index of a BAR to map
> + * @name: Name associated with the request
> + *
> + * Returns: __iomem pointer on success, an IOMEM_ERR_PTR on failure.
> + *
> + * Mapping and region will get automatically released on driver detach. If
> + * desired, release manually only with pcim_iounmap_region().
> + */
> +static void __iomem *pcim_iomap_region(struct pci_dev *pdev, int bar,
> +				       const char *name)
> +{
> +	int ret;
> +	struct pcim_addr_devres *res;
> +
> +	res = pcim_addr_devres_alloc(pdev);
> +	if (!res)
> +		return IOMEM_ERR_PTR(-ENOMEM);
> +
> +	res->type = PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING;
> +	res->bar = bar;
> +
> +	ret = __pcim_request_region(pdev, bar, name, 0);
> +	if (ret != 0)
> +		goto err_region;
> +
> +	res->baseaddr = pci_iomap(pdev, bar, 0);
> +	if (!res->baseaddr) {
> +		ret = -EINVAL;
> +		goto err_iomap;
> +	}
> +
> +	devres_add(&pdev->dev, res);
> +	return res->baseaddr;
> +
> +err_iomap:
> +	__pcim_release_region(pdev, bar);
> +err_region:
> +	pcim_addr_devres_free(res);
> +
> +	return IOMEM_ERR_PTR(ret);
> +}
> +
> +/**
> + * pcim_iounmap_region - Unmap and release a PCI BAR
> + * @pdev: PCI device to operate on
> + * @bar: Index of BAR to unmap and release
> + *
> + * Unmap a BAR and release its region manually. Only pass BARs that were
> + * previously mapped by pcim_iomap_region().
> + */
> +static void pcim_iounmap_region(struct pci_dev *pdev, int bar)
> +{
> +	struct pcim_addr_devres res_searched;
> +
> +	pcim_addr_devres_clear(&res_searched);
> +	res_searched.type = PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING;
> +	res_searched.bar = bar;
> +
> +	devres_release(&pdev->dev, pcim_addr_resource_release,
> +			pcim_addr_resources_match, &res_searched);
> +}
> +
>  /**
>   * pcim_iomap_regions - Request and iomap PCI BARs
>   * @pdev: PCI device to map IO resources for
>   * @mask: Mask of BARs to request and iomap
> - * @name: Name used when requesting regions
> + * @name: Name associated with the requests
> + *
> + * Returns: 0 on success, negative error code on failure.
>   *
>   * Request and iomap regions specified by @mask.
>   */
>  int pcim_iomap_regions(struct pci_dev *pdev, int mask, const char *name)
>  {
> -	void __iomem * const *iomap;
> -	int i, rc;
> +	int ret;
> +	short bar;
> +	void __iomem *mapping;
>  
> -	iomap = pcim_iomap_table(pdev);
> -	if (!iomap)
> -		return -ENOMEM;
> +	for (bar = 0; bar < DEVICE_COUNT_RESOURCE; bar++) {
> +		if (!mask_contains_bar(mask, bar))
> +			continue;
>  
> -	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
> -		unsigned long len;
> +		mapping = pcim_iomap_region(pdev, bar, name);
> +		if (IS_ERR(mapping)) {
> +			ret = PTR_ERR(mapping);
> +			goto err;
> +		}
> +		ret = pcim_add_mapping_to_legacy_table(pdev, mapping, bar);
> +		if (ret != 0)
> +			goto err;
> +	}
>  
> -		if (!mask_contains_bar(mask, i))
> -			continue;
> +	return 0;
>  
> -		rc = -EINVAL;
> -		len = pci_resource_len(pdev, i);
> -		if (!len)
> -			goto err_inval;
> +err:
> +	while (--bar >= 0) {
> +		pcim_iounmap_region(pdev, bar);
> +		pcim_remove_bar_from_legacy_table(pdev, bar);
> +	}
>  
> -		rc = pci_request_region(pdev, i, name);
> -		if (rc)
> -			goto err_inval;
> +	return ret;
> +}
> +EXPORT_SYMBOL(pcim_iomap_regions);
>  
> -		rc = -ENOMEM;
> -		if (!pcim_iomap(pdev, i, 0))
> -			goto err_region;
> +static int _pcim_request_region(struct pci_dev *pdev, int bar, const char *name,
> +		int request_flags)
> +{
> +	int ret;
> +	struct pcim_addr_devres *res;
> +
> +	res = pcim_addr_devres_alloc(pdev);
> +	if (!res)
> +		return -ENOMEM;
> +	res->type = PCIM_ADDR_DEVRES_TYPE_REGION;
> +	res->bar = bar;
> +
> +	ret = __pcim_request_region(pdev, bar, name, request_flags);
> +	if (ret != 0) {
> +		pcim_addr_devres_free(res);
> +		return ret;
>  	}
>  
> +	devres_add(&pdev->dev, res);
>  	return 0;
> +}
>  
> - err_region:
> -	pci_release_region(pdev, i);
> - err_inval:
> -	while (--i >= 0) {
> -		if (!mask_contains_bar(mask, i))
> -			continue;
> -		pcim_iounmap(pdev, iomap[i]);
> -		pci_release_region(pdev, i);
> +/**
> + * pcim_request_region - Request a PCI BAR
> + * @pdev: PCI device to requestion region for
> + * @bar: Index of BAR to request
> + * @name: Name associated with the request
> + *
> + * Returns: 0 on success, a negative error code on failure.
> + *
> + * Request region specified by @bar.
> + *
> + * The region will automatically be released on driver detach. If desired,
> + * release manually only with pcim_release_region().
> + */
> +static int pcim_request_region(struct pci_dev *pdev, int bar, const char *name)
> +{
> +	return _pcim_request_region(pdev, bar, name, 0);
> +}
> +
> +/**
> + * pcim_release_region - Release a PCI BAR
> + * @pdev: PCI device to operate on
> + * @bar: Index of BAR to release
> + *
> + * Release a region manually that was previously requested by
> + * pcim_request_region().
> + */
> +static void pcim_release_region(struct pci_dev *pdev, int bar)
> +{
> +	struct pcim_addr_devres res_searched;
> +
> +	pcim_addr_devres_clear(&res_searched);
> +	res_searched.type = PCIM_ADDR_DEVRES_TYPE_REGION;
> +	res_searched.bar = bar;
> +
> +	devres_release(&pdev->dev, pcim_addr_resource_release,
> +			pcim_addr_resources_match, &res_searched);
> +}
> +
> +
> +/**
> + * pcim_release_all_regions - Release all regions of a PCI-device
> + * @pdev: the PCI device
> + *
> + * Release all regions previously requested through pcim_request_region()
> + * or pcim_request_all_regions().
> + *
> + * Can be called from any context, i.e., not necessarily as a counterpart to
> + * pcim_request_all_regions().
> + */
> +static void pcim_release_all_regions(struct pci_dev *pdev)
> +{
> +	short bar;
> +
> +	for (bar = 0; bar < PCI_STD_NUM_BARS; bar++)
> +		pcim_release_region(pdev, bar);
> +}
> +
> +/**
> + * pcim_request_all_regions - Request all regions
> + * @pdev: PCI device to map IO resources for
> + * @name: name associated with the request
> + *
> + * Returns: 0 on success, negative error code on failure.
> + *
> + * Requested regions will automatically be released at driver detach. If
> + * desired, release individual regions with pcim_release_region() or all of
> + * them at once with pcim_release_all_regions().
> + */
> +static int pcim_request_all_regions(struct pci_dev *pdev, const char *name)
> +{
> +	int ret;
> +	short bar;
> +
> +	for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
> +		ret = pcim_request_region(pdev, bar, name);
> +		if (ret != 0)
> +			goto err;
>  	}
>  
> -	return rc;
> +	return 0;
> +
> +err:
> +	pcim_release_all_regions(pdev);
> +
> +	return ret;
>  }
> -EXPORT_SYMBOL(pcim_iomap_regions);
>  
>  /**
>   * pcim_iomap_regions_request_all - Request all BARs and iomap specified ones
>   * @pdev: PCI device to map IO resources for
>   * @mask: Mask of BARs to iomap
> - * @name: Name used when requesting regions
> + * @name: Name associated with the requests
> + *
> + * Returns: 0 on success, negative error code on failure.
>   *
>   * Request all PCI BARs and iomap regions specified by @mask.
> + *
> + * To release these resources manually, call pcim_release_region() for the
> + * regions and pcim_iounmap() for the mappings.
>   */
>  int pcim_iomap_regions_request_all(struct pci_dev *pdev, int mask,
>  				   const char *name)
>  {
> -	int request_mask = ((1 << 6) - 1) & ~mask;
> -	int rc;
> +	short bar;
> +	int ret;
> +	void __iomem **legacy_iomap_table;
>  
> -	rc = pci_request_selected_regions(pdev, request_mask, name);
> -	if (rc)
> -		return rc;
> +	ret = pcim_request_all_regions(pdev, name);
> +	if (ret != 0)
> +		return ret;
>  
> -	rc = pcim_iomap_regions(pdev, mask, name);
> -	if (rc)
> -		pci_release_selected_regions(pdev, request_mask);
> -	return rc;
> +	for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
> +		if (!mask_contains_bar(mask, bar))
> +			continue;
> +		if (!pcim_iomap(pdev, bar, 0))
> +			goto err;
> +	}
> +
> +	return 0;
> +
> +err:
> +	/*
> +	 * If bar is larger than 0, then pcim_iomap() above has most likely
> +	 * failed because of -EINVAL. If it is equal 0, most likely the table
> +	 * couldn't be created, indicating -ENOMEM.
> +	 */
> +	ret = bar > 0 ? -EINVAL : -ENOMEM;
> +	legacy_iomap_table = (void __iomem **)pcim_iomap_table(pdev);
> +
> +	while (--bar >= 0)
> +		pcim_iounmap(pdev, legacy_iomap_table[bar]);
> +
> +	pcim_release_all_regions(pdev);
> +
> +	return ret;
>  }
>  EXPORT_SYMBOL(pcim_iomap_regions_request_all);
>  
> @@ -473,19 +952,14 @@ EXPORT_SYMBOL(pcim_iomap_regions_request_all);
>   */
>  void pcim_iounmap_regions(struct pci_dev *pdev, int mask)
>  {
> -	void __iomem * const *iomap;
> -	int i;
> +	short bar;
>  
> -	iomap = pcim_iomap_table(pdev);
> -	if (!iomap)
> -		return;
> -
> -	for (i = 0; i < PCIM_IOMAP_MAX; i++) {
> -		if (!mask_contains_bar(mask, i))
> +	for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
> +		if (!mask_contains_bar(mask, bar))
>  			continue;
>  
> -		pcim_iounmap(pdev, iomap[i]);
> -		pci_release_region(pdev, i);
> +		pcim_iounmap_region(pdev, bar);
> +		pcim_remove_bar_from_legacy_table(pdev, bar);
>  	}
>  }
>  EXPORT_SYMBOL(pcim_iounmap_regions);
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 59e0949fb079..d94445f5f882 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3883,6 +3883,17 @@ void pci_release_region(struct pci_dev *pdev, int bar)
>  		release_mem_region(pci_resource_start(pdev, bar),
>  				pci_resource_len(pdev, bar));
>  
> +	/*
> +	 * This devres utility makes this function sometimes managed
> +	 * (when pcim_enable_device() has been called before).
> +	 *
> +	 * This is bad because it conflicts with the pcim_ functions being
> +	 * exclusively responsible for managed PCI. Its "sometimes yes,
> +	 * sometimes no" nature can cause bugs.
> +	 *
> +	 * TODO: Remove this once all users that use pcim_enable_device() PLUS
> +	 * a region request function have been ported to using pcim_ functions.
> +	 */
>  	dr = find_pci_dr(pdev);
>  	if (dr)
>  		dr->region_mask &= ~(1 << bar);
> @@ -3927,6 +3938,17 @@ static int __pci_request_region(struct pci_dev *pdev, int bar,
>  			goto err_out;
>  	}
>  
> +	/*
> +	 * This devres utility makes this function sometimes managed
> +	 * (when pcim_enable_device() has been called before).
> +	 *
> +	 * This is bad because it conflicts with the pcim_ functions being
> +	 * exclusively responsible for managed pci. Its "sometimes yes,
> +	 * sometimes no" nature can cause bugs.
> +	 *
> +	 * TODO: Remove this once all users that use pcim_enable_device() PLUS
> +	 * a region request function have been ported to using pcim_ functions.
> +	 */
>  	dr = find_pci_dr(pdev);
>  	if (dr)
>  		dr->region_mask |= 1 << bar;
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index fd44565c4756..c09487f5550c 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -826,6 +826,11 @@ struct pci_devres {
>  	unsigned int orig_intx:1;
>  	unsigned int restore_intx:1;
>  	unsigned int mwi:1;
> +
> +	/*
> +	 * TODO: remove the region_mask once everyone calling
> +	 * pcim_enable_device() + pci_*region*() is ported to pcim_ functions.
> +	 */
>  	u32 region_mask;
>  };
>  
> -- 
> 2.45.0
>
Philipp Stanner June 14, 2024, 8:01 a.m. UTC | #2
On Thu, 2024-06-13 at 16:28 -0500, Bjorn Helgaas wrote:
> On Thu, Jun 13, 2024 at 01:50:16PM +0200, Philipp Stanner wrote:
> > With the current PCI devres API implementing a managed version of
> > pci_iomap_range() is impossible.
> > 
> > Furthermore, the PCI devres API currently is inconsistent and
> > complicated. This is in large part due to the fact that there are
> > hybrid
> > functions which are only sometimes managed via devres, and
> > functions
> > IO-mapping and requesting several BARs at once and returning
> > mappings
> > through a separately administrated table.
> > 
> > This table's indexing mechanism does not support partial-BAR
> > mappings.
> > 
> > Another notable problem is that there are no separate managed
> > counterparts for region-request functions such as
> > pci_request_region(),
> > as they exist for other PCI functions (e.g., pci_iomap() <->
> > pcim_iomap()). Instead, functions based on __pci_request_region()
> > change
> > their internal behavior and suddenly become managed functions when
> > pcim_enable_device() instead of pci_enable_device() is used.
> 
> The hybrid thing is certainly a problem, but does this patch address
> it?  I don't see that it does (other than adding comments in
> __pci_request_region() and pci_release_region()), but maybe I missed
> it.

This is just justification for why __pcim_request_region() etc. are
implemented. They bypass the hybrid nature  of __pci_request_region().
If the latter wouldn't have that behavior

> 
> Correct me if I'm wrong, but I don't think this patch makes any
> user-visible changes.

Except for deprecating that two functions and adding a new public one,
the entire series shouldn't make user-visible changes. That's the
point.

P.

> 
> I'm proposing this:
> 
>   PCI: Add managed partial-BAR request and map infrastructure
> 
>   The pcim_iomap_devres table tracks entire-BAR mappings, so we can't
> use it
>   to build a managed version of pci_iomap_range(), which maps partial
> BARs.
> 
>   Add struct pcim_addr_devres, which can track request and mapping of
> both
>   entire BARs and partial BARs.
> 
>   Add the following internal devres functions based on struct
>   pcim_addr_devres:
> 
>     pcim_iomap_region()               # request & map entire BAR
>     pcim_iounmap_region()             # unmap & release entire BAR
>     pcim_request_region()             # request entire BAR
>     pcim_release_region()             # release entire BAR
>     pcim_request_all_regions()        # request all entire BARs
>     pcim_release_all_regions()        # release all entire BARs
> 
>   Rework the following public interfaces using the new infrastructure
>   listed above:
> 
>     pcim_iomap()                      # map partial BAR
>     pcim_iounmap()                    # unmap partial BAR
>     pcim_iomap_regions()              # request & map specified BARs
>     pcim_iomap_regions_request_all()  # request all BARs, map
> specified BARs
>     pcim_iounmap_regions()            # unmap & release specified
> BARs
> 
> 
> > This API is hard to understand and potentially bug-provoking.
> > Hence, it
> > should be made more consistent.
> > 
> > This patch adds the necessary infrastructure for partial-BAR
> > mappings
> > managed with devres. That infrastructure also serves as a ground
> > layer
> > for significantly simplifying the PCI devres API in subsequent
> > patches
> > which can then cleanly separate managed and unmanaged API.
> > 
> > When having the long term goal of providing always managed
> > functions
> > prefixed with "pcim_" and never managed functions prefixed with
> > "pci_"
> > and, thus, separating managed and unmanaged APIs cleanly, new PCI
> > devres
> > infrastructure cannot use __pci_request_region() and its wrappers
> > since
> > those would then again interact with PCI devres and, consequently,
> > prevent the managed nature from being removed from the pci_*
> > functions
> > in the first place. Thus, it's necessary to provide an alternative
> > to
> > __pci_request_region().
> > 
> > This patch addresses the following problems of the PCI devres API:
> > 
> >   a) There is no PCI devres infrastructure on which a managed
> > counter
> >      part to pci_iomap_range() could be based on.
> > 
> >   b) The vast majority of the users of plural functions such as
> >      pcim_iomap_regions() only ever sets a single bit in the bit
> > mask,
> >      consequently making them singular functions anyways.
> > 
> >   c) region-request functions being sometimes managed and sometimes
> > not
> >      is bug-provoking. pcim_* functions should always be managed,
> > pci_*
> >      functions never.
> > 
> > Add a new PCI device resource, pcim_addr_devres, that serves to
> > encapsulate all device resource types related to region requests
> > and
> > IO-mappings since those are very frequently created together.
> > 
> > Add a set of alternatives cleanly separated from the hybrid
> > mechanism in
> > __pci_request_region() and its respective wrappers:
> >   - __pcim_request_region_range()
> >   - __pcim_release_region_range()
> >   - __pcim_request_region()
> >   - __pcim_release_region()
> > 
> > Add the following PCI-internal devres functions based on the above:
> >   - pcim_iomap_region()
> >   - pcim_iounmap_region()
> >   - _pcim_request_region()
> >   - pcim_request_region()
> >   - pcim_release_region()
> >   - pcim_request_all_regions()
> >   - pcim_release_all_regions()
> > 
> > Add new needed helper pcim_remove_bar_from_legacy_table().
> > 
> > Rework the following public interfaces using the new infrastructure
> > listed above:
> >   - pcim_iomap_release()
> >   - pcim_iomap()
> >   - pcim_iounmap()
> >   - pcim_iomap_regions()
> >   - pcim_iomap_regions_request_all()
> >   - pcim_iounmap_regions()
> > 
> > Update API documentation.
> > 
> > Link:
> > https://lore.kernel.org/r/20240605081605.18769-5-pstanner@redhat.com
> > Signed-off-by: Philipp Stanner <pstanner@redhat.com>
> > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > ---
> >  drivers/pci/devres.c | 608 ++++++++++++++++++++++++++++++++++++++-
> > ----
> >  drivers/pci/pci.c    |  22 ++
> >  drivers/pci/pci.h    |   5 +
> >  3 files changed, 568 insertions(+), 67 deletions(-)
> > 
> > diff --git a/drivers/pci/devres.c b/drivers/pci/devres.c
> > index 845d6fab0ce7..cf2c11b54ca6 100644
> > --- a/drivers/pci/devres.c
> > +++ b/drivers/pci/devres.c
> > @@ -4,14 +4,243 @@
> >  #include "pci.h"
> >  
> >  /*
> > - * PCI iomap devres
> > + * On the state of PCI's devres implementation:
> > + *
> > + * The older devres API for PCI has two significant problems:
> > + *
> > + * 1. It is very strongly tied to the statically allocated mapping
> > table in
> > + *    struct pcim_iomap_devres below. This is mostly solved in the
> > sense of the
> > + *    pcim_ functions in this file providing things like ranged
> > mapping by
> > + *    bypassing this table, wheras the functions that were present
> > in the old
> > + *    API still enter the mapping addresses into the table for
> > users of the old
> > + *    API.
> > + *
> > + * 2. The region-request-functions in pci.c do become managed IF
> > the device has
> > + *    been enabled with pcim_enable_device() instead of
> > pci_enable_device().
> > + *    This resulted in the API becoming inconsistent: Some
> > functions have an
> > + *    obviously managed counter-part (e.g., pci_iomap() <->
> > pcim_iomap()),
> > + *    whereas some don't and are never managed, while others don't
> > and are
> > + *    _sometimes_ managed (e.g. pci_request_region()).
> > + *
> > + *    Consequently, in the new API, region requests performed by
> > the pcim_
> > + *    functions are automatically cleaned up through the devres
> > callback
> > + *    pcim_addr_resource_release(), while requests performed by
> > + *    pcim_enable_device() + pci_*region*() are automatically
> > cleaned up
> > + *    through the for-loop in pcim_release().
> > + *
> > + * TODO 1:
> > + * Remove the legacy table entirely once all calls to
> > pcim_iomap_table() in
> > + * the kernel have been removed.
> > + *
> > + * TODO 2:
> > + * Port everyone calling pcim_enable_device() + pci_*region*() to
> > using the
> > + * pcim_ functions. Then, remove all devres functionality from
> > pci_*region*()
> > + * functions and remove the associated cleanups described above in
> > point #2.
> >   */
> > -#define PCIM_IOMAP_MAX PCI_STD_NUM_BARS
> >  
> > +/*
> > + * Legacy struct storing addresses to whole mapped BARs.
> > + */
> >  struct pcim_iomap_devres {
> > -       void __iomem *table[PCIM_IOMAP_MAX];
> > +       void __iomem *table[PCI_STD_NUM_BARS];
> > +};
> > +
> > +enum pcim_addr_devres_type {
> > +       /* Default initializer. */
> > +       PCIM_ADDR_DEVRES_TYPE_INVALID,
> > +
> > +       /* A requested region spanning an entire BAR. */
> > +       PCIM_ADDR_DEVRES_TYPE_REGION,
> > +
> > +       /*
> > +        * A requested region spanning an entire BAR, and a mapping
> > for
> > +        * the entire BAR.
> > +        */
> > +       PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING,
> > +
> > +       /*
> > +        * A mapping within a BAR, either spanning the whole BAR or
> > just a
> > +        * range.  Without a requested region.
> > +        */
> > +       PCIM_ADDR_DEVRES_TYPE_MAPPING,
> >  };
> >  
> > +/*
> > + * This struct envelops IO or MEM addresses, i.e., mappings and
> > region
> > + * requests, because those are very frequently requested and
> > released
> > + * together.
> > + */
> > +struct pcim_addr_devres {
> > +       enum pcim_addr_devres_type type;
> > +       void __iomem *baseaddr;
> > +       unsigned long offset;
> > +       unsigned long len;
> > +       short bar;
> > +};
> > +
> > +static inline void pcim_addr_devres_clear(struct pcim_addr_devres
> > *res)
> > +{
> > +       memset(res, 0, sizeof(*res));
> > +       res->bar = -1;
> > +}
> > +
> > +/*
> > + * The following functions, __pcim_*_region*, exist as
> > counterparts to the
> > + * versions from pci.c - which, unfortunately, can be in "hybrid
> > mode", i.e.,
> > + * sometimes managed, sometimes not.
> > + *
> > + * To separate the APIs cleanly, we define our own, simplified
> > versions here.
> > + */
> > +
> > +/**
> > + * __pcim_request_region_range - Request a ranged region
> > + * @pdev: PCI device the region belongs to
> > + * @bar: BAR the range is within
> > + * @offset: offset from the BAR's start address
> > + * @maxlen: length in bytes, beginning at @offset
> > + * @name: name associated with the request
> > + * @req_flags: flags for the request, e.g., for kernel-exclusive
> > requests
> > + *
> > + * Returns: 0 on success, a negative error code on failure.
> > + *
> > + * Request a range within a device's PCI BAR.  Sanity check the
> > input.
> > + */
> > +static int __pcim_request_region_range(struct pci_dev *pdev, int
> > bar,
> > +               unsigned long offset, unsigned long maxlen,
> > +               const char *name, int req_flags)
> > +{
> > +       resource_size_t start = pci_resource_start(pdev, bar);
> > +       resource_size_t len = pci_resource_len(pdev, bar);
> > +       unsigned long dev_flags = pci_resource_flags(pdev, bar);
> > +
> > +       if (start == 0 || len == 0) /* Unused BAR. */
> > +               return 0;
> > +       if (len <= offset)
> > +               return  -EINVAL;
> > +
> > +       start += offset;
> > +       len -= offset;
> > +
> > +       if (len > maxlen && maxlen != 0)
> > +               len = maxlen;
> > +
> > +       if (dev_flags & IORESOURCE_IO) {
> > +               if (!request_region(start, len, name))
> > +                       return -EBUSY;
> > +       } else if (dev_flags & IORESOURCE_MEM) {
> > +               if (!__request_mem_region(start, len, name,
> > req_flags))
> > +                       return -EBUSY;
> > +       } else {
> > +               /* That's not a device we can request anything on.
> > */
> > +               return -ENODEV;
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static void __pcim_release_region_range(struct pci_dev *pdev, int
> > bar,
> > +               unsigned long offset, unsigned long maxlen)
> > +{
> > +       resource_size_t start = pci_resource_start(pdev, bar);
> > +       resource_size_t len = pci_resource_len(pdev, bar);
> > +       unsigned long flags = pci_resource_flags(pdev, bar);
> > +
> > +       if (len <= offset || start == 0)
> > +               return;
> > +
> > +       if (len == 0 || maxlen == 0) /* This an unused BAR. Do
> > nothing. */
> > +               return;
> > +
> > +       start += offset;
> > +       len -= offset;
> > +
> > +       if (len > maxlen)
> > +               len = maxlen;
> > +
> > +       if (flags & IORESOURCE_IO)
> > +               release_region(start, len);
> > +       else if (flags & IORESOURCE_MEM)
> > +               release_mem_region(start, len);
> > +}
> > +
> > +static int __pcim_request_region(struct pci_dev *pdev, int bar,
> > +               const char *name, int flags)
> > +{
> > +       unsigned long offset = 0;
> > +       unsigned long len = pci_resource_len(pdev, bar);
> > +
> > +       return __pcim_request_region_range(pdev, bar, offset, len,
> > name, flags);
> > +}
> > +
> > +static void __pcim_release_region(struct pci_dev *pdev, int bar)
> > +{
> > +       unsigned long offset = 0;
> > +       unsigned long len = pci_resource_len(pdev, bar);
> > +
> > +       __pcim_release_region_range(pdev, bar, offset, len);
> > +}
> > +
> > +static void pcim_addr_resource_release(struct device *dev, void
> > *resource_raw)
> > +{
> > +       struct pci_dev *pdev = to_pci_dev(dev);
> > +       struct pcim_addr_devres *res = resource_raw;
> > +
> > +       switch (res->type) {
> > +       case PCIM_ADDR_DEVRES_TYPE_REGION:
> > +               __pcim_release_region(pdev, res->bar);
> > +               break;
> > +       case PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING:
> > +               pci_iounmap(pdev, res->baseaddr);
> > +               __pcim_release_region(pdev, res->bar);
> > +               break;
> > +       case PCIM_ADDR_DEVRES_TYPE_MAPPING:
> > +               pci_iounmap(pdev, res->baseaddr);
> > +               break;
> > +       default:
> > +               break;
> > +       }
> > +}
> > +
> > +static struct pcim_addr_devres *pcim_addr_devres_alloc(struct
> > pci_dev *pdev)
> > +{
> > +       struct pcim_addr_devres *res;
> > +
> > +       res = devres_alloc_node(pcim_addr_resource_release,
> > sizeof(*res),
> > +                       GFP_KERNEL, dev_to_node(&pdev->dev));
> > +       if (res)
> > +               pcim_addr_devres_clear(res);
> > +       return res;
> > +}
> > +
> > +/* Just for consistency and readability. */
> > +static inline void pcim_addr_devres_free(struct pcim_addr_devres
> > *res)
> > +{
> > +       devres_free(res);
> > +}
> > +
> > +/*
> > + * Used by devres to identify a pcim_addr_devres.
> > + */
> > +static int pcim_addr_resources_match(struct device *dev, void
> > *a_raw, void *b_raw)
> > +{
> > +       struct pcim_addr_devres *a, *b;
> > +
> > +       a = a_raw;
> > +       b = b_raw;
> > +
> > +       if (a->type != b->type)
> > +               return 0;
> > +
> > +       switch (a->type) {
> > +       case PCIM_ADDR_DEVRES_TYPE_REGION:
> > +       case PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING:
> > +               return a->bar == b->bar;
> > +       case PCIM_ADDR_DEVRES_TYPE_MAPPING:
> > +               return a->baseaddr == b->baseaddr;
> > +       default:
> > +               return 0;
> > +       }
> > +}
> >  
> >  static void devm_pci_unmap_iospace(struct device *dev, void *ptr)
> >  {
> > @@ -92,8 +321,8 @@ EXPORT_SYMBOL(devm_pci_remap_cfgspace);
> >   *
> >   * All operations are managed and will be undone on driver detach.
> >   *
> > - * Returns a pointer to the remapped memory or an ERR_PTR()
> > encoded error code
> > - * on failure. Usage example::
> > + * Returns a pointer to the remapped memory or an IOMEM_ERR_PTR()
> > encoded error
> > + * code on failure. Usage example::
> >   *
> >   *     res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> >   *     base = devm_pci_remap_cfg_resource(&pdev->dev, res);
> > @@ -172,6 +401,17 @@ static void pcim_release(struct device
> > *gendev, void *res)
> >         struct pci_devres *this = res;
> >         int i;
> >  
> > +       /*
> > +        * This is legacy code.
> > +        *
> > +        * All regions requested by a pcim_ function do get
> > released through
> > +        * pcim_addr_resource_release(). Thanks to the hybrid
> > nature of the pci_
> > +        * region-request functions, this for-loop has to release
> > the regions
> > +        * if they have been requested by such a function.
> > +        *
> > +        * TODO: Remove this once all users of pcim_enable_device()
> > PLUS
> > +        * pci-region-request-functions have been ported to pcim_
> > functions.
> > +        */
> >         for (i = 0; i < DEVICE_COUNT_RESOURCE; i++)
> >                 if (mask_contains_bar(this->region_mask, i))
> >                         pci_release_region(dev, i);
> > @@ -258,19 +498,21 @@ EXPORT_SYMBOL(pcim_pin_device);
> >  
> >  static void pcim_iomap_release(struct device *gendev, void *res)
> >  {
> > -       struct pci_dev *dev = to_pci_dev(gendev);
> > -       struct pcim_iomap_devres *this = res;
> > -       int i;
> > -
> > -       for (i = 0; i < PCIM_IOMAP_MAX; i++)
> > -               if (this->table[i])
> > -                       pci_iounmap(dev, this->table[i]);
> > +       /*
> > +        * Do nothing. This is legacy code.
> > +        *
> > +        * Cleanup of the mappings is now done directly through the
> > callbacks
> > +        * registered when creating them.
> > +        */
> >  }
> >  
> >  /**
> >   * pcim_iomap_table - access iomap allocation table
> >   * @pdev: PCI device to access iomap table for
> >   *
> > + * Returns:
> > + * Const pointer to array of __iomem pointers on success, NULL on
> > failure.
> > + *
> >   * Access iomap allocation table for @dev.  If iomap table doesn't
> >   * exist and @pdev is managed, it will be allocated.  All iomaps
> >   * recorded in the iomap table are automatically unmapped on
> > driver
> > @@ -343,30 +585,67 @@ static void
> > pcim_remove_mapping_from_legacy_table(struct pci_dev *pdev,
> >         }
> >  }
> >  
> > +/*
> > + * The same as pcim_remove_mapping_from_legacy_table(), but
> > identifies the
> > + * mapping by its BAR index.
> > + */
> > +static void pcim_remove_bar_from_legacy_table(struct pci_dev
> > *pdev, short bar)
> > +{
> > +       void __iomem **legacy_iomap_table;
> > +
> > +       if (bar >= PCI_STD_NUM_BARS)
> > +               return;
> > +
> > +       legacy_iomap_table = (void __iomem
> > **)pcim_iomap_table(pdev);
> > +       if (!legacy_iomap_table)
> > +               return;
> > +
> > +       legacy_iomap_table[bar] = NULL;
> > +}
> > +
> >  /**
> >   * pcim_iomap - Managed pcim_iomap()
> >   * @pdev: PCI device to iomap for
> >   * @bar: BAR to iomap
> >   * @maxlen: Maximum length of iomap
> >   *
> > - * Managed pci_iomap().  Map is automatically unmapped on driver
> > - * detach.
> > + * Returns: __iomem pointer on success, NULL on failure.
> > + *
> > + * Managed pci_iomap(). Map is automatically unmapped on driver
> > detach. If
> > + * desired, unmap manually only with pcim_iounmap().
> > + *
> > + * This SHOULD only be used once per BAR.
> > + *
> > + * NOTE:
> > + * Contrary to the other pcim_* functions, this function does not
> > return an
> > + * IOMEM_ERR_PTR() on failure, but a simple NULL. This is done for
> > backwards
> > + * compatibility.
> >   */
> >  void __iomem *pcim_iomap(struct pci_dev *pdev, int bar, unsigned
> > long maxlen)
> >  {
> >         void __iomem *mapping;
> > +       struct pcim_addr_devres *res;
> > +
> > +       res = pcim_addr_devres_alloc(pdev);
> > +       if (!res)
> > +               return NULL;
> > +       res->type = PCIM_ADDR_DEVRES_TYPE_MAPPING;
> >  
> >         mapping = pci_iomap(pdev, bar, maxlen);
> >         if (!mapping)
> > -               return NULL;
> > +               goto err_iomap;
> > +       res->baseaddr = mapping;
> >  
> >         if (pcim_add_mapping_to_legacy_table(pdev, mapping, bar) !=
> > 0)
> >                 goto err_table;
> >  
> > +       devres_add(&pdev->dev, res);
> >         return mapping;
> >  
> >  err_table:
> >         pci_iounmap(pdev, mapping);
> > +err_iomap:
> > +       pcim_addr_devres_free(res);
> >         return NULL;
> >  }
> >  EXPORT_SYMBOL(pcim_iomap);
> > @@ -376,91 +655,291 @@ EXPORT_SYMBOL(pcim_iomap);
> >   * @pdev: PCI device to iounmap for
> >   * @addr: Address to unmap
> >   *
> > - * Managed pci_iounmap().  @addr must have been mapped using
> > pcim_iomap().
> > + * Managed pci_iounmap(). @addr must have been mapped using a
> > pcim_* mapping
> > + * function.
> >   */
> >  void pcim_iounmap(struct pci_dev *pdev, void __iomem *addr)
> >  {
> > -       pci_iounmap(pdev, addr);
> > +       struct pcim_addr_devres res_searched;
> > +
> > +       pcim_addr_devres_clear(&res_searched);
> > +       res_searched.type = PCIM_ADDR_DEVRES_TYPE_MAPPING;
> > +       res_searched.baseaddr = addr;
> > +
> > +       if (devres_release(&pdev->dev, pcim_addr_resource_release,
> > +                       pcim_addr_resources_match, &res_searched)
> > != 0) {
> > +               /* Doesn't exist. User passed nonsense. */
> > +               return;
> > +       }
> >  
> >         pcim_remove_mapping_from_legacy_table(pdev, addr);
> >  }
> >  EXPORT_SYMBOL(pcim_iounmap);
> >  
> > +/**
> > + * pcim_iomap_region - Request and iomap a PCI BAR
> > + * @pdev: PCI device to map IO resources for
> > + * @bar: Index of a BAR to map
> > + * @name: Name associated with the request
> > + *
> > + * Returns: __iomem pointer on success, an IOMEM_ERR_PTR on
> > failure.
> > + *
> > + * Mapping and region will get automatically released on driver
> > detach. If
> > + * desired, release manually only with pcim_iounmap_region().
> > + */
> > +static void __iomem *pcim_iomap_region(struct pci_dev *pdev, int
> > bar,
> > +                                      const char *name)
> > +{
> > +       int ret;
> > +       struct pcim_addr_devres *res;
> > +
> > +       res = pcim_addr_devres_alloc(pdev);
> > +       if (!res)
> > +               return IOMEM_ERR_PTR(-ENOMEM);
> > +
> > +       res->type = PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING;
> > +       res->bar = bar;
> > +
> > +       ret = __pcim_request_region(pdev, bar, name, 0);
> > +       if (ret != 0)
> > +               goto err_region;
> > +
> > +       res->baseaddr = pci_iomap(pdev, bar, 0);
> > +       if (!res->baseaddr) {
> > +               ret = -EINVAL;
> > +               goto err_iomap;
> > +       }
> > +
> > +       devres_add(&pdev->dev, res);
> > +       return res->baseaddr;
> > +
> > +err_iomap:
> > +       __pcim_release_region(pdev, bar);
> > +err_region:
> > +       pcim_addr_devres_free(res);
> > +
> > +       return IOMEM_ERR_PTR(ret);
> > +}
> > +
> > +/**
> > + * pcim_iounmap_region - Unmap and release a PCI BAR
> > + * @pdev: PCI device to operate on
> > + * @bar: Index of BAR to unmap and release
> > + *
> > + * Unmap a BAR and release its region manually. Only pass BARs
> > that were
> > + * previously mapped by pcim_iomap_region().
> > + */
> > +static void pcim_iounmap_region(struct pci_dev *pdev, int bar)
> > +{
> > +       struct pcim_addr_devres res_searched;
> > +
> > +       pcim_addr_devres_clear(&res_searched);
> > +       res_searched.type = PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING;
> > +       res_searched.bar = bar;
> > +
> > +       devres_release(&pdev->dev, pcim_addr_resource_release,
> > +                       pcim_addr_resources_match, &res_searched);
> > +}
> > +
> >  /**
> >   * pcim_iomap_regions - Request and iomap PCI BARs
> >   * @pdev: PCI device to map IO resources for
> >   * @mask: Mask of BARs to request and iomap
> > - * @name: Name used when requesting regions
> > + * @name: Name associated with the requests
> > + *
> > + * Returns: 0 on success, negative error code on failure.
> >   *
> >   * Request and iomap regions specified by @mask.
> >   */
> >  int pcim_iomap_regions(struct pci_dev *pdev, int mask, const char
> > *name)
> >  {
> > -       void __iomem * const *iomap;
> > -       int i, rc;
> > +       int ret;
> > +       short bar;
> > +       void __iomem *mapping;
> >  
> > -       iomap = pcim_iomap_table(pdev);
> > -       if (!iomap)
> > -               return -ENOMEM;
> > +       for (bar = 0; bar < DEVICE_COUNT_RESOURCE; bar++) {
> > +               if (!mask_contains_bar(mask, bar))
> > +                       continue;
> >  
> > -       for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
> > -               unsigned long len;
> > +               mapping = pcim_iomap_region(pdev, bar, name);
> > +               if (IS_ERR(mapping)) {
> > +                       ret = PTR_ERR(mapping);
> > +                       goto err;
> > +               }
> > +               ret = pcim_add_mapping_to_legacy_table(pdev,
> > mapping, bar);
> > +               if (ret != 0)
> > +                       goto err;
> > +       }
> >  
> > -               if (!mask_contains_bar(mask, i))
> > -                       continue;
> > +       return 0;
> >  
> > -               rc = -EINVAL;
> > -               len = pci_resource_len(pdev, i);
> > -               if (!len)
> > -                       goto err_inval;
> > +err:
> > +       while (--bar >= 0) {
> > +               pcim_iounmap_region(pdev, bar);
> > +               pcim_remove_bar_from_legacy_table(pdev, bar);
> > +       }
> >  
> > -               rc = pci_request_region(pdev, i, name);
> > -               if (rc)
> > -                       goto err_inval;
> > +       return ret;
> > +}
> > +EXPORT_SYMBOL(pcim_iomap_regions);
> >  
> > -               rc = -ENOMEM;
> > -               if (!pcim_iomap(pdev, i, 0))
> > -                       goto err_region;
> > +static int _pcim_request_region(struct pci_dev *pdev, int bar,
> > const char *name,
> > +               int request_flags)
> > +{
> > +       int ret;
> > +       struct pcim_addr_devres *res;
> > +
> > +       res = pcim_addr_devres_alloc(pdev);
> > +       if (!res)
> > +               return -ENOMEM;
> > +       res->type = PCIM_ADDR_DEVRES_TYPE_REGION;
> > +       res->bar = bar;
> > +
> > +       ret = __pcim_request_region(pdev, bar, name,
> > request_flags);
> > +       if (ret != 0) {
> > +               pcim_addr_devres_free(res);
> > +               return ret;
> >         }
> >  
> > +       devres_add(&pdev->dev, res);
> >         return 0;
> > +}
> >  
> > - err_region:
> > -       pci_release_region(pdev, i);
> > - err_inval:
> > -       while (--i >= 0) {
> > -               if (!mask_contains_bar(mask, i))
> > -                       continue;
> > -               pcim_iounmap(pdev, iomap[i]);
> > -               pci_release_region(pdev, i);
> > +/**
> > + * pcim_request_region - Request a PCI BAR
> > + * @pdev: PCI device to requestion region for
> > + * @bar: Index of BAR to request
> > + * @name: Name associated with the request
> > + *
> > + * Returns: 0 on success, a negative error code on failure.
> > + *
> > + * Request region specified by @bar.
> > + *
> > + * The region will automatically be released on driver detach. If
> > desired,
> > + * release manually only with pcim_release_region().
> > + */
> > +static int pcim_request_region(struct pci_dev *pdev, int bar,
> > const char *name)
> > +{
> > +       return _pcim_request_region(pdev, bar, name, 0);
> > +}
> > +
> > +/**
> > + * pcim_release_region - Release a PCI BAR
> > + * @pdev: PCI device to operate on
> > + * @bar: Index of BAR to release
> > + *
> > + * Release a region manually that was previously requested by
> > + * pcim_request_region().
> > + */
> > +static void pcim_release_region(struct pci_dev *pdev, int bar)
> > +{
> > +       struct pcim_addr_devres res_searched;
> > +
> > +       pcim_addr_devres_clear(&res_searched);
> > +       res_searched.type = PCIM_ADDR_DEVRES_TYPE_REGION;
> > +       res_searched.bar = bar;
> > +
> > +       devres_release(&pdev->dev, pcim_addr_resource_release,
> > +                       pcim_addr_resources_match, &res_searched);
> > +}
> > +
> > +
> > +/**
> > + * pcim_release_all_regions - Release all regions of a PCI-device
> > + * @pdev: the PCI device
> > + *
> > + * Release all regions previously requested through
> > pcim_request_region()
> > + * or pcim_request_all_regions().
> > + *
> > + * Can be called from any context, i.e., not necessarily as a
> > counterpart to
> > + * pcim_request_all_regions().
> > + */
> > +static void pcim_release_all_regions(struct pci_dev *pdev)
> > +{
> > +       short bar;
> > +
> > +       for (bar = 0; bar < PCI_STD_NUM_BARS; bar++)
> > +               pcim_release_region(pdev, bar);
> > +}
> > +
> > +/**
> > + * pcim_request_all_regions - Request all regions
> > + * @pdev: PCI device to map IO resources for
> > + * @name: name associated with the request
> > + *
> > + * Returns: 0 on success, negative error code on failure.
> > + *
> > + * Requested regions will automatically be released at driver
> > detach. If
> > + * desired, release individual regions with pcim_release_region()
> > or all of
> > + * them at once with pcim_release_all_regions().
> > + */
> > +static int pcim_request_all_regions(struct pci_dev *pdev, const
> > char *name)
> > +{
> > +       int ret;
> > +       short bar;
> > +
> > +       for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
> > +               ret = pcim_request_region(pdev, bar, name);
> > +               if (ret != 0)
> > +                       goto err;
> >         }
> >  
> > -       return rc;
> > +       return 0;
> > +
> > +err:
> > +       pcim_release_all_regions(pdev);
> > +
> > +       return ret;
> >  }
> > -EXPORT_SYMBOL(pcim_iomap_regions);
> >  
> >  /**
> >   * pcim_iomap_regions_request_all - Request all BARs and iomap
> > specified ones
> >   * @pdev: PCI device to map IO resources for
> >   * @mask: Mask of BARs to iomap
> > - * @name: Name used when requesting regions
> > + * @name: Name associated with the requests
> > + *
> > + * Returns: 0 on success, negative error code on failure.
> >   *
> >   * Request all PCI BARs and iomap regions specified by @mask.
> > + *
> > + * To release these resources manually, call pcim_release_region()
> > for the
> > + * regions and pcim_iounmap() for the mappings.
> >   */
> >  int pcim_iomap_regions_request_all(struct pci_dev *pdev, int mask,
> >                                    const char *name)
> >  {
> > -       int request_mask = ((1 << 6) - 1) & ~mask;
> > -       int rc;
> > +       short bar;
> > +       int ret;
> > +       void __iomem **legacy_iomap_table;
> >  
> > -       rc = pci_request_selected_regions(pdev, request_mask,
> > name);
> > -       if (rc)
> > -               return rc;
> > +       ret = pcim_request_all_regions(pdev, name);
> > +       if (ret != 0)
> > +               return ret;
> >  
> > -       rc = pcim_iomap_regions(pdev, mask, name);
> > -       if (rc)
> > -               pci_release_selected_regions(pdev, request_mask);
> > -       return rc;
> > +       for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
> > +               if (!mask_contains_bar(mask, bar))
> > +                       continue;
> > +               if (!pcim_iomap(pdev, bar, 0))
> > +                       goto err;
> > +       }
> > +
> > +       return 0;
> > +
> > +err:
> > +       /*
> > +        * If bar is larger than 0, then pcim_iomap() above has
> > most likely
> > +        * failed because of -EINVAL. If it is equal 0, most likely
> > the table
> > +        * couldn't be created, indicating -ENOMEM.
> > +        */
> > +       ret = bar > 0 ? -EINVAL : -ENOMEM;
> > +       legacy_iomap_table = (void __iomem
> > **)pcim_iomap_table(pdev);
> > +
> > +       while (--bar >= 0)
> > +               pcim_iounmap(pdev, legacy_iomap_table[bar]);
> > +
> > +       pcim_release_all_regions(pdev);
> > +
> > +       return ret;
> >  }
> >  EXPORT_SYMBOL(pcim_iomap_regions_request_all);
> >  
> > @@ -473,19 +952,14 @@
> > EXPORT_SYMBOL(pcim_iomap_regions_request_all);
> >   */
> >  void pcim_iounmap_regions(struct pci_dev *pdev, int mask)
> >  {
> > -       void __iomem * const *iomap;
> > -       int i;
> > +       short bar;
> >  
> > -       iomap = pcim_iomap_table(pdev);
> > -       if (!iomap)
> > -               return;
> > -
> > -       for (i = 0; i < PCIM_IOMAP_MAX; i++) {
> > -               if (!mask_contains_bar(mask, i))
> > +       for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
> > +               if (!mask_contains_bar(mask, bar))
> >                         continue;
> >  
> > -               pcim_iounmap(pdev, iomap[i]);
> > -               pci_release_region(pdev, i);
> > +               pcim_iounmap_region(pdev, bar);
> > +               pcim_remove_bar_from_legacy_table(pdev, bar);
> >         }
> >  }
> >  EXPORT_SYMBOL(pcim_iounmap_regions);
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index 59e0949fb079..d94445f5f882 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -3883,6 +3883,17 @@ void pci_release_region(struct pci_dev
> > *pdev, int bar)
> >                 release_mem_region(pci_resource_start(pdev, bar),
> >                                 pci_resource_len(pdev, bar));
> >  
> > +       /*
> > +        * This devres utility makes this function sometimes
> > managed
> > +        * (when pcim_enable_device() has been called before).
> > +        *
> > +        * This is bad because it conflicts with the pcim_
> > functions being
> > +        * exclusively responsible for managed PCI. Its "sometimes
> > yes,
> > +        * sometimes no" nature can cause bugs.
> > +        *
> > +        * TODO: Remove this once all users that use
> > pcim_enable_device() PLUS
> > +        * a region request function have been ported to using
> > pcim_ functions.
> > +        */
> >         dr = find_pci_dr(pdev);
> >         if (dr)
> >                 dr->region_mask &= ~(1 << bar);
> > @@ -3927,6 +3938,17 @@ static int __pci_request_region(struct
> > pci_dev *pdev, int bar,
> >                         goto err_out;
> >         }
> >  
> > +       /*
> > +        * This devres utility makes this function sometimes
> > managed
> > +        * (when pcim_enable_device() has been called before).
> > +        *
> > +        * This is bad because it conflicts with the pcim_
> > functions being
> > +        * exclusively responsible for managed pci. Its "sometimes
> > yes,
> > +        * sometimes no" nature can cause bugs.
> > +        *
> > +        * TODO: Remove this once all users that use
> > pcim_enable_device() PLUS
> > +        * a region request function have been ported to using
> > pcim_ functions.
> > +        */
> >         dr = find_pci_dr(pdev);
> >         if (dr)
> >                 dr->region_mask |= 1 << bar;
> > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> > index fd44565c4756..c09487f5550c 100644
> > --- a/drivers/pci/pci.h
> > +++ b/drivers/pci/pci.h
> > @@ -826,6 +826,11 @@ struct pci_devres {
> >         unsigned int orig_intx:1;
> >         unsigned int restore_intx:1;
> >         unsigned int mwi:1;
> > +
> > +       /*
> > +        * TODO: remove the region_mask once everyone calling
> > +        * pcim_enable_device() + pci_*region*() is ported to pcim_
> > functions.
> > +        */
> >         u32 region_mask;
> >  };
> >  
> > -- 
> > 2.45.0
> > 
>
diff mbox series

Patch

diff --git a/drivers/pci/devres.c b/drivers/pci/devres.c
index 845d6fab0ce7..cf2c11b54ca6 100644
--- a/drivers/pci/devres.c
+++ b/drivers/pci/devres.c
@@ -4,14 +4,243 @@ 
 #include "pci.h"
 
 /*
- * PCI iomap devres
+ * On the state of PCI's devres implementation:
+ *
+ * The older devres API for PCI has two significant problems:
+ *
+ * 1. It is very strongly tied to the statically allocated mapping table in
+ *    struct pcim_iomap_devres below. This is mostly solved in the sense of the
+ *    pcim_ functions in this file providing things like ranged mapping by
+ *    bypassing this table, wheras the functions that were present in the old
+ *    API still enter the mapping addresses into the table for users of the old
+ *    API.
+ *
+ * 2. The region-request-functions in pci.c do become managed IF the device has
+ *    been enabled with pcim_enable_device() instead of pci_enable_device().
+ *    This resulted in the API becoming inconsistent: Some functions have an
+ *    obviously managed counter-part (e.g., pci_iomap() <-> pcim_iomap()),
+ *    whereas some don't and are never managed, while others don't and are
+ *    _sometimes_ managed (e.g. pci_request_region()).
+ *
+ *    Consequently, in the new API, region requests performed by the pcim_
+ *    functions are automatically cleaned up through the devres callback
+ *    pcim_addr_resource_release(), while requests performed by
+ *    pcim_enable_device() + pci_*region*() are automatically cleaned up
+ *    through the for-loop in pcim_release().
+ *
+ * TODO 1:
+ * Remove the legacy table entirely once all calls to pcim_iomap_table() in
+ * the kernel have been removed.
+ *
+ * TODO 2:
+ * Port everyone calling pcim_enable_device() + pci_*region*() to using the
+ * pcim_ functions. Then, remove all devres functionality from pci_*region*()
+ * functions and remove the associated cleanups described above in point #2.
  */
-#define PCIM_IOMAP_MAX	PCI_STD_NUM_BARS
 
+/*
+ * Legacy struct storing addresses to whole mapped BARs.
+ */
 struct pcim_iomap_devres {
-	void __iomem *table[PCIM_IOMAP_MAX];
+	void __iomem *table[PCI_STD_NUM_BARS];
+};
+
+enum pcim_addr_devres_type {
+	/* Default initializer. */
+	PCIM_ADDR_DEVRES_TYPE_INVALID,
+
+	/* A requested region spanning an entire BAR. */
+	PCIM_ADDR_DEVRES_TYPE_REGION,
+
+	/*
+	 * A requested region spanning an entire BAR, and a mapping for
+	 * the entire BAR.
+	 */
+	PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING,
+
+	/*
+	 * A mapping within a BAR, either spanning the whole BAR or just a
+	 * range.  Without a requested region.
+	 */
+	PCIM_ADDR_DEVRES_TYPE_MAPPING,
 };
 
+/*
+ * This struct envelops IO or MEM addresses, i.e., mappings and region
+ * requests, because those are very frequently requested and released
+ * together.
+ */
+struct pcim_addr_devres {
+	enum pcim_addr_devres_type type;
+	void __iomem *baseaddr;
+	unsigned long offset;
+	unsigned long len;
+	short bar;
+};
+
+static inline void pcim_addr_devres_clear(struct pcim_addr_devres *res)
+{
+	memset(res, 0, sizeof(*res));
+	res->bar = -1;
+}
+
+/*
+ * The following functions, __pcim_*_region*, exist as counterparts to the
+ * versions from pci.c - which, unfortunately, can be in "hybrid mode", i.e.,
+ * sometimes managed, sometimes not.
+ *
+ * To separate the APIs cleanly, we define our own, simplified versions here.
+ */
+
+/**
+ * __pcim_request_region_range - Request a ranged region
+ * @pdev: PCI device the region belongs to
+ * @bar: BAR the range is within
+ * @offset: offset from the BAR's start address
+ * @maxlen: length in bytes, beginning at @offset
+ * @name: name associated with the request
+ * @req_flags: flags for the request, e.g., for kernel-exclusive requests
+ *
+ * Returns: 0 on success, a negative error code on failure.
+ *
+ * Request a range within a device's PCI BAR.  Sanity check the input.
+ */
+static int __pcim_request_region_range(struct pci_dev *pdev, int bar,
+		unsigned long offset, unsigned long maxlen,
+		const char *name, int req_flags)
+{
+	resource_size_t start = pci_resource_start(pdev, bar);
+	resource_size_t len = pci_resource_len(pdev, bar);
+	unsigned long dev_flags = pci_resource_flags(pdev, bar);
+
+	if (start == 0 || len == 0) /* Unused BAR. */
+		return 0;
+	if (len <= offset)
+		return  -EINVAL;
+
+	start += offset;
+	len -= offset;
+
+	if (len > maxlen && maxlen != 0)
+		len = maxlen;
+
+	if (dev_flags & IORESOURCE_IO) {
+		if (!request_region(start, len, name))
+			return -EBUSY;
+	} else if (dev_flags & IORESOURCE_MEM) {
+		if (!__request_mem_region(start, len, name, req_flags))
+			return -EBUSY;
+	} else {
+		/* That's not a device we can request anything on. */
+		return -ENODEV;
+	}
+
+	return 0;
+}
+
+static void __pcim_release_region_range(struct pci_dev *pdev, int bar,
+		unsigned long offset, unsigned long maxlen)
+{
+	resource_size_t start = pci_resource_start(pdev, bar);
+	resource_size_t len = pci_resource_len(pdev, bar);
+	unsigned long flags = pci_resource_flags(pdev, bar);
+
+	if (len <= offset || start == 0)
+		return;
+
+	if (len == 0 || maxlen == 0) /* This an unused BAR. Do nothing. */
+		return;
+
+	start += offset;
+	len -= offset;
+
+	if (len > maxlen)
+		len = maxlen;
+
+	if (flags & IORESOURCE_IO)
+		release_region(start, len);
+	else if (flags & IORESOURCE_MEM)
+		release_mem_region(start, len);
+}
+
+static int __pcim_request_region(struct pci_dev *pdev, int bar,
+		const char *name, int flags)
+{
+	unsigned long offset = 0;
+	unsigned long len = pci_resource_len(pdev, bar);
+
+	return __pcim_request_region_range(pdev, bar, offset, len, name, flags);
+}
+
+static void __pcim_release_region(struct pci_dev *pdev, int bar)
+{
+	unsigned long offset = 0;
+	unsigned long len = pci_resource_len(pdev, bar);
+
+	__pcim_release_region_range(pdev, bar, offset, len);
+}
+
+static void pcim_addr_resource_release(struct device *dev, void *resource_raw)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	struct pcim_addr_devres *res = resource_raw;
+
+	switch (res->type) {
+	case PCIM_ADDR_DEVRES_TYPE_REGION:
+		__pcim_release_region(pdev, res->bar);
+		break;
+	case PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING:
+		pci_iounmap(pdev, res->baseaddr);
+		__pcim_release_region(pdev, res->bar);
+		break;
+	case PCIM_ADDR_DEVRES_TYPE_MAPPING:
+		pci_iounmap(pdev, res->baseaddr);
+		break;
+	default:
+		break;
+	}
+}
+
+static struct pcim_addr_devres *pcim_addr_devres_alloc(struct pci_dev *pdev)
+{
+	struct pcim_addr_devres *res;
+
+	res = devres_alloc_node(pcim_addr_resource_release, sizeof(*res),
+			GFP_KERNEL, dev_to_node(&pdev->dev));
+	if (res)
+		pcim_addr_devres_clear(res);
+	return res;
+}
+
+/* Just for consistency and readability. */
+static inline void pcim_addr_devres_free(struct pcim_addr_devres *res)
+{
+	devres_free(res);
+}
+
+/*
+ * Used by devres to identify a pcim_addr_devres.
+ */
+static int pcim_addr_resources_match(struct device *dev, void *a_raw, void *b_raw)
+{
+	struct pcim_addr_devres *a, *b;
+
+	a = a_raw;
+	b = b_raw;
+
+	if (a->type != b->type)
+		return 0;
+
+	switch (a->type) {
+	case PCIM_ADDR_DEVRES_TYPE_REGION:
+	case PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING:
+		return a->bar == b->bar;
+	case PCIM_ADDR_DEVRES_TYPE_MAPPING:
+		return a->baseaddr == b->baseaddr;
+	default:
+		return 0;
+	}
+}
 
 static void devm_pci_unmap_iospace(struct device *dev, void *ptr)
 {
@@ -92,8 +321,8 @@  EXPORT_SYMBOL(devm_pci_remap_cfgspace);
  *
  * All operations are managed and will be undone on driver detach.
  *
- * Returns a pointer to the remapped memory or an ERR_PTR() encoded error code
- * on failure. Usage example::
+ * Returns a pointer to the remapped memory or an IOMEM_ERR_PTR() encoded error
+ * code on failure. Usage example::
  *
  *	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
  *	base = devm_pci_remap_cfg_resource(&pdev->dev, res);
@@ -172,6 +401,17 @@  static void pcim_release(struct device *gendev, void *res)
 	struct pci_devres *this = res;
 	int i;
 
+	/*
+	 * This is legacy code.
+	 *
+	 * All regions requested by a pcim_ function do get released through
+	 * pcim_addr_resource_release(). Thanks to the hybrid nature of the pci_
+	 * region-request functions, this for-loop has to release the regions
+	 * if they have been requested by such a function.
+	 *
+	 * TODO: Remove this once all users of pcim_enable_device() PLUS
+	 * pci-region-request-functions have been ported to pcim_ functions.
+	 */
 	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++)
 		if (mask_contains_bar(this->region_mask, i))
 			pci_release_region(dev, i);
@@ -258,19 +498,21 @@  EXPORT_SYMBOL(pcim_pin_device);
 
 static void pcim_iomap_release(struct device *gendev, void *res)
 {
-	struct pci_dev *dev = to_pci_dev(gendev);
-	struct pcim_iomap_devres *this = res;
-	int i;
-
-	for (i = 0; i < PCIM_IOMAP_MAX; i++)
-		if (this->table[i])
-			pci_iounmap(dev, this->table[i]);
+	/*
+	 * Do nothing. This is legacy code.
+	 *
+	 * Cleanup of the mappings is now done directly through the callbacks
+	 * registered when creating them.
+	 */
 }
 
 /**
  * pcim_iomap_table - access iomap allocation table
  * @pdev: PCI device to access iomap table for
  *
+ * Returns:
+ * Const pointer to array of __iomem pointers on success, NULL on failure.
+ *
  * Access iomap allocation table for @dev.  If iomap table doesn't
  * exist and @pdev is managed, it will be allocated.  All iomaps
  * recorded in the iomap table are automatically unmapped on driver
@@ -343,30 +585,67 @@  static void pcim_remove_mapping_from_legacy_table(struct pci_dev *pdev,
 	}
 }
 
+/*
+ * The same as pcim_remove_mapping_from_legacy_table(), but identifies the
+ * mapping by its BAR index.
+ */
+static void pcim_remove_bar_from_legacy_table(struct pci_dev *pdev, short bar)
+{
+	void __iomem **legacy_iomap_table;
+
+	if (bar >= PCI_STD_NUM_BARS)
+		return;
+
+	legacy_iomap_table = (void __iomem **)pcim_iomap_table(pdev);
+	if (!legacy_iomap_table)
+		return;
+
+	legacy_iomap_table[bar] = NULL;
+}
+
 /**
  * pcim_iomap - Managed pcim_iomap()
  * @pdev: PCI device to iomap for
  * @bar: BAR to iomap
  * @maxlen: Maximum length of iomap
  *
- * Managed pci_iomap().  Map is automatically unmapped on driver
- * detach.
+ * Returns: __iomem pointer on success, NULL on failure.
+ *
+ * Managed pci_iomap(). Map is automatically unmapped on driver detach. If
+ * desired, unmap manually only with pcim_iounmap().
+ *
+ * This SHOULD only be used once per BAR.
+ *
+ * NOTE:
+ * Contrary to the other pcim_* functions, this function does not return an
+ * IOMEM_ERR_PTR() on failure, but a simple NULL. This is done for backwards
+ * compatibility.
  */
 void __iomem *pcim_iomap(struct pci_dev *pdev, int bar, unsigned long maxlen)
 {
 	void __iomem *mapping;
+	struct pcim_addr_devres *res;
+
+	res = pcim_addr_devres_alloc(pdev);
+	if (!res)
+		return NULL;
+	res->type = PCIM_ADDR_DEVRES_TYPE_MAPPING;
 
 	mapping = pci_iomap(pdev, bar, maxlen);
 	if (!mapping)
-		return NULL;
+		goto err_iomap;
+	res->baseaddr = mapping;
 
 	if (pcim_add_mapping_to_legacy_table(pdev, mapping, bar) != 0)
 		goto err_table;
 
+	devres_add(&pdev->dev, res);
 	return mapping;
 
 err_table:
 	pci_iounmap(pdev, mapping);
+err_iomap:
+	pcim_addr_devres_free(res);
 	return NULL;
 }
 EXPORT_SYMBOL(pcim_iomap);
@@ -376,91 +655,291 @@  EXPORT_SYMBOL(pcim_iomap);
  * @pdev: PCI device to iounmap for
  * @addr: Address to unmap
  *
- * Managed pci_iounmap().  @addr must have been mapped using pcim_iomap().
+ * Managed pci_iounmap(). @addr must have been mapped using a pcim_* mapping
+ * function.
  */
 void pcim_iounmap(struct pci_dev *pdev, void __iomem *addr)
 {
-	pci_iounmap(pdev, addr);
+	struct pcim_addr_devres res_searched;
+
+	pcim_addr_devres_clear(&res_searched);
+	res_searched.type = PCIM_ADDR_DEVRES_TYPE_MAPPING;
+	res_searched.baseaddr = addr;
+
+	if (devres_release(&pdev->dev, pcim_addr_resource_release,
+			pcim_addr_resources_match, &res_searched) != 0) {
+		/* Doesn't exist. User passed nonsense. */
+		return;
+	}
 
 	pcim_remove_mapping_from_legacy_table(pdev, addr);
 }
 EXPORT_SYMBOL(pcim_iounmap);
 
+/**
+ * pcim_iomap_region - Request and iomap a PCI BAR
+ * @pdev: PCI device to map IO resources for
+ * @bar: Index of a BAR to map
+ * @name: Name associated with the request
+ *
+ * Returns: __iomem pointer on success, an IOMEM_ERR_PTR on failure.
+ *
+ * Mapping and region will get automatically released on driver detach. If
+ * desired, release manually only with pcim_iounmap_region().
+ */
+static void __iomem *pcim_iomap_region(struct pci_dev *pdev, int bar,
+				       const char *name)
+{
+	int ret;
+	struct pcim_addr_devres *res;
+
+	res = pcim_addr_devres_alloc(pdev);
+	if (!res)
+		return IOMEM_ERR_PTR(-ENOMEM);
+
+	res->type = PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING;
+	res->bar = bar;
+
+	ret = __pcim_request_region(pdev, bar, name, 0);
+	if (ret != 0)
+		goto err_region;
+
+	res->baseaddr = pci_iomap(pdev, bar, 0);
+	if (!res->baseaddr) {
+		ret = -EINVAL;
+		goto err_iomap;
+	}
+
+	devres_add(&pdev->dev, res);
+	return res->baseaddr;
+
+err_iomap:
+	__pcim_release_region(pdev, bar);
+err_region:
+	pcim_addr_devres_free(res);
+
+	return IOMEM_ERR_PTR(ret);
+}
+
+/**
+ * pcim_iounmap_region - Unmap and release a PCI BAR
+ * @pdev: PCI device to operate on
+ * @bar: Index of BAR to unmap and release
+ *
+ * Unmap a BAR and release its region manually. Only pass BARs that were
+ * previously mapped by pcim_iomap_region().
+ */
+static void pcim_iounmap_region(struct pci_dev *pdev, int bar)
+{
+	struct pcim_addr_devres res_searched;
+
+	pcim_addr_devres_clear(&res_searched);
+	res_searched.type = PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING;
+	res_searched.bar = bar;
+
+	devres_release(&pdev->dev, pcim_addr_resource_release,
+			pcim_addr_resources_match, &res_searched);
+}
+
 /**
  * pcim_iomap_regions - Request and iomap PCI BARs
  * @pdev: PCI device to map IO resources for
  * @mask: Mask of BARs to request and iomap
- * @name: Name used when requesting regions
+ * @name: Name associated with the requests
+ *
+ * Returns: 0 on success, negative error code on failure.
  *
  * Request and iomap regions specified by @mask.
  */
 int pcim_iomap_regions(struct pci_dev *pdev, int mask, const char *name)
 {
-	void __iomem * const *iomap;
-	int i, rc;
+	int ret;
+	short bar;
+	void __iomem *mapping;
 
-	iomap = pcim_iomap_table(pdev);
-	if (!iomap)
-		return -ENOMEM;
+	for (bar = 0; bar < DEVICE_COUNT_RESOURCE; bar++) {
+		if (!mask_contains_bar(mask, bar))
+			continue;
 
-	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
-		unsigned long len;
+		mapping = pcim_iomap_region(pdev, bar, name);
+		if (IS_ERR(mapping)) {
+			ret = PTR_ERR(mapping);
+			goto err;
+		}
+		ret = pcim_add_mapping_to_legacy_table(pdev, mapping, bar);
+		if (ret != 0)
+			goto err;
+	}
 
-		if (!mask_contains_bar(mask, i))
-			continue;
+	return 0;
 
-		rc = -EINVAL;
-		len = pci_resource_len(pdev, i);
-		if (!len)
-			goto err_inval;
+err:
+	while (--bar >= 0) {
+		pcim_iounmap_region(pdev, bar);
+		pcim_remove_bar_from_legacy_table(pdev, bar);
+	}
 
-		rc = pci_request_region(pdev, i, name);
-		if (rc)
-			goto err_inval;
+	return ret;
+}
+EXPORT_SYMBOL(pcim_iomap_regions);
 
-		rc = -ENOMEM;
-		if (!pcim_iomap(pdev, i, 0))
-			goto err_region;
+static int _pcim_request_region(struct pci_dev *pdev, int bar, const char *name,
+		int request_flags)
+{
+	int ret;
+	struct pcim_addr_devres *res;
+
+	res = pcim_addr_devres_alloc(pdev);
+	if (!res)
+		return -ENOMEM;
+	res->type = PCIM_ADDR_DEVRES_TYPE_REGION;
+	res->bar = bar;
+
+	ret = __pcim_request_region(pdev, bar, name, request_flags);
+	if (ret != 0) {
+		pcim_addr_devres_free(res);
+		return ret;
 	}
 
+	devres_add(&pdev->dev, res);
 	return 0;
+}
 
- err_region:
-	pci_release_region(pdev, i);
- err_inval:
-	while (--i >= 0) {
-		if (!mask_contains_bar(mask, i))
-			continue;
-		pcim_iounmap(pdev, iomap[i]);
-		pci_release_region(pdev, i);
+/**
+ * pcim_request_region - Request a PCI BAR
+ * @pdev: PCI device to requestion region for
+ * @bar: Index of BAR to request
+ * @name: Name associated with the request
+ *
+ * Returns: 0 on success, a negative error code on failure.
+ *
+ * Request region specified by @bar.
+ *
+ * The region will automatically be released on driver detach. If desired,
+ * release manually only with pcim_release_region().
+ */
+static int pcim_request_region(struct pci_dev *pdev, int bar, const char *name)
+{
+	return _pcim_request_region(pdev, bar, name, 0);
+}
+
+/**
+ * pcim_release_region - Release a PCI BAR
+ * @pdev: PCI device to operate on
+ * @bar: Index of BAR to release
+ *
+ * Release a region manually that was previously requested by
+ * pcim_request_region().
+ */
+static void pcim_release_region(struct pci_dev *pdev, int bar)
+{
+	struct pcim_addr_devres res_searched;
+
+	pcim_addr_devres_clear(&res_searched);
+	res_searched.type = PCIM_ADDR_DEVRES_TYPE_REGION;
+	res_searched.bar = bar;
+
+	devres_release(&pdev->dev, pcim_addr_resource_release,
+			pcim_addr_resources_match, &res_searched);
+}
+
+
+/**
+ * pcim_release_all_regions - Release all regions of a PCI-device
+ * @pdev: the PCI device
+ *
+ * Release all regions previously requested through pcim_request_region()
+ * or pcim_request_all_regions().
+ *
+ * Can be called from any context, i.e., not necessarily as a counterpart to
+ * pcim_request_all_regions().
+ */
+static void pcim_release_all_regions(struct pci_dev *pdev)
+{
+	short bar;
+
+	for (bar = 0; bar < PCI_STD_NUM_BARS; bar++)
+		pcim_release_region(pdev, bar);
+}
+
+/**
+ * pcim_request_all_regions - Request all regions
+ * @pdev: PCI device to map IO resources for
+ * @name: name associated with the request
+ *
+ * Returns: 0 on success, negative error code on failure.
+ *
+ * Requested regions will automatically be released at driver detach. If
+ * desired, release individual regions with pcim_release_region() or all of
+ * them at once with pcim_release_all_regions().
+ */
+static int pcim_request_all_regions(struct pci_dev *pdev, const char *name)
+{
+	int ret;
+	short bar;
+
+	for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
+		ret = pcim_request_region(pdev, bar, name);
+		if (ret != 0)
+			goto err;
 	}
 
-	return rc;
+	return 0;
+
+err:
+	pcim_release_all_regions(pdev);
+
+	return ret;
 }
-EXPORT_SYMBOL(pcim_iomap_regions);
 
 /**
  * pcim_iomap_regions_request_all - Request all BARs and iomap specified ones
  * @pdev: PCI device to map IO resources for
  * @mask: Mask of BARs to iomap
- * @name: Name used when requesting regions
+ * @name: Name associated with the requests
+ *
+ * Returns: 0 on success, negative error code on failure.
  *
  * Request all PCI BARs and iomap regions specified by @mask.
+ *
+ * To release these resources manually, call pcim_release_region() for the
+ * regions and pcim_iounmap() for the mappings.
  */
 int pcim_iomap_regions_request_all(struct pci_dev *pdev, int mask,
 				   const char *name)
 {
-	int request_mask = ((1 << 6) - 1) & ~mask;
-	int rc;
+	short bar;
+	int ret;
+	void __iomem **legacy_iomap_table;
 
-	rc = pci_request_selected_regions(pdev, request_mask, name);
-	if (rc)
-		return rc;
+	ret = pcim_request_all_regions(pdev, name);
+	if (ret != 0)
+		return ret;
 
-	rc = pcim_iomap_regions(pdev, mask, name);
-	if (rc)
-		pci_release_selected_regions(pdev, request_mask);
-	return rc;
+	for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
+		if (!mask_contains_bar(mask, bar))
+			continue;
+		if (!pcim_iomap(pdev, bar, 0))
+			goto err;
+	}
+
+	return 0;
+
+err:
+	/*
+	 * If bar is larger than 0, then pcim_iomap() above has most likely
+	 * failed because of -EINVAL. If it is equal 0, most likely the table
+	 * couldn't be created, indicating -ENOMEM.
+	 */
+	ret = bar > 0 ? -EINVAL : -ENOMEM;
+	legacy_iomap_table = (void __iomem **)pcim_iomap_table(pdev);
+
+	while (--bar >= 0)
+		pcim_iounmap(pdev, legacy_iomap_table[bar]);
+
+	pcim_release_all_regions(pdev);
+
+	return ret;
 }
 EXPORT_SYMBOL(pcim_iomap_regions_request_all);
 
@@ -473,19 +952,14 @@  EXPORT_SYMBOL(pcim_iomap_regions_request_all);
  */
 void pcim_iounmap_regions(struct pci_dev *pdev, int mask)
 {
-	void __iomem * const *iomap;
-	int i;
+	short bar;
 
-	iomap = pcim_iomap_table(pdev);
-	if (!iomap)
-		return;
-
-	for (i = 0; i < PCIM_IOMAP_MAX; i++) {
-		if (!mask_contains_bar(mask, i))
+	for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
+		if (!mask_contains_bar(mask, bar))
 			continue;
 
-		pcim_iounmap(pdev, iomap[i]);
-		pci_release_region(pdev, i);
+		pcim_iounmap_region(pdev, bar);
+		pcim_remove_bar_from_legacy_table(pdev, bar);
 	}
 }
 EXPORT_SYMBOL(pcim_iounmap_regions);
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 59e0949fb079..d94445f5f882 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3883,6 +3883,17 @@  void pci_release_region(struct pci_dev *pdev, int bar)
 		release_mem_region(pci_resource_start(pdev, bar),
 				pci_resource_len(pdev, bar));
 
+	/*
+	 * This devres utility makes this function sometimes managed
+	 * (when pcim_enable_device() has been called before).
+	 *
+	 * This is bad because it conflicts with the pcim_ functions being
+	 * exclusively responsible for managed PCI. Its "sometimes yes,
+	 * sometimes no" nature can cause bugs.
+	 *
+	 * TODO: Remove this once all users that use pcim_enable_device() PLUS
+	 * a region request function have been ported to using pcim_ functions.
+	 */
 	dr = find_pci_dr(pdev);
 	if (dr)
 		dr->region_mask &= ~(1 << bar);
@@ -3927,6 +3938,17 @@  static int __pci_request_region(struct pci_dev *pdev, int bar,
 			goto err_out;
 	}
 
+	/*
+	 * This devres utility makes this function sometimes managed
+	 * (when pcim_enable_device() has been called before).
+	 *
+	 * This is bad because it conflicts with the pcim_ functions being
+	 * exclusively responsible for managed pci. Its "sometimes yes,
+	 * sometimes no" nature can cause bugs.
+	 *
+	 * TODO: Remove this once all users that use pcim_enable_device() PLUS
+	 * a region request function have been ported to using pcim_ functions.
+	 */
 	dr = find_pci_dr(pdev);
 	if (dr)
 		dr->region_mask |= 1 << bar;
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index fd44565c4756..c09487f5550c 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -826,6 +826,11 @@  struct pci_devres {
 	unsigned int orig_intx:1;
 	unsigned int restore_intx:1;
 	unsigned int mwi:1;
+
+	/*
+	 * TODO: remove the region_mask once everyone calling
+	 * pcim_enable_device() + pci_*region*() is ported to pcim_ functions.
+	 */
 	u32 region_mask;
 };