diff mbox series

[V1,vfio,7/7] vfio/virtio: Enable live migration once VIRTIO_PCI was configured

Message ID 20241104102131.184193-8-yishaih@nvidia.com (mailing list archive)
State New
Headers show
Series Enhance the vfio-virtio driver to support live migration | expand

Commit Message

Yishai Hadas Nov. 4, 2024, 10:21 a.m. UTC
Now that the driver supports live migration, only the legacy IO
functionality depends on config VIRTIO_PCI_ADMIN_LEGACY.

Move the legacy IO into a separate file to be compiled only once
VIRTIO_PCI_ADMIN_LEGACY was configured and let the live migration
depends only on VIRTIO_PCI.

As part of this, modify the default driver operations (i.e.,
vfio_device_ops) to use the live migration set, and extend it to include
legacy I/O operations if they are compiled and supported.

Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 drivers/vfio/pci/virtio/Kconfig     |   4 +-
 drivers/vfio/pci/virtio/Makefile    |   1 +
 drivers/vfio/pci/virtio/common.h    |  19 ++
 drivers/vfio/pci/virtio/legacy_io.c | 420 ++++++++++++++++++++++++++++
 drivers/vfio/pci/virtio/main.c      | 416 ++-------------------------
 5 files changed, 469 insertions(+), 391 deletions(-)
 create mode 100644 drivers/vfio/pci/virtio/legacy_io.c

Comments

Alex Williamson Nov. 5, 2024, 11:29 p.m. UTC | #1
On Mon, 4 Nov 2024 12:21:31 +0200
Yishai Hadas <yishaih@nvidia.com> wrote:

> Now that the driver supports live migration, only the legacy IO
> functionality depends on config VIRTIO_PCI_ADMIN_LEGACY.
> 
> Move the legacy IO into a separate file to be compiled only once
> VIRTIO_PCI_ADMIN_LEGACY was configured and let the live migration
> depends only on VIRTIO_PCI.
> 
> As part of this, modify the default driver operations (i.e.,
> vfio_device_ops) to use the live migration set, and extend it to include
> legacy I/O operations if they are compiled and supported.
> 
> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
> ---
>  drivers/vfio/pci/virtio/Kconfig     |   4 +-
>  drivers/vfio/pci/virtio/Makefile    |   1 +
>  drivers/vfio/pci/virtio/common.h    |  19 ++
>  drivers/vfio/pci/virtio/legacy_io.c | 420 ++++++++++++++++++++++++++++
>  drivers/vfio/pci/virtio/main.c      | 416 ++-------------------------
>  5 files changed, 469 insertions(+), 391 deletions(-)
>  create mode 100644 drivers/vfio/pci/virtio/legacy_io.c
> 
> diff --git a/drivers/vfio/pci/virtio/Kconfig b/drivers/vfio/pci/virtio/Kconfig
> index bd80eca4a196..af1dd9e84a5c 100644
> --- a/drivers/vfio/pci/virtio/Kconfig
> +++ b/drivers/vfio/pci/virtio/Kconfig
> @@ -1,7 +1,7 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>  config VIRTIO_VFIO_PCI
>          tristate "VFIO support for VIRTIO NET PCI devices"
> -        depends on VIRTIO_PCI && VIRTIO_PCI_ADMIN_LEGACY
> +        depends on VIRTIO_PCI
>          select VFIO_PCI_CORE
>          help
>            This provides support for exposing VIRTIO NET VF devices which support
> @@ -11,5 +11,7 @@ config VIRTIO_VFIO_PCI
>            As of that this driver emulates I/O BAR in software to let a VF be
>            seen as a transitional device by its users and let it work with
>            a legacy driver.
> +          In addition, it provides migration support for VIRTIO NET VF devices
> +          using the VFIO framework.

The first half of this now describes something that may or may not be
enabled by this config option and the additional help text for
migration is vague enough relative to PF requirements to get user
reports that the driver doesn't work as intended.

For the former, maybe we still want a separate config item that's
optionally enabled if VIRTIO_VFIO_PCI && VFIO_PCI_ADMIN_LEGACY.

Thanks,
Alex
Jason Gunthorpe Nov. 6, 2024, 1:59 p.m. UTC | #2
On Tue, Nov 05, 2024 at 04:29:04PM -0700, Alex Williamson wrote:
> > @@ -1,7 +1,7 @@
> >  # SPDX-License-Identifier: GPL-2.0-only
> >  config VIRTIO_VFIO_PCI
> >          tristate "VFIO support for VIRTIO NET PCI devices"
> > -        depends on VIRTIO_PCI && VIRTIO_PCI_ADMIN_LEGACY
> > +        depends on VIRTIO_PCI
> >          select VFIO_PCI_CORE
> >          help
> >            This provides support for exposing VIRTIO NET VF devices which support
> > @@ -11,5 +11,7 @@ config VIRTIO_VFIO_PCI
> >            As of that this driver emulates I/O BAR in software to let a VF be
> >            seen as a transitional device by its users and let it work with
> >            a legacy driver.
> > +          In addition, it provides migration support for VIRTIO NET VF devices
> > +          using the VFIO framework.
> 
> The first half of this now describes something that may or may not be
> enabled by this config option and the additional help text for
> migration is vague enough relative to PF requirements to get user
> reports that the driver doesn't work as intended.

Yes, I think the help should be clearer

> For the former, maybe we still want a separate config item that's
> optionally enabled if VIRTIO_VFIO_PCI && VFIO_PCI_ADMIN_LEGACY.

If we are going to add a bunch of  #ifdefs/etc for ADMIN_LEGACY we
may as well just use VIRTIO_PCI_ADMIN_LEGACY directly and not
introduce another kconfig for it?

Is there any reason to compile out the migration support for virtio?
No other drivers were doing this?

kconfig combinations are painful, it woudl be nice to not make too
many..

Jason
Alex Williamson Nov. 6, 2024, 10:27 p.m. UTC | #3
On Wed, 6 Nov 2024 09:59:09 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, Nov 05, 2024 at 04:29:04PM -0700, Alex Williamson wrote:
> > > @@ -1,7 +1,7 @@
> > >  # SPDX-License-Identifier: GPL-2.0-only
> > >  config VIRTIO_VFIO_PCI
> > >          tristate "VFIO support for VIRTIO NET PCI devices"
> > > -        depends on VIRTIO_PCI && VIRTIO_PCI_ADMIN_LEGACY
> > > +        depends on VIRTIO_PCI
> > >          select VFIO_PCI_CORE
> > >          help
> > >            This provides support for exposing VIRTIO NET VF devices which support
> > > @@ -11,5 +11,7 @@ config VIRTIO_VFIO_PCI
> > >            As of that this driver emulates I/O BAR in software to let a VF be
> > >            seen as a transitional device by its users and let it work with
> > >            a legacy driver.
> > > +          In addition, it provides migration support for VIRTIO NET VF devices
> > > +          using the VFIO framework.  
> > 
> > The first half of this now describes something that may or may not be
> > enabled by this config option and the additional help text for
> > migration is vague enough relative to PF requirements to get user
> > reports that the driver doesn't work as intended.  
> 
> Yes, I think the help should be clearer
> 
> > For the former, maybe we still want a separate config item that's
> > optionally enabled if VIRTIO_VFIO_PCI && VFIO_PCI_ADMIN_LEGACY.  
> 
> If we are going to add a bunch of  #ifdefs/etc for ADMIN_LEGACY we
> may as well just use VIRTIO_PCI_ADMIN_LEGACY directly and not
> introduce another kconfig for it?

I think that's what Yishai is proposing, but as we're adding a whole
new feature to the driver I'm concerned how the person configuring the
kernel knows which features from the description might be available in
the resulting driver.

We could maybe solve that with a completely re-written help text that
describes the legacy feature as X86-only and migration as a separate
architecture independent feature, but people aren't great at reading
and part of the audience is going to see "X86" in their peripheral
vision and disable it, and maybe even complain that the text was
presented to them.

OR, we can just add an optional sub-config bool that makes it easier to
describe the (new) main feature of the driver as supporting live
migration (on supported hardware) and the sub-config option as
providing legacy support (on supported hardware), and that sub-config
is only presented on X86, ie. ADMIN_LEGACY.

Ultimately the code already needs to support #ifdefs for the latter and
I think it's more user friendly and versatile to have the separate
config option.

NB. The sub-config should be default on for upgrade compatibility.

> Is there any reason to compile out the migration support for virtio?
> No other drivers were doing this?

No other vfio-pci variant driver provides multiple, independent
features, so for instance to compile out migration support from the
vfio-pci-mlx5 driver is to simply disable the driver altogether.

> kconfig combinations are painful, it woudl be nice to not make too
> many..

I'm not arguing for a legacy-only, non-migration version (please speak
up if someone wants that).  The code already needs to support the
#ifdefs and I think reflecting that in a sub-config option helps
clarify what the driver is providing and conveniently makes it possible
to have a driver with exactly the same feature set across archs, if
desired.  Thanks,

Alex
Yishai Hadas Nov. 7, 2024, 12:57 p.m. UTC | #4
On 07/11/2024 0:27, Alex Williamson wrote:
> On Wed, 6 Nov 2024 09:59:09 -0400
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
>> On Tue, Nov 05, 2024 at 04:29:04PM -0700, Alex Williamson wrote:
>>>> @@ -1,7 +1,7 @@
>>>>   # SPDX-License-Identifier: GPL-2.0-only
>>>>   config VIRTIO_VFIO_PCI
>>>>           tristate "VFIO support for VIRTIO NET PCI devices"
>>>> -        depends on VIRTIO_PCI && VIRTIO_PCI_ADMIN_LEGACY
>>>> +        depends on VIRTIO_PCI
>>>>           select VFIO_PCI_CORE
>>>>           help
>>>>             This provides support for exposing VIRTIO NET VF devices which support
>>>> @@ -11,5 +11,7 @@ config VIRTIO_VFIO_PCI
>>>>             As of that this driver emulates I/O BAR in software to let a VF be
>>>>             seen as a transitional device by its users and let it work with
>>>>             a legacy driver.
>>>> +          In addition, it provides migration support for VIRTIO NET VF devices
>>>> +          using the VFIO framework.
>>>
>>> The first half of this now describes something that may or may not be
>>> enabled by this config option and the additional help text for
>>> migration is vague enough relative to PF requirements to get user
>>> reports that the driver doesn't work as intended.
>>
>> Yes, I think the help should be clearer
>>
>>> For the former, maybe we still want a separate config item that's
>>> optionally enabled if VIRTIO_VFIO_PCI && VFIO_PCI_ADMIN_LEGACY.
>>
>> If we are going to add a bunch of  #ifdefs/etc for ADMIN_LEGACY we
>> may as well just use VIRTIO_PCI_ADMIN_LEGACY directly and not
>> introduce another kconfig for it?
> 
> I think that's what Yishai is proposing, but as we're adding a whole
> new feature to the driver I'm concerned how the person configuring the
> kernel knows which features from the description might be available in
> the resulting driver.
> 
> We could maybe solve that with a completely re-written help text that
> describes the legacy feature as X86-only and migration as a separate
> architecture independent feature, but people aren't great at reading
> and part of the audience is going to see "X86" in their peripheral
> vision and disable it, and maybe even complain that the text was
> presented to them.
> 
> OR, we can just add an optional sub-config bool that makes it easier to
> describe the (new) main feature of the driver as supporting live
> migration (on supported hardware) and the sub-config option as
> providing legacy support (on supported hardware), and that sub-config
> is only presented on X86, ie. ADMIN_LEGACY.
> 
> Ultimately the code already needs to support #ifdefs for the latter and
> I think it's more user friendly and versatile to have the separate
> config option.
> 
> NB. The sub-config should be default on for upgrade compatibility.
> 
>> Is there any reason to compile out the migration support for virtio?
>> No other drivers were doing this?
> 
> No other vfio-pci variant driver provides multiple, independent
> features, so for instance to compile out migration support from the
> vfio-pci-mlx5 driver is to simply disable the driver altogether.
> 
>> kconfig combinations are painful, it woudl be nice to not make too
>> many..
> 
> I'm not arguing for a legacy-only, non-migration version (please speak
> up if someone wants that).  The code already needs to support the
> #ifdefs and I think reflecting that in a sub-config option helps
> clarify what the driver is providing and conveniently makes it possible
> to have a driver with exactly the same feature set across archs, if
> desired.  Thanks,
> 

Since the live migration functionality is not architecture-dependent 
(unlike legacy access, which requires X86) and is likely to be the 
primary use of the driver, I would suggest keeping it outside of any 
#ifdef directives, as initially introduced in V1.

To address the description issue and provide control for customers who 
may need the legacy access functionality, we could introduce a bool 
configuration option as a submenu under the driver’s main live migration 
feature.

This approach will keep things simple and align with the typical use 
case of the driver.

Something like the below [1] can do the job for that.

Alex,
Can that work for you ?

By the way, you have suggested calling the config entry 
VFIO_PCI_ADMIN_LEGACY, don't we need to add here also the VIRTIO as a 
prefix ? (i.e. VIRTIO_VFIO_PCI_ADMIN_LEGACY)

[1]
# SPDX-License-Identifier: GPL-2.0-only

config VIRTIO_VFIO_PCI
         tristate "VFIO support for live migration over VIRTIO NET PCI
                   devices"
         depends on VIRTIO_PCI
         select VFIO_PCI_CORE
         select IOMMUFD_DRIVER
         help
           This provides migration support for VIRTIO NET PCI VF devices
           using the VFIO framework.

           If you don't know what to do here, say N.

config VFIO_PCI_ADMIN_LEGACY
         bool "VFIO support for legacy I/O access for VIRTIO NET PCI
               devices"
         depends on VIRTIO_VFIO_PCI && VIRTIO_PCI_ADMIN_LEGACY
         default y
         help
           This provides support for exposing VIRTIO NET VF devices which
           support legacy IO access, using the VFIO framework that can
           work with a legacy virtio driver in the guest.
           Based on PCIe spec, VFs do not support I/O Space.
           As of that this driver emulates I/O BAR in software to let a
           VF be seen as a transitional device by its users and let it
           work with a legacy driver.

           If you don't know what to do here, say N.

Yishai
Alex Williamson Nov. 7, 2024, 9:25 p.m. UTC | #5
On Thu, 7 Nov 2024 14:57:39 +0200
Yishai Hadas <yishaih@nvidia.com> wrote:

> On 07/11/2024 0:27, Alex Williamson wrote:
> > On Wed, 6 Nov 2024 09:59:09 -0400
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >   
> >> On Tue, Nov 05, 2024 at 04:29:04PM -0700, Alex Williamson wrote:  
> >>>> @@ -1,7 +1,7 @@
> >>>>   # SPDX-License-Identifier: GPL-2.0-only
> >>>>   config VIRTIO_VFIO_PCI
> >>>>           tristate "VFIO support for VIRTIO NET PCI devices"
> >>>> -        depends on VIRTIO_PCI && VIRTIO_PCI_ADMIN_LEGACY
> >>>> +        depends on VIRTIO_PCI
> >>>>           select VFIO_PCI_CORE
> >>>>           help
> >>>>             This provides support for exposing VIRTIO NET VF devices which support
> >>>> @@ -11,5 +11,7 @@ config VIRTIO_VFIO_PCI
> >>>>             As of that this driver emulates I/O BAR in software to let a VF be
> >>>>             seen as a transitional device by its users and let it work with
> >>>>             a legacy driver.
> >>>> +          In addition, it provides migration support for VIRTIO NET VF devices
> >>>> +          using the VFIO framework.  
> >>>
> >>> The first half of this now describes something that may or may not be
> >>> enabled by this config option and the additional help text for
> >>> migration is vague enough relative to PF requirements to get user
> >>> reports that the driver doesn't work as intended.  
> >>
> >> Yes, I think the help should be clearer
> >>  
> >>> For the former, maybe we still want a separate config item that's
> >>> optionally enabled if VIRTIO_VFIO_PCI && VFIO_PCI_ADMIN_LEGACY.  
> >>
> >> If we are going to add a bunch of  #ifdefs/etc for ADMIN_LEGACY we
> >> may as well just use VIRTIO_PCI_ADMIN_LEGACY directly and not
> >> introduce another kconfig for it?  
> > 
> > I think that's what Yishai is proposing, but as we're adding a whole
> > new feature to the driver I'm concerned how the person configuring the
> > kernel knows which features from the description might be available in
> > the resulting driver.
> > 
> > We could maybe solve that with a completely re-written help text that
> > describes the legacy feature as X86-only and migration as a separate
> > architecture independent feature, but people aren't great at reading
> > and part of the audience is going to see "X86" in their peripheral
> > vision and disable it, and maybe even complain that the text was
> > presented to them.
> > 
> > OR, we can just add an optional sub-config bool that makes it easier to
> > describe the (new) main feature of the driver as supporting live
> > migration (on supported hardware) and the sub-config option as
> > providing legacy support (on supported hardware), and that sub-config
> > is only presented on X86, ie. ADMIN_LEGACY.
> > 
> > Ultimately the code already needs to support #ifdefs for the latter and
> > I think it's more user friendly and versatile to have the separate
> > config option.
> > 
> > NB. The sub-config should be default on for upgrade compatibility.
> >   
> >> Is there any reason to compile out the migration support for virtio?
> >> No other drivers were doing this?  
> > 
> > No other vfio-pci variant driver provides multiple, independent
> > features, so for instance to compile out migration support from the
> > vfio-pci-mlx5 driver is to simply disable the driver altogether.
> >   
> >> kconfig combinations are painful, it woudl be nice to not make too
> >> many..  
> > 
> > I'm not arguing for a legacy-only, non-migration version (please speak
> > up if someone wants that).  The code already needs to support the
> > #ifdefs and I think reflecting that in a sub-config option helps
> > clarify what the driver is providing and conveniently makes it possible
> > to have a driver with exactly the same feature set across archs, if
> > desired.  Thanks,
> >   
> 
> Since the live migration functionality is not architecture-dependent 
> (unlike legacy access, which requires X86) and is likely to be the 
> primary use of the driver, I would suggest keeping it outside of any 
> #ifdef directives, as initially introduced in V1.
> 
> To address the description issue and provide control for customers who 
> may need the legacy access functionality, we could introduce a bool 
> configuration option as a submenu under the driver’s main live migration 
> feature.
> 
> This approach will keep things simple and align with the typical use 
> case of the driver.
> 
> Something like the below [1] can do the job for that.
> 
> Alex,
> Can that work for you ?
> 
> By the way, you have suggested calling the config entry 
> VFIO_PCI_ADMIN_LEGACY, don't we need to add here also the VIRTIO as a 
> prefix ? (i.e. VIRTIO_VFIO_PCI_ADMIN_LEGACY)

I think that was just a typo referring to VIRTIO_PCI_ADMIN_LEGACY.
Yes, appending _ADMIN_LEGACY to the main config option is fine.

> [1]
> # SPDX-License-Identifier: GPL-2.0-only
> 
> config VIRTIO_VFIO_PCI
>          tristate "VFIO support for live migration over VIRTIO NET PCI
>                    devices"

Looking at other variant drivers, I think this should just be:

	"VFIO support for VIRTIO NET PCI VF devices"

>          depends on VIRTIO_PCI
>          select VFIO_PCI_CORE
>          select IOMMUFD_DRIVER

IIUC, this is not a dependency, the device will just lack dirty page
tracking with either the type1 backend or when using iommufd when the
IOMMU hardware doesn't have dirty page tracking, therefore all VM
memory is perpetually dirty.  Do I have that right?

>          help
>            This provides migration support for VIRTIO NET PCI VF devices
>            using the VFIO framework.

This is still too open ended for me, there is specific PF support
required in the device to make this work.  Maybe...

	This provides migration support for VIRTIO NET PCI VF devices
	using the VFIO framework.  Migration support requires the
	SR-IOV PF device to support specific VIRTIO extensions,
	otherwise this driver provides no additional functionality
	beyond vfio-pci.

	Migration support in this driver relies on dirty page tracking
	provided by the IOMMU hardware and exposed through IOMMUFD, any
	other use cases are dis-recommended.

>            If you don't know what to do here, say N.
> 
> config VFIO_PCI_ADMIN_LEGACY

VIRTIO_VFIO_PCI_ADMIN_LEGACY

>          bool "VFIO support for legacy I/O access for VIRTIO NET PCI
>                devices"

Maybe:

	"Legacy I/O support for VIRTIO NET PCI VF devices"

>          depends on VIRTIO_VFIO_PCI && VIRTIO_PCI_ADMIN_LEGACY
>          default y
>          help
>            This provides support for exposing VIRTIO NET VF devices which
>            support legacy IO access, using the VFIO framework that can
>            work with a legacy virtio driver in the guest.
>            Based on PCIe spec, VFs do not support I/O Space.
>            As of that this driver emulates I/O BAR in software to let a
>            VF be seen as a transitional device by its users and let it
>            work with a legacy driver.

Maybe:

	This extends the virtio-vfio-pci driver to support legacy I/O
	access, allowing use of legacy virtio drivers with VIRTIO NET
	PCI VF devices.  Legacy I/O support requires the SR-IOV PF
	device to support and enable specific VIRTIO extensions,
	otherwise this driver provides no additional functionality
	beyond vfio-pci.

IMO, noting the PF requirement in each is important (feel free to
elaborate on specific VIRTIO extension requirements).  It doesn't seem
necessary to explain how the legacy compatibility works, only that this
driver makes the VF compatible with the legacy driver.

Are both of these options configurable at the PF in either firmware or
software?  I used "support and enable" in the legacy section assuming
that there is such a knob, but for migration it seems less necessary
that there's an enable step.  Please correct based on the actual
device behavior.  Thanks,

Alex
Yishai Hadas Nov. 11, 2024, 8:22 a.m. UTC | #6
On 07/11/2024 23:25, Alex Williamson wrote:
> On Thu, 7 Nov 2024 14:57:39 +0200
> Yishai Hadas <yishaih@nvidia.com> wrote:
> 
>> On 07/11/2024 0:27, Alex Williamson wrote:
>>> On Wed, 6 Nov 2024 09:59:09 -0400
>>> Jason Gunthorpe <jgg@nvidia.com> wrote:
>>>    
>>>> On Tue, Nov 05, 2024 at 04:29:04PM -0700, Alex Williamson wrote:
>>>>>> @@ -1,7 +1,7 @@
>>>>>>    # SPDX-License-Identifier: GPL-2.0-only
>>>>>>    config VIRTIO_VFIO_PCI
>>>>>>            tristate "VFIO support for VIRTIO NET PCI devices"
>>>>>> -        depends on VIRTIO_PCI && VIRTIO_PCI_ADMIN_LEGACY
>>>>>> +        depends on VIRTIO_PCI
>>>>>>            select VFIO_PCI_CORE
>>>>>>            help
>>>>>>              This provides support for exposing VIRTIO NET VF devices which support
>>>>>> @@ -11,5 +11,7 @@ config VIRTIO_VFIO_PCI
>>>>>>              As of that this driver emulates I/O BAR in software to let a VF be
>>>>>>              seen as a transitional device by its users and let it work with
>>>>>>              a legacy driver.
>>>>>> +          In addition, it provides migration support for VIRTIO NET VF devices
>>>>>> +          using the VFIO framework.
>>>>>
>>>>> The first half of this now describes something that may or may not be
>>>>> enabled by this config option and the additional help text for
>>>>> migration is vague enough relative to PF requirements to get user
>>>>> reports that the driver doesn't work as intended.
>>>>
>>>> Yes, I think the help should be clearer
>>>>   
>>>>> For the former, maybe we still want a separate config item that's
>>>>> optionally enabled if VIRTIO_VFIO_PCI && VFIO_PCI_ADMIN_LEGACY.
>>>>
>>>> If we are going to add a bunch of  #ifdefs/etc for ADMIN_LEGACY we
>>>> may as well just use VIRTIO_PCI_ADMIN_LEGACY directly and not
>>>> introduce another kconfig for it?
>>>
>>> I think that's what Yishai is proposing, but as we're adding a whole
>>> new feature to the driver I'm concerned how the person configuring the
>>> kernel knows which features from the description might be available in
>>> the resulting driver.
>>>
>>> We could maybe solve that with a completely re-written help text that
>>> describes the legacy feature as X86-only and migration as a separate
>>> architecture independent feature, but people aren't great at reading
>>> and part of the audience is going to see "X86" in their peripheral
>>> vision and disable it, and maybe even complain that the text was
>>> presented to them.
>>>
>>> OR, we can just add an optional sub-config bool that makes it easier to
>>> describe the (new) main feature of the driver as supporting live
>>> migration (on supported hardware) and the sub-config option as
>>> providing legacy support (on supported hardware), and that sub-config
>>> is only presented on X86, ie. ADMIN_LEGACY.
>>>
>>> Ultimately the code already needs to support #ifdefs for the latter and
>>> I think it's more user friendly and versatile to have the separate
>>> config option.
>>>
>>> NB. The sub-config should be default on for upgrade compatibility.
>>>    
>>>> Is there any reason to compile out the migration support for virtio?
>>>> No other drivers were doing this?
>>>
>>> No other vfio-pci variant driver provides multiple, independent
>>> features, so for instance to compile out migration support from the
>>> vfio-pci-mlx5 driver is to simply disable the driver altogether.
>>>    
>>>> kconfig combinations are painful, it woudl be nice to not make too
>>>> many..
>>>
>>> I'm not arguing for a legacy-only, non-migration version (please speak
>>> up if someone wants that).  The code already needs to support the
>>> #ifdefs and I think reflecting that in a sub-config option helps
>>> clarify what the driver is providing and conveniently makes it possible
>>> to have a driver with exactly the same feature set across archs, if
>>> desired.  Thanks,
>>>    
>>
>> Since the live migration functionality is not architecture-dependent
>> (unlike legacy access, which requires X86) and is likely to be the
>> primary use of the driver, I would suggest keeping it outside of any
>> #ifdef directives, as initially introduced in V1.
>>
>> To address the description issue and provide control for customers who
>> may need the legacy access functionality, we could introduce a bool
>> configuration option as a submenu under the driver’s main live migration
>> feature.
>>
>> This approach will keep things simple and align with the typical use
>> case of the driver.
>>
>> Something like the below [1] can do the job for that.
>>
>> Alex,
>> Can that work for you ?
>>
>> By the way, you have suggested calling the config entry
>> VFIO_PCI_ADMIN_LEGACY, don't we need to add here also the VIRTIO as a
>> prefix ? (i.e. VIRTIO_VFIO_PCI_ADMIN_LEGACY)
> 
> I think that was just a typo referring to VIRTIO_PCI_ADMIN_LEGACY.
> Yes, appending _ADMIN_LEGACY to the main config option is fine.
> 
>> [1]
>> # SPDX-License-Identifier: GPL-2.0-only
>>
>> config VIRTIO_VFIO_PCI
>>           tristate "VFIO support for live migration over VIRTIO NET PCI
>>                     devices"
> 
> Looking at other variant drivers, I think this should just be:
> 
> 	"VFIO support for VIRTIO NET PCI VF devices"
> 
>>           depends on VIRTIO_PCI
>>           select VFIO_PCI_CORE
>>           select IOMMUFD_DRIVER
> 
> IIUC, this is not a dependency, the device will just lack dirty page
> tracking with either the type1 backend or when using iommufd when the
> IOMMU hardware doesn't have dirty page tracking, therefore all VM
> memory is perpetually dirty.  Do I have that right?

IOMMUFD_DRIVER is selected to utilize the dirty tracking functionality 
of IOMMU. Therefore, this is a select option rather than a dependency, 
similar to how the pds and mlx5 VFIO drivers handle it in their Kconfig 
files.

> 
>>           help
>>             This provides migration support for VIRTIO NET PCI VF devices
>>             using the VFIO framework.
> 
> This is still too open ended for me, there is specific PF support
> required in the device to make this work.  Maybe...
> 
> 	This provides migration support for VIRTIO NET PCI VF devices
> 	using the VFIO framework.  Migration support requires the
> 	SR-IOV PF device to support specific VIRTIO extensions,
> 	otherwise this driver provides no additional functionality
> 	beyond vfio-pci.
> 
> 	Migration support in this driver relies on dirty page tracking
> 	provided by the IOMMU hardware and exposed through IOMMUFD, any
> 	other use cases are dis-recommended.
> 
>>             If you don't know what to do here, say N.

Looks good.

>>
>> config VFIO_PCI_ADMIN_LEGACY
> 
> VIRTIO_VFIO_PCI_ADMIN_LEGACY
> 
>>           bool "VFIO support for legacy I/O access for VIRTIO NET PCI
>>                 devices"
> 
> Maybe:
> 
> 	"Legacy I/O support for VIRTIO NET PCI VF devices"
> 
>>           depends on VIRTIO_VFIO_PCI && VIRTIO_PCI_ADMIN_LEGACY
>>           default y
>>           help
>>             This provides support for exposing VIRTIO NET VF devices which
>>             support legacy IO access, using the VFIO framework that can
>>             work with a legacy virtio driver in the guest.
>>             Based on PCIe spec, VFs do not support I/O Space.
>>             As of that this driver emulates I/O BAR in software to let a
>>             VF be seen as a transitional device by its users and let it
>>             work with a legacy driver.
> 
> Maybe:
> 
> 	This extends the virtio-vfio-pci driver to support legacy I/O
> 	access, allowing use of legacy virtio drivers with VIRTIO NET
> 	PCI VF devices.  Legacy I/O support requires the SR-IOV PF
> 	device to support and enable specific VIRTIO extensions,
> 	otherwise this driver provides no additional functionality
> 	beyond vfio-pci.
> 

Looks good.

> IMO, noting the PF requirement in each is important (feel free to
> elaborate on specific VIRTIO extension requirements).  It doesn't seem
> necessary to explain how the legacy compatibility works, only that this
> driver makes the VF compatible with the legacy driver.
> 
> Are both of these options configurable at the PF in either firmware or
> software? 

These options are configured in the firmware.

  I used "support and enable" in the legacy section assuming
> that there is such a knob, but for migration it seems less necessary
> that there's an enable step.  Please correct based on the actual
> device behavior.  Thanks,
> 

Migration is such a basic functionality that we may expect it to be 
enabled by default, so your suggestion seems reasonable. Let’s proceed 
with it.


Thanks,
Yishai
Joao Martins Nov. 11, 2024, 10:32 a.m. UTC | #7
>>>           depends on VIRTIO_PCI
>>>           select VFIO_PCI_CORE
>>>           select IOMMUFD_DRIVER
>>
>> IIUC, this is not a dependency, the device will just lack dirty page
>> tracking with either the type1 backend or when using iommufd when the
>> IOMMU hardware doesn't have dirty page tracking, therefore all VM
>> memory is perpetually dirty.  Do I have that right?
> 
> IOMMUFD_DRIVER is selected to utilize the dirty tracking functionality of IOMMU.
> Therefore, this is a select option rather than a dependency, similar to how the
> pds and mlx5 VFIO drivers handle it in their Kconfig files.
> 

Yishai, I think Alex is right here.

'select IOMMUFD_DRIVER' is more for VF dirty trackers where it uses the same
helpers as IOMMUFD does for dirty tracking. But it's definitely not signaling
intent for 'IOMMU dirty tracking' but rather 'VF dirty tracking'

If you want to tie in to IOMMU dirty tracking you probably want to do:

	'select IOMMUFD'

But that is a big hammer, as you also need the VFIO_DEVICE_CDEV kconfig selected
as well and probably more.

Perhaps best to do like qat/hisilicon drivers and letting the user optionally
pick it. Migration is anyways disabled when using type1 (unless you force it,
and it then it does the perpectual dirty trick).
Yishai Hadas Nov. 11, 2024, 2:17 p.m. UTC | #8
On 11/11/2024 12:32, Joao Martins wrote:
>>>>            depends on VIRTIO_PCI
>>>>            select VFIO_PCI_CORE
>>>>            select IOMMUFD_DRIVER
>>>
>>> IIUC, this is not a dependency, the device will just lack dirty page
>>> tracking with either the type1 backend or when using iommufd when the
>>> IOMMU hardware doesn't have dirty page tracking, therefore all VM
>>> memory is perpetually dirty.  Do I have that right?
>>
>> IOMMUFD_DRIVER is selected to utilize the dirty tracking functionality of IOMMU.
>> Therefore, this is a select option rather than a dependency, similar to how the
>> pds and mlx5 VFIO drivers handle it in their Kconfig files.
>>
> 
> Yishai, I think Alex is right here.
> 
> 'select IOMMUFD_DRIVER' is more for VF dirty trackers where it uses the same
> helpers as IOMMUFD does for dirty tracking. But it's definitely not signaling
> intent for 'IOMMU dirty tracking' but rather 'VF dirty tracking'

I see, please see below.

> 
> If you want to tie in to IOMMU dirty tracking you probably want to do:
> 
> 	'select IOMMUFD'
> 

Looking at the below Kconfig(s) for AMD/INTEL_IOMMU [1], we can see that 
if IOMMUFD is set IOMMFD_DRIVER is selected.

 From that we can assume that to have 'IOMMU dirty tracking' the 
IOMMFD_DRIVER is finally needed/selected, right ?

So you are saying that it's redundant in the vfio/virtio driver as it 
will be selected down the road once needed ?

[1]
https://elixir.bootlin.com/linux/v6.12-rc6/source/drivers/iommu/intel/Kconfig#L17
https://elixir.bootlin.com/linux/v6.12-rc6/source/drivers/iommu/amd/Kconfig#L16

> But that is a big hammer, as you also need the VFIO_DEVICE_CDEV kconfig selected
> as well and probably more.

I agree, we didn't plan to add those dependencies.

> 
> Perhaps best to do like qat/hisilicon drivers and letting the user optionally
> pick it. Migration is anyways disabled when using type1 (unless you force it,
> and it then it does the perpectual dirty trick).

I can drop it in V3 if we all agree that it's not really needed here.

Thanks,
Yishai
Joao Martins Nov. 11, 2024, 3:30 p.m. UTC | #9
On 11/11/2024 14:17, Yishai Hadas wrote:
> On 11/11/2024 12:32, Joao Martins wrote:
>>>>>            depends on VIRTIO_PCI
>>>>>            select VFIO_PCI_CORE
>>>>>            select IOMMUFD_DRIVER
>>>>
>>>> IIUC, this is not a dependency, the device will just lack dirty page
>>>> tracking with either the type1 backend or when using iommufd when the
>>>> IOMMU hardware doesn't have dirty page tracking, therefore all VM
>>>> memory is perpetually dirty.  Do I have that right?
>>>
>>> IOMMUFD_DRIVER is selected to utilize the dirty tracking functionality of IOMMU.
>>> Therefore, this is a select option rather than a dependency, similar to how the
>>> pds and mlx5 VFIO drivers handle it in their Kconfig files.
>>>
>>
>> Yishai, I think Alex is right here.
>>
>> 'select IOMMUFD_DRIVER' is more for VF dirty trackers where it uses the same
>> helpers as IOMMUFD does for dirty tracking. But it's definitely not signaling
>> intent for 'IOMMU dirty tracking' but rather 'VF dirty tracking'
> 
> I see, please see below.
> 
I should have said that in the context of VFIO drivers selecting it, not broadly
across all drivers that select it.

>>
>> If you want to tie in to IOMMU dirty tracking you probably want to do:
>>
>>     'select IOMMUFD'
>>
> 
> Looking at the below Kconfig(s) for AMD/INTEL_IOMMU [1], we can see that if
> IOMMUFD is set IOMMFD_DRIVER is selected.
> 
Correct.

> From that we can assume that to have 'IOMMU dirty tracking' the IOMMFD_DRIVER is
> finally needed/selected, right ?
> 

Right, if you have CONFIG_IOMMUFD then the IOMMU will in the end auto-select
IOMMU_DRIVER. But standalone at best you can assume that 'something does dirty
tracking'. The context (i.e. who selects it) is what tells you if it's VF or IOMMU.

In your example above, that option is meant to be selected by *a* dirty tracker,
and it's because AMD/Intel was selecting that you would have IOMMU dirty
tracking. The option essentially selects IOVA bitmaps helpers (zerocopy scheme
to set bits in a bitmap) which is both used by VF dirty trackers and IOMMU dirty
trackers. Originally this started in VFIO and later got moved into IOMMUFD which
is why we have this kconfig to allow independent use.

> So you are saying that it's redundant in the vfio/virtio driver as it will be
> selected down the road once needed ?
> 
Right.

Of course, it will always depend on user enabling the right kconfigs and such.
But that would be no different than the other drivers than don't support VF
dirty tracking.

> [1]
> https://elixir.bootlin.com/linux/v6.12-rc6/source/drivers/iommu/intel/Kconfig#L17
> https://elixir.bootlin.com/linux/v6.12-rc6/source/drivers/iommu/amd/Kconfig#L16
> 
>> But that is a big hammer, as you also need the VFIO_DEVICE_CDEV kconfig selected
>> as well and probably more.
> 
> I agree, we didn't plan to add those dependencies.
> 
>>
>> Perhaps best to do like qat/hisilicon drivers and letting the user optionally
>> pick it. Migration is anyways disabled when using type1 (unless you force it,
>> and it then it does the perpectual dirty trick).
> 
> I can drop it in V3 if we all agree that it's not really needed here.
> 

Sure
Alex Williamson Nov. 11, 2024, 9:27 p.m. UTC | #10
On Mon, 11 Nov 2024 15:30:47 +0000
Joao Martins <joao.m.martins@oracle.com> wrote:

> On 11/11/2024 14:17, Yishai Hadas wrote:
> > On 11/11/2024 12:32, Joao Martins wrote:  
> >>>>>            depends on VIRTIO_PCI
> >>>>>            select VFIO_PCI_CORE
> >>>>>            select IOMMUFD_DRIVER  
> >>>>
> >>>> IIUC, this is not a dependency, the device will just lack dirty page
> >>>> tracking with either the type1 backend or when using iommufd when the
> >>>> IOMMU hardware doesn't have dirty page tracking, therefore all VM
> >>>> memory is perpetually dirty.  Do I have that right?  
> >>>
> >>> IOMMUFD_DRIVER is selected to utilize the dirty tracking functionality of IOMMU.
> >>> Therefore, this is a select option rather than a dependency, similar to how the
> >>> pds and mlx5 VFIO drivers handle it in their Kconfig files.
> >>>  
> >>
> >> Yishai, I think Alex is right here.
> >>
> >> 'select IOMMUFD_DRIVER' is more for VF dirty trackers where it uses the same
> >> helpers as IOMMUFD does for dirty tracking. But it's definitely not signaling
> >> intent for 'IOMMU dirty tracking' but rather 'VF dirty tracking'  
> > 
> > I see, please see below.
> >   
> I should have said that in the context of VFIO drivers selecting it, not broadly
> across all drivers that select it.
> 
> >>
> >> If you want to tie in to IOMMU dirty tracking you probably want to do:
> >>
> >>     'select IOMMUFD'
> >>  
> > 
> > Looking at the below Kconfig(s) for AMD/INTEL_IOMMU [1], we can see that if
> > IOMMUFD is set IOMMFD_DRIVER is selected.
> >   
> Correct.
> 
> > From that we can assume that to have 'IOMMU dirty tracking' the IOMMFD_DRIVER is
> > finally needed/selected, right ?
> >   
> 
> Right, if you have CONFIG_IOMMUFD then the IOMMU will in the end auto-select
> IOMMU_DRIVER. But standalone at best you can assume that 'something does dirty
> tracking'. The context (i.e. who selects it) is what tells you if it's VF or IOMMU.
> 
> In your example above, that option is meant to be selected by *a* dirty tracker,
> and it's because AMD/Intel was selecting that you would have IOMMU dirty
> tracking. The option essentially selects IOVA bitmaps helpers (zerocopy scheme
> to set bits in a bitmap) which is both used by VF dirty trackers and IOMMU dirty
> trackers. Originally this started in VFIO and later got moved into IOMMUFD which
> is why we have this kconfig to allow independent use.

Yeah, I agree.  IOMMUFD_DRIVER is only configuring in iova_bitmap
support independent of IOMMUFD, which mlx5 requires, but this does not.

> > So you are saying that it's redundant in the vfio/virtio driver as it will be
> > selected down the road once needed ?
> >   
> Right.
> 
> Of course, it will always depend on user enabling the right kconfigs and such.
> But that would be no different than the other drivers than don't support VF
> dirty tracking.
> 
> > [1]
> > https://elixir.bootlin.com/linux/v6.12-rc6/source/drivers/iommu/intel/Kconfig#L17
> > https://elixir.bootlin.com/linux/v6.12-rc6/source/drivers/iommu/amd/Kconfig#L16
> >   
> >> But that is a big hammer, as you also need the VFIO_DEVICE_CDEV kconfig selected
> >> as well and probably more.  
> > 
> > I agree, we didn't plan to add those dependencies.
> >   
> >>
> >> Perhaps best to do like qat/hisilicon drivers and letting the user optionally
> >> pick it. Migration is anyways disabled when using type1 (unless you force it,
> >> and it then it does the perpectual dirty trick).  

Yes, at least for QEMU, unless the user forces the device to support
migration we'll add a migration blocker rather than use the perpetually
dirty trick.  Ultimately the support depends on the underlying IOMMU
capabilities where we're running anyway, so it makes sense to me to
leave this to the user rather than trying to force a kernel config that
can support IOMMU dirty page tracking.  Thanks,

Alex
diff mbox series

Patch

diff --git a/drivers/vfio/pci/virtio/Kconfig b/drivers/vfio/pci/virtio/Kconfig
index bd80eca4a196..af1dd9e84a5c 100644
--- a/drivers/vfio/pci/virtio/Kconfig
+++ b/drivers/vfio/pci/virtio/Kconfig
@@ -1,7 +1,7 @@ 
 # SPDX-License-Identifier: GPL-2.0-only
 config VIRTIO_VFIO_PCI
         tristate "VFIO support for VIRTIO NET PCI devices"
-        depends on VIRTIO_PCI && VIRTIO_PCI_ADMIN_LEGACY
+        depends on VIRTIO_PCI
         select VFIO_PCI_CORE
         help
           This provides support for exposing VIRTIO NET VF devices which support
@@ -11,5 +11,7 @@  config VIRTIO_VFIO_PCI
           As of that this driver emulates I/O BAR in software to let a VF be
           seen as a transitional device by its users and let it work with
           a legacy driver.
+          In addition, it provides migration support for VIRTIO NET VF devices
+          using the VFIO framework.
 
           If you don't know what to do here, say N.
diff --git a/drivers/vfio/pci/virtio/Makefile b/drivers/vfio/pci/virtio/Makefile
index bf0ccde6a91a..0032e6db4636 100644
--- a/drivers/vfio/pci/virtio/Makefile
+++ b/drivers/vfio/pci/virtio/Makefile
@@ -1,3 +1,4 @@ 
 # SPDX-License-Identifier: GPL-2.0-only
 obj-$(CONFIG_VIRTIO_VFIO_PCI) += virtio-vfio-pci.o
 virtio-vfio-pci-y := main.o migrate.o
+virtio-vfio-pci-$(CONFIG_VIRTIO_PCI_ADMIN_LEGACY) += legacy_io.o
diff --git a/drivers/vfio/pci/virtio/common.h b/drivers/vfio/pci/virtio/common.h
index 5704603f0f9d..bca797b82f01 100644
--- a/drivers/vfio/pci/virtio/common.h
+++ b/drivers/vfio/pci/virtio/common.h
@@ -78,6 +78,7 @@  struct virtiovf_migration_file {
 
 struct virtiovf_pci_core_device {
 	struct vfio_pci_core_device core_device;
+#ifdef CONFIG_VIRTIO_PCI_ADMIN_LEGACY
 	u8 *bar0_virtual_buf;
 	/* synchronize access to the virtual buf */
 	struct mutex bar_mutex;
@@ -87,6 +88,7 @@  struct virtiovf_pci_core_device {
 	__le16 pci_cmd;
 	u8 bar0_virtual_buf_size;
 	u8 notify_bar;
+#endif
 
 	/* LM related */
 	u8 migrate_cap:1;
@@ -105,4 +107,21 @@  void virtiovf_open_migration(struct virtiovf_pci_core_device *virtvdev);
 void virtiovf_close_migration(struct virtiovf_pci_core_device *virtvdev);
 void virtiovf_migration_reset_done(struct pci_dev *pdev);
 
+#ifdef CONFIG_VIRTIO_PCI_ADMIN_LEGACY
+int virtiovf_open_legacy_io(struct virtiovf_pci_core_device *virtvdev);
+long virtiovf_vfio_pci_core_ioctl(struct vfio_device *core_vdev,
+				  unsigned int cmd, unsigned long arg);
+int virtiovf_pci_ioctl_get_region_info(struct vfio_device *core_vdev,
+				       unsigned int cmd, unsigned long arg);
+ssize_t virtiovf_pci_core_write(struct vfio_device *core_vdev,
+				const char __user *buf, size_t count,
+				loff_t *ppos);
+ssize_t virtiovf_pci_core_read(struct vfio_device *core_vdev, char __user *buf,
+			       size_t count, loff_t *ppos);
+int virtiovf_init_legacy_io(struct virtiovf_pci_core_device *virtvdev,
+			    bool *sup_legacy_io);
+void virtiovf_release_legacy_io(struct virtiovf_pci_core_device *virtvdev);
+void virtiovf_legacy_io_reset_done(struct pci_dev *pdev);
+#endif
+
 #endif /* VIRTIO_VFIO_COMMON_H */
diff --git a/drivers/vfio/pci/virtio/legacy_io.c b/drivers/vfio/pci/virtio/legacy_io.c
new file mode 100644
index 000000000000..52c7515ff020
--- /dev/null
+++ b/drivers/vfio/pci/virtio/legacy_io.c
@@ -0,0 +1,420 @@ 
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved
+ */
+
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/pci.h>
+#include <linux/pm_runtime.h>
+#include <linux/types.h>
+#include <linux/uaccess.h>
+#include <linux/vfio.h>
+#include <linux/vfio_pci_core.h>
+#include <linux/virtio_pci.h>
+#include <linux/virtio_net.h>
+#include <linux/virtio_pci_admin.h>
+
+#include "common.h"
+
+static int
+virtiovf_issue_legacy_rw_cmd(struct virtiovf_pci_core_device *virtvdev,
+			     loff_t pos, char __user *buf,
+			     size_t count, bool read)
+{
+	bool msix_enabled =
+		(virtvdev->core_device.irq_type == VFIO_PCI_MSIX_IRQ_INDEX);
+	struct pci_dev *pdev = virtvdev->core_device.pdev;
+	u8 *bar0_buf = virtvdev->bar0_virtual_buf;
+	bool common;
+	u8 offset;
+	int ret;
+
+	common = pos < VIRTIO_PCI_CONFIG_OFF(msix_enabled);
+	/* offset within the relevant configuration area */
+	offset = common ? pos : pos - VIRTIO_PCI_CONFIG_OFF(msix_enabled);
+	mutex_lock(&virtvdev->bar_mutex);
+	if (read) {
+		if (common)
+			ret = virtio_pci_admin_legacy_common_io_read(pdev, offset,
+					count, bar0_buf + pos);
+		else
+			ret = virtio_pci_admin_legacy_device_io_read(pdev, offset,
+					count, bar0_buf + pos);
+		if (ret)
+			goto out;
+		if (copy_to_user(buf, bar0_buf + pos, count))
+			ret = -EFAULT;
+	} else {
+		if (copy_from_user(bar0_buf + pos, buf, count)) {
+			ret = -EFAULT;
+			goto out;
+		}
+
+		if (common)
+			ret = virtio_pci_admin_legacy_common_io_write(pdev, offset,
+					count, bar0_buf + pos);
+		else
+			ret = virtio_pci_admin_legacy_device_io_write(pdev, offset,
+					count, bar0_buf + pos);
+	}
+out:
+	mutex_unlock(&virtvdev->bar_mutex);
+	return ret;
+}
+
+static int
+virtiovf_pci_bar0_rw(struct virtiovf_pci_core_device *virtvdev,
+		     loff_t pos, char __user *buf,
+		     size_t count, bool read)
+{
+	struct vfio_pci_core_device *core_device = &virtvdev->core_device;
+	struct pci_dev *pdev = core_device->pdev;
+	u16 queue_notify;
+	int ret;
+
+	if (!(le16_to_cpu(virtvdev->pci_cmd) & PCI_COMMAND_IO))
+		return -EIO;
+
+	if (pos + count > virtvdev->bar0_virtual_buf_size)
+		return -EINVAL;
+
+	ret = pm_runtime_resume_and_get(&pdev->dev);
+	if (ret) {
+		pci_info_ratelimited(pdev, "runtime resume failed %d\n", ret);
+		return -EIO;
+	}
+
+	switch (pos) {
+	case VIRTIO_PCI_QUEUE_NOTIFY:
+		if (count != sizeof(queue_notify)) {
+			ret = -EINVAL;
+			goto end;
+		}
+		if (read) {
+			ret = vfio_pci_core_ioread16(core_device, true, &queue_notify,
+						     virtvdev->notify_addr);
+			if (ret)
+				goto end;
+			if (copy_to_user(buf, &queue_notify,
+					 sizeof(queue_notify))) {
+				ret = -EFAULT;
+				goto end;
+			}
+		} else {
+			if (copy_from_user(&queue_notify, buf, count)) {
+				ret = -EFAULT;
+				goto end;
+			}
+			ret = vfio_pci_core_iowrite16(core_device, true, queue_notify,
+						      virtvdev->notify_addr);
+		}
+		break;
+	default:
+		ret = virtiovf_issue_legacy_rw_cmd(virtvdev, pos, buf, count,
+						   read);
+	}
+
+end:
+	pm_runtime_put(&pdev->dev);
+	return ret ? ret : count;
+}
+
+static ssize_t virtiovf_pci_read_config(struct vfio_device *core_vdev,
+					char __user *buf, size_t count,
+					loff_t *ppos)
+{
+	struct virtiovf_pci_core_device *virtvdev = container_of(
+		core_vdev, struct virtiovf_pci_core_device, core_device.vdev);
+	loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK;
+	size_t register_offset;
+	loff_t copy_offset;
+	size_t copy_count;
+	__le32 val32;
+	__le16 val16;
+	u8 val8;
+	int ret;
+
+	ret = vfio_pci_core_read(core_vdev, buf, count, ppos);
+	if (ret < 0)
+		return ret;
+
+	if (vfio_pci_core_range_intersect_range(pos, count, PCI_DEVICE_ID,
+						sizeof(val16), &copy_offset,
+						&copy_count, &register_offset)) {
+		val16 = cpu_to_le16(VIRTIO_TRANS_ID_NET);
+		if (copy_to_user(buf + copy_offset, (void *)&val16 + register_offset, copy_count))
+			return -EFAULT;
+	}
+
+	if ((le16_to_cpu(virtvdev->pci_cmd) & PCI_COMMAND_IO) &&
+	    vfio_pci_core_range_intersect_range(pos, count, PCI_COMMAND,
+						sizeof(val16), &copy_offset,
+						&copy_count, &register_offset)) {
+		if (copy_from_user((void *)&val16 + register_offset, buf + copy_offset,
+				   copy_count))
+			return -EFAULT;
+		val16 |= cpu_to_le16(PCI_COMMAND_IO);
+		if (copy_to_user(buf + copy_offset, (void *)&val16 + register_offset,
+				 copy_count))
+			return -EFAULT;
+	}
+
+	if (vfio_pci_core_range_intersect_range(pos, count, PCI_REVISION_ID,
+						sizeof(val8), &copy_offset,
+						&copy_count, &register_offset)) {
+		/* Transional needs to have revision 0 */
+		val8 = 0;
+		if (copy_to_user(buf + copy_offset, &val8, copy_count))
+			return -EFAULT;
+	}
+
+	if (vfio_pci_core_range_intersect_range(pos, count, PCI_BASE_ADDRESS_0,
+						sizeof(val32), &copy_offset,
+						&copy_count, &register_offset)) {
+		u32 bar_mask = ~(virtvdev->bar0_virtual_buf_size - 1);
+		u32 pci_base_addr_0 = le32_to_cpu(virtvdev->pci_base_addr_0);
+
+		val32 = cpu_to_le32((pci_base_addr_0 & bar_mask) | PCI_BASE_ADDRESS_SPACE_IO);
+		if (copy_to_user(buf + copy_offset, (void *)&val32 + register_offset, copy_count))
+			return -EFAULT;
+	}
+
+	if (vfio_pci_core_range_intersect_range(pos, count, PCI_SUBSYSTEM_ID,
+						sizeof(val16), &copy_offset,
+						&copy_count, &register_offset)) {
+		/*
+		 * Transitional devices use the PCI subsystem device id as
+		 * virtio device id, same as legacy driver always did.
+		 */
+		val16 = cpu_to_le16(VIRTIO_ID_NET);
+		if (copy_to_user(buf + copy_offset, (void *)&val16 + register_offset,
+				 copy_count))
+			return -EFAULT;
+	}
+
+	if (vfio_pci_core_range_intersect_range(pos, count, PCI_SUBSYSTEM_VENDOR_ID,
+						sizeof(val16), &copy_offset,
+						&copy_count, &register_offset)) {
+		val16 = cpu_to_le16(PCI_VENDOR_ID_REDHAT_QUMRANET);
+		if (copy_to_user(buf + copy_offset, (void *)&val16 + register_offset,
+				 copy_count))
+			return -EFAULT;
+	}
+
+	return count;
+}
+
+ssize_t virtiovf_pci_core_read(struct vfio_device *core_vdev, char __user *buf,
+			       size_t count, loff_t *ppos)
+{
+	struct virtiovf_pci_core_device *virtvdev = container_of(
+		core_vdev, struct virtiovf_pci_core_device, core_device.vdev);
+	unsigned int index = VFIO_PCI_OFFSET_TO_INDEX(*ppos);
+	loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK;
+
+	if (!count)
+		return 0;
+
+	if (index == VFIO_PCI_CONFIG_REGION_INDEX)
+		return virtiovf_pci_read_config(core_vdev, buf, count, ppos);
+
+	if (index == VFIO_PCI_BAR0_REGION_INDEX)
+		return virtiovf_pci_bar0_rw(virtvdev, pos, buf, count, true);
+
+	return vfio_pci_core_read(core_vdev, buf, count, ppos);
+}
+
+static ssize_t virtiovf_pci_write_config(struct vfio_device *core_vdev,
+					 const char __user *buf, size_t count,
+					 loff_t *ppos)
+{
+	struct virtiovf_pci_core_device *virtvdev = container_of(
+		core_vdev, struct virtiovf_pci_core_device, core_device.vdev);
+	loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK;
+	size_t register_offset;
+	loff_t copy_offset;
+	size_t copy_count;
+
+	if (vfio_pci_core_range_intersect_range(pos, count, PCI_COMMAND,
+						sizeof(virtvdev->pci_cmd),
+						&copy_offset, &copy_count,
+						&register_offset)) {
+		if (copy_from_user((void *)&virtvdev->pci_cmd + register_offset,
+				   buf + copy_offset,
+				   copy_count))
+			return -EFAULT;
+	}
+
+	if (vfio_pci_core_range_intersect_range(pos, count, PCI_BASE_ADDRESS_0,
+						sizeof(virtvdev->pci_base_addr_0),
+						&copy_offset, &copy_count,
+						&register_offset)) {
+		if (copy_from_user((void *)&virtvdev->pci_base_addr_0 + register_offset,
+				   buf + copy_offset,
+				   copy_count))
+			return -EFAULT;
+	}
+
+	return vfio_pci_core_write(core_vdev, buf, count, ppos);
+}
+
+ssize_t virtiovf_pci_core_write(struct vfio_device *core_vdev, const char __user *buf,
+				size_t count, loff_t *ppos)
+{
+	struct virtiovf_pci_core_device *virtvdev = container_of(
+		core_vdev, struct virtiovf_pci_core_device, core_device.vdev);
+	unsigned int index = VFIO_PCI_OFFSET_TO_INDEX(*ppos);
+	loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK;
+
+	if (!count)
+		return 0;
+
+	if (index == VFIO_PCI_CONFIG_REGION_INDEX)
+		return virtiovf_pci_write_config(core_vdev, buf, count, ppos);
+
+	if (index == VFIO_PCI_BAR0_REGION_INDEX)
+		return virtiovf_pci_bar0_rw(virtvdev, pos, (char __user *)buf, count, false);
+
+	return vfio_pci_core_write(core_vdev, buf, count, ppos);
+}
+
+int virtiovf_pci_ioctl_get_region_info(struct vfio_device *core_vdev,
+				       unsigned int cmd, unsigned long arg)
+{
+	struct virtiovf_pci_core_device *virtvdev = container_of(
+		core_vdev, struct virtiovf_pci_core_device, core_device.vdev);
+	unsigned long minsz = offsetofend(struct vfio_region_info, offset);
+	void __user *uarg = (void __user *)arg;
+	struct vfio_region_info info = {};
+
+	if (copy_from_user(&info, uarg, minsz))
+		return -EFAULT;
+
+	if (info.argsz < minsz)
+		return -EINVAL;
+
+	switch (info.index) {
+	case VFIO_PCI_BAR0_REGION_INDEX:
+		info.offset = VFIO_PCI_INDEX_TO_OFFSET(info.index);
+		info.size = virtvdev->bar0_virtual_buf_size;
+		info.flags = VFIO_REGION_INFO_FLAG_READ |
+			     VFIO_REGION_INFO_FLAG_WRITE;
+		return copy_to_user(uarg, &info, minsz) ? -EFAULT : 0;
+	default:
+		return vfio_pci_core_ioctl(core_vdev, cmd, arg);
+	}
+}
+
+long virtiovf_vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
+				  unsigned long arg)
+{
+	switch (cmd) {
+	case VFIO_DEVICE_GET_REGION_INFO:
+		return virtiovf_pci_ioctl_get_region_info(core_vdev, cmd, arg);
+	default:
+		return vfio_pci_core_ioctl(core_vdev, cmd, arg);
+	}
+}
+
+static int virtiovf_set_notify_addr(struct virtiovf_pci_core_device *virtvdev)
+{
+	struct vfio_pci_core_device *core_device = &virtvdev->core_device;
+	int ret;
+
+	/*
+	 * Setup the BAR where the 'notify' exists to be used by vfio as well
+	 * This will let us mmap it only once and use it when needed.
+	 */
+	ret = vfio_pci_core_setup_barmap(core_device,
+					 virtvdev->notify_bar);
+	if (ret)
+		return ret;
+
+	virtvdev->notify_addr = core_device->barmap[virtvdev->notify_bar] +
+			virtvdev->notify_offset;
+	return 0;
+}
+
+int virtiovf_open_legacy_io(struct virtiovf_pci_core_device *virtvdev)
+{
+	if (!virtvdev->bar0_virtual_buf)
+		return 0;
+
+	/*
+	 * Upon close_device() the vfio_pci_core_disable() is called
+	 * and will close all the previous mmaps, so it seems that the
+	 * valid life cycle for the 'notify' addr is per open/close.
+	 */
+	return virtiovf_set_notify_addr(virtvdev);
+}
+
+static int virtiovf_get_device_config_size(unsigned short device)
+{
+	/* Network card */
+	return offsetofend(struct virtio_net_config, status);
+}
+
+static int virtiovf_read_notify_info(struct virtiovf_pci_core_device *virtvdev)
+{
+	u64 offset;
+	int ret;
+	u8 bar;
+
+	ret = virtio_pci_admin_legacy_io_notify_info(virtvdev->core_device.pdev,
+				VIRTIO_ADMIN_CMD_NOTIFY_INFO_FLAGS_OWNER_MEM,
+				&bar, &offset);
+	if (ret)
+		return ret;
+
+	virtvdev->notify_bar = bar;
+	virtvdev->notify_offset = offset;
+	return 0;
+}
+
+static bool virtiovf_bar0_exists(struct pci_dev *pdev)
+{
+	struct resource *res = pdev->resource;
+
+	return res->flags;
+}
+
+int virtiovf_init_legacy_io(struct virtiovf_pci_core_device *virtvdev,
+			    bool *sup_legacy_io)
+{
+	struct pci_dev *pdev = virtvdev->core_device.pdev;
+	int ret;
+
+	*sup_legacy_io = virtio_pci_admin_has_legacy_io(pdev) &&
+			!virtiovf_bar0_exists(pdev);
+
+	if (!*sup_legacy_io)
+		return 0;
+
+	ret = virtiovf_read_notify_info(virtvdev);
+	if (ret)
+		return ret;
+
+	virtvdev->bar0_virtual_buf_size = VIRTIO_PCI_CONFIG_OFF(true) +
+				virtiovf_get_device_config_size(pdev->device);
+	BUILD_BUG_ON(!is_power_of_2(virtvdev->bar0_virtual_buf_size));
+	virtvdev->bar0_virtual_buf = kzalloc(virtvdev->bar0_virtual_buf_size,
+					     GFP_KERNEL);
+	if (!virtvdev->bar0_virtual_buf)
+		return -ENOMEM;
+	mutex_init(&virtvdev->bar_mutex);
+	return 0;
+}
+
+void virtiovf_release_legacy_io(struct virtiovf_pci_core_device *virtvdev)
+{
+	kfree(virtvdev->bar0_virtual_buf);
+}
+
+void virtiovf_legacy_io_reset_done(struct pci_dev *pdev)
+{
+	struct virtiovf_pci_core_device *virtvdev = dev_get_drvdata(&pdev->dev);
+
+	virtvdev->pci_cmd = 0;
+}
diff --git a/drivers/vfio/pci/virtio/main.c b/drivers/vfio/pci/virtio/main.c
index e2cdf2d48200..01b237908e7a 100644
--- a/drivers/vfio/pci/virtio/main.c
+++ b/drivers/vfio/pci/virtio/main.c
@@ -20,330 +20,6 @@ 
 
 static int virtiovf_pci_init_device(struct vfio_device *core_vdev);
 
-static int
-virtiovf_issue_legacy_rw_cmd(struct virtiovf_pci_core_device *virtvdev,
-			     loff_t pos, char __user *buf,
-			     size_t count, bool read)
-{
-	bool msix_enabled =
-		(virtvdev->core_device.irq_type == VFIO_PCI_MSIX_IRQ_INDEX);
-	struct pci_dev *pdev = virtvdev->core_device.pdev;
-	u8 *bar0_buf = virtvdev->bar0_virtual_buf;
-	bool common;
-	u8 offset;
-	int ret;
-
-	common = pos < VIRTIO_PCI_CONFIG_OFF(msix_enabled);
-	/* offset within the relevant configuration area */
-	offset = common ? pos : pos - VIRTIO_PCI_CONFIG_OFF(msix_enabled);
-	mutex_lock(&virtvdev->bar_mutex);
-	if (read) {
-		if (common)
-			ret = virtio_pci_admin_legacy_common_io_read(pdev, offset,
-					count, bar0_buf + pos);
-		else
-			ret = virtio_pci_admin_legacy_device_io_read(pdev, offset,
-					count, bar0_buf + pos);
-		if (ret)
-			goto out;
-		if (copy_to_user(buf, bar0_buf + pos, count))
-			ret = -EFAULT;
-	} else {
-		if (copy_from_user(bar0_buf + pos, buf, count)) {
-			ret = -EFAULT;
-			goto out;
-		}
-
-		if (common)
-			ret = virtio_pci_admin_legacy_common_io_write(pdev, offset,
-					count, bar0_buf + pos);
-		else
-			ret = virtio_pci_admin_legacy_device_io_write(pdev, offset,
-					count, bar0_buf + pos);
-	}
-out:
-	mutex_unlock(&virtvdev->bar_mutex);
-	return ret;
-}
-
-static int
-virtiovf_pci_bar0_rw(struct virtiovf_pci_core_device *virtvdev,
-		     loff_t pos, char __user *buf,
-		     size_t count, bool read)
-{
-	struct vfio_pci_core_device *core_device = &virtvdev->core_device;
-	struct pci_dev *pdev = core_device->pdev;
-	u16 queue_notify;
-	int ret;
-
-	if (!(le16_to_cpu(virtvdev->pci_cmd) & PCI_COMMAND_IO))
-		return -EIO;
-
-	if (pos + count > virtvdev->bar0_virtual_buf_size)
-		return -EINVAL;
-
-	ret = pm_runtime_resume_and_get(&pdev->dev);
-	if (ret) {
-		pci_info_ratelimited(pdev, "runtime resume failed %d\n", ret);
-		return -EIO;
-	}
-
-	switch (pos) {
-	case VIRTIO_PCI_QUEUE_NOTIFY:
-		if (count != sizeof(queue_notify)) {
-			ret = -EINVAL;
-			goto end;
-		}
-		if (read) {
-			ret = vfio_pci_core_ioread16(core_device, true, &queue_notify,
-						     virtvdev->notify_addr);
-			if (ret)
-				goto end;
-			if (copy_to_user(buf, &queue_notify,
-					 sizeof(queue_notify))) {
-				ret = -EFAULT;
-				goto end;
-			}
-		} else {
-			if (copy_from_user(&queue_notify, buf, count)) {
-				ret = -EFAULT;
-				goto end;
-			}
-			ret = vfio_pci_core_iowrite16(core_device, true, queue_notify,
-						      virtvdev->notify_addr);
-		}
-		break;
-	default:
-		ret = virtiovf_issue_legacy_rw_cmd(virtvdev, pos, buf, count,
-						   read);
-	}
-
-end:
-	pm_runtime_put(&pdev->dev);
-	return ret ? ret : count;
-}
-
-static ssize_t virtiovf_pci_read_config(struct vfio_device *core_vdev,
-					char __user *buf, size_t count,
-					loff_t *ppos)
-{
-	struct virtiovf_pci_core_device *virtvdev = container_of(
-		core_vdev, struct virtiovf_pci_core_device, core_device.vdev);
-	loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK;
-	size_t register_offset;
-	loff_t copy_offset;
-	size_t copy_count;
-	__le32 val32;
-	__le16 val16;
-	u8 val8;
-	int ret;
-
-	ret = vfio_pci_core_read(core_vdev, buf, count, ppos);
-	if (ret < 0)
-		return ret;
-
-	if (vfio_pci_core_range_intersect_range(pos, count, PCI_DEVICE_ID,
-						sizeof(val16), &copy_offset,
-						&copy_count, &register_offset)) {
-		val16 = cpu_to_le16(VIRTIO_TRANS_ID_NET);
-		if (copy_to_user(buf + copy_offset, (void *)&val16 + register_offset, copy_count))
-			return -EFAULT;
-	}
-
-	if ((le16_to_cpu(virtvdev->pci_cmd) & PCI_COMMAND_IO) &&
-	    vfio_pci_core_range_intersect_range(pos, count, PCI_COMMAND,
-						sizeof(val16), &copy_offset,
-						&copy_count, &register_offset)) {
-		if (copy_from_user((void *)&val16 + register_offset, buf + copy_offset,
-				   copy_count))
-			return -EFAULT;
-		val16 |= cpu_to_le16(PCI_COMMAND_IO);
-		if (copy_to_user(buf + copy_offset, (void *)&val16 + register_offset,
-				 copy_count))
-			return -EFAULT;
-	}
-
-	if (vfio_pci_core_range_intersect_range(pos, count, PCI_REVISION_ID,
-						sizeof(val8), &copy_offset,
-						&copy_count, &register_offset)) {
-		/* Transional needs to have revision 0 */
-		val8 = 0;
-		if (copy_to_user(buf + copy_offset, &val8, copy_count))
-			return -EFAULT;
-	}
-
-	if (vfio_pci_core_range_intersect_range(pos, count, PCI_BASE_ADDRESS_0,
-						sizeof(val32), &copy_offset,
-						&copy_count, &register_offset)) {
-		u32 bar_mask = ~(virtvdev->bar0_virtual_buf_size - 1);
-		u32 pci_base_addr_0 = le32_to_cpu(virtvdev->pci_base_addr_0);
-
-		val32 = cpu_to_le32((pci_base_addr_0 & bar_mask) | PCI_BASE_ADDRESS_SPACE_IO);
-		if (copy_to_user(buf + copy_offset, (void *)&val32 + register_offset, copy_count))
-			return -EFAULT;
-	}
-
-	if (vfio_pci_core_range_intersect_range(pos, count, PCI_SUBSYSTEM_ID,
-						sizeof(val16), &copy_offset,
-						&copy_count, &register_offset)) {
-		/*
-		 * Transitional devices use the PCI subsystem device id as
-		 * virtio device id, same as legacy driver always did.
-		 */
-		val16 = cpu_to_le16(VIRTIO_ID_NET);
-		if (copy_to_user(buf + copy_offset, (void *)&val16 + register_offset,
-				 copy_count))
-			return -EFAULT;
-	}
-
-	if (vfio_pci_core_range_intersect_range(pos, count, PCI_SUBSYSTEM_VENDOR_ID,
-						sizeof(val16), &copy_offset,
-						&copy_count, &register_offset)) {
-		val16 = cpu_to_le16(PCI_VENDOR_ID_REDHAT_QUMRANET);
-		if (copy_to_user(buf + copy_offset, (void *)&val16 + register_offset,
-				 copy_count))
-			return -EFAULT;
-	}
-
-	return count;
-}
-
-static ssize_t
-virtiovf_pci_core_read(struct vfio_device *core_vdev, char __user *buf,
-		       size_t count, loff_t *ppos)
-{
-	struct virtiovf_pci_core_device *virtvdev = container_of(
-		core_vdev, struct virtiovf_pci_core_device, core_device.vdev);
-	unsigned int index = VFIO_PCI_OFFSET_TO_INDEX(*ppos);
-	loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK;
-
-	if (!count)
-		return 0;
-
-	if (index == VFIO_PCI_CONFIG_REGION_INDEX)
-		return virtiovf_pci_read_config(core_vdev, buf, count, ppos);
-
-	if (index == VFIO_PCI_BAR0_REGION_INDEX)
-		return virtiovf_pci_bar0_rw(virtvdev, pos, buf, count, true);
-
-	return vfio_pci_core_read(core_vdev, buf, count, ppos);
-}
-
-static ssize_t virtiovf_pci_write_config(struct vfio_device *core_vdev,
-					 const char __user *buf, size_t count,
-					 loff_t *ppos)
-{
-	struct virtiovf_pci_core_device *virtvdev = container_of(
-		core_vdev, struct virtiovf_pci_core_device, core_device.vdev);
-	loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK;
-	size_t register_offset;
-	loff_t copy_offset;
-	size_t copy_count;
-
-	if (vfio_pci_core_range_intersect_range(pos, count, PCI_COMMAND,
-						sizeof(virtvdev->pci_cmd),
-						&copy_offset, &copy_count,
-						&register_offset)) {
-		if (copy_from_user((void *)&virtvdev->pci_cmd + register_offset,
-				   buf + copy_offset,
-				   copy_count))
-			return -EFAULT;
-	}
-
-	if (vfio_pci_core_range_intersect_range(pos, count, PCI_BASE_ADDRESS_0,
-						sizeof(virtvdev->pci_base_addr_0),
-						&copy_offset, &copy_count,
-						&register_offset)) {
-		if (copy_from_user((void *)&virtvdev->pci_base_addr_0 + register_offset,
-				   buf + copy_offset,
-				   copy_count))
-			return -EFAULT;
-	}
-
-	return vfio_pci_core_write(core_vdev, buf, count, ppos);
-}
-
-static ssize_t
-virtiovf_pci_core_write(struct vfio_device *core_vdev, const char __user *buf,
-			size_t count, loff_t *ppos)
-{
-	struct virtiovf_pci_core_device *virtvdev = container_of(
-		core_vdev, struct virtiovf_pci_core_device, core_device.vdev);
-	unsigned int index = VFIO_PCI_OFFSET_TO_INDEX(*ppos);
-	loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK;
-
-	if (!count)
-		return 0;
-
-	if (index == VFIO_PCI_CONFIG_REGION_INDEX)
-		return virtiovf_pci_write_config(core_vdev, buf, count, ppos);
-
-	if (index == VFIO_PCI_BAR0_REGION_INDEX)
-		return virtiovf_pci_bar0_rw(virtvdev, pos, (char __user *)buf, count, false);
-
-	return vfio_pci_core_write(core_vdev, buf, count, ppos);
-}
-
-static int
-virtiovf_pci_ioctl_get_region_info(struct vfio_device *core_vdev,
-				   unsigned int cmd, unsigned long arg)
-{
-	struct virtiovf_pci_core_device *virtvdev = container_of(
-		core_vdev, struct virtiovf_pci_core_device, core_device.vdev);
-	unsigned long minsz = offsetofend(struct vfio_region_info, offset);
-	void __user *uarg = (void __user *)arg;
-	struct vfio_region_info info = {};
-
-	if (copy_from_user(&info, uarg, minsz))
-		return -EFAULT;
-
-	if (info.argsz < minsz)
-		return -EINVAL;
-
-	switch (info.index) {
-	case VFIO_PCI_BAR0_REGION_INDEX:
-		info.offset = VFIO_PCI_INDEX_TO_OFFSET(info.index);
-		info.size = virtvdev->bar0_virtual_buf_size;
-		info.flags = VFIO_REGION_INFO_FLAG_READ |
-			     VFIO_REGION_INFO_FLAG_WRITE;
-		return copy_to_user(uarg, &info, minsz) ? -EFAULT : 0;
-	default:
-		return vfio_pci_core_ioctl(core_vdev, cmd, arg);
-	}
-}
-
-static long
-virtiovf_vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
-			     unsigned long arg)
-{
-	switch (cmd) {
-	case VFIO_DEVICE_GET_REGION_INFO:
-		return virtiovf_pci_ioctl_get_region_info(core_vdev, cmd, arg);
-	default:
-		return vfio_pci_core_ioctl(core_vdev, cmd, arg);
-	}
-}
-
-static int
-virtiovf_set_notify_addr(struct virtiovf_pci_core_device *virtvdev)
-{
-	struct vfio_pci_core_device *core_device = &virtvdev->core_device;
-	int ret;
-
-	/*
-	 * Setup the BAR where the 'notify' exists to be used by vfio as well
-	 * This will let us mmap it only once and use it when needed.
-	 */
-	ret = vfio_pci_core_setup_barmap(core_device,
-					 virtvdev->notify_bar);
-	if (ret)
-		return ret;
-
-	virtvdev->notify_addr = core_device->barmap[virtvdev->notify_bar] +
-			virtvdev->notify_offset;
-	return 0;
-}
-
 static int virtiovf_pci_open_device(struct vfio_device *core_vdev)
 {
 	struct virtiovf_pci_core_device *virtvdev = container_of(core_vdev,
@@ -355,18 +31,13 @@  static int virtiovf_pci_open_device(struct vfio_device *core_vdev)
 	if (ret)
 		return ret;
 
-	if (virtvdev->bar0_virtual_buf) {
-		/*
-		 * Upon close_device() the vfio_pci_core_disable() is called
-		 * and will close all the previous mmaps, so it seems that the
-		 * valid life cycle for the 'notify' addr is per open/close.
-		 */
-		ret = virtiovf_set_notify_addr(virtvdev);
-		if (ret) {
-			vfio_pci_core_disable(vdev);
-			return ret;
-		}
+#ifdef CONFIG_VIRTIO_PCI_ADMIN_LEGACY
+	ret = virtiovf_open_legacy_io(virtvdev);
+	if (ret) {
+		vfio_pci_core_disable(vdev);
+		return ret;
 	}
+#endif
 
 	virtiovf_open_migration(virtvdev);
 	vfio_pci_core_finish_enable(vdev);
@@ -382,35 +53,14 @@  static void virtiovf_pci_close_device(struct vfio_device *core_vdev)
 	vfio_pci_core_close_device(core_vdev);
 }
 
-static int virtiovf_get_device_config_size(unsigned short device)
-{
-	/* Network card */
-	return offsetofend(struct virtio_net_config, status);
-}
-
-static int virtiovf_read_notify_info(struct virtiovf_pci_core_device *virtvdev)
-{
-	u64 offset;
-	int ret;
-	u8 bar;
-
-	ret = virtio_pci_admin_legacy_io_notify_info(virtvdev->core_device.pdev,
-				VIRTIO_ADMIN_CMD_NOTIFY_INFO_FLAGS_OWNER_MEM,
-				&bar, &offset);
-	if (ret)
-		return ret;
-
-	virtvdev->notify_bar = bar;
-	virtvdev->notify_offset = offset;
-	return 0;
-}
-
 static void virtiovf_pci_core_release_dev(struct vfio_device *core_vdev)
 {
+#ifdef CONFIG_VIRTIO_PCI_ADMIN_LEGACY
 	struct virtiovf_pci_core_device *virtvdev = container_of(core_vdev,
 			struct virtiovf_pci_core_device, core_device.vdev);
 
-	kfree(virtvdev->bar0_virtual_buf);
+	virtiovf_release_legacy_io(virtvdev);
+#endif
 	vfio_pci_core_release_dev(core_vdev);
 }
 
@@ -433,6 +83,7 @@  static const struct vfio_device_ops virtiovf_vfio_pci_lm_ops = {
 	.detach_ioas = vfio_iommufd_physical_detach_ioas,
 };
 
+#ifdef CONFIG_VIRTIO_PCI_ADMIN_LEGACY
 static const struct vfio_device_ops virtiovf_vfio_pci_tran_lm_ops = {
 	.name = "virtio-vfio-pci-trans-lm",
 	.init = virtiovf_pci_init_device,
@@ -451,6 +102,7 @@  static const struct vfio_device_ops virtiovf_vfio_pci_tran_lm_ops = {
 	.attach_ioas = vfio_iommufd_physical_attach_ioas,
 	.detach_ioas = vfio_iommufd_physical_detach_ioas,
 };
+#endif
 
 static const struct vfio_device_ops virtiovf_vfio_pci_ops = {
 	.name = "virtio-vfio-pci",
@@ -471,19 +123,12 @@  static const struct vfio_device_ops virtiovf_vfio_pci_ops = {
 	.detach_ioas = vfio_iommufd_physical_detach_ioas,
 };
 
-static bool virtiovf_bar0_exists(struct pci_dev *pdev)
-{
-	struct resource *res = pdev->resource;
-
-	return res->flags;
-}
-
 static int virtiovf_pci_init_device(struct vfio_device *core_vdev)
 {
 	struct virtiovf_pci_core_device *virtvdev = container_of(core_vdev,
 			struct virtiovf_pci_core_device, core_device.vdev);
 	struct pci_dev *pdev;
-	bool sup_legacy_io;
+	bool sup_legacy_io = false;
 	bool sup_lm;
 	int ret;
 
@@ -492,8 +137,12 @@  static int virtiovf_pci_init_device(struct vfio_device *core_vdev)
 		return ret;
 
 	pdev = virtvdev->core_device.pdev;
-	sup_legacy_io = virtio_pci_admin_has_legacy_io(pdev) &&
-				!virtiovf_bar0_exists(pdev);
+#ifdef CONFIG_VIRTIO_PCI_ADMIN_LEGACY
+	ret = virtiovf_init_legacy_io(virtvdev, &sup_legacy_io);
+	if (ret)
+		return ret;
+#endif
+
 	sup_lm = virtio_pci_admin_has_dev_parts(pdev);
 
 	/*
@@ -505,26 +154,13 @@  static int virtiovf_pci_init_device(struct vfio_device *core_vdev)
 		return 0;
 	}
 
-	if (sup_legacy_io) {
-		ret = virtiovf_read_notify_info(virtvdev);
-		if (ret)
-			return ret;
-
-		virtvdev->bar0_virtual_buf_size = VIRTIO_PCI_CONFIG_OFF(true) +
-					virtiovf_get_device_config_size(pdev->device);
-		BUILD_BUG_ON(!is_power_of_2(virtvdev->bar0_virtual_buf_size));
-		virtvdev->bar0_virtual_buf = kzalloc(virtvdev->bar0_virtual_buf_size,
-						     GFP_KERNEL);
-		if (!virtvdev->bar0_virtual_buf)
-			return -ENOMEM;
-		mutex_init(&virtvdev->bar_mutex);
-	}
-
 	if (sup_lm)
 		virtiovf_set_migratable(virtvdev);
 
-	if (sup_lm && !sup_legacy_io)
-		core_vdev->ops = &virtiovf_vfio_pci_lm_ops;
+#ifdef CONFIG_VIRTIO_PCI_ADMIN_LEGACY
+	if (sup_legacy_io)
+		core_vdev->ops = &virtiovf_vfio_pci_tran_lm_ops;
+#endif
 
 	return 0;
 }
@@ -536,7 +172,7 @@  static int virtiovf_pci_probe(struct pci_dev *pdev,
 	const struct vfio_device_ops *ops;
 	int ret;
 
-	ops = (pdev->is_virtfn) ? &virtiovf_vfio_pci_tran_lm_ops :
+	ops = (pdev->is_virtfn) ? &virtiovf_vfio_pci_lm_ops :
 				  &virtiovf_vfio_pci_ops;
 
 	virtvdev = vfio_alloc_device(virtiovf_pci_core_device, core_device.vdev,
@@ -572,9 +208,9 @@  MODULE_DEVICE_TABLE(pci, virtiovf_pci_table);
 
 static void virtiovf_pci_aer_reset_done(struct pci_dev *pdev)
 {
-	struct virtiovf_pci_core_device *virtvdev = dev_get_drvdata(&pdev->dev);
-
-	virtvdev->pci_cmd = 0;
+#ifdef CONFIG_VIRTIO_PCI_ADMIN_LEGACY
+	virtiovf_legacy_io_reset_done(pdev);
+#endif
 	virtiovf_migration_reset_done(pdev);
 }