Message ID | 20230911093856.81910-1-yishaih@nvidia.com (mailing list archive) |
---|---|
Headers | show |
Series | Add chunk mode support for mlx5 driver | expand |
On Mon, Sep 11, 2023 at 12:38:47PM +0300, Yishai Hadas wrote: > This series adds 'chunk mode' support for mlx5 driver upon the migration > flow. > > Before this series, we were limited to 4GB state size, as of the 4 bytes > max value based on the device specification for the query/save/load > commands. > > Once the device supports 'chunk mode' the driver can support state size > which is larger than 4GB. > > In that case, the device has the capability to split a single image to > multiple chunks as long as the software provides a buffer in the minimum > size reported by the device. > > The driver should query for the minimum buffer size required using > QUERY_VHCA_MIGRATION_STATE command with the 'chunk' bit set in its > input, in that case, the output will include both the minimum buffer > size and also the remaining total size to be reported/used where it will > be applicable. > > Upon chunk mode, there may be multiple images that will be read from the > device upon STOP_COPY. The driver will read ahead from the firmware the > full state in small/optimized chunks while letting QEMU/user space read > in parallel the available data. > > The chunk buffer size is picked up based on the minimum size that > firmware requires, the total full size and some max value in the driver > code which was set to 8MB to achieve some optimized downtime in the > general case. > > With that series in place, we could migrate successfully a device state > with a larger size than 4GB, while even improving the downtime in some > scenarios. > > Note: > As the first patch should go to net/mlx5 we may need to send it as a > pull request format to VFIO to avoid conflicts before acceptance. > > Yishai > > Yishai Hadas (9): > net/mlx5: Introduce ifc bits for migration in a chunk mode > vfio/mlx5: Wake up the reader post of disabling the SAVING migration > file > vfio/mlx5: Refactor the SAVE callback to activate a work only upon an > error > vfio/mlx5: Enable querying state size which is > 4GB > vfio/mlx5: Rename some stuff to match chunk mode > vfio/mlx5: Pre-allocate chunks for the STOP_COPY phase > vfio/mlx5: Add support for SAVING in chunk mode > vfio/mlx5: Add support for READING in chunk mode > vfio/mlx5: Activate the chunk mode functionality I didn't check in great depth but this looks OK to me Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> I think this is a good design to start motivating more qmeu improvements, eg using io_uring as we could go further in the driver to optimize with that kind of support. Jason
On 20/09/2023 21:31, Jason Gunthorpe wrote: > On Mon, Sep 11, 2023 at 12:38:47PM +0300, Yishai Hadas wrote: >> This series adds 'chunk mode' support for mlx5 driver upon the migration >> flow. >> >> Before this series, we were limited to 4GB state size, as of the 4 bytes >> max value based on the device specification for the query/save/load >> commands. >> >> Once the device supports 'chunk mode' the driver can support state size >> which is larger than 4GB. >> >> In that case, the device has the capability to split a single image to >> multiple chunks as long as the software provides a buffer in the minimum >> size reported by the device. >> >> The driver should query for the minimum buffer size required using >> QUERY_VHCA_MIGRATION_STATE command with the 'chunk' bit set in its >> input, in that case, the output will include both the minimum buffer >> size and also the remaining total size to be reported/used where it will >> be applicable. >> >> Upon chunk mode, there may be multiple images that will be read from the >> device upon STOP_COPY. The driver will read ahead from the firmware the >> full state in small/optimized chunks while letting QEMU/user space read >> in parallel the available data. >> >> The chunk buffer size is picked up based on the minimum size that >> firmware requires, the total full size and some max value in the driver >> code which was set to 8MB to achieve some optimized downtime in the >> general case. >> >> With that series in place, we could migrate successfully a device state >> with a larger size than 4GB, while even improving the downtime in some >> scenarios. >> >> Note: >> As the first patch should go to net/mlx5 we may need to send it as a >> pull request format to VFIO to avoid conflicts before acceptance. >> >> Yishai >> >> Yishai Hadas (9): >> net/mlx5: Introduce ifc bits for migration in a chunk mode >> vfio/mlx5: Wake up the reader post of disabling the SAVING migration >> file >> vfio/mlx5: Refactor the SAVE callback to activate a work only upon an >> error >> vfio/mlx5: Enable querying state size which is > 4GB >> vfio/mlx5: Rename some stuff to match chunk mode >> vfio/mlx5: Pre-allocate chunks for the STOP_COPY phase >> vfio/mlx5: Add support for SAVING in chunk mode >> vfio/mlx5: Add support for READING in chunk mode >> vfio/mlx5: Activate the chunk mode functionality > I didn't check in great depth but this looks OK to me > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Thanks Jason > > I think this is a good design to start motivating more qmeu > improvements, eg using io_uring as we could go further in the driver > to optimize with that kind of support. > > Jason Alex, Can we move forward with the series and send a PR for the first patch that needs to go also to net/mlx5 ? Thanks, Yishai
On Wed, 27 Sep 2023 13:59:06 +0300 Yishai Hadas <yishaih@nvidia.com> wrote: > On 20/09/2023 21:31, Jason Gunthorpe wrote: > > On Mon, Sep 11, 2023 at 12:38:47PM +0300, Yishai Hadas wrote: > >> This series adds 'chunk mode' support for mlx5 driver upon the migration > >> flow. > >> > >> Before this series, we were limited to 4GB state size, as of the 4 bytes > >> max value based on the device specification for the query/save/load > >> commands. > >> > >> Once the device supports 'chunk mode' the driver can support state size > >> which is larger than 4GB. > >> > >> In that case, the device has the capability to split a single image to > >> multiple chunks as long as the software provides a buffer in the minimum > >> size reported by the device. > >> > >> The driver should query for the minimum buffer size required using > >> QUERY_VHCA_MIGRATION_STATE command with the 'chunk' bit set in its > >> input, in that case, the output will include both the minimum buffer > >> size and also the remaining total size to be reported/used where it will > >> be applicable. > >> > >> Upon chunk mode, there may be multiple images that will be read from the > >> device upon STOP_COPY. The driver will read ahead from the firmware the > >> full state in small/optimized chunks while letting QEMU/user space read > >> in parallel the available data. > >> > >> The chunk buffer size is picked up based on the minimum size that > >> firmware requires, the total full size and some max value in the driver > >> code which was set to 8MB to achieve some optimized downtime in the > >> general case. > >> > >> With that series in place, we could migrate successfully a device state > >> with a larger size than 4GB, while even improving the downtime in some > >> scenarios. > >> > >> Note: > >> As the first patch should go to net/mlx5 we may need to send it as a > >> pull request format to VFIO to avoid conflicts before acceptance. > >> > >> Yishai > >> > >> Yishai Hadas (9): > >> net/mlx5: Introduce ifc bits for migration in a chunk mode > >> vfio/mlx5: Wake up the reader post of disabling the SAVING migration > >> file > >> vfio/mlx5: Refactor the SAVE callback to activate a work only upon an > >> error > >> vfio/mlx5: Enable querying state size which is > 4GB > >> vfio/mlx5: Rename some stuff to match chunk mode > >> vfio/mlx5: Pre-allocate chunks for the STOP_COPY phase > >> vfio/mlx5: Add support for SAVING in chunk mode > >> vfio/mlx5: Add support for READING in chunk mode > >> vfio/mlx5: Activate the chunk mode functionality > > I didn't check in great depth but this looks OK to me > > > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> > > Thanks Jason > > > > > I think this is a good design to start motivating more qmeu > > improvements, eg using io_uring as we could go further in the driver > > to optimize with that kind of support. > > > > Jason > > Alex, > > Can we move forward with the series and send a PR for the first patch > that needs to go also to net/mlx5 ? Yeah, I don't spot any issues with it either. Thanks, Alex
On Wed, Sep 27, 2023 at 04:10:23PM -0600, Alex Williamson wrote: > On Wed, 27 Sep 2023 13:59:06 +0300 > Yishai Hadas <yishaih@nvidia.com> wrote: > > > On 20/09/2023 21:31, Jason Gunthorpe wrote: > > > On Mon, Sep 11, 2023 at 12:38:47PM +0300, Yishai Hadas wrote: > > >> This series adds 'chunk mode' support for mlx5 driver upon the migration > > >> flow. > > >> > > >> Before this series, we were limited to 4GB state size, as of the 4 bytes > > >> max value based on the device specification for the query/save/load > > >> commands. > > >> > > >> Once the device supports 'chunk mode' the driver can support state size > > >> which is larger than 4GB. > > >> > > >> In that case, the device has the capability to split a single image to > > >> multiple chunks as long as the software provides a buffer in the minimum > > >> size reported by the device. > > >> > > >> The driver should query for the minimum buffer size required using > > >> QUERY_VHCA_MIGRATION_STATE command with the 'chunk' bit set in its > > >> input, in that case, the output will include both the minimum buffer > > >> size and also the remaining total size to be reported/used where it will > > >> be applicable. > > >> > > >> Upon chunk mode, there may be multiple images that will be read from the > > >> device upon STOP_COPY. The driver will read ahead from the firmware the > > >> full state in small/optimized chunks while letting QEMU/user space read > > >> in parallel the available data. > > >> > > >> The chunk buffer size is picked up based on the minimum size that > > >> firmware requires, the total full size and some max value in the driver > > >> code which was set to 8MB to achieve some optimized downtime in the > > >> general case. > > >> > > >> With that series in place, we could migrate successfully a device state > > >> with a larger size than 4GB, while even improving the downtime in some > > >> scenarios. > > >> > > >> Note: > > >> As the first patch should go to net/mlx5 we may need to send it as a > > >> pull request format to VFIO to avoid conflicts before acceptance. > > >> > > >> Yishai > > >> > > >> Yishai Hadas (9): > > >> net/mlx5: Introduce ifc bits for migration in a chunk mode > > >> vfio/mlx5: Wake up the reader post of disabling the SAVING migration > > >> file > > >> vfio/mlx5: Refactor the SAVE callback to activate a work only upon an > > >> error > > >> vfio/mlx5: Enable querying state size which is > 4GB > > >> vfio/mlx5: Rename some stuff to match chunk mode > > >> vfio/mlx5: Pre-allocate chunks for the STOP_COPY phase > > >> vfio/mlx5: Add support for SAVING in chunk mode > > >> vfio/mlx5: Add support for READING in chunk mode > > >> vfio/mlx5: Activate the chunk mode functionality > > > I didn't check in great depth but this looks OK to me > > > > > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> > > > > Thanks Jason > > > > > > > > I think this is a good design to start motivating more qmeu > > > improvements, eg using io_uring as we could go further in the driver > > > to optimize with that kind of support. > > > > > > Jason > > > > Alex, > > > > Can we move forward with the series and send a PR for the first patch > > that needs to go also to net/mlx5 ? > > Yeah, I don't spot any issues with it either. Thanks, Hi Alex, I uploaded the first patch to shared branch, can you please pull it? https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/log/?h=mlx5-vfio Thanks > > Alex > >
On Thu, 28 Sep 2023 14:08:08 +0300 Leon Romanovsky <leon@kernel.org> wrote: > On Wed, Sep 27, 2023 at 04:10:23PM -0600, Alex Williamson wrote: > > On Wed, 27 Sep 2023 13:59:06 +0300 > > Yishai Hadas <yishaih@nvidia.com> wrote: > > > > > On 20/09/2023 21:31, Jason Gunthorpe wrote: > > > > On Mon, Sep 11, 2023 at 12:38:47PM +0300, Yishai Hadas wrote: > > > >> This series adds 'chunk mode' support for mlx5 driver upon the migration > > > >> flow. > > > >> > > > >> Before this series, we were limited to 4GB state size, as of the 4 bytes > > > >> max value based on the device specification for the query/save/load > > > >> commands. > > > >> > > > >> Once the device supports 'chunk mode' the driver can support state size > > > >> which is larger than 4GB. > > > >> > > > >> In that case, the device has the capability to split a single image to > > > >> multiple chunks as long as the software provides a buffer in the minimum > > > >> size reported by the device. > > > >> > > > >> The driver should query for the minimum buffer size required using > > > >> QUERY_VHCA_MIGRATION_STATE command with the 'chunk' bit set in its > > > >> input, in that case, the output will include both the minimum buffer > > > >> size and also the remaining total size to be reported/used where it will > > > >> be applicable. > > > >> > > > >> Upon chunk mode, there may be multiple images that will be read from the > > > >> device upon STOP_COPY. The driver will read ahead from the firmware the > > > >> full state in small/optimized chunks while letting QEMU/user space read > > > >> in parallel the available data. > > > >> > > > >> The chunk buffer size is picked up based on the minimum size that > > > >> firmware requires, the total full size and some max value in the driver > > > >> code which was set to 8MB to achieve some optimized downtime in the > > > >> general case. > > > >> > > > >> With that series in place, we could migrate successfully a device state > > > >> with a larger size than 4GB, while even improving the downtime in some > > > >> scenarios. > > > >> > > > >> Note: > > > >> As the first patch should go to net/mlx5 we may need to send it as a > > > >> pull request format to VFIO to avoid conflicts before acceptance. > > > >> > > > >> Yishai > > > >> > > > >> Yishai Hadas (9): > > > >> net/mlx5: Introduce ifc bits for migration in a chunk mode > > > >> vfio/mlx5: Wake up the reader post of disabling the SAVING migration > > > >> file > > > >> vfio/mlx5: Refactor the SAVE callback to activate a work only upon an > > > >> error > > > >> vfio/mlx5: Enable querying state size which is > 4GB > > > >> vfio/mlx5: Rename some stuff to match chunk mode > > > >> vfio/mlx5: Pre-allocate chunks for the STOP_COPY phase > > > >> vfio/mlx5: Add support for SAVING in chunk mode > > > >> vfio/mlx5: Add support for READING in chunk mode > > > >> vfio/mlx5: Activate the chunk mode functionality > > > > I didn't check in great depth but this looks OK to me > > > > > > > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> > > > > > > Thanks Jason > > > > > > > > > > > I think this is a good design to start motivating more qmeu > > > > improvements, eg using io_uring as we could go further in the driver > > > > to optimize with that kind of support. > > > > > > > > Jason > > > > > > Alex, > > > > > > Can we move forward with the series and send a PR for the first patch > > > that needs to go also to net/mlx5 ? > > > > Yeah, I don't spot any issues with it either. Thanks, > > Hi Alex, > > I uploaded the first patch to shared branch, can you please pull it? > https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/log/?h=mlx5-vfio Yep, got it. Thanks. Yishai, were you planning to resend the remainder or do you just want me to pull 2-9 from this series? Thanks, Alex
On Thu, Sep 28, 2023 at 12:29:52PM -0600, Alex Williamson wrote: > On Thu, 28 Sep 2023 14:08:08 +0300 > Leon Romanovsky <leon@kernel.org> wrote: > > > On Wed, Sep 27, 2023 at 04:10:23PM -0600, Alex Williamson wrote: > > > On Wed, 27 Sep 2023 13:59:06 +0300 > > > Yishai Hadas <yishaih@nvidia.com> wrote: > > > > > > > On 20/09/2023 21:31, Jason Gunthorpe wrote: > > > > > On Mon, Sep 11, 2023 at 12:38:47PM +0300, Yishai Hadas wrote: > > > > >> This series adds 'chunk mode' support for mlx5 driver upon the migration > > > > >> flow. > > > > >> > > > > >> Before this series, we were limited to 4GB state size, as of the 4 bytes > > > > >> max value based on the device specification for the query/save/load > > > > >> commands. > > > > >> > > > > >> Once the device supports 'chunk mode' the driver can support state size > > > > >> which is larger than 4GB. > > > > >> > > > > >> In that case, the device has the capability to split a single image to > > > > >> multiple chunks as long as the software provides a buffer in the minimum > > > > >> size reported by the device. > > > > >> > > > > >> The driver should query for the minimum buffer size required using > > > > >> QUERY_VHCA_MIGRATION_STATE command with the 'chunk' bit set in its > > > > >> input, in that case, the output will include both the minimum buffer > > > > >> size and also the remaining total size to be reported/used where it will > > > > >> be applicable. > > > > >> > > > > >> Upon chunk mode, there may be multiple images that will be read from the > > > > >> device upon STOP_COPY. The driver will read ahead from the firmware the > > > > >> full state in small/optimized chunks while letting QEMU/user space read > > > > >> in parallel the available data. > > > > >> > > > > >> The chunk buffer size is picked up based on the minimum size that > > > > >> firmware requires, the total full size and some max value in the driver > > > > >> code which was set to 8MB to achieve some optimized downtime in the > > > > >> general case. > > > > >> > > > > >> With that series in place, we could migrate successfully a device state > > > > >> with a larger size than 4GB, while even improving the downtime in some > > > > >> scenarios. > > > > >> > > > > >> Note: > > > > >> As the first patch should go to net/mlx5 we may need to send it as a > > > > >> pull request format to VFIO to avoid conflicts before acceptance. > > > > >> > > > > >> Yishai > > > > >> > > > > >> Yishai Hadas (9): > > > > >> net/mlx5: Introduce ifc bits for migration in a chunk mode > > > > >> vfio/mlx5: Wake up the reader post of disabling the SAVING migration > > > > >> file > > > > >> vfio/mlx5: Refactor the SAVE callback to activate a work only upon an > > > > >> error > > > > >> vfio/mlx5: Enable querying state size which is > 4GB > > > > >> vfio/mlx5: Rename some stuff to match chunk mode > > > > >> vfio/mlx5: Pre-allocate chunks for the STOP_COPY phase > > > > >> vfio/mlx5: Add support for SAVING in chunk mode > > > > >> vfio/mlx5: Add support for READING in chunk mode > > > > >> vfio/mlx5: Activate the chunk mode functionality > > > > > I didn't check in great depth but this looks OK to me > > > > > > > > > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> > > > > > > > > Thanks Jason > > > > > > > > > > > > > > I think this is a good design to start motivating more qmeu > > > > > improvements, eg using io_uring as we could go further in the driver > > > > > to optimize with that kind of support. > > > > > > > > > > Jason > > > > > > > > Alex, > > > > > > > > Can we move forward with the series and send a PR for the first patch > > > > that needs to go also to net/mlx5 ? > > > > > > Yeah, I don't spot any issues with it either. Thanks, > > > > Hi Alex, > > > > I uploaded the first patch to shared branch, can you please pull it? > > https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/log/?h=mlx5-vfio > > Yep, got it. Thanks. > > Yishai, were you planning to resend the remainder or do you just want > me to pull 2-9 from this series? Thanks, Just pull, like I did with b4 :) ~/src/b4/b4.sh shazam -l -s https://lore.kernel.org/kvm/20230911093856.81910-1-yishaih@nvidia.com/ -P 2-9 -t Thanks > > Alex >
On Thu, 28 Sep 2023 21:42:22 +0300 Leon Romanovsky <leon@kernel.org> wrote: > On Thu, Sep 28, 2023 at 12:29:52PM -0600, Alex Williamson wrote: > > On Thu, 28 Sep 2023 14:08:08 +0300 > > Leon Romanovsky <leon@kernel.org> wrote: > > > > > On Wed, Sep 27, 2023 at 04:10:23PM -0600, Alex Williamson wrote: > > > > On Wed, 27 Sep 2023 13:59:06 +0300 > > > > Yishai Hadas <yishaih@nvidia.com> wrote: > > > > > > > > > On 20/09/2023 21:31, Jason Gunthorpe wrote: > > > > > > On Mon, Sep 11, 2023 at 12:38:47PM +0300, Yishai Hadas wrote: > > > > > >> This series adds 'chunk mode' support for mlx5 driver upon the migration > > > > > >> flow. > > > > > >> > > > > > >> Before this series, we were limited to 4GB state size, as of the 4 bytes > > > > > >> max value based on the device specification for the query/save/load > > > > > >> commands. > > > > > >> > > > > > >> Once the device supports 'chunk mode' the driver can support state size > > > > > >> which is larger than 4GB. > > > > > >> > > > > > >> In that case, the device has the capability to split a single image to > > > > > >> multiple chunks as long as the software provides a buffer in the minimum > > > > > >> size reported by the device. > > > > > >> > > > > > >> The driver should query for the minimum buffer size required using > > > > > >> QUERY_VHCA_MIGRATION_STATE command with the 'chunk' bit set in its > > > > > >> input, in that case, the output will include both the minimum buffer > > > > > >> size and also the remaining total size to be reported/used where it will > > > > > >> be applicable. > > > > > >> > > > > > >> Upon chunk mode, there may be multiple images that will be read from the > > > > > >> device upon STOP_COPY. The driver will read ahead from the firmware the > > > > > >> full state in small/optimized chunks while letting QEMU/user space read > > > > > >> in parallel the available data. > > > > > >> > > > > > >> The chunk buffer size is picked up based on the minimum size that > > > > > >> firmware requires, the total full size and some max value in the driver > > > > > >> code which was set to 8MB to achieve some optimized downtime in the > > > > > >> general case. > > > > > >> > > > > > >> With that series in place, we could migrate successfully a device state > > > > > >> with a larger size than 4GB, while even improving the downtime in some > > > > > >> scenarios. > > > > > >> > > > > > >> Note: > > > > > >> As the first patch should go to net/mlx5 we may need to send it as a > > > > > >> pull request format to VFIO to avoid conflicts before acceptance. > > > > > >> > > > > > >> Yishai > > > > > >> > > > > > >> Yishai Hadas (9): > > > > > >> net/mlx5: Introduce ifc bits for migration in a chunk mode > > > > > >> vfio/mlx5: Wake up the reader post of disabling the SAVING migration > > > > > >> file > > > > > >> vfio/mlx5: Refactor the SAVE callback to activate a work only upon an > > > > > >> error > > > > > >> vfio/mlx5: Enable querying state size which is > 4GB > > > > > >> vfio/mlx5: Rename some stuff to match chunk mode > > > > > >> vfio/mlx5: Pre-allocate chunks for the STOP_COPY phase > > > > > >> vfio/mlx5: Add support for SAVING in chunk mode > > > > > >> vfio/mlx5: Add support for READING in chunk mode > > > > > >> vfio/mlx5: Activate the chunk mode functionality > > > > > > I didn't check in great depth but this looks OK to me > > > > > > > > > > > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> > > > > > > > > > > Thanks Jason > > > > > > > > > > > > > > > > > I think this is a good design to start motivating more qmeu > > > > > > improvements, eg using io_uring as we could go further in the driver > > > > > > to optimize with that kind of support. > > > > > > > > > > > > Jason > > > > > > > > > > Alex, > > > > > > > > > > Can we move forward with the series and send a PR for the first patch > > > > > that needs to go also to net/mlx5 ? > > > > > > > > Yeah, I don't spot any issues with it either. Thanks, > > > > > > Hi Alex, > > > > > > I uploaded the first patch to shared branch, can you please pull it? > > > https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/log/?h=mlx5-vfio > > > > Yep, got it. Thanks. > > > > Yishai, were you planning to resend the remainder or do you just want > > me to pull 2-9 from this series? Thanks, > > Just pull, like I did with b4 :) > > ~/src/b4/b4.sh shazam -l -s https://lore.kernel.org/kvm/20230911093856.81910-1-yishaih@nvidia.com/ -P 2-9 -t Yep, the mechanics were really not the question, I'm just double checking to avoid any conflicts with a re-post. Thanks, Alex
On Thu, Sep 28, 2023 at 12:47:03PM -0600, Alex Williamson wrote: > On Thu, 28 Sep 2023 21:42:22 +0300 > Leon Romanovsky <leon@kernel.org> wrote: > > > On Thu, Sep 28, 2023 at 12:29:52PM -0600, Alex Williamson wrote: > > > On Thu, 28 Sep 2023 14:08:08 +0300 > > > Leon Romanovsky <leon@kernel.org> wrote: > > > > > > > On Wed, Sep 27, 2023 at 04:10:23PM -0600, Alex Williamson wrote: > > > > > On Wed, 27 Sep 2023 13:59:06 +0300 > > > > > Yishai Hadas <yishaih@nvidia.com> wrote: > > > > > > > > > > > On 20/09/2023 21:31, Jason Gunthorpe wrote: > > > > > > > On Mon, Sep 11, 2023 at 12:38:47PM +0300, Yishai Hadas wrote: > > > > > > >> This series adds 'chunk mode' support for mlx5 driver upon the migration > > > > > > >> flow. > > > > > > >> > > > > > > >> Before this series, we were limited to 4GB state size, as of the 4 bytes > > > > > > >> max value based on the device specification for the query/save/load > > > > > > >> commands. > > > > > > >> > > > > > > >> Once the device supports 'chunk mode' the driver can support state size > > > > > > >> which is larger than 4GB. > > > > > > >> > > > > > > >> In that case, the device has the capability to split a single image to > > > > > > >> multiple chunks as long as the software provides a buffer in the minimum > > > > > > >> size reported by the device. > > > > > > >> > > > > > > >> The driver should query for the minimum buffer size required using > > > > > > >> QUERY_VHCA_MIGRATION_STATE command with the 'chunk' bit set in its > > > > > > >> input, in that case, the output will include both the minimum buffer > > > > > > >> size and also the remaining total size to be reported/used where it will > > > > > > >> be applicable. > > > > > > >> > > > > > > >> Upon chunk mode, there may be multiple images that will be read from the > > > > > > >> device upon STOP_COPY. The driver will read ahead from the firmware the > > > > > > >> full state in small/optimized chunks while letting QEMU/user space read > > > > > > >> in parallel the available data. > > > > > > >> > > > > > > >> The chunk buffer size is picked up based on the minimum size that > > > > > > >> firmware requires, the total full size and some max value in the driver > > > > > > >> code which was set to 8MB to achieve some optimized downtime in the > > > > > > >> general case. > > > > > > >> > > > > > > >> With that series in place, we could migrate successfully a device state > > > > > > >> with a larger size than 4GB, while even improving the downtime in some > > > > > > >> scenarios. > > > > > > >> > > > > > > >> Note: > > > > > > >> As the first patch should go to net/mlx5 we may need to send it as a > > > > > > >> pull request format to VFIO to avoid conflicts before acceptance. > > > > > > >> > > > > > > >> Yishai > > > > > > >> > > > > > > >> Yishai Hadas (9): > > > > > > >> net/mlx5: Introduce ifc bits for migration in a chunk mode > > > > > > >> vfio/mlx5: Wake up the reader post of disabling the SAVING migration > > > > > > >> file > > > > > > >> vfio/mlx5: Refactor the SAVE callback to activate a work only upon an > > > > > > >> error > > > > > > >> vfio/mlx5: Enable querying state size which is > 4GB > > > > > > >> vfio/mlx5: Rename some stuff to match chunk mode > > > > > > >> vfio/mlx5: Pre-allocate chunks for the STOP_COPY phase > > > > > > >> vfio/mlx5: Add support for SAVING in chunk mode > > > > > > >> vfio/mlx5: Add support for READING in chunk mode > > > > > > >> vfio/mlx5: Activate the chunk mode functionality > > > > > > > I didn't check in great depth but this looks OK to me > > > > > > > > > > > > > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> > > > > > > > > > > > > Thanks Jason > > > > > > > > > > > > > > > > > > > > I think this is a good design to start motivating more qmeu > > > > > > > improvements, eg using io_uring as we could go further in the driver > > > > > > > to optimize with that kind of support. > > > > > > > > > > > > > > Jason > > > > > > > > > > > > Alex, > > > > > > > > > > > > Can we move forward with the series and send a PR for the first patch > > > > > > that needs to go also to net/mlx5 ? > > > > > > > > > > Yeah, I don't spot any issues with it either. Thanks, > > > > > > > > Hi Alex, > > > > > > > > I uploaded the first patch to shared branch, can you please pull it? > > > > https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/log/?h=mlx5-vfio > > > > > > Yep, got it. Thanks. > > > > > > Yishai, were you planning to resend the remainder or do you just want > > > me to pull 2-9 from this series? Thanks, > > > > Just pull, like I did with b4 :) > > > > ~/src/b4/b4.sh shazam -l -s https://lore.kernel.org/kvm/20230911093856.81910-1-yishaih@nvidia.com/ -P 2-9 -t > > Yep, the mechanics were really not the question, I'm just double > checking to avoid any conflicts with a re-post. Thanks, It is pretty safe to say that he won't re-post. He had no plans to resend the series. Thanks > > Alex >
On Thu, 28 Sep 2023 21:51:02 +0300 Leon Romanovsky <leon@kernel.org> wrote: > On Thu, Sep 28, 2023 at 12:47:03PM -0600, Alex Williamson wrote: > > On Thu, 28 Sep 2023 21:42:22 +0300 > > Leon Romanovsky <leon@kernel.org> wrote: > > > > > On Thu, Sep 28, 2023 at 12:29:52PM -0600, Alex Williamson wrote: > > > > On Thu, 28 Sep 2023 14:08:08 +0300 > > > > Leon Romanovsky <leon@kernel.org> wrote: > > > > > > > > > On Wed, Sep 27, 2023 at 04:10:23PM -0600, Alex Williamson wrote: > > > > > > On Wed, 27 Sep 2023 13:59:06 +0300 > > > > > > Yishai Hadas <yishaih@nvidia.com> wrote: > > > > > > > > > > > > > On 20/09/2023 21:31, Jason Gunthorpe wrote: > > > > > > > > On Mon, Sep 11, 2023 at 12:38:47PM +0300, Yishai Hadas wrote: > > > > > > > >> This series adds 'chunk mode' support for mlx5 driver upon the migration > > > > > > > >> flow. > > > > > > > >> > > > > > > > >> Before this series, we were limited to 4GB state size, as of the 4 bytes > > > > > > > >> max value based on the device specification for the query/save/load > > > > > > > >> commands. > > > > > > > >> > > > > > > > >> Once the device supports 'chunk mode' the driver can support state size > > > > > > > >> which is larger than 4GB. > > > > > > > >> > > > > > > > >> In that case, the device has the capability to split a single image to > > > > > > > >> multiple chunks as long as the software provides a buffer in the minimum > > > > > > > >> size reported by the device. > > > > > > > >> > > > > > > > >> The driver should query for the minimum buffer size required using > > > > > > > >> QUERY_VHCA_MIGRATION_STATE command with the 'chunk' bit set in its > > > > > > > >> input, in that case, the output will include both the minimum buffer > > > > > > > >> size and also the remaining total size to be reported/used where it will > > > > > > > >> be applicable. > > > > > > > >> > > > > > > > >> Upon chunk mode, there may be multiple images that will be read from the > > > > > > > >> device upon STOP_COPY. The driver will read ahead from the firmware the > > > > > > > >> full state in small/optimized chunks while letting QEMU/user space read > > > > > > > >> in parallel the available data. > > > > > > > >> > > > > > > > >> The chunk buffer size is picked up based on the minimum size that > > > > > > > >> firmware requires, the total full size and some max value in the driver > > > > > > > >> code which was set to 8MB to achieve some optimized downtime in the > > > > > > > >> general case. > > > > > > > >> > > > > > > > >> With that series in place, we could migrate successfully a device state > > > > > > > >> with a larger size than 4GB, while even improving the downtime in some > > > > > > > >> scenarios. > > > > > > > >> > > > > > > > >> Note: > > > > > > > >> As the first patch should go to net/mlx5 we may need to send it as a > > > > > > > >> pull request format to VFIO to avoid conflicts before acceptance. > > > > > > > >> > > > > > > > >> Yishai > > > > > > > >> > > > > > > > >> Yishai Hadas (9): > > > > > > > >> net/mlx5: Introduce ifc bits for migration in a chunk mode > > > > > > > >> vfio/mlx5: Wake up the reader post of disabling the SAVING migration > > > > > > > >> file > > > > > > > >> vfio/mlx5: Refactor the SAVE callback to activate a work only upon an > > > > > > > >> error > > > > > > > >> vfio/mlx5: Enable querying state size which is > 4GB > > > > > > > >> vfio/mlx5: Rename some stuff to match chunk mode > > > > > > > >> vfio/mlx5: Pre-allocate chunks for the STOP_COPY phase > > > > > > > >> vfio/mlx5: Add support for SAVING in chunk mode > > > > > > > >> vfio/mlx5: Add support for READING in chunk mode > > > > > > > >> vfio/mlx5: Activate the chunk mode functionality > > > > > > > > I didn't check in great depth but this looks OK to me > > > > > > > > > > > > > > > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> > > > > > > > > > > > > > > Thanks Jason > > > > > > > > > > > > > > > > > > > > > > > I think this is a good design to start motivating more qmeu > > > > > > > > improvements, eg using io_uring as we could go further in the driver > > > > > > > > to optimize with that kind of support. > > > > > > > > > > > > > > > > Jason > > > > > > > > > > > > > > Alex, > > > > > > > > > > > > > > Can we move forward with the series and send a PR for the first patch > > > > > > > that needs to go also to net/mlx5 ? > > > > > > > > > > > > Yeah, I don't spot any issues with it either. Thanks, > > > > > > > > > > Hi Alex, > > > > > > > > > > I uploaded the first patch to shared branch, can you please pull it? > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/log/?h=mlx5-vfio > > > > > > > > Yep, got it. Thanks. > > > > > > > > Yishai, were you planning to resend the remainder or do you just want > > > > me to pull 2-9 from this series? Thanks, > > > > > > Just pull, like I did with b4 :) > > > > > > ~/src/b4/b4.sh shazam -l -s https://lore.kernel.org/kvm/20230911093856.81910-1-yishaih@nvidia.com/ -P 2-9 -t > > > > Yep, the mechanics were really not the question, I'm just double > > checking to avoid any conflicts with a re-post. Thanks, > > It is pretty safe to say that he won't re-post. > He had no plans to resend the series. Ok, applied the remainder of the series to the vfio next branch for v6.7. Thanks, Alex
On Mon, 11 Sep 2023 12:38:47 +0300, Yishai Hadas wrote: > This series adds 'chunk mode' support for mlx5 driver upon the migration > flow. > > Before this series, we were limited to 4GB state size, as of the 4 bytes > max value based on the device specification for the query/save/load > commands. > > [...] Applied, thanks! [1/9] net/mlx5: Introduce ifc bits for migration in a chunk mode https://git.kernel.org/rdma/rdma/c/5aa4c9608d2d5f Best regards,