[v17,0/5] FPGA Image Load (previously Security Manager)

Message ID	20210929230025.68961-1-russell.h.weight@intel.com (mailing list archive)
Headers	show Return-Path: <linux-fpga-owner@kernel.org> From: Russ Weight <russell.h.weight@intel.com> To: mdf@kernel.org, linux-fpga@vger.kernel.org, linux-kernel@vger.kernel.org Cc: trix@redhat.com, lgoncalv@redhat.com, yilun.xu@intel.com, hao.wu@intel.com, matthew.gerlach@intel.com, Russ Weight <russell.h.weight@intel.com> Subject: [PATCH v17 0/5] FPGA Image Load (previously Security Manager) Date: Wed, 29 Sep 2021 16:00:20 -0700 Message-Id: <20210929230025.68961-1-russell.h.weight@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	FPGA Image Load (previously Security Manager) \| expand [v17,0/5] FPGA Image Load (previously Security Manager) [v17,1/5] fpga: image-load: fpga image load framework [v17,2/5] fpga: image-load: enable image uploads [v17,3/5] fpga: image-load: signal eventfd when complete [v17,4/5] fpga: image-load: add status ioctl [v17,5/5] fpga: image-load: enable cancel of image upload

Russ Weight Sept. 29, 2021, 11 p.m. UTC

The FPGA Image Load framework provides an API to upload image
files to an FPGA device. Image files are self-describing. They could
contain FPGA images, BMC images, Root Entry Hashes, or other device
specific files. It is up to the lower-level device driver and the
target device to authenticate and disposition the file data.

The n3000bmc-sec-update driver is the first driver to use the FPGA Image
Load Framework. This driver was previously submitted in the same patch
set, but was split into a separate patch set starting with v2.

Changelog v16 -> v17:
 - Documentation cleanup with clarification for changes to how cancellation
   is handled.
 - Removed the FPGA_IMAGE_ERR_NONE macro in favor of returning zero. The
   constant name made sense when we were using enum data types, but is no
   longer necessary.
 - Changed the syntax of the write_blk() function and renamed it to
   write().
   It now returns a positive size or a negative error number.
 - Removed the default block size for write_block(). The lower-level driver
   must choose a block size.
 - Changed the prepare() and write() ops to pass in elements of the
   fpga_image_load data structure instead of having the op functions access
   them directly via imgld.
 - Changed work queue from system_unbound_wq to system_long_wq.
 - Removed calls to cond_resched(). The lower-level driver must manage
   any scheduler issues.
 - Removed the request_cancel flag and handling from the class driver
   including the fpga_image_prog_transition() function.
 - The cancel system call now directly calls the cancel() op of the
   lower-level driver. 
 - Changed the cancel() op to return void.

Changelog v15 -> v16:
 - Remove previous patch #5: fpga: image-load: create status sysfs node
 - Change device name from "fpga_image%d" to "fpga_image_load%d"
 - Shift from "Driver" terminology to "Framework" in comments and
   documentation
 - Some cleanup of documentation for the FPGA_IMAGE_LOAD_WRITE IOCTL.
 - Rename lops to ops for structure member and local variables
 - Change the write_blk() definition to pass in *blk_size (a pointer to
   a default block size of WRITE_BLOCK_SIZE=0x4000) and max_size (the
   the maximum block size to stay within the limit of the data buffer).
   The write_blk() op may use the default *blk_size or modify it to a
   more optimal number for the given device, subject to the max_size limit.
 - All enum values for progress and errors are changed to macros, because
   they are included in the uapi header. This is done to maintain consistency
   amongst the DFL related IOCTL header files. All references to the enum
   types are changed to u32.
 - Bail out early in fpga_image_do_load() if imgld->driver_unload is true.
 - Add a call to cond_resched() in the write_blk() loop to ensure that
   we yield to higher priority tasks during long data transfers.
 - Switch to the system_unbound_wq to enable calls to cond_resched().
 - Switch from test_and_set_bit() to atomic_cmpxchg() to manage
   imgld->opened.
 - Change fpga_image_load_release() cancel an ongoing image upload when
   possible and to block when cancellation is not possible.
 - Remove the completion object, imgld->update_done, in favor of calling
   flush_work(&imgld->work);

Changelog v14 -> v15:
 - Changed the driver name from FPGA Security Manager to FPGA Image Load
 - Changed file, symbol, and config names to reflect the new driver name
 - Rewrote documentation.
 - Removed all sysfs files except for "status", which has moved to the
   parent directory.
 - Implemented FPGA_IMAGE_LOAD_WRITE, FPGA_IMAGE_LOAD_STATUS, and
   FPGA_IMAGE_LOAD_CANCEL IOCTLs to initiate, monitor, and cancel image
   uploads.
 - Fixed some error return values in fpga_image_load_register()
 - Added an eventfd which is signalled upon completion of an image load.
 - Minor changes to locking to accommodate the FPGA_IMAGE_LOAD_STATUS IOCTL
 - Removed signed-off/reviewed-by tags. Please resend as appropriate.

Changelog v13 -> v14:
 - Dropped the patch: fpga: sec-mgr: expose hardware error info
 - Updated copyrights to 2021
 - Updated ABI documentation date and kernel version
 - Removed the name sysfs entry from the class driver
 - Use xa_alloc() instead of ida_simple_get()
 - Rename dev to parent for parent devices
 - Remove fpga_sec_mgr_create(), devm_fpga_sec_mgr_create(), and
   fpga_sec_mgr_free() functions and update the fpga_sec_mgr_register()
   function to both create and register a new security manager.
 - Populate the fpga_sec_mgr_dev_release() function.
 - Added MAINTAINERS reference for
   Documentation/ABI/testing/sysfs-class-fpga-sec-mgr

Changelog v12 -> v13:
  - Change "if (count == 0 || " to "if (!count || "
  - Improve error message: "Attempt to register without all required ops\n"
  - Change set_error() to fpga_sec_set_error()
  - Change set_hw_errinfo() to fpga_sec_set_hw_errinfo()

Changelog v11 -> v12:
  - Updated Date and KernelVersion fields in ABI documentation
  - Removed size parameter from write_blk() op - it is now up to
    the lower-level driver to determine the appropriate size and
    to update smgr->remaining_size accordingly.
  - Changed syntax of sec_mgr_prog_str[] and sec_mgr_err_str array definitions
    from:
	"idle",			/* FPGA_SEC_PROG_IDLE */
    to:
	[FPGA_SEC_PROG_IDLE]	    = "idle",

Changelog v10 -> v11:
  - Fixed a spelling error in a comment
  - Initialize smgr->err_code and smgr->progress explicitly in
    fpga_sec_mgr_create() instead of accepting the default 0 value.

Changelog v9 -> v10:
  - Rebased to 5.12-rc2 next
  - Updated Date and KernelVersion in ABI documentation

Changelog v8 -> v9:
  - Rebased patches for 5.11-rc2
  - Updated Date and KernelVersion in ABI documentation

Changelog v7 -> v8:
  - Fixed grammatical error in Documentation/fpga/fpga-sec-mgr.rst

Changelog v6 -> v7:
  - Changed dates in documentation file to December 2020
  - Changed filename_store() to use kmemdup_nul() instead of
    kstrndup() and changed the count to not assume a line-return.

Changelog v5 -> v6:
  - Removed sysfs support and documentation for the display of the
    flash count, root entry hashes, and code-signing-key cancellation
    vectors from the class driver. This information can vary by device
    and will instead be displayed by the device-specific parent driver.

Changelog v4 -> v5:
  - Added the devm_fpga_sec_mgr_unregister() function, following recent
    changes to the fpga_manager() implementation.
  - Changed most of the *_show() functions to use sysfs_emit()
    instead of sprintf(
  - When checking the return values for functions of type enum
    fpga_sec_err err_code, test for FPGA_SEC_ERR_NONE instead of 0

Changelog v3 -> v4:
  - This driver is generic enough that it could be used for non Intel
    FPGA devices. Changed from "Intel FPGA Security Manager" to FPGA
    Security Manager" and removed unnecessary references to "Intel".
  - Changed: iops -> sops, imgr -> smgr, IFPGA_ -> FPGA_, ifpga_ to fpga_
    Note that this also affects some filenames.

Changelog v2 -> v3:
  - Use dev_err() to report invalid progress in sec_progress()
  - Use dev_err() to report invalid error code in sec_error()
  - Modified sysfs handler check in check_sysfs_handler() to make
    it more readable.
  - Removed unnecessary "goto done"
  - Added a comment to explain imgr->driver_unload in
    ifpga_sec_mgr_unregister()

Changelog v1 -> v2:
  - Separated out the MAX10 BMC Security Engine to be submitted in
    a separate patch-set.
  - Bumped documentation dates and versions
  - Split ifpga_sec_mgr_register() into create() and register() functions
  - Added devm_ifpga_sec_mgr_create()
  - Added Documentation/fpga/ifpga-sec-mgr.rst 
  - Changed progress state "read_file" to "reading"
  - Added sec_error() function (similar to sec_progress())
  - Removed references to bmc_flash_count & smbus_flash_count (not supported)
  - Removed typedefs for imgr ops
  - Removed explicit value assignments in enums
  - Other minor code cleanup per review comments 

Russ Weight (5):
  fpga: image-load: fpga image load framework
  fpga: image-load: enable image uploads
  fpga: image-load: signal eventfd when complete
  fpga: image-load: add status ioctl
  fpga: image-load: enable cancel of image upload

 Documentation/fpga/fpga-image-load.rst |  50 +++
 Documentation/fpga/index.rst           |   1 +
 MAINTAINERS                            |   9 +
 drivers/fpga/Kconfig                   |  10 +
 drivers/fpga/Makefile                  |   3 +
 drivers/fpga/fpga-image-load.c         | 416 +++++++++++++++++++++++++
 include/linux/fpga/fpga-image-load.h   |  69 ++++
 include/uapi/linux/fpga-image-load.h   |  74 +++++
 8 files changed, 632 insertions(+)
 create mode 100644 Documentation/fpga/fpga-image-load.rst
 create mode 100644 drivers/fpga/fpga-image-load.c
 create mode 100644 include/linux/fpga/fpga-image-load.h
 create mode 100644 include/uapi/linux/fpga-image-load.h

Xu Yilun Oct. 9, 2021, 8:08 a.m. UTC | #1

On Wed, Sep 29, 2021 at 04:00:20PM -0700, Russ Weight wrote:
> The FPGA Image Load framework provides an API to upload image
> files to an FPGA device. Image files are self-describing. They could
> contain FPGA images, BMC images, Root Entry Hashes, or other device
> specific files. It is up to the lower-level device driver and the
> target device to authenticate and disposition the file data.

I've reconsider the FPGA persistent image update again, and think we
may include it in FPGA manager framework.

Sorry I raised this topic again when it is already at patch v17, but now
I need to consider more seriously than before.

We have consensus the FPGA persistent image update is just like a normal
firmware update which finally writes the nvmem like flash or eeprom,
while the current FPGA manager deals with the active FPGA region update
and re-activation. Could we just expand the FPGA manager and let it handle
the nvmem update as well? Many FPGA cards have nvmem and downloaders
supports updating both FPGA region and nvmem.

According to the patchset, the basic workflow of the 2 update types are
quite similar, get the data, prepare for the HW, write and complete.
They are already implemented in FPGA manager. We've discussed some
differences like threading or canceling the update, which are
not provided by FPGA manager but they may also nice to have for FPGA
region update. An FPGA region update may also last for a long time??
So I think having 2 sets of similar frameworks in FPGA is unnecessary.

My quick mind is that we add some flags in struct fpga_mgr & struct
fpga_image_info to indicate the HW capability (support FPGA region
update or nvmem update or both) of the download engine and the provided
image type. Then the low-level driver knows how to download if it
supports both image types.

An char device could be added for each fpga manager dev, providing the
user APIs for nvmem update. We may not use the char dev for FPGA region
update cause it changes the system HW devices and needs device hotplug
in FPGA region. We'd better leave it to FPGA region class, this is
another topic.

More discussion is appreciated.

Thanks,
Yilun

Tom Rix Oct. 9, 2021, 12:11 p.m. UTC | #2

On 10/9/21 1:08 AM, Xu Yilun wrote:
> On Wed, Sep 29, 2021 at 04:00:20PM -0700, Russ Weight wrote:
>> The FPGA Image Load framework provides an API to upload image
>> files to an FPGA device. Image files are self-describing. They could
>> contain FPGA images, BMC images, Root Entry Hashes, or other device
>> specific files. It is up to the lower-level device driver and the
>> target device to authenticate and disposition the file data.
> I've reconsider the FPGA persistent image update again, and think we
> may include it in FPGA manager framework.
>
> Sorry I raised this topic again when it is already at patch v17, but now
> I need to consider more seriously than before.
>
> We have consensus the FPGA persistent image update is just like a normal
> firmware update which finally writes the nvmem like flash or eeprom,
> while the current FPGA manager deals with the active FPGA region update
> and re-activation. Could we just expand the FPGA manager and let it handle
> the nvmem update as well? Many FPGA cards have nvmem and downloaders
> supports updating both FPGA region and nvmem.
>
> According to the patchset, the basic workflow of the 2 update types are
> quite similar, get the data, prepare for the HW, write and complete.
> They are already implemented in FPGA manager. We've discussed some
> differences like threading or canceling the update, which are
> not provided by FPGA manager but they may also nice to have for FPGA
> region update. An FPGA region update may also last for a long time??
> So I think having 2 sets of similar frameworks in FPGA is unnecessary.
>
> My quick mind is that we add some flags in struct fpga_mgr & struct
> fpga_image_info to indicate the HW capability (support FPGA region
> update or nvmem update or both) of the download engine and the provided
> image type. Then the low-level driver knows how to download if it
> supports both image types.
>
> An char device could be added for each fpga manager dev, providing the
> user APIs for nvmem update. We may not use the char dev for FPGA region
> update cause it changes the system HW devices and needs device hotplug
> in FPGA region. We'd better leave it to FPGA region class, this is
> another topic.
>
> More discussion is appreciated.

I also think fpga_mgr could be extended.

In this patchset,

https://lore.kernel.org/linux-fpga/20210625195849.837976-1-trix@redhat.com/

A second, similar set of write ops was added to fpga_manger_ops,

new bit/flag was added to fpga_image_info

The intent was for dfl to add their specific ops to cover what is done 
in this patchset.

Any other driver would do similar.

Is this close to what you are thinking ?

Tom

>
> Thanks,
> Yilun
>

Xu Yilun Oct. 11, 2021, 1:41 a.m. UTC | #3

On Sat, Oct 09, 2021 at 05:11:20AM -0700, Tom Rix wrote:
> 
> On 10/9/21 1:08 AM, Xu Yilun wrote:
> > On Wed, Sep 29, 2021 at 04:00:20PM -0700, Russ Weight wrote:
> > > The FPGA Image Load framework provides an API to upload image
> > > files to an FPGA device. Image files are self-describing. They could
> > > contain FPGA images, BMC images, Root Entry Hashes, or other device
> > > specific files. It is up to the lower-level device driver and the
> > > target device to authenticate and disposition the file data.
> > I've reconsider the FPGA persistent image update again, and think we
> > may include it in FPGA manager framework.
> > 
> > Sorry I raised this topic again when it is already at patch v17, but now
> > I need to consider more seriously than before.
> > 
> > We have consensus the FPGA persistent image update is just like a normal
> > firmware update which finally writes the nvmem like flash or eeprom,
> > while the current FPGA manager deals with the active FPGA region update
> > and re-activation. Could we just expand the FPGA manager and let it handle
> > the nvmem update as well? Many FPGA cards have nvmem and downloaders
> > supports updating both FPGA region and nvmem.
> > 
> > According to the patchset, the basic workflow of the 2 update types are
> > quite similar, get the data, prepare for the HW, write and complete.
> > They are already implemented in FPGA manager. We've discussed some
> > differences like threading or canceling the update, which are
> > not provided by FPGA manager but they may also nice to have for FPGA
> > region update. An FPGA region update may also last for a long time??
> > So I think having 2 sets of similar frameworks in FPGA is unnecessary.
> > 
> > My quick mind is that we add some flags in struct fpga_mgr & struct
> > fpga_image_info to indicate the HW capability (support FPGA region
> > update or nvmem update or both) of the download engine and the provided
> > image type. Then the low-level driver knows how to download if it
> > supports both image types.
> > 
> > An char device could be added for each fpga manager dev, providing the
> > user APIs for nvmem update. We may not use the char dev for FPGA region
> > update cause it changes the system HW devices and needs device hotplug
> > in FPGA region. We'd better leave it to FPGA region class, this is
> > another topic.
> > 
> > More discussion is appreciated.
> 
> I also think fpga_mgr could be extended.
> 
> In this patchset,
> 
> https://lore.kernel.org/linux-fpga/20210625195849.837976-1-trix@redhat.com/
> 
> A second, similar set of write ops was added to fpga_manger_ops,
> 
> new bit/flag was added to fpga_image_info
> 
> The intent was for dfl to add their specific ops to cover what is done in
> this patchset.

I think we don't have to add 2 ops for reconfig & reimage in framework,
the 2 processes are almost the same.

Just add the _REIMAGE (or something else, NVMEM?) flag for
fpga_image_info, and low level drivers handle it as they do for other
flags.

How do you think?

Thanks,
Yilun

> 
> Any other driver would do similar.
> 
> Is this close to what you are thinking ?
> 
> Tom
> 
> > 
> > Thanks,
> > Yilun
> >

Tom Rix Oct. 11, 2021, 12:35 p.m. UTC | #4

On 10/10/21 6:41 PM, Xu Yilun wrote:
> On Sat, Oct 09, 2021 at 05:11:20AM -0700, Tom Rix wrote:
>> On 10/9/21 1:08 AM, Xu Yilun wrote:
>>> On Wed, Sep 29, 2021 at 04:00:20PM -0700, Russ Weight wrote:
>>>> The FPGA Image Load framework provides an API to upload image
>>>> files to an FPGA device. Image files are self-describing. They could
>>>> contain FPGA images, BMC images, Root Entry Hashes, or other device
>>>> specific files. It is up to the lower-level device driver and the
>>>> target device to authenticate and disposition the file data.
>>> I've reconsider the FPGA persistent image update again, and think we
>>> may include it in FPGA manager framework.
>>>
>>> Sorry I raised this topic again when it is already at patch v17, but now
>>> I need to consider more seriously than before.
>>>
>>> We have consensus the FPGA persistent image update is just like a normal
>>> firmware update which finally writes the nvmem like flash or eeprom,
>>> while the current FPGA manager deals with the active FPGA region update
>>> and re-activation. Could we just expand the FPGA manager and let it handle
>>> the nvmem update as well? Many FPGA cards have nvmem and downloaders
>>> supports updating both FPGA region and nvmem.
>>>
>>> According to the patchset, the basic workflow of the 2 update types are
>>> quite similar, get the data, prepare for the HW, write and complete.
>>> They are already implemented in FPGA manager. We've discussed some
>>> differences like threading or canceling the update, which are
>>> not provided by FPGA manager but they may also nice to have for FPGA
>>> region update. An FPGA region update may also last for a long time??
>>> So I think having 2 sets of similar frameworks in FPGA is unnecessary.
>>>
>>> My quick mind is that we add some flags in struct fpga_mgr & struct
>>> fpga_image_info to indicate the HW capability (support FPGA region
>>> update or nvmem update or both) of the download engine and the provided
>>> image type. Then the low-level driver knows how to download if it
>>> supports both image types.
>>>
>>> An char device could be added for each fpga manager dev, providing the
>>> user APIs for nvmem update. We may not use the char dev for FPGA region
>>> update cause it changes the system HW devices and needs device hotplug
>>> in FPGA region. We'd better leave it to FPGA region class, this is
>>> another topic.
>>>
>>> More discussion is appreciated.
>> I also think fpga_mgr could be extended.
>>
>> In this patchset,
>>
>> https://lore.kernel.org/linux-fpga/20210625195849.837976-1-trix@redhat.com/
>>
>> A second, similar set of write ops was added to fpga_manger_ops,
>>
>> new bit/flag was added to fpga_image_info
>>
>> The intent was for dfl to add their specific ops to cover what is done in
>> this patchset.
> I think we don't have to add 2 ops for reconfig & reimage in framework,
> the 2 processes are almost the same.
>
> Just add the _REIMAGE (or something else, NVMEM?) flag for
> fpga_image_info, and low level drivers handle it as they do for other
> flags.
>
> How do you think?

A single set is fine.

A difficult part of is the length of  time to do the write. The existing 
write should be improved to use a worker thread.

Tom

>
> Thanks,
> Yilun
>
>> Any other driver would do similar.
>>
>> Is this close to what you are thinking ?
>>
>> Tom
>>
>>> Thanks,
>>> Yilun
>>>

Russ Weight Oct. 12, 2021, 1 a.m. UTC | #5

On 10/11/21 5:35 AM, Tom Rix wrote:
>
> On 10/10/21 6:41 PM, Xu Yilun wrote:
>> On Sat, Oct 09, 2021 at 05:11:20AM -0700, Tom Rix wrote:
>>> On 10/9/21 1:08 AM, Xu Yilun wrote:
>>>> On Wed, Sep 29, 2021 at 04:00:20PM -0700, Russ Weight wrote:
>>>>> The FPGA Image Load framework provides an API to upload image
>>>>> files to an FPGA device. Image files are self-describing. They could
>>>>> contain FPGA images, BMC images, Root Entry Hashes, or other device
>>>>> specific files. It is up to the lower-level device driver and the
>>>>> target device to authenticate and disposition the file data.
>>>> I've reconsider the FPGA persistent image update again, and think we
>>>> may include it in FPGA manager framework.
>>>>
>>>> Sorry I raised this topic again when it is already at patch v17, but now
>>>> I need to consider more seriously than before.
>>>>
>>>> We have consensus the FPGA persistent image update is just like a normal
>>>> firmware update which finally writes the nvmem like flash or eeprom,
>>>> while the current FPGA manager deals with the active FPGA region update
>>>> and re-activation. Could we just expand the FPGA manager and let it handle
>>>> the nvmem update as well? Many FPGA cards have nvmem and downloaders
>>>> supports updating both FPGA region and nvmem.
The fpga-image-load driver is actually just a data transfer. The class
driver has no knowledge about what the data is or where/if the data will
be stored.

This functionality could, of course, be merged into the fpga-mgr. I did
a proof of concept of this a while back and we discussed the pros and cons.
See this email for a recap:

https://marc.info/?l=linux-fpga&m=161998085507374&w=2

Things have changed some with the evolution of the driver. The IOCTL
approach probably fits better than the sysfs implementation. At the time
it seemed that a merge would add unnecessary complexity without adding value.

>>>>
>>>> According to the patchset, the basic workflow of the 2 update types are
>>>> quite similar, get the data, prepare for the HW, write and complete.
>>>> They are already implemented in FPGA manager. We've discussed some
>>>> differences like threading or canceling the update, which are
>>>> not provided by FPGA manager but they may also nice to have for FPGA
>>>> region update. An FPGA region update may also last for a long time??
>>>> So I think having 2 sets of similar frameworks in FPGA is unnecessary.
>>>>
>>>> My quick mind is that we add some flags in struct fpga_mgr & struct
>>>> fpga_image_info to indicate the HW capability (support FPGA region
>>>> update or nvmem update or both) of the download engine and the provided
>>>> image type. Then the low-level driver knows how to download if it
>>>> supports both image types.An char device could be added for each fpga manager dev, providing the
>>>> user APIs for nvmem update. We may not use the char dev for FPGA region
>>>> update cause it changes the system HW devices and needs device hotplug
>>>> in FPGA region. We'd better leave it to FPGA region class, this is
>>>> another topic.
I'll give this some more thought and see if I can come up with some RFC
patches.

- Russ
>>>>
>>>> More discussion is appreciated.
>>> I also think fpga_mgr could be extended.
>>>
>>> In this patchset,
>>>
>>> https://lore.kernel.org/linux-fpga/20210625195849.837976-1-trix@redhat.com/
>>>
>>> A second, similar set of write ops was added to fpga_manger_ops,
>>>
>>> new bit/flag was added to fpga_image_info
>>>
>>> The intent was for dfl to add their specific ops to cover what is done in
>>> this patchset.
>> I think we don't have to add 2 ops for reconfig & reimage in framework,
>> the 2 processes are almost the same.
>>
>> Just add the _REIMAGE (or something else, NVMEM?) flag for
>> fpga_image_info, and low level drivers handle it as they do for other
>> flags.
>>
>> How do you think?
>
> A single set is fine.
>
> A difficult part of is the length of  time to do the write. The existing write should be improved to use a worker thread.
>
> Tom
>
>>
>> Thanks,
>> Yilun
>>
>>> Any other driver would do similar.
>>>
>>> Is this close to what you are thinking ?
>>>
>>> Tom
>>>
>>>> Thanks,
>>>> Yilun
>>>>
>

Xu Yilun Oct. 12, 2021, 7:47 a.m. UTC | #6

On Mon, Oct 11, 2021 at 06:00:16PM -0700, Russ Weight wrote:
> 
> 
> On 10/11/21 5:35 AM, Tom Rix wrote:
> >
> > On 10/10/21 6:41 PM, Xu Yilun wrote:
> >> On Sat, Oct 09, 2021 at 05:11:20AM -0700, Tom Rix wrote:
> >>> On 10/9/21 1:08 AM, Xu Yilun wrote:
> >>>> On Wed, Sep 29, 2021 at 04:00:20PM -0700, Russ Weight wrote:
> >>>>> The FPGA Image Load framework provides an API to upload image
> >>>>> files to an FPGA device. Image files are self-describing. They could
> >>>>> contain FPGA images, BMC images, Root Entry Hashes, or other device
> >>>>> specific files. It is up to the lower-level device driver and the
> >>>>> target device to authenticate and disposition the file data.
> >>>> I've reconsider the FPGA persistent image update again, and think we
> >>>> may include it in FPGA manager framework.
> >>>>
> >>>> Sorry I raised this topic again when it is already at patch v17, but now
> >>>> I need to consider more seriously than before.
> >>>>
> >>>> We have consensus the FPGA persistent image update is just like a normal
> >>>> firmware update which finally writes the nvmem like flash or eeprom,
> >>>> while the current FPGA manager deals with the active FPGA region update
> >>>> and re-activation. Could we just expand the FPGA manager and let it handle
> >>>> the nvmem update as well? Many FPGA cards have nvmem and downloaders
> >>>> supports updating both FPGA region and nvmem.
> The fpga-image-load driver is actually just a data transfer. The class

IMHO, The fpga-mgr dev is also a data transfer. The fpga-region dev is
acting as the FPGA region admin responsible for gating, transfering and
re-enumerating.

So my opinion is to add a new data transfer type and keep a unified process.

> driver has no knowledge about what the data is or where/if the data will
> be stored.

The fpga-image-load driver knows the data will be stored in nvmem, while
the fpga-mgr knows the data will be stored in FPGA cells. They may need
to know the exact physical position to store, may not, depends on what the
HW engines are.

> 
> This functionality could, of course, be merged into the fpga-mgr. I did
> a proof of concept of this a while back and we discussed the pros and cons.
> See this email for a recap:
> 
> https://marc.info/?l=linux-fpga&m=161998085507374&w=2
> 
> Things have changed some with the evolution of the driver. The IOCTL
> approach probably fits better than the sysfs implementation. At the time
> it seemed that a merge would add unnecessary complexity without adding value.

I think at least developers don't have to go through 2 sets of software
stacks which are of the same concept. And adding some new features like
optionally threading or canceling data transfer are also good to FPGA
region update. And the nvmem update could also be benifit from exsiting
implementations like scatter-gather buffers, in-kernel firmware loading.

I try to explain myself according to each of your concern from that mail
thread:

Purpose of the 2 updates
========================

  As I said before, I think they are both data transfer devices. FPGA
region update gets extra support from fpga-region & fpga-bridge, and
FPGA nvmem update could be a standalone fpga-mgr.

Extra APIs that are unique to nvmem update
==========================================

  cdev APIs for nvmem update:
    Yes we need to expand the functionality so we need them.

  available_images, image_load APIs for loading nvmem content to FPGA
  region:
    These are features in later patchsets which are not submitted, but we
    could talk about it in advance. I think this is actually a FPGA region
    update from nvmem, it also requires gating, data loading (no SW transfer)
    and re-enumeration, or a single command to image_load HW may result system
    disordered. The FPGA framework now only supports update from in-kernel
    user data, maybe we add support for update from nvmem later.

  fpga-mgr state extend:
    I think it could be extended, The current states are not perfect,
    adding something like IDLE or READY is just fine.

  fpga-mgr status extend:
    Add general error definitions as needed. If there is some status
    that is quite vendor specific, expose it in low-level driver.

  remaining-size:
    Nice to have for all.

Threading the update
====================

  Also a good option for FPGA region update, maybe we also have a slow FPGA
  reprogrammer?

Cancelling the update
====================

  Also a good option for FPGA region update if HW engine supports.

Thanks,
Yilun

> 
> >>>>
> >>>> According to the patchset, the basic workflow of the 2 update types are
> >>>> quite similar, get the data, prepare for the HW, write and complete.
> >>>> They are already implemented in FPGA manager. We've discussed some
> >>>> differences like threading or canceling the update, which are
> >>>> not provided by FPGA manager but they may also nice to have for FPGA
> >>>> region update. An FPGA region update may also last for a long time??
> >>>> So I think having 2 sets of similar frameworks in FPGA is unnecessary.
> >>>>
> >>>> My quick mind is that we add some flags in struct fpga_mgr & struct
> >>>> fpga_image_info to indicate the HW capability (support FPGA region
> >>>> update or nvmem update or both) of the download engine and the provided
> >>>> image type. Then the low-level driver knows how to download if it
> >>>> supports both image types.An char device could be added for each fpga manager dev, providing the
> >>>> user APIs for nvmem update. We may not use the char dev for FPGA region
> >>>> update cause it changes the system HW devices and needs device hotplug
> >>>> in FPGA region. We'd better leave it to FPGA region class, this is
> >>>> another topic.
> I'll give this some more thought and see if I can come up with some RFC
> patches.
> 
> - Russ
> >>>>
> >>>> More discussion is appreciated.
> >>> I also think fpga_mgr could be extended.
> >>>
> >>> In this patchset,
> >>>
> >>> https://lore.kernel.org/linux-fpga/20210625195849.837976-1-trix@redhat.com/
> >>>
> >>> A second, similar set of write ops was added to fpga_manger_ops,
> >>>
> >>> new bit/flag was added to fpga_image_info
> >>>
> >>> The intent was for dfl to add their specific ops to cover what is done in
> >>> this patchset.
> >> I think we don't have to add 2 ops for reconfig & reimage in framework,
> >> the 2 processes are almost the same.
> >>
> >> Just add the _REIMAGE (or something else, NVMEM?) flag for
> >> fpga_image_info, and low level drivers handle it as they do for other
> >> flags.
> >>
> >> How do you think?
> >
> > A single set is fine.
> >
> > A difficult part of is the length of  time to do the write. The existing write should be improved to use a worker thread.
> >
> > Tom
> >
> >>
> >> Thanks,
> >> Yilun
> >>
> >>> Any other driver would do similar.
> >>>
> >>> Is this close to what you are thinking ?
> >>>
> >>> Tom
> >>>
> >>>> Thanks,
> >>>> Yilun
> >>>>
> >

Xu Yilun Oct. 12, 2021, 7:49 a.m. UTC | #7

On Mon, Oct 11, 2021 at 05:35:03AM -0700, Tom Rix wrote:
> 
> On 10/10/21 6:41 PM, Xu Yilun wrote:
> > On Sat, Oct 09, 2021 at 05:11:20AM -0700, Tom Rix wrote:
> > > On 10/9/21 1:08 AM, Xu Yilun wrote:
> > > > On Wed, Sep 29, 2021 at 04:00:20PM -0700, Russ Weight wrote:
> > > > > The FPGA Image Load framework provides an API to upload image
> > > > > files to an FPGA device. Image files are self-describing. They could
> > > > > contain FPGA images, BMC images, Root Entry Hashes, or other device
> > > > > specific files. It is up to the lower-level device driver and the
> > > > > target device to authenticate and disposition the file data.
> > > > I've reconsider the FPGA persistent image update again, and think we
> > > > may include it in FPGA manager framework.
> > > > 
> > > > Sorry I raised this topic again when it is already at patch v17, but now
> > > > I need to consider more seriously than before.
> > > > 
> > > > We have consensus the FPGA persistent image update is just like a normal
> > > > firmware update which finally writes the nvmem like flash or eeprom,
> > > > while the current FPGA manager deals with the active FPGA region update
> > > > and re-activation. Could we just expand the FPGA manager and let it handle
> > > > the nvmem update as well? Many FPGA cards have nvmem and downloaders
> > > > supports updating both FPGA region and nvmem.
> > > > 
> > > > According to the patchset, the basic workflow of the 2 update types are
> > > > quite similar, get the data, prepare for the HW, write and complete.
> > > > They are already implemented in FPGA manager. We've discussed some
> > > > differences like threading or canceling the update, which are
> > > > not provided by FPGA manager but they may also nice to have for FPGA
> > > > region update. An FPGA region update may also last for a long time??
> > > > So I think having 2 sets of similar frameworks in FPGA is unnecessary.
> > > > 
> > > > My quick mind is that we add some flags in struct fpga_mgr & struct
> > > > fpga_image_info to indicate the HW capability (support FPGA region
> > > > update or nvmem update or both) of the download engine and the provided
> > > > image type. Then the low-level driver knows how to download if it
> > > > supports both image types.
> > > > 
> > > > An char device could be added for each fpga manager dev, providing the
> > > > user APIs for nvmem update. We may not use the char dev for FPGA region
> > > > update cause it changes the system HW devices and needs device hotplug
> > > > in FPGA region. We'd better leave it to FPGA region class, this is
> > > > another topic.
> > > > 
> > > > More discussion is appreciated.
> > > I also think fpga_mgr could be extended.
> > > 
> > > In this patchset,
> > > 
> > > https://lore.kernel.org/linux-fpga/20210625195849.837976-1-trix@redhat.com/
> > > 
> > > A second, similar set of write ops was added to fpga_manger_ops,
> > > 
> > > new bit/flag was added to fpga_image_info
> > > 
> > > The intent was for dfl to add their specific ops to cover what is done in
> > > this patchset.
> > I think we don't have to add 2 ops for reconfig & reimage in framework,
> > the 2 processes are almost the same.
> > 
> > Just add the _REIMAGE (or something else, NVMEM?) flag for
> > fpga_image_info, and low level drivers handle it as they do for other
> > flags.
> > 
> > How do you think?
> 
> A single set is fine.
> 
> A difficult part of is the length of  time to do the write. The existing
> write should be improved to use a worker thread.

Yes, we could have a flag and optionally threading the writing.

Thanks,
Yilun

> 
> Tom
> 
> > 
> > Thanks,
> > Yilun
> > 
> > > Any other driver would do similar.
> > > 
> > > Is this close to what you are thinking ?
> > > 
> > > Tom
> > > 
> > > > Thanks,
> > > > Yilun
> > > >

Xu Yilun Oct. 12, 2021, 7:56 a.m. UTC | #8

On Tue, Oct 12, 2021 at 03:47:52PM +0800, Xu Yilun wrote:
> On Mon, Oct 11, 2021 at 06:00:16PM -0700, Russ Weight wrote:
> > 
> > 
> > On 10/11/21 5:35 AM, Tom Rix wrote:
> > >
> > > On 10/10/21 6:41 PM, Xu Yilun wrote:
> > >> On Sat, Oct 09, 2021 at 05:11:20AM -0700, Tom Rix wrote:
> > >>> On 10/9/21 1:08 AM, Xu Yilun wrote:
> > >>>> On Wed, Sep 29, 2021 at 04:00:20PM -0700, Russ Weight wrote:
> > >>>>> The FPGA Image Load framework provides an API to upload image
> > >>>>> files to an FPGA device. Image files are self-describing. They could
> > >>>>> contain FPGA images, BMC images, Root Entry Hashes, or other device
> > >>>>> specific files. It is up to the lower-level device driver and the
> > >>>>> target device to authenticate and disposition the file data.
> > >>>> I've reconsider the FPGA persistent image update again, and think we
> > >>>> may include it in FPGA manager framework.
> > >>>>
> > >>>> Sorry I raised this topic again when it is already at patch v17, but now
> > >>>> I need to consider more seriously than before.
> > >>>>
> > >>>> We have consensus the FPGA persistent image update is just like a normal
> > >>>> firmware update which finally writes the nvmem like flash or eeprom,
> > >>>> while the current FPGA manager deals with the active FPGA region update
> > >>>> and re-activation. Could we just expand the FPGA manager and let it handle
> > >>>> the nvmem update as well? Many FPGA cards have nvmem and downloaders
> > >>>> supports updating both FPGA region and nvmem.
> > The fpga-image-load driver is actually just a data transfer. The class
> 
> IMHO, The fpga-mgr dev is also a data transfer. The fpga-region dev is
> acting as the FPGA region admin responsible for gating, transfering and
> re-enumerating.
> 
> So my opinion is to add a new data transfer type and keep a unified process.
> 
> > driver has no knowledge about what the data is or where/if the data will
> > be stored.
> 
> The fpga-image-load driver knows the data will be stored in nvmem, while
> the fpga-mgr knows the data will be stored in FPGA cells. They may need
> to know the exact physical position to store, may not, depends on what the
> HW engines are.
> 
> > 
> > This functionality could, of course, be merged into the fpga-mgr. I did
> > a proof of concept of this a while back and we discussed the pros and cons.
> > See this email for a recap:
> > 
> > https://marc.info/?l=linux-fpga&m=161998085507374&w=2
> > 
> > Things have changed some with the evolution of the driver. The IOCTL
> > approach probably fits better than the sysfs implementation. At the time
> > it seemed that a merge would add unnecessary complexity without adding value.
> 
> I think at least developers don't have to go through 2 sets of software
> stacks which are of the same concept. And adding some new features like
> optionally threading or canceling data transfer are also good to FPGA
> region update. And the nvmem update could also be benifit from exsiting
> implementations like scatter-gather buffers, in-kernel firmware loading.
> 
> I try to explain myself according to each of your concern from that mail
> thread:
> 
> Purpose of the 2 updates
> ========================
> 
>   As I said before, I think they are both data transfer devices. FPGA
> region update gets extra support from fpga-region & fpga-bridge, and
> FPGA nvmem update could be a standalone fpga-mgr.
> 
> Extra APIs that are unique to nvmem update
> ==========================================
> 
>   cdev APIs for nvmem update:
>     Yes we need to expand the functionality so we need them.
> 
>   available_images, image_load APIs for loading nvmem content to FPGA
>   region:
>     These are features in later patchsets which are not submitted, but we
>     could talk about it in advance. I think this is actually a FPGA region
>     update from nvmem, it also requires gating, data loading (no SW transfer)
>     and re-enumeration, or a single command to image_load HW may result system
>     disordered. The FPGA framework now only supports update from in-kernel
>     user data, maybe we add support for update from nvmem later.
> 
>   fpga-mgr state extend:
>     I think it could be extended, The current states are not perfect,
>     adding something like IDLE or READY is just fine.
> 
>   fpga-mgr status extend:
>     Add general error definitions as needed. If there is some status
>     that is quite vendor specific, expose it in low-level driver.
> 
>   remaining-size:
>     Nice to have for all.
> 
> Threading the update
> ====================
> 
>   Also a good option for FPGA region update, maybe we also have a slow FPGA
>   reprogrammer?

Another thought is, could we implement the non-threading version
firstly, so there would be less change and faster to have the basic
functionality. But either is OK for me.

Thanks,
Yilun

> 
> Cancelling the update
> ====================
> 
>   Also a good option for FPGA region update if HW engine supports.
> 
> Thanks,
> Yilun
> 
> > 
> > >>>>
> > >>>> According to the patchset, the basic workflow of the 2 update types are
> > >>>> quite similar, get the data, prepare for the HW, write and complete.
> > >>>> They are already implemented in FPGA manager. We've discussed some
> > >>>> differences like threading or canceling the update, which are
> > >>>> not provided by FPGA manager but they may also nice to have for FPGA
> > >>>> region update. An FPGA region update may also last for a long time??
> > >>>> So I think having 2 sets of similar frameworks in FPGA is unnecessary.
> > >>>>
> > >>>> My quick mind is that we add some flags in struct fpga_mgr & struct
> > >>>> fpga_image_info to indicate the HW capability (support FPGA region
> > >>>> update or nvmem update or both) of the download engine and the provided
> > >>>> image type. Then the low-level driver knows how to download if it
> > >>>> supports both image types.An char device could be added for each fpga manager dev, providing the
> > >>>> user APIs for nvmem update. We may not use the char dev for FPGA region
> > >>>> update cause it changes the system HW devices and needs device hotplug
> > >>>> in FPGA region. We'd better leave it to FPGA region class, this is
> > >>>> another topic.
> > I'll give this some more thought and see if I can come up with some RFC
> > patches.
> > 
> > - Russ
> > >>>>
> > >>>> More discussion is appreciated.
> > >>> I also think fpga_mgr could be extended.
> > >>>
> > >>> In this patchset,
> > >>>
> > >>> https://lore.kernel.org/linux-fpga/20210625195849.837976-1-trix@redhat.com/
> > >>>
> > >>> A second, similar set of write ops was added to fpga_manger_ops,
> > >>>
> > >>> new bit/flag was added to fpga_image_info
> > >>>
> > >>> The intent was for dfl to add their specific ops to cover what is done in
> > >>> this patchset.
> > >> I think we don't have to add 2 ops for reconfig & reimage in framework,
> > >> the 2 processes are almost the same.
> > >>
> > >> Just add the _REIMAGE (or something else, NVMEM?) flag for
> > >> fpga_image_info, and low level drivers handle it as they do for other
> > >> flags.
> > >>
> > >> How do you think?
> > >
> > > A single set is fine.
> > >
> > > A difficult part of is the length of  time to do the write. The existing write should be improved to use a worker thread.
> > >
> > > Tom
> > >
> > >>
> > >> Thanks,
> > >> Yilun
> > >>
> > >>> Any other driver would do similar.
> > >>>
> > >>> Is this close to what you are thinking ?
> > >>>
> > >>> Tom
> > >>>
> > >>>> Thanks,
> > >>>> Yilun
> > >>>>
> > >

Russ Weight Oct. 12, 2021, 5:20 p.m. UTC | #9

On 10/12/21 12:47 AM, Xu Yilun wrote:
> On Mon, Oct 11, 2021 at 06:00:16PM -0700, Russ Weight wrote:
>>
>> On 10/11/21 5:35 AM, Tom Rix wrote:
>>> On 10/10/21 6:41 PM, Xu Yilun wrote:
>>>> On Sat, Oct 09, 2021 at 05:11:20AM -0700, Tom Rix wrote:
>>>>> On 10/9/21 1:08 AM, Xu Yilun wrote:
>>>>>> On Wed, Sep 29, 2021 at 04:00:20PM -0700, Russ Weight wrote:
>>>>>>> The FPGA Image Load framework provides an API to upload image
>>>>>>> files to an FPGA device. Image files are self-describing. They could
>>>>>>> contain FPGA images, BMC images, Root Entry Hashes, or other device
>>>>>>> specific files. It is up to the lower-level device driver and the
>>>>>>> target device to authenticate and disposition the file data.
>>>>>> I've reconsider the FPGA persistent image update again, and think we
>>>>>> may include it in FPGA manager framework.
>>>>>>
>>>>>> Sorry I raised this topic again when it is already at patch v17, but now
>>>>>> I need to consider more seriously than before.
>>>>>>
>>>>>> We have consensus the FPGA persistent image update is just like a normal
>>>>>> firmware update which finally writes the nvmem like flash or eeprom,
>>>>>> while the current FPGA manager deals with the active FPGA region update
>>>>>> and re-activation. Could we just expand the FPGA manager and let it handle
>>>>>> the nvmem update as well? Many FPGA cards have nvmem and downloaders
>>>>>> supports updating both FPGA region and nvmem.
>> The fpga-image-load driver is actually just a data transfer. The class
> IMHO, The fpga-mgr dev is also a data transfer. The fpga-region dev is
> acting as the FPGA region admin responsible for gating, transfering and
> re-enumerating.
>
> So my opinion is to add a new data transfer type and keep a unified process.
>
>> driver has no knowledge about what the data is or where/if the data will
>> be stored.
> The fpga-image-load driver knows the data will be stored in nvmem,
FYI: This is not strictly correct. In a coming product there is a
case where the data will be stored in RAM. Richard Gong was also
looking to use this driver to validate an image without programming
or storing it. The fpga-image-load driver has no expectation that
the data will be stored in nvmem, or even that it will be stored
at all.

> while
> the fpga-mgr knows the data will be stored in FPGA cells. They may need
> to know the exact physical position to store, may not, depends on what the
> HW engines are.
>
>> This functionality could, of course, be merged into the fpga-mgr. I did
>> a proof of concept of this a while back and we discussed the pros and cons.
>> See this email for a recap:
>>
>> https://marc.info/?l=linux-fpga&m=161998085507374&w=2
>>
>> Things have changed some with the evolution of the driver. The IOCTL
>> approach probably fits better than the sysfs implementation. At the time
>> it seemed that a merge would add unnecessary complexity without adding value.
> I think at least developers don't have to go through 2 sets of software
> stacks which are of the same concept. And adding some new features like
> optionally threading or canceling data transfer are also good to FPGA
> region update. And the nvmem update could also be benifit from exsiting
> implementations like scatter-gather buffers, in-kernel firmware loading.
>
> I try to explain myself according to each of your concern from that mail
> thread:
>
> Purpose of the 2 updates
> ========================
>
>   As I said before, I think they are both data transfer devices. FPGA
> region update gets extra support from fpga-region & fpga-bridge, and
> FPGA nvmem update could be a standalone fpga-mgr.
>
> Extra APIs that are unique to nvmem update
> ==========================================
>
>   cdev APIs for nvmem update:
>     Yes we need to expand the functionality so we need them.
>
>   available_images, image_load APIs for loading nvmem content to FPGA
>   region:
>     These are features in later patchsets which are not submitted, but we
>     could talk about it in advance. I think this is actually a FPGA region
>     update from nvmem, it also requires gating, data loading (no SW transfer)
>     and re-enumeration, or a single command to image_load HW may result system
>     disordered. The FPGA framework now only supports update from in-kernel
>     user data, maybe we add support for update from nvmem later.
>
>   fpga-mgr state extend:
>     I think it could be extended, The current states are not perfect,
>     adding something like IDLE or READY is just fine.
>
>   fpga-mgr status extend:
>     Add general error definitions as needed. If there is some status
>     that is quite vendor specific, expose it in low-level driver.
>
>   remaining-size:
>     Nice to have for all.
>
> Threading the update
> ====================
>
>   Also a good option for FPGA region update, maybe we also have a slow FPGA
>   reprogrammer?
>
> Cancelling the update
> ====================
>
>   Also a good option for FPGA region update if HW engine supports.
These are all good points.

Thanks,
- Russ
>
> Thanks,
> Yilun
>
>>>>>> According to the patchset, the basic workflow of the 2 update types are
>>>>>> quite similar, get the data, prepare for the HW, write and complete.
>>>>>> They are already implemented in FPGA manager. We've discussed some
>>>>>> differences like threading or canceling the update, which are
>>>>>> not provided by FPGA manager but they may also nice to have for FPGA
>>>>>> region update. An FPGA region update may also last for a long time??
>>>>>> So I think having 2 sets of similar frameworks in FPGA is unnecessary.
>>>>>>
>>>>>> My quick mind is that we add some flags in struct fpga_mgr & struct
>>>>>> fpga_image_info to indicate the HW capability (support FPGA region
>>>>>> update or nvmem update or both) of the download engine and the provided
>>>>>> image type. Then the low-level driver knows how to download if it
>>>>>> supports both image types.An char device could be added for each fpga manager dev, providing the
>>>>>> user APIs for nvmem update. We may not use the char dev for FPGA region
>>>>>> update cause it changes the system HW devices and needs device hotplug
>>>>>> in FPGA region. We'd better leave it to FPGA region class, this is
>>>>>> another topic.
>> I'll give this some more thought and see if I can come up with some RFC
>> patches.
>>
>> - Russ
>>>>>> More discussion is appreciated.
>>>>> I also think fpga_mgr could be extended.
>>>>>
>>>>> In this patchset,
>>>>>
>>>>> https://lore.kernel.org/linux-fpga/20210625195849.837976-1-trix@redhat.com/
>>>>>
>>>>> A second, similar set of write ops was added to fpga_manger_ops,
>>>>>
>>>>> new bit/flag was added to fpga_image_info
>>>>>
>>>>> The intent was for dfl to add their specific ops to cover what is done in
>>>>> this patchset.
>>>> I think we don't have to add 2 ops for reconfig & reimage in framework,
>>>> the 2 processes are almost the same.
>>>>
>>>> Just add the _REIMAGE (or something else, NVMEM?) flag for
>>>> fpga_image_info, and low level drivers handle it as they do for other
>>>> flags.
>>>>
>>>> How do you think?
>>> A single set is fine.
>>>
>>> A difficult part of is the length of  time to do the write. The existing write should be improved to use a worker thread.
>>>
>>> Tom
>>>
>>>> Thanks,
>>>> Yilun
>>>>
>>>>> Any other driver would do similar.
>>>>>
>>>>> Is this close to what you are thinking ?
>>>>>
>>>>> Tom
>>>>>
>>>>>> Thanks,
>>>>>> Yilun
>>>>>>

Xu Yilun Oct. 13, 2021, 1:06 a.m. UTC | #10

On Tue, Oct 12, 2021 at 10:20:15AM -0700, Russ Weight wrote:
> 
> 
> On 10/12/21 12:47 AM, Xu Yilun wrote:
> > On Mon, Oct 11, 2021 at 06:00:16PM -0700, Russ Weight wrote:
> >>
> >> On 10/11/21 5:35 AM, Tom Rix wrote:
> >>> On 10/10/21 6:41 PM, Xu Yilun wrote:
> >>>> On Sat, Oct 09, 2021 at 05:11:20AM -0700, Tom Rix wrote:
> >>>>> On 10/9/21 1:08 AM, Xu Yilun wrote:
> >>>>>> On Wed, Sep 29, 2021 at 04:00:20PM -0700, Russ Weight wrote:
> >>>>>>> The FPGA Image Load framework provides an API to upload image
> >>>>>>> files to an FPGA device. Image files are self-describing. They could
> >>>>>>> contain FPGA images, BMC images, Root Entry Hashes, or other device
> >>>>>>> specific files. It is up to the lower-level device driver and the
> >>>>>>> target device to authenticate and disposition the file data.
> >>>>>> I've reconsider the FPGA persistent image update again, and think we
> >>>>>> may include it in FPGA manager framework.
> >>>>>>
> >>>>>> Sorry I raised this topic again when it is already at patch v17, but now
> >>>>>> I need to consider more seriously than before.
> >>>>>>
> >>>>>> We have consensus the FPGA persistent image update is just like a normal
> >>>>>> firmware update which finally writes the nvmem like flash or eeprom,
> >>>>>> while the current FPGA manager deals with the active FPGA region update
> >>>>>> and re-activation. Could we just expand the FPGA manager and let it handle
> >>>>>> the nvmem update as well? Many FPGA cards have nvmem and downloaders
> >>>>>> supports updating both FPGA region and nvmem.
> >> The fpga-image-load driver is actually just a data transfer. The class
> > IMHO, The fpga-mgr dev is also a data transfer. The fpga-region dev is
> > acting as the FPGA region admin responsible for gating, transfering and
> > re-enumerating.
> >
> > So my opinion is to add a new data transfer type and keep a unified process.
> >
> >> driver has no knowledge about what the data is or where/if the data will
> >> be stored.
> > The fpga-image-load driver knows the data will be stored in nvmem,
> FYI: This is not strictly correct. In a coming product there is a
> case where the data will be stored in RAM. Richard Gong was also
> looking to use this driver to validate an image without programming
> or storing it. The fpga-image-load driver has no expectation that
> the data will be stored in nvmem, or even that it will be stored
> at all.

OK, we can discuss that use case then. But fundamentally a driver should
be clear what it is doing. You may try to extend the FPGA framework to
support nvmem storage, or image validation, but cannot say we feed the
data to any engine undefined by the framework.

Thanks,
Yilun

> 
> > while
> > the fpga-mgr knows the data will be stored in FPGA cells. They may need
> > to know the exact physical position to store, may not, depends on what the
> > HW engines are.
> >
> >> This functionality could, of course, be merged into the fpga-mgr. I did
> >> a proof of concept of this a while back and we discussed the pros and cons.
> >> See this email for a recap:
> >>
> >> https://marc.info/?l=linux-fpga&m=161998085507374&w=2
> >>
> >> Things have changed some with the evolution of the driver. The IOCTL
> >> approach probably fits better than the sysfs implementation. At the time
> >> it seemed that a merge would add unnecessary complexity without adding value.
> > I think at least developers don't have to go through 2 sets of software
> > stacks which are of the same concept. And adding some new features like
> > optionally threading or canceling data transfer are also good to FPGA
> > region update. And the nvmem update could also be benifit from exsiting
> > implementations like scatter-gather buffers, in-kernel firmware loading.
> >
> > I try to explain myself according to each of your concern from that mail
> > thread:
> >
> > Purpose of the 2 updates
> > ========================
> >
> >   As I said before, I think they are both data transfer devices. FPGA
> > region update gets extra support from fpga-region & fpga-bridge, and
> > FPGA nvmem update could be a standalone fpga-mgr.
> >
> > Extra APIs that are unique to nvmem update
> > ==========================================
> >
> >   cdev APIs for nvmem update:
> >     Yes we need to expand the functionality so we need them.
> >
> >   available_images, image_load APIs for loading nvmem content to FPGA
> >   region:
> >     These are features in later patchsets which are not submitted, but we
> >     could talk about it in advance. I think this is actually a FPGA region
> >     update from nvmem, it also requires gating, data loading (no SW transfer)
> >     and re-enumeration, or a single command to image_load HW may result system
> >     disordered. The FPGA framework now only supports update from in-kernel
> >     user data, maybe we add support for update from nvmem later.
> >
> >   fpga-mgr state extend:
> >     I think it could be extended, The current states are not perfect,
> >     adding something like IDLE or READY is just fine.
> >
> >   fpga-mgr status extend:
> >     Add general error definitions as needed. If there is some status
> >     that is quite vendor specific, expose it in low-level driver.
> >
> >   remaining-size:
> >     Nice to have for all.
> >
> > Threading the update
> > ====================
> >
> >   Also a good option for FPGA region update, maybe we also have a slow FPGA
> >   reprogrammer?
> >
> > Cancelling the update
> > ====================
> >
> >   Also a good option for FPGA region update if HW engine supports.
> These are all good points.
> 
> Thanks,
> - Russ
> >
> > Thanks,
> > Yilun
> >
> >>>>>> According to the patchset, the basic workflow of the 2 update types are
> >>>>>> quite similar, get the data, prepare for the HW, write and complete.
> >>>>>> They are already implemented in FPGA manager. We've discussed some
> >>>>>> differences like threading or canceling the update, which are
> >>>>>> not provided by FPGA manager but they may also nice to have for FPGA
> >>>>>> region update. An FPGA region update may also last for a long time??
> >>>>>> So I think having 2 sets of similar frameworks in FPGA is unnecessary.
> >>>>>>
> >>>>>> My quick mind is that we add some flags in struct fpga_mgr & struct
> >>>>>> fpga_image_info to indicate the HW capability (support FPGA region
> >>>>>> update or nvmem update or both) of the download engine and the provided
> >>>>>> image type. Then the low-level driver knows how to download if it
> >>>>>> supports both image types.An char device could be added for each fpga manager dev, providing the
> >>>>>> user APIs for nvmem update. We may not use the char dev for FPGA region
> >>>>>> update cause it changes the system HW devices and needs device hotplug
> >>>>>> in FPGA region. We'd better leave it to FPGA region class, this is
> >>>>>> another topic.
> >> I'll give this some more thought and see if I can come up with some RFC
> >> patches.
> >>
> >> - Russ
> >>>>>> More discussion is appreciated.
> >>>>> I also think fpga_mgr could be extended.
> >>>>>
> >>>>> In this patchset,
> >>>>>
> >>>>> https://lore.kernel.org/linux-fpga/20210625195849.837976-1-trix@redhat.com/
> >>>>>
> >>>>> A second, similar set of write ops was added to fpga_manger_ops,
> >>>>>
> >>>>> new bit/flag was added to fpga_image_info
> >>>>>
> >>>>> The intent was for dfl to add their specific ops to cover what is done in
> >>>>> this patchset.
> >>>> I think we don't have to add 2 ops for reconfig & reimage in framework,
> >>>> the 2 processes are almost the same.
> >>>>
> >>>> Just add the _REIMAGE (or something else, NVMEM?) flag for
> >>>> fpga_image_info, and low level drivers handle it as they do for other
> >>>> flags.
> >>>>
> >>>> How do you think?
> >>> A single set is fine.
> >>>
> >>> A difficult part of is the length of  time to do the write. The existing write should be improved to use a worker thread.
> >>>
> >>> Tom
> >>>
> >>>> Thanks,
> >>>> Yilun
> >>>>
> >>>>> Any other driver would do similar.
> >>>>>
> >>>>> Is this close to what you are thinking ?
> >>>>>
> >>>>> Tom
> >>>>>
> >>>>>> Thanks,
> >>>>>> Yilun
> >>>>>>

Russ Weight Oct. 13, 2021, 6:09 p.m. UTC | #11

On 10/12/21 6:06 PM, Xu Yilun wrote:
> On Tue, Oct 12, 2021 at 10:20:15AM -0700, Russ Weight wrote:
>>
>> On 10/12/21 12:47 AM, Xu Yilun wrote:
>>> On Mon, Oct 11, 2021 at 06:00:16PM -0700, Russ Weight wrote:
>>>> On 10/11/21 5:35 AM, Tom Rix wrote:
>>>>> On 10/10/21 6:41 PM, Xu Yilun wrote:
>>>>>> On Sat, Oct 09, 2021 at 05:11:20AM -0700, Tom Rix wrote:
>>>>>>> On 10/9/21 1:08 AM, Xu Yilun wrote:
>>>>>>>> On Wed, Sep 29, 2021 at 04:00:20PM -0700, Russ Weight wrote:
>>>>>>>>> The FPGA Image Load framework provides an API to upload image
>>>>>>>>> files to an FPGA device. Image files are self-describing. They could
>>>>>>>>> contain FPGA images, BMC images, Root Entry Hashes, or other device
>>>>>>>>> specific files. It is up to the lower-level device driver and the
>>>>>>>>> target device to authenticate and disposition the file data.
>>>>>>>> I've reconsider the FPGA persistent image update again, and think we
>>>>>>>> may include it in FPGA manager framework.
>>>>>>>>
>>>>>>>> Sorry I raised this topic again when it is already at patch v17, but now
>>>>>>>> I need to consider more seriously than before.
>>>>>>>>
>>>>>>>> We have consensus the FPGA persistent image update is just like a normal
>>>>>>>> firmware update which finally writes the nvmem like flash or eeprom,
>>>>>>>> while the current FPGA manager deals with the active FPGA region update
>>>>>>>> and re-activation. Could we just expand the FPGA manager and let it handle
>>>>>>>> the nvmem update as well? Many FPGA cards have nvmem and downloaders
>>>>>>>> supports updating both FPGA region and nvmem.
>>>> The fpga-image-load driver is actually just a data transfer. The class
>>> IMHO, The fpga-mgr dev is also a data transfer. The fpga-region dev is
>>> acting as the FPGA region admin responsible for gating, transfering and
>>> re-enumerating.
>>>
>>> So my opinion is to add a new data transfer type and keep a unified process.
>>>
>>>> driver has no knowledge about what the data is or where/if the data will
>>>> be stored.
>>> The fpga-image-load driver knows the data will be stored in nvmem,
>> FYI: This is not strictly correct. In a coming product there is a
>> case where the data will be stored in RAM. Richard Gong was also
>> looking to use this driver to validate an image without programming
>> or storing it. The fpga-image-load driver has no expectation that
>> the data will be stored in nvmem, or even that it will be stored
>> at all.
> OK, we can discuss that use case then. But fundamentally a driver should
> be clear what it is doing.

The lower-level driver is, of course, clear what it is doing. And the
FPGA Image Load Framework simply provides a consistent API and manages
a potentially long-running data transfer in the context of a kernel
worker thread.

It sounds like you are saying that that is not "clear enough" in the
context of the FPGA Manager?

The files that are used with Intel PAC devices are self-describing. The
user-space tools, the class driver and the lower-level driver just pass
the data through to the card BMC without any knowledge of the content,
purpose or final destination of the data.

The card BMC will receive signed data, validate it, and process it as a
BMC image, an FPGA image, a Root Entry Hash, or a key cancellation. In
the n6000, it could also be part of a multi-step process for programming
SDM keys and the data may not be stored permanently.

> You may try to extend the FPGA framework to
> support nvmem storage, or image validation, but cannot say we feed the
> data to any engine undefined by the framework.

I'm not sure what you mean by "feed the data to any engine undefined by the
framework". I think the "engine" is the lower level driver/device that invokes
the fpga_mgr. The lower level driver, of course, is clear what it is doing.
The fpga_mgr cannot control what driver invokes it.

Are saying that when invoking the fpga-mgr, that it _must_ also pass descriptive
data. Meaning that a self-describing file alone is not acceptable?

Thanks,
- Russ

> Thanks,
> Yilun
>
>>> while
>>> the fpga-mgr knows the data will be stored in FPGA cells. They may need
>>> to know the exact physical position to store, may not, depends on what the
>>> HW engines are.
>>>
>>>> This functionality could, of course, be merged into the fpga-mgr. I did
>>>> a proof of concept of this a while back and we discussed the pros and cons.
>>>> See this email for a recap:
>>>>
>>>> https://marc.info/?l=linux-fpga&m=161998085507374&w=2
>>>>
>>>> Things have changed some with the evolution of the driver. The IOCTL
>>>> approach probably fits better than the sysfs implementation. At the time
>>>> it seemed that a merge would add unnecessary complexity without adding value.
>>> I think at least developers don't have to go through 2 sets of software
>>> stacks which are of the same concept. And adding some new features like
>>> optionally threading or canceling data transfer are also good to FPGA
>>> region update. And the nvmem update could also be benifit from exsiting
>>> implementations like scatter-gather buffers, in-kernel firmware loading.
>>>
>>> I try to explain myself according to each of your concern from that mail
>>> thread:
>>>
>>> Purpose of the 2 updates
>>> ========================
>>>
>>>   As I said before, I think they are both data transfer devices. FPGA
>>> region update gets extra support from fpga-region & fpga-bridge, and
>>> FPGA nvmem update could be a standalone fpga-mgr.
>>>
>>> Extra APIs that are unique to nvmem update
>>> ==========================================
>>>
>>>   cdev APIs for nvmem update:
>>>     Yes we need to expand the functionality so we need them.
>>>
>>>   available_images, image_load APIs for loading nvmem content to FPGA
>>>   region:
>>>     These are features in later patchsets which are not submitted, but we
>>>     could talk about it in advance. I think this is actually a FPGA region
>>>     update from nvmem, it also requires gating, data loading (no SW transfer)
>>>     and re-enumeration, or a single command to image_load HW may result system
>>>     disordered. The FPGA framework now only supports update from in-kernel
>>>     user data, maybe we add support for update from nvmem later.
>>>
>>>   fpga-mgr state extend:
>>>     I think it could be extended, The current states are not perfect,
>>>     adding something like IDLE or READY is just fine.
>>>
>>>   fpga-mgr status extend:
>>>     Add general error definitions as needed. If there is some status
>>>     that is quite vendor specific, expose it in low-level driver.
>>>
>>>   remaining-size:
>>>     Nice to have for all.
>>>
>>> Threading the update
>>> ====================
>>>
>>>   Also a good option for FPGA region update, maybe we also have a slow FPGA
>>>   reprogrammer?
>>>
>>> Cancelling the update
>>> ====================
>>>
>>>   Also a good option for FPGA region update if HW engine supports.
>> These are all good points.
>>
>> Thanks,
>> - Russ
>>> Thanks,
>>> Yilun
>>>
>>>>>>>> According to the patchset, the basic workflow of the 2 update types are
>>>>>>>> quite similar, get the data, prepare for the HW, write and complete.
>>>>>>>> They are already implemented in FPGA manager. We've discussed some
>>>>>>>> differences like threading or canceling the update, which are
>>>>>>>> not provided by FPGA manager but they may also nice to have for FPGA
>>>>>>>> region update. An FPGA region update may also last for a long time??
>>>>>>>> So I think having 2 sets of similar frameworks in FPGA is unnecessary.
>>>>>>>>
>>>>>>>> My quick mind is that we add some flags in struct fpga_mgr & struct
>>>>>>>> fpga_image_info to indicate the HW capability (support FPGA region
>>>>>>>> update or nvmem update or both) of the download engine and the provided
>>>>>>>> image type. Then the low-level driver knows how to download if it
>>>>>>>> supports both image types.An char device could be added for each fpga manager dev, providing the
>>>>>>>> user APIs for nvmem update. We may not use the char dev for FPGA region
>>>>>>>> update cause it changes the system HW devices and needs device hotplug
>>>>>>>> in FPGA region. We'd better leave it to FPGA region class, this is
>>>>>>>> another topic.
>>>> I'll give this some more thought and see if I can come up with some RFC
>>>> patches.
>>>>
>>>> - Russ
>>>>>>>> More discussion is appreciated.
>>>>>>> I also think fpga_mgr could be extended.
>>>>>>>
>>>>>>> In this patchset,
>>>>>>>
>>>>>>> https://lore.kernel.org/linux-fpga/20210625195849.837976-1-trix@redhat.com/
>>>>>>>
>>>>>>> A second, similar set of write ops was added to fpga_manger_ops,
>>>>>>>
>>>>>>> new bit/flag was added to fpga_image_info
>>>>>>>
>>>>>>> The intent was for dfl to add their specific ops to cover what is done in
>>>>>>> this patchset.
>>>>>> I think we don't have to add 2 ops for reconfig & reimage in framework,
>>>>>> the 2 processes are almost the same.
>>>>>>
>>>>>> Just add the _REIMAGE (or something else, NVMEM?) flag for
>>>>>> fpga_image_info, and low level drivers handle it as they do for other
>>>>>> flags.
>>>>>>
>>>>>> How do you think?
>>>>> A single set is fine.
>>>>>
>>>>> A difficult part of is the length of  time to do the write. The existing write should be improved to use a worker thread.
>>>>>
>>>>> Tom
>>>>>
>>>>>> Thanks,
>>>>>> Yilun
>>>>>>
>>>>>>> Any other driver would do similar.
>>>>>>>
>>>>>>> Is this close to what you are thinking ?
>>>>>>>
>>>>>>> Tom
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Yilun
>>>>>>>>

Xu Yilun Oct. 14, 2021, 1:49 a.m. UTC | #12

On Wed, Oct 13, 2021 at 11:09:08AM -0700, Russ Weight wrote:
> 
> 
> On 10/12/21 6:06 PM, Xu Yilun wrote:
> > On Tue, Oct 12, 2021 at 10:20:15AM -0700, Russ Weight wrote:
> >>
> >> On 10/12/21 12:47 AM, Xu Yilun wrote:
> >>> On Mon, Oct 11, 2021 at 06:00:16PM -0700, Russ Weight wrote:
> >>>> On 10/11/21 5:35 AM, Tom Rix wrote:
> >>>>> On 10/10/21 6:41 PM, Xu Yilun wrote:
> >>>>>> On Sat, Oct 09, 2021 at 05:11:20AM -0700, Tom Rix wrote:
> >>>>>>> On 10/9/21 1:08 AM, Xu Yilun wrote:
> >>>>>>>> On Wed, Sep 29, 2021 at 04:00:20PM -0700, Russ Weight wrote:
> >>>>>>>>> The FPGA Image Load framework provides an API to upload image
> >>>>>>>>> files to an FPGA device. Image files are self-describing. They could
> >>>>>>>>> contain FPGA images, BMC images, Root Entry Hashes, or other device
> >>>>>>>>> specific files. It is up to the lower-level device driver and the
> >>>>>>>>> target device to authenticate and disposition the file data.
> >>>>>>>> I've reconsider the FPGA persistent image update again, and think we
> >>>>>>>> may include it in FPGA manager framework.
> >>>>>>>>
> >>>>>>>> Sorry I raised this topic again when it is already at patch v17, but now
> >>>>>>>> I need to consider more seriously than before.
> >>>>>>>>
> >>>>>>>> We have consensus the FPGA persistent image update is just like a normal
> >>>>>>>> firmware update which finally writes the nvmem like flash or eeprom,
> >>>>>>>> while the current FPGA manager deals with the active FPGA region update
> >>>>>>>> and re-activation. Could we just expand the FPGA manager and let it handle
> >>>>>>>> the nvmem update as well? Many FPGA cards have nvmem and downloaders
> >>>>>>>> supports updating both FPGA region and nvmem.
> >>>> The fpga-image-load driver is actually just a data transfer. The class
> >>> IMHO, The fpga-mgr dev is also a data transfer. The fpga-region dev is
> >>> acting as the FPGA region admin responsible for gating, transfering and
> >>> re-enumerating.
> >>>
> >>> So my opinion is to add a new data transfer type and keep a unified process.
> >>>
> >>>> driver has no knowledge about what the data is or where/if the data will
> >>>> be stored.
> >>> The fpga-image-load driver knows the data will be stored in nvmem,
> >> FYI: This is not strictly correct. In a coming product there is a
> >> case where the data will be stored in RAM. Richard Gong was also
> >> looking to use this driver to validate an image without programming
> >> or storing it. The fpga-image-load driver has no expectation that
> >> the data will be stored in nvmem, or even that it will be stored
> >> at all.
> > OK, we can discuss that use case then. But fundamentally a driver should
> > be clear what it is doing.
> 
> The lower-level driver is, of course, clear what it is doing. And the
> FPGA Image Load Framework simply provides a consistent API and manages
> a potentially long-running data transfer in the context of a kernel
> worker thread.
> 
> It sounds like you are saying that that is not "clear enough" in the
> context of the FPGA Manager?
> 
> The files that are used with Intel PAC devices are self-describing. The
> user-space tools, the class driver and the lower-level driver just pass
> the data through to the card BMC without any knowledge of the content,
> purpose or final destination of the data.
> 
> The card BMC will receive signed data, validate it, and process it as a
> BMC image, an FPGA image, a Root Entry Hash, or a key cancellation. In

I category all these actions as firmware update fully or partially on
persistent storage. The FPGA Manager don't have to know the meaning of
every byte on flash, but it should be aware the firmware is updated and
the card may acts differently with a new firmware. This is the common
working model for most of the FPGA cards so that we implement it in FPGA
manager class. 

> the n6000, it could also be part of a multi-step process for programming
> SDM keys and the data may not be stored permanently.

This is new to me, but seems to be different from firmware update, so lets
think about it again.

> 
> > You may try to extend the FPGA framework to
> > support nvmem storage, or image validation, but cannot say we feed the
> > data to any engine undefined by the framework.
> 
> I'm not sure what you mean by "feed the data to any engine undefined by the
> framework". I think the "engine" is the lower level driver/device that invokes
> the fpga_mgr. The lower level driver, of course, is clear what it is doing.
> The fpga_mgr cannot control what driver invokes it.
> 
> Are saying that when invoking the fpga-mgr, that it _must_ also pass descriptive
> data. Meaning that a self-describing file alone is not acceptable?

The class driver should define a reasonable working model and APIs.
Updating the FPGA backup storage is good to me. But receiving a mystery
box and do whatever it requires is not.

Self-describing file is OK, encryption is OK, but either the class
driver itself, or with the help of the low level driver, should make
sure it works within its scope.

Thanks,
Yilun

> 
> Thanks,
> - Russ
> 
> > Thanks,
> > Yilun
> >
> >>> while
> >>> the fpga-mgr knows the data will be stored in FPGA cells. They may need
> >>> to know the exact physical position to store, may not, depends on what the
> >>> HW engines are.
> >>>
> >>>> This functionality could, of course, be merged into the fpga-mgr. I did
> >>>> a proof of concept of this a while back and we discussed the pros and cons.
> >>>> See this email for a recap:
> >>>>
> >>>> https://marc.info/?l=linux-fpga&m=161998085507374&w=2
> >>>>
> >>>> Things have changed some with the evolution of the driver. The IOCTL
> >>>> approach probably fits better than the sysfs implementation. At the time
> >>>> it seemed that a merge would add unnecessary complexity without adding value.
> >>> I think at least developers don't have to go through 2 sets of software
> >>> stacks which are of the same concept. And adding some new features like
> >>> optionally threading or canceling data transfer are also good to FPGA
> >>> region update. And the nvmem update could also be benifit from exsiting
> >>> implementations like scatter-gather buffers, in-kernel firmware loading.
> >>>
> >>> I try to explain myself according to each of your concern from that mail
> >>> thread:
> >>>
> >>> Purpose of the 2 updates
> >>> ========================
> >>>
> >>>   As I said before, I think they are both data transfer devices. FPGA
> >>> region update gets extra support from fpga-region & fpga-bridge, and
> >>> FPGA nvmem update could be a standalone fpga-mgr.
> >>>
> >>> Extra APIs that are unique to nvmem update
> >>> ==========================================
> >>>
> >>>   cdev APIs for nvmem update:
> >>>     Yes we need to expand the functionality so we need them.
> >>>
> >>>   available_images, image_load APIs for loading nvmem content to FPGA
> >>>   region:
> >>>     These are features in later patchsets which are not submitted, but we
> >>>     could talk about it in advance. I think this is actually a FPGA region
> >>>     update from nvmem, it also requires gating, data loading (no SW transfer)
> >>>     and re-enumeration, or a single command to image_load HW may result system
> >>>     disordered. The FPGA framework now only supports update from in-kernel
> >>>     user data, maybe we add support for update from nvmem later.
> >>>
> >>>   fpga-mgr state extend:
> >>>     I think it could be extended, The current states are not perfect,
> >>>     adding something like IDLE or READY is just fine.
> >>>
> >>>   fpga-mgr status extend:
> >>>     Add general error definitions as needed. If there is some status
> >>>     that is quite vendor specific, expose it in low-level driver.
> >>>
> >>>   remaining-size:
> >>>     Nice to have for all.
> >>>
> >>> Threading the update
> >>> ====================
> >>>
> >>>   Also a good option for FPGA region update, maybe we also have a slow FPGA
> >>>   reprogrammer?
> >>>
> >>> Cancelling the update
> >>> ====================
> >>>
> >>>   Also a good option for FPGA region update if HW engine supports.
> >> These are all good points.
> >>
> >> Thanks,
> >> - Russ
> >>> Thanks,
> >>> Yilun
> >>>
> >>>>>>>> According to the patchset, the basic workflow of the 2 update types are
> >>>>>>>> quite similar, get the data, prepare for the HW, write and complete.
> >>>>>>>> They are already implemented in FPGA manager. We've discussed some
> >>>>>>>> differences like threading or canceling the update, which are
> >>>>>>>> not provided by FPGA manager but they may also nice to have for FPGA
> >>>>>>>> region update. An FPGA region update may also last for a long time??
> >>>>>>>> So I think having 2 sets of similar frameworks in FPGA is unnecessary.
> >>>>>>>>
> >>>>>>>> My quick mind is that we add some flags in struct fpga_mgr & struct
> >>>>>>>> fpga_image_info to indicate the HW capability (support FPGA region
> >>>>>>>> update or nvmem update or both) of the download engine and the provided
> >>>>>>>> image type. Then the low-level driver knows how to download if it
> >>>>>>>> supports both image types.An char device could be added for each fpga manager dev, providing the
> >>>>>>>> user APIs for nvmem update. We may not use the char dev for FPGA region
> >>>>>>>> update cause it changes the system HW devices and needs device hotplug
> >>>>>>>> in FPGA region. We'd better leave it to FPGA region class, this is
> >>>>>>>> another topic.
> >>>> I'll give this some more thought and see if I can come up with some RFC
> >>>> patches.
> >>>>
> >>>> - Russ
> >>>>>>>> More discussion is appreciated.
> >>>>>>> I also think fpga_mgr could be extended.
> >>>>>>>
> >>>>>>> In this patchset,
> >>>>>>>
> >>>>>>> https://lore.kernel.org/linux-fpga/20210625195849.837976-1-trix@redhat.com/
> >>>>>>>
> >>>>>>> A second, similar set of write ops was added to fpga_manger_ops,
> >>>>>>>
> >>>>>>> new bit/flag was added to fpga_image_info
> >>>>>>>
> >>>>>>> The intent was for dfl to add their specific ops to cover what is done in
> >>>>>>> this patchset.
> >>>>>> I think we don't have to add 2 ops for reconfig & reimage in framework,
> >>>>>> the 2 processes are almost the same.
> >>>>>>
> >>>>>> Just add the _REIMAGE (or something else, NVMEM?) flag for
> >>>>>> fpga_image_info, and low level drivers handle it as they do for other
> >>>>>> flags.
> >>>>>>
> >>>>>> How do you think?
> >>>>> A single set is fine.
> >>>>>
> >>>>> A difficult part of is the length of  time to do the write. The existing write should be improved to use a worker thread.
> >>>>>
> >>>>> Tom
> >>>>>
> >>>>>> Thanks,
> >>>>>> Yilun
> >>>>>>
> >>>>>>> Any other driver would do similar.
> >>>>>>>
> >>>>>>> Is this close to what you are thinking ?
> >>>>>>>
> >>>>>>> Tom
> >>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Yilun
> >>>>>>>>

Xu Yilun Oct. 15, 2021, 2:51 a.m. UTC | #13

On Thu, Oct 14, 2021 at 09:32:53AM -0700, Russ Weight wrote:
> 
> 
> On 10/13/21 6:49 PM, Xu Yilun wrote:
> > On Wed, Oct 13, 2021 at 11:09:08AM -0700, Russ Weight wrote:
> >>
> >> On 10/12/21 6:06 PM, Xu Yilun wrote:
> >>> On Tue, Oct 12, 2021 at 10:20:15AM -0700, Russ Weight wrote:
> >>>> On 10/12/21 12:47 AM, Xu Yilun wrote:
> >>>>> On Mon, Oct 11, 2021 at 06:00:16PM -0700, Russ Weight wrote:
> >>>>>> On 10/11/21 5:35 AM, Tom Rix wrote:
> >>>>>>> On 10/10/21 6:41 PM, Xu Yilun wrote:
> >>>>>>>> On Sat, Oct 09, 2021 at 05:11:20AM -0700, Tom Rix wrote:
> >>>>>>>>> On 10/9/21 1:08 AM, Xu Yilun wrote:
> >>>>>>>>>> On Wed, Sep 29, 2021 at 04:00:20PM -0700, Russ Weight wrote:
> >>>>>>>>>>> The FPGA Image Load framework provides an API to upload image
> >>>>>>>>>>> files to an FPGA device. Image files are self-describing. They could
> >>>>>>>>>>> contain FPGA images, BMC images, Root Entry Hashes, or other device
> >>>>>>>>>>> specific files. It is up to the lower-level device driver and the
> >>>>>>>>>>> target device to authenticate and disposition the file data.
> >>>>>>>>>> I've reconsider the FPGA persistent image update again, and think we
> >>>>>>>>>> may include it in FPGA manager framework.
> >>>>>>>>>>
> >>>>>>>>>> Sorry I raised this topic again when it is already at patch v17, but now
> >>>>>>>>>> I need to consider more seriously than before.
> >>>>>>>>>>
> >>>>>>>>>> We have consensus the FPGA persistent image update is just like a normal
> >>>>>>>>>> firmware update which finally writes the nvmem like flash or eeprom,
> >>>>>>>>>> while the current FPGA manager deals with the active FPGA region update
> >>>>>>>>>> and re-activation. Could we just expand the FPGA manager and let it handle
> >>>>>>>>>> the nvmem update as well? Many FPGA cards have nvmem and downloaders
> >>>>>>>>>> supports updating both FPGA region and nvmem.
> >>>>>> The fpga-image-load driver is actually just a data transfer. The class
> >>>>> IMHO, The fpga-mgr dev is also a data transfer. The fpga-region dev is
> >>>>> acting as the FPGA region admin responsible for gating, transfering and
> >>>>> re-enumerating.
> >>>>>
> >>>>> So my opinion is to add a new data transfer type and keep a unified process.
> >>>>>
> >>>>>> driver has no knowledge about what the data is or where/if the data will
> >>>>>> be stored.
> >>>>> The fpga-image-load driver knows the data will be stored in nvmem,
> >>>> FYI: This is not strictly correct. In a coming product there is a
> >>>> case where the data will be stored in RAM. Richard Gong was also
> >>>> looking to use this driver to validate an image without programming
> >>>> or storing it. The fpga-image-load driver has no expectation that
> >>>> the data will be stored in nvmem, or even that it will be stored
> >>>> at all.
> >>> OK, we can discuss that use case then. But fundamentally a driver should
> >>> be clear what it is doing.
> >> The lower-level driver is, of course, clear what it is doing. And the
> >> FPGA Image Load Framework simply provides a consistent API and manages
> >> a potentially long-running data transfer in the context of a kernel
> >> worker thread.
> >>
> >> It sounds like you are saying that that is not "clear enough" in the
> >> context of the FPGA Manager?
> >>
> >> The files that are used with Intel PAC devices are self-describing. The
> >> user-space tools, the class driver and the lower-level driver just pass
> >> the data through to the card BMC without any knowledge of the content,
> >> purpose or final destination of the data.
> >>
> >> The card BMC will receive signed data, validate it, and process it as a
> >> BMC image, an FPGA image, a Root Entry Hash, or a key cancellation. In
> > I category all these actions as firmware update fully or partially on
> > persistent storage. The FPGA Manager don't have to know the meaning of
> > every byte on flash, but it should be aware the firmware is updated and
> > the card may acts differently with a new firmware. This is the common
> > working model for most of the FPGA cards so that we implement it in FPGA
> > manager class. 
> >
> >> the n6000, it could also be part of a multi-step process for programming
> >> SDM keys and the data may not be stored permanently.
> > This is new to me, but seems to be different from firmware update, so lets
> > think about it again.
> >
> >>> You may try to extend the FPGA framework to
> >>> support nvmem storage, or image validation, but cannot say we feed the
> >>> data to any engine undefined by the framework.
> >> I'm not sure what you mean by "feed the data to any engine undefined by the
> >> framework". I think the "engine" is the lower level driver/device that invokes
> >> the fpga_mgr. The lower level driver, of course, is clear what it is doing.
> >> The fpga_mgr cannot control what driver invokes it.
> >>
> >> Are saying that when invoking the fpga-mgr, that it _must_ also pass descriptive
> >> data. Meaning that a self-describing file alone is not acceptable?
> > The class driver should define a reasonable working model and APIs.
> > Updating the FPGA backup storage is good to me. But receiving a mystery
> > box and do whatever it requires is not.
> >
> > Self-describing file is OK, encryption is OK, but either the class
> > driver itself, or with the help of the low level driver, should make
> > sure it works within its scope.
> In our secure update process, the card BMC firmware authenticates
> the data using the root entry hashes and will either reject the
> data or perform some function based on the contents. Neither the
> user-space, the class driver, nor the lower level driver know
> what the contents are. It _is_ a "mystery box" to them. How do we
> verify scope in this model?

I think we need to find out how. One case is, the HW is designed to have
one single function, such as firmware update, then any image input
through firmware update API is within expectation, and the driver
should only serve the firmware update API. I think this is how the
N3000 is working now. If the HW is for another function, register itself
to serve another API, or another class driver.

Another case is, the HW could do multiple types of tasks depending on
the content of the image, such as firmware update, image verification,
or assumably power off the card ... There should be some mechanism for
the driver to only accept the right image according to what API is called.
Or the user may input an image named update_the_card.img through
firmware update API and finally get the card off. Having some headers
readable by host for the operation type? Or, some HW interface for host
to apply the operation type as well as the image, let the HW verify?
Let's think about it.

> 
> As you have noted, most current cases result in a change to the
> card, and I suspect that it will remain that way. But that can't be
> guaranteed, and I'm not convinced that a host driver needs to be
> concerned about it.

A host driver should know what is done, in some abstraction level.
I think updating the persistent storage is an acceptable abstraction
in FPGA domain, so I'd like to extend it in FPGA manager. But doing
anything according to the image is not.

Thanks,
Yilun

> 
> - Russ
> 
> >
> > Thanks,
> > Yilun
> >
> >> Thanks,
> >> - Russ
> >>
> >>> Thanks,
> >>> Yilun
> >>>
> >>>>> while
> >>>>> the fpga-mgr knows the data will be stored in FPGA cells. They may need
> >>>>> to know the exact physical position to store, may not, depends on what the
> >>>>> HW engines are.
> >>>>>
> >>>>>> This functionality could, of course, be merged into the fpga-mgr. I did
> >>>>>> a proof of concept of this a while back and we discussed the pros and cons.
> >>>>>> See this email for a recap:
> >>>>>>
> >>>>>> https://marc.info/?l=linux-fpga&m=161998085507374&w=2
> >>>>>>
> >>>>>> Things have changed some with the evolution of the driver. The IOCTL
> >>>>>> approach probably fits better than the sysfs implementation. At the time
> >>>>>> it seemed that a merge would add unnecessary complexity without adding value.
> >>>>> I think at least developers don't have to go through 2 sets of software
> >>>>> stacks which are of the same concept. And adding some new features like
> >>>>> optionally threading or canceling data transfer are also good to FPGA
> >>>>> region update. And the nvmem update could also be benifit from exsiting
> >>>>> implementations like scatter-gather buffers, in-kernel firmware loading.
> >>>>>
> >>>>> I try to explain myself according to each of your concern from that mail
> >>>>> thread:
> >>>>>
> >>>>> Purpose of the 2 updates
> >>>>> ========================
> >>>>>
> >>>>>   As I said before, I think they are both data transfer devices. FPGA
> >>>>> region update gets extra support from fpga-region & fpga-bridge, and
> >>>>> FPGA nvmem update could be a standalone fpga-mgr.
> >>>>>
> >>>>> Extra APIs that are unique to nvmem update
> >>>>> ==========================================
> >>>>>
> >>>>>   cdev APIs for nvmem update:
> >>>>>     Yes we need to expand the functionality so we need them.
> >>>>>
> >>>>>   available_images, image_load APIs for loading nvmem content to FPGA
> >>>>>   region:
> >>>>>     These are features in later patchsets which are not submitted, but we
> >>>>>     could talk about it in advance. I think this is actually a FPGA region
> >>>>>     update from nvmem, it also requires gating, data loading (no SW transfer)
> >>>>>     and re-enumeration, or a single command to image_load HW may result system
> >>>>>     disordered. The FPGA framework now only supports update from in-kernel
> >>>>>     user data, maybe we add support for update from nvmem later.
> >>>>>
> >>>>>   fpga-mgr state extend:
> >>>>>     I think it could be extended, The current states are not perfect,
> >>>>>     adding something like IDLE or READY is just fine.
> >>>>>
> >>>>>   fpga-mgr status extend:
> >>>>>     Add general error definitions as needed. If there is some status
> >>>>>     that is quite vendor specific, expose it in low-level driver.
> >>>>>
> >>>>>   remaining-size:
> >>>>>     Nice to have for all.
> >>>>>
> >>>>> Threading the update
> >>>>> ====================
> >>>>>
> >>>>>   Also a good option for FPGA region update, maybe we also have a slow FPGA
> >>>>>   reprogrammer?
> >>>>>
> >>>>> Cancelling the update
> >>>>> ====================
> >>>>>
> >>>>>   Also a good option for FPGA region update if HW engine supports.
> >>>> These are all good points.
> >>>>
> >>>> Thanks,
> >>>> - Russ
> >>>>> Thanks,
> >>>>> Yilun
> >>>>>
> >>>>>>>>>> According to the patchset, the basic workflow of the 2 update types are
> >>>>>>>>>> quite similar, get the data, prepare for the HW, write and complete.
> >>>>>>>>>> They are already implemented in FPGA manager. We've discussed some
> >>>>>>>>>> differences like threading or canceling the update, which are
> >>>>>>>>>> not provided by FPGA manager but they may also nice to have for FPGA
> >>>>>>>>>> region update. An FPGA region update may also last for a long time??
> >>>>>>>>>> So I think having 2 sets of similar frameworks in FPGA is unnecessary.
> >>>>>>>>>>
> >>>>>>>>>> My quick mind is that we add some flags in struct fpga_mgr & struct
> >>>>>>>>>> fpga_image_info to indicate the HW capability (support FPGA region
> >>>>>>>>>> update or nvmem update or both) of the download engine and the provided
> >>>>>>>>>> image type. Then the low-level driver knows how to download if it
> >>>>>>>>>> supports both image types.An char device could be added for each fpga manager dev, providing the
> >>>>>>>>>> user APIs for nvmem update. We may not use the char dev for FPGA region
> >>>>>>>>>> update cause it changes the system HW devices and needs device hotplug
> >>>>>>>>>> in FPGA region. We'd better leave it to FPGA region class, this is
> >>>>>>>>>> another topic.
> >>>>>> I'll give this some more thought and see if I can come up with some RFC
> >>>>>> patches.
> >>>>>>
> >>>>>> - Russ
> >>>>>>>>>> More discussion is appreciated.
> >>>>>>>>> I also think fpga_mgr could be extended.
> >>>>>>>>>
> >>>>>>>>> In this patchset,
> >>>>>>>>>
> >>>>>>>>> https://lore.kernel.org/linux-fpga/20210625195849.837976-1-trix@redhat.com/
> >>>>>>>>>
> >>>>>>>>> A second, similar set of write ops was added to fpga_manger_ops,
> >>>>>>>>>
> >>>>>>>>> new bit/flag was added to fpga_image_info
> >>>>>>>>>
> >>>>>>>>> The intent was for dfl to add their specific ops to cover what is done in
> >>>>>>>>> this patchset.
> >>>>>>>> I think we don't have to add 2 ops for reconfig & reimage in framework,
> >>>>>>>> the 2 processes are almost the same.
> >>>>>>>>
> >>>>>>>> Just add the _REIMAGE (or something else, NVMEM?) flag for
> >>>>>>>> fpga_image_info, and low level drivers handle it as they do for other
> >>>>>>>> flags.
> >>>>>>>>
> >>>>>>>> How do you think?
> >>>>>>> A single set is fine.
> >>>>>>>
> >>>>>>> A difficult part of is the length of  time to do the write. The existing write should be improved to use a worker thread.
> >>>>>>>
> >>>>>>> Tom
> >>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Yilun
> >>>>>>>>
> >>>>>>>>> Any other driver would do similar.
> >>>>>>>>>
> >>>>>>>>> Is this close to what you are thinking ?
> >>>>>>>>>
> >>>>>>>>> Tom
> >>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Yilun
> >>>>>>>>>>

Russ Weight Oct. 15, 2021, 5:34 p.m. UTC | #14

On 10/14/21 7:51 PM, Xu Yilun wrote:
> On Thu, Oct 14, 2021 at 09:32:53AM -0700, Russ Weight wrote:
>>
>> On 10/13/21 6:49 PM, Xu Yilun wrote:
>>> On Wed, Oct 13, 2021 at 11:09:08AM -0700, Russ Weight wrote:
>>>> On 10/12/21 6:06 PM, Xu Yilun wrote:
>>>>> On Tue, Oct 12, 2021 at 10:20:15AM -0700, Russ Weight wrote:
>>>>>> On 10/12/21 12:47 AM, Xu Yilun wrote:
>>>>>>> On Mon, Oct 11, 2021 at 06:00:16PM -0700, Russ Weight wrote:
>>>>>>>> On 10/11/21 5:35 AM, Tom Rix wrote:
>>>>>>>>> On 10/10/21 6:41 PM, Xu Yilun wrote:
>>>>>>>>>> On Sat, Oct 09, 2021 at 05:11:20AM -0700, Tom Rix wrote:
>>>>>>>>>>> On 10/9/21 1:08 AM, Xu Yilun wrote:
>>>>>>>>>>>> On Wed, Sep 29, 2021 at 04:00:20PM -0700, Russ Weight wrote:
>>>>>>>>>>>>> The FPGA Image Load framework provides an API to upload image
>>>>>>>>>>>>> files to an FPGA device. Image files are self-describing. They could
>>>>>>>>>>>>> contain FPGA images, BMC images, Root Entry Hashes, or other device
>>>>>>>>>>>>> specific files. It is up to the lower-level device driver and the
>>>>>>>>>>>>> target device to authenticate and disposition the file data.
>>>>>>>>>>>> I've reconsider the FPGA persistent image update again, and think we
>>>>>>>>>>>> may include it in FPGA manager framework.
>>>>>>>>>>>>
>>>>>>>>>>>> Sorry I raised this topic again when it is already at patch v17, but now
>>>>>>>>>>>> I need to consider more seriously than before.
>>>>>>>>>>>>
>>>>>>>>>>>> We have consensus the FPGA persistent image update is just like a normal
>>>>>>>>>>>> firmware update which finally writes the nvmem like flash or eeprom,
>>>>>>>>>>>> while the current FPGA manager deals with the active FPGA region update
>>>>>>>>>>>> and re-activation. Could we just expand the FPGA manager and let it handle
>>>>>>>>>>>> the nvmem update as well? Many FPGA cards have nvmem and downloaders
>>>>>>>>>>>> supports updating both FPGA region and nvmem.
>>>>>>>> The fpga-image-load driver is actually just a data transfer. The class
>>>>>>> IMHO, The fpga-mgr dev is also a data transfer. The fpga-region dev is
>>>>>>> acting as the FPGA region admin responsible for gating, transfering and
>>>>>>> re-enumerating.
>>>>>>>
>>>>>>> So my opinion is to add a new data transfer type and keep a unified process.
>>>>>>>
>>>>>>>> driver has no knowledge about what the data is or where/if the data will
>>>>>>>> be stored.
>>>>>>> The fpga-image-load driver knows the data will be stored in nvmem,
>>>>>> FYI: This is not strictly correct. In a coming product there is a
>>>>>> case where the data will be stored in RAM. Richard Gong was also
>>>>>> looking to use this driver to validate an image without programming
>>>>>> or storing it. The fpga-image-load driver has no expectation that
>>>>>> the data will be stored in nvmem, or even that it will be stored
>>>>>> at all.
>>>>> OK, we can discuss that use case then. But fundamentally a driver should
>>>>> be clear what it is doing.
>>>> The lower-level driver is, of course, clear what it is doing. And the
>>>> FPGA Image Load Framework simply provides a consistent API and manages
>>>> a potentially long-running data transfer in the context of a kernel
>>>> worker thread.
>>>>
>>>> It sounds like you are saying that that is not "clear enough" in the
>>>> context of the FPGA Manager?
>>>>
>>>> The files that are used with Intel PAC devices are self-describing. The
>>>> user-space tools, the class driver and the lower-level driver just pass
>>>> the data through to the card BMC without any knowledge of the content,
>>>> purpose or final destination of the data.
>>>>
>>>> The card BMC will receive signed data, validate it, and process it as a
>>>> BMC image, an FPGA image, a Root Entry Hash, or a key cancellation. In
>>> I category all these actions as firmware update fully or partially on
>>> persistent storage. The FPGA Manager don't have to know the meaning of
>>> every byte on flash, but it should be aware the firmware is updated and
>>> the card may acts differently with a new firmware. This is the common
>>> working model for most of the FPGA cards so that we implement it in FPGA
>>> manager class. 
>>>
>>>> the n6000, it could also be part of a multi-step process for programming
>>>> SDM keys and the data may not be stored permanently.
>>> This is new to me, but seems to be different from firmware update, so lets
>>> think about it again.
>>>
>>>>> You may try to extend the FPGA framework to
>>>>> support nvmem storage, or image validation, but cannot say we feed the
>>>>> data to any engine undefined by the framework.
>>>> I'm not sure what you mean by "feed the data to any engine undefined by the
>>>> framework". I think the "engine" is the lower level driver/device that invokes
>>>> the fpga_mgr. The lower level driver, of course, is clear what it is doing.
>>>> The fpga_mgr cannot control what driver invokes it.
>>>>
>>>> Are saying that when invoking the fpga-mgr, that it _must_ also pass descriptive
>>>> data. Meaning that a self-describing file alone is not acceptable?
>>> The class driver should define a reasonable working model and APIs.
>>> Updating the FPGA backup storage is good to me. But receiving a mystery
>>> box and do whatever it requires is not.
>>>
>>> Self-describing file is OK, encryption is OK, but either the class
>>> driver itself, or with the help of the low level driver, should make
>>> sure it works within its scope.
>> In our secure update process, the card BMC firmware authenticates
>> the data using the root entry hashes and will either reject the
>> data or perform some function based on the contents. Neither the
>> user-space, the class driver, nor the lower level driver know
>> what the contents are. It _is_ a "mystery box" to them. How do we
>> verify scope in this model?
> I think we need to find out how. One case is, the HW is designed to have
> one single function, such as firmware update, then any image input
> through firmware update API is within expectation, and the driver
> should only serve the firmware update API. I think this is how the
> N3000 is working now. If the HW is for another function, register itself
> to serve another API, or another class driver.
>
> Another case is, the HW could do multiple types of tasks depending on
> the content of the image, such as firmware update, image verification,
> or assumably power off the card ... There should be some mechanism for
> the driver to only accept the right image according to what API is called.
> Or the user may input an image named update_the_card.img through
> firmware update API and finally get the card off. Having some headers
> readable by host for the operation type? Or, some HW interface for host
> to apply the operation type as well as the image, let the HW verify?
> Let's think about it.
I'm not sure if I am following your thinking here. The HW, of course,
verifies authentication of the image and acts according to the image
type. I don't think it is reasonable to require the class driver to
interpret the data to determine what it is. That implies that the
class driver would have to know the header format and possible values.
It also means that changes to the header format would require patches
to the class driver.

The FPGA card is trusted by virtue of the fact that the customer
purchased it and physically placed it in the machine. If the FPGA card
(or the lower level driver) validates the image, then why does the
Class driver need to be concerned about the file type? I think the
purpose of the class driver is primarily to provide a common API and
perform common functions so that they don't have to be replicated
among similar low-level drivers. It is up to the low-level driver
or the device itself to ensure that the data received is acceptable.

If the card receives data through the fpga-mgr upload facility that
isn't strictly a firmware update, and if the lower-level driver or
the card handles it and returns appropriate status - is that really
a problem?
>> As you have noted, most current cases result in a change to the
>> card, and I suspect that it will remain that way. But that can't be
>> guaranteed, and I'm not convinced that a host driver needs to be
>> concerned about it.
> A host driver should know what is done, in some abstraction level.
> I think updating the persistent storage is an acceptable abstraction
> in FPGA domain, so I'd like to extend it in FPGA manager. But doing
> anything according to the image is not.
By host driver, do you mean the class driver? Or the lower-level device
driver?

It seems to me that you are saying that self-describing images are not
acceptable? Or at least they are not acceptable payload for an FPGA
Manager firmware-update API?

The FPGA Image Load Framework was designed with the concept of
transferring data to a device without imposing a purpose on the data.
The expectation is that the lower-level driver or the device will
validate the data. Is there something fundamentally wrong with that
approach? And if not, why couldn't we incorporate a similar image_load
API into the FPGA Manager?

- Russ

>
> Thanks,
> Yilun
>
>> - Russ
>>
>>> Thanks,
>>> Yilun
>>>
>>>> Thanks,
>>>> - Russ
>>>>
>>>>> Thanks,
>>>>> Yilun
>>>>>
>>>>>>> while
>>>>>>> the fpga-mgr knows the data will be stored in FPGA cells. They may need
>>>>>>> to know the exact physical position to store, may not, depends on what the
>>>>>>> HW engines are.
>>>>>>>
>>>>>>>> This functionality could, of course, be merged into the fpga-mgr. I did
>>>>>>>> a proof of concept of this a while back and we discussed the pros and cons.
>>>>>>>> See this email for a recap:
>>>>>>>>
>>>>>>>> https://marc.info/?l=linux-fpga&m=161998085507374&w=2
>>>>>>>>
>>>>>>>> Things have changed some with the evolution of the driver. The IOCTL
>>>>>>>> approach probably fits better than the sysfs implementation. At the time
>>>>>>>> it seemed that a merge would add unnecessary complexity without adding value.
>>>>>>> I think at least developers don't have to go through 2 sets of software
>>>>>>> stacks which are of the same concept. And adding some new features like
>>>>>>> optionally threading or canceling data transfer are also good to FPGA
>>>>>>> region update. And the nvmem update could also be benifit from exsiting
>>>>>>> implementations like scatter-gather buffers, in-kernel firmware loading.
>>>>>>>
>>>>>>> I try to explain myself according to each of your concern from that mail
>>>>>>> thread:
>>>>>>>
>>>>>>> Purpose of the 2 updates
>>>>>>> ========================
>>>>>>>
>>>>>>>   As I said before, I think they are both data transfer devices. FPGA
>>>>>>> region update gets extra support from fpga-region & fpga-bridge, and
>>>>>>> FPGA nvmem update could be a standalone fpga-mgr.
>>>>>>>
>>>>>>> Extra APIs that are unique to nvmem update
>>>>>>> ==========================================
>>>>>>>
>>>>>>>   cdev APIs for nvmem update:
>>>>>>>     Yes we need to expand the functionality so we need them.
>>>>>>>
>>>>>>>   available_images, image_load APIs for loading nvmem content to FPGA
>>>>>>>   region:
>>>>>>>     These are features in later patchsets which are not submitted, but we
>>>>>>>     could talk about it in advance. I think this is actually a FPGA region
>>>>>>>     update from nvmem, it also requires gating, data loading (no SW transfer)
>>>>>>>     and re-enumeration, or a single command to image_load HW may result system
>>>>>>>     disordered. The FPGA framework now only supports update from in-kernel
>>>>>>>     user data, maybe we add support for update from nvmem later.
>>>>>>>
>>>>>>>   fpga-mgr state extend:
>>>>>>>     I think it could be extended, The current states are not perfect,
>>>>>>>     adding something like IDLE or READY is just fine.
>>>>>>>
>>>>>>>   fpga-mgr status extend:
>>>>>>>     Add general error definitions as needed. If there is some status
>>>>>>>     that is quite vendor specific, expose it in low-level driver.
>>>>>>>
>>>>>>>   remaining-size:
>>>>>>>     Nice to have for all.
>>>>>>>
>>>>>>> Threading the update
>>>>>>> ====================
>>>>>>>
>>>>>>>   Also a good option for FPGA region update, maybe we also have a slow FPGA
>>>>>>>   reprogrammer?
>>>>>>>
>>>>>>> Cancelling the update
>>>>>>> ====================
>>>>>>>
>>>>>>>   Also a good option for FPGA region update if HW engine supports.
>>>>>> These are all good points.
>>>>>>
>>>>>> Thanks,
>>>>>> - Russ
>>>>>>> Thanks,
>>>>>>> Yilun
>>>>>>>
>>>>>>>>>>>> According to the patchset, the basic workflow of the 2 update types are
>>>>>>>>>>>> quite similar, get the data, prepare for the HW, write and complete.
>>>>>>>>>>>> They are already implemented in FPGA manager. We've discussed some
>>>>>>>>>>>> differences like threading or canceling the update, which are
>>>>>>>>>>>> not provided by FPGA manager but they may also nice to have for FPGA
>>>>>>>>>>>> region update. An FPGA region update may also last for a long time??
>>>>>>>>>>>> So I think having 2 sets of similar frameworks in FPGA is unnecessary.
>>>>>>>>>>>>
>>>>>>>>>>>> My quick mind is that we add some flags in struct fpga_mgr & struct
>>>>>>>>>>>> fpga_image_info to indicate the HW capability (support FPGA region
>>>>>>>>>>>> update or nvmem update or both) of the download engine and the provided
>>>>>>>>>>>> image type. Then the low-level driver knows how to download if it
>>>>>>>>>>>> supports both image types.An char device could be added for each fpga manager dev, providing the
>>>>>>>>>>>> user APIs for nvmem update. We may not use the char dev for FPGA region
>>>>>>>>>>>> update cause it changes the system HW devices and needs device hotplug
>>>>>>>>>>>> in FPGA region. We'd better leave it to FPGA region class, this is
>>>>>>>>>>>> another topic.
>>>>>>>> I'll give this some more thought and see if I can come up with some RFC
>>>>>>>> patches.
>>>>>>>>
>>>>>>>> - Russ
>>>>>>>>>>>> More discussion is appreciated.
>>>>>>>>>>> I also think fpga_mgr could be extended.
>>>>>>>>>>>
>>>>>>>>>>> In this patchset,
>>>>>>>>>>>
>>>>>>>>>>> https://lore.kernel.org/linux-fpga/20210625195849.837976-1-trix@redhat.com/
>>>>>>>>>>>
>>>>>>>>>>> A second, similar set of write ops was added to fpga_manger_ops,
>>>>>>>>>>>
>>>>>>>>>>> new bit/flag was added to fpga_image_info
>>>>>>>>>>>
>>>>>>>>>>> The intent was for dfl to add their specific ops to cover what is done in
>>>>>>>>>>> this patchset.
>>>>>>>>>> I think we don't have to add 2 ops for reconfig & reimage in framework,
>>>>>>>>>> the 2 processes are almost the same.
>>>>>>>>>>
>>>>>>>>>> Just add the _REIMAGE (or something else, NVMEM?) flag for
>>>>>>>>>> fpga_image_info, and low level drivers handle it as they do for other
>>>>>>>>>> flags.
>>>>>>>>>>
>>>>>>>>>> How do you think?
>>>>>>>>> A single set is fine.
>>>>>>>>>
>>>>>>>>> A difficult part of is the length of  time to do the write. The existing write should be improved to use a worker thread.
>>>>>>>>>
>>>>>>>>> Tom
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Yilun
>>>>>>>>>>
>>>>>>>>>>> Any other driver would do similar.
>>>>>>>>>>>
>>>>>>>>>>> Is this close to what you are thinking ?
>>>>>>>>>>>
>>>>>>>>>>> Tom
>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Yilun
>>>>>>>>>>>>

Xu Yilun Oct. 18, 2021, 8:13 a.m. UTC | #15

On Fri, Oct 15, 2021 at 10:34:23AM -0700, Russ Weight wrote:
> 
> 
> On 10/14/21 7:51 PM, Xu Yilun wrote:
> > On Thu, Oct 14, 2021 at 09:32:53AM -0700, Russ Weight wrote:
> >>
> >> On 10/13/21 6:49 PM, Xu Yilun wrote:
> >>> On Wed, Oct 13, 2021 at 11:09:08AM -0700, Russ Weight wrote:
> >>>> On 10/12/21 6:06 PM, Xu Yilun wrote:
> >>>>> On Tue, Oct 12, 2021 at 10:20:15AM -0700, Russ Weight wrote:
> >>>>>> On 10/12/21 12:47 AM, Xu Yilun wrote:
> >>>>>>> On Mon, Oct 11, 2021 at 06:00:16PM -0700, Russ Weight wrote:
> >>>>>>>> On 10/11/21 5:35 AM, Tom Rix wrote:
> >>>>>>>>> On 10/10/21 6:41 PM, Xu Yilun wrote:
> >>>>>>>>>> On Sat, Oct 09, 2021 at 05:11:20AM -0700, Tom Rix wrote:
> >>>>>>>>>>> On 10/9/21 1:08 AM, Xu Yilun wrote:
> >>>>>>>>>>>> On Wed, Sep 29, 2021 at 04:00:20PM -0700, Russ Weight wrote:
> >>>>>>>>>>>>> The FPGA Image Load framework provides an API to upload image
> >>>>>>>>>>>>> files to an FPGA device. Image files are self-describing. They could
> >>>>>>>>>>>>> contain FPGA images, BMC images, Root Entry Hashes, or other device
> >>>>>>>>>>>>> specific files. It is up to the lower-level device driver and the
> >>>>>>>>>>>>> target device to authenticate and disposition the file data.
> >>>>>>>>>>>> I've reconsider the FPGA persistent image update again, and think we
> >>>>>>>>>>>> may include it in FPGA manager framework.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Sorry I raised this topic again when it is already at patch v17, but now
> >>>>>>>>>>>> I need to consider more seriously than before.
> >>>>>>>>>>>>
> >>>>>>>>>>>> We have consensus the FPGA persistent image update is just like a normal
> >>>>>>>>>>>> firmware update which finally writes the nvmem like flash or eeprom,
> >>>>>>>>>>>> while the current FPGA manager deals with the active FPGA region update
> >>>>>>>>>>>> and re-activation. Could we just expand the FPGA manager and let it handle
> >>>>>>>>>>>> the nvmem update as well? Many FPGA cards have nvmem and downloaders
> >>>>>>>>>>>> supports updating both FPGA region and nvmem.
> >>>>>>>> The fpga-image-load driver is actually just a data transfer. The class
> >>>>>>> IMHO, The fpga-mgr dev is also a data transfer. The fpga-region dev is
> >>>>>>> acting as the FPGA region admin responsible for gating, transfering and
> >>>>>>> re-enumerating.
> >>>>>>>
> >>>>>>> So my opinion is to add a new data transfer type and keep a unified process.
> >>>>>>>
> >>>>>>>> driver has no knowledge about what the data is or where/if the data will
> >>>>>>>> be stored.
> >>>>>>> The fpga-image-load driver knows the data will be stored in nvmem,
> >>>>>> FYI: This is not strictly correct. In a coming product there is a
> >>>>>> case where the data will be stored in RAM. Richard Gong was also
> >>>>>> looking to use this driver to validate an image without programming
> >>>>>> or storing it. The fpga-image-load driver has no expectation that
> >>>>>> the data will be stored in nvmem, or even that it will be stored
> >>>>>> at all.
> >>>>> OK, we can discuss that use case then. But fundamentally a driver should
> >>>>> be clear what it is doing.
> >>>> The lower-level driver is, of course, clear what it is doing. And the
> >>>> FPGA Image Load Framework simply provides a consistent API and manages
> >>>> a potentially long-running data transfer in the context of a kernel
> >>>> worker thread.
> >>>>
> >>>> It sounds like you are saying that that is not "clear enough" in the
> >>>> context of the FPGA Manager?
> >>>>
> >>>> The files that are used with Intel PAC devices are self-describing. The
> >>>> user-space tools, the class driver and the lower-level driver just pass
> >>>> the data through to the card BMC without any knowledge of the content,
> >>>> purpose or final destination of the data.
> >>>>
> >>>> The card BMC will receive signed data, validate it, and process it as a
> >>>> BMC image, an FPGA image, a Root Entry Hash, or a key cancellation. In
> >>> I category all these actions as firmware update fully or partially on
> >>> persistent storage. The FPGA Manager don't have to know the meaning of
> >>> every byte on flash, but it should be aware the firmware is updated and
> >>> the card may acts differently with a new firmware. This is the common
> >>> working model for most of the FPGA cards so that we implement it in FPGA
> >>> manager class. 
> >>>
> >>>> the n6000, it could also be part of a multi-step process for programming
> >>>> SDM keys and the data may not be stored permanently.
> >>> This is new to me, but seems to be different from firmware update, so lets
> >>> think about it again.
> >>>
> >>>>> You may try to extend the FPGA framework to
> >>>>> support nvmem storage, or image validation, but cannot say we feed the
> >>>>> data to any engine undefined by the framework.
> >>>> I'm not sure what you mean by "feed the data to any engine undefined by the
> >>>> framework". I think the "engine" is the lower level driver/device that invokes
> >>>> the fpga_mgr. The lower level driver, of course, is clear what it is doing.
> >>>> The fpga_mgr cannot control what driver invokes it.
> >>>>
> >>>> Are saying that when invoking the fpga-mgr, that it _must_ also pass descriptive
> >>>> data. Meaning that a self-describing file alone is not acceptable?
> >>> The class driver should define a reasonable working model and APIs.
> >>> Updating the FPGA backup storage is good to me. But receiving a mystery
> >>> box and do whatever it requires is not.
> >>>
> >>> Self-describing file is OK, encryption is OK, but either the class
> >>> driver itself, or with the help of the low level driver, should make
> >>> sure it works within its scope.
> >> In our secure update process, the card BMC firmware authenticates
> >> the data using the root entry hashes and will either reject the
> >> data or perform some function based on the contents. Neither the
> >> user-space, the class driver, nor the lower level driver know
> >> what the contents are. It _is_ a "mystery box" to them. How do we
> >> verify scope in this model?
> > I think we need to find out how. One case is, the HW is designed to have
> > one single function, such as firmware update, then any image input
> > through firmware update API is within expectation, and the driver
> > should only serve the firmware update API. I think this is how the
> > N3000 is working now. If the HW is for another function, register itself
> > to serve another API, or another class driver.
> >
> > Another case is, the HW could do multiple types of tasks depending on
> > the content of the image, such as firmware update, image verification,
> > or assumably power off the card ... There should be some mechanism for
> > the driver to only accept the right image according to what API is called.
> > Or the user may input an image named update_the_card.img through
> > firmware update API and finally get the card off. Having some headers
> > readable by host for the operation type? Or, some HW interface for host
> > to apply the operation type as well as the image, let the HW verify?
> > Let's think about it.
> I'm not sure if I am following your thinking here. The HW, of course,
> verifies authentication of the image and acts according to the image
> type. I don't think it is reasonable to require the class driver to
> interpret the data to determine what it is. That implies that the
> class driver would have to know the header format and possible values.
> It also means that changes to the header format would require patches
> to the class driver.
> 
> The FPGA card is trusted by virtue of the fact that the customer
> purchased it and physically placed it in the machine. If the FPGA card
> (or the lower level driver) validates the image, then why does the
> Class driver need to be concerned about the file type? I think the
> purpose of the class driver is primarily to provide a common API and
> perform common functions so that they don't have to be replicated
> among similar low-level drivers. It is up to the low-level driver
> or the device itself to ensure that the data received is acceptable.
> 
> If the card receives data through the fpga-mgr upload facility that
> isn't strictly a firmware update, and if the lower-level driver or
> the card handles it and returns appropriate status - is that really
> a problem?
> >> As you have noted, most current cases result in a change to the
> >> card, and I suspect that it will remain that way. But that can't be
> >> guaranteed, and I'm not convinced that a host driver needs to be
> >> concerned about it.
> > A host driver should know what is done, in some abstraction level.
> > I think updating the persistent storage is an acceptable abstraction
> > in FPGA domain, so I'd like to extend it in FPGA manager. But doing
> > anything according to the image is not.
> By host driver, do you mean the class driver? Or the lower-level device
> driver?

The class driver.

> 
> It seems to me that you are saying that self-describing images are not
> acceptable? Or at least they are not acceptable payload for an FPGA
> Manager firmware-update API?

For N3000, we are working on the persistent storage update APIs, which is
a much simpler working model, no runtime device change, and needs no
device removal & re-enumeration.

But if you need to extend something more that would potentially changes
the behavior of the running devices on FPGA, device removal &
re-enumeration are needed so that the system knows what devices are
changed.

> 
> The FPGA Image Load Framework was designed with the concept of
> transferring data to a device without imposing a purpose on the data.
> The expectation is that the lower-level driver or the device will
> validate the data. Is there something fundamentally wrong with that

I think there is something wrong here. As I said before, persistent
storage updating has different software process from some runtime
updating, so the class driver should be aware of what the HW engine
is doing.

Thanks,
Yilun

> approach? And if not, why couldn't we incorporate a similar image_load
> API into the FPGA Manager?
> 
> - Russ
> 
> >
> > Thanks,
> > Yilun
> >
> >> - Russ
> >>
> >>> Thanks,
> >>> Yilun
> >>>
> >>>> Thanks,
> >>>> - Russ
> >>>>
> >>>>> Thanks,
> >>>>> Yilun
> >>>>>
> >>>>>>> while
> >>>>>>> the fpga-mgr knows the data will be stored in FPGA cells. They may need
> >>>>>>> to know the exact physical position to store, may not, depends on what the
> >>>>>>> HW engines are.
> >>>>>>>
> >>>>>>>> This functionality could, of course, be merged into the fpga-mgr. I did
> >>>>>>>> a proof of concept of this a while back and we discussed the pros and cons.
> >>>>>>>> See this email for a recap:
> >>>>>>>>
> >>>>>>>> https://marc.info/?l=linux-fpga&m=161998085507374&w=2
> >>>>>>>>
> >>>>>>>> Things have changed some with the evolution of the driver. The IOCTL
> >>>>>>>> approach probably fits better than the sysfs implementation. At the time
> >>>>>>>> it seemed that a merge would add unnecessary complexity without adding value.
> >>>>>>> I think at least developers don't have to go through 2 sets of software
> >>>>>>> stacks which are of the same concept. And adding some new features like
> >>>>>>> optionally threading or canceling data transfer are also good to FPGA
> >>>>>>> region update. And the nvmem update could also be benifit from exsiting
> >>>>>>> implementations like scatter-gather buffers, in-kernel firmware loading.
> >>>>>>>
> >>>>>>> I try to explain myself according to each of your concern from that mail
> >>>>>>> thread:
> >>>>>>>
> >>>>>>> Purpose of the 2 updates
> >>>>>>> ========================
> >>>>>>>
> >>>>>>>   As I said before, I think they are both data transfer devices. FPGA
> >>>>>>> region update gets extra support from fpga-region & fpga-bridge, and
> >>>>>>> FPGA nvmem update could be a standalone fpga-mgr.
> >>>>>>>
> >>>>>>> Extra APIs that are unique to nvmem update
> >>>>>>> ==========================================
> >>>>>>>
> >>>>>>>   cdev APIs for nvmem update:
> >>>>>>>     Yes we need to expand the functionality so we need them.
> >>>>>>>
> >>>>>>>   available_images, image_load APIs for loading nvmem content to FPGA
> >>>>>>>   region:
> >>>>>>>     These are features in later patchsets which are not submitted, but we
> >>>>>>>     could talk about it in advance. I think this is actually a FPGA region
> >>>>>>>     update from nvmem, it also requires gating, data loading (no SW transfer)
> >>>>>>>     and re-enumeration, or a single command to image_load HW may result system
> >>>>>>>     disordered. The FPGA framework now only supports update from in-kernel
> >>>>>>>     user data, maybe we add support for update from nvmem later.
> >>>>>>>
> >>>>>>>   fpga-mgr state extend:
> >>>>>>>     I think it could be extended, The current states are not perfect,
> >>>>>>>     adding something like IDLE or READY is just fine.
> >>>>>>>
> >>>>>>>   fpga-mgr status extend:
> >>>>>>>     Add general error definitions as needed. If there is some status
> >>>>>>>     that is quite vendor specific, expose it in low-level driver.
> >>>>>>>
> >>>>>>>   remaining-size:
> >>>>>>>     Nice to have for all.
> >>>>>>>
> >>>>>>> Threading the update
> >>>>>>> ====================
> >>>>>>>
> >>>>>>>   Also a good option for FPGA region update, maybe we also have a slow FPGA
> >>>>>>>   reprogrammer?
> >>>>>>>
> >>>>>>> Cancelling the update
> >>>>>>> ====================
> >>>>>>>
> >>>>>>>   Also a good option for FPGA region update if HW engine supports.
> >>>>>> These are all good points.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> - Russ
> >>>>>>> Thanks,
> >>>>>>> Yilun
> >>>>>>>
> >>>>>>>>>>>> According to the patchset, the basic workflow of the 2 update types are
> >>>>>>>>>>>> quite similar, get the data, prepare for the HW, write and complete.
> >>>>>>>>>>>> They are already implemented in FPGA manager. We've discussed some
> >>>>>>>>>>>> differences like threading or canceling the update, which are
> >>>>>>>>>>>> not provided by FPGA manager but they may also nice to have for FPGA
> >>>>>>>>>>>> region update. An FPGA region update may also last for a long time??
> >>>>>>>>>>>> So I think having 2 sets of similar frameworks in FPGA is unnecessary.
> >>>>>>>>>>>>
> >>>>>>>>>>>> My quick mind is that we add some flags in struct fpga_mgr & struct
> >>>>>>>>>>>> fpga_image_info to indicate the HW capability (support FPGA region
> >>>>>>>>>>>> update or nvmem update or both) of the download engine and the provided
> >>>>>>>>>>>> image type. Then the low-level driver knows how to download if it
> >>>>>>>>>>>> supports both image types.An char device could be added for each fpga manager dev, providing the
> >>>>>>>>>>>> user APIs for nvmem update. We may not use the char dev for FPGA region
> >>>>>>>>>>>> update cause it changes the system HW devices and needs device hotplug
> >>>>>>>>>>>> in FPGA region. We'd better leave it to FPGA region class, this is
> >>>>>>>>>>>> another topic.
> >>>>>>>> I'll give this some more thought and see if I can come up with some RFC
> >>>>>>>> patches.
> >>>>>>>>
> >>>>>>>> - Russ
> >>>>>>>>>>>> More discussion is appreciated.
> >>>>>>>>>>> I also think fpga_mgr could be extended.
> >>>>>>>>>>>
> >>>>>>>>>>> In this patchset,
> >>>>>>>>>>>
> >>>>>>>>>>> https://lore.kernel.org/linux-fpga/20210625195849.837976-1-trix@redhat.com/
> >>>>>>>>>>>
> >>>>>>>>>>> A second, similar set of write ops was added to fpga_manger_ops,
> >>>>>>>>>>>
> >>>>>>>>>>> new bit/flag was added to fpga_image_info
> >>>>>>>>>>>
> >>>>>>>>>>> The intent was for dfl to add their specific ops to cover what is done in
> >>>>>>>>>>> this patchset.
> >>>>>>>>>> I think we don't have to add 2 ops for reconfig & reimage in framework,
> >>>>>>>>>> the 2 processes are almost the same.
> >>>>>>>>>>
> >>>>>>>>>> Just add the _REIMAGE (or something else, NVMEM?) flag for
> >>>>>>>>>> fpga_image_info, and low level drivers handle it as they do for other
> >>>>>>>>>> flags.
> >>>>>>>>>>
> >>>>>>>>>> How do you think?
> >>>>>>>>> A single set is fine.
> >>>>>>>>>
> >>>>>>>>> A difficult part of is the length of  time to do the write. The existing write should be improved to use a worker thread.
> >>>>>>>>>
> >>>>>>>>> Tom
> >>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Yilun
> >>>>>>>>>>
> >>>>>>>>>>> Any other driver would do similar.
> >>>>>>>>>>>
> >>>>>>>>>>> Is this close to what you are thinking ?
> >>>>>>>>>>>
> >>>>>>>>>>> Tom
> >>>>>>>>>>>
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>> Yilun
> >>>>>>>>>>>>

Russ Weight Oct. 18, 2021, 4:24 p.m. UTC | #16

On 10/18/21 1:13 AM, Xu Yilun wrote:
> On Fri, Oct 15, 2021 at 10:34:23AM -0700, Russ Weight wrote:
>>
>> On 10/14/21 7:51 PM, Xu Yilun wrote:
>>> On Thu, Oct 14, 2021 at 09:32:53AM -0700, Russ Weight wrote:
>>>> On 10/13/21 6:49 PM, Xu Yilun wrote:
>>>>> On Wed, Oct 13, 2021 at 11:09:08AM -0700, Russ Weight wrote:
>>>>>> On 10/12/21 6:06 PM, Xu Yilun wrote:
>>>>>>> On Tue, Oct 12, 2021 at 10:20:15AM -0700, Russ Weight wrote:
>>>>>>>> On 10/12/21 12:47 AM, Xu Yilun wrote:
>>>>>>>>> On Mon, Oct 11, 2021 at 06:00:16PM -0700, Russ Weight wrote:
>>>>>>>>>> On 10/11/21 5:35 AM, Tom Rix wrote:
>>>>>>>>>>> On 10/10/21 6:41 PM, Xu Yilun wrote:
>>>>>>>>>>>> On Sat, Oct 09, 2021 at 05:11:20AM -0700, Tom Rix wrote:
>>>>>>>>>>>>> On 10/9/21 1:08 AM, Xu Yilun wrote:
>>>>>>>>>>>>>> On Wed, Sep 29, 2021 at 04:00:20PM -0700, Russ Weight wrote:
>>>>>>>>>>>>>>> The FPGA Image Load framework provides an API to upload image
>>>>>>>>>>>>>>> files to an FPGA device. Image files are self-describing. They could
>>>>>>>>>>>>>>> contain FPGA images, BMC images, Root Entry Hashes, or other device
>>>>>>>>>>>>>>> specific files. It is up to the lower-level device driver and the
>>>>>>>>>>>>>>> target device to authenticate and disposition the file data.
>>>>>>>>>>>>>> I've reconsider the FPGA persistent image update again, and think we
>>>>>>>>>>>>>> may include it in FPGA manager framework.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sorry I raised this topic again when it is already at patch v17, but now
>>>>>>>>>>>>>> I need to consider more seriously than before.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> We have consensus the FPGA persistent image update is just like a normal
>>>>>>>>>>>>>> firmware update which finally writes the nvmem like flash or eeprom,
>>>>>>>>>>>>>> while the current FPGA manager deals with the active FPGA region update
>>>>>>>>>>>>>> and re-activation. Could we just expand the FPGA manager and let it handle
>>>>>>>>>>>>>> the nvmem update as well? Many FPGA cards have nvmem and downloaders
>>>>>>>>>>>>>> supports updating both FPGA region and nvmem.
>>>>>>>>>> The fpga-image-load driver is actually just a data transfer. The class
>>>>>>>>> IMHO, The fpga-mgr dev is also a data transfer. The fpga-region dev is
>>>>>>>>> acting as the FPGA region admin responsible for gating, transfering and
>>>>>>>>> re-enumerating.
>>>>>>>>>
>>>>>>>>> So my opinion is to add a new data transfer type and keep a unified process.
>>>>>>>>>
>>>>>>>>>> driver has no knowledge about what the data is or where/if the data will
>>>>>>>>>> be stored.
>>>>>>>>> The fpga-image-load driver knows the data will be stored in nvmem,
>>>>>>>> FYI: This is not strictly correct. In a coming product there is a
>>>>>>>> case where the data will be stored in RAM. Richard Gong was also
>>>>>>>> looking to use this driver to validate an image without programming
>>>>>>>> or storing it. The fpga-image-load driver has no expectation that
>>>>>>>> the data will be stored in nvmem, or even that it will be stored
>>>>>>>> at all.
>>>>>>> OK, we can discuss that use case then. But fundamentally a driver should
>>>>>>> be clear what it is doing.
>>>>>> The lower-level driver is, of course, clear what it is doing. And the
>>>>>> FPGA Image Load Framework simply provides a consistent API and manages
>>>>>> a potentially long-running data transfer in the context of a kernel
>>>>>> worker thread.
>>>>>>
>>>>>> It sounds like you are saying that that is not "clear enough" in the
>>>>>> context of the FPGA Manager?
>>>>>>
>>>>>> The files that are used with Intel PAC devices are self-describing. The
>>>>>> user-space tools, the class driver and the lower-level driver just pass
>>>>>> the data through to the card BMC without any knowledge of the content,
>>>>>> purpose or final destination of the data.
>>>>>>
>>>>>> The card BMC will receive signed data, validate it, and process it as a
>>>>>> BMC image, an FPGA image, a Root Entry Hash, or a key cancellation. In
>>>>> I category all these actions as firmware update fully or partially on
>>>>> persistent storage. The FPGA Manager don't have to know the meaning of
>>>>> every byte on flash, but it should be aware the firmware is updated and
>>>>> the card may acts differently with a new firmware. This is the common
>>>>> working model for most of the FPGA cards so that we implement it in FPGA
>>>>> manager class. 
>>>>>
>>>>>> the n6000, it could also be part of a multi-step process for programming
>>>>>> SDM keys and the data may not be stored permanently.
>>>>> This is new to me, but seems to be different from firmware update, so lets
>>>>> think about it again.
>>>>>
>>>>>>> You may try to extend the FPGA framework to
>>>>>>> support nvmem storage, or image validation, but cannot say we feed the
>>>>>>> data to any engine undefined by the framework.
>>>>>> I'm not sure what you mean by "feed the data to any engine undefined by the
>>>>>> framework". I think the "engine" is the lower level driver/device that invokes
>>>>>> the fpga_mgr. The lower level driver, of course, is clear what it is doing.
>>>>>> The fpga_mgr cannot control what driver invokes it.
>>>>>>
>>>>>> Are saying that when invoking the fpga-mgr, that it _must_ also pass descriptive
>>>>>> data. Meaning that a self-describing file alone is not acceptable?
>>>>> The class driver should define a reasonable working model and APIs.
>>>>> Updating the FPGA backup storage is good to me. But receiving a mystery
>>>>> box and do whatever it requires is not.
>>>>>
>>>>> Self-describing file is OK, encryption is OK, but either the class
>>>>> driver itself, or with the help of the low level driver, should make
>>>>> sure it works within its scope.
>>>> In our secure update process, the card BMC firmware authenticates
>>>> the data using the root entry hashes and will either reject the
>>>> data or perform some function based on the contents. Neither the
>>>> user-space, the class driver, nor the lower level driver know
>>>> what the contents are. It _is_ a "mystery box" to them. How do we
>>>> verify scope in this model?
>>> I think we need to find out how. One case is, the HW is designed to have
>>> one single function, such as firmware update, then any image input
>>> through firmware update API is within expectation, and the driver
>>> should only serve the firmware update API. I think this is how the
>>> N3000 is working now. If the HW is for another function, register itself
>>> to serve another API, or another class driver.
>>>
>>> Another case is, the HW could do multiple types of tasks depending on
>>> the content of the image, such as firmware update, image verification,
>>> or assumably power off the card ... There should be some mechanism for
>>> the driver to only accept the right image according to what API is called.
>>> Or the user may input an image named update_the_card.img through
>>> firmware update API and finally get the card off. Having some headers
>>> readable by host for the operation type? Or, some HW interface for host
>>> to apply the operation type as well as the image, let the HW verify?
>>> Let's think about it.
>> I'm not sure if I am following your thinking here. The HW, of course,
>> verifies authentication of the image and acts according to the image
>> type. I don't think it is reasonable to require the class driver to
>> interpret the data to determine what it is. That implies that the
>> class driver would have to know the header format and possible values.
>> It also means that changes to the header format would require patches
>> to the class driver.
>>
>> The FPGA card is trusted by virtue of the fact that the customer
>> purchased it and physically placed it in the machine. If the FPGA card
>> (or the lower level driver) validates the image, then why does the
>> Class driver need to be concerned about the file type? I think the
>> purpose of the class driver is primarily to provide a common API and
>> perform common functions so that they don't have to be replicated
>> among similar low-level drivers. It is up to the low-level driver
>> or the device itself to ensure that the data received is acceptable.
>>
>> If the card receives data through the fpga-mgr upload facility that
>> isn't strictly a firmware update, and if the lower-level driver or
>> the card handles it and returns appropriate status - is that really
>> a problem?
>>>> As you have noted, most current cases result in a change to the
>>>> card, and I suspect that it will remain that way. But that can't be
>>>> guaranteed, and I'm not convinced that a host driver needs to be
>>>> concerned about it.
>>> A host driver should know what is done, in some abstraction level.
>>> I think updating the persistent storage is an acceptable abstraction
>>> in FPGA domain, so I'd like to extend it in FPGA manager. But doing
>>> anything according to the image is not.
>> By host driver, do you mean the class driver? Or the lower-level device
>> driver?
> The class driver.
>
>> It seems to me that you are saying that self-describing images are not
>> acceptable? Or at least they are not acceptable payload for an FPGA
>> Manager firmware-update API?
> For N3000, we are working on the persistent storage update APIs, which is
> a much simpler working model, no runtime device change, and needs no
> device removal & re-enumeration.
>
> But if you need to extend something more that would potentially changes
> the behavior of the running devices on FPGA, device removal &
> re-enumeration are needed so that the system knows what devices are
> changed.
>
>> The FPGA Image Load Framework was designed with the concept of
>> transferring data to a device without imposing a purpose on the data.
>> The expectation is that the lower-level driver or the device will
>> validate the data. Is there something fundamentally wrong with that
> I think there is something wrong here. As I said before, persistent
> storage updating has different software process from some runtime
> updating, so the class driver should be aware of what the HW engine
> is doing.
So far, there are no self-describing images that cause a
change in run-time behavior, and I don't think that will
happen for the very reason that the class-driver would
need to know about it.

When I asserted that not all self-describing images are
changing firmware, I did not mean to imply that they change
run-time behavior; they do not. They are part of a multi-
step update of firmware. However, looking at each part of
the sequence independently, there is at least one instance
where a certificate is stored in RAM for temporary use.
When the driver exits from this call, there is no firmware
change. There is also no change in run-time behavior.

I'm thinking we could have different IOCTLs:

(1) firmware  update (address, size, purpose provided
    with the image)
(2) image upload (self-describing files)
(3) image validation

These would all use most of the same code, but the FPGA
Manager flags and structure fields would be set differently.

- Russ
>
> Thanks,
> Yilun
>
>> approach? And if not, why couldn't we incorporate a similar image_load
>> API into the FPGA Manager?
>>
>> - Russ
>>
>>> Thanks,
>>> Yilun
>>>
>>>> - Russ
>>>>
>>>>> Thanks,
>>>>> Yilun
>>>>>
>>>>>> Thanks,
>>>>>> - Russ
>>>>>>
>>>>>>> Thanks,
>>>>>>> Yilun
>>>>>>>
>>>>>>>>> while
>>>>>>>>> the fpga-mgr knows the data will be stored in FPGA cells. They may need
>>>>>>>>> to know the exact physical position to store, may not, depends on what the
>>>>>>>>> HW engines are.
>>>>>>>>>
>>>>>>>>>> This functionality could, of course, be merged into the fpga-mgr. I did
>>>>>>>>>> a proof of concept of this a while back and we discussed the pros and cons.
>>>>>>>>>> See this email for a recap:
>>>>>>>>>>
>>>>>>>>>> https://marc.info/?l=linux-fpga&m=161998085507374&w=2
>>>>>>>>>>
>>>>>>>>>> Things have changed some with the evolution of the driver. The IOCTL
>>>>>>>>>> approach probably fits better than the sysfs implementation. At the time
>>>>>>>>>> it seemed that a merge would add unnecessary complexity without adding value.
>>>>>>>>> I think at least developers don't have to go through 2 sets of software
>>>>>>>>> stacks which are of the same concept. And adding some new features like
>>>>>>>>> optionally threading or canceling data transfer are also good to FPGA
>>>>>>>>> region update. And the nvmem update could also be benifit from exsiting
>>>>>>>>> implementations like scatter-gather buffers, in-kernel firmware loading.
>>>>>>>>>
>>>>>>>>> I try to explain myself according to each of your concern from that mail
>>>>>>>>> thread:
>>>>>>>>>
>>>>>>>>> Purpose of the 2 updates
>>>>>>>>> ========================
>>>>>>>>>
>>>>>>>>>   As I said before, I think they are both data transfer devices. FPGA
>>>>>>>>> region update gets extra support from fpga-region & fpga-bridge, and
>>>>>>>>> FPGA nvmem update could be a standalone fpga-mgr.
>>>>>>>>>
>>>>>>>>> Extra APIs that are unique to nvmem update
>>>>>>>>> ==========================================
>>>>>>>>>
>>>>>>>>>   cdev APIs for nvmem update:
>>>>>>>>>     Yes we need to expand the functionality so we need them.
>>>>>>>>>
>>>>>>>>>   available_images, image_load APIs for loading nvmem content to FPGA
>>>>>>>>>   region:
>>>>>>>>>     These are features in later patchsets which are not submitted, but we
>>>>>>>>>     could talk about it in advance. I think this is actually a FPGA region
>>>>>>>>>     update from nvmem, it also requires gating, data loading (no SW transfer)
>>>>>>>>>     and re-enumeration, or a single command to image_load HW may result system
>>>>>>>>>     disordered. The FPGA framework now only supports update from in-kernel
>>>>>>>>>     user data, maybe we add support for update from nvmem later.
>>>>>>>>>
>>>>>>>>>   fpga-mgr state extend:
>>>>>>>>>     I think it could be extended, The current states are not perfect,
>>>>>>>>>     adding something like IDLE or READY is just fine.
>>>>>>>>>
>>>>>>>>>   fpga-mgr status extend:
>>>>>>>>>     Add general error definitions as needed. If there is some status
>>>>>>>>>     that is quite vendor specific, expose it in low-level driver.
>>>>>>>>>
>>>>>>>>>   remaining-size:
>>>>>>>>>     Nice to have for all.
>>>>>>>>>
>>>>>>>>> Threading the update
>>>>>>>>> ====================
>>>>>>>>>
>>>>>>>>>   Also a good option for FPGA region update, maybe we also have a slow FPGA
>>>>>>>>>   reprogrammer?
>>>>>>>>>
>>>>>>>>> Cancelling the update
>>>>>>>>> ====================
>>>>>>>>>
>>>>>>>>>   Also a good option for FPGA region update if HW engine supports.
>>>>>>>> These are all good points.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> - Russ
>>>>>>>>> Thanks,
>>>>>>>>> Yilun
>>>>>>>>>
>>>>>>>>>>>>>> According to the patchset, the basic workflow of the 2 update types are
>>>>>>>>>>>>>> quite similar, get the data, prepare for the HW, write and complete.
>>>>>>>>>>>>>> They are already implemented in FPGA manager. We've discussed some
>>>>>>>>>>>>>> differences like threading or canceling the update, which are
>>>>>>>>>>>>>> not provided by FPGA manager but they may also nice to have for FPGA
>>>>>>>>>>>>>> region update. An FPGA region update may also last for a long time??
>>>>>>>>>>>>>> So I think having 2 sets of similar frameworks in FPGA is unnecessary.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> My quick mind is that we add some flags in struct fpga_mgr & struct
>>>>>>>>>>>>>> fpga_image_info to indicate the HW capability (support FPGA region
>>>>>>>>>>>>>> update or nvmem update or both) of the download engine and the provided
>>>>>>>>>>>>>> image type. Then the low-level driver knows how to download if it
>>>>>>>>>>>>>> supports both image types.An char device could be added for each fpga manager dev, providing the
>>>>>>>>>>>>>> user APIs for nvmem update. We may not use the char dev for FPGA region
>>>>>>>>>>>>>> update cause it changes the system HW devices and needs device hotplug
>>>>>>>>>>>>>> in FPGA region. We'd better leave it to FPGA region class, this is
>>>>>>>>>>>>>> another topic.
>>>>>>>>>> I'll give this some more thought and see if I can come up with some RFC
>>>>>>>>>> patches.
>>>>>>>>>>
>>>>>>>>>> - Russ
>>>>>>>>>>>>>> More discussion is appreciated.
>>>>>>>>>>>>> I also think fpga_mgr could be extended.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In this patchset,
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://lore.kernel.org/linux-fpga/20210625195849.837976-1-trix@redhat.com/
>>>>>>>>>>>>>
>>>>>>>>>>>>> A second, similar set of write ops was added to fpga_manger_ops,
>>>>>>>>>>>>>
>>>>>>>>>>>>> new bit/flag was added to fpga_image_info
>>>>>>>>>>>>>
>>>>>>>>>>>>> The intent was for dfl to add their specific ops to cover what is done in
>>>>>>>>>>>>> this patchset.
>>>>>>>>>>>> I think we don't have to add 2 ops for reconfig & reimage in framework,
>>>>>>>>>>>> the 2 processes are almost the same.
>>>>>>>>>>>>
>>>>>>>>>>>> Just add the _REIMAGE (or something else, NVMEM?) flag for
>>>>>>>>>>>> fpga_image_info, and low level drivers handle it as they do for other
>>>>>>>>>>>> flags.
>>>>>>>>>>>>
>>>>>>>>>>>> How do you think?
>>>>>>>>>>> A single set is fine.
>>>>>>>>>>>
>>>>>>>>>>> A difficult part of is the length of  time to do the write. The existing write should be improved to use a worker thread.
>>>>>>>>>>>
>>>>>>>>>>> Tom
>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Yilun
>>>>>>>>>>>>
>>>>>>>>>>>>> Any other driver would do similar.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is this close to what you are thinking ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Tom
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Yilun
>>>>>>>>>>>>>>

Xu Yilun Oct. 19, 2021, 2:53 a.m. UTC | #17

On Mon, Oct 18, 2021 at 09:24:08AM -0700, Russ Weight wrote:
> 
> 
> On 10/18/21 1:13 AM, Xu Yilun wrote:
> > On Fri, Oct 15, 2021 at 10:34:23AM -0700, Russ Weight wrote:
> >>
> >> On 10/14/21 7:51 PM, Xu Yilun wrote:
> >>> On Thu, Oct 14, 2021 at 09:32:53AM -0700, Russ Weight wrote:
> >>>> On 10/13/21 6:49 PM, Xu Yilun wrote:
> >>>>> On Wed, Oct 13, 2021 at 11:09:08AM -0700, Russ Weight wrote:
> >>>>>> On 10/12/21 6:06 PM, Xu Yilun wrote:
> >>>>>>> On Tue, Oct 12, 2021 at 10:20:15AM -0700, Russ Weight wrote:
> >>>>>>>> On 10/12/21 12:47 AM, Xu Yilun wrote:
> >>>>>>>>> On Mon, Oct 11, 2021 at 06:00:16PM -0700, Russ Weight wrote:
> >>>>>>>>>> On 10/11/21 5:35 AM, Tom Rix wrote:
> >>>>>>>>>>> On 10/10/21 6:41 PM, Xu Yilun wrote:
> >>>>>>>>>>>> On Sat, Oct 09, 2021 at 05:11:20AM -0700, Tom Rix wrote:
> >>>>>>>>>>>>> On 10/9/21 1:08 AM, Xu Yilun wrote:
> >>>>>>>>>>>>>> On Wed, Sep 29, 2021 at 04:00:20PM -0700, Russ Weight wrote:
> >>>>>>>>>>>>>>> The FPGA Image Load framework provides an API to upload image
> >>>>>>>>>>>>>>> files to an FPGA device. Image files are self-describing. They could
> >>>>>>>>>>>>>>> contain FPGA images, BMC images, Root Entry Hashes, or other device
> >>>>>>>>>>>>>>> specific files. It is up to the lower-level device driver and the
> >>>>>>>>>>>>>>> target device to authenticate and disposition the file data.
> >>>>>>>>>>>>>> I've reconsider the FPGA persistent image update again, and think we
> >>>>>>>>>>>>>> may include it in FPGA manager framework.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Sorry I raised this topic again when it is already at patch v17, but now
> >>>>>>>>>>>>>> I need to consider more seriously than before.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> We have consensus the FPGA persistent image update is just like a normal
> >>>>>>>>>>>>>> firmware update which finally writes the nvmem like flash or eeprom,
> >>>>>>>>>>>>>> while the current FPGA manager deals with the active FPGA region update
> >>>>>>>>>>>>>> and re-activation. Could we just expand the FPGA manager and let it handle
> >>>>>>>>>>>>>> the nvmem update as well? Many FPGA cards have nvmem and downloaders
> >>>>>>>>>>>>>> supports updating both FPGA region and nvmem.
> >>>>>>>>>> The fpga-image-load driver is actually just a data transfer. The class
> >>>>>>>>> IMHO, The fpga-mgr dev is also a data transfer. The fpga-region dev is
> >>>>>>>>> acting as the FPGA region admin responsible for gating, transfering and
> >>>>>>>>> re-enumerating.
> >>>>>>>>>
> >>>>>>>>> So my opinion is to add a new data transfer type and keep a unified process.
> >>>>>>>>>
> >>>>>>>>>> driver has no knowledge about what the data is or where/if the data will
> >>>>>>>>>> be stored.
> >>>>>>>>> The fpga-image-load driver knows the data will be stored in nvmem,
> >>>>>>>> FYI: This is not strictly correct. In a coming product there is a
> >>>>>>>> case where the data will be stored in RAM. Richard Gong was also
> >>>>>>>> looking to use this driver to validate an image without programming
> >>>>>>>> or storing it. The fpga-image-load driver has no expectation that
> >>>>>>>> the data will be stored in nvmem, or even that it will be stored
> >>>>>>>> at all.
> >>>>>>> OK, we can discuss that use case then. But fundamentally a driver should
> >>>>>>> be clear what it is doing.
> >>>>>> The lower-level driver is, of course, clear what it is doing. And the
> >>>>>> FPGA Image Load Framework simply provides a consistent API and manages
> >>>>>> a potentially long-running data transfer in the context of a kernel
> >>>>>> worker thread.
> >>>>>>
> >>>>>> It sounds like you are saying that that is not "clear enough" in the
> >>>>>> context of the FPGA Manager?
> >>>>>>
> >>>>>> The files that are used with Intel PAC devices are self-describing. The
> >>>>>> user-space tools, the class driver and the lower-level driver just pass
> >>>>>> the data through to the card BMC without any knowledge of the content,
> >>>>>> purpose or final destination of the data.
> >>>>>>
> >>>>>> The card BMC will receive signed data, validate it, and process it as a
> >>>>>> BMC image, an FPGA image, a Root Entry Hash, or a key cancellation. In
> >>>>> I category all these actions as firmware update fully or partially on
> >>>>> persistent storage. The FPGA Manager don't have to know the meaning of
> >>>>> every byte on flash, but it should be aware the firmware is updated and
> >>>>> the card may acts differently with a new firmware. This is the common
> >>>>> working model for most of the FPGA cards so that we implement it in FPGA
> >>>>> manager class. 
> >>>>>
> >>>>>> the n6000, it could also be part of a multi-step process for programming
> >>>>>> SDM keys and the data may not be stored permanently.
> >>>>> This is new to me, but seems to be different from firmware update, so lets
> >>>>> think about it again.
> >>>>>
> >>>>>>> You may try to extend the FPGA framework to
> >>>>>>> support nvmem storage, or image validation, but cannot say we feed the
> >>>>>>> data to any engine undefined by the framework.
> >>>>>> I'm not sure what you mean by "feed the data to any engine undefined by the
> >>>>>> framework". I think the "engine" is the lower level driver/device that invokes
> >>>>>> the fpga_mgr. The lower level driver, of course, is clear what it is doing.
> >>>>>> The fpga_mgr cannot control what driver invokes it.
> >>>>>>
> >>>>>> Are saying that when invoking the fpga-mgr, that it _must_ also pass descriptive
> >>>>>> data. Meaning that a self-describing file alone is not acceptable?
> >>>>> The class driver should define a reasonable working model and APIs.
> >>>>> Updating the FPGA backup storage is good to me. But receiving a mystery
> >>>>> box and do whatever it requires is not.
> >>>>>
> >>>>> Self-describing file is OK, encryption is OK, but either the class
> >>>>> driver itself, or with the help of the low level driver, should make
> >>>>> sure it works within its scope.
> >>>> In our secure update process, the card BMC firmware authenticates
> >>>> the data using the root entry hashes and will either reject the
> >>>> data or perform some function based on the contents. Neither the
> >>>> user-space, the class driver, nor the lower level driver know
> >>>> what the contents are. It _is_ a "mystery box" to them. How do we
> >>>> verify scope in this model?
> >>> I think we need to find out how. One case is, the HW is designed to have
> >>> one single function, such as firmware update, then any image input
> >>> through firmware update API is within expectation, and the driver
> >>> should only serve the firmware update API. I think this is how the
> >>> N3000 is working now. If the HW is for another function, register itself
> >>> to serve another API, or another class driver.
> >>>
> >>> Another case is, the HW could do multiple types of tasks depending on
> >>> the content of the image, such as firmware update, image verification,
> >>> or assumably power off the card ... There should be some mechanism for
> >>> the driver to only accept the right image according to what API is called.
> >>> Or the user may input an image named update_the_card.img through
> >>> firmware update API and finally get the card off. Having some headers
> >>> readable by host for the operation type? Or, some HW interface for host
> >>> to apply the operation type as well as the image, let the HW verify?
> >>> Let's think about it.
> >> I'm not sure if I am following your thinking here. The HW, of course,
> >> verifies authentication of the image and acts according to the image
> >> type. I don't think it is reasonable to require the class driver to
> >> interpret the data to determine what it is. That implies that the
> >> class driver would have to know the header format and possible values.
> >> It also means that changes to the header format would require patches
> >> to the class driver.
> >>
> >> The FPGA card is trusted by virtue of the fact that the customer
> >> purchased it and physically placed it in the machine. If the FPGA card
> >> (or the lower level driver) validates the image, then why does the
> >> Class driver need to be concerned about the file type? I think the
> >> purpose of the class driver is primarily to provide a common API and
> >> perform common functions so that they don't have to be replicated
> >> among similar low-level drivers. It is up to the low-level driver
> >> or the device itself to ensure that the data received is acceptable.
> >>
> >> If the card receives data through the fpga-mgr upload facility that
> >> isn't strictly a firmware update, and if the lower-level driver or
> >> the card handles it and returns appropriate status - is that really
> >> a problem?
> >>>> As you have noted, most current cases result in a change to the
> >>>> card, and I suspect that it will remain that way. But that can't be
> >>>> guaranteed, and I'm not convinced that a host driver needs to be
> >>>> concerned about it.
> >>> A host driver should know what is done, in some abstraction level.
> >>> I think updating the persistent storage is an acceptable abstraction
> >>> in FPGA domain, so I'd like to extend it in FPGA manager. But doing
> >>> anything according to the image is not.
> >> By host driver, do you mean the class driver? Or the lower-level device
> >> driver?
> > The class driver.
> >
> >> It seems to me that you are saying that self-describing images are not
> >> acceptable? Or at least they are not acceptable payload for an FPGA
> >> Manager firmware-update API?
> > For N3000, we are working on the persistent storage update APIs, which is
> > a much simpler working model, no runtime device change, and needs no
> > device removal & re-enumeration.
> >
> > But if you need to extend something more that would potentially changes
> > the behavior of the running devices on FPGA, device removal &
> > re-enumeration are needed so that the system knows what devices are
> > changed.
> >
> >> The FPGA Image Load Framework was designed with the concept of
> >> transferring data to a device without imposing a purpose on the data.
> >> The expectation is that the lower-level driver or the device will
> >> validate the data. Is there something fundamentally wrong with that
> > I think there is something wrong here. As I said before, persistent
> > storage updating has different software process from some runtime
> > updating, so the class driver should be aware of what the HW engine
> > is doing.
> So far, there are no self-describing images that cause a
> change in run-time behavior, and I don't think that will
> happen for the very reason that the class-driver would
> need to know about it.

Again, the class driver needs to know what is happening, at some
abstraction level, to ensure the system is aligned with the HW state.

If the class driver cannot tell the detail, it has to assume the
whole FPGA region will be changed, and removal & re-enumeration is
needed.

> 
> When I asserted that not all self-describing images are
> changing firmware, I did not mean to imply that they change
> run-time behavior; they do not. They are part of a multi-
> step update of firmware. However, looking at each part of
> the sequence independently, there is at least one instance
> where a certificate is stored in RAM for temporary use.
> When the driver exits from this call, there is no firmware
> change. There is also no change in run-time behavior.
> 
> I'm thinking we could have different IOCTLs:
> 
> (1) firmware  update (address, size, purpose provided
>     with the image)

Will the firmware update use the self-describing files?

> (2) image upload (self-describing files)

If both 1 & 2 use self-describing files, how the class driver verifies
the type of the file without looking into the file?

For example, if a user calls a firmware update API but inputs an image
upload file, will the class driver block the call? How?

> (3) image validation
> 
> These would all use most of the same code, but the FPGA
> Manager flags and structure fields would be set differently.

This is good to me.

Thanks,
Yilun

> 
> - Russ
> >
> > Thanks,
> > Yilun
> >
> >> approach? And if not, why couldn't we incorporate a similar image_load
> >> API into the FPGA Manager?
> >>
> >> - Russ
> >>
> >>> Thanks,
> >>> Yilun
> >>>
> >>>> - Russ
> >>>>
> >>>>> Thanks,
> >>>>> Yilun
> >>>>>
> >>>>>> Thanks,
> >>>>>> - Russ
> >>>>>>
> >>>>>>> Thanks,
> >>>>>>> Yilun
> >>>>>>>
> >>>>>>>>> while
> >>>>>>>>> the fpga-mgr knows the data will be stored in FPGA cells. They may need
> >>>>>>>>> to know the exact physical position to store, may not, depends on what the
> >>>>>>>>> HW engines are.
> >>>>>>>>>
> >>>>>>>>>> This functionality could, of course, be merged into the fpga-mgr. I did
> >>>>>>>>>> a proof of concept of this a while back and we discussed the pros and cons.
> >>>>>>>>>> See this email for a recap:
> >>>>>>>>>>
> >>>>>>>>>> https://marc.info/?l=linux-fpga&m=161998085507374&w=2
> >>>>>>>>>>
> >>>>>>>>>> Things have changed some with the evolution of the driver. The IOCTL
> >>>>>>>>>> approach probably fits better than the sysfs implementation. At the time
> >>>>>>>>>> it seemed that a merge would add unnecessary complexity without adding value.
> >>>>>>>>> I think at least developers don't have to go through 2 sets of software
> >>>>>>>>> stacks which are of the same concept. And adding some new features like
> >>>>>>>>> optionally threading or canceling data transfer are also good to FPGA
> >>>>>>>>> region update. And the nvmem update could also be benifit from exsiting
> >>>>>>>>> implementations like scatter-gather buffers, in-kernel firmware loading.
> >>>>>>>>>
> >>>>>>>>> I try to explain myself according to each of your concern from that mail
> >>>>>>>>> thread:
> >>>>>>>>>
> >>>>>>>>> Purpose of the 2 updates
> >>>>>>>>> ========================
> >>>>>>>>>
> >>>>>>>>>   As I said before, I think they are both data transfer devices. FPGA
> >>>>>>>>> region update gets extra support from fpga-region & fpga-bridge, and
> >>>>>>>>> FPGA nvmem update could be a standalone fpga-mgr.
> >>>>>>>>>
> >>>>>>>>> Extra APIs that are unique to nvmem update
> >>>>>>>>> ==========================================
> >>>>>>>>>
> >>>>>>>>>   cdev APIs for nvmem update:
> >>>>>>>>>     Yes we need to expand the functionality so we need them.
> >>>>>>>>>
> >>>>>>>>>   available_images, image_load APIs for loading nvmem content to FPGA
> >>>>>>>>>   region:
> >>>>>>>>>     These are features in later patchsets which are not submitted, but we
> >>>>>>>>>     could talk about it in advance. I think this is actually a FPGA region
> >>>>>>>>>     update from nvmem, it also requires gating, data loading (no SW transfer)
> >>>>>>>>>     and re-enumeration, or a single command to image_load HW may result system
> >>>>>>>>>     disordered. The FPGA framework now only supports update from in-kernel
> >>>>>>>>>     user data, maybe we add support for update from nvmem later.
> >>>>>>>>>
> >>>>>>>>>   fpga-mgr state extend:
> >>>>>>>>>     I think it could be extended, The current states are not perfect,
> >>>>>>>>>     adding something like IDLE or READY is just fine.
> >>>>>>>>>
> >>>>>>>>>   fpga-mgr status extend:
> >>>>>>>>>     Add general error definitions as needed. If there is some status
> >>>>>>>>>     that is quite vendor specific, expose it in low-level driver.
> >>>>>>>>>
> >>>>>>>>>   remaining-size:
> >>>>>>>>>     Nice to have for all.
> >>>>>>>>>
> >>>>>>>>> Threading the update
> >>>>>>>>> ====================
> >>>>>>>>>
> >>>>>>>>>   Also a good option for FPGA region update, maybe we also have a slow FPGA
> >>>>>>>>>   reprogrammer?
> >>>>>>>>>
> >>>>>>>>> Cancelling the update
> >>>>>>>>> ====================
> >>>>>>>>>
> >>>>>>>>>   Also a good option for FPGA region update if HW engine supports.
> >>>>>>>> These are all good points.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> - Russ
> >>>>>>>>> Thanks,
> >>>>>>>>> Yilun
> >>>>>>>>>
> >>>>>>>>>>>>>> According to the patchset, the basic workflow of the 2 update types are
> >>>>>>>>>>>>>> quite similar, get the data, prepare for the HW, write and complete.
> >>>>>>>>>>>>>> They are already implemented in FPGA manager. We've discussed some
> >>>>>>>>>>>>>> differences like threading or canceling the update, which are
> >>>>>>>>>>>>>> not provided by FPGA manager but they may also nice to have for FPGA
> >>>>>>>>>>>>>> region update. An FPGA region update may also last for a long time??
> >>>>>>>>>>>>>> So I think having 2 sets of similar frameworks in FPGA is unnecessary.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> My quick mind is that we add some flags in struct fpga_mgr & struct
> >>>>>>>>>>>>>> fpga_image_info to indicate the HW capability (support FPGA region
> >>>>>>>>>>>>>> update or nvmem update or both) of the download engine and the provided
> >>>>>>>>>>>>>> image type. Then the low-level driver knows how to download if it
> >>>>>>>>>>>>>> supports both image types.An char device could be added for each fpga manager dev, providing the
> >>>>>>>>>>>>>> user APIs for nvmem update. We may not use the char dev for FPGA region
> >>>>>>>>>>>>>> update cause it changes the system HW devices and needs device hotplug
> >>>>>>>>>>>>>> in FPGA region. We'd better leave it to FPGA region class, this is
> >>>>>>>>>>>>>> another topic.
> >>>>>>>>>> I'll give this some more thought and see if I can come up with some RFC
> >>>>>>>>>> patches.
> >>>>>>>>>>
> >>>>>>>>>> - Russ
> >>>>>>>>>>>>>> More discussion is appreciated.
> >>>>>>>>>>>>> I also think fpga_mgr could be extended.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> In this patchset,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> https://lore.kernel.org/linux-fpga/20210625195849.837976-1-trix@redhat.com/
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> A second, similar set of write ops was added to fpga_manger_ops,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> new bit/flag was added to fpga_image_info
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> The intent was for dfl to add their specific ops to cover what is done in
> >>>>>>>>>>>>> this patchset.
> >>>>>>>>>>>> I think we don't have to add 2 ops for reconfig & reimage in framework,
> >>>>>>>>>>>> the 2 processes are almost the same.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Just add the _REIMAGE (or something else, NVMEM?) flag for
> >>>>>>>>>>>> fpga_image_info, and low level drivers handle it as they do for other
> >>>>>>>>>>>> flags.
> >>>>>>>>>>>>
> >>>>>>>>>>>> How do you think?
> >>>>>>>>>>> A single set is fine.
> >>>>>>>>>>>
> >>>>>>>>>>> A difficult part of is the length of  time to do the write. The existing write should be improved to use a worker thread.
> >>>>>>>>>>>
> >>>>>>>>>>> Tom
> >>>>>>>>>>>
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>> Yilun
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Any other driver would do similar.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Is this close to what you are thinking ?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Tom
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>> Yilun
> >>>>>>>>>>>>>>

Russ Weight Oct. 19, 2021, 3:09 p.m. UTC | #18

On 10/18/21 7:53 PM, Xu Yilun wrote:
> On Mon, Oct 18, 2021 at 09:24:08AM -0700, Russ Weight wrote:
>>
>> On 10/18/21 1:13 AM, Xu Yilun wrote:
>>> On Fri, Oct 15, 2021 at 10:34:23AM -0700, Russ Weight wrote:
>>>> On 10/14/21 7:51 PM, Xu Yilun wrote:
>>>>> On Thu, Oct 14, 2021 at 09:32:53AM -0700, Russ Weight wrote:
>>>>>> On 10/13/21 6:49 PM, Xu Yilun wrote:
>>>>>>> On Wed, Oct 13, 2021 at 11:09:08AM -0700, Russ Weight wrote:
>>>>>>>> On 10/12/21 6:06 PM, Xu Yilun wrote:
>>>>>>>>> On Tue, Oct 12, 2021 at 10:20:15AM -0700, Russ Weight wrote:
>>>>>>>>>> On 10/12/21 12:47 AM, Xu Yilun wrote:
>>>>>>>>>>> On Mon, Oct 11, 2021 at 06:00:16PM -0700, Russ Weight wrote:
>>>>>>>>>>>> On 10/11/21 5:35 AM, Tom Rix wrote:
>>>>>>>>>>>>> On 10/10/21 6:41 PM, Xu Yilun wrote:
>>>>>>>>>>>>>> On Sat, Oct 09, 2021 at 05:11:20AM -0700, Tom Rix wrote:
>>>>>>>>>>>>>>> On 10/9/21 1:08 AM, Xu Yilun wrote:
>>>>>>>>>>>>>>>> On Wed, Sep 29, 2021 at 04:00:20PM -0700, Russ Weight wrote:
>>>>>>>>>>>>>>>>> The FPGA Image Load framework provides an API to upload image
>>>>>>>>>>>>>>>>> files to an FPGA device. Image files are self-describing. They could
>>>>>>>>>>>>>>>>> contain FPGA images, BMC images, Root Entry Hashes, or other device
>>>>>>>>>>>>>>>>> specific files. It is up to the lower-level device driver and the
>>>>>>>>>>>>>>>>> target device to authenticate and disposition the file data.
>>>>>>>>>>>>>>>> I've reconsider the FPGA persistent image update again, and think we
>>>>>>>>>>>>>>>> may include it in FPGA manager framework.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sorry I raised this topic again when it is already at patch v17, but now
>>>>>>>>>>>>>>>> I need to consider more seriously than before.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> We have consensus the FPGA persistent image update is just like a normal
>>>>>>>>>>>>>>>> firmware update which finally writes the nvmem like flash or eeprom,
>>>>>>>>>>>>>>>> while the current FPGA manager deals with the active FPGA region update
>>>>>>>>>>>>>>>> and re-activation. Could we just expand the FPGA manager and let it handle
>>>>>>>>>>>>>>>> the nvmem update as well? Many FPGA cards have nvmem and downloaders
>>>>>>>>>>>>>>>> supports updating both FPGA region and nvmem.
>>>>>>>>>>>> The fpga-image-load driver is actually just a data transfer. The class
>>>>>>>>>>> IMHO, The fpga-mgr dev is also a data transfer. The fpga-region dev is
>>>>>>>>>>> acting as the FPGA region admin responsible for gating, transfering and
>>>>>>>>>>> re-enumerating.
>>>>>>>>>>>
>>>>>>>>>>> So my opinion is to add a new data transfer type and keep a unified process.
>>>>>>>>>>>
>>>>>>>>>>>> driver has no knowledge about what the data is or where/if the data will
>>>>>>>>>>>> be stored.
>>>>>>>>>>> The fpga-image-load driver knows the data will be stored in nvmem,
>>>>>>>>>> FYI: This is not strictly correct. In a coming product there is a
>>>>>>>>>> case where the data will be stored in RAM. Richard Gong was also
>>>>>>>>>> looking to use this driver to validate an image without programming
>>>>>>>>>> or storing it. The fpga-image-load driver has no expectation that
>>>>>>>>>> the data will be stored in nvmem, or even that it will be stored
>>>>>>>>>> at all.
>>>>>>>>> OK, we can discuss that use case then. But fundamentally a driver should
>>>>>>>>> be clear what it is doing.
>>>>>>>> The lower-level driver is, of course, clear what it is doing. And the
>>>>>>>> FPGA Image Load Framework simply provides a consistent API and manages
>>>>>>>> a potentially long-running data transfer in the context of a kernel
>>>>>>>> worker thread.
>>>>>>>>
>>>>>>>> It sounds like you are saying that that is not "clear enough" in the
>>>>>>>> context of the FPGA Manager?
>>>>>>>>
>>>>>>>> The files that are used with Intel PAC devices are self-describing. The
>>>>>>>> user-space tools, the class driver and the lower-level driver just pass
>>>>>>>> the data through to the card BMC without any knowledge of the content,
>>>>>>>> purpose or final destination of the data.
>>>>>>>>
>>>>>>>> The card BMC will receive signed data, validate it, and process it as a
>>>>>>>> BMC image, an FPGA image, a Root Entry Hash, or a key cancellation. In
>>>>>>> I category all these actions as firmware update fully or partially on
>>>>>>> persistent storage. The FPGA Manager don't have to know the meaning of
>>>>>>> every byte on flash, but it should be aware the firmware is updated and
>>>>>>> the card may acts differently with a new firmware. This is the common
>>>>>>> working model for most of the FPGA cards so that we implement it in FPGA
>>>>>>> manager class. 
>>>>>>>
>>>>>>>> the n6000, it could also be part of a multi-step process for programming
>>>>>>>> SDM keys and the data may not be stored permanently.
>>>>>>> This is new to me, but seems to be different from firmware update, so lets
>>>>>>> think about it again.
>>>>>>>
>>>>>>>>> You may try to extend the FPGA framework to
>>>>>>>>> support nvmem storage, or image validation, but cannot say we feed the
>>>>>>>>> data to any engine undefined by the framework.
>>>>>>>> I'm not sure what you mean by "feed the data to any engine undefined by the
>>>>>>>> framework". I think the "engine" is the lower level driver/device that invokes
>>>>>>>> the fpga_mgr. The lower level driver, of course, is clear what it is doing.
>>>>>>>> The fpga_mgr cannot control what driver invokes it.
>>>>>>>>
>>>>>>>> Are saying that when invoking the fpga-mgr, that it _must_ also pass descriptive
>>>>>>>> data. Meaning that a self-describing file alone is not acceptable?
>>>>>>> The class driver should define a reasonable working model and APIs.
>>>>>>> Updating the FPGA backup storage is good to me. But receiving a mystery
>>>>>>> box and do whatever it requires is not.
>>>>>>>
>>>>>>> Self-describing file is OK, encryption is OK, but either the class
>>>>>>> driver itself, or with the help of the low level driver, should make
>>>>>>> sure it works within its scope.
>>>>>> In our secure update process, the card BMC firmware authenticates
>>>>>> the data using the root entry hashes and will either reject the
>>>>>> data or perform some function based on the contents. Neither the
>>>>>> user-space, the class driver, nor the lower level driver know
>>>>>> what the contents are. It _is_ a "mystery box" to them. How do we
>>>>>> verify scope in this model?
>>>>> I think we need to find out how. One case is, the HW is designed to have
>>>>> one single function, such as firmware update, then any image input
>>>>> through firmware update API is within expectation, and the driver
>>>>> should only serve the firmware update API. I think this is how the
>>>>> N3000 is working now. If the HW is for another function, register itself
>>>>> to serve another API, or another class driver.
>>>>>
>>>>> Another case is, the HW could do multiple types of tasks depending on
>>>>> the content of the image, such as firmware update, image verification,
>>>>> or assumably power off the card ... There should be some mechanism for
>>>>> the driver to only accept the right image according to what API is called.
>>>>> Or the user may input an image named update_the_card.img through
>>>>> firmware update API and finally get the card off. Having some headers
>>>>> readable by host for the operation type? Or, some HW interface for host
>>>>> to apply the operation type as well as the image, let the HW verify?
>>>>> Let's think about it.
>>>> I'm not sure if I am following your thinking here. The HW, of course,
>>>> verifies authentication of the image and acts according to the image
>>>> type. I don't think it is reasonable to require the class driver to
>>>> interpret the data to determine what it is. That implies that the
>>>> class driver would have to know the header format and possible values.
>>>> It also means that changes to the header format would require patches
>>>> to the class driver.
>>>>
>>>> The FPGA card is trusted by virtue of the fact that the customer
>>>> purchased it and physically placed it in the machine. If the FPGA card
>>>> (or the lower level driver) validates the image, then why does the
>>>> Class driver need to be concerned about the file type? I think the
>>>> purpose of the class driver is primarily to provide a common API and
>>>> perform common functions so that they don't have to be replicated
>>>> among similar low-level drivers. It is up to the low-level driver
>>>> or the device itself to ensure that the data received is acceptable.
>>>>
>>>> If the card receives data through the fpga-mgr upload facility that
>>>> isn't strictly a firmware update, and if the lower-level driver or
>>>> the card handles it and returns appropriate status - is that really
>>>> a problem?
>>>>>> As you have noted, most current cases result in a change to the
>>>>>> card, and I suspect that it will remain that way. But that can't be
>>>>>> guaranteed, and I'm not convinced that a host driver needs to be
>>>>>> concerned about it.
>>>>> A host driver should know what is done, in some abstraction level.
>>>>> I think updating the persistent storage is an acceptable abstraction
>>>>> in FPGA domain, so I'd like to extend it in FPGA manager. But doing
>>>>> anything according to the image is not.
>>>> By host driver, do you mean the class driver? Or the lower-level device
>>>> driver?
>>> The class driver.
>>>
>>>> It seems to me that you are saying that self-describing images are not
>>>> acceptable? Or at least they are not acceptable payload for an FPGA
>>>> Manager firmware-update API?
>>> For N3000, we are working on the persistent storage update APIs, which is
>>> a much simpler working model, no runtime device change, and needs no
>>> device removal & re-enumeration.
>>>
>>> But if you need to extend something more that would potentially changes
>>> the behavior of the running devices on FPGA, device removal &
>>> re-enumeration are needed so that the system knows what devices are
>>> changed.
>>>
>>>> The FPGA Image Load Framework was designed with the concept of
>>>> transferring data to a device without imposing a purpose on the data.
>>>> The expectation is that the lower-level driver or the device will
>>>> validate the data. Is there something fundamentally wrong with that
>>> I think there is something wrong here. As I said before, persistent
>>> storage updating has different software process from some runtime
>>> updating, so the class driver should be aware of what the HW engine
>>> is doing.
>> So far, there are no self-describing images that cause a
>> change in run-time behavior, and I don't think that will
>> happen for the very reason that the class-driver would
>> need to know about it.
> Again, the class driver needs to know what is happening, at some
> abstraction level, to ensure the system is aligned with the HW state.
>
> If the class driver cannot tell the detail, it has to assume the
> whole FPGA region will be changed, and removal & re-enumeration is
> needed.
So we make it a requirement that the self-describing files
cannot make changes that require the class driver to manage
state.
>
>> When I asserted that not all self-describing images are
>> changing firmware, I did not mean to imply that they change
>> run-time behavior; they do not. They are part of a multi-
>> step update of firmware. However, looking at each part of
>> the sequence independently, there is at least one instance
>> where a certificate is stored in RAM for temporary use.
>> When the driver exits from this call, there is no firmware
>> change. There is also no change in run-time behavior.
>>
>> I'm thinking we could have different IOCTLs:
>>
>> (1) firmware  update (address, size, purpose provided
>>     with the image)
> Will the firmware update use the self-describing files?
The firmware update option would be for files that
are not self-describing.
>
>> (2) image upload (self-describing files)
> If both 1 & 2 use self-describing files, how the class driver verifies
> the type of the file without looking into the file?
Only 2 would use self-describing files.

- Russ
>
> For example, if a user calls a firmware update API but inputs an image
> upload file, will the class driver block the call? How?
>
>> (3) image validation
>>
>> These would all use most of the same code, but the FPGA
>> Manager flags and structure fields would be set differently.
> This is good to me.
>
> Thanks,
> Yilun
>
>> - Russ
>>> Thanks,
>>> Yilun
>>>
>>>> approach? And if not, why couldn't we incorporate a similar image_load
>>>> API into the FPGA Manager?
>>>>
>>>> - Russ
>>>>
>>>>> Thanks,
>>>>> Yilun
>>>>>
>>>>>> - Russ
>>>>>>
>>>>>>> Thanks,
>>>>>>> Yilun
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> - Russ
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Yilun
>>>>>>>>>
>>>>>>>>>>> while
>>>>>>>>>>> the fpga-mgr knows the data will be stored in FPGA cells. They may need
>>>>>>>>>>> to know the exact physical position to store, may not, depends on what the
>>>>>>>>>>> HW engines are.
>>>>>>>>>>>
>>>>>>>>>>>> This functionality could, of course, be merged into the fpga-mgr. I did
>>>>>>>>>>>> a proof of concept of this a while back and we discussed the pros and cons.
>>>>>>>>>>>> See this email for a recap:
>>>>>>>>>>>>
>>>>>>>>>>>> https://marc.info/?l=linux-fpga&m=161998085507374&w=2
>>>>>>>>>>>>
>>>>>>>>>>>> Things have changed some with the evolution of the driver. The IOCTL
>>>>>>>>>>>> approach probably fits better than the sysfs implementation. At the time
>>>>>>>>>>>> it seemed that a merge would add unnecessary complexity without adding value.
>>>>>>>>>>> I think at least developers don't have to go through 2 sets of software
>>>>>>>>>>> stacks which are of the same concept. And adding some new features like
>>>>>>>>>>> optionally threading or canceling data transfer are also good to FPGA
>>>>>>>>>>> region update. And the nvmem update could also be benifit from exsiting
>>>>>>>>>>> implementations like scatter-gather buffers, in-kernel firmware loading.
>>>>>>>>>>>
>>>>>>>>>>> I try to explain myself according to each of your concern from that mail
>>>>>>>>>>> thread:
>>>>>>>>>>>
>>>>>>>>>>> Purpose of the 2 updates
>>>>>>>>>>> ========================
>>>>>>>>>>>
>>>>>>>>>>>   As I said before, I think they are both data transfer devices. FPGA
>>>>>>>>>>> region update gets extra support from fpga-region & fpga-bridge, and
>>>>>>>>>>> FPGA nvmem update could be a standalone fpga-mgr.
>>>>>>>>>>>
>>>>>>>>>>> Extra APIs that are unique to nvmem update
>>>>>>>>>>> ==========================================
>>>>>>>>>>>
>>>>>>>>>>>   cdev APIs for nvmem update:
>>>>>>>>>>>     Yes we need to expand the functionality so we need them.
>>>>>>>>>>>
>>>>>>>>>>>   available_images, image_load APIs for loading nvmem content to FPGA
>>>>>>>>>>>   region:
>>>>>>>>>>>     These are features in later patchsets which are not submitted, but we
>>>>>>>>>>>     could talk about it in advance. I think this is actually a FPGA region
>>>>>>>>>>>     update from nvmem, it also requires gating, data loading (no SW transfer)
>>>>>>>>>>>     and re-enumeration, or a single command to image_load HW may result system
>>>>>>>>>>>     disordered. The FPGA framework now only supports update from in-kernel
>>>>>>>>>>>     user data, maybe we add support for update from nvmem later.
>>>>>>>>>>>
>>>>>>>>>>>   fpga-mgr state extend:
>>>>>>>>>>>     I think it could be extended, The current states are not perfect,
>>>>>>>>>>>     adding something like IDLE or READY is just fine.
>>>>>>>>>>>
>>>>>>>>>>>   fpga-mgr status extend:
>>>>>>>>>>>     Add general error definitions as needed. If there is some status
>>>>>>>>>>>     that is quite vendor specific, expose it in low-level driver.
>>>>>>>>>>>
>>>>>>>>>>>   remaining-size:
>>>>>>>>>>>     Nice to have for all.
>>>>>>>>>>>
>>>>>>>>>>> Threading the update
>>>>>>>>>>> ====================
>>>>>>>>>>>
>>>>>>>>>>>   Also a good option for FPGA region update, maybe we also have a slow FPGA
>>>>>>>>>>>   reprogrammer?
>>>>>>>>>>>
>>>>>>>>>>> Cancelling the update
>>>>>>>>>>> ====================
>>>>>>>>>>>
>>>>>>>>>>>   Also a good option for FPGA region update if HW engine supports.
>>>>>>>>>> These are all good points.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> - Russ
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Yilun
>>>>>>>>>>>
>>>>>>>>>>>>>>>> According to the patchset, the basic workflow of the 2 update types are
>>>>>>>>>>>>>>>> quite similar, get the data, prepare for the HW, write and complete.
>>>>>>>>>>>>>>>> They are already implemented in FPGA manager. We've discussed some
>>>>>>>>>>>>>>>> differences like threading or canceling the update, which are
>>>>>>>>>>>>>>>> not provided by FPGA manager but they may also nice to have for FPGA
>>>>>>>>>>>>>>>> region update. An FPGA region update may also last for a long time??
>>>>>>>>>>>>>>>> So I think having 2 sets of similar frameworks in FPGA is unnecessary.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> My quick mind is that we add some flags in struct fpga_mgr & struct
>>>>>>>>>>>>>>>> fpga_image_info to indicate the HW capability (support FPGA region
>>>>>>>>>>>>>>>> update or nvmem update or both) of the download engine and the provided
>>>>>>>>>>>>>>>> image type. Then the low-level driver knows how to download if it
>>>>>>>>>>>>>>>> supports both image types.An char device could be added for each fpga manager dev, providing the
>>>>>>>>>>>>>>>> user APIs for nvmem update. We may not use the char dev for FPGA region
>>>>>>>>>>>>>>>> update cause it changes the system HW devices and needs device hotplug
>>>>>>>>>>>>>>>> in FPGA region. We'd better leave it to FPGA region class, this is
>>>>>>>>>>>>>>>> another topic.
>>>>>>>>>>>> I'll give this some more thought and see if I can come up with some RFC
>>>>>>>>>>>> patches.
>>>>>>>>>>>>
>>>>>>>>>>>> - Russ
>>>>>>>>>>>>>>>> More discussion is appreciated.
>>>>>>>>>>>>>>> I also think fpga_mgr could be extended.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In this patchset,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://lore.kernel.org/linux-fpga/20210625195849.837976-1-trix@redhat.com/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> A second, similar set of write ops was added to fpga_manger_ops,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> new bit/flag was added to fpga_image_info
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The intent was for dfl to add their specific ops to cover what is done in
>>>>>>>>>>>>>>> this patchset.
>>>>>>>>>>>>>> I think we don't have to add 2 ops for reconfig & reimage in framework,
>>>>>>>>>>>>>> the 2 processes are almost the same.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Just add the _REIMAGE (or something else, NVMEM?) flag for
>>>>>>>>>>>>>> fpga_image_info, and low level drivers handle it as they do for other
>>>>>>>>>>>>>> flags.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> How do you think?
>>>>>>>>>>>>> A single set is fine.
>>>>>>>>>>>>>
>>>>>>>>>>>>> A difficult part of is the length of  time to do the write. The existing write should be improved to use a worker thread.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Tom
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Yilun
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Any other driver would do similar.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Is this close to what you are thinking ?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Tom
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Yilun
>>>>>>>>>>>>>>>>

Xu Yilun Oct. 20, 2021, 1:16 a.m. UTC | #19

On Tue, Oct 19, 2021 at 08:09:46AM -0700, Russ Weight wrote:
> 
> 
> On 10/18/21 7:53 PM, Xu Yilun wrote:
> > On Mon, Oct 18, 2021 at 09:24:08AM -0700, Russ Weight wrote:
> >>
> >> On 10/18/21 1:13 AM, Xu Yilun wrote:
> >>> On Fri, Oct 15, 2021 at 10:34:23AM -0700, Russ Weight wrote:
> >>>> On 10/14/21 7:51 PM, Xu Yilun wrote:
> >>>>> On Thu, Oct 14, 2021 at 09:32:53AM -0700, Russ Weight wrote:
> >>>>>> On 10/13/21 6:49 PM, Xu Yilun wrote:
> >>>>>>> On Wed, Oct 13, 2021 at 11:09:08AM -0700, Russ Weight wrote:
> >>>>>>>> On 10/12/21 6:06 PM, Xu Yilun wrote:
> >>>>>>>>> On Tue, Oct 12, 2021 at 10:20:15AM -0700, Russ Weight wrote:
> >>>>>>>>>> On 10/12/21 12:47 AM, Xu Yilun wrote:
> >>>>>>>>>>> On Mon, Oct 11, 2021 at 06:00:16PM -0700, Russ Weight wrote:
> >>>>>>>>>>>> On 10/11/21 5:35 AM, Tom Rix wrote:
> >>>>>>>>>>>>> On 10/10/21 6:41 PM, Xu Yilun wrote:
> >>>>>>>>>>>>>> On Sat, Oct 09, 2021 at 05:11:20AM -0700, Tom Rix wrote:
> >>>>>>>>>>>>>>> On 10/9/21 1:08 AM, Xu Yilun wrote:
> >>>>>>>>>>>>>>>> On Wed, Sep 29, 2021 at 04:00:20PM -0700, Russ Weight wrote:
> >>>>>>>>>>>>>>>>> The FPGA Image Load framework provides an API to upload image
> >>>>>>>>>>>>>>>>> files to an FPGA device. Image files are self-describing. They could
> >>>>>>>>>>>>>>>>> contain FPGA images, BMC images, Root Entry Hashes, or other device
> >>>>>>>>>>>>>>>>> specific files. It is up to the lower-level device driver and the
> >>>>>>>>>>>>>>>>> target device to authenticate and disposition the file data.
> >>>>>>>>>>>>>>>> I've reconsider the FPGA persistent image update again, and think we
> >>>>>>>>>>>>>>>> may include it in FPGA manager framework.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Sorry I raised this topic again when it is already at patch v17, but now
> >>>>>>>>>>>>>>>> I need to consider more seriously than before.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> We have consensus the FPGA persistent image update is just like a normal
> >>>>>>>>>>>>>>>> firmware update which finally writes the nvmem like flash or eeprom,
> >>>>>>>>>>>>>>>> while the current FPGA manager deals with the active FPGA region update
> >>>>>>>>>>>>>>>> and re-activation. Could we just expand the FPGA manager and let it handle
> >>>>>>>>>>>>>>>> the nvmem update as well? Many FPGA cards have nvmem and downloaders
> >>>>>>>>>>>>>>>> supports updating both FPGA region and nvmem.
> >>>>>>>>>>>> The fpga-image-load driver is actually just a data transfer. The class
> >>>>>>>>>>> IMHO, The fpga-mgr dev is also a data transfer. The fpga-region dev is
> >>>>>>>>>>> acting as the FPGA region admin responsible for gating, transfering and
> >>>>>>>>>>> re-enumerating.
> >>>>>>>>>>>
> >>>>>>>>>>> So my opinion is to add a new data transfer type and keep a unified process.
> >>>>>>>>>>>
> >>>>>>>>>>>> driver has no knowledge about what the data is or where/if the data will
> >>>>>>>>>>>> be stored.
> >>>>>>>>>>> The fpga-image-load driver knows the data will be stored in nvmem,
> >>>>>>>>>> FYI: This is not strictly correct. In a coming product there is a
> >>>>>>>>>> case where the data will be stored in RAM. Richard Gong was also
> >>>>>>>>>> looking to use this driver to validate an image without programming
> >>>>>>>>>> or storing it. The fpga-image-load driver has no expectation that
> >>>>>>>>>> the data will be stored in nvmem, or even that it will be stored
> >>>>>>>>>> at all.
> >>>>>>>>> OK, we can discuss that use case then. But fundamentally a driver should
> >>>>>>>>> be clear what it is doing.
> >>>>>>>> The lower-level driver is, of course, clear what it is doing. And the
> >>>>>>>> FPGA Image Load Framework simply provides a consistent API and manages
> >>>>>>>> a potentially long-running data transfer in the context of a kernel
> >>>>>>>> worker thread.
> >>>>>>>>
> >>>>>>>> It sounds like you are saying that that is not "clear enough" in the
> >>>>>>>> context of the FPGA Manager?
> >>>>>>>>
> >>>>>>>> The files that are used with Intel PAC devices are self-describing. The
> >>>>>>>> user-space tools, the class driver and the lower-level driver just pass
> >>>>>>>> the data through to the card BMC without any knowledge of the content,
> >>>>>>>> purpose or final destination of the data.
> >>>>>>>>
> >>>>>>>> The card BMC will receive signed data, validate it, and process it as a
> >>>>>>>> BMC image, an FPGA image, a Root Entry Hash, or a key cancellation. In
> >>>>>>> I category all these actions as firmware update fully or partially on
> >>>>>>> persistent storage. The FPGA Manager don't have to know the meaning of
> >>>>>>> every byte on flash, but it should be aware the firmware is updated and
> >>>>>>> the card may acts differently with a new firmware. This is the common
> >>>>>>> working model for most of the FPGA cards so that we implement it in FPGA
> >>>>>>> manager class. 
> >>>>>>>
> >>>>>>>> the n6000, it could also be part of a multi-step process for programming
> >>>>>>>> SDM keys and the data may not be stored permanently.
> >>>>>>> This is new to me, but seems to be different from firmware update, so lets
> >>>>>>> think about it again.
> >>>>>>>
> >>>>>>>>> You may try to extend the FPGA framework to
> >>>>>>>>> support nvmem storage, or image validation, but cannot say we feed the
> >>>>>>>>> data to any engine undefined by the framework.
> >>>>>>>> I'm not sure what you mean by "feed the data to any engine undefined by the
> >>>>>>>> framework". I think the "engine" is the lower level driver/device that invokes
> >>>>>>>> the fpga_mgr. The lower level driver, of course, is clear what it is doing.
> >>>>>>>> The fpga_mgr cannot control what driver invokes it.
> >>>>>>>>
> >>>>>>>> Are saying that when invoking the fpga-mgr, that it _must_ also pass descriptive
> >>>>>>>> data. Meaning that a self-describing file alone is not acceptable?
> >>>>>>> The class driver should define a reasonable working model and APIs.
> >>>>>>> Updating the FPGA backup storage is good to me. But receiving a mystery
> >>>>>>> box and do whatever it requires is not.
> >>>>>>>
> >>>>>>> Self-describing file is OK, encryption is OK, but either the class
> >>>>>>> driver itself, or with the help of the low level driver, should make
> >>>>>>> sure it works within its scope.
> >>>>>> In our secure update process, the card BMC firmware authenticates
> >>>>>> the data using the root entry hashes and will either reject the
> >>>>>> data or perform some function based on the contents. Neither the
> >>>>>> user-space, the class driver, nor the lower level driver know
> >>>>>> what the contents are. It _is_ a "mystery box" to them. How do we
> >>>>>> verify scope in this model?
> >>>>> I think we need to find out how. One case is, the HW is designed to have
> >>>>> one single function, such as firmware update, then any image input
> >>>>> through firmware update API is within expectation, and the driver
> >>>>> should only serve the firmware update API. I think this is how the
> >>>>> N3000 is working now. If the HW is for another function, register itself
> >>>>> to serve another API, or another class driver.
> >>>>>
> >>>>> Another case is, the HW could do multiple types of tasks depending on
> >>>>> the content of the image, such as firmware update, image verification,
> >>>>> or assumably power off the card ... There should be some mechanism for
> >>>>> the driver to only accept the right image according to what API is called.
> >>>>> Or the user may input an image named update_the_card.img through
> >>>>> firmware update API and finally get the card off. Having some headers
> >>>>> readable by host for the operation type? Or, some HW interface for host
> >>>>> to apply the operation type as well as the image, let the HW verify?
> >>>>> Let's think about it.
> >>>> I'm not sure if I am following your thinking here. The HW, of course,
> >>>> verifies authentication of the image and acts according to the image
> >>>> type. I don't think it is reasonable to require the class driver to
> >>>> interpret the data to determine what it is. That implies that the
> >>>> class driver would have to know the header format and possible values.
> >>>> It also means that changes to the header format would require patches
> >>>> to the class driver.
> >>>>
> >>>> The FPGA card is trusted by virtue of the fact that the customer
> >>>> purchased it and physically placed it in the machine. If the FPGA card
> >>>> (or the lower level driver) validates the image, then why does the
> >>>> Class driver need to be concerned about the file type? I think the
> >>>> purpose of the class driver is primarily to provide a common API and
> >>>> perform common functions so that they don't have to be replicated
> >>>> among similar low-level drivers. It is up to the low-level driver
> >>>> or the device itself to ensure that the data received is acceptable.
> >>>>
> >>>> If the card receives data through the fpga-mgr upload facility that
> >>>> isn't strictly a firmware update, and if the lower-level driver or
> >>>> the card handles it and returns appropriate status - is that really
> >>>> a problem?
> >>>>>> As you have noted, most current cases result in a change to the
> >>>>>> card, and I suspect that it will remain that way. But that can't be
> >>>>>> guaranteed, and I'm not convinced that a host driver needs to be
> >>>>>> concerned about it.
> >>>>> A host driver should know what is done, in some abstraction level.
> >>>>> I think updating the persistent storage is an acceptable abstraction
> >>>>> in FPGA domain, so I'd like to extend it in FPGA manager. But doing
> >>>>> anything according to the image is not.
> >>>> By host driver, do you mean the class driver? Or the lower-level device
> >>>> driver?
> >>> The class driver.
> >>>
> >>>> It seems to me that you are saying that self-describing images are not
> >>>> acceptable? Or at least they are not acceptable payload for an FPGA
> >>>> Manager firmware-update API?
> >>> For N3000, we are working on the persistent storage update APIs, which is
> >>> a much simpler working model, no runtime device change, and needs no
> >>> device removal & re-enumeration.
> >>>
> >>> But if you need to extend something more that would potentially changes
> >>> the behavior of the running devices on FPGA, device removal &
> >>> re-enumeration are needed so that the system knows what devices are
> >>> changed.
> >>>
> >>>> The FPGA Image Load Framework was designed with the concept of
> >>>> transferring data to a device without imposing a purpose on the data.
> >>>> The expectation is that the lower-level driver or the device will
> >>>> validate the data. Is there something fundamentally wrong with that
> >>> I think there is something wrong here. As I said before, persistent
> >>> storage updating has different software process from some runtime
> >>> updating, so the class driver should be aware of what the HW engine
> >>> is doing.
> >> So far, there are no self-describing images that cause a
> >> change in run-time behavior, and I don't think that will
> >> happen for the very reason that the class-driver would
> >> need to know about it.
> > Again, the class driver needs to know what is happening, at some
> > abstraction level, to ensure the system is aligned with the HW state.
> >
> > If the class driver cannot tell the detail, it has to assume the
> > whole FPGA region will be changed, and removal & re-enumeration is
> > needed.
> So we make it a requirement that the self-describing files
> cannot make changes that require the class driver to manage
> state.

The API should not only define what it won't do, but also define what
it will do. But the "image load" just specifies the top half of the
process. So I don't think this API would be accepted.

> >
> >> When I asserted that not all self-describing images are
> >> changing firmware, I did not mean to imply that they change
> >> run-time behavior; they do not. They are part of a multi-
> >> step update of firmware. However, looking at each part of
> >> the sequence independently, there is at least one instance
> >> where a certificate is stored in RAM for temporary use.
> >> When the driver exits from this call, there is no firmware
> >> change. There is also no change in run-time behavior.
> >>
> >> I'm thinking we could have different IOCTLs:
> >>
> >> (1) firmware  update (address, size, purpose provided
> >>     with the image)
> > Will the firmware update use the self-describing files?
> The firmware update option would be for files that
> are not self-describing.
> >
> >> (2) image upload (self-describing files)
> > If both 1 & 2 use self-describing files, how the class driver verifies
> > the type of the file without looking into the file?
> Only 2 would use self-describing files.

Which IOCTL the N3000 flash secure update will use?

Thanks,
Yilun

> 
> - Russ
> >
> > For example, if a user calls a firmware update API but inputs an image
> > upload file, will the class driver block the call? How?
> >
> >> (3) image validation
> >>
> >> These would all use most of the same code, but the FPGA
> >> Manager flags and structure fields would be set differently.
> > This is good to me.
> >
> > Thanks,
> > Yilun
> >
> >> - Russ
> >>> Thanks,
> >>> Yilun
> >>>
> >>>> approach? And if not, why couldn't we incorporate a similar image_load
> >>>> API into the FPGA Manager?
> >>>>
> >>>> - Russ
> >>>>
> >>>>> Thanks,
> >>>>> Yilun
> >>>>>
> >>>>>> - Russ
> >>>>>>
> >>>>>>> Thanks,
> >>>>>>> Yilun
> >>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> - Russ
> >>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Yilun
> >>>>>>>>>
> >>>>>>>>>>> while
> >>>>>>>>>>> the fpga-mgr knows the data will be stored in FPGA cells. They may need
> >>>>>>>>>>> to know the exact physical position to store, may not, depends on what the
> >>>>>>>>>>> HW engines are.
> >>>>>>>>>>>
> >>>>>>>>>>>> This functionality could, of course, be merged into the fpga-mgr. I did
> >>>>>>>>>>>> a proof of concept of this a while back and we discussed the pros and cons.
> >>>>>>>>>>>> See this email for a recap:
> >>>>>>>>>>>>
> >>>>>>>>>>>> https://marc.info/?l=linux-fpga&m=161998085507374&w=2
> >>>>>>>>>>>>
> >>>>>>>>>>>> Things have changed some with the evolution of the driver. The IOCTL
> >>>>>>>>>>>> approach probably fits better than the sysfs implementation. At the time
> >>>>>>>>>>>> it seemed that a merge would add unnecessary complexity without adding value.
> >>>>>>>>>>> I think at least developers don't have to go through 2 sets of software
> >>>>>>>>>>> stacks which are of the same concept. And adding some new features like
> >>>>>>>>>>> optionally threading or canceling data transfer are also good to FPGA
> >>>>>>>>>>> region update. And the nvmem update could also be benifit from exsiting
> >>>>>>>>>>> implementations like scatter-gather buffers, in-kernel firmware loading.
> >>>>>>>>>>>
> >>>>>>>>>>> I try to explain myself according to each of your concern from that mail
> >>>>>>>>>>> thread:
> >>>>>>>>>>>
> >>>>>>>>>>> Purpose of the 2 updates
> >>>>>>>>>>> ========================
> >>>>>>>>>>>
> >>>>>>>>>>>   As I said before, I think they are both data transfer devices. FPGA
> >>>>>>>>>>> region update gets extra support from fpga-region & fpga-bridge, and
> >>>>>>>>>>> FPGA nvmem update could be a standalone fpga-mgr.
> >>>>>>>>>>>
> >>>>>>>>>>> Extra APIs that are unique to nvmem update
> >>>>>>>>>>> ==========================================
> >>>>>>>>>>>
> >>>>>>>>>>>   cdev APIs for nvmem update:
> >>>>>>>>>>>     Yes we need to expand the functionality so we need them.
> >>>>>>>>>>>
> >>>>>>>>>>>   available_images, image_load APIs for loading nvmem content to FPGA
> >>>>>>>>>>>   region:
> >>>>>>>>>>>     These are features in later patchsets which are not submitted, but we
> >>>>>>>>>>>     could talk about it in advance. I think this is actually a FPGA region
> >>>>>>>>>>>     update from nvmem, it also requires gating, data loading (no SW transfer)
> >>>>>>>>>>>     and re-enumeration, or a single command to image_load HW may result system
> >>>>>>>>>>>     disordered. The FPGA framework now only supports update from in-kernel
> >>>>>>>>>>>     user data, maybe we add support for update from nvmem later.
> >>>>>>>>>>>
> >>>>>>>>>>>   fpga-mgr state extend:
> >>>>>>>>>>>     I think it could be extended, The current states are not perfect,
> >>>>>>>>>>>     adding something like IDLE or READY is just fine.
> >>>>>>>>>>>
> >>>>>>>>>>>   fpga-mgr status extend:
> >>>>>>>>>>>     Add general error definitions as needed. If there is some status
> >>>>>>>>>>>     that is quite vendor specific, expose it in low-level driver.
> >>>>>>>>>>>
> >>>>>>>>>>>   remaining-size:
> >>>>>>>>>>>     Nice to have for all.
> >>>>>>>>>>>
> >>>>>>>>>>> Threading the update
> >>>>>>>>>>> ====================
> >>>>>>>>>>>
> >>>>>>>>>>>   Also a good option for FPGA region update, maybe we also have a slow FPGA
> >>>>>>>>>>>   reprogrammer?
> >>>>>>>>>>>
> >>>>>>>>>>> Cancelling the update
> >>>>>>>>>>> ====================
> >>>>>>>>>>>
> >>>>>>>>>>>   Also a good option for FPGA region update if HW engine supports.
> >>>>>>>>>> These are all good points.
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> - Russ
> >>>>>>>>>>> Thanks,
> >>>>>>>>>>> Yilun
> >>>>>>>>>>>
> >>>>>>>>>>>>>>>> According to the patchset, the basic workflow of the 2 update types are
> >>>>>>>>>>>>>>>> quite similar, get the data, prepare for the HW, write and complete.
> >>>>>>>>>>>>>>>> They are already implemented in FPGA manager. We've discussed some
> >>>>>>>>>>>>>>>> differences like threading or canceling the update, which are
> >>>>>>>>>>>>>>>> not provided by FPGA manager but they may also nice to have for FPGA
> >>>>>>>>>>>>>>>> region update. An FPGA region update may also last for a long time??
> >>>>>>>>>>>>>>>> So I think having 2 sets of similar frameworks in FPGA is unnecessary.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> My quick mind is that we add some flags in struct fpga_mgr & struct
> >>>>>>>>>>>>>>>> fpga_image_info to indicate the HW capability (support FPGA region
> >>>>>>>>>>>>>>>> update or nvmem update or both) of the download engine and the provided
> >>>>>>>>>>>>>>>> image type. Then the low-level driver knows how to download if it
> >>>>>>>>>>>>>>>> supports both image types.An char device could be added for each fpga manager dev, providing the
> >>>>>>>>>>>>>>>> user APIs for nvmem update. We may not use the char dev for FPGA region
> >>>>>>>>>>>>>>>> update cause it changes the system HW devices and needs device hotplug
> >>>>>>>>>>>>>>>> in FPGA region. We'd better leave it to FPGA region class, this is
> >>>>>>>>>>>>>>>> another topic.
> >>>>>>>>>>>> I'll give this some more thought and see if I can come up with some RFC
> >>>>>>>>>>>> patches.
> >>>>>>>>>>>>
> >>>>>>>>>>>> - Russ
> >>>>>>>>>>>>>>>> More discussion is appreciated.
> >>>>>>>>>>>>>>> I also think fpga_mgr could be extended.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> In this patchset,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> https://lore.kernel.org/linux-fpga/20210625195849.837976-1-trix@redhat.com/
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> A second, similar set of write ops was added to fpga_manger_ops,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> new bit/flag was added to fpga_image_info
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> The intent was for dfl to add their specific ops to cover what is done in
> >>>>>>>>>>>>>>> this patchset.
> >>>>>>>>>>>>>> I think we don't have to add 2 ops for reconfig & reimage in framework,
> >>>>>>>>>>>>>> the 2 processes are almost the same.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Just add the _REIMAGE (or something else, NVMEM?) flag for
> >>>>>>>>>>>>>> fpga_image_info, and low level drivers handle it as they do for other
> >>>>>>>>>>>>>> flags.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> How do you think?
> >>>>>>>>>>>>> A single set is fine.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> A difficult part of is the length of  time to do the write. The existing write should be improved to use a worker thread.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Tom
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>> Yilun
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Any other driver would do similar.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Is this close to what you are thinking ?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Tom
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>> Yilun
> >>>>>>>>>>>>>>>>

Russ Weight Oct. 20, 2021, 4:27 p.m. UTC | #20

On 10/19/21 6:16 PM, Xu Yilun wrote:
> On Tue, Oct 19, 2021 at 08:09:46AM -0700, Russ Weight wrote:
>>
>> On 10/18/21 7:53 PM, Xu Yilun wrote:
>>> On Mon, Oct 18, 2021 at 09:24:08AM -0700, Russ Weight wrote:
>>>> On 10/18/21 1:13 AM, Xu Yilun wrote:
>>>>> On Fri, Oct 15, 2021 at 10:34:23AM -0700, Russ Weight wrote:
>>>>>> On 10/14/21 7:51 PM, Xu Yilun wrote:
>>>>>>> On Thu, Oct 14, 2021 at 09:32:53AM -0700, Russ Weight wrote:
>>>>>>>> On 10/13/21 6:49 PM, Xu Yilun wrote:
>>>>>>>>> On Wed, Oct 13, 2021 at 11:09:08AM -0700, Russ Weight wrote:
>>>>>>>>>> On 10/12/21 6:06 PM, Xu Yilun wrote:
>>>>>>>>>>> On Tue, Oct 12, 2021 at 10:20:15AM -0700, Russ Weight wrote:
>>>>>>>>>>>> On 10/12/21 12:47 AM, Xu Yilun wrote:
>>>>>>>>>>>>> On Mon, Oct 11, 2021 at 06:00:16PM -0700, Russ Weight wrote:
>>>>>>>>>>>>>> On 10/11/21 5:35 AM, Tom Rix wrote:
>>>>>>>>>>>>>>> On 10/10/21 6:41 PM, Xu Yilun wrote:
>>>>>>>>>>>>>>>> On Sat, Oct 09, 2021 at 05:11:20AM -0700, Tom Rix wrote:
>>>>>>>>>>>>>>>>> On 10/9/21 1:08 AM, Xu Yilun wrote:
>>>>>>>>>>>>>>>>>> On Wed, Sep 29, 2021 at 04:00:20PM -0700, Russ Weight wrote:
>>>>>>>>>>>>>>>>>>> The FPGA Image Load framework provides an API to upload image
>>>>>>>>>>>>>>>>>>> files to an FPGA device. Image files are self-describing. They could
>>>>>>>>>>>>>>>>>>> contain FPGA images, BMC images, Root Entry Hashes, or other device
>>>>>>>>>>>>>>>>>>> specific files. It is up to the lower-level device driver and the
>>>>>>>>>>>>>>>>>>> target device to authenticate and disposition the file data.
>>>>>>>>>>>>>>>>>> I've reconsider the FPGA persistent image update again, and think we
>>>>>>>>>>>>>>>>>> may include it in FPGA manager framework.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Sorry I raised this topic again when it is already at patch v17, but now
>>>>>>>>>>>>>>>>>> I need to consider more seriously than before.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> We have consensus the FPGA persistent image update is just like a normal
>>>>>>>>>>>>>>>>>> firmware update which finally writes the nvmem like flash or eeprom,
>>>>>>>>>>>>>>>>>> while the current FPGA manager deals with the active FPGA region update
>>>>>>>>>>>>>>>>>> and re-activation. Could we just expand the FPGA manager and let it handle
>>>>>>>>>>>>>>>>>> the nvmem update as well? Many FPGA cards have nvmem and downloaders
>>>>>>>>>>>>>>>>>> supports updating both FPGA region and nvmem.
>>>>>>>>>>>>>> The fpga-image-load driver is actually just a data transfer. The class
>>>>>>>>>>>>> IMHO, The fpga-mgr dev is also a data transfer. The fpga-region dev is
>>>>>>>>>>>>> acting as the FPGA region admin responsible for gating, transfering and
>>>>>>>>>>>>> re-enumerating.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So my opinion is to add a new data transfer type and keep a unified process.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> driver has no knowledge about what the data is or where/if the data will
>>>>>>>>>>>>>> be stored.
>>>>>>>>>>>>> The fpga-image-load driver knows the data will be stored in nvmem,
>>>>>>>>>>>> FYI: This is not strictly correct. In a coming product there is a
>>>>>>>>>>>> case where the data will be stored in RAM. Richard Gong was also
>>>>>>>>>>>> looking to use this driver to validate an image without programming
>>>>>>>>>>>> or storing it. The fpga-image-load driver has no expectation that
>>>>>>>>>>>> the data will be stored in nvmem, or even that it will be stored
>>>>>>>>>>>> at all.
>>>>>>>>>>> OK, we can discuss that use case then. But fundamentally a driver should
>>>>>>>>>>> be clear what it is doing.
>>>>>>>>>> The lower-level driver is, of course, clear what it is doing. And the
>>>>>>>>>> FPGA Image Load Framework simply provides a consistent API and manages
>>>>>>>>>> a potentially long-running data transfer in the context of a kernel
>>>>>>>>>> worker thread.
>>>>>>>>>>
>>>>>>>>>> It sounds like you are saying that that is not "clear enough" in the
>>>>>>>>>> context of the FPGA Manager?
>>>>>>>>>>
>>>>>>>>>> The files that are used with Intel PAC devices are self-describing. The
>>>>>>>>>> user-space tools, the class driver and the lower-level driver just pass
>>>>>>>>>> the data through to the card BMC without any knowledge of the content,
>>>>>>>>>> purpose or final destination of the data.
>>>>>>>>>>
>>>>>>>>>> The card BMC will receive signed data, validate it, and process it as a
>>>>>>>>>> BMC image, an FPGA image, a Root Entry Hash, or a key cancellation. In
>>>>>>>>> I category all these actions as firmware update fully or partially on
>>>>>>>>> persistent storage. The FPGA Manager don't have to know the meaning of
>>>>>>>>> every byte on flash, but it should be aware the firmware is updated and
>>>>>>>>> the card may acts differently with a new firmware. This is the common
>>>>>>>>> working model for most of the FPGA cards so that we implement it in FPGA
>>>>>>>>> manager class. 
>>>>>>>>>
>>>>>>>>>> the n6000, it could also be part of a multi-step process for programming
>>>>>>>>>> SDM keys and the data may not be stored permanently.
>>>>>>>>> This is new to me, but seems to be different from firmware update, so lets
>>>>>>>>> think about it again.
>>>>>>>>>
>>>>>>>>>>> You may try to extend the FPGA framework to
>>>>>>>>>>> support nvmem storage, or image validation, but cannot say we feed the
>>>>>>>>>>> data to any engine undefined by the framework.
>>>>>>>>>> I'm not sure what you mean by "feed the data to any engine undefined by the
>>>>>>>>>> framework". I think the "engine" is the lower level driver/device that invokes
>>>>>>>>>> the fpga_mgr. The lower level driver, of course, is clear what it is doing.
>>>>>>>>>> The fpga_mgr cannot control what driver invokes it.
>>>>>>>>>>
>>>>>>>>>> Are saying that when invoking the fpga-mgr, that it _must_ also pass descriptive
>>>>>>>>>> data. Meaning that a self-describing file alone is not acceptable?
>>>>>>>>> The class driver should define a reasonable working model and APIs.
>>>>>>>>> Updating the FPGA backup storage is good to me. But receiving a mystery
>>>>>>>>> box and do whatever it requires is not.
>>>>>>>>>
>>>>>>>>> Self-describing file is OK, encryption is OK, but either the class
>>>>>>>>> driver itself, or with the help of the low level driver, should make
>>>>>>>>> sure it works within its scope.
>>>>>>>> In our secure update process, the card BMC firmware authenticates
>>>>>>>> the data using the root entry hashes and will either reject the
>>>>>>>> data or perform some function based on the contents. Neither the
>>>>>>>> user-space, the class driver, nor the lower level driver know
>>>>>>>> what the contents are. It _is_ a "mystery box" to them. How do we
>>>>>>>> verify scope in this model?
>>>>>>> I think we need to find out how. One case is, the HW is designed to have
>>>>>>> one single function, such as firmware update, then any image input
>>>>>>> through firmware update API is within expectation, and the driver
>>>>>>> should only serve the firmware update API. I think this is how the
>>>>>>> N3000 is working now. If the HW is for another function, register itself
>>>>>>> to serve another API, or another class driver.
>>>>>>>
>>>>>>> Another case is, the HW could do multiple types of tasks depending on
>>>>>>> the content of the image, such as firmware update, image verification,
>>>>>>> or assumably power off the card ... There should be some mechanism for
>>>>>>> the driver to only accept the right image according to what API is called.
>>>>>>> Or the user may input an image named update_the_card.img through
>>>>>>> firmware update API and finally get the card off. Having some headers
>>>>>>> readable by host for the operation type? Or, some HW interface for host
>>>>>>> to apply the operation type as well as the image, let the HW verify?
>>>>>>> Let's think about it.
>>>>>> I'm not sure if I am following your thinking here. The HW, of course,
>>>>>> verifies authentication of the image and acts according to the image
>>>>>> type. I don't think it is reasonable to require the class driver to
>>>>>> interpret the data to determine what it is. That implies that the
>>>>>> class driver would have to know the header format and possible values.
>>>>>> It also means that changes to the header format would require patches
>>>>>> to the class driver.
>>>>>>
>>>>>> The FPGA card is trusted by virtue of the fact that the customer
>>>>>> purchased it and physically placed it in the machine. If the FPGA card
>>>>>> (or the lower level driver) validates the image, then why does the
>>>>>> Class driver need to be concerned about the file type? I think the
>>>>>> purpose of the class driver is primarily to provide a common API and
>>>>>> perform common functions so that they don't have to be replicated
>>>>>> among similar low-level drivers. It is up to the low-level driver
>>>>>> or the device itself to ensure that the data received is acceptable.
>>>>>>
>>>>>> If the card receives data through the fpga-mgr upload facility that
>>>>>> isn't strictly a firmware update, and if the lower-level driver or
>>>>>> the card handles it and returns appropriate status - is that really
>>>>>> a problem?
>>>>>>>> As you have noted, most current cases result in a change to the
>>>>>>>> card, and I suspect that it will remain that way. But that can't be
>>>>>>>> guaranteed, and I'm not convinced that a host driver needs to be
>>>>>>>> concerned about it.
>>>>>>> A host driver should know what is done, in some abstraction level.
>>>>>>> I think updating the persistent storage is an acceptable abstraction
>>>>>>> in FPGA domain, so I'd like to extend it in FPGA manager. But doing
>>>>>>> anything according to the image is not.
>>>>>> By host driver, do you mean the class driver? Or the lower-level device
>>>>>> driver?
>>>>> The class driver.
>>>>>
>>>>>> It seems to me that you are saying that self-describing images are not
>>>>>> acceptable? Or at least they are not acceptable payload for an FPGA
>>>>>> Manager firmware-update API?
>>>>> For N3000, we are working on the persistent storage update APIs, which is
>>>>> a much simpler working model, no runtime device change, and needs no
>>>>> device removal & re-enumeration.
>>>>>
>>>>> But if you need to extend something more that would potentially changes
>>>>> the behavior of the running devices on FPGA, device removal &
>>>>> re-enumeration are needed so that the system knows what devices are
>>>>> changed.
>>>>>
>>>>>> The FPGA Image Load Framework was designed with the concept of
>>>>>> transferring data to a device without imposing a purpose on the data.
>>>>>> The expectation is that the lower-level driver or the device will
>>>>>> validate the data. Is there something fundamentally wrong with that
>>>>> I think there is something wrong here. As I said before, persistent
>>>>> storage updating has different software process from some runtime
>>>>> updating, so the class driver should be aware of what the HW engine
>>>>> is doing.
>>>> So far, there are no self-describing images that cause a
>>>> change in run-time behavior, and I don't think that will
>>>> happen for the very reason that the class-driver would
>>>> need to know about it.
>>> Again, the class driver needs to know what is happening, at some
>>> abstraction level, to ensure the system is aligned with the HW state.
>>>
>>> If the class driver cannot tell the detail, it has to assume the
>>> whole FPGA region will be changed, and removal & re-enumeration is
>>> needed.
>> So we make it a requirement that the self-describing files
>> cannot make changes that require the class driver to manage
>> state.
> The API should not only define what it won't do, but also define what
> it will do. But the "image load" just specifies the top half of the
> process. So I don't think this API would be accepted.
So what is the path forward. It seems like you are saying
that the self-describing files do not fit in the fpga-mgr.
Can we reconsider the FPGA Image Load Framework, which does
not make any assumptions about the contents of the image
files?

>
>>>> When I asserted that not all self-describing images are
>>>> changing firmware, I did not mean to imply that they change
>>>> run-time behavior; they do not. They are part of a multi-
>>>> step update of firmware. However, looking at each part of
>>>> the sequence independently, there is at least one instance
>>>> where a certificate is stored in RAM for temporary use.
>>>> When the driver exits from this call, there is no firmware
>>>> change. There is also no change in run-time behavior.
>>>>
>>>> I'm thinking we could have different IOCTLs:
>>>>
>>>> (1) firmware  update (address, size, purpose provided
>>>>     with the image)
>>> Will the firmware update use the self-describing files?
>> The firmware update option would be for files that
>> are not self-describing.
>>>> (2) image upload (self-describing files)
>>> If both 1 & 2 use self-describing files, how the class driver verifies
>>> the type of the file without looking into the file?
>> Only 2 would use self-describing files.
> Which IOCTL the N3000 flash secure update will use?
It would use 2. All of our image update files are self-describing.

- Russ
>
> Thanks,
> Yilun
>
>> - Russ
>>> For example, if a user calls a firmware update API but inputs an image
>>> upload file, will the class driver block the call? How?
>>>
>>>> (3) image validation
>>>>
>>>> These would all use most of the same code, but the FPGA
>>>> Manager flags and structure fields would be set differently.
>>> This is good to me.
>>>
>>> Thanks,
>>> Yilun
>>>
>>>> - Russ
>>>>> Thanks,
>>>>> Yilun
>>>>>
>>>>>> approach? And if not, why couldn't we incorporate a similar image_load
>>>>>> API into the FPGA Manager?
>>>>>>
>>>>>> - Russ
>>>>>>
>>>>>>> Thanks,
>>>>>>> Yilun
>>>>>>>
>>>>>>>> - Russ
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Yilun
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> - Russ
>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Yilun
>>>>>>>>>>>
>>>>>>>>>>>>> while
>>>>>>>>>>>>> the fpga-mgr knows the data will be stored in FPGA cells. They may need
>>>>>>>>>>>>> to know the exact physical position to store, may not, depends on what the
>>>>>>>>>>>>> HW engines are.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> This functionality could, of course, be merged into the fpga-mgr. I did
>>>>>>>>>>>>>> a proof of concept of this a while back and we discussed the pros and cons.
>>>>>>>>>>>>>> See this email for a recap:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://marc.info/?l=linux-fpga&m=161998085507374&w=2
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Things have changed some with the evolution of the driver. The IOCTL
>>>>>>>>>>>>>> approach probably fits better than the sysfs implementation. At the time
>>>>>>>>>>>>>> it seemed that a merge would add unnecessary complexity without adding value.
>>>>>>>>>>>>> I think at least developers don't have to go through 2 sets of software
>>>>>>>>>>>>> stacks which are of the same concept. And adding some new features like
>>>>>>>>>>>>> optionally threading or canceling data transfer are also good to FPGA
>>>>>>>>>>>>> region update. And the nvmem update could also be benifit from exsiting
>>>>>>>>>>>>> implementations like scatter-gather buffers, in-kernel firmware loading.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I try to explain myself according to each of your concern from that mail
>>>>>>>>>>>>> thread:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Purpose of the 2 updates
>>>>>>>>>>>>> ========================
>>>>>>>>>>>>>
>>>>>>>>>>>>>   As I said before, I think they are both data transfer devices. FPGA
>>>>>>>>>>>>> region update gets extra support from fpga-region & fpga-bridge, and
>>>>>>>>>>>>> FPGA nvmem update could be a standalone fpga-mgr.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Extra APIs that are unique to nvmem update
>>>>>>>>>>>>> ==========================================
>>>>>>>>>>>>>
>>>>>>>>>>>>>   cdev APIs for nvmem update:
>>>>>>>>>>>>>     Yes we need to expand the functionality so we need them.
>>>>>>>>>>>>>
>>>>>>>>>>>>>   available_images, image_load APIs for loading nvmem content to FPGA
>>>>>>>>>>>>>   region:
>>>>>>>>>>>>>     These are features in later patchsets which are not submitted, but we
>>>>>>>>>>>>>     could talk about it in advance. I think this is actually a FPGA region
>>>>>>>>>>>>>     update from nvmem, it also requires gating, data loading (no SW transfer)
>>>>>>>>>>>>>     and re-enumeration, or a single command to image_load HW may result system
>>>>>>>>>>>>>     disordered. The FPGA framework now only supports update from in-kernel
>>>>>>>>>>>>>     user data, maybe we add support for update from nvmem later.
>>>>>>>>>>>>>
>>>>>>>>>>>>>   fpga-mgr state extend:
>>>>>>>>>>>>>     I think it could be extended, The current states are not perfect,
>>>>>>>>>>>>>     adding something like IDLE or READY is just fine.
>>>>>>>>>>>>>
>>>>>>>>>>>>>   fpga-mgr status extend:
>>>>>>>>>>>>>     Add general error definitions as needed. If there is some status
>>>>>>>>>>>>>     that is quite vendor specific, expose it in low-level driver.
>>>>>>>>>>>>>
>>>>>>>>>>>>>   remaining-size:
>>>>>>>>>>>>>     Nice to have for all.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Threading the update
>>>>>>>>>>>>> ====================
>>>>>>>>>>>>>
>>>>>>>>>>>>>   Also a good option for FPGA region update, maybe we also have a slow FPGA
>>>>>>>>>>>>>   reprogrammer?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cancelling the update
>>>>>>>>>>>>> ====================
>>>>>>>>>>>>>
>>>>>>>>>>>>>   Also a good option for FPGA region update if HW engine supports.
>>>>>>>>>>>> These are all good points.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> - Russ
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Yilun
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> According to the patchset, the basic workflow of the 2 update types are
>>>>>>>>>>>>>>>>>> quite similar, get the data, prepare for the HW, write and complete.
>>>>>>>>>>>>>>>>>> They are already implemented in FPGA manager. We've discussed some
>>>>>>>>>>>>>>>>>> differences like threading or canceling the update, which are
>>>>>>>>>>>>>>>>>> not provided by FPGA manager but they may also nice to have for FPGA
>>>>>>>>>>>>>>>>>> region update. An FPGA region update may also last for a long time??
>>>>>>>>>>>>>>>>>> So I think having 2 sets of similar frameworks in FPGA is unnecessary.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> My quick mind is that we add some flags in struct fpga_mgr & struct
>>>>>>>>>>>>>>>>>> fpga_image_info to indicate the HW capability (support FPGA region
>>>>>>>>>>>>>>>>>> update or nvmem update or both) of the download engine and the provided
>>>>>>>>>>>>>>>>>> image type. Then the low-level driver knows how to download if it
>>>>>>>>>>>>>>>>>> supports both image types.An char device could be added for each fpga manager dev, providing the
>>>>>>>>>>>>>>>>>> user APIs for nvmem update. We may not use the char dev for FPGA region
>>>>>>>>>>>>>>>>>> update cause it changes the system HW devices and needs device hotplug
>>>>>>>>>>>>>>>>>> in FPGA region. We'd better leave it to FPGA region class, this is
>>>>>>>>>>>>>>>>>> another topic.
>>>>>>>>>>>>>> I'll give this some more thought and see if I can come up with some RFC
>>>>>>>>>>>>>> patches.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> - Russ
>>>>>>>>>>>>>>>>>> More discussion is appreciated.
>>>>>>>>>>>>>>>>> I also think fpga_mgr could be extended.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In this patchset,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> https://lore.kernel.org/linux-fpga/20210625195849.837976-1-trix@redhat.com/
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> A second, similar set of write ops was added to fpga_manger_ops,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> new bit/flag was added to fpga_image_info
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The intent was for dfl to add their specific ops to cover what is done in
>>>>>>>>>>>>>>>>> this patchset.
>>>>>>>>>>>>>>>> I think we don't have to add 2 ops for reconfig & reimage in framework,
>>>>>>>>>>>>>>>> the 2 processes are almost the same.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Just add the _REIMAGE (or something else, NVMEM?) flag for
>>>>>>>>>>>>>>>> fpga_image_info, and low level drivers handle it as they do for other
>>>>>>>>>>>>>>>> flags.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> How do you think?
>>>>>>>>>>>>>>> A single set is fine.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> A difficult part of is the length of  time to do the write. The existing write should be improved to use a worker thread.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Tom
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Yilun
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Any other driver would do similar.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Is this close to what you are thinking ?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Tom
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Yilun
>>>>>>>>>>>>>>>>>>

Wu, Hao Oct. 26, 2021, 6:45 a.m. UTC | #21

> >>>>>> The FPGA Image Load Framework was designed with the concept of
> >>>>>> transferring data to a device without imposing a purpose on the data.
> >>>>>> The expectation is that the lower-level driver or the device will
> >>>>>> validate the data. Is there something fundamentally wrong with that
> >>>>> I think there is something wrong here. As I said before, persistent
> >>>>> storage updating has different software process from some runtime
> >>>>> updating, so the class driver should be aware of what the HW engine
> >>>>> is doing.
> >>>> So far, there are no self-describing images that cause a
> >>>> change in run-time behavior, and I don't think that will
> >>>> happen for the very reason that the class-driver would
> >>>> need to know about it.
> >>> Again, the class driver needs to know what is happening, at some
> >>> abstraction level, to ensure the system is aligned with the HW state.
> >>>
> >>> If the class driver cannot tell the detail, it has to assume the
> >>> whole FPGA region will be changed, and removal & re-enumeration is
> >>> needed.
> >> So we make it a requirement that the self-describing files
> >> cannot make changes that require the class driver to manage
> >> state.
> > The API should not only define what it won't do, but also define what
> > it will do. But the "image load" just specifies the top half of the
> > process. So I don't think this API would be accepted.
> So what is the path forward. It seems like you are saying
> that the self-describing files do not fit in the fpga-mgr.
> Can we reconsider the FPGA Image Load Framework, which does
> not make any assumptions about the contents of the image
> files?

Why we need such "generic data transfer" interface in FPGA
framework? we need to handle the common need for FPGA
devices only, not all devices, like programming FPGA images.
So far we even don't know, what's the hardware response on
these self-describing files, how we define it as a common need
interface in the framework? If you just want to reuse the
fpga-mgr/framework code for your own purpose, Yes, it seems
saving some code for you, but finally it loses flexibility, as it's
not possible to extend common framework for your own
purpose in the future.
 
Thanks
Hao

Russ Weight Oct. 26, 2021, 5:41 p.m. UTC | #22

On 10/25/21 11:45 PM, Wu, Hao wrote:
>>>>>>>> The FPGA Image Load Framework was designed with the concept of
>>>>>>>> transferring data to a device without imposing a purpose on the data.
>>>>>>>> The expectation is that the lower-level driver or the device will
>>>>>>>> validate the data. Is there something fundamentally wrong with that
>>>>>>> I think there is something wrong here. As I said before, persistent
>>>>>>> storage updating has different software process from some runtime
>>>>>>> updating, so the class driver should be aware of what the HW engine
>>>>>>> is doing.
>>>>>> So far, there are no self-describing images that cause a
>>>>>> change in run-time behavior, and I don't think that will
>>>>>> happen for the very reason that the class-driver would
>>>>>> need to know about it.
>>>>> Again, the class driver needs to know what is happening, at some
>>>>> abstraction level, to ensure the system is aligned with the HW state.
>>>>>
>>>>> If the class driver cannot tell the detail, it has to assume the
>>>>> whole FPGA region will be changed, and removal & re-enumeration is
>>>>> needed.
>>>> So we make it a requirement that the self-describing files
>>>> cannot make changes that require the class driver to manage
>>>> state.
>>> The API should not only define what it won't do, but also define what
>>> it will do. But the "image load" just specifies the top half of the
>>> process. So I don't think this API would be accepted.
>> So what is the path forward. It seems like you are saying
>> that the self-describing files do not fit in the fpga-mgr.
>> Can we reconsider the FPGA Image Load Framework, which does
>> not make any assumptions about the contents of the image
>> files?
> Why we need such "generic data transfer" interface in FPGA
> framework? 
Are you referring to the use of self-describing files?
or the generic nature of this class driver?
> we need to handle the common need for FPGA
> devices only, not all devices, like programming FPGA images.
> So far we even don't know, what's the hardware response on
> these self-describing files, how we define it as a common need
> interface in the framework?
The class driver does not _need_ to reside in the FPGA
framework. I sent an inquiry to the maintainer of the
Firmware update subsystem (and cc'd the kernel mailing list)
and received no responses. I placed it under the FPGA
framework only because the first user of the class driver
is an FPGA driver.

> If you just want to reuse the
> fpga-mgr/framework code for your own purpose, Yes, it seems
> saving some code for you, but finally it loses flexibility, as it's
> not possible to extend common framework for your own
> purpose in the future.
If I understand correctly, you are saying that it doesn't
fit well in the FPGA manager, because not all file types
fit the definition of a firmware update? And future file
types may not fit in fpga-mgr context?

- Russ
>  
> Thanks
> Hao

Wu, Hao Oct. 27, 2021, 3:29 a.m. UTC | #23

> >>> The API should not only define what it won't do, but also define what
> >>> it will do. But the "image load" just specifies the top half of the
> >>> process. So I don't think this API would be accepted.
> >> So what is the path forward. It seems like you are saying
> >> that the self-describing files do not fit in the fpga-mgr.
> >> Can we reconsider the FPGA Image Load Framework, which does
> >> not make any assumptions about the contents of the image
> >> files?
> > Why we need such "generic data transfer" interface in FPGA
> > framework?
> Are you referring to the use of self-describing files?
> or the generic nature of this class driver?

Yes, why this is under FPGA framework? Per your description that
it can be used to transfer any data, e.g. BMC images, some device
specific data (self-describing image?). Let's take this as example,
if FPGA device is replaced with ASIC on N3000, do you still want
to use FPGA image load framework to transfer your device specific
data, e.g. BMC images? I really hope that FPGA framework code only
focus on common usage of FPGA. 

> > we need to handle the common need for FPGA
> > devices only, not all devices, like programming FPGA images.
> > So far we even don't know, what's the hardware response on
> > these self-describing files, how we define it as a common need
> > interface in the framework?
> The class driver does not _need_ to reside in the FPGA
> framework. I sent an inquiry to the maintainer of the
> Firmware update subsystem (and cc'd the kernel mailing list)
> and received no responses. I placed it under the FPGA
> framework only because the first user of the class driver
> is an FPGA driver.
You must have enough justifications why this needs to be included
for everybody not for our own case.

> 
> > If you just want to reuse the
> > fpga-mgr/framework code for your own purpose, Yes, it seems
> > saving some code for you, but finally it loses flexibility, as it's
> > not possible to extend common framework for your own
> > purpose in the future.
> If I understand correctly, you are saying that it doesn't
> fit well in the FPGA manager, because not all file types
> fit the definition of a firmware update? And future file
> types may not fit in fpga-mgr context?

Let's split the use cases, I think the use case that update a persistent
storage for FPGA image, and later use hardware logic (FPGA loader)
to load it into FPGA. This sounds like a common usage for FPGA
devices, so I think this is why Yilun propose to have this part to be
covered by fpga-mgr. But for other cases in your description, e.g.
BMC images, device specific data, self-describing image and etc,
they are out of scope of FPGA.

Actually I don't fully understand why we need to introduce the
"self-describing image" as a common data transfer interface, if
I remember correctly, for N3000, different sub drivers will own
different hardware sub function blocks, why expose such a new
shared communication channel? If "self-describing image" is a
request to one of the sub function block, why not just expose
new interface in such hardware block per modularization? I
have some concern that this new requirement may break
current driver architecture for N3000.

Hao

> 
> - Russ
> >
> > Thanks
> > Hao

Russ Weight Oct. 27, 2021, 3:11 p.m. UTC | #24

On 10/26/21 8:29 PM, Wu, Hao wrote:
>>>>> The API should not only define what it won't do, but also define what
>>>>> it will do. But the "image load" just specifies the top half of the
>>>>> process. So I don't think this API would be accepted.
>>>> So what is the path forward. It seems like you are saying
>>>> that the self-describing files do not fit in the fpga-mgr.
>>>> Can we reconsider the FPGA Image Load Framework, which does
>>>> not make any assumptions about the contents of the image
>>>> files?
>>> Why we need such "generic data transfer" interface in FPGA
>>> framework?
>> Are you referring to the use of self-describing files?
>> or the generic nature of this class driver?
> Yes, why this is under FPGA framework? Per your description that
> it can be used to transfer any data, e.g. BMC images, some device
> specific data (self-describing image?). Let's take this as example,
> if FPGA device is replaced with ASIC on N3000, do you still want
> to use FPGA image load framework to transfer your device specific
> data, e.g. BMC images? I really hope that FPGA framework code only
> focus on common usage of FPGA. 
>
>>> we need to handle the common need for FPGA
>>> devices only, not all devices, like programming FPGA images.
>>> So far we even don't know, what's the hardware response on
>>> these self-describing files, how we define it as a common need
>>> interface in the framework?
>> The class driver does not _need_ to reside in the FPGA
>> framework. I sent an inquiry to the maintainer of the
>> Firmware update subsystem (and cc'd the kernel mailing list)
>> and received no responses. I placed it under the FPGA
>> framework only because the first user of the class driver
>> is an FPGA driver.
> You must have enough justifications why this needs to be included
> for everybody not for our own case.

How do we justify it when there are currently no other known
users? I can go ahead and work up some patches for the firmware
subsystem, if we can resolve the other concerns below.

>>> If you just want to reuse the
>>> fpga-mgr/framework code for your own purpose, Yes, it seems
>>> saving some code for you, but finally it loses flexibility, as it's
>>> not possible to extend common framework for your own
>>> purpose in the future.
>> If I understand correctly, you are saying that it doesn't
>> fit well in the FPGA manager, because not all file types
>> fit the definition of a firmware update? And future file
>> types may not fit in fpga-mgr context?
> Let's split the use cases, I think the use case that update a persistent
> storage for FPGA image, and later use hardware logic (FPGA loader)
> to load it into FPGA. This sounds like a common usage for FPGA
> devices, so I think this is why Yilun propose to have this part to be
> covered by fpga-mgr. But for other cases in your description, e.g.
> BMC images, device specific data, self-describing image and etc,
> they are out of scope of FPGA.

Self-describing files are not something new to us; _ALL_ of the image
files that we send to our FPGA cards, including the N3000 FPGA and BMC
images, root-entry hashes, key cancellations, etc. are self-describing
files. They always have been.


>
> Actually I don't fully understand why we need to introduce the
> "self-describing image" as a common data transfer interface, if
> I remember correctly, for N3000, different sub drivers will own
> different hardware sub function blocks, why expose such a new
> shared communication channel?

There is no change here. The N3000 files are self describing. The
secure-update sub-driver of the MAX10 BMC invokes the class driver,
funnels image data to the BMC, performs handshakes with the BMC,
and ultimately returns status through the class driver. All images
that are sent to the FPGA card follow this same path - and it works
fine.

To try to split out the purposes of each self-describing file to
use different kernel APIs means interfacing multiple class drivers
to the same MAX10 sub-driver. I think it also means replicating
code.

- Russ
> If "self-describing image" is a
> request to one of the sub function block, why not just expose
> new interface in such hardware block per modularization? I
> have some concern that this new requirement may break
> current driver architecture for N3000.
>
> Hao
>
>> - Russ
>>> Thanks
>>> Hao

Tom Rix Oct. 27, 2021, 3:34 p.m. UTC | #25

On 10/27/21 8:11 AM, Russ Weight wrote:
>
> On 10/26/21 8:29 PM, Wu, Hao wrote:
>>>>>> The API should not only define what it won't do, but also define what
>>>>>> it will do. But the "image load" just specifies the top half of the
>>>>>> process. So I don't think this API would be accepted.
>>>>> So what is the path forward. It seems like you are saying
>>>>> that the self-describing files do not fit in the fpga-mgr.
>>>>> Can we reconsider the FPGA Image Load Framework, which does
>>>>> not make any assumptions about the contents of the image
>>>>> files?
>>>> Why we need such "generic data transfer" interface in FPGA
>>>> framework?
>>> Are you referring to the use of self-describing files?
>>> or the generic nature of this class driver?
>> Yes, why this is under FPGA framework? Per your description that
>> it can be used to transfer any data, e.g. BMC images, some device
>> specific data (self-describing image?). Let's take this as example,
>> if FPGA device is replaced with ASIC on N3000, do you still want
>> to use FPGA image load framework to transfer your device specific
>> data, e.g. BMC images? I really hope that FPGA framework code only
>> focus on common usage of FPGA.
>>
>>>> we need to handle the common need for FPGA
>>>> devices only, not all devices, like programming FPGA images.
>>>> So far we even don't know, what's the hardware response on
>>>> these self-describing files, how we define it as a common need
>>>> interface in the framework?
>>> The class driver does not _need_ to reside in the FPGA
>>> framework. I sent an inquiry to the maintainer of the
>>> Firmware update subsystem (and cc'd the kernel mailing list)
>>> and received no responses. I placed it under the FPGA
>>> framework only because the first user of the class driver
>>> is an FPGA driver.
>> You must have enough justifications why this needs to be included
>> for everybody not for our own case.
> How do we justify it when there are currently no other known
> users? I can go ahead and work up some patches for the firmware
> subsystem, if we can resolve the other concerns below.
>
>>>> If you just want to reuse the
>>>> fpga-mgr/framework code for your own purpose, Yes, it seems
>>>> saving some code for you, but finally it loses flexibility, as it's
>>>> not possible to extend common framework for your own
>>>> purpose in the future.
>>> If I understand correctly, you are saying that it doesn't
>>> fit well in the FPGA manager, because not all file types
>>> fit the definition of a firmware update? And future file
>>> types may not fit in fpga-mgr context?
>> Let's split the use cases, I think the use case that update a persistent
>> storage for FPGA image, and later use hardware logic (FPGA loader)
>> to load it into FPGA. This sounds like a common usage for FPGA
>> devices, so I think this is why Yilun propose to have this part to be
>> covered by fpga-mgr. But for other cases in your description, e.g.
>> BMC images, device specific data, self-describing image and etc,
>> they are out of scope of FPGA.
> Self-describing files are not something new to us; _ALL_ of the image
> files that we send to our FPGA cards, including the N3000 FPGA and BMC
> images, root-entry hashes, key cancellations, etc. are self-describing
> files. They always have been.
>
>    
>> Actually I don't fully understand why we need to introduce the
>> "self-describing image" as a common data transfer interface, if
>> I remember correctly, for N3000, different sub drivers will own
>> different hardware sub function blocks, why expose such a new
>> shared communication channel?
> There is no change here. The N3000 files are self describing. The
> secure-update sub-driver of the MAX10 BMC invokes the class driver,
> funnels image data to the BMC, performs handshakes with the BMC,
> and ultimately returns status through the class driver. All images
> that are sent to the FPGA card follow this same path - and it works
> fine.
>
> To try to split out the purposes of each self-describing file to
> use different kernel APIs means interfacing multiple class drivers
> to the same MAX10 sub-driver. I think it also means replicating
> code.

Could the split be ?

add max10 bits mfd/

move image updating out of the kernel and into an uio driver

Tom

>
> - Russ
>> If "self-describing image" is a
>> request to one of the sub function block, why not just expose
>> new interface in such hardware block per modularization? I
>> have some concern that this new requirement may break
>> current driver architecture for N3000.
>>
>> Hao
>>
>>> - Russ
>>>> Thanks
>>>> Hao

Xu Yilun Oct. 28, 2021, 3:09 p.m. UTC | #26

On Wed, Oct 27, 2021 at 08:34:16AM -0700, Tom Rix wrote:
> 
> On 10/27/21 8:11 AM, Russ Weight wrote:
> > 
> > On 10/26/21 8:29 PM, Wu, Hao wrote:
> > > > > > > The API should not only define what it won't do, but also define what
> > > > > > > it will do. But the "image load" just specifies the top half of the
> > > > > > > process. So I don't think this API would be accepted.
> > > > > > So what is the path forward. It seems like you are saying
> > > > > > that the self-describing files do not fit in the fpga-mgr.
> > > > > > Can we reconsider the FPGA Image Load Framework, which does
> > > > > > not make any assumptions about the contents of the image
> > > > > > files?
> > > > > Why we need such "generic data transfer" interface in FPGA
> > > > > framework?
> > > > Are you referring to the use of self-describing files?
> > > > or the generic nature of this class driver?
> > > Yes, why this is under FPGA framework? Per your description that
> > > it can be used to transfer any data, e.g. BMC images, some device
> > > specific data (self-describing image?). Let's take this as example,
> > > if FPGA device is replaced with ASIC on N3000, do you still want
> > > to use FPGA image load framework to transfer your device specific
> > > data, e.g. BMC images? I really hope that FPGA framework code only
> > > focus on common usage of FPGA.
> > > 
> > > > > we need to handle the common need for FPGA
> > > > > devices only, not all devices, like programming FPGA images.
> > > > > So far we even don't know, what's the hardware response on
> > > > > these self-describing files, how we define it as a common need
> > > > > interface in the framework?
> > > > The class driver does not _need_ to reside in the FPGA
> > > > framework. I sent an inquiry to the maintainer of the
> > > > Firmware update subsystem (and cc'd the kernel mailing list)
> > > > and received no responses. I placed it under the FPGA
> > > > framework only because the first user of the class driver
> > > > is an FPGA driver.
> > > You must have enough justifications why this needs to be included
> > > for everybody not for our own case.
> > How do we justify it when there are currently no other known
> > users? I can go ahead and work up some patches for the firmware
> > subsystem, if we can resolve the other concerns below.
> > 
> > > > > If you just want to reuse the
> > > > > fpga-mgr/framework code for your own purpose, Yes, it seems
> > > > > saving some code for you, but finally it loses flexibility, as it's
> > > > > not possible to extend common framework for your own
> > > > > purpose in the future.
> > > > If I understand correctly, you are saying that it doesn't
> > > > fit well in the FPGA manager, because not all file types
> > > > fit the definition of a firmware update? And future file
> > > > types may not fit in fpga-mgr context?
> > > Let's split the use cases, I think the use case that update a persistent
> > > storage for FPGA image, and later use hardware logic (FPGA loader)
> > > to load it into FPGA. This sounds like a common usage for FPGA
> > > devices, so I think this is why Yilun propose to have this part to be
> > > covered by fpga-mgr. But for other cases in your description, e.g.
> > > BMC images, device specific data, self-describing image and etc,
> > > they are out of scope of FPGA.
> > Self-describing files are not something new to us; _ALL_ of the image
> > files that we send to our FPGA cards, including the N3000 FPGA and BMC
> > images, root-entry hashes, key cancellations, etc. are self-describing
> > files. They always have been.
> > 
> > > Actually I don't fully understand why we need to introduce the
> > > "self-describing image" as a common data transfer interface, if
> > > I remember correctly, for N3000, different sub drivers will own
> > > different hardware sub function blocks, why expose such a new
> > > shared communication channel?
> > There is no change here. The N3000 files are self describing. The
> > secure-update sub-driver of the MAX10 BMC invokes the class driver,
> > funnels image data to the BMC, performs handshakes with the BMC,
> > and ultimately returns status through the class driver. All images
> > that are sent to the FPGA card follow this same path - and it works
> > fine.
> > 
> > To try to split out the purposes of each self-describing file to
> > use different kernel APIs means interfacing multiple class drivers
> > to the same MAX10 sub-driver. I think it also means replicating
> > code.
> 
> Could the split be ?
> 
> add max10 bits mfd/
> 
> move image updating out of the kernel and into an uio driver

I'm afraid an uio driver doesn't help in this case. The image updating
is not an independent device, it may dynamically change other hardwares.
So it is better the image updating driver works as an low level driver
which provides services to other feature drivers.

Thanks,
Yilun

> 
> Tom
> 
> > 
> > - Russ
> > > If "self-describing image" is a
> > > request to one of the sub function block, why not just expose
> > > new interface in such hardware block per modularization? I
> > > have some concern that this new requirement may break
> > > current driver architecture for N3000.
> > > 
> > > Hao
> > > 
> > > > - Russ
> > > > > Thanks
> > > > > Hao

Tom Rix Oct. 28, 2021, 4:08 p.m. UTC | #27

On 10/28/21 8:09 AM, Xu Yilun wrote:
> On Wed, Oct 27, 2021 at 08:34:16AM -0700, Tom Rix wrote:
>> On 10/27/21 8:11 AM, Russ Weight wrote:
>>> On 10/26/21 8:29 PM, Wu, Hao wrote:
>>>>>>>> The API should not only define what it won't do, but also define what
>>>>>>>> it will do. But the "image load" just specifies the top half of the
>>>>>>>> process. So I don't think this API would be accepted.
>>>>>>> So what is the path forward. It seems like you are saying
>>>>>>> that the self-describing files do not fit in the fpga-mgr.
>>>>>>> Can we reconsider the FPGA Image Load Framework, which does
>>>>>>> not make any assumptions about the contents of the image
>>>>>>> files?
>>>>>> Why we need such "generic data transfer" interface in FPGA
>>>>>> framework?
>>>>> Are you referring to the use of self-describing files?
>>>>> or the generic nature of this class driver?
>>>> Yes, why this is under FPGA framework? Per your description that
>>>> it can be used to transfer any data, e.g. BMC images, some device
>>>> specific data (self-describing image?). Let's take this as example,
>>>> if FPGA device is replaced with ASIC on N3000, do you still want
>>>> to use FPGA image load framework to transfer your device specific
>>>> data, e.g. BMC images? I really hope that FPGA framework code only
>>>> focus on common usage of FPGA.
>>>>
>>>>>> we need to handle the common need for FPGA
>>>>>> devices only, not all devices, like programming FPGA images.
>>>>>> So far we even don't know, what's the hardware response on
>>>>>> these self-describing files, how we define it as a common need
>>>>>> interface in the framework?
>>>>> The class driver does not _need_ to reside in the FPGA
>>>>> framework. I sent an inquiry to the maintainer of the
>>>>> Firmware update subsystem (and cc'd the kernel mailing list)
>>>>> and received no responses. I placed it under the FPGA
>>>>> framework only because the first user of the class driver
>>>>> is an FPGA driver.
>>>> You must have enough justifications why this needs to be included
>>>> for everybody not for our own case.
>>> How do we justify it when there are currently no other known
>>> users? I can go ahead and work up some patches for the firmware
>>> subsystem, if we can resolve the other concerns below.
>>>
>>>>>> If you just want to reuse the
>>>>>> fpga-mgr/framework code for your own purpose, Yes, it seems
>>>>>> saving some code for you, but finally it loses flexibility, as it's
>>>>>> not possible to extend common framework for your own
>>>>>> purpose in the future.
>>>>> If I understand correctly, you are saying that it doesn't
>>>>> fit well in the FPGA manager, because not all file types
>>>>> fit the definition of a firmware update? And future file
>>>>> types may not fit in fpga-mgr context?
>>>> Let's split the use cases, I think the use case that update a persistent
>>>> storage for FPGA image, and later use hardware logic (FPGA loader)
>>>> to load it into FPGA. This sounds like a common usage for FPGA
>>>> devices, so I think this is why Yilun propose to have this part to be
>>>> covered by fpga-mgr. But for other cases in your description, e.g.
>>>> BMC images, device specific data, self-describing image and etc,
>>>> they are out of scope of FPGA.
>>> Self-describing files are not something new to us; _ALL_ of the image
>>> files that we send to our FPGA cards, including the N3000 FPGA and BMC
>>> images, root-entry hashes, key cancellations, etc. are self-describing
>>> files. They always have been.
>>>
>>>> Actually I don't fully understand why we need to introduce the
>>>> "self-describing image" as a common data transfer interface, if
>>>> I remember correctly, for N3000, different sub drivers will own
>>>> different hardware sub function blocks, why expose such a new
>>>> shared communication channel?
>>> There is no change here. The N3000 files are self describing. The
>>> secure-update sub-driver of the MAX10 BMC invokes the class driver,
>>> funnels image data to the BMC, performs handshakes with the BMC,
>>> and ultimately returns status through the class driver. All images
>>> that are sent to the FPGA card follow this same path - and it works
>>> fine.
>>>
>>> To try to split out the purposes of each self-describing file to
>>> use different kernel APIs means interfacing multiple class drivers
>>> to the same MAX10 sub-driver. I think it also means replicating
>>> code.
>> Could the split be ?
>>
>> add max10 bits mfd/
>>
>> move image updating out of the kernel and into an uio driver
> I'm afraid an uio driver doesn't help in this case. The image updating
> is not an independent device, it may dynamically change other hardwares.
> So it is better the image updating driver works as an low level driver
> which provides services to other feature drivers.

Ok.

Since this is dfl specific could a 'write' op be added to fme_fops ?

Tom

>
> Thanks,
> Yilun
>
>> Tom
>>
>>> - Russ
>>>> If "self-describing image" is a
>>>> request to one of the sub function block, why not just expose
>>>> new interface in such hardware block per modularization? I
>>>> have some concern that this new requirement may break
>>>> current driver architecture for N3000.
>>>>
>>>> Hao
>>>>
>>>>> - Russ
>>>>>> Thanks
>>>>>> Hao

Xu Yilun Oct. 29, 2021, 2:05 a.m. UTC | #28

On Thu, Oct 28, 2021 at 09:08:03AM -0700, Tom Rix wrote:
> 
> On 10/28/21 8:09 AM, Xu Yilun wrote:
> > On Wed, Oct 27, 2021 at 08:34:16AM -0700, Tom Rix wrote:
> > > On 10/27/21 8:11 AM, Russ Weight wrote:
> > > > On 10/26/21 8:29 PM, Wu, Hao wrote:
> > > > > > > > > The API should not only define what it won't do, but also define what
> > > > > > > > > it will do. But the "image load" just specifies the top half of the
> > > > > > > > > process. So I don't think this API would be accepted.
> > > > > > > > So what is the path forward. It seems like you are saying
> > > > > > > > that the self-describing files do not fit in the fpga-mgr.
> > > > > > > > Can we reconsider the FPGA Image Load Framework, which does
> > > > > > > > not make any assumptions about the contents of the image
> > > > > > > > files?
> > > > > > > Why we need such "generic data transfer" interface in FPGA
> > > > > > > framework?
> > > > > > Are you referring to the use of self-describing files?
> > > > > > or the generic nature of this class driver?
> > > > > Yes, why this is under FPGA framework? Per your description that
> > > > > it can be used to transfer any data, e.g. BMC images, some device
> > > > > specific data (self-describing image?). Let's take this as example,
> > > > > if FPGA device is replaced with ASIC on N3000, do you still want
> > > > > to use FPGA image load framework to transfer your device specific
> > > > > data, e.g. BMC images? I really hope that FPGA framework code only
> > > > > focus on common usage of FPGA.
> > > > > 
> > > > > > > we need to handle the common need for FPGA
> > > > > > > devices only, not all devices, like programming FPGA images.
> > > > > > > So far we even don't know, what's the hardware response on
> > > > > > > these self-describing files, how we define it as a common need
> > > > > > > interface in the framework?
> > > > > > The class driver does not _need_ to reside in the FPGA
> > > > > > framework. I sent an inquiry to the maintainer of the
> > > > > > Firmware update subsystem (and cc'd the kernel mailing list)
> > > > > > and received no responses. I placed it under the FPGA
> > > > > > framework only because the first user of the class driver
> > > > > > is an FPGA driver.
> > > > > You must have enough justifications why this needs to be included
> > > > > for everybody not for our own case.
> > > > How do we justify it when there are currently no other known
> > > > users? I can go ahead and work up some patches for the firmware
> > > > subsystem, if we can resolve the other concerns below.
> > > > 
> > > > > > > If you just want to reuse the
> > > > > > > fpga-mgr/framework code for your own purpose, Yes, it seems
> > > > > > > saving some code for you, but finally it loses flexibility, as it's
> > > > > > > not possible to extend common framework for your own
> > > > > > > purpose in the future.
> > > > > > If I understand correctly, you are saying that it doesn't
> > > > > > fit well in the FPGA manager, because not all file types
> > > > > > fit the definition of a firmware update? And future file
> > > > > > types may not fit in fpga-mgr context?
> > > > > Let's split the use cases, I think the use case that update a persistent
> > > > > storage for FPGA image, and later use hardware logic (FPGA loader)
> > > > > to load it into FPGA. This sounds like a common usage for FPGA
> > > > > devices, so I think this is why Yilun propose to have this part to be
> > > > > covered by fpga-mgr. But for other cases in your description, e.g.
> > > > > BMC images, device specific data, self-describing image and etc,
> > > > > they are out of scope of FPGA.
> > > > Self-describing files are not something new to us; _ALL_ of the image
> > > > files that we send to our FPGA cards, including the N3000 FPGA and BMC
> > > > images, root-entry hashes, key cancellations, etc. are self-describing
> > > > files. They always have been.
> > > > 
> > > > > Actually I don't fully understand why we need to introduce the
> > > > > "self-describing image" as a common data transfer interface, if
> > > > > I remember correctly, for N3000, different sub drivers will own
> > > > > different hardware sub function blocks, why expose such a new
> > > > > shared communication channel?
> > > > There is no change here. The N3000 files are self describing. The
> > > > secure-update sub-driver of the MAX10 BMC invokes the class driver,
> > > > funnels image data to the BMC, performs handshakes with the BMC,
> > > > and ultimately returns status through the class driver. All images
> > > > that are sent to the FPGA card follow this same path - and it works
> > > > fine.
> > > > 
> > > > To try to split out the purposes of each self-describing file to
> > > > use different kernel APIs means interfacing multiple class drivers
> > > > to the same MAX10 sub-driver. I think it also means replicating
> > > > code.
> > > Could the split be ?
> > > 
> > > add max10 bits mfd/
> > > 
> > > move image updating out of the kernel and into an uio driver
> > I'm afraid an uio driver doesn't help in this case. The image updating
> > is not an independent device, it may dynamically change other hardwares.
> > So it is better the image updating driver works as an low level driver
> > which provides services to other feature drivers.
> 
> Ok.
> 
> Since this is dfl specific could a 'write' op be added to fme_fops ?

We still need to have a clear definition or classification of what the
changes could be so that we discuss which operations could be added to
which driver. Or we have to assume the FME is changed and has to be
removed before loading and re-enumerated afterward. That seems not what
everyone want.

Thanks,
Yilun

> 
> Tom
> 
> > 
> > Thanks,
> > Yilun
> > 
> > > Tom
> > > 
> > > > - Russ
> > > > > If "self-describing image" is a
> > > > > request to one of the sub function block, why not just expose
> > > > > new interface in such hardware block per modularization? I
> > > > > have some concern that this new requirement may break
> > > > > current driver architecture for N3000.
> > > > > 
> > > > > Hao
> > > > > 
> > > > > > - Russ
> > > > > > > Thanks
> > > > > > > Hao

[v17,0/5] FPGA Image Load (previously Security Manager)

Message

Comments