mbox series

[0/2] cxl/mailbox: Replace racy error checking with timeouts

Message ID 163855973642.1338601.12855868083437118567.stgit@dwillia2-desk3.amr.corp.intel.com
Headers show
Series cxl/mailbox: Replace racy error checking with timeouts | expand

Message

Dan Williams Dec. 3, 2021, 7:28 p.m. UTC
Quoting patch2:

    Device status can change without warning at any point in time. This
    effectively means that no amount of status checking before a command is
    submitted can guarantee that the device is not in an error condition
    when the command is later submitted. The clearest signal that a device
    is not able to process commands is if it fails to process commands.

So while "cxl/pci: Don't poll doorbell for mailbox access" [1] trimmed
cxl_pci_mbox_get() a bit, it still checks status in racy manner. Just
drop it altogether, add a doorbell sanity check to the initial
mailbox setup, and rely on timeouts to report errors.

Also, in anticipation of a 60s not being enough for device-bringup
scenarios, and a module parameter as an override.

This drops patch 3 [2], replaces patch 5 [3], and adds "cxl/pci: Defer
mailbox status checks to command timeouts" to the "CXL port prep work"
series [4]. Jonathan, given the policy tweaks and addition of a module
parameter I did not carry forward your reviewed-by for "cxl/pci:
Implement Interface Ready Timeout", please have another look when you
get a chance.

[1]: https://lore.kernel.org/r/20211129214721.1668325-6-ben.widawsky@intel.com
[2]: https://lore.kernel.org/r/20211129214721.1668325-4-ben.widawsky@intel.com
[3]: https://lore.kernel.org/r/20211129214721.1668325-6-ben.widawsky@intel.com
[4]: https://lore.kernel.org/r/20211129214721.1668325-1-ben.widawsky@intel.com

---

Ben Widawsky (1):
      cxl/pci: Implement Interface Ready Timeout

Dan Williams (1):
      cxl/pci: Defer mailbox status checks to command timeouts


 drivers/cxl/pci.c |  160 +++++++++++++++++++++--------------------------------
 1 file changed, 64 insertions(+), 96 deletions(-)