mbox series

[0/2] vmxnet3: Fix inconsistent DMA accesses

Message ID 20241113200001.3567479-1-bjohannesmeyer@gmail.com (mailing list archive)
Headers show
Series vmxnet3: Fix inconsistent DMA accesses | expand

Message

Brian Johannesmeyer Nov. 13, 2024, 7:59 p.m. UTC
Hello,

We found hundreds of inconsistent DMA accesses in the VMXNET3 driver. This
patch series aims to fix them. (For a nice summary of the rules around
accessing streaming DMA --- which, if violated, result in inconsistent
accesses --- see Figure 4a of this paper [0]).

The inconsistent accesses occur because the `adapter` object is mapped into
streaming DMA. However, when it is mapped into streaming DMA, it is then
"owned" by the device. Hence, any access to `adapter` thereafter, if not
preceded by a CPU-synchronization operation (e.g.,
`dma_sync_single_for_cpu()`), may cause unexpected hardware behaviors.

This patch series consists of two patches:
- Patch 1 adds synchronization operations into `vmxnet3_probe_device()`, to
  mitigate the inconsistent accesses when `adapter` is initialized.
However, this unfortunately does not mitigate all inconsistent accesses to
it, because `adapter` is accessed elsewhere in the driver without proper
synchronization.
- Patch 2 removes `adapter` from streaming DMA, which entirely mitigates
  the inconsistent accesses to it. It is not clear to me why `adapter` was
mapped into DMA in the first place (in [1]), because it seems that before
[1], it was not mapped into DMA. (However, I am not very familiar with the
VMXNET3 internals, so someone is welcome to correct me here). Alternatively
--- if `adapter` should indeed remain mapped in DMA --- then
synchronization operations should be added throughout the driver code (as
Patch 1 begins to do).

[0] Link: https://www.usenix.org/system/files/sec21-bai.pdf
[1] commit b0eb57cb97e7837ebb746404c2c58c6f536f23fa ("VMXNET3: Add support
for virtual IOMMU")

Brian Johannesmeyer (2):
  vmxnet3: Fix inconsistent DMA accesses in vmxnet3_probe_device()
  vmxnet3: Remove adapter from DMA region

 drivers/net/vmxnet3/vmxnet3_drv.c | 17 ++---------------
 drivers/net/vmxnet3/vmxnet3_int.h |  1 -
 2 files changed, 2 insertions(+), 16 deletions(-)

Comments

Jakub Kicinski Nov. 15, 2024, 3:38 a.m. UTC | #1
On Wed, 13 Nov 2024 20:59:59 +0100 Brian Johannesmeyer wrote:
> We found hundreds of inconsistent DMA accesses in the VMXNET3 driver. This
> patch series aims to fix them. (For a nice summary of the rules around
> accessing streaming DMA --- which, if violated, result in inconsistent
> accesses --- see Figure 4a of this paper [0]).
> 
> The inconsistent accesses occur because the `adapter` object is mapped into
> streaming DMA. However, when it is mapped into streaming DMA, it is then
> "owned" by the device. Hence, any access to `adapter` thereafter, if not
> preceded by a CPU-synchronization operation (e.g.,
> `dma_sync_single_for_cpu()`), may cause unexpected hardware behaviors.
> 
> This patch series consists of two patches:
> - Patch 1 adds synchronization operations into `vmxnet3_probe_device()`, to
>   mitigate the inconsistent accesses when `adapter` is initialized.
> However, this unfortunately does not mitigate all inconsistent accesses to
> it, because `adapter` is accessed elsewhere in the driver without proper
> synchronization.
> - Patch 2 removes `adapter` from streaming DMA, which entirely mitigates
>   the inconsistent accesses to it. It is not clear to me why `adapter` was
> mapped into DMA in the first place (in [1]), because it seems that before
> [1], it was not mapped into DMA. (However, I am not very familiar with the
> VMXNET3 internals, so someone is welcome to correct me here). Alternatively
> --- if `adapter` should indeed remain mapped in DMA --- then
> synchronization operations should be added throughout the driver code (as
> Patch 1 begins to do).

I guess we need to hear from vmxnet3 maintainers to know whether DMA
mapping is necessary for this virt device. But committing patch 1 just
to completely revert it in patch 2 seems a little odd.

Also trivial note, please checkpatch with --strict --max-line-length=80
Brian Johannesmeyer Nov. 18, 2024, 3:31 p.m. UTC | #2
> But committing patch 1 just
> to completely revert it in patch 2 seems a little odd.

Indeed, this was a poor choice on my part. I suppose the correct way
to do this would be to submit them separately (as opposed to as a
series)? I.e.: (i) one patch to start adding the synchronization
operations (in case `adapter` should indeed be in a DMA region), and
(ii) a second patch to remove `adapter` from a DMA region? Based on
the feedback, I can submit a V2 patch for either (i) or (ii).

> Also trivial note, please checkpatch with --strict --max-line-length=80

Thanks for the feedback.

-Brian

On Thu, Nov 14, 2024 at 8:38 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 13 Nov 2024 20:59:59 +0100 Brian Johannesmeyer wrote:
> > We found hundreds of inconsistent DMA accesses in the VMXNET3 driver. This
> > patch series aims to fix them. (For a nice summary of the rules around
> > accessing streaming DMA --- which, if violated, result in inconsistent
> > accesses --- see Figure 4a of this paper [0]).
> >
> > The inconsistent accesses occur because the `adapter` object is mapped into
> > streaming DMA. However, when it is mapped into streaming DMA, it is then
> > "owned" by the device. Hence, any access to `adapter` thereafter, if not
> > preceded by a CPU-synchronization operation (e.g.,
> > `dma_sync_single_for_cpu()`), may cause unexpected hardware behaviors.
> >
> > This patch series consists of two patches:
> > - Patch 1 adds synchronization operations into `vmxnet3_probe_device()`, to
> >   mitigate the inconsistent accesses when `adapter` is initialized.
> > However, this unfortunately does not mitigate all inconsistent accesses to
> > it, because `adapter` is accessed elsewhere in the driver without proper
> > synchronization.
> > - Patch 2 removes `adapter` from streaming DMA, which entirely mitigates
> >   the inconsistent accesses to it. It is not clear to me why `adapter` was
> > mapped into DMA in the first place (in [1]), because it seems that before
> > [1], it was not mapped into DMA. (However, I am not very familiar with the
> > VMXNET3 internals, so someone is welcome to correct me here). Alternatively
> > --- if `adapter` should indeed remain mapped in DMA --- then
> > synchronization operations should be added throughout the driver code (as
> > Patch 1 begins to do).
>
> I guess we need to hear from vmxnet3 maintainers to know whether DMA
> mapping is necessary for this virt device. But committing patch 1 just
> to completely revert it in patch 2 seems a little odd.
>
> Also trivial note, please checkpatch with --strict --max-line-length=80
> --
> pw-bot: cr
Jakub Kicinski Nov. 19, 2024, 12:16 a.m. UTC | #3
On Mon, 18 Nov 2024 08:31:35 -0700 Brian Johannesmeyer wrote:
> > But committing patch 1 just
> > to completely revert it in patch 2 seems a little odd.  
> 
> Indeed, this was a poor choice on my part. I suppose the correct way
> to do this would be to submit them separately (as opposed to as a
> series)? I.e.: (i) one patch to start adding the synchronization
> operations (in case `adapter` should indeed be in a DMA region), and
> (ii) a second patch to remove `adapter` from a DMA region? Based on
> the feedback, I can submit a V2 patch for either (i) or (ii).

What is the purpose of the first patch? Is it sufficient to make 
the device work correctly?

If yes, why do we need patch 2.
If no, why do we have patch 1, instead of a revert / patch 2...
Brian Johannesmeyer Nov. 19, 2024, 5:10 p.m. UTC | #4
> What is the purpose of the first patch? Is it sufficient to make
> the device work correctly?
The purpose of the first patch is to fix the inconsistent accesses in
`vmxnet3_probe_device()`. This only partially fixes the issue,
however, because there are inconsistent accesses elsewhere in the
driver. So, no, it does not make the device work *entirely* correctly.

> If yes, why do we need patch 2.
> If no, why do we have patch 1, instead of a revert / patch 2...
The answer is that the way I submitted this patch series was a
mistake. Specifically, I submitted it as: (i) patch 1 is applied on
master, *and* (ii) patch 2 is applied on patch 1.

Instead, the way I should have submitted it was: (i) patch 1 is
applied on master, *or* (ii) a corrected version of patch 2 is applied
on master. (By "a corrected version of patch 2", I mean not
pointlessly reverting patch 1.)

The difference being:
- If `adapter` *should* be mapped to DMA, then patch 1 is the way to go. Or,
- If `adapter` *should not* be mapped to DMA, then a corrected version
of patch 2 is the way to go.

I hope this clears things up. I'm sorry for the confusion I caused. I
will submit a V2 patch series that makes this clear.

Thanks,

Brian