Message ID | 1675135986-254490-1-git-send-email-mikelley@microsoft.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 99f1c46011cc0feb47d4f4f7bee70a0341442d14 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,1/1] hv_netvsc: Fix missed pagebuf entries in netvsc_dma_map/unmap() | expand |
> -----Original Message----- > From: Michael Kelley (LINUX) <mikelley@microsoft.com> > Sent: Monday, January 30, 2023 10:33 PM > To: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang > <haiyangz@microsoft.com>; wei.liu@kernel.org; Dexuan Cui > <decui@microsoft.com>; davem@davemloft.net; edumazet@google.com; > kuba@kernel.org; pabeni@redhat.com; netdev@vger.kernel.org; linux- > hyperv@vger.kernel.org; linux-kernel@vger.kernel.org > Cc: Michael Kelley (LINUX) <mikelley@microsoft.com>; stable@vger.kernel.org > Subject: [PATCH net 1/1] hv_netvsc: Fix missed pagebuf entries in > netvsc_dma_map/unmap() > > netvsc_dma_map() and netvsc_dma_unmap() currently check the cp_partial > flag and adjust the page_count so that pagebuf entries for the RNDIS > portion of the message are skipped when it has already been copied into > a send buffer. But this adjustment has already been made by code in > netvsc_send(). The duplicate adjustment causes some pagebuf entries to > not be mapped. In a normal VM, this doesn't break anything because the > mapping doesn’t change the PFN. But in a Confidential VM, > dma_map_single() does bounce buffering and provides a different PFN. > Failing to do the mapping causes the wrong PFN to be passed to Hyper-V, > and various errors ensue. > > Fix this by removing the duplicate adjustment in netvsc_dma_map() and > netvsc_dma_unmap(). > > Fixes: 846da38de0e8 ("net: netvsc: Add Isolation VM support for netvsc > driver") > Cc: stable@vger.kernel.org > Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Thanks!
On Mon, 30 Jan 2023 19:33:06 -0800 Michael Kelley wrote: > @@ -990,9 +987,7 @@ static int netvsc_dma_map(struct hv_device *hv_dev, > struct hv_netvsc_packet *packet, > struct hv_page_buffer *pb) > { > - u32 page_count = packet->cp_partial ? > - packet->page_buf_cnt - packet->rmsg_pgcnt : > - packet->page_buf_cnt; > + u32 page_count = packet->page_buf_cnt; > dma_addr_t dma; > int i; Suspiciously, the caller still does: if (packet->cp_partial) pb += packet->rmsg_pgcnt; ret = netvsc_dma_map(ndev_ctx->device_ctx, packet, pb); Shouldn't that if () pb +=... also go away?
From: Jakub Kicinski <kuba@kernel.org> Sent: Wednesday, February 1, 2023 9:01 PM > > On Mon, 30 Jan 2023 19:33:06 -0800 Michael Kelley wrote: > > @@ -990,9 +987,7 @@ static int netvsc_dma_map(struct hv_device *hv_dev, > > struct hv_netvsc_packet *packet, > > struct hv_page_buffer *pb) > > { > > - u32 page_count = packet->cp_partial ? > > - packet->page_buf_cnt - packet->rmsg_pgcnt : > > - packet->page_buf_cnt; > > + u32 page_count = packet->page_buf_cnt; > > dma_addr_t dma; > > int i; > > Suspiciously, the caller still does: > > if (packet->cp_partial) > pb += packet->rmsg_pgcnt; > > ret = netvsc_dma_map(ndev_ctx->device_ctx, packet, pb); > > Shouldn't that if () pb +=... also go away? No -- it's correct. In netvsc_send(), cp_partial is tested and packet->page_buf_cnt is adjusted. But the pointer into the pagebuf array is not adjusted in netvsc_send(). Instead it is adjusted here in netvsc_send_pkt(), which brings it back in sync with packet->page_buf_cnt. I don't know if there's a good reason for the adjustment being split across two different functions. It doesn't seem like the most straightforward approach. From a quick glance at the code it looks like this adjustment to 'pb' could move to netvsc_send() to be together with the adjustment to packet->page_buf_cnt, but maybe there's a reason for the split that I'm not familiar with. Haiyang -- any insight? Michael
On Thu, 2023-02-02 at 05:20 +0000, Michael Kelley (LINUX) wrote: > From: Jakub Kicinski <kuba@kernel.org> Sent: Wednesday, February 1, 2023 9:01 PM > > > > On Mon, 30 Jan 2023 19:33:06 -0800 Michael Kelley wrote: > > > @@ -990,9 +987,7 @@ static int netvsc_dma_map(struct hv_device *hv_dev, > > > struct hv_netvsc_packet *packet, > > > struct hv_page_buffer *pb) > > > { > > > - u32 page_count = packet->cp_partial ? > > > - packet->page_buf_cnt - packet->rmsg_pgcnt : > > > - packet->page_buf_cnt; > > > + u32 page_count = packet->page_buf_cnt; > > > dma_addr_t dma; > > > int i; > > > > Suspiciously, the caller still does: > > > > if (packet->cp_partial) > > pb += packet->rmsg_pgcnt; > > > > ret = netvsc_dma_map(ndev_ctx->device_ctx, packet, pb); > > > > Shouldn't that if () pb +=... also go away? > > No -- it's correct. > > In netvsc_send(), cp_partial is tested and packet->page_buf_cnt is > adjusted. But the pointer into the pagebuf array is not adjusted in > netvsc_send(). Instead it is adjusted here in netvsc_send_pkt(), which > brings it back in sync with packet->page_buf_cnt. Ok > I don't know if there's a good reason for the adjustment being split > across two different functions. It doesn't seem like the most > straightforward approach. From a quick glance at the code it looks > like this adjustment to 'pb' could move to netvsc_send() to be > together with the adjustment to packet->page_buf_cnt, but maybe > there's a reason for the split that I'm not familiar with. > > Haiyang -- any insight? While at that, please also have a look at the following allocation in netvsc_dma_map(): packet->dma_range = kcalloc(page_count, sizeof(*packet->dma_range), GFP_KERNEL); which looks wrong - netvsc_dma_map() should be in atomic context. Anyway it's a topic unrelated from this patch. I just stumbled upon it while reviewing. Cheers, Paolo
Hello: This patch was applied to netdev/net.git (master) by Paolo Abeni <pabeni@redhat.com>: On Mon, 30 Jan 2023 19:33:06 -0800 you wrote: > netvsc_dma_map() and netvsc_dma_unmap() currently check the cp_partial > flag and adjust the page_count so that pagebuf entries for the RNDIS > portion of the message are skipped when it has already been copied into > a send buffer. But this adjustment has already been made by code in > netvsc_send(). The duplicate adjustment causes some pagebuf entries to > not be mapped. In a normal VM, this doesn't break anything because the > mapping doesn’t change the PFN. But in a Confidential VM, > dma_map_single() does bounce buffering and provides a different PFN. > Failing to do the mapping causes the wrong PFN to be passed to Hyper-V, > and various errors ensue. > > [...] Here is the summary with links: - [net,1/1] hv_netvsc: Fix missed pagebuf entries in netvsc_dma_map/unmap() https://git.kernel.org/netdev/net/c/99f1c46011cc You are awesome, thank you!
From: Paolo Abeni <pabeni@redhat.com> Sent: Thursday, February 2, 2023 12:31 AM > > On Thu, 2023-02-02 at 05:20 +0000, Michael Kelley (LINUX) wrote: > > From: Jakub Kicinski <kuba@kernel.org> Sent: Wednesday, February 1, 2023 9:01 PM > > > > > > On Mon, 30 Jan 2023 19:33:06 -0800 Michael Kelley wrote: > > > > @@ -990,9 +987,7 @@ static int netvsc_dma_map(struct hv_device *hv_dev, > > > > struct hv_netvsc_packet *packet, > > > > struct hv_page_buffer *pb) > > > > { > > > > - u32 page_count = packet->cp_partial ? > > > > - packet->page_buf_cnt - packet->rmsg_pgcnt : > > > > - packet->page_buf_cnt; > > > > + u32 page_count = packet->page_buf_cnt; > > > > dma_addr_t dma; > > > > int i; > > > > > > Suspiciously, the caller still does: > > > > > > if (packet->cp_partial) > > > pb += packet->rmsg_pgcnt; > > > > > > ret = netvsc_dma_map(ndev_ctx->device_ctx, packet, pb); > > > > > > Shouldn't that if () pb +=... also go away? > > > > No -- it's correct. > > > > In netvsc_send(), cp_partial is tested and packet->page_buf_cnt is > > adjusted. But the pointer into the pagebuf array is not adjusted in > > netvsc_send(). Instead it is adjusted here in netvsc_send_pkt(), which > > brings it back in sync with packet->page_buf_cnt. > > Ok > > > I don't know if there's a good reason for the adjustment being split > > across two different functions. It doesn't seem like the most > > straightforward approach. From a quick glance at the code it looks > > like this adjustment to 'pb' could move to netvsc_send() to be > > together with the adjustment to packet->page_buf_cnt, but maybe > > there's a reason for the split that I'm not familiar with. > > > > Haiyang -- any insight? > > While at that, please also have a look at the following allocation in > netvsc_dma_map(): > > packet->dma_range = kcalloc(page_count, > sizeof(*packet->dma_range), > GFP_KERNEL); > > which looks wrong - netvsc_dma_map() should be in atomic context. > > Anyway it's a topic unrelated from this patch. I just stumbled upon it > while reviewing. > Thanks for pointing this out. I've made a note to do a fix. Michael
diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index 661bbe6..8acfa40 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -949,9 +949,6 @@ static void netvsc_copy_to_send_buf(struct netvsc_device *net_device, void netvsc_dma_unmap(struct hv_device *hv_dev, struct hv_netvsc_packet *packet) { - u32 page_count = packet->cp_partial ? - packet->page_buf_cnt - packet->rmsg_pgcnt : - packet->page_buf_cnt; int i; if (!hv_is_isolation_supported()) @@ -960,7 +957,7 @@ void netvsc_dma_unmap(struct hv_device *hv_dev, if (!packet->dma_range) return; - for (i = 0; i < page_count; i++) + for (i = 0; i < packet->page_buf_cnt; i++) dma_unmap_single(&hv_dev->device, packet->dma_range[i].dma, packet->dma_range[i].mapping_size, DMA_TO_DEVICE); @@ -990,9 +987,7 @@ static int netvsc_dma_map(struct hv_device *hv_dev, struct hv_netvsc_packet *packet, struct hv_page_buffer *pb) { - u32 page_count = packet->cp_partial ? - packet->page_buf_cnt - packet->rmsg_pgcnt : - packet->page_buf_cnt; + u32 page_count = packet->page_buf_cnt; dma_addr_t dma; int i;
netvsc_dma_map() and netvsc_dma_unmap() currently check the cp_partial flag and adjust the page_count so that pagebuf entries for the RNDIS portion of the message are skipped when it has already been copied into a send buffer. But this adjustment has already been made by code in netvsc_send(). The duplicate adjustment causes some pagebuf entries to not be mapped. In a normal VM, this doesn't break anything because the mapping doesn’t change the PFN. But in a Confidential VM, dma_map_single() does bounce buffering and provides a different PFN. Failing to do the mapping causes the wrong PFN to be passed to Hyper-V, and various errors ensue. Fix this by removing the duplicate adjustment in netvsc_dma_map() and netvsc_dma_unmap(). Fixes: 846da38de0e8 ("net: netvsc: Add Isolation VM support for netvsc driver") Cc: stable@vger.kernel.org Signed-off-by: Michael Kelley <mikelley@microsoft.com> --- drivers/net/hyperv/netvsc.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-)