Message ID | 1718054553-6588-1-git-send-email-haiyangz@microsoft.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | [net-next] net: mana: Add support for variable page sizes of ARM64 | expand |
From: LKML haiyangz <lkmlhyz@microsoft.com> On Behalf Of Haiyang Zhang Sent: Monday, June 10, 2024 2:23 PM > > As defined by the MANA Hardware spec, the queue size for DMA is 4KB > minimal, and power of 2. You say the hardware requires 4K "minimal". But the definitions in this patch hardcode to 4K, as if that's the only choice. Is the hardcoding to 4K a design decision made to simplify the MANA driver? > To support variable page sizes (4KB, 16KB, 64KB) of ARM64, define A minor nit, but "variable" page size doesn't seem like quite the right description -- both here and in the Subject line. On ARM64, the page size is a choice among a few fixed options. Perhaps call it support for "page sizes other than 4K"? > the minimal queue size as a macro separate from the PAGE_SIZE, which > we always assumed it to be 4KB before supporting ARM64. > Also, update the relevant code related to size alignment, DMA region > calculations, etc. > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> > --- > drivers/net/ethernet/microsoft/Kconfig | 2 +- > .../net/ethernet/microsoft/mana/gdma_main.c | 8 +++---- > .../net/ethernet/microsoft/mana/hw_channel.c | 22 +++++++++---------- > drivers/net/ethernet/microsoft/mana/mana_en.c | 8 +++---- > .../net/ethernet/microsoft/mana/shm_channel.c | 9 ++++---- > include/net/mana/gdma.h | 7 +++++- > include/net/mana/mana.h | 3 ++- > 7 files changed, 33 insertions(+), 26 deletions(-) > > diff --git a/drivers/net/ethernet/microsoft/Kconfig > b/drivers/net/ethernet/microsoft/Kconfig > index 286f0d5697a1..901fbffbf718 100644 > --- a/drivers/net/ethernet/microsoft/Kconfig > +++ b/drivers/net/ethernet/microsoft/Kconfig > @@ -18,7 +18,7 @@ if NET_VENDOR_MICROSOFT > config MICROSOFT_MANA > tristate "Microsoft Azure Network Adapter (MANA) support" > depends on PCI_MSI > - depends on X86_64 || (ARM64 && !CPU_BIG_ENDIAN && ARM64_4K_PAGES) > + depends on X86_64 || (ARM64 && !CPU_BIG_ENDIAN) > depends on PCI_HYPERV > select AUXILIARY_BUS > select PAGE_POOL > diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c > b/drivers/net/ethernet/microsoft/mana/gdma_main.c > index 1332db9a08eb..c9df942d0d02 100644 > --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c > +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c > @@ -182,7 +182,7 @@ int mana_gd_alloc_memory(struct gdma_context *gc, > unsigned int length, > dma_addr_t dma_handle; > void *buf; > > - if (length < PAGE_SIZE || !is_power_of_2(length)) > + if (length < MANA_MIN_QSIZE || !is_power_of_2(length)) > return -EINVAL; > > gmi->dev = gc->dev; > @@ -717,7 +717,7 @@ EXPORT_SYMBOL_NS(mana_gd_destroy_dma_region, > NET_MANA); > static int mana_gd_create_dma_region(struct gdma_dev *gd, > struct gdma_mem_info *gmi) > { > - unsigned int num_page = gmi->length / PAGE_SIZE; > + unsigned int num_page = gmi->length / MANA_MIN_QSIZE; This calculation seems a bit weird when using MANA_MIN_QSIZE. The number of pages, and the construction of the page_addr_list array a few lines later, seem unrelated to the concept of a minimum queue size. Is the right concept really a "mapping chunk", and num_page would conceptually be "num_chunks", or something like that? Then a queue must be at least one chunk in size, but that's derived from the chunk size, and is not the core concept. Another approach might be to just call it "MANA_PAGE_SIZE", like has been done with HV_HYP_PAGE_SIZE. HV_HYP_PAGE_SIZE exists to handle exactly the same issue of the guest PAGE_SIZE potentially being different from the fixed 4K size that must be used in host-guest communication on Hyper-V. Same thing here with MANA. > struct gdma_create_dma_region_req *req = NULL; > struct gdma_create_dma_region_resp resp = {}; > struct gdma_context *gc = gd->gdma_context; > @@ -727,7 +727,7 @@ static int mana_gd_create_dma_region(struct gdma_dev *gd, > int err; > int i; > > - if (length < PAGE_SIZE || !is_power_of_2(length)) > + if (length < MANA_MIN_QSIZE || !is_power_of_2(length)) > return -EINVAL; > > if (offset_in_page(gmi->virt_addr) != 0) > @@ -751,7 +751,7 @@ static int mana_gd_create_dma_region(struct gdma_dev *gd, > req->page_addr_list_len = num_page; > > for (i = 0; i < num_page; i++) > - req->page_addr_list[i] = gmi->dma_handle + i * PAGE_SIZE; > + req->page_addr_list[i] = gmi->dma_handle + i * MANA_MIN_QSIZE; > > err = mana_gd_send_request(gc, req_msg_size, req, sizeof(resp), &resp); > if (err) > diff --git a/drivers/net/ethernet/microsoft/mana/hw_channel.c > b/drivers/net/ethernet/microsoft/mana/hw_channel.c > index bbc4f9e16c98..038dc31e09cd 100644 > --- a/drivers/net/ethernet/microsoft/mana/hw_channel.c > +++ b/drivers/net/ethernet/microsoft/mana/hw_channel.c > @@ -362,12 +362,12 @@ static int mana_hwc_create_cq(struct hw_channel_context > *hwc, u16 q_depth, > int err; > > eq_size = roundup_pow_of_two(GDMA_EQE_SIZE * q_depth); > - if (eq_size < MINIMUM_SUPPORTED_PAGE_SIZE) > - eq_size = MINIMUM_SUPPORTED_PAGE_SIZE; > + if (eq_size < MANA_MIN_QSIZE) > + eq_size = MANA_MIN_QSIZE; > > cq_size = roundup_pow_of_two(GDMA_CQE_SIZE * q_depth); > - if (cq_size < MINIMUM_SUPPORTED_PAGE_SIZE) > - cq_size = MINIMUM_SUPPORTED_PAGE_SIZE; > + if (cq_size < MANA_MIN_QSIZE) > + cq_size = MANA_MIN_QSIZE; > > hwc_cq = kzalloc(sizeof(*hwc_cq), GFP_KERNEL); > if (!hwc_cq) > @@ -429,7 +429,7 @@ static int mana_hwc_alloc_dma_buf(struct > hw_channel_context *hwc, u16 q_depth, > > dma_buf->num_reqs = q_depth; > > - buf_size = PAGE_ALIGN(q_depth * max_msg_size); > + buf_size = MANA_MIN_QALIGN(q_depth * max_msg_size); > > gmi = &dma_buf->mem_info; > err = mana_gd_alloc_memory(gc, buf_size, gmi); > @@ -497,8 +497,8 @@ static int mana_hwc_create_wq(struct hw_channel_context > *hwc, > else > queue_size = roundup_pow_of_two(GDMA_MAX_SQE_SIZE * > q_depth); > > - if (queue_size < MINIMUM_SUPPORTED_PAGE_SIZE) > - queue_size = MINIMUM_SUPPORTED_PAGE_SIZE; > + if (queue_size < MANA_MIN_QSIZE) > + queue_size = MANA_MIN_QSIZE; > > hwc_wq = kzalloc(sizeof(*hwc_wq), GFP_KERNEL); > if (!hwc_wq) > @@ -628,10 +628,10 @@ static int mana_hwc_establish_channel(struct > gdma_context *gc, u16 *q_depth, > init_completion(&hwc->hwc_init_eqe_comp); > > err = mana_smc_setup_hwc(&gc->shm_channel, false, > - eq->mem_info.dma_handle, > - cq->mem_info.dma_handle, > - rq->mem_info.dma_handle, > - sq->mem_info.dma_handle, > + virt_to_phys(eq->mem_info.virt_addr), > + virt_to_phys(cq->mem_info.virt_addr), > + virt_to_phys(rq->mem_info.virt_addr), > + virt_to_phys(sq->mem_info.virt_addr), This change seems unrelated to handling guest PAGE_SIZE values other than 4K. Does it belong in a separate patch? Or maybe it just needs an explanation in the commit message of this patch? > eq->eq.msix_index); > if (err) > return err; > diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c > b/drivers/net/ethernet/microsoft/mana/mana_en.c > index d087cf954f75..6a891dbce686 100644 > --- a/drivers/net/ethernet/microsoft/mana/mana_en.c > +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c > @@ -1889,10 +1889,10 @@ static int mana_create_txq(struct mana_port_context > *apc, > * to prevent overflow. > */ > txq_size = MAX_SEND_BUFFERS_PER_QUEUE * 32; > - BUILD_BUG_ON(!PAGE_ALIGNED(txq_size)); > + BUILD_BUG_ON(!MANA_MIN_QALIGNED(txq_size)); > > cq_size = MAX_SEND_BUFFERS_PER_QUEUE * COMP_ENTRY_SIZE; > - cq_size = PAGE_ALIGN(cq_size); > + cq_size = MANA_MIN_QALIGN(cq_size); > > gc = gd->gdma_context; > > @@ -2189,8 +2189,8 @@ static struct mana_rxq *mana_create_rxq(struct > mana_port_context *apc, > if (err) > goto out; > > - rq_size = PAGE_ALIGN(rq_size); > - cq_size = PAGE_ALIGN(cq_size); > + rq_size = MANA_MIN_QALIGN(rq_size); > + cq_size = MANA_MIN_QALIGN(cq_size); > > /* Create RQ */ > memset(&spec, 0, sizeof(spec)); > diff --git a/drivers/net/ethernet/microsoft/mana/shm_channel.c > b/drivers/net/ethernet/microsoft/mana/shm_channel.c > index 5553af9c8085..9a54a163d8d1 100644 > --- a/drivers/net/ethernet/microsoft/mana/shm_channel.c > +++ b/drivers/net/ethernet/microsoft/mana/shm_channel.c > @@ -6,6 +6,7 @@ > #include <linux/io.h> > #include <linux/mm.h> > > +#include <net/mana/gdma.h> > #include <net/mana/shm_channel.h> > > #define PAGE_FRAME_L48_WIDTH_BYTES 6 > @@ -183,7 +184,7 @@ int mana_smc_setup_hwc(struct shm_channel *sc, bool > reset_vf, u64 eq_addr, > > /* EQ addr: low 48 bits of frame address */ > shmem = (u64 *)ptr; > - frame_addr = PHYS_PFN(eq_addr); > + frame_addr = MANA_PFN(eq_addr); > *shmem = frame_addr & PAGE_FRAME_L48_MASK; > all_addr_h4bits |= (frame_addr >> PAGE_FRAME_L48_WIDTH_BITS) << > (frame_addr_seq++ * PAGE_FRAME_H4_WIDTH_BITS); In mana_smc_setup_hwc() a few lines above this change, code using PAGE_ALIGNED() is unchanged. Is it correct that the eq/cq/rq/sq addresses must be aligned to 64K if PAGE_SIZE is 64K? Related, I wonder about how MANA_PFN() is defined. If PAGE_SIZE is 64K, MANA_PFN() will first right-shift 16, then left shift 4. The net is right-shift 12, corresponding to the 4K chunks that MANA expects. But that approach guarantees that the rightmost 4 bits of the MANA PFN will always be zero. That's consistent with requiring the addresses to be PAGE_ALIGNED() to 64K, but I'm unclear whether that is really the requirement. You might compare with the definition of HVPFN_DOWN(), which has a similar goal for Linux guests communicating with Hyper-V. > @@ -191,7 +192,7 @@ int mana_smc_setup_hwc(struct shm_channel *sc, bool > reset_vf, u64 eq_addr, > > /* CQ addr: low 48 bits of frame address */ > shmem = (u64 *)ptr; > - frame_addr = PHYS_PFN(cq_addr); > + frame_addr = MANA_PFN(cq_addr); > *shmem = frame_addr & PAGE_FRAME_L48_MASK; > all_addr_h4bits |= (frame_addr >> PAGE_FRAME_L48_WIDTH_BITS) << > (frame_addr_seq++ * PAGE_FRAME_H4_WIDTH_BITS); > @@ -199,7 +200,7 @@ int mana_smc_setup_hwc(struct shm_channel *sc, bool > reset_vf, u64 eq_addr, > > /* RQ addr: low 48 bits of frame address */ > shmem = (u64 *)ptr; > - frame_addr = PHYS_PFN(rq_addr); > + frame_addr = MANA_PFN(rq_addr); > *shmem = frame_addr & PAGE_FRAME_L48_MASK; > all_addr_h4bits |= (frame_addr >> PAGE_FRAME_L48_WIDTH_BITS) << > (frame_addr_seq++ * PAGE_FRAME_H4_WIDTH_BITS); > @@ -207,7 +208,7 @@ int mana_smc_setup_hwc(struct shm_channel *sc, bool > reset_vf, u64 eq_addr, > > /* SQ addr: low 48 bits of frame address */ > shmem = (u64 *)ptr; > - frame_addr = PHYS_PFN(sq_addr); > + frame_addr = MANA_PFN(sq_addr); > *shmem = frame_addr & PAGE_FRAME_L48_MASK; > all_addr_h4bits |= (frame_addr >> PAGE_FRAME_L48_WIDTH_BITS) << > (frame_addr_seq++ * PAGE_FRAME_H4_WIDTH_BITS); > diff --git a/include/net/mana/gdma.h b/include/net/mana/gdma.h > index 27684135bb4d..b392559c33e9 100644 > --- a/include/net/mana/gdma.h > +++ b/include/net/mana/gdma.h > @@ -224,7 +224,12 @@ struct gdma_dev { > struct auxiliary_device *adev; > }; > > -#define MINIMUM_SUPPORTED_PAGE_SIZE PAGE_SIZE > +/* These are defined by HW */ > +#define MANA_MIN_QSHIFT 12 > +#define MANA_MIN_QSIZE (1 << MANA_MIN_QSHIFT) > +#define MANA_MIN_QALIGN(x) ALIGN((x), MANA_MIN_QSIZE) > +#define MANA_MIN_QALIGNED(addr) IS_ALIGNED((unsigned long)(addr), MANA_MIN_QSIZE) > +#define MANA_PFN(a) (PHYS_PFN(a) << (PAGE_SHIFT - MANA_MIN_QSHIFT)) See comments above about how this is defined. Michael > > #define GDMA_CQE_SIZE 64 > #define GDMA_EQE_SIZE 16 > diff --git a/include/net/mana/mana.h b/include/net/mana/mana.h > index 561f6719fb4e..43e8fc574354 100644 > --- a/include/net/mana/mana.h > +++ b/include/net/mana/mana.h > @@ -42,7 +42,8 @@ enum TRI_STATE { > > #define MAX_SEND_BUFFERS_PER_QUEUE 256 > > -#define EQ_SIZE (8 * PAGE_SIZE) > +#define EQ_SIZE (8 * MANA_MIN_QSIZE) > + > #define LOG2_EQ_THROTTLE 3 > > #define MAX_PORTS_IN_MANA_DEV 256 > -- > 2.34.1 >
(resending in plain text) > -----Original Message----- > From: Michael Kelley <mailto:mhklinux@outlook.com> > Sent: Tuesday, June 11, 2024 12:35 PM > To: Haiyang Zhang <mailto:haiyangz@microsoft.com>; mailto:linux-hyperv@vger.kernel.org; > mailto:netdev@vger.kernel.org > Cc: Dexuan Cui <mailto:decui@microsoft.com>; mailto:stephen@networkplumber.org; KY > Srinivasan <mailto:kys@microsoft.com>; Paul Rosswurm <mailto:paulros@microsoft.com>; > mailto:olaf@aepfle.de; vkuznets <mailto:vkuznets@redhat.com>; mailto:davem@davemloft.net; > mailto:wei.liu@kernel.org; mailto:edumazet@google.com; mailto:kuba@kernel.org; > mailto:pabeni@redhat.com; mailto:leon@kernel.org; Long Li <mailto:longli@microsoft.com>; > mailto:ssengar@linux.microsoft.com; mailto:linux-rdma@vger.kernel.org; > mailto:daniel@iogearbox.net; mailto:john.fastabend@gmail.com; mailto:bpf@vger.kernel.org; > mailto:ast@kernel.org; mailto:hawk@kernel.org; mailto:tglx@linutronix.de; > mailto:shradhagupta@linux.microsoft.com; mailto:linux-kernel@vger.kernel.org > Subject: RE: [PATCH net-next] net: mana: Add support for variable page > sizes of ARM64 > > From: LKML haiyangz <mailto:lkmlhyz@microsoft.com> On Behalf Of Haiyang Zhang > Sent: Monday, June 10, 2024 2:23 PM > > > > As defined by the MANA Hardware spec, the queue size for DMA is 4KB > > minimal, and power of 2. > > You say the hardware requires 4K "minimal". But the definitions in this > patch hardcode to 4K, as if that's the only choice. Is the hardcoding to > 4K a design decision made to simplify the MANA driver? The HWC q size has to be exactly 4k, which is by HW design. Other "regular" queues can be 2^n >= 4k. > > > To support variable page sizes (4KB, 16KB, 64KB) of ARM64, define > > A minor nit, but "variable" page size doesn't seem like quite the right > description -- both here and in the Subject line. On ARM64, the page > size > is a choice among a few fixed options. Perhaps call it support for "page > sizes > other than 4K"? "page sizes other than 4K" sounds good. > > > the minimal queue size as a macro separate from the PAGE_SIZE, which > > we always assumed it to be 4KB before supporting ARM64. > > Also, update the relevant code related to size alignment, DMA region > > calculations, etc. > > > > Signed-off-by: Haiyang Zhang <mailto:haiyangz@microsoft.com> > > --- > > drivers/net/ethernet/microsoft/Kconfig | 2 +- > > .../net/ethernet/microsoft/mana/gdma_main.c | 8 +++---- > > .../net/ethernet/microsoft/mana/hw_channel.c | 22 +++++++++---------- > > drivers/net/ethernet/microsoft/mana/mana_en.c | 8 +++---- > > .../net/ethernet/microsoft/mana/shm_channel.c | 9 ++++---- > > include/net/mana/gdma.h | 7 +++++- > > include/net/mana/mana.h | 3 ++- > > 7 files changed, 33 insertions(+), 26 deletions(-) > > > > diff --git a/drivers/net/ethernet/microsoft/Kconfig > > b/drivers/net/ethernet/microsoft/Kconfig > > index 286f0d5697a1..901fbffbf718 100644 > > --- a/drivers/net/ethernet/microsoft/Kconfig > > +++ b/drivers/net/ethernet/microsoft/Kconfig > > @@ -18,7 +18,7 @@ if NET_VENDOR_MICROSOFT > > config MICROSOFT_MANA > > tristate "Microsoft Azure Network Adapter (MANA) support" > > depends on PCI_MSI > > - depends on X86_64 || (ARM64 && !CPU_BIG_ENDIAN && ARM64_4K_PAGES) > > + depends on X86_64 || (ARM64 && !CPU_BIG_ENDIAN) > > depends on PCI_HYPERV > > select AUXILIARY_BUS > > select PAGE_POOL > > diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c > > b/drivers/net/ethernet/microsoft/mana/gdma_main.c > > index 1332db9a08eb..c9df942d0d02 100644 > > --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c > > +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c > > @@ -182,7 +182,7 @@ int mana_gd_alloc_memory(struct gdma_context *gc, > > unsigned int length, > > dma_addr_t dma_handle; > > void *buf; > > > > - if (length < PAGE_SIZE || !is_power_of_2(length)) > > + if (length < MANA_MIN_QSIZE || !is_power_of_2(length)) > > return -EINVAL; > > > > gmi->dev = gc->dev; > > @@ -717,7 +717,7 @@ EXPORT_SYMBOL_NS(mana_gd_destroy_dma_region, > > NET_MANA); > > static int mana_gd_create_dma_region(struct gdma_dev *gd, > > struct gdma_mem_info *gmi) > > { > > - unsigned int num_page = gmi->length / PAGE_SIZE; > > + unsigned int num_page = gmi->length / MANA_MIN_QSIZE; > > This calculation seems a bit weird when using MANA_MIN_QSIZE. The > number of pages, and the construction of the page_addr_list array > a few lines later, seem unrelated to the concept of a minimum queue > size. Is the right concept really a "mapping chunk", and num_page > would conceptually be "num_chunks", or something like that? Then > a queue must be at least one chunk in size, but that's derived from the > chunk size, and is not the core concept. I think calling it "num_chunks" is fine. May I use "num_chunks" in next version? > > Another approach might be to just call it "MANA_PAGE_SIZE", like > has been done with HV_HYP_PAGE_SIZE. HV_HYP_PAGE_SIZE exists to > handle exactly the same issue of the guest PAGE_SIZE potentially > being different from the fixed 4K size that must be used in host-guest > communication on Hyper-V. Same thing here with MANA. I actually called it "MANA_PAGE_SIZE" in my previous internal patch. But Paul from Hostnet team opposed using that name, because 4kB is the min q size. MANA doesn't have "page" at HW level. > > struct gdma_create_dma_region_req *req = NULL; > > struct gdma_create_dma_region_resp resp = {}; > > struct gdma_context *gc = gd->gdma_context; > > @@ -727,7 +727,7 @@ static int mana_gd_create_dma_region(struct > gdma_dev *gd, > > int err; > > int i; > > > > - if (length < PAGE_SIZE || !is_power_of_2(length)) > > + if (length < MANA_MIN_QSIZE || !is_power_of_2(length)) > > return -EINVAL; > > > > if (offset_in_page(gmi->virt_addr) != 0) > > @@ -751,7 +751,7 @@ static int mana_gd_create_dma_region(struct > gdma_dev *gd, > > req->page_addr_list_len = num_page; > > > > for (i = 0; i < num_page; i++) > > - req->page_addr_list[i] = gmi->dma_handle + i * PAGE_SIZE; > > + req->page_addr_list[i] = gmi->dma_handle + i * > MANA_MIN_QSIZE; > > > > err = mana_gd_send_request(gc, req_msg_size, req, sizeof(resp), > &resp); > > if (err) > > diff --git a/drivers/net/ethernet/microsoft/mana/hw_channel.c > > b/drivers/net/ethernet/microsoft/mana/hw_channel.c > > index bbc4f9e16c98..038dc31e09cd 100644 > > --- a/drivers/net/ethernet/microsoft/mana/hw_channel.c > > +++ b/drivers/net/ethernet/microsoft/mana/hw_channel.c > > @@ -362,12 +362,12 @@ static int mana_hwc_create_cq(struct > hw_channel_context > > *hwc, u16 q_depth, > > int err; > > > > eq_size = roundup_pow_of_two(GDMA_EQE_SIZE * q_depth); > > - if (eq_size < MINIMUM_SUPPORTED_PAGE_SIZE) > > - eq_size = MINIMUM_SUPPORTED_PAGE_SIZE; > > + if (eq_size < MANA_MIN_QSIZE) > > + eq_size = MANA_MIN_QSIZE; > > > > cq_size = roundup_pow_of_two(GDMA_CQE_SIZE * q_depth); > > - if (cq_size < MINIMUM_SUPPORTED_PAGE_SIZE) > > - cq_size = MINIMUM_SUPPORTED_PAGE_SIZE; > > + if (cq_size < MANA_MIN_QSIZE) > > + cq_size = MANA_MIN_QSIZE; > > > > hwc_cq = kzalloc(sizeof(*hwc_cq), GFP_KERNEL); > > if (!hwc_cq) > > @@ -429,7 +429,7 @@ static int mana_hwc_alloc_dma_buf(struct > > hw_channel_context *hwc, u16 q_depth, > > > > dma_buf->num_reqs = q_depth; > > > > - buf_size = PAGE_ALIGN(q_depth * max_msg_size); > > + buf_size = MANA_MIN_QALIGN(q_depth * max_msg_size); > > > > gmi = &dma_buf->mem_info; > > err = mana_gd_alloc_memory(gc, buf_size, gmi); > > @@ -497,8 +497,8 @@ static int mana_hwc_create_wq(struct > hw_channel_context > > *hwc, > > else > > queue_size = roundup_pow_of_two(GDMA_MAX_SQE_SIZE * > > q_depth); > > > > - if (queue_size < MINIMUM_SUPPORTED_PAGE_SIZE) > > - queue_size = MINIMUM_SUPPORTED_PAGE_SIZE; > > + if (queue_size < MANA_MIN_QSIZE) > > + queue_size = MANA_MIN_QSIZE; > > > > hwc_wq = kzalloc(sizeof(*hwc_wq), GFP_KERNEL); > > if (!hwc_wq) > > @@ -628,10 +628,10 @@ static int mana_hwc_establish_channel(struct > > gdma_context *gc, u16 *q_depth, > > init_completion(&hwc->hwc_init_eqe_comp); > > > > err = mana_smc_setup_hwc(&gc->shm_channel, false, > > - eq->mem_info.dma_handle, > > - cq->mem_info.dma_handle, > > - rq->mem_info.dma_handle, > > - sq->mem_info.dma_handle, > > + virt_to_phys(eq->mem_info.virt_addr), > > + virt_to_phys(cq->mem_info.virt_addr), > > + virt_to_phys(rq->mem_info.virt_addr), > > + virt_to_phys(sq->mem_info.virt_addr), > > This change seems unrelated to handling guest PAGE_SIZE values > other than 4K. Does it belong in a separate patch? Or maybe it just > needs an explanation in the commit message of this patch? I know dma_handle is usually just the phys adr. But this is not always True if IOMMU is used... I have no problem to put it to a separate patch if desired. > > > eq->eq.msix_index); > > if (err) > > return err; > > diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c > > b/drivers/net/ethernet/microsoft/mana/mana_en.c > > index d087cf954f75..6a891dbce686 100644 > > --- a/drivers/net/ethernet/microsoft/mana/mana_en.c > > +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c > > @@ -1889,10 +1889,10 @@ static int mana_create_txq(struct > mana_port_context > > *apc, > > * to prevent overflow. > > */ > > txq_size = MAX_SEND_BUFFERS_PER_QUEUE * 32; > > - BUILD_BUG_ON(!PAGE_ALIGNED(txq_size)); > > + BUILD_BUG_ON(!MANA_MIN_QALIGNED(txq_size)); > > > > cq_size = MAX_SEND_BUFFERS_PER_QUEUE * COMP_ENTRY_SIZE; > > - cq_size = PAGE_ALIGN(cq_size); > > + cq_size = MANA_MIN_QALIGN(cq_size); > > > > gc = gd->gdma_context; > > > > @@ -2189,8 +2189,8 @@ static struct mana_rxq *mana_create_rxq(struct > > mana_port_context *apc, > > if (err) > > goto out; > > > > - rq_size = PAGE_ALIGN(rq_size); > > - cq_size = PAGE_ALIGN(cq_size); > > + rq_size = MANA_MIN_QALIGN(rq_size); > > + cq_size = MANA_MIN_QALIGN(cq_size); > > > > /* Create RQ */ > > memset(&spec, 0, sizeof(spec)); > > diff --git a/drivers/net/ethernet/microsoft/mana/shm_channel.c > > b/drivers/net/ethernet/microsoft/mana/shm_channel.c > > index 5553af9c8085..9a54a163d8d1 100644 > > --- a/drivers/net/ethernet/microsoft/mana/shm_channel.c > > +++ b/drivers/net/ethernet/microsoft/mana/shm_channel.c > > @@ -6,6 +6,7 @@ > > #include <linux/io.h> > > #include <linux/mm.h> > > > > +#include <net/mana/gdma.h> > > #include <net/mana/shm_channel.h> > > > > #define PAGE_FRAME_L48_WIDTH_BYTES 6 > > @@ -183,7 +184,7 @@ int mana_smc_setup_hwc(struct shm_channel *sc, bool > > reset_vf, u64 eq_addr, > > > > /* EQ addr: low 48 bits of frame address */ > > shmem = (u64 *)ptr; > > - frame_addr = PHYS_PFN(eq_addr); > > + frame_addr = MANA_PFN(eq_addr); > > *shmem = frame_addr & PAGE_FRAME_L48_MASK; > > all_addr_h4bits |= (frame_addr >> PAGE_FRAME_L48_WIDTH_BITS) << > > (frame_addr_seq++ * PAGE_FRAME_H4_WIDTH_BITS); > > In mana_smc_setup_hwc() a few lines above this change, code using > PAGE_ALIGNED() is unchanged. Is it correct that the eq/cq/rq/sq > addresses > must be aligned to 64K if PAGE_SIZE is 64K? Since we still using PHYS_PFN on them, if not aligned to PAGE_SIZE, the lower bits may be lost. (You said the same below.) > > Related, I wonder about how MANA_PFN() is defined. If PAGE_SIZE is 64K, > MANA_PFN() will first right-shift 16, then left shift 4. The net is > right-shift 12, > corresponding to the 4K chunks that MANA expects. But that approach > guarantees > that the rightmost 4 bits of the MANA PFN will always be zero. That's > consistent > with requiring the addresses to be PAGE_ALIGNED() to 64K, but I'm unclear > whether > that is really the requirement. You might compare with the definition of > HVPFN_DOWN(), which has a similar goal for Linux guests communicating > with > Hyper-V. @Paul Rosswurm You said MANA HW has "no page concept". So the "frame_addr" In the mana_smc_setup_hwc() is NOT related to physical page number, correct? Can we just use phys_adr >> 12 like below? #define MANA_MIN_QSHIFT 12 #define MANA_PFN(a) ((a) >> MANA_MIN_QSHIFT) /* EQ addr: low 48 bits of frame address */ shmem = (u64 *)ptr; - frame_addr = PHYS_PFN(eq_addr); + frame_addr = MANA_PFN(eq_addr); > > > @@ -191,7 +192,7 @@ int mana_smc_setup_hwc(struct shm_channel *sc, bool > > reset_vf, u64 eq_addr, > > > > /* CQ addr: low 48 bits of frame address */ > > shmem = (u64 *)ptr; > > - frame_addr = PHYS_PFN(cq_addr); > > + frame_addr = MANA_PFN(cq_addr); > > *shmem = frame_addr & PAGE_FRAME_L48_MASK; > > all_addr_h4bits |= (frame_addr >> PAGE_FRAME_L48_WIDTH_BITS) << > > (frame_addr_seq++ * PAGE_FRAME_H4_WIDTH_BITS); > > @@ -199,7 +200,7 @@ int mana_smc_setup_hwc(struct shm_channel *sc, bool > > reset_vf, u64 eq_addr, > > > > /* RQ addr: low 48 bits of frame address */ > > shmem = (u64 *)ptr; > > - frame_addr = PHYS_PFN(rq_addr); > > + frame_addr = MANA_PFN(rq_addr); > > *shmem = frame_addr & PAGE_FRAME_L48_MASK; > > all_addr_h4bits |= (frame_addr >> PAGE_FRAME_L48_WIDTH_BITS) << > > (frame_addr_seq++ * PAGE_FRAME_H4_WIDTH_BITS); > > @@ -207,7 +208,7 @@ int mana_smc_setup_hwc(struct shm_channel *sc, bool > > reset_vf, u64 eq_addr, > > > > /* SQ addr: low 48 bits of frame address */ > > shmem = (u64 *)ptr; > > - frame_addr = PHYS_PFN(sq_addr); > > + frame_addr = MANA_PFN(sq_addr); > > *shmem = frame_addr & PAGE_FRAME_L48_MASK; > > all_addr_h4bits |= (frame_addr >> PAGE_FRAME_L48_WIDTH_BITS) << > > (frame_addr_seq++ * PAGE_FRAME_H4_WIDTH_BITS); > > diff --git a/include/net/mana/gdma.h b/include/net/mana/gdma.h > > index 27684135bb4d..b392559c33e9 100644 > > --- a/include/net/mana/gdma.h > > +++ b/include/net/mana/gdma.h > > @@ -224,7 +224,12 @@ struct gdma_dev { > > struct auxiliary_device *adev; > > }; > > > > -#define MINIMUM_SUPPORTED_PAGE_SIZE PAGE_SIZE > > +/* These are defined by HW */ > > +#define MANA_MIN_QSHIFT 12 > > +#define MANA_MIN_QSIZE (1 << MANA_MIN_QSHIFT) > > +#define MANA_MIN_QALIGN(x) ALIGN((x), MANA_MIN_QSIZE) > > +#define MANA_MIN_QALIGNED(addr) IS_ALIGNED((unsigned long)(addr), > MANA_MIN_QSIZE) > > +#define MANA_PFN(a) (PHYS_PFN(a) << (PAGE_SHIFT - MANA_MIN_QSHIFT)) > > See comments above about how this is defined. Replied above. Thank you for all the detailed comments! - Haiyang
> -----Original Message----- > From: Haiyang Zhang <haiyangz@microsoft.com> > Sent: Tuesday, June 11, 2024 1:44 PM > To: Michael Kelley <mhklinux@outlook.com>; linux-hyperv@vger.kernel.org; > netdev@vger.kernel.org; Paul Rosswurm <paulros@microsoft.com> > Cc: Dexuan Cui <decui@microsoft.com>; stephen@networkplumber.org; KY > Srinivasan <kys@microsoft.com>; olaf@aepfle.de; vkuznets > <vkuznets@redhat.com>; davem@davemloft.net; wei.liu@kernel.org; > edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; leon@kernel.org; > Long Li <longli@microsoft.com>; ssengar@linux.microsoft.com; linux- > rdma@vger.kernel.org; daniel@iogearbox.net; john.fastabend@gmail.com; > bpf@vger.kernel.org; ast@kernel.org; hawk@kernel.org; tglx@linutronix.de; > shradhagupta@linux.microsoft.com; linux-kernel@vger.kernel.org > Subject: RE: [PATCH net-next] net: mana: Add support for variable page > sizes of ARM64 > > > @@ -183,7 +184,7 @@ int mana_smc_setup_hwc(struct shm_channel *sc, > bool > > > reset_vf, u64 eq_addr, > > > > > > /* EQ addr: low 48 bits of frame address */ > > > shmem = (u64 *)ptr; > > > - frame_addr = PHYS_PFN(eq_addr); > > > + frame_addr = MANA_PFN(eq_addr); > > > *shmem = frame_addr & PAGE_FRAME_L48_MASK; > > > all_addr_h4bits |= (frame_addr >> PAGE_FRAME_L48_WIDTH_BITS) << > > > (frame_addr_seq++ * PAGE_FRAME_H4_WIDTH_BITS); > > > > In mana_smc_setup_hwc() a few lines above this change, code using > > PAGE_ALIGNED() is unchanged. Is it correct that the eq/cq/rq/sq > > addresses > > must be aligned to 64K if PAGE_SIZE is 64K? > > Since we still using PHYS_PFN on them, if not aligned to PAGE_SIZE, > the lower bits may be lost. (You said the same below.) > > > > > Related, I wonder about how MANA_PFN() is defined. If PAGE_SIZE is 64K, > > MANA_PFN() will first right-shift 16, then left shift 4. The net is > > right-shift 12, > > corresponding to the 4K chunks that MANA expects. But that approach > > guarantees > > that the rightmost 4 bits of the MANA PFN will always be zero. That's > > consistent > > with requiring the addresses to be PAGE_ALIGNED() to 64K, but I'm > unclear > > whether > > that is really the requirement. You might compare with the definition > of > > HVPFN_DOWN(), which has a similar goal for Linux guests communicating > > with > > Hyper-V. > > @Paul Rosswurm You said MANA HW has "no page concept". So the > "frame_addr" > In the mana_smc_setup_hwc() is NOT related to physical page number, > correct? > Can we just use phys_adr >> 12 like below? > > #define MANA_MIN_QSHIFT 12 > #define MANA_PFN(a) ((a) >> MANA_MIN_QSHIFT) > > /* EQ addr: low 48 bits of frame address */ > shmem = (u64 *)ptr; > - frame_addr = PHYS_PFN(eq_addr); > + frame_addr = MANA_PFN(eq_addr); > I just confirmed with Paul, we can use phys_adr >> 12. And I will change the alignment requirements to be 4k. Thanks, - Haiyang
From: Haiyang Zhang <haiyangz@microsoft.com> Sent: Tuesday, June 11, 2024 10:44 AM > > > -----Original Message----- > > From: Michael Kelley <mailto:mhklinux@outlook.com> > > Sent: Tuesday, June 11, 2024 12:35 PM > > To: Haiyang Zhang <mailto:haiyangz@microsoft.com>; mailto:linux-hyperv@vger.kernel.org; > > mailto:netdev@vger.kernel.org > > Cc: Dexuan Cui <mailto:decui@microsoft.com>; mailto:stephen@networkplumber.org; KY > > Srinivasan <mailto:kys@microsoft.com>; Paul Rosswurm <mailto:paulros@microsoft.com>; > > mailto:olaf@aepfle.de; vkuznets <mailto:vkuznets@redhat.com>; mailto:davem@davemloft.net; > > mailto:wei.liu@kernel.org; mailto:edumazet@google.com; mailto:kuba@kernel.org; > > mailto:pabeni@redhat.com; mailto:leon@kernel.org; Long Li <mailto:longli@microsoft.com>; > > mailto:ssengar@linux.microsoft.com; mailto:linux-rdma@vger.kernel.org; > > mailto:daniel@iogearbox.net; mailto:john.fastabend@gmail.com; mailto:bpf@vger.kernel.org; > > mailto:ast@kernel.org; mailto:hawk@kernel.org; mailto:tglx@linutronix.de; > > mailto:shradhagupta@linux.microsoft.com; mailto:linux-kernel@vger.kernel.org > > Subject: RE: [PATCH net-next] net: mana: Add support for variable page > > sizes of ARM64 > > > > From: LKML haiyangz <mailto:lkmlhyz@microsoft.com> On Behalf Of Haiyang Zhang > > Sent: Monday, June 10, 2024 2:23 PM > > > > > > As defined by the MANA Hardware spec, the queue size for DMA is 4KB > > > minimal, and power of 2. > > > > You say the hardware requires 4K "minimal". But the definitions in this > > patch hardcode to 4K, as if that's the only choice. Is the hardcoding to > > 4K a design decision made to simplify the MANA driver? > > The HWC q size has to be exactly 4k, which is by HW design. > Other "regular" queues can be 2^n >= 4k. > > > > > > To support variable page sizes (4KB, 16KB, 64KB) of ARM64, define > > > > A minor nit, but "variable" page size doesn't seem like quite the right > > description -- both here and in the Subject line. On ARM64, the page size > > is a choice among a few fixed options. Perhaps call it support for "page sizes > > other than 4K"? > > "page sizes other than 4K" sounds good. > > > > > > the minimal queue size as a macro separate from the PAGE_SIZE, which > > > we always assumed it to be 4KB before supporting ARM64. > > > Also, update the relevant code related to size alignment, DMA region > > > calculations, etc. > > > > > > Signed-off-by: Haiyang Zhang <mailto:haiyangz@microsoft.com> > > > --- > > > drivers/net/ethernet/microsoft/Kconfig | 2 +- > > > .../net/ethernet/microsoft/mana/gdma_main.c | 8 +++---- > > > .../net/ethernet/microsoft/mana/hw_channel.c | 22 +++++++++---------- > > > drivers/net/ethernet/microsoft/mana/mana_en.c | 8 +++---- > > > .../net/ethernet/microsoft/mana/shm_channel.c | 9 ++++---- > > > include/net/mana/gdma.h | 7 +++++- > > > include/net/mana/mana.h | 3 ++- > > > 7 files changed, 33 insertions(+), 26 deletions(-) > > > > > > diff --git a/drivers/net/ethernet/microsoft/Kconfig > > > b/drivers/net/ethernet/microsoft/Kconfig > > > index 286f0d5697a1..901fbffbf718 100644 > > > --- a/drivers/net/ethernet/microsoft/Kconfig > > > +++ b/drivers/net/ethernet/microsoft/Kconfig > > > @@ -18,7 +18,7 @@ if NET_VENDOR_MICROSOFT > > > config MICROSOFT_MANA > > > tristate "Microsoft Azure Network Adapter (MANA) support" > > > depends on PCI_MSI > > > - depends on X86_64 || (ARM64 && !CPU_BIG_ENDIAN && ARM64_4K_PAGES) > > > + depends on X86_64 || (ARM64 && !CPU_BIG_ENDIAN) > > > depends on PCI_HYPERV > > > select AUXILIARY_BUS > > > select PAGE_POOL > > > diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c > > > b/drivers/net/ethernet/microsoft/mana/gdma_main.c > > > index 1332db9a08eb..c9df942d0d02 100644 > > > --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c > > > +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c > > > @@ -182,7 +182,7 @@ int mana_gd_alloc_memory(struct gdma_context *gc, > > > unsigned int length, > > > dma_addr_t dma_handle; > > > void *buf; > > > > > > - if (length < PAGE_SIZE || !is_power_of_2(length)) > > > + if (length < MANA_MIN_QSIZE || !is_power_of_2(length)) > > > return -EINVAL; > > > > > > gmi->dev = gc->dev; > > > @@ -717,7 +717,7 @@ EXPORT_SYMBOL_NS(mana_gd_destroy_dma_region, NET_MANA); > > > static int mana_gd_create_dma_region(struct gdma_dev *gd, > > > struct gdma_mem_info *gmi) > > > { > > > - unsigned int num_page = gmi->length / PAGE_SIZE; > > > + unsigned int num_page = gmi->length / MANA_MIN_QSIZE; > > > > This calculation seems a bit weird when using MANA_MIN_QSIZE. The > > number of pages, and the construction of the page_addr_list array > > a few lines later, seem unrelated to the concept of a minimum queue > > size. Is the right concept really a "mapping chunk", and num_page > > would conceptually be "num_chunks", or something like that? Then > > a queue must be at least one chunk in size, but that's derived from the > > chunk size, and is not the core concept. > > I think calling it "num_chunks" is fine. > May I use "num_chunks" in next version? > I think first is the decision on what to use for MANA_MIN_QSIZE. I'm admittedly not familiar with mana and gdma, but the function mana_gd_create_dma_region() seems fairly typical in defining a logical region that spans multiple 4K chunks that may or may not be physically contiguous. So you set up an array of physical addresses that identify the physical memory location of each chunk. It seems very similar to a Hyper-V GPADL. I typically *do* see such chunks called "pages", but a "mapping chunk" or "mapping unit" is probably OK too. So mana_gd_create_dma_region() would use MANA_CHUNK_SIZE instead of MANA_MIN_QSIZE. Then you could also define MANA_MIN_QSIZE to be MANA_CHUNK_SIZE, for use specifically when checking the size of a queue. Then if you are using MANA_CHUNK_SIZE, the local variable would be "num_chunks". The use of "page_count", "page_addr_list", and "offset_in_page" field names in struct gdma_create_dma_region_req should then be changed as well. Looking further at the function mana_gd_create_dma_region(), there's also the use of offset_in_page(), which is based on the guest PAGE_SIZE. Wouldn't this be problematic if PAGE_SIZE is 64K? But perhaps Paul would weigh in further on his thoughts. > > > > Another approach might be to just call it "MANA_PAGE_SIZE", like > > has been done with HV_HYP_PAGE_SIZE. HV_HYP_PAGE_SIZE exists to > > handle exactly the same issue of the guest PAGE_SIZE potentially > > being different from the fixed 4K size that must be used in host-guest > > communication on Hyper-V. Same thing here with MANA. > > I actually called it "MANA_PAGE_SIZE" in my previous internal patch. > But Paul from Hostnet team opposed using that name, because > 4kB is the min q size. MANA doesn't have "page" at HW level. > > > > > struct gdma_create_dma_region_req *req = NULL; > > > struct gdma_create_dma_region_resp resp = {}; > > > struct gdma_context *gc = gd->gdma_context; > > > @@ -727,7 +727,7 @@ static int mana_gd_create_dma_region(struct gdma_dev *gd, > > > int err; > > > int i; > > > > > > - if (length < PAGE_SIZE || !is_power_of_2(length)) > > > + if (length < MANA_MIN_QSIZE || !is_power_of_2(length)) > > > return -EINVAL; > > > > > > if (offset_in_page(gmi->virt_addr) != 0) > > > @@ -751,7 +751,7 @@ static int mana_gd_create_dma_region(struct gdma_dev *gd, > > > req->page_addr_list_len = num_page; > > > > > > for (i = 0; i < num_page; i++) > > > - req->page_addr_list[i] = gmi->dma_handle + i * PAGE_SIZE; > > > + req->page_addr_list[i] = gmi->dma_handle + i * MANA_MIN_QSIZE; > > > > > > err = mana_gd_send_request(gc, req_msg_size, req, sizeof(resp), &resp); > > > if (err) > > > diff --git a/drivers/net/ethernet/microsoft/mana/hw_channel.c > > > b/drivers/net/ethernet/microsoft/mana/hw_channel.c > > > index bbc4f9e16c98..038dc31e09cd 100644 > > > --- a/drivers/net/ethernet/microsoft/mana/hw_channel.c > > > +++ b/drivers/net/ethernet/microsoft/mana/hw_channel.c > > > @@ -362,12 +362,12 @@ static int mana_hwc_create_cq(struct hw_channel_context > > > *hwc, u16 q_depth, > > > int err; > > > > > > eq_size = roundup_pow_of_two(GDMA_EQE_SIZE * q_depth); > > > - if (eq_size < MINIMUM_SUPPORTED_PAGE_SIZE) > > > - eq_size = MINIMUM_SUPPORTED_PAGE_SIZE; > > > + if (eq_size < MANA_MIN_QSIZE) > > > + eq_size = MANA_MIN_QSIZE; > > > > > > cq_size = roundup_pow_of_two(GDMA_CQE_SIZE * q_depth); > > > - if (cq_size < MINIMUM_SUPPORTED_PAGE_SIZE) > > > - cq_size = MINIMUM_SUPPORTED_PAGE_SIZE; > > > + if (cq_size < MANA_MIN_QSIZE) > > > + cq_size = MANA_MIN_QSIZE; > > > > > > hwc_cq = kzalloc(sizeof(*hwc_cq), GFP_KERNEL); > > > if (!hwc_cq) > > > @@ -429,7 +429,7 @@ static int mana_hwc_alloc_dma_buf(struct hw_channel_context *hwc, u16 q_depth, > > > > > > dma_buf->num_reqs = q_depth; > > > > > > - buf_size = PAGE_ALIGN(q_depth * max_msg_size); > > > + buf_size = MANA_MIN_QALIGN(q_depth * max_msg_size); > > > > > > gmi = &dma_buf->mem_info; > > > err = mana_gd_alloc_memory(gc, buf_size, gmi); > > > @@ -497,8 +497,8 @@ static int mana_hwc_create_wq(struct hw_channel_context > > > *hwc, > > > else > > > queue_size = roundup_pow_of_two(GDMA_MAX_SQE_SIZE * q_depth); > > > > > > - if (queue_size < MINIMUM_SUPPORTED_PAGE_SIZE) > > > - queue_size = MINIMUM_SUPPORTED_PAGE_SIZE; > > > + if (queue_size < MANA_MIN_QSIZE) > > > + queue_size = MANA_MIN_QSIZE; > > > > > > hwc_wq = kzalloc(sizeof(*hwc_wq), GFP_KERNEL); > > > if (!hwc_wq) > > > @@ -628,10 +628,10 @@ static int mana_hwc_establish_channel(struct > > > gdma_context *gc, u16 *q_depth, > > > init_completion(&hwc->hwc_init_eqe_comp); > > > > > > err = mana_smc_setup_hwc(&gc->shm_channel, false, > > > - eq->mem_info.dma_handle, > > > - cq->mem_info.dma_handle, > > > - rq->mem_info.dma_handle, > > > - sq->mem_info.dma_handle, > > > + virt_to_phys(eq->mem_info.virt_addr), > > > + virt_to_phys(cq->mem_info.virt_addr), > > > + virt_to_phys(rq->mem_info.virt_addr), > > > + virt_to_phys(sq->mem_info.virt_addr), > > > > This change seems unrelated to handling guest PAGE_SIZE values > > other than 4K. Does it belong in a separate patch? Or maybe it just > > needs an explanation in the commit message of this patch? > > I know dma_handle is usually just the phys adr. But this is not always > True if IOMMU is used... > I have no problem to put it to a separate patch if desired. Yes, that would seem like a separate patch. It's not related to handling page size other than 4K. > > > > > > eq->eq.msix_index); > > > if (err) > > > return err; > > > diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c > > > b/drivers/net/ethernet/microsoft/mana/mana_en.c > > > index d087cf954f75..6a891dbce686 100644 > > > --- a/drivers/net/ethernet/microsoft/mana/mana_en.c > > > +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c > > > @@ -1889,10 +1889,10 @@ static int mana_create_txq(struct mana_port_context > > > *apc, > > > * to prevent overflow. > > > */ > > > txq_size = MAX_SEND_BUFFERS_PER_QUEUE * 32; > > > - BUILD_BUG_ON(!PAGE_ALIGNED(txq_size)); > > > + BUILD_BUG_ON(!MANA_MIN_QALIGNED(txq_size)); > > > > > > cq_size = MAX_SEND_BUFFERS_PER_QUEUE * COMP_ENTRY_SIZE; > > > - cq_size = PAGE_ALIGN(cq_size); > > > + cq_size = MANA_MIN_QALIGN(cq_size); > > > > > > gc = gd->gdma_context; > > > > > > @@ -2189,8 +2189,8 @@ static struct mana_rxq *mana_create_rxq(struct mana_port_context *apc, > > > if (err) > > > goto out; > > > > > > - rq_size = PAGE_ALIGN(rq_size); > > > - cq_size = PAGE_ALIGN(cq_size); > > > + rq_size = MANA_MIN_QALIGN(rq_size); > > > + cq_size = MANA_MIN_QALIGN(cq_size); > > > > > > /* Create RQ */ > > > memset(&spec, 0, sizeof(spec)); > > > diff --git a/drivers/net/ethernet/microsoft/mana/shm_channel.c > > > b/drivers/net/ethernet/microsoft/mana/shm_channel.c > > > index 5553af9c8085..9a54a163d8d1 100644 > > > --- a/drivers/net/ethernet/microsoft/mana/shm_channel.c > > > +++ b/drivers/net/ethernet/microsoft/mana/shm_channel.c > > > @@ -6,6 +6,7 @@ > > > #include <linux/io.h> > > > #include <linux/mm.h> > > > > > > +#include <net/mana/gdma.h> > > > #include <net/mana/shm_channel.h> > > > > > > #define PAGE_FRAME_L48_WIDTH_BYTES 6 > > > @@ -183,7 +184,7 @@ int mana_smc_setup_hwc(struct shm_channel *sc, bool reset_vf, u64 eq_addr, > > > > > > /* EQ addr: low 48 bits of frame address */ > > > shmem = (u64 *)ptr; > > > - frame_addr = PHYS_PFN(eq_addr); > > > + frame_addr = MANA_PFN(eq_addr); > > > *shmem = frame_addr & PAGE_FRAME_L48_MASK; > > > all_addr_h4bits |= (frame_addr >> PAGE_FRAME_L48_WIDTH_BITS) << > > > (frame_addr_seq++ * PAGE_FRAME_H4_WIDTH_BITS); > > > > In mana_smc_setup_hwc() a few lines above this change, code using > > PAGE_ALIGNED() is unchanged. Is it correct that the eq/cq/rq/sq > > addresses > > must be aligned to 64K if PAGE_SIZE is 64K? > > Since we still using PHYS_PFN on them, if not aligned to PAGE_SIZE, > the lower bits may be lost. (You said the same below.) > > > > > Related, I wonder about how MANA_PFN() is defined. If PAGE_SIZE is 64K, > > MANA_PFN() will first right-shift 16, then left shift 4. The net is > > right-shift 12, > > corresponding to the 4K chunks that MANA expects. But that approach > > guarantees > > that the rightmost 4 bits of the MANA PFN will always be zero. That's > > consistent > > with requiring the addresses to be PAGE_ALIGNED() to 64K, but I'm unclear > > whether > > that is really the requirement. You might compare with the definition of > > HVPFN_DOWN(), which has a similar goal for Linux guests communicating > > with > > Hyper-V. > > @Paul Rosswurm You said MANA HW has "no page concept". So the "frame_addr" > In the mana_smc_setup_hwc() is NOT related to physical page number, correct? > Can we just use phys_adr >> 12 like below? > > #define MANA_MIN_QSHIFT 12 > #define MANA_PFN(a) ((a) >> MANA_MIN_QSHIFT) > > /* EQ addr: low 48 bits of frame address */ > shmem = (u64 *)ptr; > - frame_addr = PHYS_PFN(eq_addr); > + frame_addr = MANA_PFN(eq_addr); > > > > > > @@ -191,7 +192,7 @@ int mana_smc_setup_hwc(struct shm_channel *sc, bool > > > reset_vf, u64 eq_addr, > > > > > > /* CQ addr: low 48 bits of frame address */ > > > shmem = (u64 *)ptr; > > > - frame_addr = PHYS_PFN(cq_addr); > > > + frame_addr = MANA_PFN(cq_addr); > > > *shmem = frame_addr & PAGE_FRAME_L48_MASK; > > > all_addr_h4bits |= (frame_addr >> PAGE_FRAME_L48_WIDTH_BITS) << > > > (frame_addr_seq++ * PAGE_FRAME_H4_WIDTH_BITS); > > > @@ -199,7 +200,7 @@ int mana_smc_setup_hwc(struct shm_channel *sc, bool > > > reset_vf, u64 eq_addr, > > > > > > /* RQ addr: low 48 bits of frame address */ > > > shmem = (u64 *)ptr; > > > - frame_addr = PHYS_PFN(rq_addr); > > > + frame_addr = MANA_PFN(rq_addr); > > > *shmem = frame_addr & PAGE_FRAME_L48_MASK; > > > all_addr_h4bits |= (frame_addr >> PAGE_FRAME_L48_WIDTH_BITS) << > > > (frame_addr_seq++ * PAGE_FRAME_H4_WIDTH_BITS); > > > @@ -207,7 +208,7 @@ int mana_smc_setup_hwc(struct shm_channel *sc, bool > > > reset_vf, u64 eq_addr, > > > > > > /* SQ addr: low 48 bits of frame address */ > > > shmem = (u64 *)ptr; > > > - frame_addr = PHYS_PFN(sq_addr); > > > + frame_addr = MANA_PFN(sq_addr); > > > *shmem = frame_addr & PAGE_FRAME_L48_MASK; > > > all_addr_h4bits |= (frame_addr >> PAGE_FRAME_L48_WIDTH_BITS) << > > > (frame_addr_seq++ * PAGE_FRAME_H4_WIDTH_BITS); > > > diff --git a/include/net/mana/gdma.h b/include/net/mana/gdma.h > > > index 27684135bb4d..b392559c33e9 100644 > > > --- a/include/net/mana/gdma.h > > > +++ b/include/net/mana/gdma.h > > > @@ -224,7 +224,12 @@ struct gdma_dev { > > > struct auxiliary_device *adev; > > > }; > > > > > > -#define MINIMUM_SUPPORTED_PAGE_SIZE PAGE_SIZE > > > +/* These are defined by HW */ > > > +#define MANA_MIN_QSHIFT 12 > > > +#define MANA_MIN_QSIZE (1 << MANA_MIN_QSHIFT) > > > +#define MANA_MIN_QALIGN(x) ALIGN((x), MANA_MIN_QSIZE) > > > +#define MANA_MIN_QALIGNED(addr) IS_ALIGNED((unsigned long)(addr), > > MANA_MIN_QSIZE) > > > +#define MANA_PFN(a) (PHYS_PFN(a) << (PAGE_SHIFT - MANA_MIN_QSHIFT)) > > > > See comments above about how this is defined. > > Replied above. > Thank you for all the detailed comments! > > - Haiyang
> -----Original Message----- > From: Michael Kelley <mhklinux@outlook.com> > Sent: Tuesday, June 11, 2024 10:42 PM > To: Haiyang Zhang <haiyangz@microsoft.com>; linux-hyperv@vger.kernel.org; > netdev@vger.kernel.org; Paul Rosswurm <paulros@microsoft.com> > Cc: Dexuan Cui <decui@microsoft.com>; stephen@networkplumber.org; KY > Srinivasan <kys@microsoft.com>; olaf@aepfle.de; vkuznets > <vkuznets@redhat.com>; davem@davemloft.net; wei.liu@kernel.org; > edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; leon@kernel.org; > Long Li <longli@microsoft.com>; ssengar@linux.microsoft.com; linux- > rdma@vger.kernel.org; daniel@iogearbox.net; john.fastabend@gmail.com; > bpf@vger.kernel.org; ast@kernel.org; hawk@kernel.org; tglx@linutronix.de; > shradhagupta@linux.microsoft.com; linux-kernel@vger.kernel.org > Subject: RE: [PATCH net-next] net: mana: Add support for variable page > sizes of ARM64 > > > > diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c > > > > b/drivers/net/ethernet/microsoft/mana/gdma_main.c > > > > index 1332db9a08eb..c9df942d0d02 100644 > > > > --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c > > > > +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c > > > > @@ -182,7 +182,7 @@ int mana_gd_alloc_memory(struct gdma_context > *gc, > > > > unsigned int length, > > > > dma_addr_t dma_handle; > > > > void *buf; > > > > > > > > - if (length < PAGE_SIZE || !is_power_of_2(length)) > > > > + if (length < MANA_MIN_QSIZE || !is_power_of_2(length)) > > > > return -EINVAL; > > > > > > > > gmi->dev = gc->dev; > > > > @@ -717,7 +717,7 @@ EXPORT_SYMBOL_NS(mana_gd_destroy_dma_region, > NET_MANA); > > > > static int mana_gd_create_dma_region(struct gdma_dev *gd, > > > > struct gdma_mem_info *gmi) > > > > { > > > > - unsigned int num_page = gmi->length / PAGE_SIZE; > > > > + unsigned int num_page = gmi->length / MANA_MIN_QSIZE; > > > > > > This calculation seems a bit weird when using MANA_MIN_QSIZE. The > > > number of pages, and the construction of the page_addr_list array > > > a few lines later, seem unrelated to the concept of a minimum queue > > > size. Is the right concept really a "mapping chunk", and num_page > > > would conceptually be "num_chunks", or something like that? Then > > > a queue must be at least one chunk in size, but that's derived from > the > > > chunk size, and is not the core concept. > > > > I think calling it "num_chunks" is fine. > > May I use "num_chunks" in next version? > > > > I think first is the decision on what to use for MANA_MIN_QSIZE. I'm > admittedly not familiar with mana and gdma, but the function > mana_gd_create_dma_region() seems fairly typical in defining a > logical region that spans multiple 4K chunks that may or may not > be physically contiguous. So you set up an array of physical > addresses that identify the physical memory location of each chunk. > It seems very similar to a Hyper-V GPADL. I typically *do* see such > chunks called "pages", but a "mapping chunk" or "mapping unit" > is probably OK too. So mana_gd_create_dma_region() would use > MANA_CHUNK_SIZE instead of MANA_MIN_QSIZE. Then you could > also define MANA_MIN_QSIZE to be MANA_CHUNK_SIZE, for use > specifically when checking the size of a queue. > > Then if you are using MANA_CHUNK_SIZE, the local variable > would be "num_chunks". The use of "page_count", "page_addr_list", > and "offset_in_page" field names in struct > gdma_create_dma_region_req should then be changed as well. I'm fine with these names. I will check with Paul too. I'd define just one macro, with a code comment like this. It will be used for chunk calculation and q len checking: /* Chunk size of MANA DMA, which is also the min queue size */ #define MANA_CHUNK_SIZE 4096 > Looking further at the function mana_gd_create_dma_region(), > there's also the use of offset_in_page(), which is based on the > guest PAGE_SIZE. Wouldn't this be problematic if PAGE_SIZE > is 64K? As in my other email - I confirmed with Hostnet team that the alignment requirement is just 4k. So I will relax the following check to be 4k alignment too: if (offset_in_page(gmi->virt_addr) != 0) return -EINVAL; Thanks, - Haiyang
> -----Original Message----- > From: Haiyang Zhang <haiyangz@microsoft.com> > Sent: Wednesday, June 12, 2024 10:22 AM > To: Michael Kelley <mhklinux@outlook.com>; linux-hyperv@vger.kernel.org; > netdev@vger.kernel.org; Paul Rosswurm <paulros@microsoft.com> > Cc: Dexuan Cui <decui@microsoft.com>; stephen@networkplumber.org; KY > Srinivasan <kys@microsoft.com>; olaf@aepfle.de; vkuznets > <vkuznets@redhat.com>; davem@davemloft.net; wei.liu@kernel.org; > edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; leon@kernel.org; > Long Li <longli@microsoft.com>; ssengar@linux.microsoft.com; linux- > rdma@vger.kernel.org; daniel@iogearbox.net; john.fastabend@gmail.com; > bpf@vger.kernel.org; ast@kernel.org; hawk@kernel.org; tglx@linutronix.de; > shradhagupta@linux.microsoft.com; linux-kernel@vger.kernel.org > Subject: RE: [PATCH net-next] net: mana: Add support for variable page > sizes of ARM64 > > > > > -----Original Message----- > > From: Michael Kelley <mhklinux@outlook.com> > > Sent: Tuesday, June 11, 2024 10:42 PM > > To: Haiyang Zhang <haiyangz@microsoft.com>; linux- > hyperv@vger.kernel.org; > > netdev@vger.kernel.org; Paul Rosswurm <paulros@microsoft.com> > > Cc: Dexuan Cui <decui@microsoft.com>; stephen@networkplumber.org; KY > > Srinivasan <kys@microsoft.com>; olaf@aepfle.de; vkuznets > > <vkuznets@redhat.com>; davem@davemloft.net; wei.liu@kernel.org; > > edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; > leon@kernel.org; > > Long Li <longli@microsoft.com>; ssengar@linux.microsoft.com; linux- > > rdma@vger.kernel.org; daniel@iogearbox.net; john.fastabend@gmail.com; > > bpf@vger.kernel.org; ast@kernel.org; hawk@kernel.org; > tglx@linutronix.de; > > shradhagupta@linux.microsoft.com; linux-kernel@vger.kernel.org > > Subject: RE: [PATCH net-next] net: mana: Add support for variable page > > sizes of ARM64 > > > > > > diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c > > > > > b/drivers/net/ethernet/microsoft/mana/gdma_main.c > > > > > index 1332db9a08eb..c9df942d0d02 100644 > > > > > --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c > > > > > +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c > > > > > @@ -182,7 +182,7 @@ int mana_gd_alloc_memory(struct gdma_context > > *gc, > > > > > unsigned int length, > > > > > dma_addr_t dma_handle; > > > > > void *buf; > > > > > > > > > > - if (length < PAGE_SIZE || !is_power_of_2(length)) > > > > > + if (length < MANA_MIN_QSIZE || !is_power_of_2(length)) > > > > > return -EINVAL; > > > > > > > > > > gmi->dev = gc->dev; > > > > > @@ -717,7 +717,7 @@ EXPORT_SYMBOL_NS(mana_gd_destroy_dma_region, > > NET_MANA); > > > > > static int mana_gd_create_dma_region(struct gdma_dev *gd, > > > > > struct gdma_mem_info *gmi) > > > > > { > > > > > - unsigned int num_page = gmi->length / PAGE_SIZE; > > > > > + unsigned int num_page = gmi->length / MANA_MIN_QSIZE; > > > > > > > > This calculation seems a bit weird when using MANA_MIN_QSIZE. The > > > > number of pages, and the construction of the page_addr_list array > > > > a few lines later, seem unrelated to the concept of a minimum queue > > > > size. Is the right concept really a "mapping chunk", and num_page > > > > would conceptually be "num_chunks", or something like that? Then > > > > a queue must be at least one chunk in size, but that's derived from > > the > > > > chunk size, and is not the core concept. > > > > > > I think calling it "num_chunks" is fine. > > > May I use "num_chunks" in next version? > > > > > > > I think first is the decision on what to use for MANA_MIN_QSIZE. I'm > > admittedly not familiar with mana and gdma, but the function > > mana_gd_create_dma_region() seems fairly typical in defining a > > logical region that spans multiple 4K chunks that may or may not > > be physically contiguous. So you set up an array of physical > > addresses that identify the physical memory location of each chunk. > > It seems very similar to a Hyper-V GPADL. I typically *do* see such > > chunks called "pages", but a "mapping chunk" or "mapping unit" > > is probably OK too. So mana_gd_create_dma_region() would use > > MANA_CHUNK_SIZE instead of MANA_MIN_QSIZE. Then you could > > also define MANA_MIN_QSIZE to be MANA_CHUNK_SIZE, for use > > specifically when checking the size of a queue. > > > > Then if you are using MANA_CHUNK_SIZE, the local variable > > would be "num_chunks". The use of "page_count", "page_addr_list", > > and "offset_in_page" field names in struct > > gdma_create_dma_region_req should then be changed as well. > > I'm fine with these names. I will check with Paul too. > > I'd define just one macro, with a code comment like this. It > will be used for chunk calculation and q len checking: > /* Chunk size of MANA DMA, which is also the min queue size */ > #define MANA_CHUNK_SIZE 4096 > > After further discussion with Paul, and reading documents, we decided to use MANA_PAGE_SIZE for DMA unit calculations etc. And use another macro MANA_MIN_QSIZE for queue length checking. This is also what Michael initially suggested. Thanks, - Haiyang
diff --git a/drivers/net/ethernet/microsoft/Kconfig b/drivers/net/ethernet/microsoft/Kconfig index 286f0d5697a1..901fbffbf718 100644 --- a/drivers/net/ethernet/microsoft/Kconfig +++ b/drivers/net/ethernet/microsoft/Kconfig @@ -18,7 +18,7 @@ if NET_VENDOR_MICROSOFT config MICROSOFT_MANA tristate "Microsoft Azure Network Adapter (MANA) support" depends on PCI_MSI - depends on X86_64 || (ARM64 && !CPU_BIG_ENDIAN && ARM64_4K_PAGES) + depends on X86_64 || (ARM64 && !CPU_BIG_ENDIAN) depends on PCI_HYPERV select AUXILIARY_BUS select PAGE_POOL diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c index 1332db9a08eb..c9df942d0d02 100644 --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c @@ -182,7 +182,7 @@ int mana_gd_alloc_memory(struct gdma_context *gc, unsigned int length, dma_addr_t dma_handle; void *buf; - if (length < PAGE_SIZE || !is_power_of_2(length)) + if (length < MANA_MIN_QSIZE || !is_power_of_2(length)) return -EINVAL; gmi->dev = gc->dev; @@ -717,7 +717,7 @@ EXPORT_SYMBOL_NS(mana_gd_destroy_dma_region, NET_MANA); static int mana_gd_create_dma_region(struct gdma_dev *gd, struct gdma_mem_info *gmi) { - unsigned int num_page = gmi->length / PAGE_SIZE; + unsigned int num_page = gmi->length / MANA_MIN_QSIZE; struct gdma_create_dma_region_req *req = NULL; struct gdma_create_dma_region_resp resp = {}; struct gdma_context *gc = gd->gdma_context; @@ -727,7 +727,7 @@ static int mana_gd_create_dma_region(struct gdma_dev *gd, int err; int i; - if (length < PAGE_SIZE || !is_power_of_2(length)) + if (length < MANA_MIN_QSIZE || !is_power_of_2(length)) return -EINVAL; if (offset_in_page(gmi->virt_addr) != 0) @@ -751,7 +751,7 @@ static int mana_gd_create_dma_region(struct gdma_dev *gd, req->page_addr_list_len = num_page; for (i = 0; i < num_page; i++) - req->page_addr_list[i] = gmi->dma_handle + i * PAGE_SIZE; + req->page_addr_list[i] = gmi->dma_handle + i * MANA_MIN_QSIZE; err = mana_gd_send_request(gc, req_msg_size, req, sizeof(resp), &resp); if (err) diff --git a/drivers/net/ethernet/microsoft/mana/hw_channel.c b/drivers/net/ethernet/microsoft/mana/hw_channel.c index bbc4f9e16c98..038dc31e09cd 100644 --- a/drivers/net/ethernet/microsoft/mana/hw_channel.c +++ b/drivers/net/ethernet/microsoft/mana/hw_channel.c @@ -362,12 +362,12 @@ static int mana_hwc_create_cq(struct hw_channel_context *hwc, u16 q_depth, int err; eq_size = roundup_pow_of_two(GDMA_EQE_SIZE * q_depth); - if (eq_size < MINIMUM_SUPPORTED_PAGE_SIZE) - eq_size = MINIMUM_SUPPORTED_PAGE_SIZE; + if (eq_size < MANA_MIN_QSIZE) + eq_size = MANA_MIN_QSIZE; cq_size = roundup_pow_of_two(GDMA_CQE_SIZE * q_depth); - if (cq_size < MINIMUM_SUPPORTED_PAGE_SIZE) - cq_size = MINIMUM_SUPPORTED_PAGE_SIZE; + if (cq_size < MANA_MIN_QSIZE) + cq_size = MANA_MIN_QSIZE; hwc_cq = kzalloc(sizeof(*hwc_cq), GFP_KERNEL); if (!hwc_cq) @@ -429,7 +429,7 @@ static int mana_hwc_alloc_dma_buf(struct hw_channel_context *hwc, u16 q_depth, dma_buf->num_reqs = q_depth; - buf_size = PAGE_ALIGN(q_depth * max_msg_size); + buf_size = MANA_MIN_QALIGN(q_depth * max_msg_size); gmi = &dma_buf->mem_info; err = mana_gd_alloc_memory(gc, buf_size, gmi); @@ -497,8 +497,8 @@ static int mana_hwc_create_wq(struct hw_channel_context *hwc, else queue_size = roundup_pow_of_two(GDMA_MAX_SQE_SIZE * q_depth); - if (queue_size < MINIMUM_SUPPORTED_PAGE_SIZE) - queue_size = MINIMUM_SUPPORTED_PAGE_SIZE; + if (queue_size < MANA_MIN_QSIZE) + queue_size = MANA_MIN_QSIZE; hwc_wq = kzalloc(sizeof(*hwc_wq), GFP_KERNEL); if (!hwc_wq) @@ -628,10 +628,10 @@ static int mana_hwc_establish_channel(struct gdma_context *gc, u16 *q_depth, init_completion(&hwc->hwc_init_eqe_comp); err = mana_smc_setup_hwc(&gc->shm_channel, false, - eq->mem_info.dma_handle, - cq->mem_info.dma_handle, - rq->mem_info.dma_handle, - sq->mem_info.dma_handle, + virt_to_phys(eq->mem_info.virt_addr), + virt_to_phys(cq->mem_info.virt_addr), + virt_to_phys(rq->mem_info.virt_addr), + virt_to_phys(sq->mem_info.virt_addr), eq->eq.msix_index); if (err) return err; diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c index d087cf954f75..6a891dbce686 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -1889,10 +1889,10 @@ static int mana_create_txq(struct mana_port_context *apc, * to prevent overflow. */ txq_size = MAX_SEND_BUFFERS_PER_QUEUE * 32; - BUILD_BUG_ON(!PAGE_ALIGNED(txq_size)); + BUILD_BUG_ON(!MANA_MIN_QALIGNED(txq_size)); cq_size = MAX_SEND_BUFFERS_PER_QUEUE * COMP_ENTRY_SIZE; - cq_size = PAGE_ALIGN(cq_size); + cq_size = MANA_MIN_QALIGN(cq_size); gc = gd->gdma_context; @@ -2189,8 +2189,8 @@ static struct mana_rxq *mana_create_rxq(struct mana_port_context *apc, if (err) goto out; - rq_size = PAGE_ALIGN(rq_size); - cq_size = PAGE_ALIGN(cq_size); + rq_size = MANA_MIN_QALIGN(rq_size); + cq_size = MANA_MIN_QALIGN(cq_size); /* Create RQ */ memset(&spec, 0, sizeof(spec)); diff --git a/drivers/net/ethernet/microsoft/mana/shm_channel.c b/drivers/net/ethernet/microsoft/mana/shm_channel.c index 5553af9c8085..9a54a163d8d1 100644 --- a/drivers/net/ethernet/microsoft/mana/shm_channel.c +++ b/drivers/net/ethernet/microsoft/mana/shm_channel.c @@ -6,6 +6,7 @@ #include <linux/io.h> #include <linux/mm.h> +#include <net/mana/gdma.h> #include <net/mana/shm_channel.h> #define PAGE_FRAME_L48_WIDTH_BYTES 6 @@ -183,7 +184,7 @@ int mana_smc_setup_hwc(struct shm_channel *sc, bool reset_vf, u64 eq_addr, /* EQ addr: low 48 bits of frame address */ shmem = (u64 *)ptr; - frame_addr = PHYS_PFN(eq_addr); + frame_addr = MANA_PFN(eq_addr); *shmem = frame_addr & PAGE_FRAME_L48_MASK; all_addr_h4bits |= (frame_addr >> PAGE_FRAME_L48_WIDTH_BITS) << (frame_addr_seq++ * PAGE_FRAME_H4_WIDTH_BITS); @@ -191,7 +192,7 @@ int mana_smc_setup_hwc(struct shm_channel *sc, bool reset_vf, u64 eq_addr, /* CQ addr: low 48 bits of frame address */ shmem = (u64 *)ptr; - frame_addr = PHYS_PFN(cq_addr); + frame_addr = MANA_PFN(cq_addr); *shmem = frame_addr & PAGE_FRAME_L48_MASK; all_addr_h4bits |= (frame_addr >> PAGE_FRAME_L48_WIDTH_BITS) << (frame_addr_seq++ * PAGE_FRAME_H4_WIDTH_BITS); @@ -199,7 +200,7 @@ int mana_smc_setup_hwc(struct shm_channel *sc, bool reset_vf, u64 eq_addr, /* RQ addr: low 48 bits of frame address */ shmem = (u64 *)ptr; - frame_addr = PHYS_PFN(rq_addr); + frame_addr = MANA_PFN(rq_addr); *shmem = frame_addr & PAGE_FRAME_L48_MASK; all_addr_h4bits |= (frame_addr >> PAGE_FRAME_L48_WIDTH_BITS) << (frame_addr_seq++ * PAGE_FRAME_H4_WIDTH_BITS); @@ -207,7 +208,7 @@ int mana_smc_setup_hwc(struct shm_channel *sc, bool reset_vf, u64 eq_addr, /* SQ addr: low 48 bits of frame address */ shmem = (u64 *)ptr; - frame_addr = PHYS_PFN(sq_addr); + frame_addr = MANA_PFN(sq_addr); *shmem = frame_addr & PAGE_FRAME_L48_MASK; all_addr_h4bits |= (frame_addr >> PAGE_FRAME_L48_WIDTH_BITS) << (frame_addr_seq++ * PAGE_FRAME_H4_WIDTH_BITS); diff --git a/include/net/mana/gdma.h b/include/net/mana/gdma.h index 27684135bb4d..b392559c33e9 100644 --- a/include/net/mana/gdma.h +++ b/include/net/mana/gdma.h @@ -224,7 +224,12 @@ struct gdma_dev { struct auxiliary_device *adev; }; -#define MINIMUM_SUPPORTED_PAGE_SIZE PAGE_SIZE +/* These are defined by HW */ +#define MANA_MIN_QSHIFT 12 +#define MANA_MIN_QSIZE (1 << MANA_MIN_QSHIFT) +#define MANA_MIN_QALIGN(x) ALIGN((x), MANA_MIN_QSIZE) +#define MANA_MIN_QALIGNED(addr) IS_ALIGNED((unsigned long)(addr), MANA_MIN_QSIZE) +#define MANA_PFN(a) (PHYS_PFN(a) << (PAGE_SHIFT - MANA_MIN_QSHIFT)) #define GDMA_CQE_SIZE 64 #define GDMA_EQE_SIZE 16 diff --git a/include/net/mana/mana.h b/include/net/mana/mana.h index 561f6719fb4e..43e8fc574354 100644 --- a/include/net/mana/mana.h +++ b/include/net/mana/mana.h @@ -42,7 +42,8 @@ enum TRI_STATE { #define MAX_SEND_BUFFERS_PER_QUEUE 256 -#define EQ_SIZE (8 * PAGE_SIZE) +#define EQ_SIZE (8 * MANA_MIN_QSIZE) + #define LOG2_EQ_THROTTLE 3 #define MAX_PORTS_IN_MANA_DEV 256
As defined by the MANA Hardware spec, the queue size for DMA is 4KB minimal, and power of 2. To support variable page sizes (4KB, 16KB, 64KB) of ARM64, define the minimal queue size as a macro separate from the PAGE_SIZE, which we always assumed it to be 4KB before supporting ARM64. Also, update the relevant code related to size alignment, DMA region calculations, etc. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> --- drivers/net/ethernet/microsoft/Kconfig | 2 +- .../net/ethernet/microsoft/mana/gdma_main.c | 8 +++---- .../net/ethernet/microsoft/mana/hw_channel.c | 22 +++++++++---------- drivers/net/ethernet/microsoft/mana/mana_en.c | 8 +++---- .../net/ethernet/microsoft/mana/shm_channel.c | 9 ++++---- include/net/mana/gdma.h | 7 +++++- include/net/mana/mana.h | 3 ++- 7 files changed, 33 insertions(+), 26 deletions(-)