Message ID | 20240827042512.216634-2-schalla@marvell.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | octeontx2-af: Update CPT block for CN10KB | expand |
On Tue, Aug 27, 2024 at 09:55:11AM +0530, Srujana Challa wrote: > On CN10KB, number of flt interrupt vectors are reduced to > 2. So, modify the code accordingly for cn10k. I think it would be nice to state that the patch moves away from a hard-coded to dynamic number of vectors. And that this is in order to accommodate the CN10KB, which has 2 vectors, as opposed to chips supported by the driver to date, which have 3. > > Signed-off-by: Srujana Challa <schalla@marvell.com> ... > diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cpt.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cpt.c > index 3e09d2285814..e56d27018828 100644 > --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cpt.c > +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cpt.c > @@ -37,6 +37,44 @@ > (_rsp)->free_sts_##etype = free_sts; \ > }) > > +#define MAX_AE GENMASK_ULL(47, 32) > +#define MAX_IE GENMASK_ULL(31, 16) > +#define MAX_SE GENMASK_ULL(15, 0) nit: Maybe a blank line here. > +static u32 cpt_max_engines_get(struct rvu *rvu) > +{ > + u16 max_ses, max_ies, max_aes; > + u64 reg; > + > + reg = rvu_read64(rvu, BLKADDR_CPT0, CPT_AF_CONSTANTS1); > + max_ses = FIELD_GET(MAX_SE, reg); > + max_ies = FIELD_GET(MAX_IE, reg); > + max_aes = FIELD_GET(MAX_AE, reg); > + > + return max_ses + max_ies + max_aes; Maybe I am wrong, but it looks like this will perform u16 addition. Can that overflow? I ask because the return type is u32, implying values larger than 64k are expected. > +} > + > +/* Number of flt interrupt vectors are depends on number of engines that the > + * chip has. Each flt vector represents 64 engines. > + */ > +static int cpt_10k_flt_nvecs_get(struct rvu *rvu) > +{ > + u32 max_engs; > + int flt_vecs; > + > + max_engs = cpt_max_engines_get(rvu); > + > + flt_vecs = (max_engs / 64); > + flt_vecs += (max_engs % 64) ? 1 : 0; I don't think the parentheses are needed on the lines above. And likewise elsewhere in this patch. At any rate, here, I think you can use DIV_ROUND_UP. > + > + if (flt_vecs > CPT_10K_AF_INT_VEC_FLT_MAX) { > + dev_warn(rvu->dev, "flt_vecs(%d) exceeds the max vectors(%d)\n", > + flt_vecs, CPT_10K_AF_INT_VEC_FLT_MAX); > + flt_vecs = CPT_10K_AF_INT_VEC_FLT_MAX; > + } cpt_max_engines_get seems to get called quite a bit. I think dev_warn_once() is probably appropriate here. > + > + return flt_vecs; > +} > + > static irqreturn_t cpt_af_flt_intr_handler(int vec, void *ptr) > { > struct rvu_block *block = ptr; > @@ -150,17 +188,25 @@ static void cpt_10k_unregister_interrupts(struct rvu_block *block, int off) > { > struct rvu *rvu = block->rvu; > int blkaddr = block->addr; > + u32 max_engs; > + u8 nr; > int i; > > + max_engs = cpt_max_engines_get(rvu); > + > /* Disable all CPT AF interrupts */ > - rvu_write64(rvu, blkaddr, CPT_AF_FLTX_INT_ENA_W1C(0), ~0ULL); > - rvu_write64(rvu, blkaddr, CPT_AF_FLTX_INT_ENA_W1C(1), ~0ULL); > - rvu_write64(rvu, blkaddr, CPT_AF_FLTX_INT_ENA_W1C(2), 0xFFFF); > + for (i = CPT_10K_AF_INT_VEC_FLT0; i < cpt_10k_flt_nvecs_get(rvu); i++) { I think it would be best to cache the value of cpt_10k_flt_nvecs_get() rather than run it for each iteration of the loop. I'm assuming it has a non-zero cost as it reads a register, the value of which which I assume does not change. Also, the same register is already read by the call to cpt_max_engines_get(). It would be nice to read it just once in this scope. Likewise elsewhere. Also, there is an inconsistency between the type of i and the return type of cpt_10k_flt_nvecs_get(). Probably harmless, but IMHO it would be nice to fix. > + nr = (max_engs > 64) ? 64 : max_engs; > + max_engs -= nr; > + rvu_write64(rvu, blkaddr, CPT_AF_FLTX_INT_ENA_W1C(i), > + INTR_MASK(nr)); > + } > > rvu_write64(rvu, blkaddr, CPT_AF_RVU_INT_ENA_W1C, 0x1); > rvu_write64(rvu, blkaddr, CPT_AF_RAS_INT_ENA_W1C, 0x1); > > - for (i = 0; i < CPT_10K_AF_INT_VEC_CNT; i++) > + /* CPT AF interrupt vectors are flt_int, rvu_int and ras_int. */ > + for (i = 0; i < cpt_10k_flt_nvecs_get(rvu) + 2; i++) It would be nice to avoid the naked '2' here. Perhaps a macro is appropriate. > if (rvu->irq_allocated[off + i]) { > free_irq(pci_irq_vector(rvu->pdev, off + i), block); > rvu->irq_allocated[off + i] = false; ...
> Subject: [EXTERNAL] Re: [PATCH net-next,1/2] octeontx2-af: reduce cpt flt > interrupt vectors for CN10KB > > On Tue, Aug 27, 2024 at 09: 55: 11AM +0530, Srujana Challa wrote: > On > CN10KB, number of flt interrupt vectors are reduced to > 2. So, modify the > code accordingly for cn10k. I think it would be nice to state that the patch > moves away from > On Tue, Aug 27, 2024 at 09:55:11AM +0530, Srujana Challa wrote: > > On CN10KB, number of flt interrupt vectors are reduced to 2. So, > > modify the code accordingly for cn10k. > > I think it would be nice to state that the patch moves away from a hard-coded > to dynamic number of vectors. > And that this is in order to accommodate the CN10KB, which has 2 vectors, as > opposed to chips supported by the driver to date, which have 3. Sure, I will make the change. > > > > > Signed-off-by: Srujana Challa <schalla@marvell.com> > > ... > > > diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cpt.c > > b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cpt.c > > index 3e09d2285814..e56d27018828 100644 > > --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cpt.c > > +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cpt.c > > @@ -37,6 +37,44 @@ > > (_rsp)->free_sts_##etype = free_sts; \ > > }) > > > > +#define MAX_AE GENMASK_ULL(47, 32) > > +#define MAX_IE GENMASK_ULL(31, 16) > > +#define MAX_SE GENMASK_ULL(15, 0) > > nit: Maybe a blank line here. > > > +static u32 cpt_max_engines_get(struct rvu *rvu) { > > + u16 max_ses, max_ies, max_aes; > > + u64 reg; > > + > > + reg = rvu_read64(rvu, BLKADDR_CPT0, CPT_AF_CONSTANTS1); > > + max_ses = FIELD_GET(MAX_SE, reg); > > + max_ies = FIELD_GET(MAX_IE, reg); > > + max_aes = FIELD_GET(MAX_AE, reg); > > + > > + return max_ses + max_ies + max_aes; > > Maybe I am wrong, but it looks like this will perform u16 addition. > Can that overflow? I ask because the return type is u32, implying values larger > than 64k are expected. No, it couldn't overflow. I will change the return type to u16. > > > +} > > + > > +/* Number of flt interrupt vectors are depends on number of engines > > +that the > > + * chip has. Each flt vector represents 64 engines. > > + */ > > +static int cpt_10k_flt_nvecs_get(struct rvu *rvu) { > > + u32 max_engs; > > + int flt_vecs; > > + > > + max_engs = cpt_max_engines_get(rvu); > > + > > + flt_vecs = (max_engs / 64); > > + flt_vecs += (max_engs % 64) ? 1 : 0; > > I don't think the parentheses are needed on the lines above. > And likewise elsewhere in this patch. > > At any rate, here, I think you can use DIV_ROUND_UP. Ack. > > > + > > + if (flt_vecs > CPT_10K_AF_INT_VEC_FLT_MAX) { > > + dev_warn(rvu->dev, "flt_vecs(%d) exceeds the max > vectors(%d)\n", > > + flt_vecs, CPT_10K_AF_INT_VEC_FLT_MAX); > > + flt_vecs = CPT_10K_AF_INT_VEC_FLT_MAX; > > + } > > cpt_max_engines_get seems to get called quite a bit. > I think dev_warn_once() is probably appropriate here. Ack. > > > + > > + return flt_vecs; > > +} > > + > > static irqreturn_t cpt_af_flt_intr_handler(int vec, void *ptr) { > > struct rvu_block *block = ptr; > > @@ -150,17 +188,25 @@ static void cpt_10k_unregister_interrupts(struct > > rvu_block *block, int off) { > > struct rvu *rvu = block->rvu; > > int blkaddr = block->addr; > > + u32 max_engs; > > + u8 nr; > > int i; > > > > + max_engs = cpt_max_engines_get(rvu); > > + > > /* Disable all CPT AF interrupts */ > > - rvu_write64(rvu, blkaddr, CPT_AF_FLTX_INT_ENA_W1C(0), ~0ULL); > > - rvu_write64(rvu, blkaddr, CPT_AF_FLTX_INT_ENA_W1C(1), ~0ULL); > > - rvu_write64(rvu, blkaddr, CPT_AF_FLTX_INT_ENA_W1C(2), 0xFFFF); > > + for (i = CPT_10K_AF_INT_VEC_FLT0; i < cpt_10k_flt_nvecs_get(rvu); > > +i++) { > > I think it would be best to cache the value of cpt_10k_flt_nvecs_get() rather > than run it for each iteration of the loop. I'm assuming it has a non-zero cost as > it reads a register, the value of which which I assume does not change. > > Also, the same register is already read by the call to cpt_max_engines_get(). It > would be nice to read it just once in this scope. Ack. > > Likewise elsewhere. > > Also, there is an inconsistency between the type of i and the return type of > cpt_10k_flt_nvecs_get(). Probably harmless, but IMHO it would be nice to fix. Both are int only. > > > + nr = (max_engs > 64) ? 64 : max_engs; > > + max_engs -= nr; > > + rvu_write64(rvu, blkaddr, CPT_AF_FLTX_INT_ENA_W1C(i), > > + INTR_MASK(nr)); > > + } > > > > rvu_write64(rvu, blkaddr, CPT_AF_RVU_INT_ENA_W1C, 0x1); > > rvu_write64(rvu, blkaddr, CPT_AF_RAS_INT_ENA_W1C, 0x1); > > > > - for (i = 0; i < CPT_10K_AF_INT_VEC_CNT; i++) > > + /* CPT AF interrupt vectors are flt_int, rvu_int and ras_int. */ > > + for (i = 0; i < cpt_10k_flt_nvecs_get(rvu) + 2; i++) > > It would be nice to avoid the naked '2' here. > Perhaps a macro is appropriate. Ack. > > > if (rvu->irq_allocated[off + i]) { > > free_irq(pci_irq_vector(rvu->pdev, off + i), block); > > rvu->irq_allocated[off + i] = false; > > ... > > -- > pw-bot: cr Thanks for reviewing the patch.
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/mbox.h b/drivers/net/ethernet/marvell/octeontx2/af/mbox.h index ed2160cc5acb..6ea2f3071fe8 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/mbox.h +++ b/drivers/net/ethernet/marvell/octeontx2/af/mbox.h @@ -1856,8 +1856,9 @@ struct cpt_flt_eng_info_req { struct cpt_flt_eng_info_rsp { struct mbox_msghdr hdr; - u64 flt_eng_map[CPT_10K_AF_INT_VEC_RVU]; - u64 rcvrd_eng_map[CPT_10K_AF_INT_VEC_RVU]; +#define CPT_AF_MAX_FLT_INT_VECS 3 + u64 flt_eng_map[CPT_AF_MAX_FLT_INT_VECS]; + u64 rcvrd_eng_map[CPT_AF_MAX_FLT_INT_VECS]; u64 rsvd; }; diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cpt.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cpt.c index 3e09d2285814..e56d27018828 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cpt.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cpt.c @@ -37,6 +37,44 @@ (_rsp)->free_sts_##etype = free_sts; \ }) +#define MAX_AE GENMASK_ULL(47, 32) +#define MAX_IE GENMASK_ULL(31, 16) +#define MAX_SE GENMASK_ULL(15, 0) +static u32 cpt_max_engines_get(struct rvu *rvu) +{ + u16 max_ses, max_ies, max_aes; + u64 reg; + + reg = rvu_read64(rvu, BLKADDR_CPT0, CPT_AF_CONSTANTS1); + max_ses = FIELD_GET(MAX_SE, reg); + max_ies = FIELD_GET(MAX_IE, reg); + max_aes = FIELD_GET(MAX_AE, reg); + + return max_ses + max_ies + max_aes; +} + +/* Number of flt interrupt vectors are depends on number of engines that the + * chip has. Each flt vector represents 64 engines. + */ +static int cpt_10k_flt_nvecs_get(struct rvu *rvu) +{ + u32 max_engs; + int flt_vecs; + + max_engs = cpt_max_engines_get(rvu); + + flt_vecs = (max_engs / 64); + flt_vecs += (max_engs % 64) ? 1 : 0; + + if (flt_vecs > CPT_10K_AF_INT_VEC_FLT_MAX) { + dev_warn(rvu->dev, "flt_vecs(%d) exceeds the max vectors(%d)\n", + flt_vecs, CPT_10K_AF_INT_VEC_FLT_MAX); + flt_vecs = CPT_10K_AF_INT_VEC_FLT_MAX; + } + + return flt_vecs; +} + static irqreturn_t cpt_af_flt_intr_handler(int vec, void *ptr) { struct rvu_block *block = ptr; @@ -150,17 +188,25 @@ static void cpt_10k_unregister_interrupts(struct rvu_block *block, int off) { struct rvu *rvu = block->rvu; int blkaddr = block->addr; + u32 max_engs; + u8 nr; int i; + max_engs = cpt_max_engines_get(rvu); + /* Disable all CPT AF interrupts */ - rvu_write64(rvu, blkaddr, CPT_AF_FLTX_INT_ENA_W1C(0), ~0ULL); - rvu_write64(rvu, blkaddr, CPT_AF_FLTX_INT_ENA_W1C(1), ~0ULL); - rvu_write64(rvu, blkaddr, CPT_AF_FLTX_INT_ENA_W1C(2), 0xFFFF); + for (i = CPT_10K_AF_INT_VEC_FLT0; i < cpt_10k_flt_nvecs_get(rvu); i++) { + nr = (max_engs > 64) ? 64 : max_engs; + max_engs -= nr; + rvu_write64(rvu, blkaddr, CPT_AF_FLTX_INT_ENA_W1C(i), + INTR_MASK(nr)); + } rvu_write64(rvu, blkaddr, CPT_AF_RVU_INT_ENA_W1C, 0x1); rvu_write64(rvu, blkaddr, CPT_AF_RAS_INT_ENA_W1C, 0x1); - for (i = 0; i < CPT_10K_AF_INT_VEC_CNT; i++) + /* CPT AF interrupt vectors are flt_int, rvu_int and ras_int. */ + for (i = 0; i < cpt_10k_flt_nvecs_get(rvu) + 2; i++) if (rvu->irq_allocated[off + i]) { free_irq(pci_irq_vector(rvu->pdev, off + i), block); rvu->irq_allocated[off + i] = false; @@ -206,12 +252,17 @@ void rvu_cpt_unregister_interrupts(struct rvu *rvu) static int cpt_10k_register_interrupts(struct rvu_block *block, int off) { + int rvu_intr_vec, ras_intr_vec; struct rvu *rvu = block->rvu; int blkaddr = block->addr; irq_handler_t flt_fn; + u32 max_engs; int i, ret; + u8 nr; + + max_engs = cpt_max_engines_get(rvu); - for (i = CPT_10K_AF_INT_VEC_FLT0; i < CPT_10K_AF_INT_VEC_RVU; i++) { + for (i = CPT_10K_AF_INT_VEC_FLT0; i < cpt_10k_flt_nvecs_get(rvu); i++) { sprintf(&rvu->irq_name[(off + i) * NAME_SIZE], "CPTAF FLT%d", i); switch (i) { @@ -229,20 +280,24 @@ static int cpt_10k_register_interrupts(struct rvu_block *block, int off) flt_fn, &rvu->irq_name[(off + i) * NAME_SIZE]); if (ret) goto err; - if (i == CPT_10K_AF_INT_VEC_FLT2) - rvu_write64(rvu, blkaddr, CPT_AF_FLTX_INT_ENA_W1S(i), 0xFFFF); - else - rvu_write64(rvu, blkaddr, CPT_AF_FLTX_INT_ENA_W1S(i), ~0ULL); + + nr = (max_engs > 64) ? 64 : max_engs; + max_engs -= nr; + rvu_write64(rvu, blkaddr, CPT_AF_FLTX_INT_ENA_W1S(i), + INTR_MASK(nr)); } - ret = rvu_cpt_do_register_interrupt(block, off + CPT_10K_AF_INT_VEC_RVU, + rvu_intr_vec = cpt_10k_flt_nvecs_get(rvu); + ras_intr_vec = rvu_intr_vec + 1; + + ret = rvu_cpt_do_register_interrupt(block, off + rvu_intr_vec, rvu_cpt_af_rvu_intr_handler, "CPTAF RVU"); if (ret) goto err; rvu_write64(rvu, blkaddr, CPT_AF_RVU_INT_ENA_W1S, 0x1); - ret = rvu_cpt_do_register_interrupt(block, off + CPT_10K_AF_INT_VEC_RAS, + ret = rvu_cpt_do_register_interrupt(block, off + ras_intr_vec, rvu_cpt_af_ras_intr_handler, "CPTAF RAS"); if (ret) @@ -928,7 +983,7 @@ int rvu_mbox_handler_cpt_flt_eng_info(struct rvu *rvu, struct cpt_flt_eng_info_r return blkaddr; block = &rvu->hw->block[blkaddr]; - for (vec = 0; vec < CPT_10K_AF_INT_VEC_RVU; vec++) { + for (vec = 0; vec < cpt_10k_flt_nvecs_get(block->rvu); vec++) { spin_lock_irqsave(&rvu->cpt_intr_lock, flags); rsp->flt_eng_map[vec] = block->cpt_flt_eng_map[vec]; rsp->rcvrd_eng_map[vec] = block->cpt_rcvrd_eng_map[vec]; diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_struct.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu_struct.h index 5ef406c7e8a4..fc8da2090657 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_struct.h +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_struct.h @@ -71,13 +71,11 @@ enum cpt_af_int_vec_e { CPT_AF_INT_VEC_CNT = 0x4, }; -enum cpt_10k_af_int_vec_e { +enum cpt_cn10k_flt_int_vec_e { CPT_10K_AF_INT_VEC_FLT0 = 0x0, CPT_10K_AF_INT_VEC_FLT1 = 0x1, CPT_10K_AF_INT_VEC_FLT2 = 0x2, - CPT_10K_AF_INT_VEC_RVU = 0x3, - CPT_10K_AF_INT_VEC_RAS = 0x4, - CPT_10K_AF_INT_VEC_CNT = 0x5, + CPT_10K_AF_INT_VEC_FLT_MAX = 0x3, }; /* NPA Admin function Interrupt Vector Enumeration */
On CN10KB, number of flt interrupt vectors are reduced to 2. So, modify the code accordingly for cn10k. Signed-off-by: Srujana Challa <schalla@marvell.com> --- .../net/ethernet/marvell/octeontx2/af/mbox.h | 5 +- .../ethernet/marvell/octeontx2/af/rvu_cpt.c | 79 ++++++++++++++++--- .../marvell/octeontx2/af/rvu_struct.h | 6 +- 3 files changed, 72 insertions(+), 18 deletions(-)