Message ID | 1715887976-1288-1-git-send-email-quic_mojha@quicinc.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | firmware: qcom_scm: Give page_aligned size for dma api's | expand |
On Fri, May 17, 2024 at 01:02:56AM +0530, Mukesh Ojha wrote: > If we disable CONFIG_ZONE_DMA32 to make the selection of DMA > memory from higher 4GB range. dma_alloc_coherant() api usage dma_alloc_coherent() > inside qcom_scm_pas_init_image() which usage scm 32bit device > will fail for size of data passed less than PAGE_SIZE and > it will fallback to buddy pool to allocate memory from which > will fail. I interpreted this as: When CONFIG_ZONE_DMA32 is disabled, dma_alloc_coherent() fails when size is < PAGE_SIZE. qcom_scm_pas_init_image() will fail to allocate using dma_alloc_coherent() and incorrectly fall back to buddy pool. This justification seems incorrect to me. None of the other dma_alloc_coherent() users are page-aligning their requests in scm driver. Is something else going on? > > Convert the size to aligned to PAGE_SIZE before it gets pass > to dma_alloc_coherant(), so that it gets coherant memory in dma_alloc_coherent coherent > lower 4GB from linux cma region. > > Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> > --- > drivers/firmware/qcom/qcom_scm.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c > index 029ee5edbea6..6616048f1c33 100644 > --- a/drivers/firmware/qcom/qcom_scm.c > +++ b/drivers/firmware/qcom/qcom_scm.c > @@ -562,6 +562,7 @@ static void qcom_scm_set_download_mode(u32 dload_mode) > int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size, > struct qcom_scm_pas_metadata *ctx) > { > + size_t page_aligned_size; > dma_addr_t mdata_phys; > void *mdata_buf; > int ret; > @@ -579,7 +580,8 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size, > * data blob, so make sure it's physically contiguous, 4K aligned and > * non-cachable to avoid XPU violations. > */ > - mdata_buf = dma_alloc_coherent(__scm->dev, size, &mdata_phys, > + page_aligned_size = PAGE_ALIGN(size + PAGE_SIZE); Isn't PAGE_ALIGN(size) good enough? Why do you need to round up to the 2nd page? Maybe you thought PAGE_ALIGN was PAGE_ALIGN_DOWN ? > + mdata_buf = dma_alloc_coherent(__scm->dev, page_aligned_size, &mdata_phys, > GFP_KERNEL); > if (!mdata_buf) > return -ENOMEM; > @@ -604,11 +606,11 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size, > > out: > if (ret < 0 || !ctx) { > - dma_free_coherent(__scm->dev, size, mdata_buf, mdata_phys); > + dma_free_coherent(__scm->dev, page_aligned_size, mdata_buf, mdata_phys); > } else if (ctx) { > ctx->ptr = mdata_buf; > ctx->phys = mdata_phys; > - ctx->size = size; > + ctx->size = page_aligned_size; > } > > return ret ? : res.result[0]; > -- > 2.7.4 > >
Thanks for the review.. On 5/17/2024 1:58 AM, Elliot Berman wrote: > On Fri, May 17, 2024 at 01:02:56AM +0530, Mukesh Ojha wrote: >> If we disable CONFIG_ZONE_DMA32 to make the selection of DMA >> memory from higher 4GB range. dma_alloc_coherant() api usage > dma_alloc_coherent() >> inside qcom_scm_pas_init_image() which usage scm 32bit device >> will fail for size of data passed less than PAGE_SIZE and >> it will fallback to buddy pool to allocate memory from which >> will fail. > > I interpreted this as: > > When CONFIG_ZONE_DMA32 is disabled, dma_alloc_coherent() fails when size > is < PAGE_SIZE. qcom_scm_pas_init_image() will fail to allocate using > dma_alloc_coherent() and incorrectly fall back to buddy pool. > > This justification seems incorrect to me. None of the other > dma_alloc_coherent() users are page-aligning their requests in scm > driver. Is something else going on? For SCM protection, memory allocation should be physically contiguous, 4K aligned and non-cacheable to avoid XPU violations as that is the granularity of protection to be applied from secure world also what if, there is a 32-bit secure peripheral is going to access this memory which for some SoCs and some not. So, we wanted to keep this common and align across multiple SoCs to do the allocation from CMA and add a pad to the memory passed to secure world Also, this also enables us to keep CONFIG_ZONE_{DMA|DMA32} disabled which is a significant overhead. > >> >> Convert the size to aligned to PAGE_SIZE before it gets pass >> to dma_alloc_coherant(), so that it gets coherant memory in > dma_alloc_coherent coherent >> lower 4GB from linux cma region. >> >> Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> >> --- >> drivers/firmware/qcom/qcom_scm.c | 8 +++++--- >> 1 file changed, 5 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c >> index 029ee5edbea6..6616048f1c33 100644 >> --- a/drivers/firmware/qcom/qcom_scm.c >> +++ b/drivers/firmware/qcom/qcom_scm.c >> @@ -562,6 +562,7 @@ static void qcom_scm_set_download_mode(u32 dload_mode) >> int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size, >> struct qcom_scm_pas_metadata *ctx) >> { >> + size_t page_aligned_size; >> dma_addr_t mdata_phys; >> void *mdata_buf; >> int ret; >> @@ -579,7 +580,8 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size, >> * data blob, so make sure it's physically contiguous, 4K aligned and >> * non-cachable to avoid XPU violations. >> */ >> - mdata_buf = dma_alloc_coherent(__scm->dev, size, &mdata_phys, >> + page_aligned_size = PAGE_ALIGN(size + PAGE_SIZE); > > Isn't PAGE_ALIGN(size) good enough? Why do you need to round up to the > 2nd page? Maybe you thought PAGE_ALIGN was PAGE_ALIGN_DOWN ? No, this was intentional as there is a check inside dma_alloc_contiguous() call for a size <= PAGE_SIZE . -Mukesh
On Mon, May 20, 2024 at 05:55:49PM +0530, Mukesh Ojha wrote: > Thanks for the review.. > > On 5/17/2024 1:58 AM, Elliot Berman wrote: > > On Fri, May 17, 2024 at 01:02:56AM +0530, Mukesh Ojha wrote: > > > If we disable CONFIG_ZONE_DMA32 to make the selection of DMA > > > memory from higher 4GB range. dma_alloc_coherant() api usage > > dma_alloc_coherent() > > > inside qcom_scm_pas_init_image() which usage scm 32bit device > > > will fail for size of data passed less than PAGE_SIZE and > > > it will fallback to buddy pool to allocate memory from which > > > will fail. > > > > I interpreted this as: > > > > When CONFIG_ZONE_DMA32 is disabled, dma_alloc_coherent() fails when size > > is < PAGE_SIZE. qcom_scm_pas_init_image() will fail to allocate using > dma_alloc_coherent() and incorrectly fall back to buddy pool. > > > > This justification seems incorrect to me. None of the other > > dma_alloc_coherent() users are page-aligning their requests in scm > > driver. Is something else going on? > > For SCM protection, memory allocation should be physically contiguous, 4K > aligned and non-cacheable to avoid XPU violations as that is the > granularity of protection to be applied from secure world also what if, > there is a 32-bit secure peripheral is going to access this memory which > for some SoCs and some not. > > So, we wanted to keep this common and align across multiple SoCs to do > the allocation from CMA and add a pad to the memory passed to secure world > Also, this also enables us to keep CONFIG_ZONE_{DMA|DMA32} disabled which is > a significant overhead. > > > > > > > > > Convert the size to aligned to PAGE_SIZE before it gets pass > > > to dma_alloc_coherant(), so that it gets coherant memory in > > dma_alloc_coherent coherent > > > lower 4GB from linux cma region. > > > > > > Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> > > > --- > > > drivers/firmware/qcom/qcom_scm.c | 8 +++++--- > > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c > > > index 029ee5edbea6..6616048f1c33 100644 > > > --- a/drivers/firmware/qcom/qcom_scm.c > > > +++ b/drivers/firmware/qcom/qcom_scm.c > > > @@ -562,6 +562,7 @@ static void qcom_scm_set_download_mode(u32 dload_mode) > > > int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size, > > > struct qcom_scm_pas_metadata *ctx) > > > { > > > + size_t page_aligned_size; > > > dma_addr_t mdata_phys; > > > void *mdata_buf; > > > int ret; > > > @@ -579,7 +580,8 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size, > > > * data blob, so make sure it's physically contiguous, 4K aligned and > > > * non-cachable to avoid XPU violations. > > > */ > > > - mdata_buf = dma_alloc_coherent(__scm->dev, size, &mdata_phys, > > > + page_aligned_size = PAGE_ALIGN(size + PAGE_SIZE); > > > > Isn't PAGE_ALIGN(size) good enough? Why do you need to round up to the > > 2nd page? Maybe you thought PAGE_ALIGN was PAGE_ALIGN_DOWN ? > > No, this was intentional as there is a check inside > dma_alloc_contiguous() call for a size <= PAGE_SIZE . > Sure, but as Elliot points out PAGE_ALIGN(x) for X <= PAGE_SIZE is PAGE_SIZE, so you wouldn't pass a value < PAGE_SIZE to the function. That said, here you say <= PAGE_SIZE and in commit message you say "less than PAGE_SIZE"... If you need this to be 2 pages, it needs to be vary clearly documented, to avoid having future readers fixing what looks like a bug. Regards, Bjorn
diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c index 029ee5edbea6..6616048f1c33 100644 --- a/drivers/firmware/qcom/qcom_scm.c +++ b/drivers/firmware/qcom/qcom_scm.c @@ -562,6 +562,7 @@ static void qcom_scm_set_download_mode(u32 dload_mode) int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size, struct qcom_scm_pas_metadata *ctx) { + size_t page_aligned_size; dma_addr_t mdata_phys; void *mdata_buf; int ret; @@ -579,7 +580,8 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size, * data blob, so make sure it's physically contiguous, 4K aligned and * non-cachable to avoid XPU violations. */ - mdata_buf = dma_alloc_coherent(__scm->dev, size, &mdata_phys, + page_aligned_size = PAGE_ALIGN(size + PAGE_SIZE); + mdata_buf = dma_alloc_coherent(__scm->dev, page_aligned_size, &mdata_phys, GFP_KERNEL); if (!mdata_buf) return -ENOMEM; @@ -604,11 +606,11 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size, out: if (ret < 0 || !ctx) { - dma_free_coherent(__scm->dev, size, mdata_buf, mdata_phys); + dma_free_coherent(__scm->dev, page_aligned_size, mdata_buf, mdata_phys); } else if (ctx) { ctx->ptr = mdata_buf; ctx->phys = mdata_phys; - ctx->size = size; + ctx->size = page_aligned_size; } return ret ? : res.result[0];
If we disable CONFIG_ZONE_DMA32 to make the selection of DMA memory from higher 4GB range. dma_alloc_coherant() api usage inside qcom_scm_pas_init_image() which usage scm 32bit device will fail for size of data passed less than PAGE_SIZE and it will fallback to buddy pool to allocate memory from which will fail. Convert the size to aligned to PAGE_SIZE before it gets pass to dma_alloc_coherant(), so that it gets coherant memory in lower 4GB from linux cma region. Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> --- drivers/firmware/qcom/qcom_scm.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)