Message ID | 20200821113355.6140-1-song.bao.hua@hisilicon.com (mailing list archive) |
---|---|
Headers | show |
Series | make dma_alloc_coherent NUMA-aware by per-NUMA CMA | expand |
Hi Barry, Sorry for jumping in so late. On 8/21/20 4:33 AM, Barry Song wrote: > > with per-numa CMA, smmu will get memory from local numa node to save command > queues and page tables. that means dma_unmap latency will be shrunk much. Since per-node CMA areas for hugetlb was introduced, I have been thinking about the limited number of CMA areas. In most configurations, I believe it is limited to 7. And, IIRC it is not something that can be changed at runtime, you need to reconfig and rebuild to increase the number. In contrast some configs have NODES_SHIFT set to 10. I wasn't too worried because of the limited hugetlb use case. However, this series is adding another user of per-node CMA areas. With more users, should try to sync up number of CMA areas and number of nodes? Or, perhaps I am worrying about nothing?
> -----Original Message----- > From: Mike Kravetz [mailto:mike.kravetz@oracle.com] > Sent: Saturday, August 22, 2020 5:53 AM > To: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com>; hch@lst.de; > m.szyprowski@samsung.com; robin.murphy@arm.com; will@kernel.org; > ganapatrao.kulkarni@cavium.com; catalin.marinas@arm.com; > akpm@linux-foundation.org > Cc: iommu@lists.linux-foundation.org; linux-arm-kernel@lists.infradead.org; > linux-kernel@vger.kernel.org; Zengtao (B) <prime.zeng@hisilicon.com>; > huangdaode <huangdaode@huawei.com>; Linuxarm <linuxarm@huawei.com> > Subject: Re: [PATCH v7 0/3] make dma_alloc_coherent NUMA-aware by > per-NUMA CMA > > Hi Barry, > Sorry for jumping in so late. > > On 8/21/20 4:33 AM, Barry Song wrote: > > > > with per-numa CMA, smmu will get memory from local numa node to save > command > > queues and page tables. that means dma_unmap latency will be shrunk > much. > > Since per-node CMA areas for hugetlb was introduced, I have been thinking > about the limited number of CMA areas. In most configurations, I believe > it is limited to 7. And, IIRC it is not something that can be changed at > runtime, you need to reconfig and rebuild to increase the number. In contrast > some configs have NODES_SHIFT set to 10. I wasn't too worried because of > the limited hugetlb use case. However, this series is adding another user > of per-node CMA areas. > > With more users, should try to sync up number of CMA areas and number of > nodes? Or, perhaps I am worrying about nothing? Hi Mike, The current limitation is 8. If the server has 4 nodes and we enable both pernuma CMA and hugetlb, the last node will fail to get one cma area as the default global cma area will take 1 of 8. So users need to change menuconfig. If the server has 8 nodes, we enable one of pernuma cma and hugetlb, one node will fail to get cma. We may set the default number of CMA areas as 8+MAX_NODES(if hugetlb enabled) + MAX_NODES(if pernuma cma enabled) if we don't expect users to change config, but right now hugetlb has not an option in Kconfig to enable or disable like pernuma cma has DMA_PERNUMA_CMA. > -- > Mike Kravetz Thanks Barry
> -----Original Message----- > From: Song Bao Hua (Barry Song) > Sent: Saturday, August 22, 2020 7:27 AM > To: 'Mike Kravetz' <mike.kravetz@oracle.com>; hch@lst.de; > m.szyprowski@samsung.com; robin.murphy@arm.com; will@kernel.org; > ganapatrao.kulkarni@cavium.com; catalin.marinas@arm.com; > akpm@linux-foundation.org > Cc: iommu@lists.linux-foundation.org; linux-arm-kernel@lists.infradead.org; > linux-kernel@vger.kernel.org; Zengtao (B) <prime.zeng@hisilicon.com>; > huangdaode <huangdaode@huawei.com>; Linuxarm <linuxarm@huawei.com> > Subject: RE: [PATCH v7 0/3] make dma_alloc_coherent NUMA-aware by > per-NUMA CMA > > > > > -----Original Message----- > > From: Mike Kravetz [mailto:mike.kravetz@oracle.com] > > Sent: Saturday, August 22, 2020 5:53 AM > > To: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com>; hch@lst.de; > > m.szyprowski@samsung.com; robin.murphy@arm.com; will@kernel.org; > > ganapatrao.kulkarni@cavium.com; catalin.marinas@arm.com; > > akpm@linux-foundation.org > > Cc: iommu@lists.linux-foundation.org; linux-arm-kernel@lists.infradead.org; > > linux-kernel@vger.kernel.org; Zengtao (B) <prime.zeng@hisilicon.com>; > > huangdaode <huangdaode@huawei.com>; Linuxarm > <linuxarm@huawei.com> > > Subject: Re: [PATCH v7 0/3] make dma_alloc_coherent NUMA-aware by > > per-NUMA CMA > > > > Hi Barry, > > Sorry for jumping in so late. > > > > On 8/21/20 4:33 AM, Barry Song wrote: > > > > > > with per-numa CMA, smmu will get memory from local numa node to save > > command > > > queues and page tables. that means dma_unmap latency will be shrunk > > much. > > > > Since per-node CMA areas for hugetlb was introduced, I have been thinking > > about the limited number of CMA areas. In most configurations, I believe > > it is limited to 7. And, IIRC it is not something that can be changed at > > runtime, you need to reconfig and rebuild to increase the number. In > contrast > > some configs have NODES_SHIFT set to 10. I wasn't too worried because of > > the limited hugetlb use case. However, this series is adding another user > > of per-node CMA areas. > > > > With more users, should try to sync up number of CMA areas and number of > > nodes? Or, perhaps I am worrying about nothing? > > Hi Mike, > The current limitation is 8. If the server has 4 nodes and we enable both > pernuma > CMA and hugetlb, the last node will fail to get one cma area as the default > global cma area will take 1 of 8. So users need to change menuconfig. > If the server has 8 nodes, we enable one of pernuma cma and hugetlb, one > node > will fail to get cma. > > We may set the default number of CMA areas as 8+MAX_NODES(if hugetlb > enabled) + > MAX_NODES(if pernuma cma enabled) if we don't expect users to change > config, but > right now hugetlb has not an option in Kconfig to enable or disable like > pernuma cma > has DMA_PERNUMA_CMA. I would prefer we make some changes like: config CMA_AREAS int "Maximum count of the CMA areas" depends on CMA + default 19 if NUMA default 7 help CMA allows to create CMA areas for particular purpose, mainly, used as device private area. This parameter sets the maximum number of CMA area in the system. - If unsure, leave the default value "7". + If unsure, leave the default value "7" or "19" if NUMA is used. 1+ CONFIG_CMA_AREAS should be quite enough for almost all servers in the markets. If 2 numa nodes, and both hugetlb cma and pernuma cma is enabled, we need 2*2 + 1 = 5 If 4 numa nodes, and both hugetlb cma and pernuma cma is enabled, we need 2*4 + 1 = 9 -> default ARM64 config. If 8 numa nodes, and both hugetlb cma and pernuma cma is enabled, we need 2*8 + 1 = 17 The default value is supporting the most common case and is not going to support those servers with NODES_SHIFT=10, they can make their own config just like users need to increase CMA_AREAS if they add many cma areas in device tree in a system even without NUMA. How do you think, mike? Thanks Barry
On 8/21/20 1:47 PM, Song Bao Hua (Barry Song) wrote: > > >> -----Original Message----- >> From: Song Bao Hua (Barry Song) >> Sent: Saturday, August 22, 2020 7:27 AM >> To: 'Mike Kravetz' <mike.kravetz@oracle.com>; hch@lst.de; >> m.szyprowski@samsung.com; robin.murphy@arm.com; will@kernel.org; >> ganapatrao.kulkarni@cavium.com; catalin.marinas@arm.com; >> akpm@linux-foundation.org >> Cc: iommu@lists.linux-foundation.org; linux-arm-kernel@lists.infradead.org; >> linux-kernel@vger.kernel.org; Zengtao (B) <prime.zeng@hisilicon.com>; >> huangdaode <huangdaode@huawei.com>; Linuxarm <linuxarm@huawei.com> >> Subject: RE: [PATCH v7 0/3] make dma_alloc_coherent NUMA-aware by >> per-NUMA CMA >> >> >> >>> -----Original Message----- >>> From: Mike Kravetz [mailto:mike.kravetz@oracle.com] >>> Sent: Saturday, August 22, 2020 5:53 AM >>> To: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com>; hch@lst.de; >>> m.szyprowski@samsung.com; robin.murphy@arm.com; will@kernel.org; >>> ganapatrao.kulkarni@cavium.com; catalin.marinas@arm.com; >>> akpm@linux-foundation.org >>> Cc: iommu@lists.linux-foundation.org; linux-arm-kernel@lists.infradead.org; >>> linux-kernel@vger.kernel.org; Zengtao (B) <prime.zeng@hisilicon.com>; >>> huangdaode <huangdaode@huawei.com>; Linuxarm >> <linuxarm@huawei.com> >>> Subject: Re: [PATCH v7 0/3] make dma_alloc_coherent NUMA-aware by >>> per-NUMA CMA >>> >>> Hi Barry, >>> Sorry for jumping in so late. >>> >>> On 8/21/20 4:33 AM, Barry Song wrote: >>>> >>>> with per-numa CMA, smmu will get memory from local numa node to save >>> command >>>> queues and page tables. that means dma_unmap latency will be shrunk >>> much. >>> >>> Since per-node CMA areas for hugetlb was introduced, I have been thinking >>> about the limited number of CMA areas. In most configurations, I believe >>> it is limited to 7. And, IIRC it is not something that can be changed at >>> runtime, you need to reconfig and rebuild to increase the number. In >> contrast >>> some configs have NODES_SHIFT set to 10. I wasn't too worried because of >>> the limited hugetlb use case. However, this series is adding another user >>> of per-node CMA areas. >>> >>> With more users, should try to sync up number of CMA areas and number of >>> nodes? Or, perhaps I am worrying about nothing? >> >> Hi Mike, >> The current limitation is 8. If the server has 4 nodes and we enable both >> pernuma >> CMA and hugetlb, the last node will fail to get one cma area as the default >> global cma area will take 1 of 8. So users need to change menuconfig. >> If the server has 8 nodes, we enable one of pernuma cma and hugetlb, one >> node >> will fail to get cma. >> >> We may set the default number of CMA areas as 8+MAX_NODES(if hugetlb >> enabled) + >> MAX_NODES(if pernuma cma enabled) if we don't expect users to change >> config, but >> right now hugetlb has not an option in Kconfig to enable or disable like >> pernuma cma >> has DMA_PERNUMA_CMA. > > I would prefer we make some changes like: > > config CMA_AREAS > int "Maximum count of the CMA areas" > depends on CMA > + default 19 if NUMA > default 7 > help > CMA allows to create CMA areas for particular purpose, mainly, > used as device private area. This parameter sets the maximum > number of CMA area in the system. > > - If unsure, leave the default value "7". > + If unsure, leave the default value "7" or "19" if NUMA is used. > > 1+ CONFIG_CMA_AREAS should be quite enough for almost all servers in the markets. > > If 2 numa nodes, and both hugetlb cma and pernuma cma is enabled, we need 2*2 + 1 = 5 > If 4 numa nodes, and both hugetlb cma and pernuma cma is enabled, we need 2*4 + 1 = 9 -> default ARM64 config. > If 8 numa nodes, and both hugetlb cma and pernuma cma is enabled, we need 2*8 + 1 = 17 > > The default value is supporting the most common case and is not going to support those servers > with NODES_SHIFT=10, they can make their own config just like users need to increase CMA_AREAS > if they add many cma areas in device tree in a system even without NUMA. > > How do you think, mike? I'm OK with that. I really did not want to sidetrach this series. It is just something I thought about when looking at the hugetlb code. My 'to do' list includes looking at a way to make the number of CMA areas dynamic.