Message ID | 20151104123640.GK7637@e104818-lin.cambridge.arm.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Simon Horman |
Headers | show |
On Wed, 4 Nov 2015, Catalin Marinas wrote: > The simplest option would be to make sure that off slab isn't allowed > for caches of KMALLOC_MIN_SIZE or smaller, with the drawback that not > only "kmalloc-128" but any other such caches will be on slab. The reason for an off slab configuration is denser object packing. > I think a better option would be to first check that there is a > kmalloc_caches[] entry for freelist_size before deciding to go off-slab. Hmmm.. Yes seems to be an option. Maybe we simply revert commit 8fc9cf420b36 instead? That does not seem to make too much sense to me and the goal of the commit cannot be accomplished on ARM. Your patch essentially reverts the effect anyways. Smaller slabs really do not need off slab management anyways since they will only loose a few objects per slab page. -- To unsubscribe from this list: send the line "unsubscribe linux-sh" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 04, 2015 at 07:53:50AM -0600, Christoph Lameter wrote: > On Wed, 4 Nov 2015, Catalin Marinas wrote: > > > The simplest option would be to make sure that off slab isn't allowed > > for caches of KMALLOC_MIN_SIZE or smaller, with the drawback that not > > only "kmalloc-128" but any other such caches will be on slab. > > The reason for an off slab configuration is denser object packing. > > > I think a better option would be to first check that there is a > > kmalloc_caches[] entry for freelist_size before deciding to go off-slab. > > Hmmm.. Yes seems to be an option. > > Maybe we simply revert commit 8fc9cf420b36 instead? I'm fine with this. Also note that the arm64 commit changing L1_CACHE_BYTES to 128 hasn't been pushed yet (it's queued for 4.4). > That does not seem to make too much sense to me and the goal of the > commit cannot be accomplished on ARM. Your patch essentially reverts > the effect anyways. In theory it only reverts the effect for the first kmalloc_cache ("kmalloc-128" in the arm64 case). Any other bigger cache which would not be mergeable with an existing one still has the potential of off-slab management. > Smaller slabs really do not need off slab management anyways since they > will only loose a few objects per slab page. IIUC, starting with 128 slab size for a 4KB page, you have 32 objects per page. The freelist takes 32 bytes (or 31), therefore you waste a single slab object. However, only 1/4 of it is used for freelist and the waste gets bigger with 256 slab size, hence the original commit. BTW, assuming L1_CACHE_BYTES is 512 (I don't ever see this happening but just in theory), we potentially have the same issue. What would save us is that INDEX_NODE would match the first "kmalloc-512" cache, so we have it pre-populated.
On Wed, 4 Nov 2015, Catalin Marinas wrote: > BTW, assuming L1_CACHE_BYTES is 512 (I don't ever see this happening but > just in theory), we potentially have the same issue. What would save us > is that INDEX_NODE would match the first "kmalloc-512" cache, so we have > it pre-populated. Ok maybe add some BUILD_BUG_ONs to ensure that builds fail until we have addressed that. -- To unsubscribe from this list: send the line "unsubscribe linux-sh" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/mm/slab.c b/mm/slab.c index 4fcc5dd8d5a6..d4a21736eb5d 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -2246,16 +2246,33 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags) if (flags & CFLGS_OFF_SLAB) { /* really off slab. No need for manual alignment */ - freelist_size = calculate_freelist_size(cachep->num, 0); + size_t off_freelist_size = calculate_freelist_size(cachep->num, 0); + + cachep->freelist_cache = kmalloc_slab(off_freelist_size, 0u); + if (ZERO_OR_NULL_PTR(cachep->freelist_cache)) { + /* + * We don't have kmalloc_caches[] populated for + * off_freelist_size yet. This can happen during + * create_kmalloc_caches() when KMALLOC_MIN_SIZE >= + * (PAGE_SHIFT >> 5) and CFLGS_OFF_SLAB is set. Move + * the cache on-slab. + */ + flags &= ~CFLGS_OFF_SLAB; + left_over = calculate_slab_order(cachep, size, cachep->align, flags); + } else { + freelist_size = off_freelist_size; #ifdef CONFIG_PAGE_POISONING - /* If we're going to use the generic kernel_map_pages() - * poisoning, then it's going to smash the contents of - * the redzone and userword anyhow, so switch them off. - */ - if (size % PAGE_SIZE == 0 && flags & SLAB_POISON) - flags &= ~(SLAB_RED_ZONE | SLAB_STORE_USER); + /* + * If we're going to use the generic kernel_map_pages() + * poisoning, then it's going to smash the contents of + * the redzone and userword anyhow, so switch them off. + */ + if (size % PAGE_SIZE == 0 && flags & SLAB_POISON) + flags &= ~(SLAB_RED_ZONE | SLAB_STORE_USER); #endif + } + } cachep->colour_off = cache_line_size(); @@ -2271,18 +2288,6 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags) cachep->size = size; cachep->reciprocal_buffer_size = reciprocal_value(size); - if (flags & CFLGS_OFF_SLAB) { - cachep->freelist_cache = kmalloc_slab(freelist_size, 0u); - /* - * This is a possibility for one of the kmalloc_{dma,}_caches. - * But since we go off slab only for object size greater than - * PAGE_SIZE/8, and kmalloc_{dma,}_caches get created - * in ascending order,this should not happen at all. - * But leave a BUG_ON for some lucky dude. - */ - BUG_ON(ZERO_OR_NULL_PTR(cachep->freelist_cache)); - } - err = setup_cpu_cache(cachep, gfp); if (err) { __kmem_cache_shutdown(cachep);