diff mbox series

[RFC,04/14] mm: vmstat: convert slab vmstat counter to bytes

Message ID 20190905214553.1643060-5-guro@fb.com (mailing list archive)
State New, archived
Headers show
Series The new slab memory controller | expand

Commit Message

Roman Gushchin Sept. 5, 2019, 9:45 p.m. UTC
In order to prepare for per-object slab memory accounting,
convert NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE vmstat
items to bytes.

To make sure that these vmstats are in bytes, rename them
to NR_SLAB_RECLAIMABLE_B and NR_SLAB_UNRECLAIMABLE_B (similar to
NR_KERNEL_STACK_KB).

The size of slab memory shouldn't exceed 4Gb on 32-bit machines,
so it will fit into atomic_long_t we use for vmstats.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 drivers/base/node.c     | 11 ++++++++---
 fs/proc/meminfo.c       |  4 ++--
 include/linux/mmzone.h  | 10 ++++++++--
 include/linux/vmstat.h  |  8 ++++++++
 kernel/power/snapshot.c |  2 +-
 mm/memcontrol.c         | 29 ++++++++++++++++-------------
 mm/oom_kill.c           |  2 +-
 mm/page_alloc.c         |  8 ++++----
 mm/slab.h               | 15 ++++++++-------
 mm/slab_common.c        |  4 ++--
 mm/slob.c               | 12 ++++++------
 mm/slub.c               |  8 ++++----
 mm/vmscan.c             |  3 ++-
 mm/vmstat.c             | 22 +++++++++++++++++++---
 mm/workingset.c         |  6 ++++--
 15 files changed, 93 insertions(+), 51 deletions(-)

Comments

Johannes Weiner Sept. 16, 2019, 12:38 p.m. UTC | #1
On Thu, Sep 05, 2019 at 02:45:48PM -0700, Roman Gushchin wrote:
> In order to prepare for per-object slab memory accounting,
> convert NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE vmstat
> items to bytes.
> 
> To make sure that these vmstats are in bytes, rename them
> to NR_SLAB_RECLAIMABLE_B and NR_SLAB_UNRECLAIMABLE_B (similar to
> NR_KERNEL_STACK_KB).
> 
> The size of slab memory shouldn't exceed 4Gb on 32-bit machines,
> so it will fit into atomic_long_t we use for vmstats.
> 
> Signed-off-by: Roman Gushchin <guro@fb.com>

Maybe a crazy idea, but instead of mixing bytes and pages, would it be
difficult to account all vmstat items in bytes internally? And provide
two general apis, byte and page based, to update and query the counts,
instead of tying the unit it to individual items?

The vmstat_item_in_bytes() conditional shifting is pretty awkward in
code that has a recent history littered with subtle breakages.

The translation helper node_page_state_pages() will yield garbage if
used with the page-based counters, which is another easy to misuse
interface.

We already have many places that multiply with PAGE_SIZE to get the
stats in bytes or kb units.

And _B/_KB suffixes are kinda clunky.

The stats use atomic_long_t, so switching to atomic64_t doesn't make a
difference on 64-bit and is backward compatible with 32-bit.

The per-cpu batch size you have to raise from s8 either way.

It seems to me that would make the code and API a lot simpler and
easier to use / harder to misuse.
Roman Gushchin Sept. 17, 2019, 2:08 a.m. UTC | #2
On Mon, Sep 16, 2019 at 02:38:40PM +0200, Johannes Weiner wrote:
> On Thu, Sep 05, 2019 at 02:45:48PM -0700, Roman Gushchin wrote:
> > In order to prepare for per-object slab memory accounting,
> > convert NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE vmstat
> > items to bytes.
> > 
> > To make sure that these vmstats are in bytes, rename them
> > to NR_SLAB_RECLAIMABLE_B and NR_SLAB_UNRECLAIMABLE_B (similar to
> > NR_KERNEL_STACK_KB).
> > 
> > The size of slab memory shouldn't exceed 4Gb on 32-bit machines,
> > so it will fit into atomic_long_t we use for vmstats.
> > 
> > Signed-off-by: Roman Gushchin <guro@fb.com>

Hello Johannes!

Thank you for looking into the patchset!

> 
> Maybe a crazy idea, but instead of mixing bytes and pages, would it be
> difficult to account all vmstat items in bytes internally? And provide
> two general apis, byte and page based, to update and query the counts,
> instead of tying the unit it to individual items?
> 
> The vmstat_item_in_bytes() conditional shifting is pretty awkward in
> code that has a recent history littered with subtle breakages.
> 
> The translation helper node_page_state_pages() will yield garbage if
> used with the page-based counters, which is another easy to misuse
> interface.
> 
> We already have many places that multiply with PAGE_SIZE to get the
> stats in bytes or kb units.
> 
> And _B/_KB suffixes are kinda clunky.
> 
> The stats use atomic_long_t, so switching to atomic64_t doesn't make a
> difference on 64-bit and is backward compatible with 32-bit.

I fully agree here, that having different stats in different units
adds a lot of mess to the code. But I always thought that 64-bit
atomics are slow on a 32-bit machine, so it might be a noticeable
performance regression. Don't you think so?

I'm happy to prepare such a patch(set), only I'd prefer to keep it
separately from this one. It can precede or follow the slab controller
rework, either way will work. Slab controller rework is already not so
small, so adding more code (and potential issues) here will only make
the review more complex.

> 
> The per-cpu batch size you have to raise from s8 either way.

Yeah, tbh I don't know why those are just not unsigned long by default.
Space savings are miserable here, and I don't see any other reasons.
It could be even slightly faster to use a larger type.

I kinda tried to keep the patchset as small as possible (at least for
the RFC version), so tried to avoid any non-necessary changes.
But overall using s8 or s16 here doesn't make much sense to me.

> 
> It seems to me that would make the code and API a lot simpler and
> easier to use / harder to misuse.
diff mbox series

Patch

diff --git a/drivers/base/node.c b/drivers/base/node.c
index 296546ffed6c..56664222f3fd 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -368,8 +368,8 @@  static ssize_t node_read_meminfo(struct device *dev,
 	unsigned long sreclaimable, sunreclaimable;
 
 	si_meminfo_node(&i, nid);
-	sreclaimable = node_page_state(pgdat, NR_SLAB_RECLAIMABLE);
-	sunreclaimable = node_page_state(pgdat, NR_SLAB_UNRECLAIMABLE);
+	sreclaimable = node_page_state_pages(pgdat, NR_SLAB_RECLAIMABLE_B);
+	sunreclaimable = node_page_state_pages(pgdat, NR_SLAB_UNRECLAIMABLE_B);
 	n = sprintf(buf,
 		       "Node %d MemTotal:       %8lu kB\n"
 		       "Node %d MemFree:        %8lu kB\n"
@@ -495,9 +495,14 @@  static ssize_t node_read_vmstat(struct device *dev,
 	int i;
 	int n = 0;
 
-	for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++)
+	for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++) {
+		unsigned long x = sum_zone_node_page_state(nid, i);
+
+		if (vmstat_item_in_bytes(i))
+			x >>= PAGE_SHIFT;
 		n += sprintf(buf+n, "%s %lu\n", vmstat_text[i],
 			     sum_zone_node_page_state(nid, i));
+	}
 
 #ifdef CONFIG_NUMA
 	for (i = 0; i < NR_VM_NUMA_STAT_ITEMS; i++)
diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index ac9247371871..87afa8683c1b 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -53,8 +53,8 @@  static int meminfo_proc_show(struct seq_file *m, void *v)
 		pages[lru] = global_node_page_state(NR_LRU_BASE + lru);
 
 	available = si_mem_available();
-	sreclaimable = global_node_page_state(NR_SLAB_RECLAIMABLE);
-	sunreclaim = global_node_page_state(NR_SLAB_UNRECLAIMABLE);
+	sreclaimable = global_node_page_state_pages(NR_SLAB_RECLAIMABLE_B);
+	sunreclaim = global_node_page_state_pages(NR_SLAB_UNRECLAIMABLE_B);
 
 	show_val_kb(m, "MemTotal:       ", i.totalram);
 	show_val_kb(m, "MemFree:        ", i.freeram);
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 6e8233d52971..2dbc2d042ef6 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -215,8 +215,8 @@  enum node_stat_item {
 	NR_INACTIVE_FILE,	/*  "     "     "   "       "         */
 	NR_ACTIVE_FILE,		/*  "     "     "   "       "         */
 	NR_UNEVICTABLE,		/*  "     "     "   "       "         */
-	NR_SLAB_RECLAIMABLE,	/* Please do not reorder this item */
-	NR_SLAB_UNRECLAIMABLE,	/* and this one without looking at
+	NR_SLAB_RECLAIMABLE_B,	/* Please do not reorder this item */
+	NR_SLAB_UNRECLAIMABLE_B,/* and this one without looking at
 				 * memcg_flush_percpu_vmstats() first. */
 	NR_ISOLATED_ANON,	/* Temporary isolated pages from anon lru */
 	NR_ISOLATED_FILE,	/* Temporary isolated pages from file lru */
@@ -247,6 +247,12 @@  enum node_stat_item {
 	NR_VM_NODE_STAT_ITEMS
 };
 
+static __always_inline bool vmstat_item_in_bytes(enum node_stat_item item)
+{
+	return (item == NR_SLAB_RECLAIMABLE_B ||
+		item == NR_SLAB_UNRECLAIMABLE_B);
+}
+
 /*
  * We do arithmetic on the LRU lists in various places in the code,
  * so it is important to keep the active lists LRU_ACTIVE higher in
diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index bdeda4b079fe..e5460e597e0c 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -194,6 +194,12 @@  static inline unsigned long global_node_page_state(enum node_stat_item item)
 	return x;
 }
 
+static inline
+unsigned long global_node_page_state_pages(enum node_stat_item item)
+{
+	return global_node_page_state(item) >> PAGE_SHIFT;
+}
+
 static inline unsigned long zone_page_state(struct zone *zone,
 					enum zone_stat_item item)
 {
@@ -234,6 +240,8 @@  extern unsigned long sum_zone_node_page_state(int node,
 extern unsigned long sum_zone_numa_state(int node, enum numa_stat_item item);
 extern unsigned long node_page_state(struct pglist_data *pgdat,
 						enum node_stat_item item);
+extern unsigned long node_page_state_pages(struct pglist_data *pgdat,
+					   enum node_stat_item item);
 #else
 #define sum_zone_node_page_state(node, item) global_zone_page_state(item)
 #define node_page_state(node, item) global_node_page_state(item)
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index 83105874f255..ce9e5686e745 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -1659,7 +1659,7 @@  static unsigned long minimum_image_size(unsigned long saveable)
 {
 	unsigned long size;
 
-	size = global_node_page_state(NR_SLAB_RECLAIMABLE)
+	size = global_node_page_state_pages(NR_SLAB_RECLAIMABLE_B)
 		+ global_node_page_state(NR_ACTIVE_ANON)
 		+ global_node_page_state(NR_INACTIVE_ANON)
 		+ global_node_page_state(NR_ACTIVE_FILE)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index cb9adb31360e..e4af9810b59e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -759,13 +759,16 @@  mem_cgroup_largest_soft_limit_node(struct mem_cgroup_tree_per_node *mctz)
  */
 void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val)
 {
-	long x;
+	long x, threshold = MEMCG_CHARGE_BATCH;
 
 	if (mem_cgroup_disabled())
 		return;
 
+	if (vmstat_item_in_bytes(idx))
+		threshold <<= PAGE_SHIFT;
+
 	x = val + __this_cpu_read(memcg->vmstats_percpu->stat[idx]);
-	if (unlikely(abs(x) > MEMCG_CHARGE_BATCH)) {
+	if (unlikely(abs(x) > threshold)) {
 		struct mem_cgroup *mi;
 
 		/*
@@ -807,7 +810,7 @@  void __mod_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
 	pg_data_t *pgdat = lruvec_pgdat(lruvec);
 	struct mem_cgroup_per_node *pn;
 	struct mem_cgroup *memcg;
-	long x;
+	long x, threshold = MEMCG_CHARGE_BATCH;
 
 	/* Update node */
 	__mod_node_page_state(pgdat, idx, val);
@@ -824,8 +827,11 @@  void __mod_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
 	/* Update lruvec */
 	__this_cpu_add(pn->lruvec_stat_local->count[idx], val);
 
+	if (vmstat_item_in_bytes(idx))
+		threshold <<= PAGE_SHIFT;
+
 	x = val + __this_cpu_read(pn->lruvec_stat_cpu->count[idx]);
-	if (unlikely(abs(x) > MEMCG_CHARGE_BATCH)) {
+	if (unlikely(abs(x) > threshold)) {
 		struct mem_cgroup_per_node *pi;
 
 		for (pi = pn; pi; pi = parent_nodeinfo(pi, pgdat->node_id))
@@ -1478,9 +1484,8 @@  static char *memory_stat_format(struct mem_cgroup *memcg)
 		       (u64)memcg_page_state(memcg, MEMCG_KERNEL_STACK_KB) *
 		       1024);
 	seq_buf_printf(&s, "slab %llu\n",
-		       (u64)(memcg_page_state(memcg, NR_SLAB_RECLAIMABLE) +
-			     memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE)) *
-		       PAGE_SIZE);
+		       (u64)(memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B) +
+			     memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE_B)));
 	seq_buf_printf(&s, "sock %llu\n",
 		       (u64)memcg_page_state(memcg, MEMCG_SOCK) *
 		       PAGE_SIZE);
@@ -1514,11 +1519,9 @@  static char *memory_stat_format(struct mem_cgroup *memcg)
 			       PAGE_SIZE);
 
 	seq_buf_printf(&s, "slab_reclaimable %llu\n",
-		       (u64)memcg_page_state(memcg, NR_SLAB_RECLAIMABLE) *
-		       PAGE_SIZE);
+		       (u64)memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B));
 	seq_buf_printf(&s, "slab_unreclaimable %llu\n",
-		       (u64)memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE) *
-		       PAGE_SIZE);
+		       (u64)memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE_B));
 
 	/* Accumulated memory events */
 
@@ -3564,8 +3567,8 @@  static void memcg_flush_percpu_vmstats(struct mem_cgroup *memcg, bool slab_only)
 	int min_idx, max_idx;
 
 	if (slab_only) {
-		min_idx = NR_SLAB_RECLAIMABLE;
-		max_idx = NR_SLAB_UNRECLAIMABLE;
+		min_idx = NR_SLAB_RECLAIMABLE_B;
+		max_idx = NR_SLAB_UNRECLAIMABLE_B;
 	} else {
 		min_idx = 0;
 		max_idx = MEMCG_NR_STAT;
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 314ce1a3cf25..61476bf0f147 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -183,7 +183,7 @@  static bool is_dump_unreclaim_slabs(void)
 		 global_node_page_state(NR_ISOLATED_FILE) +
 		 global_node_page_state(NR_UNEVICTABLE);
 
-	return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru);
+	return (global_node_page_state_pages(NR_SLAB_UNRECLAIMABLE_B) > nr_lru);
 }
 
 /**
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c5d62f1c2851..9da8ee92c226 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5100,8 +5100,8 @@  long si_mem_available(void)
 	 * items that are in use, and cannot be freed. Cap this estimate at the
 	 * low watermark.
 	 */
-	reclaimable = global_node_page_state(NR_SLAB_RECLAIMABLE) +
-			global_node_page_state(NR_KERNEL_MISC_RECLAIMABLE);
+	reclaimable = global_node_page_state_pages(NR_SLAB_RECLAIMABLE_B) +
+		global_node_page_state(NR_KERNEL_MISC_RECLAIMABLE);
 	available += reclaimable - min(reclaimable / 2, wmark_low);
 
 	if (available < 0)
@@ -5245,8 +5245,8 @@  void show_free_areas(unsigned int filter, nodemask_t *nodemask)
 		global_node_page_state(NR_FILE_DIRTY),
 		global_node_page_state(NR_WRITEBACK),
 		global_node_page_state(NR_UNSTABLE_NFS),
-		global_node_page_state(NR_SLAB_RECLAIMABLE),
-		global_node_page_state(NR_SLAB_UNRECLAIMABLE),
+		global_node_page_state_pages(NR_SLAB_RECLAIMABLE_B),
+		global_node_page_state_pages(NR_SLAB_UNRECLAIMABLE_B),
 		global_node_page_state(NR_FILE_MAPPED),
 		global_node_page_state(NR_SHMEM),
 		global_zone_page_state(NR_PAGETABLE),
diff --git a/mm/slab.h b/mm/slab.h
index 68e455f2b698..7c5577c2b9ea 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -272,7 +272,7 @@  int __kmem_cache_alloc_bulk(struct kmem_cache *, gfp_t, size_t, void **);
 static inline int cache_vmstat_idx(struct kmem_cache *s)
 {
 	return (s->flags & SLAB_RECLAIM_ACCOUNT) ?
-		NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE;
+		NR_SLAB_RECLAIMABLE_B : NR_SLAB_UNRECLAIMABLE_B;
 }
 
 #ifdef CONFIG_MEMCG_KMEM
@@ -360,7 +360,7 @@  static __always_inline int memcg_charge_slab(struct page *page,
 
 	if (unlikely(!memcg || mem_cgroup_is_root(memcg))) {
 		mod_node_page_state(page_pgdat(page), cache_vmstat_idx(s),
-				    (1 << order));
+				    (PAGE_SIZE << order));
 		percpu_ref_get_many(&s->memcg_params.refcnt, 1 << order);
 		return 0;
 	}
@@ -370,7 +370,7 @@  static __always_inline int memcg_charge_slab(struct page *page,
 		goto out;
 
 	lruvec = mem_cgroup_lruvec(page_pgdat(page), memcg);
-	mod_lruvec_state(lruvec, cache_vmstat_idx(s), 1 << order);
+	mod_lruvec_state(lruvec, cache_vmstat_idx(s), PAGE_SIZE << order);
 
 	/* transer try_charge() page references to kmem_cache */
 	percpu_ref_get_many(&s->memcg_params.refcnt, 1 << order);
@@ -394,11 +394,12 @@  static __always_inline void memcg_uncharge_slab(struct page *page, int order,
 	memcg = READ_ONCE(s->memcg_params.memcg);
 	if (likely(!mem_cgroup_is_root(memcg))) {
 		lruvec = mem_cgroup_lruvec(page_pgdat(page), memcg);
-		mod_lruvec_state(lruvec, cache_vmstat_idx(s), -(1 << order));
+		mod_lruvec_state(lruvec, cache_vmstat_idx(s),
+				 -(PAGE_SIZE << order));
 		memcg_kmem_uncharge_memcg(page, order, memcg);
 	} else {
 		mod_node_page_state(page_pgdat(page), cache_vmstat_idx(s),
-				    -(1 << order));
+				    -(PAGE_SIZE << order));
 	}
 	rcu_read_unlock();
 
@@ -482,7 +483,7 @@  static __always_inline int charge_slab_page(struct page *page,
 {
 	if (is_root_cache(s)) {
 		mod_node_page_state(page_pgdat(page), cache_vmstat_idx(s),
-				    1 << order);
+				    PAGE_SIZE << order);
 		return 0;
 	}
 
@@ -494,7 +495,7 @@  static __always_inline void uncharge_slab_page(struct page *page, int order,
 {
 	if (is_root_cache(s)) {
 		mod_node_page_state(page_pgdat(page), cache_vmstat_idx(s),
-				    -(1 << order));
+				    -(PAGE_SIZE << order));
 		return;
 	}
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index c29f03adca91..79695d9c34f3 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1303,8 +1303,8 @@  void *kmalloc_order(size_t size, gfp_t flags, unsigned int order)
 	page = alloc_pages(flags, order);
 	if (likely(page)) {
 		ret = page_address(page);
-		mod_node_page_state(page_pgdat(page), NR_SLAB_UNRECLAIMABLE,
-				    1 << order);
+		mod_node_page_state(page_pgdat(page), NR_SLAB_UNRECLAIMABLE_B,
+				    PAGE_SIZE << order);
 	}
 	ret = kasan_kmalloc_large(ret, size, flags);
 	/* As ret might get tagged, call kmemleak hook after KASAN. */
diff --git a/mm/slob.c b/mm/slob.c
index fa53e9f73893..8b7b56235438 100644
--- a/mm/slob.c
+++ b/mm/slob.c
@@ -202,8 +202,8 @@  static void *slob_new_pages(gfp_t gfp, int order, int node)
 	if (!page)
 		return NULL;
 
-	mod_node_page_state(page_pgdat(page), NR_SLAB_UNRECLAIMABLE,
-			    1 << order);
+	mod_node_page_state(page_pgdat(page), NR_SLAB_UNRECLAIMABLE_B,
+			    PAGE_SIZE << order);
 	return page_address(page);
 }
 
@@ -214,8 +214,8 @@  static void slob_free_pages(void *b, int order)
 	if (current->reclaim_state)
 		current->reclaim_state->reclaimed_slab += 1 << order;
 
-	mod_node_page_state(page_pgdat(sp), NR_SLAB_UNRECLAIMABLE,
-			    -(1 << order));
+	mod_node_page_state(page_pgdat(sp), NR_SLAB_UNRECLAIMABLE_B,
+			    -(PAGE_SIZE << order));
 	__free_pages(sp, order);
 }
 
@@ -550,8 +550,8 @@  void kfree(const void *block)
 		slob_free(m, *m + align);
 	} else {
 		unsigned int order = compound_order(sp);
-		mod_node_page_state(page_pgdat(sp), NR_SLAB_UNRECLAIMABLE,
-				    -(1 << order));
+		mod_node_page_state(page_pgdat(sp), NR_SLAB_UNRECLAIMABLE_B,
+				    -(PAGE_SIZE << order));
 		__free_pages(sp, order);
 
 	}
diff --git a/mm/slub.c b/mm/slub.c
index c9856a9807f1..0873b77727bf 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3825,8 +3825,8 @@  static void *kmalloc_large_node(size_t size, gfp_t flags, int node)
 	page = alloc_pages_node(node, flags, order);
 	if (page) {
 		ptr = page_address(page);
-		mod_node_page_state(page_pgdat(page), NR_SLAB_UNRECLAIMABLE,
-				    1 << order);
+		mod_node_page_state(page_pgdat(page), NR_SLAB_UNRECLAIMABLE_B,
+				    PAGE_SIZE << order);
 	}
 
 	return kmalloc_large_node_hook(ptr, size, flags);
@@ -3957,8 +3957,8 @@  void kfree(const void *x)
 
 		BUG_ON(!PageCompound(page));
 		kfree_hook(object);
-		mod_node_page_state(page_pgdat(page), NR_SLAB_UNRECLAIMABLE,
-				    -(1 << order));
+		mod_node_page_state(page_pgdat(page), NR_SLAB_UNRECLAIMABLE_B,
+				    -(PAGE_SIZE << order));
 		__free_pages(page, order);
 		return;
 	}
diff --git a/mm/vmscan.c b/mm/vmscan.c
index ee47bbcb99b5..283da9a39de2 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4272,7 +4272,8 @@  int node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned int order)
 	 * unmapped file backed pages.
 	 */
 	if (node_pagecache_reclaimable(pgdat) <= pgdat->min_unmapped_pages &&
-	    node_page_state(pgdat, NR_SLAB_RECLAIMABLE) <= pgdat->min_slab_pages)
+	    node_page_state_pages(pgdat, NR_SLAB_RECLAIMABLE_B) <=
+	    pgdat->min_slab_pages)
 		return NODE_RECLAIM_FULL;
 
 	/*
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 98f43725d910..d04f53997fd2 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -344,6 +344,8 @@  void __mod_node_page_state(struct pglist_data *pgdat, enum node_stat_item item,
 	x = delta + __this_cpu_read(*p);
 
 	t = __this_cpu_read(pcp->stat_threshold);
+	if (vmstat_item_in_bytes(item))
+		t <<= PAGE_SHIFT;
 
 	if (unlikely(x > t || x < -t)) {
 		node_page_state_add(x, pgdat, item);
@@ -555,6 +557,8 @@  static inline void mod_node_state(struct pglist_data *pgdat,
 		 * for all cpus in a node.
 		 */
 		t = this_cpu_read(pcp->stat_threshold);
+		if (vmstat_item_in_bytes(item))
+			t <<= PAGE_SHIFT;
 
 		o = this_cpu_read(*p);
 		n = delta + o;
@@ -999,6 +1003,12 @@  unsigned long node_page_state(struct pglist_data *pgdat,
 #endif
 	return x;
 }
+
+unsigned long node_page_state_pages(struct pglist_data *pgdat,
+				    enum node_stat_item item)
+{
+	return node_page_state(pgdat, item) >> PAGE_SHIFT;
+}
 #endif
 
 #ifdef CONFIG_COMPACTION
@@ -1547,10 +1557,13 @@  static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
 	if (is_zone_first_populated(pgdat, zone)) {
 		seq_printf(m, "\n  per-node stats");
 		for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) {
+			unsigned long x = node_page_state(pgdat, i);
+
+			if (vmstat_item_in_bytes(i))
+				x >>= PAGE_SHIFT;
 			seq_printf(m, "\n      %-12s %lu",
 				vmstat_text[i + NR_VM_ZONE_STAT_ITEMS +
-				NR_VM_NUMA_STAT_ITEMS],
-				node_page_state(pgdat, i));
+				NR_VM_NUMA_STAT_ITEMS], x);
 		}
 	}
 	seq_printf(m,
@@ -1679,8 +1692,11 @@  static void *vmstat_start(struct seq_file *m, loff_t *pos)
 	v += NR_VM_NUMA_STAT_ITEMS;
 #endif
 
-	for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++)
+	for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) {
 		v[i] = global_node_page_state(i);
+		if (vmstat_item_in_bytes(i))
+			v[i] >>= PAGE_SHIFT;
+	}
 	v += NR_VM_NODE_STAT_ITEMS;
 
 	global_dirty_limits(v + NR_DIRTY_BG_THRESHOLD,
diff --git a/mm/workingset.c b/mm/workingset.c
index c963831d354f..675792387e58 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -430,8 +430,10 @@  static unsigned long count_shadow_nodes(struct shrinker *shrinker,
 		for (pages = 0, i = 0; i < NR_LRU_LISTS; i++)
 			pages += lruvec_page_state_local(lruvec,
 							 NR_LRU_BASE + i);
-		pages += lruvec_page_state_local(lruvec, NR_SLAB_RECLAIMABLE);
-		pages += lruvec_page_state_local(lruvec, NR_SLAB_UNRECLAIMABLE);
+		pages += lruvec_page_state_local(
+			lruvec, NR_SLAB_RECLAIMABLE_B) >> PAGE_SHIFT;
+		pages += lruvec_page_state_local(
+			lruvec, NR_SLAB_UNRECLAIMABLE_B) >> PAGE_SHIFT;
 	} else
 #endif
 		pages = node_present_pages(sc->nid);