Message ID | 20210924033547.939554938@goodmis.org (mailing list archive) |
---|---|
Headers | show |
Series | tracing: Have trace_pid_list be a sparse array | expand |
On Thu, 23 Sep 2021 23:35:47 -0400 Steven Rostedt <rostedt@goodmis.org> wrote: > The pid_mask will start out with 1024 entries for the first 10 MSB bits. > This will cost 4K for 32 bit architectures and 8K for 64 bit. Each of > these will have a 1024 array to store the next 10 bits of the pid (another > 4 or 8K). These will hold an 512 byte bitmask (which will cover the LSB 12 > bits or 4096 bits). Thinking about this more, I should adjust this split. Currently, this is a 10,10,12 split, but since the upper chunks are arrays of pointers, and the lower chunk is a bitmask, and that pids tend to be close together, I should make the lower split bigger. As a 4K page is 32768 bits (2^15), the lower split should be 15 bits, not 12. This will probably allocate better. Of course 32 - 15 is 17. So maybe to keep it simple, by having the two upper chunks still the same in size, I could have it be 14 bits for the lower (2048 bytes). And since pid_max can only be 2^30, we don't even need to cover the full 32 bits. 30 - 14 = 16 = 8 * 2. Then I can make the upper chunks cover 8 bits (arrays of 256 pointers) and the lower chunks cove 14 bits. This would coincidentally make all chunks 2K in size (on 64 bit machines). Hmm, in that case, I can make the lower and upper chunks use the same memory and not separate them. They are unions after all. But that may be unfair for 32 bit machines, as the upper chunks are only going to be 1K in size for them (256 * 4). -- Steve
On Fri, Sep 24, 2021 at 6:07 AM Steven Rostedt <rostedt@goodmis.org> wrote: > > On Thu, 23 Sep 2021 23:35:47 -0400 > Steven Rostedt <rostedt@goodmis.org> wrote: > > > The pid_mask will start out with 1024 entries for the first 10 MSB bits. > > This will cost 4K for 32 bit architectures and 8K for 64 bit. Each of > > these will have a 1024 array to store the next 10 bits of the pid (another > > 4 or 8K). These will hold an 512 byte bitmask (which will cover the LSB 12 > > bits or 4096 bits). > > Thinking about this more, I should adjust this split. > > Currently, this is a 10,10,12 split, but since the upper chunks are > arrays of pointers, and the lower chunk is a bitmask, and that pids > tend to be close together, I should make the lower split bigger. > > As a 4K page is 32768 bits (2^15), the lower split should be 15 bits, > not 12. This will probably allocate better. > > Of course 32 - 15 is 17. So maybe to keep it simple, by having the two > upper chunks still the same in size, I could have it be 14 bits for the > lower (2048 bytes). And since pid_max can only be 2^30, we don't even > need to cover the full 32 bits. > > 30 - 14 = 16 = 8 * 2. > > Then I can make the upper chunks cover 8 bits (arrays of 256 pointers) > and the lower chunks cove 14 bits. This would coincidentally make all > chunks 2K in size (on 64 bit machines). > > Hmm, in that case, I can make the lower and upper chunks use the same > memory and not separate them. They are unions after all. But that may > be unfair for 32 bit machines, as the upper chunks are only going to be > 1K in size for them (256 * 4). Note that there is only one top-level chunk (so its size doesn't matter much), (usually) about one middle-tier chunk (except for some heavy cases, since pids are allocated linearly), and quite some lower-tier bitset leaves. So I'd optimise towards smaller leaves at the expense of middle-tier (and especially top-tier) chunk size (especially considering the fact that in the kernel, buddy allocator is used), like 12-8-12 or something like that, but I have no factual basis for arguing about specific split. Also, I cannot resist from noticing that this reminds me an awful lot of XArray and [1]. Maybe, some wrapper around XArray would do? [1] https://gitlab.com/strace/strace/-/raw/master/src/trie.h > > -- Steve > -- Eugene Syromyatnikov mailto:evgsyr@gmail.com xmpp:esyr@jabber.{ru|org}
On Fri, 24 Sep 2021 12:51:07 +0200 Eugene Syromyatnikov <evgsyr@gmail.com> wrote: Hi Eugene, > Note that there is only one top-level chunk (so its size doesn't > matter much), (usually) about one middle-tier chunk (except for some > heavy cases, since pids are allocated linearly), and quite some > lower-tier bitset leaves. So I'd optimise towards smaller leaves at > the expense of middle-tier (and especially top-tier) chunk size > (especially considering the fact that in the kernel, buddy allocator > is used), like 12-8-12 or something like that, but I have no factual What I really like about my 8 8 14 split I have, it makes everything 2K in size on 64 bit machines (1K 1K 2K for 32 bit, but who cares ;-) 1 << 8 * 8 bytes = 2K // top tiers are pointers to lower tiers 1 << 14 bits = 2K // lower tier only cares about bits This means they will likely all be allocated in the same slab. I'm optimizing the top tiers for size, because they are likely to be empty. Why add memory for something that will never be used, and can't be removed. Note, the middle and lower tiers can be reused when they go empty, which is a likely use case (at least when I test this using hackbench). > basis for arguing about specific split. Also, I cannot resist from > noticing that this reminds me an awful lot of XArray and [1]. Maybe, > some wrapper around XArray would do? > > [1] https://gitlab.com/strace/strace/-/raw/master/src/trie.h > I looked into xarray and it appears to be optimized for storing something, where as I'm just interested in a sparse bitmask. Thanks for the review. Note, I'll be posting a v3 soon because I found if I echo 1<<30 into set_event_pid, it adds 0 (because it only looks at the bottom 30 bits). It should really return -EINVAL. -- Steve
On Fri, 24 Sep 2021 09:16:27 -0400 Steven Rostedt <rostedt@goodmis.org> wrote: > On Fri, 24 Sep 2021 12:51:07 +0200 > Eugene Syromyatnikov <evgsyr@gmail.com> wrote: > > Hi Eugene, > > > Note that there is only one top-level chunk (so its size doesn't > > matter much), (usually) about one middle-tier chunk (except for some > > heavy cases, since pids are allocated linearly), and quite some > > lower-tier bitset leaves. So I'd optimise towards smaller leaves at > > the expense of middle-tier (and especially top-tier) chunk size > > (especially considering the fact that in the kernel, buddy allocator > > is used), like 12-8-12 or something like that, but I have no factual > > What I really like about my 8 8 14 split I have, it makes everything 2K in > size on 64 bit machines (1K 1K 2K for 32 bit, but who cares ;-) > > 1 << 8 * 8 bytes = 2K // top tiers are pointers to lower tiers > 1 << 14 bits = 2K // lower tier only cares about bits > > This means they will likely all be allocated in the same slab. > > I'm optimizing the top tiers for size, because they are likely to be empty. > Why add memory for something that will never be used, and can't be removed. > Note, the middle and lower tiers can be reused when they go empty, which is > a likely use case (at least when I test this using hackbench). > > > basis for arguing about specific split. Also, I cannot resist from > > noticing that this reminds me an awful lot of XArray and [1]. Maybe, > > some wrapper around XArray would do? > > > > [1] https://gitlab.com/strace/strace/-/raw/master/src/trie.h > > > > I looked into xarray and it appears to be optimized for storing something, > where as I'm just interested in a sparse bitmask. I guess he suggested that store the bitmask in xarray. Anyway, both are OK to me. This is needed for reducing the memory. Thank you, > > Thanks for the review. > > Note, I'll be posting a v3 soon because I found if I echo 1<<30 into > set_event_pid, it adds 0 (because it only looks at the bottom 30 bits). > It should really return -EINVAL. > > -- Steve
On Fri, Sep 24, 2021 at 3:35 PM Masami Hiramatsu <mhiramat@kernel.org> wrote: > > On Fri, 24 Sep 2021 09:16:27 -0400 > Steven Rostedt <rostedt@goodmis.org> wrote: > > I'm optimizing the top tiers for size, because they are likely to be empty. > > Why add memory for something that will never be used, and can't be removed. > > Note, the middle and lower tiers can be reused when they go empty, which is > > a likely use case (at least when I test this using hackbench). Makes sense; I was thinking about worse case scenarios—tracing thousands+ processes, but those probably not as common and important. > > I looked into xarray and it appears to be optimized for storing something, > > where as I'm just interested in a sparse bitmask. > > I guess he suggested that store the bitmask in xarray. Anyway, both are > OK to me. This is needed for reducing the memory. Yes, the idea was to store pointers to bitset leaves in XArray and leverage its radix tree implementation, at cost of somewhat lesser efficiency (since XArray indices are longs and thus it employs more intermediate levels on 64-bit architectures).