Message ID | 20240118124109.37324-1-lizhe.67@bytedance.com (mailing list archive) |
---|---|
Headers | show |
Series | kasan: introduce mem track feature | expand |
On Thu, 18 Jan 2024 at 13:41, lizhe.67 via kasan-dev <kasan-dev@googlegroups.com> wrote: > > From: Li Zhe <lizhe.67@bytedance.com> > > 1. Problem > ========== > KASAN is a tools for detecting memory bugs like out-of-bounds and > use-after-free. In Generic KASAN mode, it use shadow memory to record > the accessible information of the memory. After we allocate a memory > from kernel, the shadow memory corresponding to this memory will be > marked as accessible. > In our daily development, memory problems often occur. If a task > accidentally modifies memory that does not belong to itself but has > been allocated, some strange phenomena may occur. This kind of problem > brings a lot of trouble to our development, and unluckily, this kind of > problem cannot be captured by KASAN. This is because as long as the > accessible information in shadow memory shows that the corresponding > memory can be accessed, KASAN considers the memory access to be legal. > > 2. Solution > =========== > We solve this problem by introducing mem track feature base on KASAN > with Generic KASAN mode. In the current kernel implementation, we use > bits 0-2 of each shadow memory byte to store how many bytes in the 8 > byte memory corresponding to the shadow memory byte can be accessed. > When a 8-byte-memory is inaccessible, the highest bit of its > corresponding shadow memory value is 1. Therefore, the key idea is that > we can use the currently unused four bits 3-6 in the shadow memory to > record relevant track information. Which means, we can use one bit to > track 2 bytes of memory. If the track bit of the shadow mem corresponding > to a certain memory is 1, it means that the corresponding 2-byte memory > is tracked. By adding this check logic to KASAN's callback function, we > can use KASAN's ability to capture allocated memory corruption. Note: "track" is already an overloaded word with KASAN, meaning some allocation/free stack trace info + CPU id, task etc. > 3. Simple usage > =========== > The first step is to mark the memory as tracked after the allocation is > completed. > The second step is to remove the tracked mark of the memory before the > legal access process and re-mark the memory as tracked after finishing > the legal access process. It took me several readings to understand what problem you're actually trying to solve. AFAIK, you're trying to add custom poison/unpoison functions. From what I can tell this is duplicating functionality: it is perfectly legal to poison and unpoison memory while it is already allocated. I think it used to be the case the kasan_poison/unpoison() were API functions, but since tag-based KASAN modes this was changed to hide the complexity here. But you could simply expose a simpler variant of kasan_{un,}poison, e.g. kasan_poison/unpoison_custom(). You'd have to introduce another type (see where KASAN_PAGE_FREE, KASAN_SLAB_FREE is defined) to distinguish this custom type from other poisoned memory. Obviously it would be invalid to kasan_poison_custom() memory that is already poisoned, because that would discard the pre-existing poison type. With that design, I believe it would also work for the inline version of KASAN and not just outline version.
On Thu, 18 Jan 2024 14:28:00, elver@google.com wrote: >> 1. Problem >> ========== >> KASAN is a tools for detecting memory bugs like out-of-bounds and >> use-after-free. In Generic KASAN mode, it use shadow memory to record >> the accessible information of the memory. After we allocate a memory >> from kernel, the shadow memory corresponding to this memory will be >> marked as accessible. >> In our daily development, memory problems often occur. If a task >> accidentally modifies memory that does not belong to itself but has >> been allocated, some strange phenomena may occur. This kind of problem >> brings a lot of trouble to our development, and unluckily, this kind of >> problem cannot be captured by KASAN. This is because as long as the >> accessible information in shadow memory shows that the corresponding >> memory can be accessed, KASAN considers the memory access to be legal. >> >> 2. Solution >> =========== >> We solve this problem by introducing mem track feature base on KASAN >> with Generic KASAN mode. In the current kernel implementation, we use >> bits 0-2 of each shadow memory byte to store how many bytes in the 8 >> byte memory corresponding to the shadow memory byte can be accessed. >> When a 8-byte-memory is inaccessible, the highest bit of its >> corresponding shadow memory value is 1. Therefore, the key idea is that >> we can use the currently unused four bits 3-6 in the shadow memory to >> record relevant track information. Which means, we can use one bit to >> track 2 bytes of memory. If the track bit of the shadow mem corresponding >> to a certain memory is 1, it means that the corresponding 2-byte memory >> is tracked. By adding this check logic to KASAN's callback function, we >> can use KASAN's ability to capture allocated memory corruption. > >Note: "track" is already an overloaded word with KASAN, meaning some >allocation/free stack trace info + CPU id, task etc. Thanks for the reminder, I will change it to another name in the v2 patch. >> 3. Simple usage >> =========== >> The first step is to mark the memory as tracked after the allocation is >> completed. >> The second step is to remove the tracked mark of the memory before the >> legal access process and re-mark the memory as tracked after finishing >> the legal access process. > >It took me several readings to understand what problem you're actually >trying to solve. AFAIK, you're trying to add custom poison/unpoison >functions. > >From what I can tell this is duplicating functionality: it is >perfectly legal to poison and unpoison memory while it is already >allocated. I think it used to be the case the kasan_poison/unpoison() >were API functions, but since tag-based KASAN modes this was changed >to hide the complexity here. > >But you could simply expose a simpler variant of kasan_{un,}poison, >e.g. kasan_poison/unpoison_custom(). You'd have to introduce another >type (see where KASAN_PAGE_FREE, KASAN_SLAB_FREE is defined) to >distinguish this custom type from other poisoned memory. > >Obviously it would be invalid to kasan_poison_custom() memory that is >already poisoned, because that would discard the pre-existing poison >type. > >With that design, I believe it would also work for the inline version >of KASAN and not just outline version. Thank you for your review! Yes I am trying to add custom poison/unpoison functions which can monitor memory in a fine-grained manner, and not affect the original functionality of kasan. For example, for a 100-byte variable, I may only want to monitor certain two bytes (byte 3 and 4) in it. According to my understanding, kasan_poison/unpoison() can not detect the middle bytes individually. So I don't think function kasan_poison/unpoison() can do what I want.
On Thu, Jan 18, 2024 at 3:30 PM <lizhe.67@bytedance.com> wrote: > > Yes I am trying to add custom poison/unpoison functions which can monitor > memory in a fine-grained manner, and not affect the original functionality > of kasan. For example, for a 100-byte variable, I may only want to monitor > certain two bytes (byte 3 and 4) in it. According to my understanding, > kasan_poison/unpoison() can not detect the middle bytes individually. So I > don't think function kasan_poison/unpoison() can do what I want. I'm not sure this type of tracking belongs within KASAN. If there are only a few locations you want to monitor, perhaps a separate tools based on watchpoints would make more sense? Another alternative is to base this functionality on KMSAN: it already allows for bit-level precision. Plus, it would allow to only report when the marked memory is actually being used, not when it's just being copied. Perhaps Alexander can comment on whether this makes sense. If we decide to add this to KASAN or KMSAN, we need to least also add some in-tree users to demonstrate the functionality. And it would be great to find some bugs with it, but perhaps syzbot will be able to take care of that. Thank you!
On Thu, 18 Jan 2024 14:28:00, andreyknvl@gmail.com wrote: > >> Yes I am trying to add custom poison/unpoison functions which can monitor >> memory in a fine-grained manner, and not affect the original functionality >> of kasan. For example, for a 100-byte variable, I may only want to monitor >> certain two bytes (byte 3 and 4) in it. According to my understanding, >> kasan_poison/unpoison() can not detect the middle bytes individually. So I >> don't think function kasan_poison/unpoison() can do what I want. > >I'm not sure this type of tracking belongs within KASAN. > >If there are only a few locations you want to monitor, perhaps a >separate tools based on watchpoints would make more sense? Thank you for your review! Yes hardware breakpoint is a method to monitor a few locations. However, this depends on the hardware implementation and there will be a problem of limited number of hardware watchpoints, and software solution does not have these problems. > >Another alternative is to base this functionality on KMSAN: it already >allows for bit-level precision. Plus, it would allow to only report >when the marked memory is actually being used, not when it's just >being copied. Perhaps Alexander can comment on whether this makes >sense. > >If we decide to add this to KASAN or KMSAN, we need to least also add >some in-tree users to demonstrate the functionality. And it would be >great to find some bugs with it, but perhaps syzbot will be able to >take care of that. > >Thank you! In my opinion, currently this feature will only appear in our daily debugging process. Maybe this feature can be used in perf later. Or do you have any suggestions for in-tree users?
On Thu, 18 Jan 2024 at 13:41, <lizhe.67@bytedance.com> wrote: > > From: Li Zhe <lizhe.67@bytedance.com> > > 1. Problem > ========== > KASAN is a tools for detecting memory bugs like out-of-bounds and > use-after-free. In Generic KASAN mode, it use shadow memory to record > the accessible information of the memory. After we allocate a memory > from kernel, the shadow memory corresponding to this memory will be > marked as accessible. > In our daily development, memory problems often occur. If a task > accidentally modifies memory that does not belong to itself but has > been allocated, some strange phenomena may occur. This kind of problem > brings a lot of trouble to our development, and unluckily, this kind of > problem cannot be captured by KASAN. This is because as long as the > accessible information in shadow memory shows that the corresponding > memory can be accessed, KASAN considers the memory access to be legal. > > 2. Solution > =========== > We solve this problem by introducing mem track feature base on KASAN > with Generic KASAN mode. In the current kernel implementation, we use > bits 0-2 of each shadow memory byte to store how many bytes in the 8 > byte memory corresponding to the shadow memory byte can be accessed. > When a 8-byte-memory is inaccessible, the highest bit of its > corresponding shadow memory value is 1. Therefore, the key idea is that > we can use the currently unused four bits 3-6 in the shadow memory to > record relevant track information. Which means, we can use one bit to > track 2 bytes of memory. If the track bit of the shadow mem corresponding > to a certain memory is 1, it means that the corresponding 2-byte memory > is tracked. By adding this check logic to KASAN's callback function, we > can use KASAN's ability to capture allocated memory corruption. > > 3. Simple usage > =========== > The first step is to mark the memory as tracked after the allocation is > completed. > The second step is to remove the tracked mark of the memory before the > legal access process and re-mark the memory as tracked after finishing > the legal access process. KASAN already has a notion of memory poisoning/unpoisoning. See kasan_unpoison_range function. We don't export kasan_poison_range, but if you do local debuggng, you can export it locally. > The first patch completes the implementation of the mem track, and the > second patch provides an interface for using this facility, as well as > a testcase for the interface. > > Li Zhe (2): > kasan: introduce mem track feature base on kasan > kasan: add mem track interface and its test cases > > include/linux/kasan.h | 5 + > lib/Kconfig.kasan | 9 + > mm/kasan/generic.c | 437 +++++++++++++++++++++++++++++++++-- > mm/kasan/kasan_test_module.c | 26 +++ > mm/kasan/report_generic.c | 6 + > 5 files changed, 467 insertions(+), 16 deletions(-) > > -- > 2.20.1 >
On Mon, 22 Jan 2024 05:49:29 dvyukov@google.com wrote: >> >> From: Li Zhe <lizhe.67@bytedance.com> >> >> 1. Problem >> ========== >> KASAN is a tools for detecting memory bugs like out-of-bounds and >> use-after-free. In Generic KASAN mode, it use shadow memory to record >> the accessible information of the memory. After we allocate a memory >> from kernel, the shadow memory corresponding to this memory will be >> marked as accessible. >> In our daily development, memory problems often occur. If a task >> accidentally modifies memory that does not belong to itself but has >> been allocated, some strange phenomena may occur. This kind of problem >> brings a lot of trouble to our development, and unluckily, this kind of >> problem cannot be captured by KASAN. This is because as long as the >> accessible information in shadow memory shows that the corresponding >> memory can be accessed, KASAN considers the memory access to be legal. >> >> 2. Solution >> =========== >> We solve this problem by introducing mem track feature base on KASAN >> with Generic KASAN mode. In the current kernel implementation, we use >> bits 0-2 of each shadow memory byte to store how many bytes in the 8 >> byte memory corresponding to the shadow memory byte can be accessed. >> When a 8-byte-memory is inaccessible, the highest bit of its >> corresponding shadow memory value is 1. Therefore, the key idea is that >> we can use the currently unused four bits 3-6 in the shadow memory to >> record relevant track information. Which means, we can use one bit to >> track 2 bytes of memory. If the track bit of the shadow mem corresponding >> to a certain memory is 1, it means that the corresponding 2-byte memory >> is tracked. By adding this check logic to KASAN's callback function, we >> can use KASAN's ability to capture allocated memory corruption. >> >> 3. Simple usage >> =========== >> The first step is to mark the memory as tracked after the allocation is >> completed. >> The second step is to remove the tracked mark of the memory before the >> legal access process and re-mark the memory as tracked after finishing >> the legal access process. > >KASAN already has a notion of memory poisoning/unpoisoning. >See kasan_unpoison_range function. We don't export kasan_poison_range, >but if you do local debuggng, you can export it locally. Thank you for your review! For example, for a 100-byte variable, I may only want to monitor certain two bytes (byte 3 and 4) in it. According to my understanding, kasan_poison/unpoison() can not detect the middle bytes individually. So I don't think function kasan_poison_range() can do what I want. > >> The first patch completes the implementation of the mem track, and the >> second patch provides an interface for using this facility, as well as >> a testcase for the interface. >> >> Li Zhe (2): >> kasan: introduce mem track feature base on kasan >> kasan: add mem track interface and its test cases >> >> include/linux/kasan.h | 5 + >> lib/Kconfig.kasan | 9 + >> mm/kasan/generic.c | 437 +++++++++++++++++++++++++++++++++-- >> mm/kasan/kasan_test_module.c | 26 +++ >> mm/kasan/report_generic.c | 6 + >> 5 files changed, 467 insertions(+), 16 deletions(-)
On Mon, 22 Jan 2024 at 07:26, <lizhe.67@bytedance.com> wrote: > >> From: Li Zhe <lizhe.67@bytedance.com> > >> > >> 1. Problem > >> ========== > >> KASAN is a tools for detecting memory bugs like out-of-bounds and > >> use-after-free. In Generic KASAN mode, it use shadow memory to record > >> the accessible information of the memory. After we allocate a memory > >> from kernel, the shadow memory corresponding to this memory will be > >> marked as accessible. > >> In our daily development, memory problems often occur. If a task > >> accidentally modifies memory that does not belong to itself but has > >> been allocated, some strange phenomena may occur. This kind of problem > >> brings a lot of trouble to our development, and unluckily, this kind of > >> problem cannot be captured by KASAN. This is because as long as the > >> accessible information in shadow memory shows that the corresponding > >> memory can be accessed, KASAN considers the memory access to be legal. > >> > >> 2. Solution > >> =========== > >> We solve this problem by introducing mem track feature base on KASAN > >> with Generic KASAN mode. In the current kernel implementation, we use > >> bits 0-2 of each shadow memory byte to store how many bytes in the 8 > >> byte memory corresponding to the shadow memory byte can be accessed. > >> When a 8-byte-memory is inaccessible, the highest bit of its > >> corresponding shadow memory value is 1. Therefore, the key idea is that > >> we can use the currently unused four bits 3-6 in the shadow memory to > >> record relevant track information. Which means, we can use one bit to > >> track 2 bytes of memory. If the track bit of the shadow mem corresponding > >> to a certain memory is 1, it means that the corresponding 2-byte memory > >> is tracked. By adding this check logic to KASAN's callback function, we > >> can use KASAN's ability to capture allocated memory corruption. > >> > >> 3. Simple usage > >> =========== > >> The first step is to mark the memory as tracked after the allocation is > >> completed. > >> The second step is to remove the tracked mark of the memory before the > >> legal access process and re-mark the memory as tracked after finishing > >> the legal access process. > > > >KASAN already has a notion of memory poisoning/unpoisoning. > >See kasan_unpoison_range function. We don't export kasan_poison_range, > >but if you do local debuggng, you can export it locally. > > Thank you for your review! > > For example, for a 100-byte variable, I may only want to monitor certain > two bytes (byte 3 and 4) in it. According to my understanding, > kasan_poison/unpoison() can not detect the middle bytes individually. So I > don't think function kasan_poison_range() can do what I want. That's something to note in the description/comments. How many ranges do you intend to protect this way? If that's not too many, then a better option would be to poison these ranges normally and store ranges that a thread can access currently on a side. This will give both 1-byte precision, filtering for reads/writes separately and better diagnostics. > >> The first patch completes the implementation of the mem track, and the > >> second patch provides an interface for using this facility, as well as > >> a testcase for the interface. > >> > >> Li Zhe (2): > >> kasan: introduce mem track feature base on kasan > >> kasan: add mem track interface and its test cases > >> > >> include/linux/kasan.h | 5 + > >> lib/Kconfig.kasan | 9 + > >> mm/kasan/generic.c | 437 +++++++++++++++++++++++++++++++++-- > >> mm/kasan/kasan_test_module.c | 26 +++ > >> mm/kasan/report_generic.c | 6 + > >> 5 files changed, 467 insertions(+), 16 deletions(-)
On Mon, 22 Jan 2024 08:03:17, dvyukov@google.com wrote: >> >> From: Li Zhe <lizhe.67@bytedance.com> >> >> >> >> 1. Problem >> >> ========== >> >> KASAN is a tools for detecting memory bugs like out-of-bounds and >> >> use-after-free. In Generic KASAN mode, it use shadow memory to record >> >> the accessible information of the memory. After we allocate a memory >> >> from kernel, the shadow memory corresponding to this memory will be >> >> marked as accessible. >> >> In our daily development, memory problems often occur. If a task >> >> accidentally modifies memory that does not belong to itself but has >> >> been allocated, some strange phenomena may occur. This kind of problem >> >> brings a lot of trouble to our development, and unluckily, this kind of >> >> problem cannot be captured by KASAN. This is because as long as the >> >> accessible information in shadow memory shows that the corresponding >> >> memory can be accessed, KASAN considers the memory access to be legal. >> >> >> >> 2. Solution >> >> =========== >> >> We solve this problem by introducing mem track feature base on KASAN >> >> with Generic KASAN mode. In the current kernel implementation, we use >> >> bits 0-2 of each shadow memory byte to store how many bytes in the 8 >> >> byte memory corresponding to the shadow memory byte can be accessed. >> >> When a 8-byte-memory is inaccessible, the highest bit of its >> >> corresponding shadow memory value is 1. Therefore, the key idea is that >> >> we can use the currently unused four bits 3-6 in the shadow memory to >> >> record relevant track information. Which means, we can use one bit to >> >> track 2 bytes of memory. If the track bit of the shadow mem corresponding >> >> to a certain memory is 1, it means that the corresponding 2-byte memory >> >> is tracked. By adding this check logic to KASAN's callback function, we >> >> can use KASAN's ability to capture allocated memory corruption. >> >> >> >> 3. Simple usage >> >> =========== >> >> The first step is to mark the memory as tracked after the allocation is >> >> completed. >> >> The second step is to remove the tracked mark of the memory before the >> >> legal access process and re-mark the memory as tracked after finishing >> >> the legal access process. >> > >> >KASAN already has a notion of memory poisoning/unpoisoning. >> >See kasan_unpoison_range function. We don't export kasan_poison_range, >> >but if you do local debuggng, you can export it locally. >> >> Thank you for your review! >> >> For example, for a 100-byte variable, I may only want to monitor certain >> two bytes (byte 3 and 4) in it. According to my understanding, >> kasan_poison/unpoison() can not detect the middle bytes individually. So I >> don't think function kasan_poison_range() can do what I want. > >That's something to note in the description/comments. > >How many ranges do you intend to protect this way? >If that's not too many, then a better option would be to poison these >ranges normally and store ranges that a thread can access currently on >a side. >This will give both 1-byte precision, filtering for reads/writes >separately and better diagnostics. OK I will find a better method to solve this problem. Thank you! > >> >> The first patch completes the implementation of the mem track, and the >> >> second patch provides an interface for using this facility, as well as >> >> a testcase for the interface. >> >> >> >> Li Zhe (2): >> >> kasan: introduce mem track feature base on kasan >> >> kasan: add mem track interface and its test cases >> >> >> >> include/linux/kasan.h | 5 + >> >> lib/Kconfig.kasan | 9 + >> >> mm/kasan/generic.c | 437 +++++++++++++++++++++++++++++++++-- >> >> mm/kasan/kasan_test_module.c | 26 +++ >> >> mm/kasan/report_generic.c | 6 + >> >> 5 files changed, 467 insertions(+), 16 deletions(-)
From: Li Zhe <lizhe.67@bytedance.com> 1. Problem ========== KASAN is a tools for detecting memory bugs like out-of-bounds and use-after-free. In Generic KASAN mode, it use shadow memory to record the accessible information of the memory. After we allocate a memory from kernel, the shadow memory corresponding to this memory will be marked as accessible. In our daily development, memory problems often occur. If a task accidentally modifies memory that does not belong to itself but has been allocated, some strange phenomena may occur. This kind of problem brings a lot of trouble to our development, and unluckily, this kind of problem cannot be captured by KASAN. This is because as long as the accessible information in shadow memory shows that the corresponding memory can be accessed, KASAN considers the memory access to be legal. 2. Solution =========== We solve this problem by introducing mem track feature base on KASAN with Generic KASAN mode. In the current kernel implementation, we use bits 0-2 of each shadow memory byte to store how many bytes in the 8 byte memory corresponding to the shadow memory byte can be accessed. When a 8-byte-memory is inaccessible, the highest bit of its corresponding shadow memory value is 1. Therefore, the key idea is that we can use the currently unused four bits 3-6 in the shadow memory to record relevant track information. Which means, we can use one bit to track 2 bytes of memory. If the track bit of the shadow mem corresponding to a certain memory is 1, it means that the corresponding 2-byte memory is tracked. By adding this check logic to KASAN's callback function, we can use KASAN's ability to capture allocated memory corruption. 3. Simple usage =========== The first step is to mark the memory as tracked after the allocation is completed. The second step is to remove the tracked mark of the memory before the legal access process and re-mark the memory as tracked after finishing the legal access process. The first patch completes the implementation of the mem track, and the second patch provides an interface for using this facility, as well as a testcase for the interface. Li Zhe (2): kasan: introduce mem track feature base on kasan kasan: add mem track interface and its test cases include/linux/kasan.h | 5 + lib/Kconfig.kasan | 9 + mm/kasan/generic.c | 437 +++++++++++++++++++++++++++++++++-- mm/kasan/kasan_test_module.c | 26 +++ mm/kasan/report_generic.c | 6 + 5 files changed, 467 insertions(+), 16 deletions(-)