Message ID | CANeU7Qk8VCsj_EDBrj9H1bjR5a-_bLPQJiza_-eC90j=qiw7xw@mail.gmail.com (mailing list archive) |
---|---|
State | Rejected, archived |
Headers | show |
On Wed, Jul 12, 2017 at 10:27:25PM -0700, Christopher Li wrote: > From 7264a822d9d9e545152ac675fbc5b5e970ce254b Mon Sep 17 00:00:00 2001 > From: Christopher Li <sparse@chrisli.org> > Date: Wed, 12 Jul 2017 14:46:50 -0700 > Subject: [PATCH] Let pseudo->users loop on duplicate versin of list > > pseudo->users list will change during find dominator. > That cause a bug in the ptrlist because the outer loop iterator > is not award of the deletion of the entry. > > Let the outer loop using a duplicate version of entry > to avoid this problem for now. When I see this description, the first thing I think is: "you have a problem with an object, you duplicate the object, not you have two problem (at least)". In short, how you will keep coherency between the two? So yes, without any doubts, this fixes the iteration in the outer loop not taking correctly in account the deletion in the inner loop. But is the code working correctly now? I don't think so. In fact, since the outer loop is using a copy of the initial list, now *any* changes in the list will siply be ignored by the outer loop. - if an user is removed from the users list (for example in a position much further than the current position in the inner), previously, the inner loop will of course not have processed this user, while now it will process it although this user no more a user of this pseudo. It there any explanation which could explain that the current behaviour is correct? - same if the inner loop add a new user, now the outer loop will never process this user. To be clear, the issue I'm raising here is not about 'maybe missing an optimisation' but well a question of applying an optimization while the real condition are in fatc not met, and thus producings incorrect code. -- Luc -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 19, 2017 at 2:14 PM, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > When I see this description, the first thing I think is: > "you have a problem with an object, you duplicate the object, > not you have two problem (at least)". > In short, how you will keep coherency between the two? We don't :-). Both the remove case and insert case has been discussed. First of all, the ptrlist-refcount patch check for the insert case, there is none so far. If you believe there is one, I would like to know. About the looping on the deleted entry. That is the very first thing I shout in the V1 version of the patch. The thing is, the pseudo_user have a point to instruction. If the instruction is deleted then that instruction will have insn->bb = NULL. Not a perfect fix. But better than the previous. Because the for loop is actually looping on reverse order. Any delete will cause major align problem __nr counting from the tail. Any way, My V1 and email asking for suggestion how to fix it About 10 days ago. I haven't heard any thing better. If you have other way to fix it properly, I am all for it. It is not too late but time is running out soon. Chris -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 19, 2017 at 11:42 PM, Christopher Li <sparse@chrisli.org> wrote: > On Wed, Jul 19, 2017 at 2:14 PM, Luc Van Oostenryck > <luc.vanoostenryck@gmail.com> wrote: >> When I see this description, the first thing I think is: >> "you have a problem with an object, you duplicate the object, >> not you have two problem (at least)". >> In short, how you will keep coherency between the two? > > We don't :-). Both the remove case and insert case has been > discussed. First of all, the ptrlist-refcount patch check for the > insert case, there is none so far. If you believe there is one, > I would like to know. Some add_user() can be called in this loop. so it's just a question to some input code complex enough to have an add_user() done on the same pseudo as the one concerned in the outer loop. > About the looping on the deleted entry. That is the very first > thing I shout in the V1 version of the patch. The thing is, > the pseudo_user have a point to instruction. If the instruction > is deleted then that instruction will have insn->bb = NULL. > > Not a perfect fix. But better than the previous. Because > the for loop is actually looping on reverse order. Any delete > will cause major align problem __nr counting from the tail. Yes, those list delete inside iteration loops are definitively to avoid. > Any way, My V1 and email asking for suggestion how to fix it > About 10 days ago. I haven't heard any thing better. If you > have other way to fix it properly, I am all for it. It is not too late > but time is running out soon. I stated several times that the only real solution will be to mark the elements as being deleted (and only effectively delete them in some safe situations) exactly like it is done for instruction (which are probably the type the most often deleted from lists. Trying to effectively removing them from lists like it is currently done for the others types would most probably fail in the most horrible ways). It's a simple & clean solution, provably correct in *all* situations. But I don't remember having seen any replies from you on this subject. -- Luc -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 19, 2017 at 3:51 PM, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > > Some add_user() can be called in this loop. so it's just a question > to some input code complex enough to have an add_user() > done on the same pseudo as the one concerned in the outer > loop. Can you point me to a call stack that trigger the add_user()? I have run the full kernel compile did not trigger it. The other two nested loop delete was catches within the first 15 files or so. > I stated several times that the only real solution will be to mark the > elements as being deleted (and only effectively delete them in some > safe situations) exactly like it is done for instruction (which are probably > the type the most often deleted from lists. Trying to effectively > removing them from lists like it is currently done for the others types > would most probably fail in the most horrible ways). > It's a simple & clean solution, provably correct in *all* situations. > But I don't remember having seen any replies from you on this subject. That is because you are away and you did not read through the email. I did comment on it. It doesn't correct in *all* situations. We need bigger hammer. I locate it out for you. http://marc.info/?l=linux-sparse&m=149987902920449&w=3 ==========quote================ Even Luc's suggestion for "mark and sweep" to delete the ptrlist is not going to help with the nested loop split. I will add that check. I don't expect new offenders but let's make sure about it. ==========end quote============= http://marc.info/?l=linux-sparse&m=149969288517115&w=3 =============quote=============== > I earlier suggested to instead change the way elements are 'deleted' > (like instructions are already never removed from their list but > just marked as deleted). That looks simple but it has hidden complications as well. The issue is that we need to find out which list need this kind of special treatment. Who is the outer loop. If the same function can be both call as the outer loop and inner loop then it is complicate to decide when it should do the finalize. There is also the price to pay for walking the list twice which does not exist if nested loop can be avoided. =============end quote============== Chris -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 20, 2017 at 4:34 AM, Christopher Li <sparse@chrisli.org> wrote: > On Wed, Jul 19, 2017 at 3:51 PM, Luc Van Oostenryck > <luc.vanoostenryck@gmail.com> wrote: >> >> Some add_user() can be called in this loop. so it's just a question >> to some input code complex enough to have an add_user() >> done on the same pseudo as the one concerned in the outer >> loop. > > Can you point me to a call stack that trigger the add_user()? > > I have run the full kernel compile did not trigger it. The other two > nested loop delete was catches within the first 15 files or so. > >> I stated several times that the only real solution will be to mark the >> elements as being deleted (and only effectively delete them in some >> safe situations) exactly like it is done for instruction (which are probably >> the type the most often deleted from lists. Trying to effectively >> removing them from lists like it is currently done for the others types >> would most probably fail in the most horrible ways). >> It's a simple & clean solution, provably correct in *all* situations. >> But I don't remember having seen any replies from you on this subject. > > That is because you are away and you did not read through the email. > I did comment on it. It doesn't correct in *all* situations. We need bigger > hammer. > > I locate it out for you. > > http://marc.info/?l=linux-sparse&m=149987902920449&w=3 > ==========quote================ > Even Luc's suggestion for "mark and sweep" to delete the ptrlist > is not going to help with the nested loop split. I will add that check. > I don't expect new offenders but let's make sure about it. > ==========end quote============= I don't think you understood what I meant. > http://marc.info/?l=linux-sparse&m=149969288517115&w=3 > =============quote=============== >> I earlier suggested to instead change the way elements are 'deleted' >> (like instructions are already never removed from their list but >> just marked as deleted). > > That looks simple but it has hidden complications as well. The issue is that > we need to find out which list need this kind of special treatment. It's very simple: *all* lists need this 'treatement'. > Who is > the outer loop. If the same function can be both call as the outer > loop and inner > loop then it is complicate to decide when it should do the finalize. > There is also > the price to pay for walking the list twice which does not exist if nested loop > can be avoided. Walking twice? In general, we can't avoid nested loops. Look at what is done currently with instructions: when we want to 'delete' an instruction we simply set it's ->bb to NULL which basically means: "for now, please just ignore this instruction". In other words, instructions are *never* removed from their lists, they are simply marked as being deleted. You don't need to know if you're in a nested loop or not. You will never have the problem with deletion on nested list walking because they are never removed from the lists. Why do you think it has been done so? Do you think it's bad and it would be better to effectively remove instructions from their lists like others types? Do you think it would be good or even possible to avoid having nested loops of instructions? I don't think so. -- Luc -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Aug 2, 2017 at 7:44 PM, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: >> http://marc.info/?l=linux-sparse&m=149987902920449&w=3 >> ==========quote================ >> Even Luc's suggestion for "mark and sweep" to delete the ptrlist >> is not going to help with the nested loop split. I will add that check. >> I don't expect new offenders but let's make sure about it. >> ==========end quote============= > > I don't think you understood what I meant. I don't think so. Can you please explain? Let's give it an example on the insert case. The outer loop is doing FOR_EACH_PTR_REVERSE(bbs, bb) { /* the outer loop macro is holding the __nr = list->nr; __list = list; It is using __nr as the slot number point to the current slot. */ ... add_bb(bbs, new_bb); /* now list->list[] array has been split. list->nr has less entry. __nr does not know about that. it is not going to point to the right element */ ... } END_FOR_EACH_PTR_REVERSE(bb); The key insight here is that, we need to preserv list->list[] array not changed, also list->nr not changed for make reverse ptr loop work. For the delete case, set bb->ep = NULL; marking the bb dead works. It does not touch list->list[] nor list->nr. As long as you skip the bb which bb->ep == NULL you are fine. Let's look at the insert case. regardless how you set bb->ep. When you insert the new bb into the list, the list split will modify list->list[] and list->nr. There is no way to defer that. It *will* mess up with the outer loop __nr slot pointer. How does your purpose method handle this case? Please explain. I am listening. > >> http://marc.info/?l=linux-sparse&m=149969288517115&w=3 >> =============quote=============== >>> I earlier suggested to instead change the way elements are 'deleted' >>> (like instructions are already never removed from their list but >>> just marked as deleted). >> >> That looks simple but it has hidden complications as well. The issue is that >> we need to find out which list need this kind of special treatment. > > It's very simple: *all* lists need this 'treatement'. I mean you need to identify the caller who is the outer loop, the outer loop caller need to do the list packing. That is what I mean by special treatment. If packing inside the inner loop, that is the problem we try to avoid. That is assume we still pack list. If we never pack list, then it is not a problem here per say. But we still need to deal with the insert problem. > > Walking twice? I am more worry about the incorrect behavior on insert and packing than walk twice. > > In general, we can't avoid nested loops. I am thinking breaking the recursive calling into work queue. Then there is no nest loop. If you known why that will not work, please point out. I do assume I can convert the recursive call to non recursive using work queue. Again I haven't dig very deep yet. > > Look at what is done currently with instructions: when we want to 'delete' > an instruction we simply set it's ->bb to NULL which basically means: > "for now, please just ignore this instruction". In other words, instructions > are *never* removed from their lists, they are simply marked as being > deleted. You don't need to know if you're in a nested loop or not. > You will never have the problem with deletion on nested list walking > because they are never removed from the lists. Right, but does not help the insert case. > > Why do you think it has been done so? > Do you think it's bad and it would be better to effectively remove > instructions from their lists like others types? > Do you think it would be good or even possible to avoid having > nested loops of instructions? > I don't think so. Please, I *do* understand why it is done that way. I am trying to point out that did not solve all the problem if inner function want to modify the list. aka insert. That seems to be the point haven't come across to you. If you know how to solve the insert case, please explain. Chris -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Aug 02, 2017 at 08:50:14PM -0400, Christopher Li wrote: > On Wed, Aug 2, 2017 at 7:44 PM, Luc Van Oostenryck > <luc.vanoostenryck@gmail.com> wrote: > >> http://marc.info/?l=linux-sparse&m=149987902920449&w=3 > >> ==========quote================ > >> Even Luc's suggestion for "mark and sweep" to delete the ptrlist > >> is not going to help with the nested loop split. I will add that check. > >> I don't expect new offenders but let's make sure about it. > >> ==========end quote============= > > > > I don't think you understood what I meant. > > I don't think so. Can you please explain? > > Let's give it an example on the insert case. Yes yes. I already said the 7th July that reverse list walking and list splitting will be another beast but the subject of this patch is the *deletion* of elements from lists. > > It's very simple: *all* lists need this 'treatement'. > > I mean you need to identify the caller who is the outer loop, the outer > loop caller need to do the list packing. That is what I mean by > special treatment. > If packing inside the inner loop, that is the problem we try to avoid. > > That is assume we still pack list. If we never pack list, then it is > not a problem > here per say. We don't have to pack all lists. The ones that we pack can be packed at some specific place/moment. > > > > Walking twice? > > I am more worry about the incorrect behavior on insert and packing > than walk twice. > > > > > In general, we can't avoid nested loops. > > I am thinking breaking the recursive calling into work queue. > Then there is no nest loop. If you known why that will not work, > please point out. I do assume I can convert the recursive call > to non recursive using work queue. Again I haven't dig very deep yet. Oh, I'm sure that using a work queue will work in some case. I'm even sure that it's a good idea for some case. But I don't think it will solve all problems because: - some nested loops will remains - it will be hard to avoid nested loops because we generaly don't know if we're nested or not. The mains ep - bbs - insns loop is obvious but what for other loops? - using a work list is not without problems either. For example, we'll have no/few control over when things are effectively be done. Sometimes it will be fine, sometimes it won't be acceptable. If during some operation, we must delete *this* element or add a new element *here in the list* just pushing something on a work list and saying "OK, it will be done some time later" won't solve the problem, won't keep things coherent, ... > > > > Look at what is done currently with instructions: when we want to 'delete' > > an instruction we simply set it's ->bb to NULL which basically means: > > "for now, please just ignore this instruction". In other words, instructions > > are *never* removed from their lists, they are simply marked as being > > deleted. You don't need to know if you're in a nested loop or not. > > You will never have the problem with deletion on nested list walking > > because they are never removed from the lists. > > Right, but does not help the insert case. Sure, it's another problem. > > Why do you think it has been done so? > > Do you think it's bad and it would be better to effectively remove > > instructions from their lists like others types? > > Do you think it would be good or even possible to avoid having > > nested loops of instructions? > > I don't think so. > > Please, I *do* understand why it is done that way. I am trying to point out that > did not solve all the problem if inner function want to modify the list. > aka insert. That seems to be the point haven't come across to you. > > If you know how to solve the insert case, please explain. Well, it's not as if saying "let's do everything with work queues" solve anything in itself. It's a idea that would need lot of patches and big changes in a lot of code to show if it's up to its promise (and as I explain shortly here above, I have reasons to believe that it won't be possible to solve all the list problems with it). For the reverse walking and insert, I haven't thought much about it yet. I won't be surprised that it will need some change in the ptrlist struct & macros. Maybe it's an operation that is not common enough that trying to ban it wouldn't be a big constraint on existing & future code. If all the ones we have go away once everything have been converted to using work queues then fine, but what if they don't, what if we can't convert everything to using work queues? On the other hand, avoiding the problem of list deletion (which seems to be an operation much more common than reverse & insert) by marking deleted elements is something that can be done easily, with minimal code impact and that *will* solve *all* problems with list deletion (but yes, only list deletion). -- Luc -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 3, 2017 at 6:18 AM, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > On Wed, Aug 02, 2017 at 08:50:14PM -0400, Christopher Li wrote: >> On Wed, Aug 2, 2017 at 7:44 PM, Luc Van Oostenryck >> <luc.vanoostenryck@gmail.com> wrote: >> >> http://marc.info/?l=linux-sparse&m=149987902920449&w=3 >> >> ==========quote================ >> >> Even Luc's suggestion for "mark and sweep" to delete the ptrlist >> >> is not going to help with the nested loop split. I will add that check. >> >> I don't expect new offenders but let's make sure about it. >> >> ==========end quote============= >> > >> > I don't think you understood what I meant. >> >> I don't think so. Can you please explain? >> >> Let's give it an example on the insert case. > > Yes yes. > I already said the 7th July that reverse list walking and list > splitting will be another beast but the subject of this patch > is the *deletion* of elements from lists. The text you quote from me clearing is discussing node split here. " nested loop *split*". Even though the title of the email was original about delete. The discussion expand to split because I haven't realize that is a problem earlier. I consider this as the same category delete. It is basically looping and modify the list at the same time. I want to find a solution to solve this both situations. > We don't have to pack all lists. > The ones that we pack can be packed at some specific place/moment. Right, we need to make sure the packing *never* call from the same list looping body. There is no easy way to find out this did not happen. >> I am thinking breaking the recursive calling into work queue. >> Then there is no nest loop. If you known why that will not work, >> please point out. I do assume I can convert the recursive call >> to non recursive using work queue. Again I haven't dig very deep yet. > > Oh, I'm sure that using a work queue will work in some case. > I'm even sure that it's a good idea for some case. But I don't > think it will solve all problems because: > - some nested loops will remains Can you give an example what nest looping will have to remain. Converting recursive call into work queue without recursive is pretty stander algorithm. In theory, as long as you defer the inner loop calling into the work queue. It can always avoid the nest looping. In the extreme case you can think the work queue holding a copy of the list and move the inner loop body to the outer work queue. In reality, we don't want to move every thing into work queue. I think the loop just observe. Defer the modification into work queue is relative conceptually clean. > - it will be hard to avoid nested loops because we generaly don't > know if we're nested or not. The mains ep - bbs - insns loop We don't mind nested looping as long as we don't modify it. We can always know that we are about to modify the bb or insns. That is a good place to move to the work queue. My mental model is that, have the detect and modify phase. The detect phase do all the looping as needed. Look but don't touch. The modify request is push into work queue and execute after the loop has been done. > is obvious but what for other loops? > - using a work list is not without problems either. For example, > we'll have no/few control over when things are effectively be > done. Sometimes it will be fine, sometimes it won't be acceptable. > If during some operation, we must delete *this* element or > add a new element *here in the list* just pushing something > on a work list and saying "OK, it will be done some time later" > won't solve the problem, won't keep things coherent, ... That usually means you have some other things need to defer as well. You don't have to do it on the spot on the loop. Have a work queue is actually easier to reason and enforce such ordering. You can consider our current CSE loop as a poor form of work queue. It just have one bit of flag for each stage. It does not point to which insn need to be changed. You can think it this way, the delete mark and sweep is just another way to defer the delete as well. In the extreme case, we defer to never delete. >> >> If you know how to solve the insert case, please explain. > > Well, it's not as if saying "let's do everything with work queues" > solve anything in itself. It's a idea that would need lot of patches > and big changes in a lot of code to show if it's up to its promise > (and as I explain shortly here above, I have reasons to believe > that it won't be possible to solve all the list problems with it). Right. It need a lot of discussion and patch experiment. Definitely after the release. As I said in the other email. If you have better patch to solve the nest loop delete on pseudo list for RC5. Feel free to replace the current one on sparse. I am aware of the current solution is not perfect. It can iterate on deleted ptr entry. > > For the reverse walking and insert, I haven't thought much about > it yet. I won't be surprised that it will need some change in the > ptrlist struct & macros. > Maybe it's an operation that is not common enough that trying to ban > it wouldn't be a big constraint on existing & future code. I think that is let the tail wag the dog. Plus, I can demonstrate, node split in FOR_EACH_PTR() can also have incorrect behavior as well. Basically node split will change list->list[] and list->nr. That can have impact on forward iteration too. The reverse one is just more obvious. > If all the ones we have go away once everything have been converted > to using work queues then fine, but what if they don't, what if > we can't convert everything to using work queues? We will have to use bigger hammer. In short, it is going to be ugly. e.g. have the list node has ref count and a pending change single list. > > On the other hand, avoiding the problem of list deletion (which seems > to be an operation much more common than reverse & insert) by marking > deleted elements is something that can be done easily, with minimal > code impact and that *will* solve *all* problems with list deletion > (but yes, only list deletion). Yes, I consider that as a short term thing until we find a better way to handle more general modification including insert. Chris -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 03, 2017 at 07:48:59PM -0400, Christopher Li wrote: > On Thu, Aug 3, 2017 at 6:18 AM, Luc Van Oostenryck wrote: > > We don't have to pack all lists. > > The ones that we pack can be packed at some specific place/moment. > > Right, we need to make sure the packing *never* call from the same list > looping body. There is no easy way to find out this did not happen. > > >> I am thinking breaking the recursive calling into work queue. > >> Then there is no nest loop. If you known why that will not work, > >> please point out. I do assume I can convert the recursive call > >> to non recursive using work queue. Again I haven't dig very deep yet. > > > > Oh, I'm sure that using a work queue will work in some case. > > I'm even sure that it's a good idea for some case. But I don't > > think it will solve all problems because: > > - some nested loops will remains > > Can you give an example what nest looping will have to remain. Can you give the assurance that none will remains? I said "I think that ..." and of course, I have reasons to think so. It was explained here under. > Converting recursive call into work queue without recursive > is pretty stander algorithm. In theory, as long as you defer > the inner loop calling into the work queue. It can always > avoid the nest looping. In the extreme case you can think > the work queue holding a copy of the list and move the inner > loop body to the outer work queue. *standard algorithm*, *in theory*, ... > In reality, we don't want to move every thing into work queue. > I think the loop just observe. Defer the modification into work queue > is relative conceptually clean. > > > - it will be hard to avoid nested loops because we generaly don't > > know if we're nested or not. The mains ep - bbs - insns loop > > We don't mind nested looping as long as we don't modify it. > We can always know that we are about to modify the bb or insns. > That is a good place to move to the work queue. > > My mental model is that, have the detect and modify phase. > The detect phase do all the looping as needed. Look but don't touch. > The modify request is push into work queue and execute after the > loop has been done. > > > is obvious but what for other loops? > > - using a work list is not without problems either. For example, > > we'll have no/few control over when things are effectively be > > done. Sometimes it will be fine, sometimes it won't be acceptable. > > If during some operation, we must delete *this* element or > > add a new element *here in the list* just pushing something > > on a work list and saying "OK, it will be done some time later" > > won't solve the problem, won't keep things coherent, ... > > That usually means you have some other things need to defer as well. > You don't have to do it on the spot on the loop. Have a work queue is > actually easier to reason and enforce such ordering. You have all the advantages to decouple things and you have all the disadvantges to have decoupled things. It's not a B&W thing. > > Well, it's not as if saying "let's do everything with work queues" > > solve anything in itself. It's a idea that would need lot of patches > > and big changes in a lot of code to show if it's up to its promise > > (and as I explain shortly here above, I have reasons to believe > > that it won't be possible to solve all the list problems with it). > > Right. It need a lot of discussion and patch experiment. Definitely > after the release. I think you very seriously underestimate the amount of work that will need to be done. > As I said in the other email. If you have better patch to solve the > nest loop delete on pseudo list for RC5. Feel free to replace the > current one on sparse. I am aware of the current solution is not perfect. > It can iterate on deleted ptr entry. I just send a patch. It's not very pretty and is only lightly tested but IMO it's vastly superior to the duplication of list. > > For the reverse walking and insert, I haven't thought much about > > it yet. I won't be surprised that it will need some change in the > > ptrlist struct & macros. > > Maybe it's an operation that is not common enough that trying to ban > > it wouldn't be a big constraint on existing & future code. > > I think that is let the tail wag the dog. Plus, I can demonstrate, > node split in FOR_EACH_PTR() can also have incorrect behavior as well. Yes yes, we know that since the beginning. > > If all the ones we have go away once everything have been converted > > to using work queues then fine, but what if they don't, what if > > we can't convert everything to using work queues? > > We will have to use bigger hammer. In short, it is going to be ugly. > e.g. have the list node has ref count and a pending change single list. Other solutions exist :) > > On the other hand, avoiding the problem of list deletion (which seems > > to be an operation much more common than reverse & insert) by marking > > deleted elements is something that can be done easily, with minimal > > code impact and that *will* solve *all* problems with list deletion > > (but yes, only list deletion). > > Yes, I consider that as a short term thing until we find a better way to handle > more general modification including insert. Good to hear that. -- Luc -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 3, 2017 at 8:41 PM, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: >> >> Can you give an example what nest looping will have to remain. > > Can you give the assurance that none will remains? I can have the ptr ref count patch to check if such thing happen. But I am sure that is not what you are asking :-) If you follow the observe and modify model. Put the modify stage on the outer loop. Modify only one thing at a time. Then it is easy to reason weather "nest loop modify" exist. What you are asking in more general setting is not going to have trivial answer any way. It is equivalent of the truing halting problem. On the other hand, if you want to disprove this, you only need one counter example. >> Converting recursive call into work queue without recursive >> is pretty stander algorithm. In theory, as long as you defer >> the inner loop calling into the work queue. It can always >> avoid the nest looping. In the extreme case you can think >> the work queue holding a copy of the list and move the inner >> loop body to the outer work queue. > > *standard algorithm*, *in theory*, ... I am actually don't worry about weather it can be done or not. I am sure you don't need me to google converting recursion to iteration for you. The question is weather it can be done on sparse cleanly. It is hard to say without trying. > You have all the advantages to decouple things and > you have all the disadvantges to have decoupled things. > It's not a B&W thing. It is a white-gray-black thing if you know what I mean :-) Black is the terminate condition. >> > Well, it's not as if saying "let's do everything with work queues" >> > solve anything in itself. It's a idea that would need lot of patches >> > and big changes in a lot of code to show if it's up to its promise >> > (and as I explain shortly here above, I have reasons to believe >> > that it won't be possible to solve all the list problems with it). >> >> Right. It need a lot of discussion and patch experiment. Definitely >> after the release. > > I think you very seriously underestimate the amount of work that will > need to be done. I realize there will be a lot of change. But actually sparse will need ssa version of the dead code removal (correctness) and ssa version of constant propagation any way. (performance, current we scan the whole insn list to find out optimization opportunity). A lot of similar algorithm already using work queue. Some rewrites is expected as far as I can tell. > I just send a patch. > It's not very pretty and is only lightly tested but IMO it's vastly > superior to the duplication of list. I haven't see it. I saw there is an constant expression patch on the mailing list. >> I think that is let the tail wag the dog. Plus, I can demonstrate, >> node split in FOR_EACH_PTR() can also have incorrect behavior as well. > > Yes yes, we know that since the beginning. Then why ban reverse ptr loop? It is not going to help avoid the bad situation. >> We will have to use bigger hammer. In short, it is going to be ugly. >> e.g. have the list node has ref count and a pending change single list. > > Other solutions exist :) Yes, idea and patch welcome. Chris -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 03, 2017 at 10:22:34PM -0400, Christopher Li wrote: > On Thu, Aug 3, 2017 at 8:41 PM, Luc Van Oostenryck > <luc.vanoostenryck@gmail.com> wrote: > >> > >> Can you give an example what nest looping will have to remain. > > > > Can you give the assurance that none will remains? > > I can have the ptr ref count patch to check if such thing > happen. But I am sure that is not what you are asking :-) It was a rethorical question as a reponse to your own question because your own asking to show something while we're talking design and potentiality doesn't help at all. > >> Converting recursive call into work queue without recursive > >> is pretty stander algorithm. In theory, as long as you defer > >> the inner loop calling into the work queue. It can always > >> avoid the nest looping. In the extreme case you can think > >> the work queue holding a copy of the list and move the inner > >> loop body to the outer work queue. > > > > *standard algorithm*, *in theory*, ... > > I am actually don't worry about weather it can be done or not. > I am sure you don't need me to google converting recursion > to iteration for you. 1) Converting a recursive algo to an iterative one is not always a good solution, for example the iterative one can be much more complex/less elegant. You may also need more memory (because you often need to store additional info). A typical example is searching in a tree: it can be done with iteration but it's generaly not done because the recursive version is so much simpler & elegant (and probably efficient too). 2) Converting something to a work queue is not the same as converting recursion to iteration anyway. Recursion vs. iteration is something about how walking through a data structure. Using a work queue is about how/when doing an operation on an object. An example among other of complications you can have when converting to work queue is something like: Imagine you're iterating through a list of instruction checking/ searching after something. If found, you want to do an operation on the instruction just found. Sound like something for a work queue, right? Just put the object you found in the work queue and let do the operation later. Now what if the operation is about inserting another instruction just before the one just found (like done when adding death notes)? - if you do it immediatly, you have the context of the instruction and inserting the death notes is easy/cheap - if you do the insertion later when processing the work queue you won't have anymore the context of the instruction and to insert the death note you will first have to search after it (to have the context back) before you will be able to do the insertion and this search will cost you. It's just an example but there is a whole class of similar situations. Delaying thing by moving it to a work queue will also have negative impact on cache (but in a way, it's also a problem of having lost the context). And again, I'm not telling that using workqueue is never a good solution, I'm saying that most probably it won't always be a good solution. -- Luc -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Aug 4, 2017 at 6:38 AM, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > It was a rethorical question as a reponse to your own question > because your own asking to show something while we're talking > design and potentiality doesn't help at all. That is exactly the feeling I am getting from you. You said some thing might not work. I want to find out exactly what do you have in mind might not work. Otherwise the conversion is base on doubt is not very useful. I want details. This goes both ways. You can also ask me to clarify more details on certain things in my mental model. > > 1) Converting a recursive algo to an iterative one is not always > a good solution, for example the iterative one can be much > more complex/less elegant. You may also need more memory > (because you often need to store additional info). > A typical example is searching in a tree: it can be done > with iteration but it's generaly not done because the > recursive version is so much simpler & elegant (and probably > efficient too). In my observe and modify model, the traverse of a tree can use recursive, even on the same ptrlist. It is the modify get deferred. I think I mention that observe and modify in previous email already. So what you are describing above is not a good example to against it. > 2) Converting something to a work queue is not the same as > converting recursion to iteration anyway. > Recursion vs. iteration is something about how walking through > a data structure. Using a work queue is about how/when doing an > operation on an object. You walk the graph to do some thing. That is given. Nobody walk the graph for the sake of just walking the graph. The optimize can be model as an algorithm walk the graph and make modification of the graph. Is that a fair statement? > > An example among other of complications you can have when converting > to work queue is something like: > Imagine you're iterating through a list of instruction checking/ > searching after something. If found, you want to do an operation > on the instruction just found. Now we are talking. > Sound like something for a work queue, right? Just put the > object you found in the work queue and let do the operation later. > Now what if the operation is about inserting another instruction > just before the one just found (like done when adding death notes)? > - if you do it immediatly, you have the context of the instruction > and inserting the death notes is easy/cheap Here you make some assumption that you can't save the context of the instruction in the work queue. That is not necessary true. In my mental model, the work queue will contain the context of what needs to be done, e.g. the pointer to the instruct. Other context if needed. I want to know more about what context you can't save. BTW, most of the context can be found on the graph does not need to save in to work queue. Take simplify_branch_branch for example, when you know this instruction should be simplified, there is very little information need to be save in the work queue. Basically we can start with call arguments to rewrite_branch(), go from there. > - if you do the insertion later when processing the work queue > you won't have anymore the context of the instruction > and to insert the death note you will first have to > search after it (to have the context back) before you will > be able to do the insertion and this search will cost you. Again, I don't have a clear picture of what context you are talking about. Please clarify. If it is the instruction before this instruction, I can't image why I need it. If really need it I can still cache it or search it in the bb->insns list. BTW, my impression is that, the algorithm "constant propagation using ssa" is such an example what you are describing here. In the constant propagation, you first find the instruction that use constant. Then you find out if that instruction can be simplify as constant. Next thing is put the instruction using of this simplify instruction result into the queue to see if they can be simplify as well. It is actually a lot better than our current approach to rescan the whole instruction list just to find out, oh, we got some new instruction have constant input now (due to last round constant propagation). > It's just an example but there is a whole class of similar situations. > Delaying thing by moving it to a work queue will also have negative > impact on cache (but in a way, it's also a problem of having lost the > context). Caching, maybe. But I don't think caching is a strong point to against it. I am just a system guy playing catch up game on compiler designs. The more I read, the more I find out using work queue is a very common thing amount those classic optimization algorithms. > And again, I'm not telling that using workqueue is never a good solution, > I'm saying that most probably it won't always be a good solution. My reading is that, "I have doubt but I can't articulate precisely why it is the case." Fill in more detail we can have more discussion. Granted, no algorithm is *always" a good solution to all problem. As I express in my previous email, I have no doubt this can be done. The question is weather it can be done *clean* enough for sparse. When we introduce "dead code elimination using ssa", work queue like frame work will be needed any way. Chris -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Aug 04, 2017 at 10:48:35AM -0400, Christopher Li wrote: > On Fri, Aug 4, 2017 at 6:38 AM, Luc Van Oostenryck wrote: > > It was a rethorical question as a reponse to your own question > > because your own asking to show something while we're talking > > design and potentiality doesn't help at all. > > That is exactly the feeling I am getting from you. You said some thing > might not work. I want to find out exactly what do you have in mind might > not work. Otherwise the conversion is base on doubt is not very > useful. Replace 'doubt' by 'warning' and it give already another sense to it. My whole message has always been the same: "Warning, it's not so easy or obvious and I have reasons to think that it won't always be possible or desirable (so it may be good to also think about alternative or to somehow accept thet idea that maybe not all cases will be solved the same way". > I want details. This goes both ways. You can also ask me to clarify > more details on certain things in my mental model. You can ask details about an existing implementation, you can ask details about some idea already well elaborated. You can't ask details about something that by nature have not yet details, but I have already tried to give you all sort of indices that would give some matter to think about. > > 1) Converting a recursive algo to an iterative one is not always > > a good solution, for example the iterative one can be much > > more complex/less elegant. You may also need more memory > > (because you often need to store additional info). > > A typical example is searching in a tree: it can be done > > with iteration but it's generaly not done because the > > recursive version is so much simpler & elegant (and probably > > efficient too). > > In my observe and modify model, the traverse of a tree can use > recursive, even on the same ptrlist. When you're searching in a (binary) tree, at some point you will need to backtrack, to return to some previous branching point. The usual way to do that is to use recursion and the branching point is implicitely stored in the stack. You can do the same without recursion, without using the call stack but then you need to store info about the branching point. It's an example that this sort of changes is no for free. > > 2) Converting something to a work queue is not the same as > > converting recursion to iteration anyway. > > Recursion vs. iteration is something about how walking through > > a data structure. Using a work queue is about how/when doing an > > operation on an object. > > You walk the graph to do some thing. That is given. Nobody walk the graph > for the sake of just walking the graph. The optimize can be model > as an algorithm walk the graph and make modification of the graph. > Is that a fair statement? Sorry, I don't understand what you mean. > > An example among other of complications you can have when converting > > to work queue is something like: > > Imagine you're iterating through a list of instruction checking/ > > searching after something. If found, you want to do an operation > > on the instruction just found. > > Now we are talking. > > > Sound like something for a work queue, right? Just put the > > object you found in the work queue and let do the operation later. > > Now what if the operation is about inserting another instruction > > just before the one just found (like done when adding death notes)? > > - if you do it immediatly, you have the context of the instruction > > and inserting the death notes is easy/cheap > > Here you make some assumption that you can't save the context of > the instruction in the work queue. No, I don't make such assumption, it's an example showing that not doing things there may have a cost. Saving the context is such a cost. > That is not necessary true. In > my mental model, the work queue will contain the context of what > needs to be done, e.g. the pointer to the instruct. Other context > if needed. I want to know more about what context you can't save. I never said you can't save context. On the contrary I've said that you'll most probably have to store some context to be able to do things in a deferred way and this will have a cost (and I use here the word 'context' in a very general abstract meaning). > BTW, most of the context can be found on the graph does not > need to save in to work queue. > > Take simplify_branch_branch for example, when you know this > instruction should be simplified, there is very little information > need to be save in the work queue. Absolutely, in this example yes. In some others examples, it is less so. > > - if you do the insertion later when processing the work queue > > you won't have anymore the context of the instruction > > and to insert the death note you will first have to > > search after it (to have the context back) before you will > > be able to do the insertion and this search will cost you. > > Again, I don't have a clear picture of what context you are talking > about. You have an instruction and you need to insert another instruction in front of it. If you do that while you was searching after this instruction, it's very easy because you have the head-list-nr trio implicitely available via the ptrlist macros. This trio is here the context you need in order to efficiently insert an instruction in front ot it. Now, for some reasons you want to defer this instruction. What will you put in the work queue? - the pointer to the instruction? then to insert you will need to search again in the list until you found the instruction. At this point you will have back the context needed to do the insertion but you had to do a search. - the pointer to where the instruction stored in the list? then you can easily exchange this instruction with another one but how will this be usefull to the insertion? Well sure, with the pointer you can get back the original head-list-nr but is this usefull here? If you store somewhere the instruction pointer pointer, in practice that mean that the list that contain it become immutable (presumably you also hold pointers for the neighbour instructions). The only thing that can work at this point is a normal liked list, nor the block-based ptrlist. Is that a problem? Maybe, maybe not much. > Please clarify. If it is the instruction before this instruction, > I can't image why I need it. If really need it I can still cache it or > search it in the bb->insns list. > > BTW, my impression is that, the algorithm "constant propagation using ssa" > is such an example what you are describing here. In the constant > propagation, you first find the instruction that use constant. > Then you find out if that instruction can be simplify as constant. > Next thing is put the instruction using of this simplify instruction result > into the queue to see if they can be simplify as well. No, because in this case each instruction is still independend, you don't need to know anything about the surrounding instructions. So it's a case where it should be easy to convert it. The example for the deathnotes insertion is not like this. > > It's just an example but there is a whole class of similar situations. > > Delaying thing by moving it to a work queue will also have negative > > impact on cache (but in a way, it's also a problem of having lost the > > context). > > Caching, maybe. But I don't think caching is a strong point to against it. > I am just a system guy playing catch up game on compiler designs. The more > I read, the more I find out using work queue is a very common > thing amount those classic optimization algorithms. > > > And again, I'm not telling that using workqueue is never a good solution, > > I'm saying that most probably it won't always be a good solution. > > My reading is that, "I have doubt but I can't articulate precisely why it > is the case." Fill in more detail we can have more discussion. If by "I have doubt" you understand "I have doubt about using it systematically to avoid problems with nested lists", you would be quite close. If you mean instead "I have doubt about using it preferably (for cleanless and effective solution to avoid problems with nested lists" you would be very wrong. > Granted, no algorithm is *always" a good solution to all problem. > As I express in my previous email, I have no doubt this can be done. You have no doubts that it can be done everywhere in sparse to avoid problems with nested lists or you have no doubts that it can be done at least in some or most case? Like I have already told since the beginning, I have doubts about doing it everywhere and I have already explained why (or at least tried to). -- Luc -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/flow.c b/flow.c index fce8bde..bfe54f7 100644 --- a/flow.c +++ b/flow.c @@ -729,12 +729,21 @@ static void simplify_one_symbol(struct entrypoint *ep, struct symbol *sym) multi_def: complex_def: external_visibility: - all = 1; - FOR_EACH_PTR_REVERSE(pseudo->users, pu) { - struct instruction *insn = pu->insn; - if (insn->opcode == OP_LOAD) - all &= find_dominating_stores(pseudo, insn, ++bb_generation, !mod); - } END_FOR_EACH_PTR_REVERSE(pu); + { + /* + * find_dominating_stores() will modify the pesudo->users list. + * Make a duplicate copy before using it. + */ + struct pseudo_user_list *users = NULL; + all = 1; + concat_user_list(pseudo->users, &users); + FOR_EACH_PTR_REVERSE(users, pu) { + struct instruction *insn = pu->insn; + if (insn->opcode == OP_LOAD) + all &= find_dominating_stores(pseudo, insn, ++bb_generation, !mod); + } END_FOR_EACH_PTR_REVERSE(pu); + free_ptr_list(&users); + } /* If we converted all the loads, remove the stores. They are dead */