Message ID | 20181002200710.15721-1-jacob.e.keller@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] coccicheck: process every source file at once | expand |
On Tue, Oct 2, 2018 at 1:07 PM Jacob Keller <jacob.e.keller@intel.com> wrote: > > From: Jacob Keller <jacob.keller@gmail.com> > > make coccicheck is used in order to apply coccinelle semantic patches, > and see if any of the transformations found within contrib/coccinelle/ > can be applied to the current code base. > > Pass every file to a single invocation of spatch, instead of running > spatch once per source file. > > This reduces the time required to run make coccicheck by a significant > amount of time: > > Prior timing of make coccicheck > real 6m14.090s > user 25m2.606s > sys 1m22.919s > > New timing of make coccicheck > real 1m36.580s > user 7m55.933s > sys 0m18.219s > > This is nearly a 4x decrease in the time required to run make > coccicheck. This is due to the overhead of restarting spatch for every > file. By processing all files at once, we can amortize this startup cost > across the total number of files, rather than paying it once per file. > > Signed-off-by: Jacob Keller <jacob.keller@gmail.com> > --- Forgot to add what changed. I dropped the subshell and "||" bit around invoking spatch. Thanks, Jake > Makefile | 6 ++---- > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/Makefile b/Makefile > index df1df9db78da..da692ece9e12 100644 > --- a/Makefile > +++ b/Makefile > @@ -2715,10 +2715,8 @@ endif > %.cocci.patch: %.cocci $(COCCI_SOURCES) > @echo ' ' SPATCH $<; \ > ret=0; \ > - for f in $(COCCI_SOURCES); do \ > - $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS) || \ > - { ret=$$?; break; }; \ > - done >$@+ 2>$@.log; \ > + $(SPATCH) --sp-file $< $(COCCI_SOURCES) $(SPATCH_FLAGS) >$@+ 2>$@.log; \ > + ret=$$?; \ > if test $$ret != 0; \ > then \ > cat $@.log; \ > -- > 2.18.0.219.gaf81d287a9da >
On Tue, Oct 2, 2018 at 1:18 PM Jacob Keller <jacob.keller@gmail.com> wrote: > > On Tue, Oct 2, 2018 at 1:07 PM Jacob Keller <jacob.e.keller@intel.com> wrote: > > > > From: Jacob Keller <jacob.keller@gmail.com> > > > > make coccicheck is used in order to apply coccinelle semantic patches, > > and see if any of the transformations found within contrib/coccinelle/ > > can be applied to the current code base. > > > > Pass every file to a single invocation of spatch, instead of running > > spatch once per source file. > > > > This reduces the time required to run make coccicheck by a significant > > amount of time: > > > > Prior timing of make coccicheck > > real 6m14.090s > > user 25m2.606s > > sys 1m22.919s > > > > New timing of make coccicheck > > real 1m36.580s > > user 7m55.933s > > sys 0m18.219s > > > > This is nearly a 4x decrease in the time required to run make > > coccicheck. This is due to the overhead of restarting spatch for every > > file. By processing all files at once, we can amortize this startup cost > > across the total number of files, rather than paying it once per file. > > > > Signed-off-by: Jacob Keller <jacob.keller@gmail.com> > > --- > > Forgot to add what changed. I dropped the subshell and "||" bit around > invoking spatch. > > Thanks, > Jake > Junio, do you want me to update the commit message on my side with the memory concerns? Or could you update it to mention memory as a noted trade off. Thanks, Jake > > > Makefile | 6 ++---- > > 1 file changed, 2 insertions(+), 4 deletions(-) > > > > diff --git a/Makefile b/Makefile > > index df1df9db78da..da692ece9e12 100644 > > --- a/Makefile > > +++ b/Makefile > > @@ -2715,10 +2715,8 @@ endif > > %.cocci.patch: %.cocci $(COCCI_SOURCES) > > @echo ' ' SPATCH $<; \ > > ret=0; \ > > - for f in $(COCCI_SOURCES); do \ > > - $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS) || \ > > - { ret=$$?; break; }; \ > > - done >$@+ 2>$@.log; \ > > + $(SPATCH) --sp-file $< $(COCCI_SOURCES) $(SPATCH_FLAGS) >$@+ 2>$@.log; \ > > + ret=$$?; \ > > if test $$ret != 0; \ > > then \ > > cat $@.log; \ > > -- > > 2.18.0.219.gaf81d287a9da > >
On Thu, Oct 04, 2018 at 07:17:47PM -0700, Jacob Keller wrote: > Junio, do you want me to update the commit message on my side with the > memory concerns? Or could you update it to mention memory as a noted > trade off. We have been running 'make -j2 coccicheck' in the static analysis build job on Travis CI, which worked just fine so far. The Travis CI build environments have 3GB of memory available [1], but, as shown in [2], with this patch the memory consumption jumps up to about 1.3-1.8GB for each of those jobs. So with two parallel jobs we will very likely bump into this limit. So this patch should definitely change that build script to run only a single job. 1 - https://docs.travis-ci.com/user/common-build-problems/#my-build-script-is-killed-without-any-error 2 - https://public-inbox.org/git/20181003101658.GM23446@localhost/ > > > Makefile | 6 ++---- > > > 1 file changed, 2 insertions(+), 4 deletions(-) > > > > > > diff --git a/Makefile b/Makefile > > > index df1df9db78da..da692ece9e12 100644 > > > --- a/Makefile > > > +++ b/Makefile > > > @@ -2715,10 +2715,8 @@ endif > > > %.cocci.patch: %.cocci $(COCCI_SOURCES) > > > @echo ' ' SPATCH $<; \ > > > ret=0; \ > > > - for f in $(COCCI_SOURCES); do \ > > > - $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS) || \ > > > - { ret=$$?; break; }; \ > > > - done >$@+ 2>$@.log; \ > > > + $(SPATCH) --sp-file $< $(COCCI_SOURCES) $(SPATCH_FLAGS) >$@+ 2>$@.log; \ > > > + ret=$$?; \ > > > if test $$ret != 0; \ > > > then \ > > > cat $@.log; \ > > > -- > > > 2.18.0.219.gaf81d287a9da > > >
On Fri, Oct 05, 2018 at 02:40:48PM +0200, SZEDER Gábor wrote: > On Thu, Oct 04, 2018 at 07:17:47PM -0700, Jacob Keller wrote: > > Junio, do you want me to update the commit message on my side with the > > memory concerns? Or could you update it to mention memory as a noted > > trade off. > > We have been running 'make -j2 coccicheck' in the static analysis > build job on Travis CI, which worked just fine so far. The Travis CI > build environments have 3GB of memory available [1], but, as shown in > [2], with this patch the memory consumption jumps up to about > 1.3-1.8GB for each of those jobs. So with two parallel jobs we will > very likely bump into this limit. > > So this patch should definitely change that build script to run only a > single job. It should still be a net win, since the total CPU seems to drop by a factor of 3-4. Are we OK with saying 1.3-1.8GB is necessary to run coccicheck? That doesn't feel like an exorbitant request for a developer-only tool these days, but I have noticed some people on the list tend to have lousier machines than I do. ;) -Peff
> -----Original Message----- > From: Jeff King [mailto:peff@peff.net] > Sent: Friday, October 05, 2018 9:25 AM > To: SZEDER Gábor <szeder.dev@gmail.com> > Cc: Jacob Keller <jacob.keller@gmail.com>; Keller, Jacob E > <jacob.e.keller@intel.com>; Git mailing list <git@vger.kernel.org> > Subject: Re: [PATCH v3] coccicheck: process every source file at once > > On Fri, Oct 05, 2018 at 02:40:48PM +0200, SZEDER Gábor wrote: > > > On Thu, Oct 04, 2018 at 07:17:47PM -0700, Jacob Keller wrote: > > > Junio, do you want me to update the commit message on my side with the > > > memory concerns? Or could you update it to mention memory as a noted > > > trade off. > > > > We have been running 'make -j2 coccicheck' in the static analysis > > build job on Travis CI, which worked just fine so far. The Travis CI > > build environments have 3GB of memory available [1], but, as shown in > > [2], with this patch the memory consumption jumps up to about > > 1.3-1.8GB for each of those jobs. So with two parallel jobs we will > > very likely bump into this limit. > > > > So this patch should definitely change that build script to run only a > > single job. > > It should still be a net win, since the total CPU seems to drop by a > factor of 3-4. > > Are we OK with saying 1.3-1.8GB is necessary to run coccicheck? That > doesn't feel like an exorbitant request for a developer-only tool these > days, but I have noticed some people on the list tend to have lousier > machines than I do. ;) > > -Peff It's probably not worth trying to make this more complicated and scale up how many files we do at once based on the amount of available memory on the system... Thanks, Jake
On Fri, Oct 05, 2018 at 04:53:35PM +0000, Keller, Jacob E wrote: > > Are we OK with saying 1.3-1.8GB is necessary to run coccicheck? That > > doesn't feel like an exorbitant request for a developer-only tool these > > days, but I have noticed some people on the list tend to have lousier > > machines than I do. ;) > > > > -Peff > > It's probably not worth trying to make this more complicated and scale > up how many files we do at once based on the amount of available > memory on the system... Yeah, that sounds too complicated. At most I'd give a Makefile knob to say "spatch in batches of $(N)". But I'd prefer to avoid even that complexity if we can. -Peff
On Fri, Oct 05, 2018 at 12:25:17PM -0400, Jeff King wrote: > On Fri, Oct 05, 2018 at 02:40:48PM +0200, SZEDER Gábor wrote: > > > On Thu, Oct 04, 2018 at 07:17:47PM -0700, Jacob Keller wrote: > > > Junio, do you want me to update the commit message on my side with the > > > memory concerns? Or could you update it to mention memory as a noted > > > trade off. > > > > We have been running 'make -j2 coccicheck' in the static analysis > > build job on Travis CI, which worked just fine so far. The Travis CI > > build environments have 3GB of memory available [1], but, as shown in > > [2], with this patch the memory consumption jumps up to about > > 1.3-1.8GB for each of those jobs. So with two parallel jobs we will > > very likely bump into this limit. > > > > So this patch should definitely change that build script to run only a > > single job. > > It should still be a net win, since the total CPU seems to drop by a > factor of 3-4. Well, that's true when you have unlimited resources... :) or it's true even then, when I have just enough resources, but not much contention. After all, Coccinelle doesn't have to parse the same header files over and over again. However, on Travis CI, where who knows how many other build jobs are running next to our static analysis, it doesn't seem to be the case. On current master with an additional 'time' in front: time make --jobs=2 coccicheck <...> 695.70user 50.27system 6:27.88elapsed 192%CPU (0avgtext+0avgdata 91448maxresident)k 5976inputs+2536outputs (42major+18411888minor)pagefaults 0swaps https://travis-ci.org/szeder/git/jobs/437733874#L574 With this patch, but without -j2 to fit into 3GB: 960.50user 22.59system 16:23.74elapsed 99%CPU (0avgtext+0avgdata 1606156maxresident)k 5976inputs+1320outputs (26major+4548440minor)pagefaults 0swaps https://travis-ci.org/szeder/git/jobs/437734003#L575 Note that both the runtime and the CPU time increased. (and RSS, of course) > Are we OK with saying 1.3-1.8GB is necessary to run coccicheck? That > doesn't feel like an exorbitant request for a developer-only tool these > days, but I have noticed some people on the list tend to have lousier > machines than I do. ;) > > -Peff
On Fri, Oct 05, 2018 at 12:59:01PM -0400, Jeff King wrote: > On Fri, Oct 05, 2018 at 04:53:35PM +0000, Keller, Jacob E wrote: > > > > Are we OK with saying 1.3-1.8GB is necessary to run coccicheck? That > > > doesn't feel like an exorbitant request for a developer-only tool these > > > days, but I have noticed some people on the list tend to have lousier > > > machines than I do. ;) > > > > > > -Peff > > > > It's probably not worth trying to make this more complicated and scale > > up how many files we do at once based on the amount of available > > memory on the system... > > Yeah, that sounds too complicated. At most I'd give a Makefile knob to > say "spatch in batches of $(N)". But I'd prefer to avoid even that > complexity if we can. But perhaps one more if-else, e.g.: if test -n "$(COCCICHECK_ALL_AT_ONCE)"; then \ <all at once from Jacob> else <old for loop> fi would be an acceptable compromise? Dunno.
On Fri, Oct 05, 2018 at 08:50:50PM +0200, SZEDER Gábor wrote: > On Fri, Oct 05, 2018 at 12:59:01PM -0400, Jeff King wrote: > > On Fri, Oct 05, 2018 at 04:53:35PM +0000, Keller, Jacob E wrote: > > > > > > Are we OK with saying 1.3-1.8GB is necessary to run coccicheck? That > > > > doesn't feel like an exorbitant request for a developer-only tool these > > > > days, but I have noticed some people on the list tend to have lousier > > > > machines than I do. ;) > > > > > > > > -Peff > > > > > > It's probably not worth trying to make this more complicated and scale > > > up how many files we do at once based on the amount of available > > > memory on the system... > > > > Yeah, that sounds too complicated. At most I'd give a Makefile knob to > > say "spatch in batches of $(N)". But I'd prefer to avoid even that > > complexity if we can. > > But perhaps one more if-else, e.g.: > > if test -n "$(COCCICHECK_ALL_AT_ONCE)"; then \ > <all at once from Jacob> > else > <old for loop> > fi > > would be an acceptable compromise? Dunno. That's OK, too, assuming people would actually want to use it. I'm also OK shipping this (with the "make -j" fix you suggested) and seeing if anybody actually complains. I assume there are only a handful of people running coccicheck in the first place. -Peff
On Fri, Oct 05, 2018 at 08:39:04PM +0200, SZEDER Gábor wrote: > > It should still be a net win, since the total CPU seems to drop by a > > factor of 3-4. > > Well, that's true when you have unlimited resources... :) or it's > true even then, when I have just enough resources, but not much > contention. After all, Coccinelle doesn't have to parse the same > header files over and over again. However, on Travis CI, where who > knows how many other build jobs are running next to our static > analysis, it doesn't seem to be the case. > > On current master with an additional 'time' in front: > > time make --jobs=2 coccicheck > <...> > 695.70user 50.27system 6:27.88elapsed 192%CPU (0avgtext+0avgdata 91448maxresident)k > 5976inputs+2536outputs (42major+18411888minor)pagefaults 0swaps > > https://travis-ci.org/szeder/git/jobs/437733874#L574 > > With this patch, but without -j2 to fit into 3GB: > > 960.50user 22.59system 16:23.74elapsed 99%CPU (0avgtext+0avgdata 1606156maxresident)k > 5976inputs+1320outputs (26major+4548440minor)pagefaults 0swaps > > https://travis-ci.org/szeder/git/jobs/437734003#L575 > > Note that both the runtime and the CPU time increased. (and RSS, of > course) I'm not sure what to make of those results. Was the jump in CPU _caused_ by the patch, or does it independently fluctuate based on other things happening on the Travis servers? I.e., in the second run, do we know that the time would not have actually been worse with the first patch? -Peff
On Fri, Oct 05, 2018 at 03:02:16PM -0400, Jeff King wrote: > On Fri, Oct 05, 2018 at 08:39:04PM +0200, SZEDER Gábor wrote: > > > > It should still be a net win, since the total CPU seems to drop by a > > > factor of 3-4. > > > > Well, that's true when you have unlimited resources... :) or it's > > true even then, when I have just enough resources, but not much > > contention. After all, Coccinelle doesn't have to parse the same > > header files over and over again. However, on Travis CI, where who > > knows how many other build jobs are running next to our static > > analysis, it doesn't seem to be the case. > > > > On current master with an additional 'time' in front: > > > > time make --jobs=2 coccicheck > > <...> > > 695.70user 50.27system 6:27.88elapsed 192%CPU (0avgtext+0avgdata 91448maxresident)k > > 5976inputs+2536outputs (42major+18411888minor)pagefaults 0swaps > > > > https://travis-ci.org/szeder/git/jobs/437733874#L574 > > > > With this patch, but without -j2 to fit into 3GB: > > > > 960.50user 22.59system 16:23.74elapsed 99%CPU (0avgtext+0avgdata 1606156maxresident)k > > 5976inputs+1320outputs (26major+4548440minor)pagefaults 0swaps > > > > https://travis-ci.org/szeder/git/jobs/437734003#L575 > > > > Note that both the runtime and the CPU time increased. (and RSS, of > > course) > > I'm not sure what to make of those results. Was the jump in CPU _caused_ > by the patch, or does it independently fluctuate based on other things > happening on the Travis servers? > > I.e., in the second run, do we know that the time would not have > actually been worse with the first patch? Runtimes tend to fluctuate quite a bit more on Travis CI compared to my machine, but not this much, and it seems to be consistent so far. After scripting/querying the Travis CI API a bit, I found that from the last 100 static analysis build jobs 78 did actully run 'make coccicheck' [1], avaraging 470s for the whole build job, with only 4 build job exceeding the 10min mark. I had maybe 6-8 build jobs running this patch over the last 2-3 days, I think all of them were over 15min. (I restarted some of them, so I don't have separate logs for all of them, hence the uncertainty.) 1 - There are a couple of canceled build jobs, and we skip the build job of branches when they happen to match a tags.
On Fri, Oct 5, 2018 at 12:00 PM Jeff King <peff@peff.net> wrote: > That's OK, too, assuming people would actually want to use it. I'm also > OK shipping this (with the "make -j" fix you suggested) and seeing if > anybody actually complains. I assume there are only a handful of people > running coccicheck in the first place. > > -Peff Ok. I can go this route if we have consensus on the "break it and see if someone complains" route. Regards, Jake
Am 05.10.2018 um 21:00 schrieb Jeff King: > On Fri, Oct 05, 2018 at 08:50:50PM +0200, SZEDER Gábor wrote: > >> On Fri, Oct 05, 2018 at 12:59:01PM -0400, Jeff King wrote: >>> On Fri, Oct 05, 2018 at 04:53:35PM +0000, Keller, Jacob E wrote: >>> >>>>> Are we OK with saying 1.3-1.8GB is necessary to run coccicheck? That >>>>> doesn't feel like an exorbitant request for a developer-only tool these >>>>> days, but I have noticed some people on the list tend to have lousier >>>>> machines than I do. ;) >>>>> >>>>> -Peff >>>> >>>> It's probably not worth trying to make this more complicated and scale >>>> up how many files we do at once based on the amount of available >>>> memory on the system... >>> >>> Yeah, that sounds too complicated. At most I'd give a Makefile knob to >>> say "spatch in batches of $(N)". But I'd prefer to avoid even that >>> complexity if we can. >> >> But perhaps one more if-else, e.g.: >> >> if test -n "$(COCCICHECK_ALL_AT_ONCE)"; then \ >> <all at once from Jacob> >> else >> <old for loop> >> fi >> >> would be an acceptable compromise? Dunno. > > That's OK, too, assuming people would actually want to use it. I'm also > OK shipping this (with the "make -j" fix you suggested) and seeing if > anybody actually complains. I assume there are only a handful of people > running coccicheck in the first place. FWIW, my development environment is a virtual machine with 1200MB RAM and 900MB swap space. coccicheck takes almost eight minutes sequentially, and four and a half minutes with -j4. Unsurprisingly, it fails after almost three minutes with the patch, reporting that it ran out of memory. With 2900MB it fails after almost two minutes, with 3000MB it succeeds after a good two minutes. time(1) says (for -j1): 433.30user 36.17system 7:49.84elapsed 99%CPU (0avgtext+0avgdata 108212maxresident)k 192inputs+1512outputs (0major+16409056minor)pagefaults 0swaps 129.74user 2.06system 2:13.27elapsed 98%CPU (0avgtext+0avgdata 1884568maxresident)k 236896inputs+1096outputs (795major+462129minor)pagefaults 0swaps So with the patch it's more than three times faster, but needs more than seventeen times more memory. And I need a bigger VM. :-/ René
On 02.10.18 22:18, Jacob Keller wrote: > On Tue, Oct 2, 2018 at 1:07 PM Jacob Keller <jacob.e.keller@intel.com> wrote: >> >> From: Jacob Keller <jacob.keller@gmail.com> >> >> make coccicheck is used in order to apply coccinelle semantic patches, >> and see if any of the transformations found within contrib/coccinelle/ >> can be applied to the current code base. >> >> Pass every file to a single invocation of spatch, instead of running >> spatch once per source file. >> >> This reduces the time required to run make coccicheck by a significant >> amount of time: >> >> Prior timing of make coccicheck >> real 6m14.090s >> user 25m2.606s >> sys 1m22.919s >> >> New timing of make coccicheck >> real 1m36.580s >> user 7m55.933s >> sys 0m18.219s >> >> This is nearly a 4x decrease in the time required to run make >> coccicheck. This is due to the overhead of restarting spatch for every >> file. By processing all files at once, we can amortize this startup cost >> across the total number of files, rather than paying it once per file. >> >> Signed-off-by: Jacob Keller <jacob.keller@gmail.com> >> --- > > Forgot to add what changed. I dropped the subshell and "||" bit around > invoking spatch. > > Thanks, > Jake > > >> Makefile | 6 ++---- >> 1 file changed, 2 insertions(+), 4 deletions(-) >> >> diff --git a/Makefile b/Makefile >> index df1df9db78da..da692ece9e12 100644 >> --- a/Makefile >> +++ b/Makefile >> @@ -2715,10 +2715,8 @@ endif >> %.cocci.patch: %.cocci $(COCCI_SOURCES) >> @echo ' ' SPATCH $<; \ >> ret=0; \ >> - for f in $(COCCI_SOURCES); do \ >> - $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS) || \ >> - { ret=$$?; break; }; \ >> - done >$@+ 2>$@.log; \ >> + $(SPATCH) --sp-file $< $(COCCI_SOURCES) $(SPATCH_FLAGS) >$@+ 2>$@.log; \ >> + ret=$$?; \ >> if test $$ret != 0; \ >> then \ >> cat $@.log; \ >> -- >> 2.18.0.219.gaf81d287a9da >> Wouldn't the following be even simpler? diff --git a/Makefile b/Makefile index 5c8307b7c479..a37b2724d526 100644 --- a/Makefile +++ b/Makefile @@ -2701,12 +2701,7 @@ endif %.cocci.patch: %.cocci $(COCCI_SOURCES) @echo ' ' SPATCH $<; \ - ret=0; \ - for f in $(COCCI_SOURCES); do \ - $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS) || \ - { ret=$$?; break; }; \ - done >$@+ 2>$@.log; \ - if test $$ret != 0; \ + if ! $(SPATCH) --sp-file $< $(COCCI_SOURCES) $(SPATCH_FLAGS) >$@+ 2>$@.log; \ then \ cat $@.log; \ exit 1; \ Cheers, Beat
On 07.10.18 13:36, Beat Bolli wrote: > On 02.10.18 22:18, Jacob Keller wrote: >> On Tue, Oct 2, 2018 at 1:07 PM Jacob Keller <jacob.e.keller@intel.com> wrote: >>> >>> From: Jacob Keller <jacob.keller@gmail.com> >>> >>> make coccicheck is used in order to apply coccinelle semantic patches, >>> and see if any of the transformations found within contrib/coccinelle/ >>> can be applied to the current code base. >>> >>> Pass every file to a single invocation of spatch, instead of running >>> spatch once per source file. >>> >>> This reduces the time required to run make coccicheck by a significant >>> amount of time: >>> >>> Prior timing of make coccicheck >>> real 6m14.090s >>> user 25m2.606s >>> sys 1m22.919s >>> >>> New timing of make coccicheck >>> real 1m36.580s >>> user 7m55.933s >>> sys 0m18.219s >>> >>> This is nearly a 4x decrease in the time required to run make >>> coccicheck. This is due to the overhead of restarting spatch for every >>> file. By processing all files at once, we can amortize this startup cost >>> across the total number of files, rather than paying it once per file. >>> >>> Signed-off-by: Jacob Keller <jacob.keller@gmail.com> >>> --- >> >> Forgot to add what changed. I dropped the subshell and "||" bit around >> invoking spatch. >> >> Thanks, >> Jake >> >> >>> Makefile | 6 ++---- >>> 1 file changed, 2 insertions(+), 4 deletions(-) >>> >>> diff --git a/Makefile b/Makefile >>> index df1df9db78da..da692ece9e12 100644 >>> --- a/Makefile >>> +++ b/Makefile >>> @@ -2715,10 +2715,8 @@ endif >>> %.cocci.patch: %.cocci $(COCCI_SOURCES) >>> @echo ' ' SPATCH $<; \ >>> ret=0; \ >>> - for f in $(COCCI_SOURCES); do \ >>> - $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS) || \ >>> - { ret=$$?; break; }; \ >>> - done >$@+ 2>$@.log; \ >>> + $(SPATCH) --sp-file $< $(COCCI_SOURCES) $(SPATCH_FLAGS) >$@+ 2>$@.log; \ >>> + ret=$$?; \ >>> if test $$ret != 0; \ >>> then \ >>> cat $@.log; \ >>> -- >>> 2.18.0.219.gaf81d287a9da >>> > > Wouldn't the following be even simpler? > > diff --git a/Makefile b/Makefile > index 5c8307b7c479..a37b2724d526 100644 > --- a/Makefile > +++ b/Makefile > @@ -2701,12 +2701,7 @@ endif > > %.cocci.patch: %.cocci $(COCCI_SOURCES) > @echo ' ' SPATCH $<; \ > - ret=0; \ > - for f in $(COCCI_SOURCES); do \ > - $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS) || \ > - { ret=$$?; break; }; \ > - done >$@+ 2>$@.log; \ > - if test $$ret != 0; \ > + if ! $(SPATCH) --sp-file $< $(COCCI_SOURCES) $(SPATCH_FLAGS) >> $@+ 2>$@.log; \ The "If !" and the output redirection should be on one line, obviously... Sorry about this. Beat
On Sat, Oct 06, 2018 at 10:42:57AM +0200, René Scharfe wrote: > > That's OK, too, assuming people would actually want to use it. I'm also > > OK shipping this (with the "make -j" fix you suggested) and seeing if > > anybody actually complains. I assume there are only a handful of people > > running coccicheck in the first place. > > FWIW, my development environment is a virtual machine with 1200MB RAM > and 900MB swap space. coccicheck takes almost eight minutes > sequentially, and four and a half minutes with -j4. > > Unsurprisingly, it fails after almost three minutes with the patch, > reporting that it ran out of memory. With 2900MB it fails after almost > two minutes, with 3000MB it succeeds after a good two minutes. > > time(1) says (for -j1): > > 433.30user 36.17system 7:49.84elapsed 99%CPU (0avgtext+0avgdata 108212maxresident)k > 192inputs+1512outputs (0major+16409056minor)pagefaults 0swaps > > 129.74user 2.06system 2:13.27elapsed 98%CPU (0avgtext+0avgdata 1884568maxresident)k > 236896inputs+1096outputs (795major+462129minor)pagefaults 0swaps > > So with the patch it's more than three times faster, but needs more > than seventeen times more memory. And I need a bigger VM. :-/ Yuck. :) So if we want to take this as a complaint, then I guess we can jump straight to implementing the fallback to the existing behavior (though it may be worth it for you to expand your VM to get the decreased CPU time). I'm still puzzled by Gábor's counter-intuitive CI numbers, though. -Peff
On Fri, Oct 05, 2018 at 09:54:13PM +0200, SZEDER Gábor wrote: > Runtimes tend to fluctuate quite a bit more on Travis CI compared to > my machine, but not this much, and it seems to be consistent so far. > > After scripting/querying the Travis CI API a bit, I found that from > the last 100 static analysis build jobs 78 did actully run 'make > coccicheck' [1], avaraging 470s for the whole build job, with only 4 > build job exceeding the 10min mark. > > I had maybe 6-8 build jobs running this patch over the last 2-3 days, > I think all of them were over 15min. (I restarted some of them, so I > don't have separate logs for all of them, hence the uncertainty.) So that's really weird and counter-intuitive, since we should be doing strictly less work. I know that spatch tries to parallelize itself, though from my tests, 1.0.4 does not. I wonder if the version in Travis differs in that respect and starts too many threads, and the extra time is going to contention and context switches. Have you tried passing "-j1" to spatch? My 1.0.4 does not even recognize it. That seems like a pretty unlikely explanation to me, but I am having trouble coming up with another one. I guess the other plausible thing is that the extra memory is forcing us into some slower path. E.g., a hypervisor may even be swapping, unbeknownst to the child OS, and it gets accounted in the child OS as "boy, that memory load was really slow", which becomes used CPU. That actually sounds more credible to me. -Peff
On Mon, Oct 08, 2018 at 11:15:42PM -0400, Jeff King wrote: > On Fri, Oct 05, 2018 at 09:54:13PM +0200, SZEDER Gábor wrote: > > > Runtimes tend to fluctuate quite a bit more on Travis CI compared to > > my machine, but not this much, and it seems to be consistent so far. > > > > After scripting/querying the Travis CI API a bit, I found that from > > the last 100 static analysis build jobs 78 did actully run 'make > > coccicheck' [1], avaraging 470s for the whole build job, with only 4 > > build job exceeding the 10min mark. > > > > I had maybe 6-8 build jobs running this patch over the last 2-3 days, > > I think all of them were over 15min. (I restarted some of them, so I > > don't have separate logs for all of them, hence the uncertainty.) > > So that's really weird and counter-intuitive, since we should be doing > strictly less work. I know that spatch tries to parallelize itself, > though from my tests, 1.0.4 does not. I wonder if the version in Travis > differs in that respect and starts too many threads, and the extra time > is going to contention and context switches. I don't think it does any parallel work. Here is the timing again from my previous email: 960.50user 22.59system 16:23.74elapsed 99%CPU (0avgtext+0avgdata 1606156maxresident)k Notice that 16:23 is 983s, and that it matches the sum of the user and system times. I usually saw this kind of timing with CPU-intensive single-threaded programs, and if there were any parallelization, then I would expect the elapsed time to be at least somewhat smaller than the other two. > Have you tried passing "-j1" to spatch? My 1.0.4 does not even recognize > it. I have just gave it a try, but the v1.0.0 on Travis CI errored out with "unknown option `-j'. https://travis-ci.org/szeder/git/jobs/439532822#L566 > That seems like a pretty unlikely explanation to me, but I am having > trouble coming up with another one. > > I guess the other plausible thing is that the extra memory is forcing us > into some slower path. E.g., a hypervisor may even be swapping, > unbeknownst to the child OS, and it gets accounted in the child OS as > "boy, that memory load was really slow", which becomes used CPU. > > That actually sounds more credible to me. > > -Peff
On Wed, Oct 10, 2018 at 01:44:41PM +0200, SZEDER Gábor wrote: > > So that's really weird and counter-intuitive, since we should be doing > > strictly less work. I know that spatch tries to parallelize itself, > > though from my tests, 1.0.4 does not. I wonder if the version in Travis > > differs in that respect and starts too many threads, and the extra time > > is going to contention and context switches. > > I don't think it does any parallel work. > > Here is the timing again from my previous email: > > 960.50user 22.59system 16:23.74elapsed 99%CPU (0avgtext+0avgdata 1606156maxresident)k > > Notice that 16:23 is 983s, and that it matches the sum of the user and > system times. I usually saw this kind of timing with CPU-intensive > single-threaded programs, and if there were any parallelization, then I > would expect the elapsed time to be at least somewhat smaller than the > other two. Ah, right, I should have been able to figure that out myself. So scratch that theory. My "hypervisor stalling our memory reads" theory is still plausible, but I don't know how we would test it. I guess in some sense it doesn't matter. If it's slower, we're not likely to be able to fix that. So I guess we just need the fallback to the current behavior. -Peff
diff --git a/Makefile b/Makefile index df1df9db78da..da692ece9e12 100644 --- a/Makefile +++ b/Makefile @@ -2715,10 +2715,8 @@ endif %.cocci.patch: %.cocci $(COCCI_SOURCES) @echo ' ' SPATCH $<; \ ret=0; \ - for f in $(COCCI_SOURCES); do \ - $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS) || \ - { ret=$$?; break; }; \ - done >$@+ 2>$@.log; \ + $(SPATCH) --sp-file $< $(COCCI_SOURCES) $(SPATCH_FLAGS) >$@+ 2>$@.log; \ + ret=$$?; \ if test $$ret != 0; \ then \ cat $@.log; \