Message ID | cover.1564009259.git.steadmon@google.com (mailing list archive) |
---|---|
Headers | show |
Series | Add a JSON Schema for trace2 events | expand |
On Wed, Jul 24, 2019 at 04:06:50PM -0700, Josh Steadmon wrote: > This is a proof of concept series that formalizes the structure of trace2 event > output using JSON-Schema [1]. > > It provides a validator (written in Go) that verifies the events in a given > trace2 event output file match the schema. I am happy to rewrite this validator > in some other language, provided that the language has a JSON-Schema library > supporting at least draft-04. > > It runs the validator as part of the CI suite (it increase the runtime > by about 15 minutes). It tests that the trace output of "make test" > conforms to the schema. I don't like this approach. 15 minutes is a lot, and spending that much time on JSON schema validation in every CI build is overkill and (IMO unacceptably) wasteful. I mean, the test suite involves e.g. thousands of 'git (add|commit|checkout|reset|...)' executions to set up the test cases, but surely it's not necessary to validate the trace output of every single one of them. > I would appreciate any feedback on better ways to integrate the > validator into the CI suite. How about adding a test script dedicated to JSON schema validation, which runs only as many git commands as needed to cover all trace2 events.
SZEDER Gábor <szeder.dev@gmail.com> writes: >> I would appreciate any feedback on better ways to integrate the >> validator into the CI suite. > > How about adding a test script dedicated to JSON schema validation, > which runs only as many git commands as needed to cover all trace2 > events. Sensible, but might be hard to arrange, as the trace2 annotations on codepaths are not frozen. I wonder if these schema definitions can somehow be read in reverse by machine to automatically generate helper functions that would by definition produce schema-valid json objects, and the current callers of trace2_*() that logs stuff can be updated to call them? That way, we'll automatically and always be producing valid output, or am I dreaming?
On Wed, Jul 24, 2019 at 04:06:50PM -0700, Josh Steadmon wrote: > Changes since V2 of this series: > * corrected commit message regarding the different schema variations > * cleaned up the Makefile > * added comment noting that the validator expects JSON-Lines input > * added a --progress flag to the validator > * improved validation error output * replaced underscores with dashes in the command line options to be consistent with the rest of Git: > 2: 3fa4e9eef8 ! 2: 97cb6a3eb4 trace2: add a schema validator for trace2 events > +// Traces can then be verified like so: > +// trace_schema_validator \ > -+// --trace2_event_file /path/to/trace/output \ > -+// --schema_file /path/to/schema > ++// --trace2-event-file /path/to/trace/output \ > ++// --schema-file /path/to/schema > +package main > + > +import ( > @@ t/trace_schema_validator/trace_schema_validator.go (new) > +) > + > +// Required flags > -+var schemaFile = flag.String("schema_file", "", "JSON-Schema filename") > -+var trace2EventFile = flag.String("trace2_event_file", "", "trace2 event filename") > ++var schemaFile = flag.String("schema-file", "", "JSON-Schema filename") > ++var trace2EventFile = flag.String("trace2-event-file", "", "trace2 event filename") > ++var progress = flag.Int("progress", 0, "Print progress message each time we have validated this many lines. --progress=0 means no messages are printed") > 3: acf3aebcaa ! 3: a07458b2e4 ci: run trace2 schema validation in the CI suite > @@ ci/run-build-and-tests.sh: then > make test > + t/trace_schema_validator/trace_schema_validator \ > + --trace2_event_file=${GIT_TRACE2_EVENT} \ > -+ --schema_file=t/trace_schema_validator/strict_schema.json > ++ --schema_file=t/trace_schema_validator/strict_schema.json \ However, the options used in the CI script remained unchanged, and, consequently, the build breaks with the message "flag provided but not defined: -trace2_event_file" (nit: which doesn't begin with the prefix "error:"): https://travis-ci.org/git/git/jobs/563776273#L2238 > ++ --progress=10000 > fi > > check_unignored_build_artifacts > -- > 2.22.0.709.g102302147b-goog >
Hi, On Fri, 26 Jul 2019, SZEDER Gábor wrote: > On Wed, Jul 24, 2019 at 04:06:50PM -0700, Josh Steadmon wrote: > > > 3: acf3aebcaa ! 3: a07458b2e4 ci: run trace2 schema validation in the CI suite > > @@ ci/run-build-and-tests.sh: then > > make test > > + t/trace_schema_validator/trace_schema_validator \ > > + --trace2_event_file=${GIT_TRACE2_EVENT} \ > > -+ --schema_file=t/trace_schema_validator/strict_schema.json > > ++ --schema_file=t/trace_schema_validator/strict_schema.json \ > > However, the options used in the CI script remained unchanged, and, > consequently, the build breaks with the message "flag provided but not > defined: -trace2_event_file" (nit: which doesn't begin with the prefix > "error:"): > > https://travis-ci.org/git/git/jobs/563776273#L2238 Not only on Travis, but obviously also on Azure Pipelines. I was wary about this patch series ever since I got aware that it refuses to use an already-available JSON schema validator (such as `ajv`, a seemingly well-established all-purpose validator that even claims to be the fastest validator) and instead tries to push down our throat the first Go-implemented thing in git.git. That somehow does not sit well with me. And I still think we can do a lot better than implement a JSON schema validator in Go (which has nothing to do with Git), leverage other, well-maintained projects instead, and gain even more power by not having to maintain it in git.git, and by avoiding to add yet another Language You Gotta Know To Do Git. (We already have too many of those, and quite honestly, I'd rather add Javascript that we _already_ kinda require contributors to know because of GitWeb, rather than any other language, especially languages that seem to be controlled by one company, that's why I still stay away from .NET and even Java, to be quite honest, even if I was really super knowledgeable about Java because of my previous job). Besides, I have to admit that I am quite disheartened to see these lines in the log: -- snip -- 2019/07/26 11:39:28 Validated items: 0 2019/07/26 11:39:32 Validated items: 10000 2019/07/26 11:39:36 Validated items: 20000 2019/07/26 11:39:40 Validated items: 30000 2019/07/26 11:39:44 Validated items: 40000 2019/07/26 11:39:49 Validated items: 50000 2019/07/26 11:39:53 Validated items: 60000 2019/07/26 11:39:57 Validated items: 70000 2019/07/26 11:40:02 Validated items: 80000 2019/07/26 11:40:06 Validated items: 90000 2019/07/26 11:40:11 Validated items: 100000 2019/07/26 11:40:15 Validated items: 110000 2019/07/26 11:40:19 Validated items: 120000 2019/07/26 11:40:23 Validated items: 130000 2019/07/26 11:40:28 Validated items: 140000 2019/07/26 11:40:32 Validated items: 150000 2019/07/26 11:40:36 Validated items: 160000 2019/07/26 11:40:41 Validated items: 170000 2019/07/26 11:40:45 Validated items: 180000 2019/07/26 11:40:50 Validated items: 190000 2019/07/26 11:40:54 Validated items: 200000 2019/07/26 11:40:59 Validated items: 210000 2019/07/26 11:41:04 Validated items: 220000 2019/07/26 11:41:08 Validated items: 230000 2019/07/26 11:41:13 Validated items: 240000 2019/07/26 11:41:17 Validated items: 250000 2019/07/26 11:41:22 Validated items: 260000 2019/07/26 11:41:27 Validated items: 270000 2019/07/26 11:41:31 Validated items: 280000 2019/07/26 11:41:36 Validated items: 290000 2019/07/26 11:41:40 Validated items: 300000 2019/07/26 11:41:45 Validated items: 310000 2019/07/26 11:41:49 Validated items: 320000 2019/07/26 11:41:54 Validated items: 330000 2019/07/26 11:41:58 Validated items: 340000 2019/07/26 11:42:03 Validated items: 350000 2019/07/26 11:42:08 Validated items: 360000 2019/07/26 11:42:12 Validated items: 370000 2019/07/26 11:42:17 Validated items: 380000 2019/07/26 11:42:21 Validated items: 390000 2019/07/26 11:42:25 Validated items: 400000 2019/07/26 11:42:30 Validated items: 410000 2019/07/26 11:42:35 Validated items: 420000 2019/07/26 11:42:39 Validated items: 430000 2019/07/26 11:42:44 Validated items: 440000 2019/07/26 11:42:48 Validated items: 450000 2019/07/26 11:42:53 Validated items: 460000 2019/07/26 11:42:57 Validated items: 470000 2019/07/26 11:43:02 Validated items: 480000 2019/07/26 11:43:06 Validated items: 490000 2019/07/26 11:43:11 Validated items: 500000 2019/07/26 11:43:16 Validated items: 510000 2019/07/26 11:43:20 Validated items: 520000 2019/07/26 11:43:25 Validated items: 530000 2019/07/26 11:43:29 Validated items: 540000 2019/07/26 11:43:34 Validated items: 550000 2019/07/26 11:43:38 Validated items: 560000 2019/07/26 11:43:43 Validated items: 570000 2019/07/26 11:43:47 Validated items: 580000 2019/07/26 11:43:51 Validated items: 590000 2019/07/26 11:43:56 Validated items: 600000 2019/07/26 11:44:00 Validated items: 610000 2019/07/26 11:44:04 Validated items: 620000 2019/07/26 11:44:09 Validated items: 630000 2019/07/26 11:44:13 Validated items: 640000 2019/07/26 11:44:18 Validated items: 650000 2019/07/26 11:44:22 Validated items: 660000 2019/07/26 11:44:27 Validated items: 670000 2019/07/26 11:44:31 Validated items: 680000 2019/07/26 11:44:35 Validated items: 690000 2019/07/26 11:44:40 Validated items: 700000 2019/07/26 11:44:44 Validated items: 710000 2019/07/26 11:44:48 Validated items: 720000 2019/07/26 11:44:52 Validated items: 730000 2019/07/26 11:44:57 Validated items: 740000 2019/07/26 11:45:01 Validated items: 750000 2019/07/26 11:45:05 Validated items: 760000 2019/07/26 11:45:10 Validated items: 770000 2019/07/26 11:45:14 Validated items: 780000 2019/07/26 11:45:19 Validated items: 790000 2019/07/26 11:45:23 Validated items: 800000 2019/07/26 11:45:27 Validated items: 810000 2019/07/26 11:45:31 Validated items: 820000 2019/07/26 11:45:36 Validated items: 830000 2019/07/26 11:45:40 Validated items: 840000 2019/07/26 11:45:44 Validated items: 850000 2019/07/26 11:45:49 Validated items: 860000 2019/07/26 11:45:53 Validated items: 870000 2019/07/26 11:45:57 Validated items: 880000 2019/07/26 11:46:01 Validated items: 890000 2019/07/26 11:46:06 Validated items: 900000 2019/07/26 11:46:10 Validated items: 910000 2019/07/26 11:46:14 Validated items: 920000 2019/07/26 11:46:19 Validated items: 930000 2019/07/26 11:46:23 Validated items: 940000 2019/07/26 11:46:27 Validated items: 950000 2019/07/26 11:46:31 Validated items: 960000 2019/07/26 11:46:36 Validated items: 970000 2019/07/26 11:46:40 Validated items: 980000 2019/07/26 11:46:44 Validated items: 990000 2019/07/26 11:46:49 Validated items: 1000000 2019/07/26 11:46:53 Validated items: 1010000 2019/07/26 11:46:57 Validated items: 1020000 2019/07/26 11:47:01 Validated items: 1030000 2019/07/26 11:47:05 Validated items: 1040000 2019/07/26 11:47:10 Validated items: 1050000 2019/07/26 11:47:14 Validated items: 1060000 2019/07/26 11:47:18 Validated items: 1070000 2019/07/26 11:47:22 Validated items: 1080000 2019/07/26 11:47:27 Validated items: 1090000 2019/07/26 11:47:31 Validated items: 1100000 2019/07/26 11:47:35 Validated items: 1110000 2019/07/26 11:47:40 Validated items: 1120000 2019/07/26 11:47:44 Validated items: 1130000 2019/07/26 11:47:49 Validated items: 1140000 2019/07/26 11:47:54 Validated items: 1150000 2019/07/26 11:47:58 Validated items: 1160000 2019/07/26 11:48:02 Validated items: 1170000 2019/07/26 11:48:07 Validated items: 1180000 2019/07/26 11:48:11 Validated items: 1190000 2019/07/26 11:48:15 Validated items: 1200000 2019/07/26 11:48:19 Validated items: 1210000 2019/07/26 11:48:24 Validated items: 1220000 2019/07/26 11:48:28 Validated items: 1230000 2019/07/26 11:48:33 Validated items: 1240000 2019/07/26 11:48:37 Validated items: 1250000 2019/07/26 11:48:42 Validated items: 1260000 2019/07/26 11:48:47 Validated items: 1270000 2019/07/26 11:48:52 Validated items: 1280000 2019/07/26 11:48:56 Validated items: 1290000 2019/07/26 11:49:01 Validated items: 1300000 2019/07/26 11:49:05 Validated items: 1310000 2019/07/26 11:49:09 Validated items: 1320000 2019/07/26 11:49:14 Validated items: 1330000 2019/07/26 11:49:18 Validated items: 1340000 2019/07/26 11:49:23 Validated items: 1350000 2019/07/26 11:49:28 Validated items: 1360000 2019/07/26 11:49:32 Validated items: 1370000 2019/07/26 11:49:37 Validated items: 1380000 2019/07/26 11:49:41 Validated items: 1390000 2019/07/26 11:49:46 Validated items: 1400000 2019/07/26 11:49:50 Validated items: 1410000 2019/07/26 11:49:55 Validated items: 1420000 2019/07/26 11:50:00 Validated items: 1430000 2019/07/26 11:50:04 Validated items: 1440000 2019/07/26 11:50:08 Validated items: 1450000 2019/07/26 11:50:12 Validated items: 1460000 2019/07/26 11:50:17 Validated items: 1470000 2019/07/26 11:50:21 Validated items: 1480000 2019/07/26 11:50:26 Validated items: 1490000 2019/07/26 11:50:30 Validated items: 1500000 2019/07/26 11:50:35 Validated items: 1510000 2019/07/26 11:50:39 Validated items: 1520000 2019/07/26 11:50:44 Validated items: 1530000 2019/07/26 11:50:49 Validated items: 1540000 2019/07/26 11:50:53 Validated items: 1550000 2019/07/26 11:50:57 Validated items: 1560000 2019/07/26 11:51:02 Validated items: 1570000 2019/07/26 11:51:06 Validated items: 1580000 2019/07/26 11:51:11 Validated items: 1590000 2019/07/26 11:51:15 Validated items: 1600000 2019/07/26 11:51:20 Validated items: 1610000 2019/07/26 11:51:24 Validated items: 1620000 2019/07/26 11:51:29 Validated items: 1630000 2019/07/26 11:51:33 Validated items: 1640000 2019/07/26 11:51:38 Validated items: 1650000 2019/07/26 11:51:42 Validated items: 1660000 2019/07/26 11:51:47 Validated items: 1670000 2019/07/26 11:51:51 Validated items: 1680000 2019/07/26 11:51:55 Validated items: 1690000 2019/07/26 11:52:00 Validated items: 1700000 2019/07/26 11:52:04 Validated items: 1710000 2019/07/26 11:52:09 Validated items: 1720000 2019/07/26 11:52:13 Validated items: 1730000 2019/07/26 11:52:17 Validated items: 1740000 2019/07/26 11:52:22 Validated items: 1750000 2019/07/26 11:52:26 Validated items: 1760000 2019/07/26 11:52:30 Validated items: 1770000 2019/07/26 11:52:34 Validated items: 1780000 2019/07/26 11:52:39 Validated items: 1790000 2019/07/26 11:52:43 Validated items: 1800000 2019/07/26 11:52:47 Validated items: 1810000 2019/07/26 11:52:51 Validated items: 1820000 2019/07/26 11:52:55 Validated items: 1830000 2019/07/26 11:52:59 Validated items: 1840000 2019/07/26 11:53:04 Validated items: 1850000 2019/07/26 11:53:08 Validated items: 1860000 2019/07/26 11:53:12 Validated items: 1870000 2019/07/26 11:53:16 Validated items: 1880000 2019/07/26 11:53:20 Validated items: 1890000 2019/07/26 11:53:24 Validated items: 1900000 2019/07/26 11:53:28 Validated items: 1910000 2019/07/26 11:53:32 Validated items: 1920000 2019/07/26 11:53:36 Validated items: 1930000 2019/07/26 11:53:41 Validated items: 1940000 2019/07/26 11:53:45 Validated items: 1950000 2019/07/26 11:53:49 Validated items: 1960000 2019/07/26 11:53:53 Validated items: 1970000 2019/07/26 11:53:58 Validated items: 1980000 2019/07/26 11:54:02 Validated items: 1990000 2019/07/26 11:54:06 Validated items: 2000000 2019/07/26 11:54:10 Validated items: 2010000 2019/07/26 11:54:14 Validated items: 2020000 2019/07/26 11:54:18 Validated items: 2030000 2019/07/26 11:54:23 Validated items: 2040000 2019/07/26 11:54:27 Validated items: 2050000 2019/07/26 11:54:31 Validated items: 2060000 2019/07/26 11:54:35 Validated items: 2070000 2019/07/26 11:54:39 Validated items: 2080000 2019/07/26 11:54:43 Validated items: 2090000 2019/07/26 11:54:48 Validated items: 2100000 2019/07/26 11:54:52 Validated items: 2110000 2019/07/26 11:54:56 Validated items: 2120000 2019/07/26 11:55:00 Validated items: 2130000 2019/07/26 11:55:04 Validated items: 2140000 2019/07/26 11:55:08 Validated items: 2150000 2019/07/26 11:55:12 Validated items: 2160000 2019/07/26 11:55:16 Validated items: 2170000 2019/07/26 11:55:21 Validated items: 2180000 2019/07/26 11:55:25 Validated items: 2190000 2019/07/26 11:55:29 Validated items: 2200000 2019/07/26 11:55:33 Validated items: 2210000 2019/07/26 11:55:37 Validated items: 2220000 2019/07/26 11:55:42 Validated items: 2230000 2019/07/26 11:55:46 Validated items: 2240000 2019/07/26 11:55:50 Validated items: 2250000 2019/07/26 11:55:55 Validated items: 2260000 2019/07/26 11:55:59 Validated items: 2270000 2019/07/26 11:56:03 Validated items: 2280000 2019/07/26 11:56:08 Validated items: 2290000 2019/07/26 11:56:12 Validated items: 2300000 2019/07/26 11:56:14 Validated events: 2303564 -- snap -- Let me read this out loud to you. We validate about 2.3 million items, and we spend a whopping sixteen minutes and fourty-six seconds on validating them. That's crazy. It's not only the time, that's bad enough. It is the combined amount of electricity and CO2 that is spent -- quite pointlessly -- on this task. I know that at least for Azure Pipelines, that's CO2-neutral, but that is only a trick and a lot of money. I'd rather avoid that in the first place. What the heck are we thinking, to add such a lot of wasted time to every CI run? Why do we have to validate *that* many items? And why do we have to use a validator that is obviously pretty slow at doing this? In the short run, let's fix the build. But even before the next iteraion, let's try to find a way to avoid having to maintain a JSON validator in Git's source code, _and_ let's avoid wasting so much time and electricty. Junio, I need this patch to fix the build, would you terribly mind applying it on top of `js/trace2-json-schema`? -- snipsnap -- From 2deb3ac117768133551716c9788266671aafb445 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Fri, 26 Jul 2019 13:18:26 +0200 Subject: [PATCH] fixup??? ci: run trace2 schema validation in the CI suite That seems like an oversight when iterating the patch series. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- ci/run-build-and-tests.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh index 8103125849..696b6248ed 100755 --- a/ci/run-build-and-tests.sh +++ b/ci/run-build-and-tests.sh @@ -24,8 +24,8 @@ linux-gcc) export GIT_TEST_ADD_I_USE_BUILTIN=1 make test t/trace_schema_validator/trace_schema_validator \ - --trace2_event_file=${GIT_TRACE2_EVENT} \ - --schema_file=t/trace_schema_validator/strict_schema.json \ + --trace2-event-file=${GIT_TRACE2_EVENT} \ + --schema-file=t/trace_schema_validator/strict_schema.json \ --progress=10000 ;; linux-gcc-4.8) -- 2.22.0.windows.1.6.g271c090e89
On Fri, Jul 26, 2019 at 02:12:39PM +0200, Johannes Schindelin wrote: > I was wary about this patch series ever since I got aware that it > refuses to use an already-available JSON schema validator (such as > `ajv`, a seemingly well-established all-purpose validator that even > claims to be the fastest validator) and instead tries to push down our > throat the first Go-implemented thing in git.git. Well, to be fair, the cover letters of all versions of this patch series included the following bit: I am happy to rewrite this validator in some other language, provided that the language has a JSON-Schema library supporting at least draft-04. Having said that, using an existing validator would make me happier, too. At the very least the commit message of the patch adding the validator should argue convincingly why rolling our own is worth it. > Besides, I have to admit that I am quite disheartened to see these lines > in the log: > > -- snip -- > 2019/07/26 11:39:28 Validated items: 0 > 2019/07/26 11:39:32 Validated items: 10000 These are necessary for Travis CI, which aborts the build if it doesn't receive output for 10 minutes, thinking that the tests hang somewhere. Yeah, less lines like these would be sufficient to avoid that timeout, but, of course, the real solution would be to not validate all of 'make test'.
On 2019.07.25 09:14, Junio C Hamano wrote: > SZEDER Gábor <szeder.dev@gmail.com> writes: > > >> I would appreciate any feedback on better ways to integrate the > >> validator into the CI suite. > > > > How about adding a test script dedicated to JSON schema validation, > > which runs only as many git commands as needed to cover all trace2 > > events. > > Sensible, but might be hard to arrange, as the trace2 annotations on > codepaths are not frozen. > > I wonder if these schema definitions can somehow be read in reverse > by machine to automatically generate helper functions that would by > definition produce schema-valid json objects, and the current > callers of trace2_*() that logs stuff can be updated to call them? > That way, we'll automatically and always be producing valid output, > or am I dreaming? In that case, how would you feel about having a protobuf writer, perhaps using Nano-PB[1]? It would handle code generation, and we could add a new GIT_TRACE2_PB trace target to use the generated code. [1]: https://jpa.kapsi.fi/nanopb/
On 2019.07.26 14:12, Johannes Schindelin wrote: > Hi, > > On Fri, 26 Jul 2019, SZEDER Gábor wrote: > > > On Wed, Jul 24, 2019 at 04:06:50PM -0700, Josh Steadmon wrote: > > > > > 3: acf3aebcaa ! 3: a07458b2e4 ci: run trace2 schema validation in the CI suite > > > @@ ci/run-build-and-tests.sh: then > > > make test > > > + t/trace_schema_validator/trace_schema_validator \ > > > + --trace2_event_file=${GIT_TRACE2_EVENT} \ > > > -+ --schema_file=t/trace_schema_validator/strict_schema.json > > > ++ --schema_file=t/trace_schema_validator/strict_schema.json \ > > > > However, the options used in the CI script remained unchanged, and, > > consequently, the build breaks with the message "flag provided but not > > defined: -trace2_event_file" (nit: which doesn't begin with the prefix > > "error:"): > > > > https://travis-ci.org/git/git/jobs/563776273#L2238 > > Not only on Travis, but obviously also on Azure Pipelines. > > I was wary about this patch series ever since I got aware that it > refuses to use an already-available JSON schema validator (such as > `ajv`, a seemingly well-established all-purpose validator that even > claims to be the fastest validator) and instead tries to push down our > throat the first Go-implemented thing in git.git. As I mentioned in my cover letter, all the standalone validators that I had previously tested took multiple hours to validate the full "make test" output. Additionally, they tended to have unhelpful output in the case of non-validating messages, that did not describe how or where the validation failed. I had previously attempted to evaluate ajv-cli, but I could not get npm to install it correctly on my machine. I took another look today, and I am pleasantly surprised to see that it can validate the full 1.7M line trace output in just over a minute. Moreover, it has helpful output when validation fails. So I would be happy to re-implement this using ajv-cli. This will also require using `jq` to massage the trace output from JSON-Lines into a single JSON list object; I presume that adding jq as a dependency to the CI suite would not be a burden? If so I suppose `sed` would also suffice. > That somehow does not sit well with me. > > And I still think we can do a lot better than implement a JSON schema > validator in Go (which has nothing to do with Git), leverage other, > well-maintained projects instead, and gain even more power by not having > to maintain it in git.git, and by avoiding to add yet another Language > You Gotta Know To Do Git. (We already have too many of those, and quite > honestly, I'd rather add Javascript that we _already_ kinda require > contributors to know because of GitWeb, rather than any other > language, especially languages that seem to be controlled by one > company, that's why I still stay away from .NET and even Java, to be > quite honest, even if I was really super knowledgeable about Java > because of my previous job). > > Besides, I have to admit that I am quite disheartened to see these lines > in the log: > > -- snip -- > 2019/07/26 11:39:28 Validated items: 0 > 2019/07/26 11:39:32 Validated items: 10000 > 2019/07/26 11:39:36 Validated items: 20000 > 2019/07/26 11:39:40 Validated items: 30000 > 2019/07/26 11:39:44 Validated items: 40000 > 2019/07/26 11:39:49 Validated items: 50000 > 2019/07/26 11:39:53 Validated items: 60000 > 2019/07/26 11:39:57 Validated items: 70000 > 2019/07/26 11:40:02 Validated items: 80000 > 2019/07/26 11:40:06 Validated items: 90000 > 2019/07/26 11:40:11 Validated items: 100000 > 2019/07/26 11:40:15 Validated items: 110000 > 2019/07/26 11:40:19 Validated items: 120000 > 2019/07/26 11:40:23 Validated items: 130000 > 2019/07/26 11:40:28 Validated items: 140000 > 2019/07/26 11:40:32 Validated items: 150000 > 2019/07/26 11:40:36 Validated items: 160000 > 2019/07/26 11:40:41 Validated items: 170000 > 2019/07/26 11:40:45 Validated items: 180000 > 2019/07/26 11:40:50 Validated items: 190000 > 2019/07/26 11:40:54 Validated items: 200000 > 2019/07/26 11:40:59 Validated items: 210000 > 2019/07/26 11:41:04 Validated items: 220000 > 2019/07/26 11:41:08 Validated items: 230000 > 2019/07/26 11:41:13 Validated items: 240000 > 2019/07/26 11:41:17 Validated items: 250000 > 2019/07/26 11:41:22 Validated items: 260000 > 2019/07/26 11:41:27 Validated items: 270000 > 2019/07/26 11:41:31 Validated items: 280000 > 2019/07/26 11:41:36 Validated items: 290000 > 2019/07/26 11:41:40 Validated items: 300000 > 2019/07/26 11:41:45 Validated items: 310000 > 2019/07/26 11:41:49 Validated items: 320000 > 2019/07/26 11:41:54 Validated items: 330000 > 2019/07/26 11:41:58 Validated items: 340000 > 2019/07/26 11:42:03 Validated items: 350000 > 2019/07/26 11:42:08 Validated items: 360000 > 2019/07/26 11:42:12 Validated items: 370000 > 2019/07/26 11:42:17 Validated items: 380000 > 2019/07/26 11:42:21 Validated items: 390000 > 2019/07/26 11:42:25 Validated items: 400000 > 2019/07/26 11:42:30 Validated items: 410000 > 2019/07/26 11:42:35 Validated items: 420000 > 2019/07/26 11:42:39 Validated items: 430000 > 2019/07/26 11:42:44 Validated items: 440000 > 2019/07/26 11:42:48 Validated items: 450000 > 2019/07/26 11:42:53 Validated items: 460000 > 2019/07/26 11:42:57 Validated items: 470000 > 2019/07/26 11:43:02 Validated items: 480000 > 2019/07/26 11:43:06 Validated items: 490000 > 2019/07/26 11:43:11 Validated items: 500000 > 2019/07/26 11:43:16 Validated items: 510000 > 2019/07/26 11:43:20 Validated items: 520000 > 2019/07/26 11:43:25 Validated items: 530000 > 2019/07/26 11:43:29 Validated items: 540000 > 2019/07/26 11:43:34 Validated items: 550000 > 2019/07/26 11:43:38 Validated items: 560000 > 2019/07/26 11:43:43 Validated items: 570000 > 2019/07/26 11:43:47 Validated items: 580000 > 2019/07/26 11:43:51 Validated items: 590000 > 2019/07/26 11:43:56 Validated items: 600000 > 2019/07/26 11:44:00 Validated items: 610000 > 2019/07/26 11:44:04 Validated items: 620000 > 2019/07/26 11:44:09 Validated items: 630000 > 2019/07/26 11:44:13 Validated items: 640000 > 2019/07/26 11:44:18 Validated items: 650000 > 2019/07/26 11:44:22 Validated items: 660000 > 2019/07/26 11:44:27 Validated items: 670000 > 2019/07/26 11:44:31 Validated items: 680000 > 2019/07/26 11:44:35 Validated items: 690000 > 2019/07/26 11:44:40 Validated items: 700000 > 2019/07/26 11:44:44 Validated items: 710000 > 2019/07/26 11:44:48 Validated items: 720000 > 2019/07/26 11:44:52 Validated items: 730000 > 2019/07/26 11:44:57 Validated items: 740000 > 2019/07/26 11:45:01 Validated items: 750000 > 2019/07/26 11:45:05 Validated items: 760000 > 2019/07/26 11:45:10 Validated items: 770000 > 2019/07/26 11:45:14 Validated items: 780000 > 2019/07/26 11:45:19 Validated items: 790000 > 2019/07/26 11:45:23 Validated items: 800000 > 2019/07/26 11:45:27 Validated items: 810000 > 2019/07/26 11:45:31 Validated items: 820000 > 2019/07/26 11:45:36 Validated items: 830000 > 2019/07/26 11:45:40 Validated items: 840000 > 2019/07/26 11:45:44 Validated items: 850000 > 2019/07/26 11:45:49 Validated items: 860000 > 2019/07/26 11:45:53 Validated items: 870000 > 2019/07/26 11:45:57 Validated items: 880000 > 2019/07/26 11:46:01 Validated items: 890000 > 2019/07/26 11:46:06 Validated items: 900000 > 2019/07/26 11:46:10 Validated items: 910000 > 2019/07/26 11:46:14 Validated items: 920000 > 2019/07/26 11:46:19 Validated items: 930000 > 2019/07/26 11:46:23 Validated items: 940000 > 2019/07/26 11:46:27 Validated items: 950000 > 2019/07/26 11:46:31 Validated items: 960000 > 2019/07/26 11:46:36 Validated items: 970000 > 2019/07/26 11:46:40 Validated items: 980000 > 2019/07/26 11:46:44 Validated items: 990000 > 2019/07/26 11:46:49 Validated items: 1000000 > 2019/07/26 11:46:53 Validated items: 1010000 > 2019/07/26 11:46:57 Validated items: 1020000 > 2019/07/26 11:47:01 Validated items: 1030000 > 2019/07/26 11:47:05 Validated items: 1040000 > 2019/07/26 11:47:10 Validated items: 1050000 > 2019/07/26 11:47:14 Validated items: 1060000 > 2019/07/26 11:47:18 Validated items: 1070000 > 2019/07/26 11:47:22 Validated items: 1080000 > 2019/07/26 11:47:27 Validated items: 1090000 > 2019/07/26 11:47:31 Validated items: 1100000 > 2019/07/26 11:47:35 Validated items: 1110000 > 2019/07/26 11:47:40 Validated items: 1120000 > 2019/07/26 11:47:44 Validated items: 1130000 > 2019/07/26 11:47:49 Validated items: 1140000 > 2019/07/26 11:47:54 Validated items: 1150000 > 2019/07/26 11:47:58 Validated items: 1160000 > 2019/07/26 11:48:02 Validated items: 1170000 > 2019/07/26 11:48:07 Validated items: 1180000 > 2019/07/26 11:48:11 Validated items: 1190000 > 2019/07/26 11:48:15 Validated items: 1200000 > 2019/07/26 11:48:19 Validated items: 1210000 > 2019/07/26 11:48:24 Validated items: 1220000 > 2019/07/26 11:48:28 Validated items: 1230000 > 2019/07/26 11:48:33 Validated items: 1240000 > 2019/07/26 11:48:37 Validated items: 1250000 > 2019/07/26 11:48:42 Validated items: 1260000 > 2019/07/26 11:48:47 Validated items: 1270000 > 2019/07/26 11:48:52 Validated items: 1280000 > 2019/07/26 11:48:56 Validated items: 1290000 > 2019/07/26 11:49:01 Validated items: 1300000 > 2019/07/26 11:49:05 Validated items: 1310000 > 2019/07/26 11:49:09 Validated items: 1320000 > 2019/07/26 11:49:14 Validated items: 1330000 > 2019/07/26 11:49:18 Validated items: 1340000 > 2019/07/26 11:49:23 Validated items: 1350000 > 2019/07/26 11:49:28 Validated items: 1360000 > 2019/07/26 11:49:32 Validated items: 1370000 > 2019/07/26 11:49:37 Validated items: 1380000 > 2019/07/26 11:49:41 Validated items: 1390000 > 2019/07/26 11:49:46 Validated items: 1400000 > 2019/07/26 11:49:50 Validated items: 1410000 > 2019/07/26 11:49:55 Validated items: 1420000 > 2019/07/26 11:50:00 Validated items: 1430000 > 2019/07/26 11:50:04 Validated items: 1440000 > 2019/07/26 11:50:08 Validated items: 1450000 > 2019/07/26 11:50:12 Validated items: 1460000 > 2019/07/26 11:50:17 Validated items: 1470000 > 2019/07/26 11:50:21 Validated items: 1480000 > 2019/07/26 11:50:26 Validated items: 1490000 > 2019/07/26 11:50:30 Validated items: 1500000 > 2019/07/26 11:50:35 Validated items: 1510000 > 2019/07/26 11:50:39 Validated items: 1520000 > 2019/07/26 11:50:44 Validated items: 1530000 > 2019/07/26 11:50:49 Validated items: 1540000 > 2019/07/26 11:50:53 Validated items: 1550000 > 2019/07/26 11:50:57 Validated items: 1560000 > 2019/07/26 11:51:02 Validated items: 1570000 > 2019/07/26 11:51:06 Validated items: 1580000 > 2019/07/26 11:51:11 Validated items: 1590000 > 2019/07/26 11:51:15 Validated items: 1600000 > 2019/07/26 11:51:20 Validated items: 1610000 > 2019/07/26 11:51:24 Validated items: 1620000 > 2019/07/26 11:51:29 Validated items: 1630000 > 2019/07/26 11:51:33 Validated items: 1640000 > 2019/07/26 11:51:38 Validated items: 1650000 > 2019/07/26 11:51:42 Validated items: 1660000 > 2019/07/26 11:51:47 Validated items: 1670000 > 2019/07/26 11:51:51 Validated items: 1680000 > 2019/07/26 11:51:55 Validated items: 1690000 > 2019/07/26 11:52:00 Validated items: 1700000 > 2019/07/26 11:52:04 Validated items: 1710000 > 2019/07/26 11:52:09 Validated items: 1720000 > 2019/07/26 11:52:13 Validated items: 1730000 > 2019/07/26 11:52:17 Validated items: 1740000 > 2019/07/26 11:52:22 Validated items: 1750000 > 2019/07/26 11:52:26 Validated items: 1760000 > 2019/07/26 11:52:30 Validated items: 1770000 > 2019/07/26 11:52:34 Validated items: 1780000 > 2019/07/26 11:52:39 Validated items: 1790000 > 2019/07/26 11:52:43 Validated items: 1800000 > 2019/07/26 11:52:47 Validated items: 1810000 > 2019/07/26 11:52:51 Validated items: 1820000 > 2019/07/26 11:52:55 Validated items: 1830000 > 2019/07/26 11:52:59 Validated items: 1840000 > 2019/07/26 11:53:04 Validated items: 1850000 > 2019/07/26 11:53:08 Validated items: 1860000 > 2019/07/26 11:53:12 Validated items: 1870000 > 2019/07/26 11:53:16 Validated items: 1880000 > 2019/07/26 11:53:20 Validated items: 1890000 > 2019/07/26 11:53:24 Validated items: 1900000 > 2019/07/26 11:53:28 Validated items: 1910000 > 2019/07/26 11:53:32 Validated items: 1920000 > 2019/07/26 11:53:36 Validated items: 1930000 > 2019/07/26 11:53:41 Validated items: 1940000 > 2019/07/26 11:53:45 Validated items: 1950000 > 2019/07/26 11:53:49 Validated items: 1960000 > 2019/07/26 11:53:53 Validated items: 1970000 > 2019/07/26 11:53:58 Validated items: 1980000 > 2019/07/26 11:54:02 Validated items: 1990000 > 2019/07/26 11:54:06 Validated items: 2000000 > 2019/07/26 11:54:10 Validated items: 2010000 > 2019/07/26 11:54:14 Validated items: 2020000 > 2019/07/26 11:54:18 Validated items: 2030000 > 2019/07/26 11:54:23 Validated items: 2040000 > 2019/07/26 11:54:27 Validated items: 2050000 > 2019/07/26 11:54:31 Validated items: 2060000 > 2019/07/26 11:54:35 Validated items: 2070000 > 2019/07/26 11:54:39 Validated items: 2080000 > 2019/07/26 11:54:43 Validated items: 2090000 > 2019/07/26 11:54:48 Validated items: 2100000 > 2019/07/26 11:54:52 Validated items: 2110000 > 2019/07/26 11:54:56 Validated items: 2120000 > 2019/07/26 11:55:00 Validated items: 2130000 > 2019/07/26 11:55:04 Validated items: 2140000 > 2019/07/26 11:55:08 Validated items: 2150000 > 2019/07/26 11:55:12 Validated items: 2160000 > 2019/07/26 11:55:16 Validated items: 2170000 > 2019/07/26 11:55:21 Validated items: 2180000 > 2019/07/26 11:55:25 Validated items: 2190000 > 2019/07/26 11:55:29 Validated items: 2200000 > 2019/07/26 11:55:33 Validated items: 2210000 > 2019/07/26 11:55:37 Validated items: 2220000 > 2019/07/26 11:55:42 Validated items: 2230000 > 2019/07/26 11:55:46 Validated items: 2240000 > 2019/07/26 11:55:50 Validated items: 2250000 > 2019/07/26 11:55:55 Validated items: 2260000 > 2019/07/26 11:55:59 Validated items: 2270000 > 2019/07/26 11:56:03 Validated items: 2280000 > 2019/07/26 11:56:08 Validated items: 2290000 > 2019/07/26 11:56:12 Validated items: 2300000 > 2019/07/26 11:56:14 Validated events: 2303564 > -- snap -- > > Let me read this out loud to you. We validate about 2.3 million items, > and we spend a whopping sixteen minutes and fourty-six seconds on > validating them. > > That's crazy. > > It's not only the time, that's bad enough. It is the combined amount of > electricity and CO2 that is spent -- quite pointlessly -- on this task. > > I know that at least for Azure Pipelines, that's CO2-neutral, but that > is only a trick and a lot of money. I'd rather avoid that in the first > place. > > What the heck are we thinking, to add such a lot of wasted time to every > CI run? Why do we have to validate *that* many items? And why do we have > to use a validator that is obviously pretty slow at doing this? > > In the short run, let's fix the build. But even before the next > iteraion, let's try to find a way to avoid having to maintain a JSON > validator in Git's source code, _and_ let's avoid wasting so much time > and electricty. > > Junio, I need this patch to fix the build, would you terribly mind > applying it on top of `js/trace2-json-schema`? > > -- snipsnap -- > From 2deb3ac117768133551716c9788266671aafb445 Mon Sep 17 00:00:00 2001 > From: Johannes Schindelin <johannes.schindelin@gmx.de> > Date: Fri, 26 Jul 2019 13:18:26 +0200 > Subject: [PATCH] fixup??? ci: run trace2 schema validation in the CI suite > > That seems like an oversight when iterating the patch series. > > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> > --- > ci/run-build-and-tests.sh | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh > index 8103125849..696b6248ed 100755 > --- a/ci/run-build-and-tests.sh > +++ b/ci/run-build-and-tests.sh > @@ -24,8 +24,8 @@ linux-gcc) > export GIT_TEST_ADD_I_USE_BUILTIN=1 > make test > t/trace_schema_validator/trace_schema_validator \ > - --trace2_event_file=${GIT_TRACE2_EVENT} \ > - --schema_file=t/trace_schema_validator/strict_schema.json \ > + --trace2-event-file=${GIT_TRACE2_EVENT} \ > + --schema-file=t/trace_schema_validator/strict_schema.json \ > --progress=10000 > ;; > linux-gcc-4.8) > -- > 2.22.0.windows.1.6.g271c090e89
Hi Gábor, On Fri, 26 Jul 2019, SZEDER Gábor wrote: > On Fri, Jul 26, 2019 at 02:12:39PM +0200, Johannes Schindelin wrote: > > > Besides, I have to admit that I am quite disheartened to see these lines > > in the log: > > > > -- snip -- > > 2019/07/26 11:39:28 Validated items: 0 > > 2019/07/26 11:39:32 Validated items: 10000 > > These are necessary for Travis CI, which aborts the build if it > doesn't receive output for 10 minutes, thinking that the tests hang > somewhere. Yeah, less lines like these would be sufficient to avoid > that timeout, but, of course, the real solution would be to not > validate all of 'make test'. To be crystal clear: I was not disheartened by seeing the *amount* of lines, but by the sheer amount of time burned up just to verify a feature that only very, very few users will ever need. The entire validation's runtime rivals the one of a full run of the entire test suite. (Which _already_ takes too long to be effective. Just look how even at least one long-time contributor often simply skips running it before submitting patches, most likely because they don't want to wait.) Ciao, Dscho
On 2019.07.26 15:03, Josh Steadmon wrote: [snip] > [ajv-cli] can validate the full 1.7M line trace output in just over a > minute. Moreover, it has helpful output when validation fails. So I > would be happy to re-implement this using ajv-cli. Unfortunately, ajv on Travis is much slower than on my work machine. It still takes over 10 minutes to complete, and is killed by Travis since it doesn't provide any progress indicator while it's running. How would people feel about validating a sample of the "make test" output? In the short term we could just use command-line tools to sample the trace file; for the long-term, we could add a sampling config option for trace2 (I've been considering adding this for other reasons anyway). Ideally we would want the sample to be deterministic for any given commit, so that we don't end up with flaky tests if changes are made to trace2 while neglecting to update the schema. Since there have been some suggestions to build a standalone test and verify its trace output, let me reiterate why I feel it's useful to use "make test" instead: I do not feel that I can create a standalone test that exercises a wide enough selection of code paths to get sufficient coverage of all potential trace2 output. Trying to make a standalone test that also anticipates future development is practically impossible. Using "make test" means that I can rely on the whole community to identify important code paths, both now and in the future. As always, I am open to other approaches to make sure the schema stays up-to-date.
Josh Steadmon wrote: > On 2019.07.26 15:03, Josh Steadmon wrote: >> [ajv-cli] can validate the full 1.7M line trace output in just over a >> minute. Moreover, it has helpful output when validation fails. So I >> would be happy to re-implement this using ajv-cli. > > Unfortunately, ajv on Travis is much slower than on my work machine. It > still takes over 10 minutes to complete, and is killed by Travis since > it doesn't provide any progress indicator while it's running. Alas. What do you think of making the validation disabled by default and using a parameter (see "Running tests with special setups" in t/README) to turn it on? That way, it should be okay for it to take 10 minutes because this would only affect such specialized workers as choose to set that parameter, instead of all of them. Gábor, if we introduce such a parameter, do you think it would make sense for us to set up a worker that passes it? > How would people feel about validating a sample of the "make test" > output? In the short term we could just use command-line tools to sample > the trace file; for the long-term, we could add a sampling config option > for trace2 (I've been considering adding this for other reasons anyway). I kind of like the idea of a "make smoke" that runs some fast probabilistic tests for interactive use, but for CI runs I prefer the thoroughness of running the full testsuite. So for the problem at hand, I prefer making the test optional and reliable instead of fast and nondeterministic. > Ideally we would want the sample to be deterministic for any given > commit, so that we don't end up with flaky tests if changes are made to > trace2 while neglecting to update the schema. That would make it hard to bisect to track down bugs, so I don't like it as much. > Since there have been some suggestions to build a standalone test and > verify its trace output, let me reiterate why I feel it's useful to use > "make test" instead: I do not feel that I can create a standalone test > that exercises a wide enough selection of code paths to get sufficient > coverage of all potential trace2 output. Trying to make a standalone > test that also anticipates future development is practically impossible. > Using "make test" means that I can rely on the whole community to > identify important code paths, both now and in the future. I agree. > As always, I am open to other approaches to make sure the schema stays > up-to-date. It sounds like the codegen approach would be a nice way to do that, but that doesn't need to block this step in the right direction. Thanks for your thoughtfulness, Jonathan
Hi Jonathan & Josh, On Thu, 1 Aug 2019, Jonathan Nieder wrote: > Josh Steadmon wrote: > > On 2019.07.26 15:03, Josh Steadmon wrote: > > >> [ajv-cli] can validate the full 1.7M line trace output in just over a > >> minute. Moreover, it has helpful output when validation fails. So I > >> would be happy to re-implement this using ajv-cli. > > > > Unfortunately, ajv on Travis is much slower than on my work machine. It > > still takes over 10 minutes to complete, and is killed by Travis since > > it doesn't provide any progress indicator while it's running. > > Alas. > > What do you think of making the validation disabled by default and > using a parameter (see "Running tests with special setups" in > t/README) to turn it on? That way, it should be okay for it to take > 10 minutes because this would only affect such specialized workers as > choose to set that parameter, instead of all of them. > > Gábor, if we introduce such a parameter, do you think it would make > sense for us to set up a worker that passes it? Oh my. What do we want? To validate the JSON output. Do we want to validate it in a smart way? Yes! So how about adding a separate test script that activates Trace2, then runs a _few_ selected, very specific commands, starting with a set that Josh thinks representative, validate the generated JSON, and then only add new invocations if we *ever* see violations of the schema, to verify that we don't regress on that front again? Such a script can't take more than a couple of seconds to run, I am sure. And it would totally strike a sensible balance between benefit and time spent. Generating JSON output that respects the given schema is *such* a niche problem that I am really against punishing every person who wants to run the test suite by having to sit through 10 minutes, at the same time I do not want this feature to be totally untested by default, either. We do _not_ have to validate the 10,001st instance of a `git commit`'s corresponding JSON. The 10k invocations before that validated fine, there is little doubt that this one, as well as the following 59,724 invocations won't violate the schema, either. It's not hard to test smartly. Use a really small, representative sample. I am usually against trying to be too smart, but in this case using a really small sample (my gut feeling says: at most a dozen) is Just Smart Enough. Ciao, Johannes
Hi, Johannes Schindelin wrote: > On Thu, 1 Aug 2019, Jonathan Nieder wrote: >> What do you think of making the validation disabled by default and >> using a parameter (see "Running tests with special setups" in >> t/README) to turn it on? That way, it should be okay for it to take >> 10 minutes because this would only affect such specialized workers as >> choose to set that parameter, instead of all of them. [...] > So how about adding a separate test script that activates Trace2, then > runs a _few_ selected, very specific commands, starting with a set that > Josh thinks representative, validate the generated JSON, and then only > add new invocations if we *ever* see violations of the schema, to verify > that we don't regress on that front again? > > Such a script can't take more than a couple of seconds to run, I am > sure. And it would totally strike a sensible balance between benefit and > time spent. I think this suggestion has been covered a little upthread, but apologies for not paying it the fuller attention that it deserves. Such a test would check that * the validation code still works, the schema is well formed, etc * traces produced by common commands fit the schema That makes it very likely the right thing to do. This is a test that's cheap to run, so we can make it happen automatically as part of a normal test suite run for any developer that has ajv installed (including various CI jobs). And it unblocks getting patches 1 and 2 from this series in. So, basically, you're right. :) That said, it still leaves an unmet need that is important to us. We cannot trust the JSON output and rely on it without a more exhaustive test that the schema is accurate. It is very easy to add a new field or event type that only shows up in a specialized code path, without remembering to update the schema or docs, because of the nature of how the trace2 API works. In this thread, you can already see some of the fruit of this more exhaustive approach: - it identifies the tests that write invalid UTF-8 to the traces, illustrating an issue with the format (i.e., a real bug) that we can now be more aware of - it identified a field that had been renamed in code where we had forgotten to update the documentation The exhaustive approach really helps. Arguing against it kind of feels like saying "leak checkers are great, but why run one on the full test suite instead of a dedicated tool that exercises the code paths where you expect to find leaks?" In the short term, we can run tests internally to check that Git keeps following the schema. Let's not block patches 1 and 2 by this --- instead, the simple basic functionality test you're describing seems like a good way to make progress for the moment. We can upstream it later. Thanks, Jonathan
On Thu, Aug 01, 2019 at 06:52:47PM -0700, Jonathan Nieder wrote: > What do you think of making the validation disabled by default and > using a parameter (see "Running tests with special setups" in > t/README) to turn it on? That way, it should be okay for it to take > 10 minutes because this would only affect such specialized workers as > choose to set that parameter, instead of all of them. That's kind of what the third patch in this series already does, as it performs this schema validation only in the 'linux-gcc' build job. > Gábor, if we introduce such a parameter, do you think it would make > sense for us to set up a worker that passes it? That would be even worse than the current approach of the third patch, because the additional worker would have to install dependencies, build Git and run the test suite, in addition to the enormous overhead of redundantly validating the trace output of every git command executed during 'make test'. So instead of adding "only" 10 minutes to every build, it would add over 20.
On Fri, Aug 02, 2019 at 09:59:13AM -0700, Jonathan Nieder wrote: > The exhaustive approach really helps. Arguing against it kind of > feels like saying "leak checkers are great, but why run one on the > full test suite instead of a dedicated tool that exercises the code > paths where you expect to find leaks?" Not at all. We don't run any leak checker on the full test suite as part of each and every CI build, do we? And just like no one is arguing against you running a leak checker on the full test suite, no one is arguing against you running a JSON schema validation on the full test suite, either. > In the short term, we can run tests internally to check that Git keeps > following the schema. Let's not block patches 1 and 2 by this --- To my understanding patch 2 is only a proof of concept: it starts using a programming language that has not been used before in this project, to implement functionality that is readily available in several existing tools, without even arguing (let alone convincingly arguing) in the commit message why this approach is a good idea.
SZEDER Gábor wrote: > On Thu, Aug 01, 2019 at 06:52:47PM -0700, Jonathan Nieder wrote: >> Gábor, if we introduce such a parameter, do you think it would make >> sense for us to set up a worker that passes it? > > That would be even worse than the current approach of the third patch, > because the additional worker would have to install dependencies, > build Git and run the test suite, in addition to the enormous overhead > of redundantly validating the trace output of every git command > executed during 'make test'. So instead of adding "only" 10 minutes > to every build, it would add over 20. Thanks, that's helpful to know. It sounds like if we want to run this kind of expensive test in CI, we would want to set it up differently: e.g. daily runs against "pu" instead of running on every push. Jonathan
SZEDER Gábor wrote: > On Fri, Aug 02, 2019 at 09:59:13AM -0700, Jonathan Nieder wrote: >> In the short term, we can run tests internally to check that Git keeps >> following the schema. Let's not block patches 1 and 2 by this --- > > To my understanding patch 2 is only a proof of concept: it starts > using a programming language that has not been used before in this > project, to implement functionality that is readily available in > several existing tools, without even arguing (let alone convincingly > arguing) in the commit message why this approach is a good idea. Well, Golang has been used in contrib/ before. ;-) If I understand [1] and [2] correctly, we haven't found an existing standalone tool that is able to efficiently validate a high volume of traces. But for the purpose of sanity-checking that running a few typical commands generates a trace that fits the schema, it's not important that the validator be super fast. So we can use a tool like jq, picking one using the criteria that it - allows performing the validation without too much fuss - is likely to already be present on some developers' machines Thanks, Jonathan [1] https://public-inbox.org/git/20190726220348.GF43313@google.com/ [2] https://public-inbox.org/git/cover.1564009259.git.steadmon@google.com/
On Fri, Aug 02, 2019 at 04:06:50PM -0700, Jonathan Nieder wrote: > SZEDER Gábor wrote: > > On Thu, Aug 01, 2019 at 06:52:47PM -0700, Jonathan Nieder wrote: > > >> Gábor, if we introduce such a parameter, do you think it would make > >> sense for us to set up a worker that passes it? > > > > That would be even worse than the current approach of the third patch, > > because the additional worker would have to install dependencies, > > build Git and run the test suite, in addition to the enormous overhead > > of redundantly validating the trace output of every git command > > executed during 'make test'. So instead of adding "only" 10 minutes > > to every build, it would add over 20. > > Thanks, that's helpful to know. > > It sounds like if we want to run this kind of expensive test in CI, we > would want to set it up differently: e.g. daily runs against "pu" > instead of running on every push. OK, I think that sounds acceptable, though I would suggest to: - Limit it to 'git/git's 'pu' branch, so others can have their own 'pu' branch without suffering from the consequences. It seems easy to do so, on Travis CI these are available in the TRAVIS_BRANCH and TRAVIS_REPO_SLUG environment variables. I'm not sure what are the equivalent variables on Azure Pipelines, though. - Additionally, check an environment variable with a name like GIT_CI_RUN_EXPENSIVE_JSON_SCHEMA_VALIDATION or something, and if set, then, well, run the schema validation anyway, no matter what repo and what branch is being checked. This way if a contributor cares about schema validation that deeply, then they can set this env var in Travis CI's repo settings panel for all or for specific branches, and thus enable validation on those branches of their git fork, without having to modify any files in the repository.
On Sat, Aug 03, 2019 at 09:35:49AM +0200, SZEDER Gábor wrote: > - Limit it to 'git/git's 'pu' branch, so others can have their own > 'pu' branch without suffering from the consequences. > It seems easy to do so, on Travis CI these are available in the > TRAVIS_BRANCH and TRAVIS_REPO_SLUG environment variables. > I'm not sure what are the equivalent variables on Azure Pipelines, > though. Ah, I forgot that we are kind of CI-system-agnostic these days, and our 'ci/lib.sh' sets the common CI_BRANCH and CI_REPO_SLUG environment variables both on Travis CI and Azure Pipelines.
Hi Jonathan, On Fri, 2 Aug 2019, Jonathan Nieder wrote: > SZEDER Gábor wrote: > > On Fri, Aug 02, 2019 at 09:59:13AM -0700, Jonathan Nieder wrote: > > >> In the short term, we can run tests internally to check that Git keeps > >> following the schema. Let's not block patches 1 and 2 by this --- > > > > To my understanding patch 2 is only a proof of concept: it starts > > using a programming language that has not been used before in this > > project, to implement functionality that is readily available in > > several existing tools, without even arguing (let alone convincingly > > arguing) in the commit message why this approach is a good idea. > > Well, Golang has been used in contrib/ before. ;-) Yes, and the rules for contrib/ are a lot more lenient than for the core of Git for a purpose: we tried to invite a lot of projects into that folder that would otherwise have fallen undiscoverable and/or unmaintained (this was before GitHub, and I would argue that the contrib/ directory's original purpose was kinda outlived by the quite popular Git hosters out there, with some things like the completions probably wanting to be promoted to a better directory, i.e. a location that does suggest that it is maintained within git.git proper, and not an unbeloved step-child, and others simply be moved into their own public repository). But as both you and I know, the patch in question is not at all about contrib/. It is about a part that the Git project is asked to maintain, because it is now part of the CI build, and even worse: it is now expected that the contributors whose patches break the CI build are able to investigate, even if it is a bug in that very new validator in that very language that we did not require contributors to know beforehand. It is easy for old Unix graybeards to forget that they, too, struggled with new languages when they started out, and nowadays it is even harder, as you are kinda stretched between languages and frameworks and new techniques like multi-threading, closures, algebraic effects, etc that Unix graybeards will never truly understand. Just as a reminder: if you truly want to understand, or even only debug, or even only scratch your own itch with, core Git, you have to know - C - Unix shell scripting (POSIX-compliant, including the vagaries of the options/particulars of sed, xargs, etc, in particular which ones are *not* limited to the GNU flavors) - Perl - GNU Makefile (you better understand macros, and the difference between `=` and `:=`, and `:` and `::`!) - Javascript - Markdown - AsciiDoc (we seriously do expect contributors to know what AsciiDoc makes of a line consisting of a single `+` character!) - Python - RUby (for AsciiDoctor extensions) - Tcl And that's just core Git. You won't hear me complaining too much about the above, as I am fluent enough in all of the above, and since it is a relatively rare skillset, it makes it easy for me to keep my job. At the same time, you do realize that this makes it harder than necessary to get contributions? That it fosters the kind of environment that makes every potential contributor feel stupid, when it is totally not their fault that _we_ created this environment? Adding a new language to the fray, even something as universally respected as Rust, would be a hard sell. Golang is much, much harder a sell. I know that you, working at Google, might think that Go is a mainstream language and not even controversial, but that just looks a bit biased: Go is very much a language controlled by Google, much like .NET Framework (although not as much .NET Core) is controlled by Microsoft. Therefore, I would have considered it to go without saying that both Go and C# are obviously bad choices for _new_ contributions to core Git, in particular if they are intended to become part of the CI-tested portion of git.git. It just looks a bit like forcing your stuff onto others. I constantly push back against trying to contribute C# code for that reason. > If I understand [1] and [2] correctly, we haven't found an existing > standalone tool that is able to efficiently validate a high volume of > traces. Gábor put it very aptly: the problem is the high volume. You are free to test this as part of your development, but to push this high volume testing (for almost nobody's benefit) down the throat of git.git's regular CI is a bit... much. > But for the purpose of sanity-checking that running a few typical > commands generates a trace that fits the schema, it's not important > that the validator be super fast. So we can use a tool like jq, > picking one using the criteria that it > > - allows performing the validation without too much fuss > - is likely to already be present on some developers' machines I don't really care what you use, as long as it is something we don't need to maintain ourselves, and as long as it is well-maintained in its own right. `jq` seems to fit that bill. I do care as soon as you increase the average runtime of the CI build by a _double-digit_ percentage, especially if it is for the purpose of a feature that your team and my team and maybe two or five other teams might care about, i.e. maybe even as much as 50 people in total, but not the millions and millions of Git users out there. We _cannot_ penalize the CI build for a feature a couple of dozen users want and nobody else, no matter how useful the feature _to them_. It would have been a totally different matter if the contribution would have come as a patch to contrib/, with a few test cases marked up with a prereq that is off unless a user sets a specific environment variable, so that you could run your integration tests, and nobody else would have to pay the price. But that's not what was proposed, and that's why I object. Ciao, Dscho