Message ID | pull.1746.git.git.1721061218993.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | userdiff: add builtin diff driver for TypeScript language | expand |
On Mon, Jul 15, 2024 at 04:33:38PM +0000, Matthew Hughes via GitGitGadget wrote: > diff --git a/userdiff.c b/userdiff.c > index c4ebb9ff734..7247d351cde 100644 > --- a/userdiff.c > +++ b/userdiff.c > @@ -333,6 +333,17 @@ PATTERNS("scheme", > "|([^][)(}{[ \t])+"), > PATTERNS("tex", "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$", > "\\\\[a-zA-Z@]+|\\\\.|([a-zA-Z0-9]|[^\x01-\x7f])+"), > +PATTERNS("typescript", > + "^[\t ]*((class|constructor|public|private|protected|function|interface)[ \t].*)$\n" > + // arrow funcs > + "^[\t ]*((const|let|var)?[^()]*)=[\t ]*\\([^()]*\\)[\t ]*.*=>.*$", > + /* -- */ > + "[a-zA-Z_][a-zA-Z0-9_]*" > + // numeric constants > + "|[-+0-9.e]+|0[xX]?[0-9a-fA-F]" > + // operators > + "|[-+*/<>%&^|=!]" > + "|--|\\+\\+|//=?|<<=?|>>?=?"), > { .name = "default", .binary = -1 }, > }; > #undef PATTERNS > > base-commit: a7dae3bdc8b516d36f630b12bb01e853a667e0d9 > -- > gitgitgadget This needs some updates. For the arrow function, definitions can cover multiple lines e.g.: const bar = ( name: string ) => console.log(name) The funcname pattern should also consider the `export` keyword, since both of the following are valid: export const bar = ( name: string ) => console.log(name) export function foo() {} Some docs: https://www.typescriptlang.org/docs/handbook/modules/reference.html#module-syntax
Matthew Hughes <matthewhughes934@gmail.com> writes:
> This needs some updates.
What does it mean?
The patterns that were posted were so broken that they are unusable
and harm the users by giving misleading information?
Or do the patterns work just fine in basic or tutorial cases, but
with more advanced or realistic uses of the language construct, they
highlight wrong lines as the function header and/or split at wrong
word boundaries that are obviously much less optimal than ideal that
any human users would find questionable?
In the latter case, how far from the ideal are the decisions done by
the current patterns, and what's the rough percentage of usual code
we see in the real world, for which the current patterns do not work
well?
What I am trying to gauge is if it is so broken that it should not
exist (in other words, you regret sending the patch to the list
before doing these updates), or is "already serviceable, but not
perfect yet". Waiting for perfection takes forever. If the latter,
letting the general public to use it to gather feedbacks by waiting
for the dust to settle before making such updates is often better.
Oe Tue, Jul 16, 2024 at 08:45:08AM -0700, Junio C Hamano wrote: > What does it mean? > > The patterns that were posted were so broken that they are unusable > and harm the users by giving misleading information? I think this would be a good summary. It's sufficient for some simpler cases considered, and does even give some benefits e.g. for function headers for nested functions. However, the cases where it fails can be significant, e.g. hundreds of lines away from the correct function header for files with multiple consecutive multi-line arrow functions. > In the latter case, how far from the ideal are the decisions done by > the current patterns, and what's the rough percentage of usual code > we see in the real world, for which the current patterns do not work > well? I think just the missing `export` keyword handling would be equivalent to missing all public functions in other programming languages, so that alone would be a decent percentage. > What I am trying to gauge is if it is so broken that it should not > exist (in other words, you regret sending the patch to the list > before doing these updates), or is "already serviceable, but not > perfect yet". Waiting for perfection takes forever. If the latter, > letting the general public to use it to gather feedbacks by waiting > for the dust to settle before making such updates is often better. I'm leaning towards the former case: that this patch was premature. I think it's far enough from perfect that it would greatly benefit from me more actively reaching out to the TypeScript language team and asking some devs there try out the changes and gather some more input (and identify some more missing cases, of which I now expect that are many) before getting something out to general users.
We have had a submission for typescript just recently: https://lore.kernel.org/git/20240404163827.5855-1-utsavp0213@gmail.com/ And two for Javascript https://lore.kernel.org/git/20240301074048.188835-1-sergiusnyah@gmail.com/ https://lore.kernel.org/git/20220403132508.28196-1-a97410985new@gmail.com/ Please review these earlier submissions. If you think you can improve on them, you are very welcome to do so. But you can also just resend one of these series with a note that they are sufficiently mature and that you support the submitted version. I may very well mistaken, but I think that Typescript is a syntactical superset of Javascript, so we need just one language driver and mention in the documentation that it can be used for both languages. Thanks, -- Hannes
Matthew Hughes <matthewhughes934@gmail.com> writes:
> I'm leaning towards the former case: that this patch was premature.
OK. Then let's take sufficient time. After all, we are never in a
hurry ;-)
Thanks for giving an honest assessment. Will keep the topic in
'seen' without marking it for 'next' (at least until it gets
replaced with a version that is more suitable to the public).
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt index e6150595af8..88ebfb2e6c9 100644 --- a/Documentation/gitattributes.txt +++ b/Documentation/gitattributes.txt @@ -907,6 +907,8 @@ patterns are available: - `tex` suitable for source code for LaTeX documents. +- `typescript` suitable for source code in the TypeScript language. + Customizing word diff ^^^^^^^^^^^^^^^^^^^^^ diff --git a/t/t4018/typescript-arrow-func b/t/t4018/typescript-arrow-func new file mode 100644 index 00000000000..c884c2e6bdc --- /dev/null +++ b/t/t4018/typescript-arrow-func @@ -0,0 +1,7 @@ +const RIGHT = (a: number) : number => { + const res = a + 1 + + // some comment + console.log("ChangeMe") + return res +} diff --git a/t/t4018/typescript-class b/t/t4018/typescript-class new file mode 100644 index 00000000000..a373f7af1dd --- /dev/null +++ b/t/t4018/typescript-class @@ -0,0 +1,8 @@ +class RIGHT { + // some + // comments + // ChangeMe + constructor() { + console.log("constructing...") + } +} diff --git a/t/t4018/typescript-func b/t/t4018/typescript-func new file mode 100644 index 00000000000..94cc659a452 --- /dev/null +++ b/t/t4018/typescript-func @@ -0,0 +1,7 @@ +function RIGHT(a: number) : number { + const res = a + 1 + + // some comment + console.log("ChangeMe") + return res +} diff --git a/t/t4018/typescript-indented-func b/t/t4018/typescript-indented-func new file mode 100644 index 00000000000..fee0373126e --- /dev/null +++ b/t/t4018/typescript-indented-func @@ -0,0 +1,7 @@ +class Foo { + public RIGHT() { + // some + // comments + console.log("ChangeMe") + } +} diff --git a/t/t4018/typescript-interface b/t/t4018/typescript-interface new file mode 100644 index 00000000000..78d14523446 --- /dev/null +++ b/t/t4018/typescript-interface @@ -0,0 +1,5 @@ +interface RIGHT { + name: string; + // some comment + ChangeMe: number; +} diff --git a/t/t4018/typescript-nested-arrow-func b/t/t4018/typescript-nested-arrow-func new file mode 100644 index 00000000000..52dbc6a92e5 --- /dev/null +++ b/t/t4018/typescript-nested-arrow-func @@ -0,0 +1,7 @@ +class Foo { + RIGHT = () => { + // some + // comment + console.log("ChangeMe") + } +} diff --git a/t/t4034-diff-words.sh b/t/t4034-diff-words.sh index 74586f3813c..4e3cf415c23 100755 --- a/t/t4034-diff-words.sh +++ b/t/t4034-diff-words.sh @@ -338,6 +338,7 @@ test_language_driver python test_language_driver ruby test_language_driver scheme test_language_driver tex +test_language_driver typescript test_expect_success 'word-diff with diff.sbe' ' cat >pre <<-\EOF && diff --git a/t/t4034/typescript/expect b/t/t4034/typescript/expect new file mode 100644 index 00000000000..fde27423339 --- /dev/null +++ b/t/t4034/typescript/expect @@ -0,0 +1,42 @@ +<BOLD>diff --git a/pre b/post<RESET> +<BOLD>index 6a5ba06c60..db4788b1a6 100644<RESET> +<BOLD>--- a/pre<RESET> +<BOLD>+++ b/post<RESET> +<CYAN>@@ -1,24 +1,24 @@<RESET> +console.log("Hello world<RED>!<RESET><GREEN>?<RESET>") +foo===<RED>bar<RESET><GREEN>buz<RESET> +foo!==<RED>bar<RESET><GREEN>buz<RESET> +function fn() : <RED>number<RESET><GREEN>string<RESET> { return <RED>1<RESET><GREEN>"ok"<RESET> } +<RED>a<RESET><GREEN>x<RESET>.someFunc() +const <RED>a<RESET><GREEN>x<RESET> = () => {} +<GREEN>(<RESET>1<GREEN>) (<RESET>-1e10<GREEN>) (<RESET>0xabcdef<GREEN>)<RESET> '<RED>x<RESET><GREEN>y<RESET>' +"<RED>x<RESET><GREEN>y<RESET>" +[<RED>a<RESET><GREEN>x<RESET>] <RED>a<RESET><GREEN>x<RESET>.<RED>b<RESET><GREEN>y<RESET> +!<RED>a<RESET><GREEN>x<RESET> ~<RED>a a<RESET><GREEN>x x<RESET>++ <RED>a<RESET><GREEN>x<RESET>-- <RED>a<RESET><GREEN>x<RESET>*<RED>b a<RESET><GREEN>y x<RESET>&<RED>b<RESET> +<RED>a<RESET><GREEN>y<RESET> +<GREEN>x<RESET>*<RED>b a<RESET><GREEN>y x<RESET>/<RED>b a<RESET><GREEN>y x<RESET>%<RED>b<RESET> +<RED>a<RESET><GREEN>y<RESET> +<GREEN>x<RESET>+<RED>b a<RESET><GREEN>y x<RESET>-<RED>b<RESET> +<RED>a<RESET><GREEN>y<RESET> +<GREEN>x<RESET><<<RED>b a<RESET><GREEN>y x<RESET>>><RED>b<RESET> +<RED>a<RESET><GREEN>y<RESET> +<GREEN>x<RESET>>>>=<RED>b a<RESET><GREEN>y x<RESET>>>><RED>b<RESET> +<RED>a<RESET><GREEN>y<RESET> +<GREEN>x<RESET><<RED>b a<RESET><GREEN>y x<RESET><=<RED>b a<RESET><GREEN>y x<RESET>><RED>b a<RESET><GREEN>y x<RESET>>=<RED>b<RESET> +<RED>a<RESET><GREEN>y<RESET> +<GREEN>x<RESET>==<RED>b a<RESET><GREEN>y x<RESET>!=<RED>b<RESET> +<RED>a<RESET><GREEN>y<RESET> +<GREEN>x<RESET>&<RED>b<RESET> +<RED>a<RESET><GREEN>y<RESET> +<GREEN>x<RESET>^<RED>b<RESET> +<RED>a<RESET><GREEN>y<RESET> +<GREEN>x<RESET>|<RED>b<RESET> +<RED>a<RESET><GREEN>y<RESET> +<GREEN>x<RESET>&&<RED>b<RESET> +<RED>a<RESET><GREEN>y<RESET> +<GREEN>x<RESET>||<RED>b<RESET> +<RED>a<RESET><GREEN>y<RESET> +<GREEN>x<RESET>?<RED>b<RESET><GREEN>y<RESET>:z +<RED>a<RESET><GREEN>x<RESET>=<RED>b a<RESET><GREEN>y x<RESET>+=<RED>b a<RESET><GREEN>y x<RESET>-=<RED>b a<RESET><GREEN>y x<RESET>*=<RED>b a<RESET><GREEN>y x<RESET>/=<RED>b a<RESET><GREEN>y x<RESET>%=<RED>b a<RESET><GREEN>y x<RESET><<=<RED>b a<RESET><GREEN>y x<RESET>>>=<RED>b a<RESET><GREEN>y x<RESET>&=<RED>b a<RESET><GREEN>y x<RESET>^=<RED>b a<RESET><GREEN>y x<RESET>|=<RED>b<RESET> +<RED>a<RESET><GREEN>y<RESET> +<GREEN>x<RESET>,y diff --git a/t/t4034/typescript/post b/t/t4034/typescript/post new file mode 100644 index 00000000000..4aca03f8ce6 --- /dev/null +++ b/t/t4034/typescript/post @@ -0,0 +1,24 @@ +console.log("Hello world?") +foo===buz +foo!==buz +function fn() : string { return "ok" } +x.someFunc() +const x = () => {} +(1) (-1e10) (0xabcdef) 'y' +"y" +[x] x.y +!x ~x x++ x-- x*y x&y +x*y x/y x%y +x+y x-y +x<<y x>>y +x>>>=y x>>>y +x<y x<=y x>y x>=y +x==y x!=y +x&y +x^y +x|y +x&&y +x||y +x?y:z +x=y x+=y x-=y x*=y x/=y x%=y x<<=y x>>=y x&=y x^=y x|=y +x,y diff --git a/t/t4034/typescript/pre b/t/t4034/typescript/pre new file mode 100644 index 00000000000..ef654a92c61 --- /dev/null +++ b/t/t4034/typescript/pre @@ -0,0 +1,24 @@ +console.log("Hello world!") +foo===bar +foo!==bar +function fn() : number { return 1 } +a.someFunc() +const a = () => {} +1 -1e10 0xabcdef 'x' +"x" +[a] a.b +!a ~a a++ a-- a*b a&b +a*b a/b a%b +a+b a-b +a<<b a>>b +a>>>=b a>>>b +a<b a<=b a>b a>=b +a==b a!=b +a&b +a^b +a|b +a&&b +a||b +a?b:z +a=b a+=b a-=b a*=b a/=b a%=b a<<=b a>>=b a&=b a^=b a|=b +a,y diff --git a/userdiff.c b/userdiff.c index c4ebb9ff734..7247d351cde 100644 --- a/userdiff.c +++ b/userdiff.c @@ -333,6 +333,17 @@ PATTERNS("scheme", "|([^][)(}{[ \t])+"), PATTERNS("tex", "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$", "\\\\[a-zA-Z@]+|\\\\.|([a-zA-Z0-9]|[^\x01-\x7f])+"), +PATTERNS("typescript", + "^[\t ]*((class|constructor|public|private|protected|function|interface)[ \t].*)$\n" + // arrow funcs + "^[\t ]*((const|let|var)?[^()]*)=[\t ]*\\([^()]*\\)[\t ]*.*=>.*$", + /* -- */ + "[a-zA-Z_][a-zA-Z0-9_]*" + // numeric constants + "|[-+0-9.e]+|0[xX]?[0-9a-fA-F]" + // operators + "|[-+*/<>%&^|=!]" + "|--|\\+\\+|//=?|<<=?|>>?=?"), { .name = "default", .binary = -1 }, }; #undef PATTERNS