Message ID | 20230204134329.251451-1-rybak.a.v@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | userdiff: Java updates | expand |
Am 04.02.23 um 14:43 schrieb Andrei Rybak: > On 04/02/2023 10:22, Tassilo Horn wrote: >> Thanks for including me being the last contributor to java userdiff. >> The patches look good from my POV and are safe-guarded with tests, so >> I'm all for it. > > Thank you for review! > > I've realized that I've been writing modifiers "abstract" and "sealed" in a > technically correct, but not the conventional order. Here's a reroll with the > order of modifiers following the style of original authors of > https://openjdk.org/jeps/409. It doesn't matter for the purposes of the test, > but it will be less annoying to any future readers :-) I've looked through the patches and run the tests, and they all make sense to me. By just looking at the patch text I noted that no whitespace between the identifier and the opening angle bracket is permitted and whether it should be allowed, but the commit messages make quite clear that whitespace is not allowed in this position. Hence: Reviewed-by: Johannes Sixt <j6t@kdbg.org> Thanks, -- Hannes
On 2023-02-05T11:09, Johannes Sixt wrote: > Am 04.02.23 um 14:43 schrieb Andrei Rybak: >> On 04/02/2023 10:22, Tassilo Horn wrote: >>> Thanks for including me being the last contributor to java userdiff. >>> The patches look good from my POV and are safe-guarded with tests, so >>> I'm all for it. >> >> Thank you for review! >> >> I've realized that I've been writing modifiers "abstract" and "sealed" in a >> technically correct, but not the conventional order. Here's a reroll with the >> order of modifiers following the style of original authors of >> https://openjdk.org/jeps/409. It doesn't matter for the purposes of the test, >> but it will be less annoying to any future readers :-) > > I've looked through the patches and run the tests, and they all make > sense to me. By just looking at the patch text I noted that no > whitespace between the identifier and the opening angle bracket is > permitted and whether it should be allowed, but the commit messages make > quite clear that whitespace is not allowed in this position. There is some kind of misunderstanding. I guess the wording in commit messages of the first and second patches could have been clearer. In Java, whitespace is allowed between type name and the brackets. It is permitted both for angle brackets of type parameters: class SpacesBeforeTypeParameters <A, B> { } and for round brackets of components in records: record SpacesBeforeComponents (String comp1, int comp2) { } The common convention, is however, to omit the whitespace before the brackets. The regular expression on branch master already allows for whitespace after the name of the type: "^[ \t]*(([a-z]+[ \t]+)*(class|enum|interface)[ \t]+[A-Za-z][A-Za-z0-9_$]*[ \t]+.*)$\n" ^^^^^^ so I didn't need to cover this case. Note that it requires a non-zero amount of whitespace. This part of the regular expression was left as is (v2 after patch 3/3): "^[ \t]*(([a-z-]+[ \t]+)*(class|enum|interface|record)[ \t]+[A-Za-z][A-Za-z0-9_$]*([ \t]+|[<(]).*)$\n" ^^^^^^ That being said, I guess it would be an improvement to also allow the name of the type be followed by the end of the line, for users with fairly common code style that puts braces on separate lines: class WithLineBreakBeforeOpeningBrace { } or `extends` and `implements` clauses after a line break: class ExtendsOnSeparateLine extends Number implements Serializable { } even type parameters: class TypeParametersOnSeparateLine <A, B> { } Something like the following: "^[ \t]*(([a-z-]+[ \t]+)*(class|enum|interface|record)[ \t]+[A-Za-z][A-Za-z0-9_$]*(([ \t]+|[<(]).*)?)$\n" ^ ^^ perhaps? Technically, the following is also valid Java: class WithComment//comment immediately after class name { } but I'm not sure if allowing it is needed. If so, we might as well just do this: "^[ \t]*(([a-z-]+[ \t]+)*(class|enum|interface|record)[ \t]+[A-Za-z][A-Za-z0-9_$]*.*)$\n" ^^
Am 05.02.23 um 20:27 schrieb Andrei Rybak: > On 2023-02-05T11:09, Johannes Sixt wrote: >> Am 04.02.23 um 14:43 schrieb Andrei Rybak: >>> On 04/02/2023 10:22, Tassilo Horn wrote: >>>> Thanks for including me being the last contributor to java userdiff. >>>> The patches look good from my POV and are safe-guarded with tests, so >>>> I'm all for it. >>> >>> Thank you for review! >>> >>> I've realized that I've been writing modifiers "abstract" and >>> "sealed" in a >>> technically correct, but not the conventional order. Here's a reroll >>> with the >>> order of modifiers following the style of original authors of >>> https://openjdk.org/jeps/409. It doesn't matter for the purposes of >>> the test, >>> but it will be less annoying to any future readers :-) >> >> I've looked through the patches and run the tests, and they all make >> sense to me. By just looking at the patch text I noted that no >> whitespace between the identifier and the opening angle bracket is >> permitted and whether it should be allowed, but the commit messages make >> quite clear that whitespace is not allowed in this position. > > There is some kind of misunderstanding. I guess the wording in commit > messages of the first and second patches could have been clearer. > > In Java, whitespace is allowed between type name and the brackets. > It is permitted both for angle brackets of type parameters: > > class SpacesBeforeTypeParameters <A, B> { > } > > and for round brackets of components in records: > > record SpacesBeforeComponents (String comp1, int comp2) { > } > > The common convention, is however, to omit the whitespace before the > brackets. > > The regular expression on branch master already allows for whitespace > after the name of the type: > > "^[ \t]*(([a-z]+[ \t]+)*(class|enum|interface)[ > \t]+[A-Za-z][A-Za-z0-9_$]*[ \t]+.*)$\n" > ^^^^^^ > so I didn't need to cover this case. Note that it requires a non-zero > amount of whitespace. This part of the regular expression was left as > is (v2 after patch 3/3): > > "^[ \t]*(([a-z-]+[ \t]+)*(class|enum|interface|record)[ > \t]+[A-Za-z][A-Za-z0-9_$]*([ \t]+|[<(]).*)$\n" > ^^^^^^ > > > That being said, I guess it would be an improvement to also allow > the name of the type be followed by the end of the line, for users > with fairly common code style that puts braces on separate lines: > > class WithLineBreakBeforeOpeningBrace > { > } > > or `extends` and `implements` clauses after a line break: > > class ExtendsOnSeparateLine > extends Number > implements Serializable > { > } > > even type parameters: > > class TypeParametersOnSeparateLine > <A, B> > { > } > > Something like the following: > > "^[ \t]*(([a-z-]+[ \t]+)*(class|enum|interface|record)[ > \t]+[A-Za-z][A-Za-z0-9_$]*(([ \t]+|[<(]).*)?)$\n" > ^ ^^ > perhaps? Technically, the following is also valid Java: > > class WithComment//comment immediately after class name > { > } > > but I'm not sure if allowing it is needed. If so, we might as well just > do this: > > "^[ \t]*(([a-z-]+[ \t]+)*(class|enum|interface|record)[ > \t]+[A-Za-z][A-Za-z0-9_$]*.*)$\n" > ^^ Having seen all these examples, I think the following truncated expression might do the right thing for all cases that are valid Java: "^[ \t]*(([a-z-]+[ \t]+)*(class|enum|interface|record)[ \t].*)$" i.e., we recognize a whitespace in order to identify the keyword, and then capture anything that follows without being specific. My reasoning is that "class", "enum", "interface", and "record" cannot occur in any other context than the beginning of a class definition. (But please do correct me; I know next to nothing about Java syntax.) As always, userdiff regular expressions can assume that only valid constructs are inspected. -- Hannes