diff mbox series

[RFC,1/1] Add separator lines into `git log --graph`.

Message ID 20240407051031.6018-2-leduyquang753@gmail.com (mailing list archive)
State New, archived
Headers show
Series Add lines to `git log --graph` to separate connected regions | expand

Commit Message

Quang Lê Duy April 7, 2024, 5:10 a.m. UTC
This is to separate out connected regions of the resulting commit graph so as
to not have them confused as belonging to the same timeline.
---
 graph.c                                |  55 +++++++++++-
 t/t4218-log-graph-connected-regions.sh | 119 +++++++++++++++++++++++++
 2 files changed, 170 insertions(+), 4 deletions(-)
 create mode 100755 t/t4218-log-graph-connected-regions.sh

Comments

Eric Sunshine April 7, 2024, 5:47 a.m. UTC | #1
On Sun, Apr 7, 2024 at 1:10 AM Lê Duy Quang <leduyquang753@gmail.com> wrote:
> This is to separate out connected regions of the resulting commit graph so as
> to not have them confused as belonging to the same timeline.
> ---

I'm not particularly a user of --graph, so I don't necessarily have an
opinion about the utility of this change or its mechanics, but I can
make a few observations to help you improve the patch to improve the
chances of it being accepted.

First, move the information from the cover letter into the commit
message of the patch itself since that information will be helpful to
future readers of the patch if it becomes part of the permanent
history.

Second, following Documentation/SubmittingPatches guidelines, the
subject could instead be written something like this:

    log: visually separate `git log --graph` regions

Third, add a Signed-off-by: trailer after the commit message (see
SubmittingPatches).

> diff --git a/graph.c b/graph.c
> @@ -729,9 +742,9 @@ static int graph_num_expansion_rows(struct git_graph *graph)
>  static int graph_needs_pre_commit_line(struct git_graph *graph)
>  {
> -       return graph->num_parents >= 3 &&
> +       return graph->connected_region_state == CONNECTED_REGION_NEW_REGION || (graph->num_parents >= 3 &&

Style: This line is overly long and should be wrapped; we aim (as much
as possible) to fit within an 80-column limit.

>                graph->commit_index < (graph->num_columns - 1) &&
> -              graph->expansion_row < graph_num_expansion_rows(graph);
> +              graph->expansion_row < graph_num_expansion_rows(graph));
>  void graph_update(struct git_graph *graph, struct commit *commit)
> @@ -760,6 +773,12 @@ void graph_update(struct git_graph *graph, struct commit *commit)
> +
> +       /*
> +        * Determine whether this commit belongs to a new connected region.
> +        */
> +       graph->connected_region_state = (graph->connected_region_state != CONNECTED_REGION_FIRST_COMMIT &&
> +               graph->num_new_columns == 0) ? CONNECTED_REGION_NEW_REGION : CONNECTED_REGION_USE_CURRENT;

Style: overly long lines

> +static void graph_output_separator_line(struct git_graph *graph, struct graph_line *line)
> +{
> +       /*
> +        * This function adds a row that separates two disconnected graphs,
> +        * as the appearance of multiple separate commits on top of each other
> +        * may cause a misunderstanding that they belong to a timeline.
> +        */

This comment seems to explain the purpose of the function itself. As
such, it should precede the function definition rather than being
embedded within it.

> +       assert(graph->connected_region_state == CONNECTED_REGION_NEW_REGION);

We tend to use BUG() rather than assert():

    if (graph->connected_region_state != CONNECTED_REGION_NEW_REGION)
        BUG("explain the failure here");

> +       /*
> +        * Output the row.
> +        */
> +       graph_line_addstr(line, "---");

The code itself is obvious enough without the comment, so the comment
is mere noise, thus should be dropped.

> +       /*
> +        * Immediately move to GRAPH_COMMIT state as there for sure aren't going to be
> +        * any more pre-commit lines.
> +        */
> +       graph_update_state(graph, GRAPH_COMMIT);
> +}
> diff --git a/t/t4218-log-graph-connected-regions.sh b/t/t4218-log-graph-connected-regions.sh
> new file mode 100755

We typically try to avoid creating new test scripts if an existing
script would be a logical place to house the new tests. I haven't
personally checked if such a script already exists, but if so, it
would be good to add new tests to it. If not, then creating a new
script, as you do here, may be fine.

> @@ -0,0 +1,119 @@
> +#!/bin/sh
> +
> +test_description="git log --graph connected regions"
> +
> +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
> +
> +. ./test-lib.sh
> +. "$TEST_DIRECTORY/lib-terminal.sh"
> +. "$TEST_DIRECTORY/lib-log-graph.sh"

"lib-terminal.sh" doesn't seem to be needed by these tests.

> +test_cmp_graph () {
> +       lib_test_cmp_graph --format=%s "$@"
> +}
> +
> +add_commit () {
> +       touch $1 &&

If the timestamp of the empty file being created is not significant,
we avoid `touch` and instead use `>` to create the file:

    >"$1" &&

> +       git add $1 &&
> +       git commit -m $1
> +       git tag "$1-commit"
> +}

Is this add_commit() function more or less duplicating the
functionality of test_commit() from t/test-lib-functions.sh?

> +cat > expect <<\EOF

Style: drop whitespace following redirect operators:

    cat >expect <<\EOF

> +* a3
> +* a2
> +* a1
> +| *   b4
> +| |\
> +| | * c3
> +| * | b3
> +| |/
> +| * b2
> +| * b1
> +|/
> +| * d4
> +| * d3
> +| | * e3
> +| |/
> +| * d2
> +| * d1
> +|/
> +* root
> +EOF
> +
> +test_expect_success 'all commits' '
> +       test_cmp_graph a b c d e
> +'

Modern test style is to perform all actions inside the
test_expect_success body itself, so:

    test_expect_success 'all commits' '
        cat >expect <<-\EOF
        ...
        EOF
        test_cmp_graph a b c d e
    '

Note the use of <<- to allow you to indent the here-doc body.
Eric Sunshine April 7, 2024, 5:52 a.m. UTC | #2
On Sun, Apr 7, 2024 at 1:47 AM Eric Sunshine <sunshine@sunshineco.com> wrote:
> I'm not particularly a user of --graph, so I don't necessarily have an
> opinion about the utility of this change or its mechanics, but I can
> make a few observations to help you improve the patch to improve the
> chances of it being accepted.

I forgot to mention that application of your patch results in some warnings:

    % git am add-sep-lines.patch
    Applying: Add separator lines into `git log --graph`.
    .git/rebase-apply/patch:61: trailing whitespace.
    .git/rebase-apply/patch:147: trailing whitespace.
    .git/rebase-apply/patch:151: trailing whitespace.
    .git/rebase-apply/patch:160: trailing whitespace.
    warning: 4 lines add whitespace errors.
Quang Lê Duy April 7, 2024, 7:03 a.m. UTC | #3
On Sun, Apr 7, 2024 at 12:47 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
> > diff --git a/graph.c b/graph.c
> > @@ -729,9 +742,9 @@ static int graph_num_expansion_rows(struc
t git_graph *graph)
> >  static int graph_needs_pre_commit_line(struct git_graph *graph)
> >  {
> > -       return graph->num_parents >= 3 &&
> > +       return graph->connected_region_state == CONNECTED_REGION_NEW_REGION || (graph->num_parents >= 3 &&
>
> Style: This line is overly long and should be wrapped; we aim (as much
> as possible) to fit within an 80-column limit.
>
> >                graph->commit_index < (graph->num_columns - 1) &&
> > -              graph->expansion_row < graph_num_expansion_rows(graph);
> > +              graph->expansion_row < graph_num_expansion_rows(graph));
> >  void graph_update(struct git_graph *graph, struct commit *commit)
> > @@ -760,6 +773,12 @@ void graph_update(struct git_graph *graph, struct commit *commit)
> > +
> > +       /*
> > +        * Determine whether this commit belongs to a new connected region.
> > +        */
> > +       graph->connected_region_state = (graph->connected_region_state != CONNECTED_REGION_FIRST_COMMIT &&
> > +               graph->num_new_columns == 0) ? CONNECTED_REGION_NEW_REGION : CONNECTED_REGION_USE_CURRENT;
>
> Style: overly long lines

May I ask how am I expected to place the line breaks? The Linux kernel style
guide I consulted
(https://www.kernel.org/doc/html/v4.10/process/coding-style.html) doesn't seem
to go into too much detail on this.

> > +static void graph_output_separator_line(struct git_graph *graph, struct graph_line *line)
> > +{
> > +       /*
> > +        * This function adds a row that separates two disconnected graphs,
> > +        * as the appearance of multiple separate commits on top of each other
> > +        * may cause a misunderstanding that they belong to a timeline.
> > +        */
>
> This comment seems to explain the purpose of the function itself. As
> such, it should precede the function definition rather than being
> embedded within it.

I just followed what the surrounding code did (particularly in the original
`graph_output_pre_commit_line` function), but on second look that functionality
comment seems to only serve as context for the sentence below that so OK.

> > +       assert(graph->connected_region_state == CONNECTED_REGION_NEW_REGION);
>
> We tend to use BUG() rather than assert():

Same thing, I just followed that `graph_output_pre_commit_line` did. So I should
forgo the consistency here? Or is that usage of `assert` in the existing code
also to be updated?

>     if (graph->connected_region_state != CONNECTED_REGION_NEW_REGION)
>         BUG("explain the failure here");
>
> > +       /*
> > +        * Output the row.
> > +        */
> > +       graph_line_addstr(line, "---");
>
> The code itself is obvious enough without the comment, so the comment
> is mere noise, thus should be dropped.

Also same thing that I followed for consistency.

> > +       /*
> > +        * Immediately move to GRAPH_COMMIT state as there for sure aren't going to be
> > +        * any more pre-commit lines.
> > +        */
> > +       graph_update_state(graph, GRAPH_COMMIT);
> > +}
> > diff --git a/t/t4218-log-graph-connected-regions.sh b/t/t4218-log-graph-connected-regions.sh
> > new file mode 100755
>
> We typically try to avoid creating new test scripts if an existing
> script would be a logical place to house the new tests. I haven't
> personally checked if such a script already exists, but if so, it
> would be good to add new tests to it. If not, then creating a new
> script, as you do here, may be fine.

I tried looking and didn't see a script that these tests would fit nicely into.
I would really appreciate having a second set of eyes.

> Modern test style is to perform all actions inside the
> test_expect_success body itself, so:
>
>     test_expect_success 'all commits' '
>         cat >expect <<-\EOF
>         ...
>         EOF
>         test_cmp_graph a b c d e
>     '
>
> Note the use of <<- to allow you to indent the here-doc body.

This is also because I followed what `t4202-log.sh` did, but if that represents
outdated practice then I'll change.

(My apologies, the email client doesn't automatically add CC to the mailing list
in the reply and I forgot to do it myself, so I have to resend this message.)
Quang Lê Duy April 7, 2024, 7:06 a.m. UTC | #4
On Sun, Apr 7, 2024 at 12:52 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
> I forgot to mention that application of your patch results in some warnings:
>
>     % git am add-sep-lines.patch
>     Applying: Add separator lines into `git log --graph`.
>     .git/rebase-apply/patch:61: trailing whitespace.
>     .git/rebase-apply/patch:147: trailing whitespace.
>     .git/rebase-apply/patch:151: trailing whitespace.
>     .git/rebase-apply/patch:160: trailing whitespace.
>     warning: 4 lines add whitespace errors.

Indeed I failed to notice the whitespace `vim` added to the empty lines.
Appreciate your notice.
Dragan Simic April 7, 2024, 8:35 a.m. UTC | #5
On 2024-04-07 09:06, Quang Lê Duy wrote:
> On Sun, Apr 7, 2024 at 12:52 PM Eric Sunshine <sunshine@sunshineco.com> 
> wrote:
>> I forgot to mention that application of your patch results in some 
>> warnings:
>> 
>>     % git am add-sep-lines.patch
>>     Applying: Add separator lines into `git log --graph`.
>>     .git/rebase-apply/patch:61: trailing whitespace.
>>     .git/rebase-apply/patch:147: trailing whitespace.
>>     .git/rebase-apply/patch:151: trailing whitespace.
>>     .git/rebase-apply/patch:160: trailing whitespace.
>>     warning: 4 lines add whitespace errors.
> 
> Indeed I failed to notice the whitespace `vim` added to the empty 
> lines.
> Appreciate your notice.

As a note, vim can be configured to highlight the trailing
whitespace, making it easy to spot.
Eric Sunshine April 7, 2024, 9:07 a.m. UTC | #6
On Sun, Apr 7, 2024 at 3:04 AM Quang Lê Duy <leduyquang753@gmail.com> wrote:
> On Sun, Apr 7, 2024 at 12:47 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
> > > +       return graph->connected_region_state == CONNECTED_REGION_NEW_REGION || (graph->num_parents >= 3 &&
> > > +       graph->connected_region_state = (graph->connected_region_state != CONNECTED_REGION_FIRST_COMMIT &&
> > > +               graph->num_new_columns == 0) ? CONNECTED_REGION_NEW_REGION : CONNECTED_REGION_USE_CURRENT;
> >
> > Style: overly long lines
>
> May I ask how am I expected to place the line breaks? The Linux kernel style
> guide I consulted
> (https://www.kernel.org/doc/html/v4.10/process/coding-style.html) doesn't seem
> to go into too much detail on this.

I don't have a precise answer other than "use good taste". One
reasonably solid rule is that when wrapping at `&&` and `||`, those
operators should appear at the end of the line rather than the
beginning of the next line. So, a possible wrapping for these two
cases might be:

    return graph->connected_region_state == CONNECTED_REGION_NEW_REGION ||
        (graph->num_parents >= 3 &&
        graph->commit_index < (graph->num_columns - 1) &&
        graph->expansion_row < graph_num_expansion_rows(graph));

    graph->connected_region_state =
        (graph->connected_region_state != CONNECTED_REGION_FIRST_COMMIT &&
        graph->num_new_columns == 0) ?
        CONNECTED_REGION_NEW_REGION : CONNECTED_REGION_USE_CURRENT;

Since this enum is private to the C file and not part of an expressive
public API, another possibility for reducing the line length is to
shorten some of the names. For instance:

    enum connected_region_state {
        CONNREG_FIRST_COMMIT,
        CONNREG_USE_CURRENT,
        CONNREG_NEW_REGION
    };

> > > +static void graph_output_separator_line(struct git_graph *graph, struct graph_line *line)
> > > +{
> > > +       /*
> > > +        * This function adds a row that separates two disconnected graphs,
> > > +        * as the appearance of multiple separate commits on top of each other
> > > +        * may cause a misunderstanding that they belong to a timeline.
> > > +        */
> >
> > This comment seems to explain the purpose of the function itself. As
> > such, it should precede the function definition rather than being
> > embedded within it.
>
> I just followed what the surrounding code did (particularly in the original
> `graph_output_pre_commit_line` function), but on second look that functionality
> comment seems to only serve as context for the sentence below that so OK.

Indeed, looking at graph_output_pre_commit_line(), the comment seems
to be explaining the reason for the assert() in that function, whereas
the comment you wrote here seems to be explaining the purpose of the
function itself.

> > > +       assert(graph->connected_region_state == CONNECTED_REGION_NEW_REGION);
> >
> > We tend to use BUG() rather than assert():
>
> Same thing, I just followed that `graph_output_pre_commit_line` did. So I should
> forgo the consistency here? Or is that usage of `assert` in the existing code
> also to be updated?

I see what you mean, now that I'm looking at graph.c. Since assert()
is used so heavily in this file already (and there are no BUG()
invocations at all), it probably makes sense to be consistent and use
assert() here, as well. Adding a sentence to the commit message
explaining that you're using assert() for consistency rather than
BUG() will be helpful to reviewers.

While it might be a nice cleanup to eventually swap out assert() in
favor of BUG(), we should leave that for another day in order to keep
this patch well-focused. (We don't want to add a bunch of "while at
it, let's also change this" items, thus losing focus on what you
actually want to achieve.)

> > > +       /*
> > > +        * Output the row.
> > > +        */
> > > +       graph_line_addstr(line, "---");
> >
> > The code itself is obvious enough without the comment, so the comment
> > is mere noise, thus should be dropped.
>
> Also same thing that I followed for consistency.

Understandable. In this case, I don't personally feel that this
comment is adding any value, thus would drop it, but others (including
yourself) may feel differently.

> > Modern test style is to perform all actions inside the
> > test_expect_success body itself, so:
> >
> >     test_expect_success 'all commits' '
> >         cat >expect <<-\EOF
> >         ...
> >         EOF
> >         test_cmp_graph a b c d e
> >     '
> >
> > Note the use of <<- to allow you to indent the here-doc body.
>
> This is also because I followed what `t4202-log.sh` did, but if that represents
> outdated practice then I'll change.

Understood.

Generally speaking, when adding new tests, we do want to follow modern
practice; that's especially true when creating a brand new test
script, but even when adding new tests to an existing script.

If you're modifying an existing test, then being consistent with the
surrounding code is a good idea. Consistency may also be reasonable
sometimes when inserting a new test into a block of existing
closely-related tests. Saying so in the commit message will help
reviewers understand.
diff mbox series

Patch

diff --git a/graph.c b/graph.c
index 1ca34770ee..c0107c02fa 100644
--- a/graph.c
+++ b/graph.c
@@ -69,6 +69,12 @@  enum graph_state {
 	GRAPH_COLLAPSING
 };
 
+enum connected_region_state {
+	CONNECTED_REGION_FIRST_COMMIT,
+	CONNECTED_REGION_USE_CURRENT,
+	CONNECTED_REGION_NEW_REGION
+};
+
 static void graph_show_line_prefix(const struct diff_options *diffopt)
 {
 	if (!diffopt || !diffopt->line_prefix)
@@ -310,6 +316,12 @@  struct git_graph {
 	 * stored as an index into the array column_colors.
 	 */
 	unsigned short default_column_color;
+	/*
+	 * The state of which connected region the current commit belongs to.
+	 * This is used to output a clarifying separator line between
+	 * connected regions.
+	 */
+	enum connected_region_state connected_region_state;
 };
 
 static struct strbuf *diff_output_prefix_callback(struct diff_options *opt, void *data)
@@ -380,6 +392,7 @@  struct git_graph *graph_init(struct rev_info *opt)
 	 * This way we start at 0 for the first commit.
 	 */
 	graph->default_column_color = column_colors_max - 1;
+	graph->connected_region_state = CONNECTED_REGION_FIRST_COMMIT;
 
 	/*
 	 * Allocate a reasonably large default number of columns
@@ -729,9 +742,9 @@  static int graph_num_expansion_rows(struct git_graph *graph)
 
 static int graph_needs_pre_commit_line(struct git_graph *graph)
 {
-	return graph->num_parents >= 3 &&
+	return graph->connected_region_state == CONNECTED_REGION_NEW_REGION || (graph->num_parents >= 3 &&
 	       graph->commit_index < (graph->num_columns - 1) &&
-	       graph->expansion_row < graph_num_expansion_rows(graph);
+	       graph->expansion_row < graph_num_expansion_rows(graph));
 }
 
 void graph_update(struct git_graph *graph, struct commit *commit)
@@ -760,6 +773,12 @@  void graph_update(struct git_graph *graph, struct commit *commit)
 	 * commit.
 	 */
 	graph->prev_commit_index = graph->commit_index;
+	
+	/*
+	 * Determine whether this commit belongs to a new connected region.
+	 */
+	graph->connected_region_state = (graph->connected_region_state != CONNECTED_REGION_FIRST_COMMIT &&
+		graph->num_new_columns == 0) ? CONNECTED_REGION_NEW_REGION : CONNECTED_REGION_USE_CURRENT;
 
 	/*
 	 * Call graph_update_columns() to update
@@ -865,8 +884,28 @@  static void graph_output_skip_line(struct git_graph *graph, struct graph_line *l
 		graph_update_state(graph, GRAPH_COMMIT);
 }
 
-static void graph_output_pre_commit_line(struct git_graph *graph,
-					 struct graph_line *line)
+static void graph_output_separator_line(struct git_graph *graph, struct graph_line *line)
+{
+	/*
+	 * This function adds a row that separates two disconnected graphs,
+	 * as the appearance of multiple separate commits on top of each other
+	 * may cause a misunderstanding that they belong to a timeline.
+	 */
+	assert(graph->connected_region_state == CONNECTED_REGION_NEW_REGION);
+
+	/*
+	 * Output the row.
+	 */
+	graph_line_addstr(line, "---");
+
+	/*
+	 * Immediately move to GRAPH_COMMIT state as there for sure aren't going to be
+	 * any more pre-commit lines.
+	 */
+	graph_update_state(graph, GRAPH_COMMIT);
+}
+
+static void graph_output_parent_expansion_line(struct git_graph *graph, struct graph_line *line)
 {
 	int i, seen_this;
 
@@ -928,6 +967,14 @@  static void graph_output_pre_commit_line(struct git_graph *graph,
 		graph_update_state(graph, GRAPH_COMMIT);
 }
 
+static void graph_output_pre_commit_line(struct git_graph *graph, struct graph_line *line)
+{
+	if (graph->connected_region_state == CONNECTED_REGION_NEW_REGION)
+		graph_output_separator_line(graph, line);
+	else
+		graph_output_parent_expansion_line(graph, line);
+}
+
 static void graph_output_commit_char(struct git_graph *graph, struct graph_line *line)
 {
 	/*
diff --git a/t/t4218-log-graph-connected-regions.sh b/t/t4218-log-graph-connected-regions.sh
new file mode 100755
index 0000000000..4efe17827e
--- /dev/null
+++ b/t/t4218-log-graph-connected-regions.sh
@@ -0,0 +1,119 @@ 
+#!/bin/sh
+
+test_description="git log --graph connected regions"
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+. "$TEST_DIRECTORY/lib-terminal.sh"
+. "$TEST_DIRECTORY/lib-log-graph.sh"
+
+test_cmp_graph () {
+	lib_test_cmp_graph --format=%s "$@"
+}
+
+add_commit () {
+	touch $1 &&
+	git add $1 &&
+	git commit -m $1
+	git tag "$1-commit"
+}
+
+test_expect_success setup '
+	git checkout -b a &&
+	add_commit root &&
+	
+	add_commit a1 &&
+	add_commit a2 &&
+	add_commit a3 &&
+	
+	git checkout -b b root-commit &&
+	add_commit b1 &&
+	add_commit b2 &&
+	git checkout -b c &&
+	add_commit c3 &&
+	git checkout b &&
+	add_commit b3 &&
+	git merge c -m b4 &&
+	
+	git checkout -b d root-commit &&
+	add_commit d1 &&
+	add_commit d2 &&
+	git checkout -b e &&
+	add_commit e3 &&
+	git checkout d &&
+	add_commit d3 &&
+	add_commit d4
+'
+
+cat > expect <<\EOF
+* a3
+* a2
+* a1
+| *   b4
+| |\
+| | * c3
+| * | b3
+| |/
+| * b2
+| * b1
+|/
+| * d4
+| * d3
+| | * e3
+| |/
+| * d2
+| * d1
+|/
+* root
+EOF
+
+test_expect_success 'all commits' '
+	test_cmp_graph a b c d e
+'
+
+cat > expect <<\EOF
+* a3
+* a2
+* a1
+---
+*   b4
+|\
+| * c3
+* | b3
+|/
+* b2
+* b1
+---
+* d4
+* d3
+| * e3
+|/
+* d2
+* d1
+EOF
+
+test_expect_success 'without root commit' '
+	test_cmp_graph a b c d e ^root-commit
+'
+
+cat > expect <<\EOF
+* a3
+---
+*   b4
+|\
+| * c3
+* b3
+---
+* d4
+* d3
+---
+* e3
+EOF
+
+test_expect_success "branches' tips" '
+	test_cmp_graph a b c d e ^a2-commit ^b2-commit ^d2-commit
+'
+
+test_done