Proposal: add issue number to PR title

Maybe a little off-topic, but I think the main problem with tikv is that the squash merge method is inherently incompatible with DCO. They shouldn’t be used that way.

I don’t see any problems between squash merge and DCO. Maybe you can send a new thread to explain your thoughts.

@Mini256 is verifying the approach proposed above to retain current way and extract info from body.

As there are projects write issue number in commit message header, I don’t think it is a great pity. If your opinion is mainly about the format, you can propose your own, and we will join all thoughts into one final solution.

Because TiKV requests sign-off (DCO requirement) but TiDE cannot extract sign-off info and populate into the commit message, you get that squash content in mess. Actually, if we write commit message by hands, there should have no such “issues”.

As there are projects write issue number in commit message header, I don’t think it is a great pity.

Is it a coincident that these projects use their own issue tracking systems?

If your opinion is mainly about the format, you can propose your own

I don’t think we need to write the issue number in title. If a pull request is about to close an issue, it should be stated clearly either in pull requests body or commit messages and github will link those issues to pull requests automatically. If it’s written in commit messages, then current squash and merge can also make the issue number appear in the git log history.

Because TiKV requests sign-off (DCO requirement) but TiDE cannot extract sign-off info and populate into the commit message, you get that squash content in mess.

Yes, so it’s a problem of TiDE.

Actually, if we write commit message by hands, there should have no such “issues”.

I’m not sure what does “write commit message by hands” mean. I usually writes details commit messages and separate a pull requests into several commits, but TiDE will squash them into one, which is still a mess. https://github.com/tikv/tikv/commit/e2713589cac40d6888d7f2c891bce36e252f3b47 is an example.

1 Like

[123] becomes magic to any new members. And it looks terrible.

@BusyJay, I have updated the proposal, do you think these two formats will be better?

[TI-11123] pkg: what's changed in this one package (#14232)
[issue-11123] pkg: what's changed in this one package (#14232)

Better than just [number].

I strongly against adding issue number to title for following reasons:

  • We already have good practice to link PR and issues by writing #number in the PR body. Don’t need to reinvent the wheel.
  • Adding issue number in title is a duplicated work when we have to fill it in PR template already.

My suggestions to the proposal are:

  • If the purpose is to make it searchable in git log, then I suggest to ask everyone write one close #number in their commit messages or make bot to parse the pr body. Writing close #number in commit messages will also appear in the PR draft.
  • If the purpose is to link between pr and issues, then currently an issue is linked to all PRs which have an issue number in their body already.

One of the example is https://github.com/tikv/tikv/issues/9012. https://github.com/tikv/tikv/pull/10971 fixes the issue with well written message. And all its cherry-pick also contains the well written messages, though surrounded by noises. You are able to search the issue number in all release branch after cherry-picks are merged, hence meet the proposal goal. It’s also well recognized by github, so even you view the history in github, you can still easily find the fixing issues or commits by simply clicking. No weird format of PR title is required and no duplicated work.

1 Like

i proposal use below format as suffix to PR title:

(#pr_number operation #issue_number), operation can be:

> means close
>> means ref

example:
close:
*: Forbid drop placement policy when it is in used by db or partition (#28374 > #27985)

ref then close:
*: Forbid drop placement policy when it is in used by db or partition (#28374 >> #27985)
*: Forbid drop placement policy when it is in used by db or partition (#28374 > #27986)

this way it is clear, meaniful and similiar to current habit:

  1. you can get pr, issue nubmer together for one glance of the tail.
  2. sign # and > make it meaningful and short
  3. operation support can tells more info and extendable

@feitian124, Thank you for your suggestion. However, it should be noted that there is no PR number in our current PR title format, which is automatically added by the bot while merging.

yes i know. is it possible to update the logic of bot to add both pr number and issue number?
for example:

in body has close #1 become: my pr title (#2 > #1)
in body has ref #1 become: my pr title (#2 >> #1)

@Mini256

  • We already have good practice to link PR and issues by writing #number in the PR body. Don’t need to reinvent the wheel.

Yes, I know that, and our release note collects tool benefits from this, our requirement is to put the issue number into the commit message and take the issue as the center to associate with systems including bug management, requirement management, release management, etc. Maybe the demand side can give you a better explanation.

  • If the purpose is to make it searchable in git log , then I suggest to ask everyone write one close #number in their commit messages or make bot to parse the pr body. Writing close #number in commit messages will also appear in the PR draft.

Before this proposal is proposed, we first consider placing the issue number in the PR body.

But the reality is that we rely on the tide to help us complete the automatic merging of PR, but it only provides a merge_commit_template configuration to allow us to make simple customizations.

It is possible for us to use the entire PR body as the commit message body, but it needs our PR body to be concise enough, otherwise, there will still be a lot of noise for the commit message。

It should be noted that such as parse PR body and extract issue number are simple in code implementation, but such customized functions that lack universality are difficult to be received by the upstream (The tide component is currently being maintained by the k8s community).

cc: @feitian124

is it possible to update the logic of bot to add both pr number and issue number?
for example:

For the issue will be closed by PR, yes, We can submit a PR to the upstream of the robot to add the required fields,but if the PR is associated with an issue, but does not close it, GitHub does not provide a direct API to let us get the associated issue numbers.

I have created an issue in the upstream repository to request support the issue number field.

@feitian124, I tried to use the regular expression mentioned by @hi-rustin to extract the issue number from the PR body, and then passed it to the commit message template of tide, but I found that the regular expression matches #xxx, the number in the format may be a pull request number, which will cause the robot to mistakenly fill the pr number into the commit message.

If we want to further distinguish between the issue number and the pr number, you will need to use the GitHub API to determine, so I think this automatically generated scheme may not be a good method.

Refer: Proposal: add issue number to PR title

consider placing the issue number in the PR body.

You may miss my point. I also suggest to write the issue number in commit message as

Close #number.

You can check the link of TiKV PR I provided above, the number is written in the commit messages and appear in the final merge commit.

1 Like

I like your idea. If I get it correctly, actually it pushes us contributors to write pretty commit message.

For the squash part, I found the PR below that the author of TiDE said

just tell your users to do proper commits and squash themselves

1 Like

Sorry, I misunderstood.

This is indeed a solution, but currently, repositories under pingcap org do not squash the commit message in the PR together as the final commit message body.

Personally, I think another disadvantage of adding issue number to the title is that it increases the length of the PR title or commit message subject, but one of the advantages is that we can intuitively know which PR or commit are associated with the issue, whether it is on git or github, because the issue number is always placed at the front.

I guess this explains why people don’t write meaningful commit messages. I’m not sure whether TiDB community consider “only keeping title” is a reasonable behavior. If you do, then I suggest to use different style between TiDB org and TiKV org. The bot can check whether there are issue numbers in either title or commit messages.

I suggest to use different style between TiDB org and TiKV org

What do you think? @tison

@BusyJay, but I think according to the method you proposed, it also requires some learning cost to attach issue number to commit message in PR, if the robot finds that all commits in a PR do not have an issue number, the PR author may need to submit an empty commit to attach the issue number or re-modify the commit message that has been submitted before.

it also requires some learning cost to attach issue number to commit message in PR

Yes, but it’s worthy, right? We want developers to write meaningful message as Discuss: Fruitful commit message suggested. Learning attaching issue number is the first step. Writing close #number or ref #number is a common practice on Github, it’s not just pingcap or tikv specific.

A vote for TiKV community is present:

I’m going to conclude the proposal above and apply our thoughts on tikv & pd first to see what we gain.

Summarize the above discussion and answers:

Why do we need to link a pull request to an issue?

As the GitHub document explains:

You can link a pull request to an issue to show that a fix is in progress and to automatically close the issue when the pull request is merged.

In addition:

  • Through the association with PR and Issue, we can track the fixes of bug issues easily, because the number of PR fixing the same bug is different on different branches, the number of associated issue recording bug details is the same.

  • Some of our automation tools may also use the relationship between PR and Issue.

Why do we not use the GitHub API to get the association?

We can get issues that were may be closed by this pull request through the ClosingissueSreferences field in GitHub’s Graph API.

But in some cases, a PR will fix an Issue, but there is no complete fix, the author of the PR does not want to close the Issue immediately after the PR is merged. We cannot obtain these associated issues without close semantics through the API.

The PR body has already provided the issue number, why do we still need to add it to the PR title?

We can use the support keywords to associate with the Issue that may be closed by current PR in the PR body. And then use other keywords like ref, see to indicate the non-close semantic association between PR and Issue.

Unfortunately, the PR body is not structured, and the above keywords may also be used to link another PR. For example, if the PR author wants to close a previous PR after the current PR is merged automatically, he may add close {pull_number} to the PR body.

In order to reduce the difficulty of automatic tools in identifying associated issues, we propose to fill in issue number in PR title according to a certain format manually.

What is the difference between the two ways?

Adding Issue Number to PR title and PR body

  1. The bot will make sure that ALL numbers beginning with the issue- in the PR title are issue numbers.

  2. Automation tools can extract issue numbers directly from the PR title through regex without checking if number is issue number again via Github API.

  3. The issue number added to the PR title will not make GitHub generate a reference link to the issue. At the same time, we ALSO need to add the issue number in the PR body through the syntax supported.

  4. This will grow the length of the commit message title. (For example: In the commits list page of Spark main repo, because GitHub can only display up to 72 characters of commit message title, the exceeding part of most of the message title will be omitted with ···)

Only Adding Issue Number to PR body

  1. The bot can only ensure that there is AT LEAST ONE associated issue number in the PR body.

  2. Automation tools can use the regex ( Such as: (?i)(ref|close[sd]?|resolve[sd]?|fix(e[sd])?)\\s*#([1-9]\\d*) ) to extract all #number from the PR body, and then further confirm whether it is issue number through the Github API.

I’ve proposed a PR for some TiDB repos: