There is an RFC on github now. Please go github and comment, the link is here. The RFC will be seen as accepted, once it is LGTM3 and there is no objection in 3 days.
In short, I want to move pingcap/parser back to the pingcap/tidb repository in the form of go sub-module. It is moved to a separate repository due to dependency problems but now the thing changed.
Why and How
I will explain why and how to migration parser back to the tidb repository.
Why did we move parser into a separate repository?
The PR is tidb#2923: Move TiDB parser to a separate repository. In short, when we migrated to go module 111, @tiancaiamao decided to do so, because
We can put generated
parser.goin the repository directly, rather than generate it using Makefile script. The drawback of this way is that every time
parser.yis touched, there will be many lines change in
parser.go. Then the TiDB repo will be inflate quickly.
A better way is moving parser to a separated repo, so every time parser is changed, only go.mod need to change accordingly.
Personally, that is a weird argument for me. Indeed, pingcap/tidb does not store the diff of
parser.go anymore, but pingcap/parser does. How is moving the inflation from one repo to another better?
Another argument is about the messy dependency. The separation does improve the case.
What is wrong with a separate repository?
In fact, there is no big problem. However the development becomes inconvenient. You could argue that we have workarounds, but why not make the development easier?
We always need to ensure the version of two go modules. Cherry-picking becomes troublesome since the import version must be a specific branch of pingcap/parser.
It is a mysql parser, but is closely related to TiDB(basically everything except ast/mysql packages). You must code TiDB logic into the “standalone mysql parser” first before your real implementation, which means at least two PRs, even for small changes.
Before the PR of parser merged, the PR of TiDB must use module directive to redirect a specific version of parser to fix CI.
Problems above have defeated the previous argument
only go.mod need to change accordingly. We only separated code into two repositories, actual work did not decrease. However, our development experience does getting worse.
How do we do?
Obviously, we could move back to the old good way. But it must be a go sub-module, because:
- To keep compatibility. Parser has been imported by other tools and other users.
- To keep the clean dependency. That is why it is separated. We don’t want to make things worse again.
The detailed plan:
- Create parser directory in pingcap/tidb, which is synchronized with pingcap/parser, with
replace github.com/pingcap/parser => ./parser.
- Stop sending new PRs and issues to pingcap/parser, new PRs should be directly sent to pingcap/tidb. And merge recent PRs in pingcap/parser as much as possible. Changes will be synchronized from pingcap/parser to pingcap/tidb, of course.
- Create wrapper packages for pingcap/parser, which re-exports things from
pingcap/tidb/parser. I mean something like
import ( . github.com/pingcap/tidb/parser/xxx), create a dummy function will eliminate the error of
not used import. All code of pingcap/parser will be removed in this step.
- Announce the deprecation of pingcap/parser to possible users. Clean up old issues and PRs, create it in pingcap/tidb again, close it, or just forget about it.
- Archive pingcap/parser after, e.g. 4 months, i.e. two dev cycles. It mainly depends on users of pingcap/parser.
CI is not a problem. pingcap/parser only has unit tests, and integration tests
make gotest of
And by @hundundm, since sub-module is merely another module inside another module, we need to release it separately from the main module. That means our release team needs some extra work, which can be solved by a script: thanks to the identical release cycle/version between parser and tidb.