我有一个带有master和a分支的存储库,在这两个分支之间有很多合并活动。当分支A基于master创建时,我如何在我的存储库中找到提交?

我的存储库基本上是这样的:

-- X -- A -- B -- C -- D -- F  (master) 
          \     /   \     /
           \   /     \   /
             G -- H -- I -- J  (branch A)

我正在寻找修订A,这不是git merge-base(——all)找到的。


当前回答

有时这实际上是不可能的(除了一些例外情况,您可能幸运地拥有额外的数据),这里的解决方案不会起作用。

Git不保存历史引用(包括分支)。它只存储每个分支(头)的当前位置。这意味着随着时间的推移,你可能会丢失git中的一些分支历史。举个例子,每当你分支的时候,它就会立刻失去原来的那个分支。分支所做的就是:

git checkout branch1    # refs/branch1 -> commit1
git checkout -b branch2 # branch2 -> commit1

您可以假设第一个提交的是分支。情况往往如此,但也不总是如此。在上述操作之后,没有什么可以阻止您首先提交到任何一个分支。此外,git时间戳不能保证可靠。直到您对两者都做出承诺,它们才真正在结构上成为分支。

在图中,我们倾向于概念性地对提交进行编号,但是当提交树分支时,git没有真正稳定的序列概念。在这种情况下,您可以假设数字(表示顺序)是由时间戳决定的(当您将所有时间戳设置为相同时,看看git UI如何处理事情可能会很有趣)。

这是人类在概念上的期望:

After branch:
       C1 (B1)
      /
    -
      \
       C1 (B2)
After first commit:
       C1 (B1)
      /
    - 
      \
       C1 - C2 (B2)

这是你实际得到的结果:

After branch:
    - C1 (B1) (B2)
After first commit (human):
    - C1 (B1)
        \
         C2 (B2)
After first commit (real):
    - C1 (B1) - C2 (B2)

你会假设B1是原来的分支,但实际上它可能只是一个死分支(有人签出了-b,但从未提交给它)。直到你提交这两个,你才会在git中得到一个合法的分支结构:

Either:
      / - C2 (B1)
    -- C1
      \ - C3 (B2)
Or:
      / - C3 (B1)
    -- C1
      \ - C2 (B2)

You always know that C1 came before C2 and C3 but you never reliably know if C2 came before C3 or C3 came before C2 (because you can set the time on your workstation to anything for example). B1 and B2 is also misleading as you can't know which branch came first. You can make a very good and usually accurate guess at it in many cases. It is a bit like a race track. All things generally being equal with the cars then you can assume that a car that comes in a lap behind started a lap behind. We also have conventions that are very reliable, for example master will nearly always represent the longest lived branches although sadly I have seen cases where even this is not the case.

这里给出的例子是一个保存历史的例子:

Human:
    - X - A - B - C - D - F (B1)
           \     / \     /
            G - H ----- I - J (B2)
Real:
            B ----- C - D - F (B1)
           /       / \     /
    - X - A       /   \   /
           \     /     \ /
            G - H ----- I - J (B2)

Real here is also misleading because we as humans read it left to right, root to leaf (ref). Git does not do that. Where we do (A->B) in our heads git does (A<-B or B->A). It reads it from ref to root. Refs can be anywhere but tend to be leafs, at least for active branches. A ref points to a commit and commits only contain a like to their parent/s, not to their children. When a commit is a merge commit it will have more than one parent. The first parent is always the original commit that was merged into. The other parents are always commits that were merged into the original commit.

Paths:
    F->(D->(C->(B->(A->X)),(H->(G->(A->X))))),(I->(H->(G->(A->X))),(C->(B->(A->X)),(H->(G->(A->X)))))
    J->(I->(H->(G->(A->X))),(C->(B->(A->X)),(H->(G->(A->X)))))

这不是一个非常有效的表示,而是git可以从每个ref (B1和B2)中获得的所有路径的表达式。

Git的内部存储看起来更像这样(并不是A作为父文件出现了两次):

    F->D,I | D->C | C->B,H | B->A | A->X | J->I | I->H,C | H->G | G->A

如果你转储一个原始的git提交,你会看到零或多个父字段。如果为0,则表示没有父节点,提交的是根节点(实际上可以有多个根节点)。如果有一个,这意味着没有合并,它不是根提交。如果有多个,则意味着提交是合并的结果,第一个之后的所有父节点都是合并提交。

Paths simplified:
    F->(D->C),I | J->I | I->H,C | C->(B->A),H | H->(G->A) | A->X
Paths first parents only:
    F->(D->(C->(B->(A->X)))) | F->D->C->B->A->X
    J->(I->(H->(G->(A->X))) | J->I->H->G->A->X
Or:
    F->D->C | J->I | I->H | C->B->A | H->G->A | A->X
Paths first parents only simplified:
    F->D->C->B->A | J->I->->G->A | A->X
Topological:
    - X - A - B - C - D - F (B1)
           \
            G - H - I - J (B2)

When both hit A their chain will be the same, before that their chain will be entirely different. The first commit another two commits have in common is the common ancestor and from whence they diverged. there might be some confusion here between the terms commit, branch and ref. You can in fact merge a commit. This is what merge really does. A ref simply points to a commit and a branch is nothing more than a ref in the folder .git/refs/heads, the folder location is what determines that a ref is a branch rather than something else such as a tag.

你丢失历史的地方是合并会根据情况做两件事中的一件。

考虑:

      / - B (B1)
    - A
      \ - C (B2)

在这种情况下,任何一个方向的合并都将创建一个新的提交,其中第一个父节点作为当前检出分支指向的提交,第二个父节点作为您合并到当前分支的分支顶端的提交。它必须创建一个新的提交,因为自它们的共同祖先以来,两个分支都发生了必须合并的更改。

      / - B - D (B1)
    - A      /
      \ --- C (B2)

此时D (B1)现在拥有来自两个分支(自身和B2)的两组更改。然而,第二个分支没有从B1开始的更改。如果你合并B1到B2的变化,这样它们就同步了,那么你可能会看到这样的东西(你可以强制git合并,但是使用——no-ff):

Expected:
      / - B - D (B1)
    - A      / \
      \ --- C - E (B2)
Reality:
      / - B - D (B1) (B2)
    - A      /
      \ --- C

即使B1有额外的提交,也会得到这个结果。只要B2中没有B1中没有的变化,两个分支就会合并。它做了一个快进,就像一个rebase (rebase也吃或线性化历史),除了不像rebase只有一个分支有一个变更集,它不需要从一个分支应用一个变更集到另一个分支。

From:
      / - B - D - E (B1)
    - A      /
      \ --- C (B2)
To:
      / - B - D - E (B1) (B2)
    - A      /
      \ --- C

If you cease work on B1 then things are largely fine for preserving history in the long run. Only B1 (which might be master) will advance typically so the location of B2 in B2's history successfully represents the point that it was merged into B1. This is what git expects you to do, to branch B from A, then you can merge A into B as much as you like as changes accumulate, however when merging B back into A, it's not expected that you will work on B and further. If you carry on working on your branch after fast forward merging it back into the branch you were working on then your erasing B's previous history each time. You're really creating a new branch each time after fast forward commit to source then commit to branch. You end up with when you fast forward commit is lots of branches/merges that you can see in the history and structure but without the ability to determine what the name of that branch was or if what looks like two separate branches is really the same branch.

         0   1   2   3   4 (B1)
        /-\ /-\ /-\ /-\ /
    ----   -   -   -   -
        \-/ \-/ \-/ \-/ \
         5   6   7   8   9 (B2)

1 to 3 and 5 to 8 are structural branches that show up if you follow the history for either 4 or 9. There's no way in git to know which of this unnamed and unreferenced structural branches belong to with of the named and references branches as the end of the structure. You might assume from this drawing that 0 to 4 belongs to B1 and 4 to 9 belongs to B2 but apart from 4 and 9 was can't know which branch belongs to which branch, I've simply drawn it in a way that gives the illusion of that. 0 might belong to B2 and 5 might belong to B1. There are 16 different possibilies in this case of which named branch each of the structural branches could belong to. This is assuming that none of these structural branches came from a deleted branch or as a result of merging a branch into itself when pulling from master (the same branch name on two repos is infact two branches, a separate repository is like branching all branches).

There are a number of git strategies that work around this. You can force git merge to never fast forward and always create a merge branch. A horrible way to preserve branch history is with tags and/or branches (tags are really recommended) according to some convention of your choosing. I realy wouldn't recommend a dummy empty commit in the branch you're merging into. A very common convention is to not merge into an integration branch until you want to genuinely close your branch. This is a practice that people should attempt to adhere to as otherwise you're working around the point of having branches. However in the real world the ideal is not always practical meaning doing the right thing is not viable for every situation. If what you're doing on a branch is isolated that can work but otherwise you might be in a situation where when multiple developers are working one something they need to share their changes quickly (ideally you might really want to be working on one branch but not all situations suit that either and generally two people working on a branch is something you want to avoid).

其他回答

我也在寻找同样的东西,我发现了这个问题。谢谢你的提问!

然而,我发现我在这里看到的答案似乎并没有完全给出你所要求的答案(或者我正在寻找的答案)——它们似乎给出了G提交,而不是A提交。

所以,我已经创建了以下树(字母按时间顺序分配),所以我可以测试一下:

A - B - D - F - G   <- "master" branch (at G)
     \   \     /
      C - E --'     <- "topic" branch (still at E)

这看起来和你的有点不同,因为我想确保我得到了(指的是这张图,不是你的)B,但不是a(也不是D或E)。下面是SHA前缀和提交消息附加的字母(我的回购可以从这里克隆,如果有人感兴趣的话):

G: a9546a2 merge from topic back to master
F: e7c863d commit on master after master was merged to topic
E: 648ca35 merging master onto topic
D: 37ad159 post-branch commit on master
C: 132ee2a first commit on topic branch
B: 6aafd7f second commit on master before branching
A: 4112403 initial commit on master

所以,我们的目标是:找到b。以下是我在修改后找到的三种方法:


1. 在视觉上,用gitk:

你应该能看到这样的树(从master上看):

或者在这里(从主题来看):

在这两种情况下,我都选择了提交图中的B。一旦单击它,它的完整SHA就会显示在图形下方的文本输入字段中。


2. 从视觉上看,但从终端来看:

Git日志—图形—一行—全部

(编辑/旁注:添加—装饰也可以很有趣;它添加了分支名称、标记等的指示。没有将它添加到上面的命令行,因为下面的输出没有反映它的使用。)

它显示(假设git配置-global颜色。ui汽车):

或者,直接说:

*   a9546a2 merge from topic back to master
|\  
| *   648ca35 merging master onto topic
| |\  
| * | 132ee2a first commit on topic branch
* | | e7c863d commit on master after master was merged to topic
| |/  
|/|   
* | 37ad159 post-branch commit on master
|/  
* 6aafd7f second commit on master before branching
* 4112403 initial commit on master

在任何一种情况下,我们都将6aafd7f提交视为最低公共点,即在我的图中是B,或者在你的图中是A。


3.用贝壳魔法:

您没有在问题中指定您想要的是类似上述的东西,还是只提供一个修订版本的单一命令,而不是其他任何命令。下面是后者:

diff -u <(git rev-list --first-parent topic) \
             <(git rev-list --first-parent master) | \
     sed -ne 's/^ //p' | head -1
6aafd7ff98017c816033df18395c5c1e7829960d

你也可以把它放到~/中。Gitconfig as(注意:后面的破折号很重要;谢谢Brian的关注):

[alias]
    oldest-ancestor = !zsh -c 'diff -u <(git rev-list --first-parent "${1:-master}") <(git rev-list --first-parent "${2:-HEAD}") | sed -ne \"s/^ //p\" | head -1' -

这可以通过以下命令行(带引号)完成:

git config --global alias.oldest-ancestor '!zsh -c '\''diff -u <(git rev-list --first-parent "${1:-master}") <(git rev-list --first-parent "${2:-HEAD}") | sed -ne "s/^ //p" | head -1'\'' -'

注意:zsh可以很容易地变成bash,但sh不能工作——<()语法在vanilla sh中不存在。(再次感谢@conny,让我在本页另一个答案的评论中意识到它!)

注:上述备选版本:

感谢liori指出,在比较相同的分支时,上面的内容可能会失败,并提出了一个替代的diff表单,从混合中删除sed表单,并使其“更安全”(即,即使在比较master和master时,它也会返回一个结果(即最近的提交):

作为.git-config行:

[alias]
    oldest-ancestor = !zsh -c 'diff --old-line-format='' --new-line-format='' <(git rev-list --first-parent "${1:-master}") <(git rev-list --first-parent "${2:-HEAD}") | head -1' -

从壳上:

git config --global alias.oldest-ancestor '!zsh -c '\''diff --old-line-format='' --new-line-format='' <(git rev-list --first-parent "${1:-master}") <(git rev-list --first-parent "${2:-HEAD}") | head -1'\'' -'

所以,在我的测试树中(不好意思,它暂时不可用;它回来了),现在对master和topic都有效(分别给出提交G和B)。再次谢谢你,利奥里,给我另一种形式。


所以,这就是我(和liori)想到的。这似乎对我有用。它还允许额外的两个别名,这可能会证明很方便:

git config --global alias.branchdiff '!sh -c "git diff `git oldest-ancestor`.."'
git config --global alias.branchlog '!sh -c "git log `git oldest-ancestor`.."'

git-ing快乐!

问题似乎是在一边的两个分支之间找到最近的单次提交切割,在另一边找到最早的共同祖先(可能是回购的初始提交)。这符合我对“分支”点的直觉。

记住,使用普通的git shell命令来计算这一点并不容易,因为git rev-list——我们最强大的工具——不允许我们限制提交到达的路径。我们拥有的最接近的是git rev-list——boundary,它可以给我们一组“阻塞我们的方式”的所有提交。(注意:git rev-list——ancestry-path很有趣,但我不知道如何让它在这里有用。)

下面是脚本:https://gist.github.com/abortz/d464c88923c520b79e3d。它相对简单,但由于循环,它的复杂程度足以保证要点。

请注意,这里提出的大多数其他解决方案不可能在所有情况下都有效,原因很简单:git rev-list—first-parent在线性化历史时不可靠,因为两种顺序都可能存在合并。

另一方面,Git rev-list -topo-order非常有用——用于按地形顺序进行提交——但执行差分是很脆弱的:对于给定的图,有多种可能的地形顺序,因此您依赖于排序的某种稳定性。也就是说,strongk7的解决方案可能在大多数时候都非常有效。然而,它比我的慢,因为必须遍历整个回购历史…两次。: -)

这并不是一个解决问题的方法,但我认为当我有一个长寿的分支时,我使用的方法值得注意:

在创建分支的同时,我还创建了一个名称相同但后缀为-init的标记,例如feature-branch和feature-branch-init。

(这是一个很难回答的问题,这有点奇怪!)

我使用git rev-list来做这类事情。例如,(注意3个点)

$ git rev-list --boundary branch-a...master | grep "^-" | cut -c2-

将分叉点吐出来。这并不完美;因为你已经多次将master合并到分支A中,这将分离出两个可能的分支点(基本上,最初的分支点,然后是你将master合并到分支A的每个点)。然而,它至少应该缩小可能性。

我将该命令添加到~/中的别名中。gitconfig:

[alias]
    diverges = !sh -c 'git rev-list --boundary $1...$2 | grep "^-" | cut -c2-'

所以我可以称它为:

$ git diverges branch-a master

经过大量的研究和讨论,很明显没有什么灵丹妙药能在所有情况下都起作用,至少在当前版本的Git中不是这样。

这就是为什么我写了几个补丁,增加了尾巴分支的概念。每次创建分支时,也会创建一个指向原始点的指针,即tail ref。每当分支重基时,这个ref都会更新。

要找到devel分支的分支点,你所要做的就是使用develop @{tail},就是这样。

https://github.com/felipec/git/commits/fc/tail