We have a Git repository with over 400 commits, the first couple dozen of which were a lot of trial-and-error. We want to clean up these commits by squashing many down into a single commit. Naturally, git-rebase seems the way to go. My problem is that it ends up with merge conflicts, and these conflicts are not easy to resolve. I don't understand why there should be any conflicts at all, since I'm just squashing commits (not deleting or rearranging). Very likely, this demonstrates that I'm not completely understanding how git-rebase does its squashes.

以下是我正在使用的脚本的修改版本:


Repo_squash.sh(这是实际运行的脚本):


rm -rf repo_squash
git clone repo repo_squash
cd repo_squash/
GIT_EDITOR=../repo_squash_helper.sh git rebase --strategy theirs -i bd6a09a484b8230d0810e6689cf08a24f26f287a

Repo_squash_helper.sh(此脚本仅用于repo_squash.sh):


if grep -q "pick " $1
then
#  cp $1 ../repo_squash_history.txt
#  emacs -nw $1
  sed -f ../repo_squash_list.txt < $1 > $1.tmp
  mv $1.tmp $1
else
  if grep -q "initial import" $1
  then
    cp ../repo_squash_new_message1.txt $1
  elif grep -q "fixing bad import" $1
  then
    cp ../repo_squash_new_message2.txt $1
  else
    emacs -nw $1
  fi
fi

Repo_squash_list.txt(该文件仅由repo_squash_helper.sh使用)


# Initial import
s/pick \(251a190\)/squash \1/g
# Leaving "Needed subdir" for now
# Fixing bad import
s/pick \(46c41d1\)/squash \1/g
s/pick \(5d7agf2\)/squash \1/g
s/pick \(3da63ed\)/squash \1/g

I'll leave the "new message" contents to your imagination. Initially, I did this without the "--strategy theirs" option (i.e., using the default strategy, which if I understand the documentation correctly is recursive, but I'm not sure which recursive strategy is used), and it also didn't work. Also, I should point out that, using the commented out code in repo_squash_helper.sh, I saved off the original file that the sed script works on and ran the sed script against it to make sure it was doing what I wanted it to do (it was). Again, I don't even know why there would be a conflict, so it wouldn't seem to matter so much which strategy is used. Any advice or insight would be helpful, but mostly I just want to get this squashing working.

更新了与Jefromi讨论的额外信息:

在开始大规模的“真正的”存储库之前,我在一个测试存储库上使用了类似的脚本。它是一个非常简单的存储库,测试工作得很干净。

当它失败时,我得到的信息是:

Finished one cherry-pick.
# Not currently on any branch.
nothing to commit (working directory clean)
Could not apply 66c45e2... Needed subdir

这是第一次壁球提交后的第一个选择。运行git状态会产生一个干净的工作目录。如果我然后执行git rebase -continue,在多次提交后,我得到一个非常相似的消息。如果我再做一次,在几十次提交之后,我将得到另一条非常相似的消息。如果我再做一次,这一次它会经过大约100次提交,并产生以下消息:

Automatic cherry-pick failed.  After resolving the conflicts,
mark the corrected paths with 'git add <paths>', and
run 'git rebase --continue'
Could not apply f1de3bc... Incremental

如果我然后运行git status,我得到:

# Not currently on any branch.
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
# modified:   repo/file_A.cpp
# modified:   repo/file_B.cpp
#
# Unmerged paths:
#   (use "git reset HEAD <file>..." to unstage)
#   (use "git add/rm <file>..." as appropriate to mark resolution)
#
# both modified:      repo/file_X.cpp
#
# Changed but not updated:
#   (use "git add/rm <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
# deleted:    repo/file_Z.imp

The "both modified" bit sounds weird to me, since this was just the result of a pick. It's also worth noting that if I look at the "conflict", it boils down to a single line with one version beginning it with a [tab] character, and the other one with four spaces. This sounded like it might be an issue with how I've set up my config file, but there's nothing of the sort in it. (I did note that core.ignorecase is set to true, but evidently git-clone did that automatically. I'm not completely surprised by that considering that the original source was on a Windows machine.)

如果我手动修复file_X.cpp,那么不久之后就会出现另一个冲突而失败,这次是在一个版本认为应该存在而另一个版本认为不应该存在的文件(CMakeLists.txt)之间。如果我通过说我确实想要这个文件(我确实想要)来修复这个冲突,那么几次提交之后,我就会得到另一个冲突(在同一个文件中),现在有一些相当重要的更改。目前为止,冲突只进行了25%。

我还应该指出(因为这可能非常重要),这个项目是从svn存储库开始的。初始历史很可能是从svn存储库导入的。

更新2:

(受Jefromi评论的影响),我决定将repo_squash.sh更改为:

rm -rf repo_squash
git clone repo repo_squash
cd repo_squash/
git rebase --strategy theirs -i bd6a09a484b8230d0810e6689cf08a24f26f287a

然后,我只接受原来的元素。也就是说,“改基”不应该改变任何事情。最终得到的结果与前面描述的相同。

更新# 3:

或者,如果我省略策略,并将最后一个命令替换为:

git rebase -i bd6a09a484b8230d0810e6689cf08a24f26f287a

我不再遇到“无事可做”的重构问题,但我仍然面临其他冲突。

更新与玩具库,重新创建问题:

Test_squash.sh(这是你实际运行的文件):

#========================================================
# Initialize directories
#========================================================
rm -rf test_squash/ test_squash_clone/
mkdir -p test_squash
mkdir -p test_squash_clone
#========================================================

#========================================================
# Create repository with history
#========================================================
cd test_squash/
git init
echo "README">README
git add README
git commit -m"Initial commit: can't easily access for rebasing"
echo "Line 1">test_file.txt
git add test_file.txt
git commit -m"Created single line file"
echo "Line 2">>test_file.txt 
git add test_file.txt 
git commit -m"Meant for it to be two lines"
git checkout -b dev
echo Meaningful code>new_file.txt
git add new_file.txt 
git commit -m"Meaningful commit"
git checkout master
echo Conflicting meaningful code>new_file.txt
git add new_file.txt 
git commit -m"Conflicting meaningful commit"
# This will conflict
git merge dev
# Fixes conflict
echo Merged meaningful code>new_file.txt
git add new_file.txt
git commit -m"Merged dev with master"
cd ..

#========================================================
# Save off a clone of the repository prior to squashing
#========================================================
git clone test_squash test_squash_clone
#========================================================

#========================================================
# Do the squash
#========================================================
cd test_squash
GIT_EDITOR=../test_squash_helper.sh git rebase -i HEAD@{7}
#========================================================

#========================================================
# Show the results
#========================================================
git log
git gc
git reflog
#========================================================

Test_squash_helper.sh(由test_squash .sh使用):

# If the file has the phrase "pick " in it, assume it's the log file
if grep -q "pick " $1
then
  sed -e "s/pick \(.*\) \(Meant for it to be two lines\)/squash \1 \2/g" < $1 > $1.tmp
  mv $1.tmp $1
# Else, assume it's the commit message file
else
# Use our pre-canned message
  echo "Created two line file" > $1
fi

附注:是的,我知道当你们中的一些人看到我使用emacs作为备用编辑器时感到畏缩。

p.p.s.:我们确实知道,在重基之后,我们将不得不放弃现有存储库的所有克隆。(这句话的意思是:“你不应该在一个存储库发布之后重新建立它的基础”。)

p.p.p.s.:谁能告诉我怎么给这个加赏金吗?不管我是在编辑模式还是查看模式,我都看不到这个选项。


好吧,我有足够的信心给出答案。也许我得编辑一下,但我相信我知道你的问题是什么。

Your toy repo test case has a merge in it - worse, it has a merge with conflicts. And you're rebasing across the merge. Without -p (which doesn't totally work with -i), the merges are ignored. This means that whatever you did in your conflict resolution isn't there when the rebase tries to cherry-pick the next commit, so its patch may not apply. (I believe this is shown as a merge conflict because git cherry-pick can apply the patch by doing a three-way merge between the original commit, the current commit, and the common ancestor.)

不幸的是,正如我们在评论中指出的那样,-i和-p(保存合并)并不能很好地相处。我知道编辑/重新措辞有用,而重新排序就不行。然而,我相信它对南瓜很有效。这并没有被记录下来,但是它对于我下面描述的测试用例是有效的。如果你的情况要复杂得多,你可能会遇到很多麻烦,尽管这仍然是可能的。(这个故事的寓意是:在合并之前使用rebase -i清理。)

那么,让我们假设有一个非常简单的情况,我们想把a, B和C挤在一起:

- o - A - B - C - X - D - E - F (master)
   \             /
    Z -----------

现在,就像我说的,如果在X中没有冲突,git rebase -i -p会像你期望的那样工作。

如果有冲突,事情就会变得有点棘手。它可以很好地进行压缩,但是当它试图重新创建合并时,冲突将再次发生。您必须再次解析它们,将它们添加到索引中,然后使用git rebase—继续前进。(当然,您可以通过检出原始合并提交的版本来再次解析它们。)

If you happen to have rerere enabled in your repo (rerere.enabled set to true), this will be way easier - git will be able to reuse the recorded resolution from when you originally had the conflicts, and all you have to do is inspect it to make sure it worked right, add the files to the index, and continue. (You can even go one step farther, turning on rerere.autoupdate, and it'll add them for you, so the merge won't even fail). I'm guessing, however, that you didn't ever enable rerere, so you're going to have to do the conflict resolution yourself.*

*或者,您可以尝试git-contrib中的rerere-train.sh脚本,它试图“从现有的合并提交中启动[the] rerere数据库”——基本上,它检查出所有的合并提交,尝试合并它们,如果合并失败,它获取结果并将它们显示给git-rerere。这可能很耗时,而且我从未实际使用过,但它可能非常有用。


请注意-X和策略选项在交互rebase中使用时会被忽略。

参见commit db2b3b820e2b28da268cc88adff076b396392dfe(2013年7月,git 1.8.4+),

不要忽略合并选项在互动rebase 合并策略及其选项可以在git rebase中指定,但是在——interactive中,它们完全被忽略了。 署名:Arnaud Fontaine

这意味着-X和策略现在可以与交互rebase以及普通rebase一起工作,并且您的初始脚本现在可以更好地工作。


我正在寻找一个类似的需求,即丢弃我的开发分支的中间提交,我发现这个过程对我有用。 在我的工作分支上

git reset –hard mybranch-start-commit
git checkout mybranch-end-commit . // files only of the latest commit
git add -a
git commit -m”New Message intermediate commits discarded”

我们已经将最新的提交连接到分支的开始提交! 没有合并冲突问题! 在我的学习实践中,我在这个阶段得出了这个结论,有没有更好的方法来达到这个目的。


我遇到了一个更简单但相似的问题 1)解决了本地分支上的合并冲突; 2)继续工作,添加更多的小提交, 3)想要重置基地和打击合并冲突。

对我来说,git rebase -p -i master有效。它保持了最初的冲突解决承诺,并允许我将其他冲突压制在顶端。

希望这能帮助到别人!


如果你不介意创建一个新的分支,这是我处理这个问题的方法:

主要的:

# create a new branch
git checkout -b new_clean_branch

# apply all changes
git merge original_messy_branch

# forget the commits but have the changes staged for commit
git reset --soft main        

git commit -m "Squashed changes from original_messy_branch"

在上面@hlidka的回答的基础上,最大限度地减少了人工干预,我想添加一个版本,保存所有不在分支中要压缩的master上的新提交。

因为我相信在这个例子中,这些很容易在git重置步骤中丢失。

# create a new branch 
# ...from the commit in master original_messy_branch was originally based on. eg 5654da06
git checkout -b new_clean_branch 5654da06

# apply all changes
git merge original_messy_branch

# forget the commits but have the changes staged for commit
# ...base the reset on the base commit from Master
git reset --soft 5654da06       

git commit -m "Squashed changes from original_messy_branch"

# Rebase onto HEAD of master
git rebase origin/master

# Resolve any new conflicts from the new commits

如果你想在一个很长的提交分支中只创建一个提交,其中一些是合并提交,最简单的方法是将你的分支重置到第一次提交之前的点,同时保留所有的更改,然后重新提交它们:

git reset $(git merge-base origin/master @)
git add .
git commit

将origin/master替换为分支的名称。

添加。是必需的,因为新添加的文件在重置后显示为未跟踪。