我不小心把一个dvd光盘放到了一个网站项目中,然后不小心提交-a -m…而且,快,回购膨胀了2.2 g。下次我做了一些编辑,删除了视频文件,并提交了所有内容,但压缩文件仍然在存储库中,在历史中。






rm path/to/your/large/file        # delete the large file


mkdir large_files                       # create directory large_files
touch .gitignore                        # create .gitignore file if needed
'/large_files/' >> .gitignore           # untrack directory large_files
mv path/to/your/large/file large_files/ # move the large file into the untracked directory


git add path/to/your/large/file   # add the deletion to the index
git commit -m 'delete large file' # commit the deletion


git filter-branch --force --index-filter \
  "git rm --cached --ignore-unmatch path/to/your/large/file" \
  --prune-empty --tag-name-filter cat -- --all
git push <remote> <branch>


如果您已经向其他开发人员发布了历史记录,那么您想要做的事情是非常具有破坏性的。关于修复历史记录后的必要步骤,请参阅git Rebase文档中的“从上游Rebase恢复”。

你至少有两个选择:git filter-branch和交互式rebase,这两个选项都在下面解释。

使用git filter-branch



$ git lola --name-status
* f772d66 (HEAD, master) Login page
| A     login.html
* cb14efd Remove DVD-rip
| D     oops.iso
* ce36c98 Careless
| A     oops.iso
| A     other.html
* 5af4522 Admin page
| A     admin.html
* e738b63 Index
  A     index.html

注意,git lola是一个非标准但非常有用的别名。(详见答案末尾的附录)git日志的——name-status开关显示与每次提交相关的树修改。


git filter-branch --prune-empty -d /dev/shm/scratch \
  --index-filter "git rm --cached -f --ignore-unmatch oops.iso" \
  --tag-name-filter cat -- --all


--prune-empty removes commits that become empty (i.e., do not change the tree) as a result of the filter operation. In the typical case, this option produces a cleaner history. -d names a temporary directory that does not yet exist to use for building the filtered history. If you are running on a modern Linux distribution, specifying a tree in /dev/shm will result in faster execution. --index-filter is the main event and runs against the index at each step in the history. You want to remove oops.iso wherever it is found, but it isn’t present in all commits. The command git rm --cached -f --ignore-unmatch oops.iso deletes the DVD-rip when it is present and does not fail otherwise. --tag-name-filter describes how to rewrite tag names. A filter of cat is the identity operation. Your repository, like the sample above, may not have any tags, but I included this option for full generality. -- specifies the end of options to git filter-branch --all following -- is shorthand for all refs. Your repository, like the sample above, may have only one ref (master), but I included this option for full generality.


$ git lola --name-status
* 8e0a11c (HEAD, master) Login page
| A     login.html
* e45ac59 Careless
| A     other.html
| * f772d66 (refs/original/refs/heads/master) Login page
| | A   login.html
| * cb14efd Remove DVD-rip
| | D   oops.iso
| * ce36c98 Careless
|/  A   oops.iso
|   A   other.html
* 5af4522 Admin page
| A     admin.html
* e738b63 Index
  A     index.html

注意,新的“粗心”提交只添加了other.html,而“Remove DVD-rip”提交不再在主分支上。标记为refs/original/refs/heads/master的分支包含了你的原始提交,以防你犯了错误。要删除它,请遵循“缩小存储库的检查表”中的步骤。

$ git update-ref -d refs/original/refs/heads/master
$ git reflog expire --expire=now --all
$ git gc --prune=now


$ cd ~/src
$ mv repo repo.old
$ git clone file:///home/user/src/repo.old repo



$ git lola --name-status
* 8e0a11c (HEAD, master) Login page
| A     login.html
* e45ac59 Careless
| A     other.html
* 5af4522 Admin page
| A     admin.html
* e738b63 Index
  A     index.html

前两个提交(“Index”和“Admin page”)的SHA1对象名称保持不变,因为过滤操作没有修改这些提交。“粗心”输了。iso和“Login page”有了新的父节点,所以它们的sha1确实改变了。



$ git lola --name-status
* f772d66 (HEAD, master) Login page
| A     login.html
* cb14efd Remove DVD-rip
| D     oops.iso
* ce36c98 Careless
| A     oops.iso
| A     other.html
* 5af4522 Admin page
| A     admin.html
* e738b63 Index
  A     index.html


运行$ git rebase -i 5af4522启动一个包含以下内容的编辑器。

pick ce36c98 Careless
pick cb14efd Remove DVD-rip
pick f772d66 Login page

# Rebase 5af4522..f772d66 onto 5af4522
# Commands:
#  p, pick = use commit
#  r, reword = use commit, but edit the commit message
#  e, edit = use commit, but stop for amending
#  s, squash = use commit, but meld into previous commit
#  f, fixup = like "squash", but discard this commit's log message
#  x, exec = run command (the rest of the line) using shell
# If you remove a line here THAT COMMIT WILL BE LOST.
# However, if you remove everything, the rebase will be aborted.


edit ce36c98 Careless
pick f772d66 Login page

# Rebase 5af4522..f772d66 onto 5af4522
# ...

也就是说,我们删除了“Remove DVD-rip”这一行,并将“Careless”上的操作更改为edit而不是pick。


Stopped at ce36c98... Careless
You can amend the commit now, with

        git commit --amend

Once you are satisfied with your changes, run

        git rebase --continue


$ git rm --cached oops.iso
$ git commit --amend -C HEAD
$ git rebase --continue

第一个方法从索引中删除有问题的文件。第二个修改或修正" Careless "为更新后的索引,-C HEAD指示git重用旧的提交消息。最后,git rebase—continue继续执行其余的rebase操作。


$ git lola --name-status
* 93174be (HEAD, master) Login page
| A     login.html
* a570198 Careless
| A     other.html
* 5af4522 Admin page
| A     admin.html
* e738b63 Index
  A     index.html


附录:通过~/.gitconfig启用git lola


我在Scott Chacon在linux.conf.au 2010上的演讲中学到的最好的技巧是:Git的高级技巧和窍门:

Lol = log -graph - decoration -pretty=oneline -commit

这提供了一个非常好的树图,显示了合并等分支结构。当然,有非常好的GUI工具来显示这样的图形,但git lol的优势在于它可以在控制台或ssh上工作,所以它对于远程开发或嵌入式板上的本地开发非常有用……

因此,只需将下面的代码复制到~/。Gitconfig为您的全彩git Lola行动: (别名) Lol = log -graph - decoration -pretty=oneline -commit Lola = log -graph - decoration -pretty=oneline -commit -all (颜色) 分支=自动 Diff =自动 交互=自动 状态= auto


git filter-branch --tree-filter 'rm -f DVD-rip' HEAD





$ git filter-branch --index-filter "git rm -rf --cached --ignore-unmatch YOURFILENAME" HEAD
$ rm -rf .git/refs/original/ 
$ git reflog expire --all 
$ git gc --aggressive --prune
$ git push origin master --force

git filter-branch——tree-filter 'rm -f path/to/file' HEAD 这对我来说非常好,尽管我遇到了这里描述的相同问题,但我通过遵循这个建议解决了这个问题。


我基本上按照这个答案做了: https://stackoverflow.com/a/11032521/1286423


$ git filter-branch --index-filter "git rm -rf --cached --ignore-unmatch YOURFILENAME" HEAD
$ rm -rf .git/refs/original/ 
$ git reflog expire --all 
$ git gc --aggressive --prune
$ git push origin master --force

这并没有起作用,因为我喜欢重命名和移动东西。一些大文件在重命名的文件夹中,我认为gc不能删除对这些文件的引用因为树对象中的引用指向这些文件。 我最终的解决方法是:

# First, apply what's in the answer linked in the front
# and before doing the gc --prune --aggressive, do:

# Go back at the origin of the repository
git checkout -b newinit <sha1 of first commit>
# Create a parallel initial commit
git commit --amend
# go back on the master branch that has big file
# still referenced in history, even though 
# we thought we removed them.
git checkout master
# rebase on the newinit created earlier. By reapply patches,
# it will really forget about the references to hidden big files.
git rebase newinit

# Do the previous part (checkout + rebase) for each branch
# still connected to the original initial commit, 
# so we remove all the references.

# Remove the .git/logs folder, also containing references
# to commits that could make git gc not remove them.
rm -rf .git/logs/

# Then you can do a garbage collection,
# and the hidden files really will get gc'ed
git gc --prune --aggressive

我的repo (.git)从32MB变成了388KB,即使过滤器分支也无法清理。