我目前的基地总规模大约是。200 mb。

但是我的。git文件夹有惊人的5GB(!)。因为我把我的工作推到一个外部服务器,我不需要任何大的本地历史…

如何缩小。git文件夹以释放笔记本上的一些空间?我可以删除超过30天的所有更改吗?


当前回答

以下是git的创建者Linus对如何缩小git回购的看法:

The equivalent of "git gc --aggressive" - but done *properly* - is to do (overnight) something like git repack -a -d --depth=250 --window=250 where that depth thing is just about how deep the delta chains can be (make them longer for old history - it's worth the space overhead), and the window thing is about how big an object window we want each delta candidate to scan. And here, you might well want to add the "-f" flag (which is the "drop all old deltas", since you now are actually trying to make sure that this one actually finds good candidates.

来源:http://gcc.gnu.org/ml/gcc/2007-12/msg00165.html

这将摆脱二进制数据是孤儿在我的回购?“git重新打包”将不会删除你已经签入repo然后删除它的图像或二进制数据。要从你的回购中永久删除这些数据,你必须重写你的历史记录。一个常见的例子是当你不小心在git中检入你的密码。你可以回去删除一些文件,但你必须重写从那时到现在的历史记录,然后强制将新的repo推到你的原点。

其他回答

如何缩小你的。git文件夹在你的git回购

总结

按照这个顺序,从最不危险和/或最有效和/或最快到更危险和/或最不有效和/或最慢。

注意,git lfs行只适用于安装了git lfs的情况。谷歌,你会看到它是一个第三方独立应用程序。如果你没有安装git lfs,忽略这些行即可。看看我在这个答案下面的评论,从这里开始。

这些测试结果是针对du -hs——exclude=的repo。git。显示总回购大小,不包括。git目录,大约是80gb, du -hs .git显示。git文件夹单独开始大约162 GB:

#                                                                   Memory Saved
#                                               Time it took        in .git dir
#                                               ------------        ------------
time git lfs prune                              #  1~60 min          62 GB
time git gc                                     #  3 min            < 1 GB
time git prune                                  #  1 min            < 1 GB
time git repack -a -d --depth=250 --window=250  #  2 min            < 1 GB
# (Note: `--prune` does nothing extra here; `man git gc` says 
# `--prune is on by default`)
time git gc --aggressive --prune                #  1.25 hrs         < 1 GB

正如您所看到的,最后一个命令花费了很长时间,但收效甚微,所以甚至不要运行它!

此外,运行git lfs prune的另一种方法是手动删除整个.git/lfs目录,然后从头重新获取lfs (git大文件系统)的内容。 注意:不要不小心删除整个。git目录!你将会丢失所有的git历史记录,分支和提交!只删除“。git/lfs”目录。类似这样的方法可能有用:

# 1. Delete the whole git lfs directory
rm -rf .git/lfs


# 2. Re-fetch the git lfs contents again from scratch.
# See my answer here: https://stackoverflow.com/a/72610495/4561887

# Option 1 (recommended): fetch (to the ".git/lfs" dir) AND check out just the
# git lfs files for just the one branch or commit you currently have
# checked-out. 
# - this might download ~20 GB of data on a large corporate mono-repo
git lfs pull
# OR do this (these two commands do the exact same thing as `git lfs pull`)
git lfs fetch
git lfs checkout

# Option 2: fetch (to the ".git/lfs" dir) ALL git lfs files for ALL branches on
# the remote
# - this might download ~1000 GB of data on the same large corporate mono-repo
#   as above
git lfs fetch --all
# Also check out, or "activate" the git lfs files for your currently-checked-out
# branch or commit, by updating all file placeholders or pointers in your
# active filesystem for the current branch with the actual files these git lfs
# placeholders point to.
git lfs checkout

关于上面所示的git lfs命令的详细信息,请参阅我的另一个回答:如何作为基本用户使用git lfs: git lfs fetch、git lfs fetch——all、git lfs pull和git lfs checkout之间有什么区别?

细节

首先,您需要知道.git文件夹中什么占用了这么多空间。一种方法是在repo中运行基于NCurses(类似gui)的ncdu (NCurses Disk Usage)命令。另一种方法是运行这个:

du -h --max-depth=1 .git

旁注:看看你的repo有多大,不包括你的。git文件夹,运行这个代替:

du -h --max-depth=1 --exclude=.git .

上面第一个命令的输出示例:

$ du -h --max-depth=1 .git
158G    .git/lfs
6.2M    .git/refs
4.0K    .git/branches
2.5M    .git/info
3.7G    .git/objects
6.2M    .git/logs
68K .git/hooks
162G    .git

如你所见,我的。Git文件夹总大小为162 GB,但其中158 GB是我的。Git /lfs文件夹,因为我使用第三方“Git大文件存储”(Git lfs)工具来存储大型二进制文件。所以,运行这个程序可以显著减少这个量。注意:以下所有命令的时间部分都是可选的:

time git lfs prune

(如果git lfs修剪失败,提示“panic: runtime error: invalid memory address or nil pointer derefence”,请参阅下面的注释。)

来源:如何收缩一个git LFS repo 官方文档:git-lfs-prune(1)—从本地存储中删除旧的LFS文件

这花了60秒!

现在我刚刚释放了62 GB!我的.git/lfs文件夹现在只有96 GB,如下所示:

$ du -h --max-depth=1 .git
96G .git/lfs
6.2M    .git/refs
4.0K    .git/branches
2.5M    .git/info
3.0G    .git/objects
6.2M    .git/logs
68K .git/hooks
99G .git

接下来,运行这个命令将.git/objects文件夹缩小几百MB到大约1gb:

time git gc
time git prune

Git gc大约需要3分钟,Git prune大约需要1分钟。

再次使用du -h——max-depth=1 .git检查磁盘使用情况。如果你想节省更多的空间,运行这个:

time git repack -a -d --depth=250 --window=250

这大约需要2分钟,并节省几百MB。

现在,你可以停在这里,或者你可以运行最后一个命令:

time git gc --aggressive --prune

最后一个命令将节省几百MB,但需要大约1.25小时。

如果git lfs修剪失败,提示“panic:运行时错误:无效的内存地址或空指针解引用”

如果git lfs修剪失败:

Panic:运行时错误:无效的内存地址或空指针解引用

然后,您可能已经安装了旧版本的git-lfs,需要更新它。以下是如何做到的:

首先,查看您安装了什么版本。运行man git-lfs并滚动到底部查看日期。例如,也许它说的是2017年。现在,用这些命令更新您的版本。第一个命令来自这里:https://packagecloud.io/github/git-lfs/install。

curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
sudo apt update
sudo apt install git-lfs

再运行man git-lfs,滚动到底部。我现在认为我的日期是“2021年3月”,而之前是2017年的某个日期。

此外,如果我再次运行sudo apt install git-lfs,它会告诉我:

Git-lfs已经是最新版本(2.13.3)。

所以,git-lfs的更新工作,现在错误已经消失,git lfs修剪工作再次!

我首先在GitHub上的一个评论中记录了这一点:https://github.com/git-lfs/git-lfs/issues/3395#issuecomment-889393444。

引用:

如何缩小。git文件夹 如何缩小。git文件夹 git lfs prune:如何收缩一个git lfs repo Linus Torvalds对git重新打包-a -d——depth=250——window=250: https://gcc.gnu.org/legacy-ml/gcc/2007-12/msg00165.html https://github.com/git-lfs/git-lfs/blob/main/docs/man/git-lfs-prune.1.ronn

参见:

My answer: Unix & Linux: All about finding, filtering, and sorting with find, based on file size - see the example near the end, titled "(Figure out which file extensions to add to git lfs next)". Other really useful git lfs info: Great article!: my developer planet: Git LFS: Why and how to use https://git-lfs.github.com/ My repo and notes: https://github.com/ElectricRCAircraftGuy/eRCaGuy_dotfiles#how-to-clone-this-repo-and-all-git-submodules ***** [my Q&A] How to use git lfs as a basic user: What is the difference between git lfs fetch, git lfs fetch --all, git lfs pull, and git lfs checkout? [my Q&A] How to resume `git lfs post-checkout` hook after failed `git checkout` Note: for pure synchronization, try FreeFileSync or rsync, as I explain in my answer here. That being said, occasionally I use git for synchronization too, as I explain for my sync_git_repo_from_pc1_to_pc2.sh tool here, and in my other answer here: Work on a remote project with Eclipse via SSH.

尝试了以上方法,在我的情况下没有任何工作(在git推送期间,我不小心杀死了git进程),所以我最终不得不删除回购并再次克隆它,现在.git文件夹是正常大小。

您不应该删除所有超过30天的更改(我认为这在某种程度上可以利用Git,但真的不推荐)。

你可以调用git gc——aggressive——prune,它将在你的存储库中执行垃圾收集并修剪旧对象。你是否有很多经常变化的二进制文件(归档文件、图像、可执行文件)?这通常会导致巨大的.git文件夹(记住,Git为每个修订存储快照,二进制文件压缩得很差)

5GB vs 200MB有点奇怪。尝试运行git gc。

但是,除非您将存储库拆分为模块,否则您无法减小.git目录的大小。

git repo的每个克隆都是一个完整的存储库,可以充当服务器。这是分布式版本控制的基本原理。

通过根据文件最近更新的时间从. Git文件夹中删除一些文件的日志历史来收缩Git存储库。

我在本地机器上也遇到过同样的问题。原因是我从本地删除了一些大型文件,并提交到中央存储库。但是事件发生在git状态之后,git fetch和git pull。我的。git文件夹大小大约是3GB。之后,我运行以下命令,通过考虑一个月前已经更改/过期的文件来减小.git文件夹的大小。

命令

$ git remote prune origin && git repack && git prune-packed && git reflog expire --expire=1.month.ago && git gc --aggressive

Git命令及其简短描述:

git-prune - Prune all unreachable objects from the object database git-repack - Pack unpacked objects in a repository git-prune-packed - Remove extra objects that are already in pack files. git reflog: Git keeps track of updates to the tip of branches using a mechanism called reference logs, or "reflogs." Reflogs track when Git refs were updated in the local repository. In addition to branch tip reflogs, a special reflog is maintained for the Git stash. Reflogs are stored in directories under the local repository's .git directory. git reflog directories can be found at .git/logs/refs/heads/., .git/logs/HEAD, and also .git/logs/refs/stash if the git stash has been used on the repo. git reflog at a high level on the Rewriting History Page. git reflog expire --expire=now --expire-unreachable=now --all In addition to preserving history in the reflog, Git has internal expiration dates on when it will prune detached commits. Again, these are all implementation details that git gc handles and git prune should not be used standalone. git gc --aggressive: git-gc - Cleanup unnecessary files and optimize the local repository.Behind the scenes git gc actually executes a bundle of other internal subcommands like git prune, git repack, git pack and git rerere. The high-level responsibility of these commands is to identify any Git objects that are outside the threshold levels set from the git gc configuration. Once identified, these objects are then compressed, or pruned accordingly.

常见的结果:

$ git remote prune origin && git repack && git prune-packed && git reflog expire --expire=1.month.ago && git gc --aggressive
Enumerating objects: 535, done.
Counting objects: 100% (340/340), done.
Delta compression using up to 2 threads
Compressing objects: 100% (263/263), done.
Writing objects: 100% (340/340), done.
Total 340 (delta 104), reused 0 (delta 0)
Enumerating objects: 904, done.
Counting objects: 100% (904/904), done.
Delta compression using up to 2 threads
Compressing objects: 100% (771/771), done.
Writing objects: 100% (904/904), done.
Total 904 (delta 343), reused 561 (delta 0)