我有一个带有Git子模块的项目。它来自ssh://…URL,在提交a上,提交B已经被推到那个URL,我想让子模块检索提交,并更改它。

现在,我的理解是git子模块更新应该这样做,但它没有。它不做任何事情(没有输出,成功退出代码)。这里有一个例子:

$ mkdir foo
$ cd foo
$ git init .
Initialized empty Git repository in /.../foo/.git/
$ git submodule add ssh://user@host/git/mod mod
Cloning into mod...
user@host's password: hunter2
remote: Counting objects: 131, done.
remote: Compressing objects: 100% (115/115), done.
remote: Total 131 (delta 54), reused 0 (delta 0)
Receiving objects: 100% (131/131), 16.16 KiB, done.
Resolving deltas: 100% (54/54), done.
$ git commit -m "Hello world."
[master (root-commit) 565b235] Hello world.
 2 files changed, 4 insertions(+), 0 deletions(-)
 create mode 100644 .gitmodules
 create mode 160000 mod
# At this point, ssh://user@host/git/mod changes; submodule needs to change too.
$ git submodule init
Submodule 'mod' (ssh://user@host/git/mod) registered for path 'mod'
$ git submodule update
$ git submodule sync
Synchronizing submodule url for 'mod'
$ git submodule update
$ man git-submodule 
$ git submodule update --rebase
$ git submodule update
$ echo $?
0
$ git status
# On branch master
nothing to commit (working directory clean)
$ git submodule update mod
$ ...

我也尝试过git fetch mod,它似乎做了一个取回(但不可能,因为它不提示密码!),但git日志和git显示否认新提交的存在。到目前为止,我只是在rm-ing模块并重新添加它,但这在原则上是错误的,在实践中也很乏味。


当前回答

如何在repo中更新所有git子模块(两种方法做两种完全不同的事情!)

快速的总结

# Option 1: as a **user** of the outer repo, pull the latest changes of the
# sub-repos as previously specified (pointed to as commit hashes) by developers
# of this outer repo.
# - This recursively updates all git submodules to their commit hash pointers as
#   currently committed in the outer repo.
git submodule update --init --recursive

# Option 2. As a **developer** of the outer repo, update all subrepos to force
# them each to pull the latest changes from their respective upstreams (ex: via
# `git pull origin main` or `git pull origin master`, or similar, for each
# sub-repo). 
git submodule update --init --recursive --remote

# now add and commit these subrepo changes
git add -A
git commit -m "Update all subrepos to their latest upstream changes"

细节

Option 1: as a user of the outer repo, trying to get all submodules into the state intended by the developers of the outer repo: git submodule update --init --recursive Option 2: as a developer of the outer repo, trying to update all submodules to the latest commit pushed to the default branch of each of their remote repos (ie: update all subrepos to the latest state intended by the developers of each subrepo): git submodule update --init --recursive --remote ...in place of using git submodule foreach --recursive git pull origin master or git submodule foreach --recursive git pull origin main.

在我看来,上述两个选项的最佳答案是不要使用我在其他一些答案中看到的——merge和——force选项。

上述选项的解释:

the --init part above initializes the submodule in case you just cloned the repo and haven't done that yet --recursive does this for submodules within submodules, recursively down forever and --remote says to update the submodule to the latest commit on the default branch on the default remote for the submodule. It is like doing git pull origin master or git pull origin main in most cases, for example, for each submodule. If you want to update to the commit specified by the outer-most repo (super repo) instead, leave --remote off.

Git子模块foreach——递归Git pull(不要用这个——它经常失败)vs Git子模块update——递归——远程(用这个!——它总是有效的)

我在这个答案下面留下了以下评论。我认为它们很重要,所以我也把它们放在我的回答中。

Basically, for some situations, git submodule foreach --recursive git pull might work. For others, git submodule foreach --recursive git pull origin master might be what you need instead. For others, git submodule foreach --recursive git pull origin main might be what you need. And for others still, none of those might work! You might need git submodule foreach --recursive git pull upstream develop, for instance. OR, even worse, there might not be any git submodule foreach command which works for your outer repo, as each submodule might require a different command to update itself from its default remote and default branch. In all cases I can find, however, this does work, including for all cases you might use one of the several git submodule foreach commands I just presented above. So, use this instead:

git submodule update --recursive --remote

不管怎样,下面是我对这个答案的一些评论:

(1/4) @DavidZ, a lot of people think that git submodule foreach git pull and git submodule update --remote are the same thing, with the latter simply being the newer command. They aren't the same thing, however. git submodule foreach git pull will fail under multiple circumstances for which git submodule update --remote works just fine! If your submodule points to a commit hash that doesn't have a branch pointing to it, which is frequently the case in real-life development where you want a particular version of the submodule for your outer repo, then that submodule... (2/4)...is in a detached HEAD state. In this case, git submodule foreach git pull fails to run git pull on that submodule since a detached HEAD cannot have an upstream branch. git submodule update --remote, however, works just fine! It appears to call git pull origin main on that submodule if origin is the default remote and main is the default branch on that default remote, or git pull origin master, for instance, if origin is the default remote but master is the default branch. (3/4) Furthemore, git submodule foreach git pull origin master will even fail in many cases where git submodule update --remote works just fine, since many submodules use master as the default branch, and many other submodules use main as the default branch since GitHub changed from master to main recently in order to get away from terms related to slavery in the United States ("master" and "slave"). (4/4) So, I added the explicit remote and branch to make it more clear that they are frequently needed, and to remind people that git pull is frequently not enough, and git pull origin master may not work, and git pull origin main may work when the former doesn't, but also may not even work, and that none of them by themselves are the same as git submodule update --remote, since that latter command is smart enough to just do git pull <default_remote> <default_branch> for you for each submodule, apparently adjusting the remote and branch as necessary for each submodule.

相关,及其他研究

如何找到一个回购的主要分支:https://stackoverflow.com/a/49384283/4561887 如何通过git子模块foreach运行自定义命令来更新每个subrepo <cmd>: https://stackoverflow.com/a/45744725/4561887 Man git子模块-然后搜索foreach,——remote等。

其他回答

在这个讨论中,似乎有两种不同的场景被混合在一起:

场景1

使用父存储库指向子模块的指针,我想检查父存储库指向的每个子模块中的提交,可能是在第一次遍历所有子模块并从远程更新/提取这些子模块之后。

如前所述,这已经完成了

git submodule foreach git pull origin BRANCH
git submodule update

情景2,我认为这是OP的目标

新的东西发生在一个或多个子模块中,我想1)拉这些更改和2)更新父库指向这个/这些子模块的HEAD(最新)提交。

这将由

git submodule foreach git pull origin BRANCH
git add module_1_name
git add module_2_name
......
git add module_n_name
git push origin BRANCH

不是很实用,因为你必须硬编码n个路径到所有n个子模块,例如一个脚本来更新父库的提交指针。

通过每个子模块进行自动迭代,更新父存储库指针(使用git add)以指向子模块的头部,这很酷。

为此,我编写了这个小Bash脚本:

git-update-submodules.sh

#!/bin/bash

APP_PATH=$1
shift

if [ -z $APP_PATH ]; then
  echo "Missing 1st argument: should be path to folder of a git repo";
  exit 1;
fi

BRANCH=$1
shift

if [ -z $BRANCH ]; then
  echo "Missing 2nd argument (branch name)";
  exit 1;
fi

echo "Working in: $APP_PATH"
cd $APP_PATH

git checkout $BRANCH && git pull --ff origin $BRANCH

git submodule sync
git submodule init
git submodule update
git submodule foreach "(git checkout $BRANCH && git pull --ff origin $BRANCH && git push origin $BRANCH) || true"

for i in $(git submodule foreach --quiet 'echo $path')
do
  echo "Adding $i to root repo"
  git add "$i"
done

git commit -m "Updated $BRANCH branch of deployment repo to point to latest head of submodules"
git push origin $BRANCH

要运行它,执行

git-update-submodules.sh /path/to/base/repo BRANCH_NAME

细化

首先,我假设名称为$ branch(第二个参数)的分支存在于所有存储库中。你可以让这个问题变得更复杂。

前几节是检查参数是否存在。然后我拉出父库的最新的东西(我更喜欢使用——ff(快进)每当我只是做拉。顺便说一句,我已经重新调基了。

git checkout $BRANCH && git pull --ff origin $BRANCH

然后,如果新的子模块已经添加或尚未初始化,则可能需要初始化一些子模块:

git submodule sync
git submodule init
git submodule update

然后更新/拉出所有子模块:

git submodule foreach "(git checkout $BRANCH && git pull --ff origin $BRANCH && git push origin $BRANCH) || true"

注意几件事:首先,我使用&& -链接一些Git命令,这意味着前一个命令必须正确执行。

在一次可能成功的拉操作之后(如果在远程上发现了新内容),我执行一次推操作,以确保不会将可能的合并提交遗留在客户机上。同样,只有当拉力真的带来了新的东西时,才会发生这种情况。

最后,|| true确保脚本继续错误。要做到这一点,迭代中的所有内容都必须用双引号括起来,Git命令用圆括号括起来(操作符优先级)。

我最喜欢的部分:

for i in $(git submodule foreach --quiet 'echo $path')
do
  echo "Adding $i to root repo"
  git add "$i"
done

用——quiet迭代所有子模块-,这将删除' entered MODULE_PATH'输出。使用'echo $path'(必须是单引号),子模块的路径被写入输出。

这个相对子模块路径列表被捕获在一个数组中($(…))-最后迭代这个,并执行git add $i来更新父库。

最后,提交一些消息,说明父库已更新。如果什么都没有做,默认情况下这个提交将被忽略。把这个推到原点,你就完成了。

我在Jenkins作业中运行了一个脚本,该脚本随后链接到计划的自动部署,它的工作就像一个魅力。

我希望这将对某人有所帮助。

git子模块update命令实际上告诉git,您希望每个子模块都检出在超项目索引中已经指定的提交。如果您希望将子模块更新为远程可用的最新提交,则需要直接在子模块中执行此操作。

总结一下:

# Get the submodule initially
git submodule add ssh://bla submodule_dir
git submodule init

# Time passes, submodule upstream is updated
# and you now want to update

# Change to the submodule directory
cd submodule_dir

# Checkout desired branch
git checkout master

# Update
git pull

# Get back to your project root
cd ..

# Now the submodules are in the state you want, so
git commit -am "Pulled down update to submodule_dir"

或者,如果你是个大忙人:

git submodule foreach git pull origin master

如果你不知道主机分支,这样做:

git submodule foreach git pull origin $(git rev-parse --abbrev-ref HEAD)

它将获得主Git存储库的一个分支,然后为每个子模块提取相同的分支。

Git 1.8.2提供了一个新选项——remote,它将启用这种行为。运行

git submodule update --remote --merge

将从每个子模块的上游获取最新的更改,将它们合并到子模块中,并检出子模块的最新修订。如文档所述:

——远程 此选项仅对update命令有效。与其使用超项目记录的SHA-1来更新子模块,不如使用子模块的远程跟踪分支的状态。

这相当于在每个子模块中运行git pull <remote> <default_branch>(通常是git pull origin master或git pull origin main),这通常正是你想要的。

处理包含子模块的git项目最简单的方法是总是添加

--recurse-submodules 

在每个git命令的末尾 例子:

git fetch --recurse-submodules

另一个

git pull --update --recurse-submodules

等等……