我们的Git存储库最初是一个大型SVN存储库的一部分,其中每个项目都有自己的树,如下所示:

project1/branches
        /tags
        /trunk
project2/branches
        /tags
        /trunk

显然,使用svn mv将文件从一个文件移动到另一个文件非常容易。但是在Git中,每个项目都在它自己的存储库中,今天我被要求将一个子目录从project2移动到project1。我是这样做的:

$ git clone project2 
$ cd project2
$ git filter-branch --subdirectory-filter deeply/buried/java/source/directory/A -- --all
$ git remote rm origin  # so I don't accidentally overwrite the repo ;-)
$ mkdir -p deeply/buried/different/java/source/directory/B
$ for f in *.java; do 
>  git mv $f deeply/buried/different/java/source/directory/B
>  done
$ git commit -m "moved files to new subdirectory"
$ cd ..
$
$ git clone project1
$ cd project1
$ git remote add p2 ../project2
$ git fetch p2
$ git branch p2 remotes/p2/master
$ git merge p2 # --allow-unrelated-histories for git 2.9+
$ git remote rm p2
$ git push

但这似乎相当复杂。有没有更好的方法来做这类事情呢?还是我采取了正确的方法?

注意,这涉及到将历史合并到现有的存储库中,而不是简单地从另一个存储库的一部分创建一个新的独立存储库(如前面的问题中所述)。


当前回答

保留目录名

子目录过滤器(或更短的命令git子树)工作得很好,但对我来说并不管用,因为他们从提交信息中删除了目录名。在我的场景中,我只想将一个存储库的部分合并到另一个存储库中,并保留历史记录WITH完整的路径名。

我的解决方案是使用树筛选器,简单地从源存储库的临时克隆中删除不需要的文件和目录,然后通过5个简单步骤将该克隆拉入目标存储库。

# 1. clone the source
git clone ssh://<user>@<source-repo url>
cd <source-repo>
# 2. remove the stuff we want to exclude
git filter-branch --tree-filter "rm -rf <files to exclude>" --prune-empty HEAD
# 3. move to target repo and create a merge branch (for safety)
cd <path to target-repo>
git checkout -b <merge branch>
# 4. Add the source-repo as remote 
git remote add source-repo <path to source-repo>
# 5. fetch it
git pull source-repo master
# 6. check that you got it right (better safe than sorry, right?)
gitk

其他回答

我想要一些健壮和可重用的东西(一个命令行+撤销函数),所以我写了下面的bash脚本。我用过几次,所以我想在这里分享一下。

它能够将任意文件夹/路径/to/foo从repo1移动到/some/other/folder/bar到repo2(文件夹路径可以相同或不同,与根文件夹的距离可能不同)。

由于它只遍历输入文件夹中涉及文件的提交(而不是源回购的所有提交),即使在大的源回购上,如果你只是提取一个在每次提交中都没有触及的嵌套很深的子文件夹,它也应该相当快。

因为这样做是创建一个带有所有旧的回购历史的孤立分支,然后将其合并到HEAD,它甚至可以在文件名冲突的情况下工作(当然,然后您必须在最后解决合并)。

如果没有文件名冲突,您只需要在最后提交git来完成合并。

缺点是它可能不会遵循文件重命名(REWRITE_FROM文件夹之外)在源repo - pull请求欢迎GitHub来适应这一点。

GitHub链接:git-move-folder-between- restore -keep-history

#!/bin/bash

# Copy a folder from one git repo to another git repo,
# preserving full history of the folder.

SRC_GIT_REPO='/d/git-experimental/your-old-webapp'
DST_GIT_REPO='/d/git-experimental/your-new-webapp'
SRC_BRANCH_NAME='master'
DST_BRANCH_NAME='import-stuff-from-old-webapp'
# Most likely you want the REWRITE_FROM and REWRITE_TO to have a trailing slash!
REWRITE_FROM='app/src/main/static/'
REWRITE_TO='app/src/main/static/'

verifyPreconditions() {
    #echo 'Checking if SRC_GIT_REPO is a git repo...' &&
      { test -d "${SRC_GIT_REPO}/.git" || { echo "Fatal: SRC_GIT_REPO is not a git repo"; exit; } } &&
    #echo 'Checking if DST_GIT_REPO is a git repo...' &&
      { test -d "${DST_GIT_REPO}/.git" || { echo "Fatal: DST_GIT_REPO is not a git repo"; exit; } } &&
    #echo 'Checking if REWRITE_FROM is not empty...' &&
      { test -n "${REWRITE_FROM}" || { echo "Fatal: REWRITE_FROM is empty"; exit; } } &&
    #echo 'Checking if REWRITE_TO is not empty...' &&
      { test -n "${REWRITE_TO}" || { echo "Fatal: REWRITE_TO is empty"; exit; } } &&
    #echo 'Checking if REWRITE_FROM folder exists in SRC_GIT_REPO' &&
      { test -d "${SRC_GIT_REPO}/${REWRITE_FROM}" || { echo "Fatal: REWRITE_FROM does not exist inside SRC_GIT_REPO"; exit; } } &&
    #echo 'Checking if SRC_GIT_REPO has a branch SRC_BRANCH_NAME' &&
      { cd "${SRC_GIT_REPO}"; git rev-parse --verify "${SRC_BRANCH_NAME}" || { echo "Fatal: SRC_BRANCH_NAME does not exist inside SRC_GIT_REPO"; exit; } } &&
    #echo 'Checking if DST_GIT_REPO has a branch DST_BRANCH_NAME' &&
      { cd "${DST_GIT_REPO}"; git rev-parse --verify "${DST_BRANCH_NAME}" || { echo "Fatal: DST_BRANCH_NAME does not exist inside DST_GIT_REPO"; exit; } } &&
    echo '[OK] All preconditions met'
}

# Import folder from one git repo to another git repo, including full history.
#
# Internally, it rewrites the history of the src repo (by creating
# a temporary orphaned branch; isolating all the files from REWRITE_FROM path
# to the root of the repo, commit by commit; and rewriting them again
# to the original path).
#
# Then it creates another temporary branch in the dest repo,
# fetches the commits from the rewritten src repo, and does a merge.
#
# Before any work is done, all the preconditions are verified: all folders
# and branches must exist (except REWRITE_TO folder in dest repo, which
# can exist, but does not have to).
#
# The code should work reasonably on repos with reasonable git history.
# I did not test pathological cases, like folder being created, deleted,
# created again etc. but probably it will work fine in that case too.
#
# In case you realize something went wrong, you should be able to reverse
# the changes by calling `undoImportFolderFromAnotherGitRepo` function.
# However, to be safe, please back up your repos just in case, before running
# the script. `git filter-branch` is a powerful but dangerous command.
importFolderFromAnotherGitRepo(){
    SED_COMMAND='s-\t\"*-\t'${REWRITE_TO}'-'

    verifyPreconditions &&
    cd "${SRC_GIT_REPO}" &&
      echo "Current working directory: ${SRC_GIT_REPO}" &&
      git checkout "${SRC_BRANCH_NAME}" &&
      echo 'Backing up current branch as FILTER_BRANCH_BACKUP' &&
      git branch -f FILTER_BRANCH_BACKUP &&
      SRC_BRANCH_NAME_EXPORTED="${SRC_BRANCH_NAME}-exported" &&
      echo "Creating temporary branch '${SRC_BRANCH_NAME_EXPORTED}'..." &&
      git checkout -b "${SRC_BRANCH_NAME_EXPORTED}" &&
      echo 'Rewriting history, step 1/2...' &&
      git filter-branch -f --prune-empty --subdirectory-filter ${REWRITE_FROM} &&
      echo 'Rewriting history, step 2/2...' &&
      git filter-branch -f --index-filter \
       "git ls-files -s | sed \"$SED_COMMAND\" |
        GIT_INDEX_FILE=\$GIT_INDEX_FILE.new git update-index --index-info &&
        mv \$GIT_INDEX_FILE.new \$GIT_INDEX_FILE" HEAD &&
    cd - &&
    cd "${DST_GIT_REPO}" &&
      echo "Current working directory: ${DST_GIT_REPO}" &&
      echo "Adding git remote pointing to SRC_GIT_REPO..." &&
      git remote add old-repo ${SRC_GIT_REPO} &&
      echo "Fetching from SRC_GIT_REPO..." &&
      git fetch old-repo "${SRC_BRANCH_NAME_EXPORTED}" &&
      echo "Checking out DST_BRANCH_NAME..." &&
      git checkout "${DST_BRANCH_NAME}" &&
      echo "Merging SRC_GIT_REPO/" &&
      git merge "old-repo/${SRC_BRANCH_NAME}-exported" --no-commit &&
    cd -
}

# If something didn't work as you'd expect, you can undo, tune the params, and try again
undoImportFolderFromAnotherGitRepo(){
  cd "${SRC_GIT_REPO}" &&
    SRC_BRANCH_NAME_EXPORTED="${SRC_BRANCH_NAME}-exported" &&
    git checkout "${SRC_BRANCH_NAME}" &&
    git branch -D "${SRC_BRANCH_NAME_EXPORTED}" &&
  cd - &&
  cd "${DST_GIT_REPO}" &&
    git remote rm old-repo &&
    git merge --abort
  cd -
}

importFolderFromAnotherGitRepo
#undoImportFolderFromAnotherGitRepo

是的,点击filter-branch的——subdirectory-filter是关键。您使用它的事实本质上证明了没有更简单的方法—您别无选择,只能重写历史,因为您希望最终只得到文件的一个(重命名的)子集,而这根据定义改变了哈希值。由于没有任何标准命令(例如pull)重写历史,因此您无法使用它们来完成此任务。

当然,您可以细化细节—您的一些克隆和分支并不是严格必要的—但是总体方法是好的!遗憾的是它很复杂,但是git的意义当然不是让重写历史变得容易。

通过使用git-filter-repo,这变得更简单。

为了移动project2/sub/dir到project1/sub/dir:

# Create a new repo containing only the subdirectory:
git clone project2 project2_clone --no-local
cd project2_clone
git filter-repo --path sub/dir

# Merge the new repo:
cd ../project1
git remote add tmp ../project2_clone/
git fetch tmp master
git merge remotes/tmp/master --allow-unrelated-histories
git remote remove tmp

简单地安装工具:pip3 install git-filter-repo (更多细节和选项在README)

# Before: (root)
.
|-- project1
|   `-- 3
`-- project2
    |-- 1
    `-- sub
        `-- dir
            `-- 2

# After: (project1)
.
├── 3
└── sub
    └── dir
        └── 2

保留目录名

子目录过滤器(或更短的命令git子树)工作得很好,但对我来说并不管用,因为他们从提交信息中删除了目录名。在我的场景中,我只想将一个存储库的部分合并到另一个存储库中,并保留历史记录WITH完整的路径名。

我的解决方案是使用树筛选器,简单地从源存储库的临时克隆中删除不需要的文件和目录,然后通过5个简单步骤将该克隆拉入目标存储库。

# 1. clone the source
git clone ssh://<user>@<source-repo url>
cd <source-repo>
# 2. remove the stuff we want to exclude
git filter-branch --tree-filter "rm -rf <files to exclude>" --prune-empty HEAD
# 3. move to target repo and create a merge branch (for safety)
cd <path to target-repo>
git checkout -b <merge branch>
# 4. Add the source-repo as remote 
git remote add source-repo <path to source-repo>
# 5. fetch it
git pull source-repo master
# 6. check that you got it right (better safe than sorry, right?)
gitk

下面是通过维护所有分支和保存历史记录来将我的GIT Stash迁移到GitLab的方法。

将旧的存储库克隆到本地。

git clone --bare <STASH-URL>

在GitLab中创建一个空存储库。

git push --mirror <GitLab-URL>

当我们将代码从stash迁移到GitLab时,我执行了上述操作,效果非常好。