为什么家里没有家长?
以下内容
总结:
Gitr:查找(可能多个)相关分支
Gitp:通过类似git-flow的内部规则/regex找到可能的父母
为什么会有人想读这篇长文章?
因为之前的答案很清楚
用原来的问题来理解问题,
他们没有得到正确/有意义的结果;
或者准确地解决一个不同的问题。
请随意回顾第一部分;
它解决了"找东西"的问题
应该突出问题的范围。
对一些人来说,这可能就足够了。
这个会告诉你
从git中提取正确且有意义的结果
(你可能不喜欢),
并演示一种应用方法
你对惯例的了解
对于这些结果
来提取你真正想要的东西。
以下各节涵盖:
一个公正的问题和解决方案:
最近的git分支使用git show-branch。
预期的结果应该是什么样的
示例图表和结果
批处理分支:围绕git show-branch的限制工作
有偏见的问题和解决方案:
引入(命名)约定来改善结果
问题的问题
如前所述,git不跟踪分支之间的关系;
分支只是引用提交的名称。
在官方git文档和其他来源中,我们经常会遇到一些误导性的图表,例如:
A---B---C---D <- master branch
\
E---F <- work branch
让我们改变图表的形式和层次暗示的名称,以显示一个等效的图表:
E---F <- jack
/
A---B
\
C---D <- jill
这个图(以及git)完全没有告诉我们哪个分支是先创建的(因此,哪个分支是从另一个分支分支出来的)。
第一个图中的master是work的父节点,这是惯例。
因此
简单的工具将产生忽略偏差的响应
更复杂的工具包含了约定(偏差)。
一个公正的问题
首先,我必须首先承认Joe Chrysler的回应,这里的其他回应,以及周围的许多评论/建议;
他们激励了我,为我指明了道路!
请允许我重新措辞Joe的措辞,考虑到与最近的提交相关的多个分支(它发生了!):
函数之外的分支上最近的提交是什么
现在的分支,是哪个分支?”
换句话说:
Q1
给定分支B:
考虑最接近B'HEAD的提交C
(C可以是B'HEAD)
由其他分支共享的:
除了B,哪些分支的提交历史中有C ?
一个公正的解决方案
先道歉;人们似乎更喜欢说俏皮话。请随意提出(可读/可维护的)改进建议!
#!/usr/local/bin/bash
# git show-branch supports 29 branches; reserve 1 for current branch
GIT_SHOW_BRANCH_MAX=28
CURRENT_BRANCH="$(git rev-parse --abbrev-ref HEAD)"
if (( $? != 0 )); then
echo "Failed to determine git branch; is this a git repo?" >&2
exit 1
fi
##
# Given Params:
# EXCEPT : $1
# VALUES : $2..N
#
# Return all values except EXCEPT, in order.
#
function valuesExcept() {
local except=$1 ; shift
for value in "$@"; do
if [[ "$value" != "$except" ]]; then
echo $value
fi
done
}
##
# Given Params:
# BASE_BRANCH : $1 : base branch; default is current branch
# BRANCHES : [ $2 .. $N ] : list of unique branch names (no duplicates);
# perhaps possible parents.
# Default is all branches except base branch.
#
# For the most recent commit in the commit history for BASE_BRANCH that is
# also in the commit history of at least one branch in BRANCHES: output all
# BRANCHES that share that commit in their commit history.
#
function nearestCommonBranches() {
local BASE_BRANCH
if [[ -z "${1+x}" || "$1" == '.' ]]; then
BASE_BRANCH="$CURRENT_BRANCH"
else
BASE_BRANCH="$1"
fi
shift
local -a CANDIDATES
if [[ -z "${1+x}" ]]; then
CANDIDATES=( $(git rev-parse --symbolic --branches) )
else
CANDIDATES=("$@")
fi
local BRANCHES=( $(valuesExcept "$BASE_BRANCH" "${CANDIDATES[@]}") )
local BRANCH_COUNT=${#BRANCHES[@]}
if (( $BRANCH_COUNT > $GIT_SHOW_BRANCH_MAX )); then
echo "Too many branches: limit $GIT_SHOW_BRANCH_MAX" >&2
exit 1
fi
local MAP=( $(git show-branch --topo-order "${BRANCHES[@]}" "$BASE_BRANCH" \
| tail -n +$(($BRANCH_COUNT+3)) \
| sed "s/ \[.*$//" \
| sed "s/ /_/g" \
| sed "s/*/+/g" \
| egrep '^_*[^_].*[^_]$' \
| head -n1 \
| sed 's/\(.\)/\1\n/g'
) )
for idx in "${!BRANCHES[@]}"; do
## to include "merge", symbolized by '-', use
## ALT: if [[ "${MAP[$idx]}" != "_" ]]
if [[ "${MAP[$idx]}" == "+" ]]; then
echo "${BRANCHES[$idx]}"
fi
done
}
# Usage: gitr [ baseBranch [branchToConsider]* ]
# baseBranch: '.' (no quotes needed) corresponds to default current branch
# branchToConsider* : list of unique branch names (no duplicates);
# perhaps possible (bias?) parents.
# Default is all branches except base branch.
nearestCommonBranches "${@}"
工作原理
考虑输出:git show-branch
对于git show-branch——topo-order feature/g hotfix master release/2 release/3 feature/d,输出看起来类似于:
! [feature/g] TEAM-12345: create X
* [hotfix] TEAM-12345: create G
! [master] TEAM-12345: create E
! [release/2] TEAM-12345: create C
! [release/3] TEAM-12345: create C
! [feature/d] TEAM-12345: create S
------
+ [feature/g] TEAM-12345: create X
+ [feature/g^] TEAM-12345: create W
+ [feature/d] TEAM-12345: create S
+ [feature/d^] TEAM-12345: create R
+ [feature/d~2] TEAM-12345: create Q
...
+ [master] TEAM-12345: create E
* [hotfix] TEAM-12345: create G
* [hotfix^] TEAM-12345: create F
*+ [master^] TEAM-12345: create D
+*+++ [release/2] TEAM-12345: create C
+*++++ [feature/d~8] TEAM-12345: create B
以下几点:
原来的命令在命令行上列出了N(6)个分支名称
这些分支名称按顺序出现在输出的前N行
标题后面的行表示提交
提交行的前N列表示(作为一个整体)一个“分支/提交矩阵”,其中X列中的一个字符表示分支(标题行X)和当前提交之间的关系(或缺乏关系)。
主要步骤
Given a BASE_BRANCH
Given an ordered set (unique) BRANCHES that does not include BASE_BRANCH
For brevity, let N be BRANCH_COUNT,
which is the size of BRANCHES;
it does not include BASE_BRANCH
git show-branch --topo-order $BRANCHES $BASE_BRANCH:
Since BRANCHES contains only unique names (presumed valid)
the names will map 1-1 with the header lines of the output,
and correspond to the first N columns of the branch/commit matrix.
Since BASE_BRANCH is not in BRANCHES
it will be the last of the header lines,
and corresponds to the last column branch/commit matrix.
tail: start with line N+3; throw away the first N+2 lines: N branches + base branch + separator row ---...
sed: these could be combined in one... but are separated for clarity
remove everything after the branch/commit matrix
replace spaces with underscores '_';
my primary reason was to avoid potential IFS parsing hassles
and for debugging/readability.
replace * with +; base branch is always in last column,
and that's sufficient. Also, if left alone it goes through bash
pathname expansion, and that's always fun with *
egrep: grep for commits that map to at least one branch ([^_]) AND to the BASE_BRANCH ([^_]$). Maybe that base branch pattern should be \+$?
head -n1: take the first remaining commit
sed: separate each character of the branch/commit matrix to separate lines.
Capture the lines in an array MAP, at which point we have two arrays:
BRANCHES: length N
MAP: length N+1: first N elements 1-1 with BRANCHES, and the last element corresponding to the BASE_BRANCH.
Iterate over BRANCHES (that's all we want, and it's shorter) and check corresponding element in MAP: output BRANCH[$idx] if MAP[$idx] is +.
示例图表和结果
请看下面这个有点做作的例子:
将使用有偏见的名称,因为它们有助于(我)衡量和考虑结果。
假设合并已经存在并且被忽略。
该图通常试图突出这样的分支(分叉),
没有从视觉上暗示偏好/层次;
讽刺的是,在我做完这件事之后,master脱颖而出。
J <- feature/b
/
H
/ \
/ I <- feature/a
/
D---E <- master
/ \
/ F---G <- hotfix
/
A---B---C <- feature/f, release/2, release/3
\ \
\ W--X <- feature/g
\
\ M <- support/1
\ /
K---L <- release/4
\
\ T---U---V <- feature/e
\ /
N---O
\
P <- feature/c
\
Q---R---S <- feature/d
例子图的无偏结果
假设脚本在可执行文件gitr中,然后运行:
gitr <baseBranch>
对于不同的分支B,我们得到如下结果:
GIVEN B |
Shared Commit C |
Branches !B with C in their history? |
feature/a |
H |
feature/b |
feature/b |
H |
feature/a |
feature/c |
P |
feature/d |
feature/d |
P |
feature/c |
feature/e |
O |
feature/c, feature/d |
feature/f |
C |
feature/a, feature/b, feature/g, hotfix, master, release/2, release/3 |
feature/g |
C |
feature/a, feature/b, feature/f, hotfix, master, release/2, release/3 |
hotfix |
D |
feature/a, feature/b, master |
master |
D |
feature/a, feature/b, hotfix |
release/2 |
C |
feature/a, feature/b, feature/f, feature/g, hotfix, master, release/3 |
release/3 |
C |
feature/a, feature/b, feature/f, feature/g, hotfix, master, release/2 |
release/4 |
L |
feature/c, feature/d, feature/e, support/1 |
support/1 |
L |
feature/c, feature/d, feature/e, release/4 |
批处理分支
[在此阶段出现
因为在这一点上它最适合最终的脚本。
这部分不是必需的,可以跳过。]
Git show-branch将自己限制为29个分支。
这对某些人来说可能是一个阻碍(没有评判,只是说说!)
在某些情况下,我们可以改善结果,
通过将分支分组成批。
BASE_BRANCH必须与每个分支一起提交。
如果在一个回购中有大量的分支
这本身的价值可能有限。
如果你能找到其他方法,可能会提供更多的价值
以限制分支(将被批处理)。
前面的观点符合我的用例,
所以向前冲吧!
这个机制并不完美,
当结果大小接近最大值(29)时,
预计它会失败。下面的细节
批处理解决方案
#
# Remove/comment-out the function call at the end of script,
# and append this to the end.
##
##
# Given:
# BASE_BRANCH : $1 : first param on every batch
# BRANCHES : [ $2 .. $N ] : list of unique branch names (no duplicates);
# perhaps possible parents
# Default is all branches except base branch.
#
# Output all BRANCHES that share that commit in their commit history.
#
function repeatBatchingUntilStableResults() {
local BASE_BRANCH="$1"
shift
local -a CANDIDATES
if [[ -z "${1+x}" ]]; then
CANDIDATES=( $(git rev-parse --symbolic --branches) )
else
CANDIDATES=("$@")
fi
local BRANCHES=( $(valuesExcept "$BASE_BRANCH" "${CANDIDATES[@]}") )
local SIZE=$GIT_SHOW_BRANCH_MAX
local COUNT=${#BRANCHES[@]}
local LAST_COUNT=$(( $COUNT + 1 ))
local NOT_DONE=1
while (( $NOT_DONE && $COUNT < $LAST_COUNT )); do
NOT_DONE=$(( $SIZE < $COUNT ))
LAST_COUNT=$COUNT
local -a BRANCHES_TO_BATCH=( "${BRANCHES[@]}" )
local -a AGGREGATE=()
while (( ${#BRANCHES_TO_BATCH[@]} > 0 )); do
local -a BATCH=( "${BRANCHES_TO_BATCH[@]:0:$SIZE}" )
AGGREGATE+=( $(nearestCommonBranches "$BASE_BRANCH" "${BATCH[@]}") )
BRANCHES_TO_BATCH=( "${BRANCHES_TO_BATCH[@]:$SIZE}" )
done
BRANCHES=( "${AGGREGATE[@]}" )
COUNT=${#BRANCHES[@]}
done
if (( ${#BRANCHES[@]} > $SIZE )); then
echo "Unable to reduce candidate branches below MAX for git-show-branch" >&2
echo " Base Branch : $BASE_BRANCH" >&2
echo " MAX Branches: $SIZE" >&2
echo " Candidates : ${BRANCHES[@]}" >&2
exit 1
fi
echo "${BRANCHES[@]}"
}
repeatBatchingUntilStableResults "$@"
exit 0
工作原理
重复,直到结果稳定
把树枝分成几批
GIT_SHOW_BRANCH_MAX(又名SIZE)元素
调用nearestCommonBranches BASE_BRANCH BATCH
将结果聚合到一个新的(更小的?)分支集
为什么会失败
如果聚合的分支数量超过最大SIZE
进一步的批处理/处理不能减少这个数字
然后:
聚合的分支就是解决方案,
但是git show-branch不能验证这一点
每批不减;
一个批次的分支可能有助于减少另一个批次的分支
(差异归并基);目前的算法承认失败并失败。
考虑替代
将一个基本分支与每个感兴趣的其他分支单独配对,为每对确定一个提交节点(合并基);按照提交历史顺序对归并基集进行排序,取最近的节点,确定与该节点关联的所有分支。
我只是事后诸葛亮。
这可能是正确的选择。
我在前进;
也许在当前话题之外还有其他价值。
有偏见的问题
您可能已经注意到核心函数nearestCommonBranches
在前面的脚本中回答了比问题Q1问的更多的问题。
实际上,这个函数回答了一个更普遍的问题:
Q2
已知一个分支B和
分支(B不在P中)的有序集(无重复)P:
考虑最接近B'HEAD的提交C (C可以是B'HEAD)
由P中的分支共享:
按P的顺序,P中的哪些分支在它们的提交历史中有C ?
选择P提供偏差,或者描述一个(有限的)约定。
要匹配您的偏见/惯例的所有特征可能需要额外的工具,这超出了本文的讨论范围。
简单偏见/惯例建模
偏见因不同的组织和实践而异,
以下内容可能不适合您的组织。
不出意外的话,也许这里的一些想法会有所帮助
你找到了你需要的解决方案。
有偏解;分支命名惯例的偏差
也许偏差可以映射到,并从,
使用中的命名约定。
P偏差(其他分支名称)
下一步我们需要这个,
让我们看看通过正则表达式过滤分支名称能做什么。
前面的代码和下面的新代码的组合可以作为一个要点:gitr
#
# Remove/comment-out the function call at the end of script,
# and append this to the end.
##
##
# Given Params:
# BASE_BRANCH : $1 : base branch
# REGEXs : $2 [ .. $N ] : regex(s)
#
# Output:
# - git branches matching at least one of the regex params
# - base branch is excluded from result
# - order: branches matching the Nth regex will appear before
# branches matching the (N+1)th regex.
# - no duplicates in output
#
function expandUniqGitBranches() {
local -A BSET[$1]=1
shift
local ALL_BRANCHES=$(git rev-parse --symbolic --branches)
for regex in "$@"; do
for branch in $ALL_BRANCHES; do
## RE: -z ${BSET[$branch]+x ... ; presumes ENV 'x' is not defined
if [[ $branch =~ $regex && -z "${BSET[$branch]+x}" ]]; then
echo "$branch"
BSET[$branch]=1
fi
done
done
}
##
# Params:
# BASE_BRANCH: $1 : "." equates to the current branch;
# REGEXS : $2..N : regex(es) corresponding to other to include
#
function findBranchesSharingFirstCommonCommit() {
if [[ -z "$1" ]]; then
echo "Usage: findBranchesSharingFirstCommonCommit ( . | baseBranch ) [ regex [ ... ] ]" >&2
exit 1
fi
local BASE_BRANCH
if [[ -z "${1+x}" || "$1" == '.' ]]; then
BASE_BRANCH="$CURRENT_BRANCH"
else
BASE_BRANCH="$1"
fi
shift
local REGEXS
if [[ -z "$1" ]]; then
REGEXS=(".*")
else
REGEXS=("$@")
fi
local BRANCHES=( $(expandUniqGitBranches "$BASE_BRANCH" "${REGEXS[@]}") )
## nearestCommonBranches can also be used here, if batching not used.
repeatBatchingUntilStableResults "$BASE_BRANCH" "${BRANCHES[@]}"
}
findBranchesSharingFirstCommonCommit "$@"
偏倚结果示例图
我们考虑有序集
P = {^release/。* $ ^支持/。*$ ^master$}
假设脚本(所有部分)在可执行文件gitr中,然后运行:
gitr <baseBranch> '^release/.*$' '^support/.*$' '^master$'
对于不同的分支B,我们得到如下结果:
GIVEN B |
Shared Commit C |
Branches P with C in their history (in order) |
feature/a |
D |
master |
feature/b |
D |
master |
feature/c |
L |
release/4, support/1 |
feature/d |
L |
release/4, support/1 |
feature/e |
L |
release/4, support/1 |
feature/f |
C |
release/2, release/3, master |
feature/g |
C |
release/2, release/3, master |
hotfix |
D |
master |
master |
C |
release/2, release/3 |
release/2 |
C |
release/3, master |
release/3 |
C |
release/2, master |
release/4 |
L |
support/1 |
support/1 |
L |
release/4 |
这离确定答案越来越近了;发布分支的响应并不理想。让我们更进一步。
基于BASE_NAME和P的偏差
一个方向是用不同的P来表示不同
基地名称。我们来设计一下。
约定
免责声明:我不是一个纯粹的git流主义者,请体谅我
A support branch shall branch off master.
There will NOT be two support branches sharing a common commit.
A hotfix branch shall branch off a support branch or master.
A release branch shall branch off a support branch or master.
There may be multiple release branches sharing a common commit;
i.e. branched off master at the same time.
A bugfix branch shall branch off a release branch.
a feature branch may branch off a feature, release, support, or master:
for the purpose of "parent",
one feature branch cannot be established as
a parent over another (see initial discussion).
therefore: skip feature branches and
look for "parent" among release, support, and/or master branches.
any other branch name to be considered a working branch,
with same conventions as a feature branch.
让我们看看我们在这方面取得了多少进展:
Base Branch Pattern |
Parent Branches, Ordered |
Comment(s) |
^master$ |
n/a |
no parent |
^support/.*$ |
^master$ |
|
^hotfix/.*$ |
^support/.*$ ^master$ |
give preference to a support branch over master (ordering) |
^release/.*$ |
^support/.*$ ^master$ |
give preference to a support branch over master (ordering) |
^bugfix/.*$ |
^release/.*$ |
|
^feature/.*$ |
^release/.*$ ^support/.*$ ^master$ |
|
^.*$ |
^release/.*$ ^support/.*$ ^master$ |
Redundant, but keep design concerns separate |
脚本
前面的代码和下面的新代码的组合可以作为一个gist: gitp
#
# Remove/comment-out the function call at the end of script,
# and append this to the end.
##
# bash associative arrays maintain key/entry order.
# So, use two maps, values correlated by index:
declare -a MAP_BASE_BRANCH_REGEX=( "^master$" \
"^support/.*$" \
"^hotfix/.*$" \
"^release/.*$" \
"^bugfix/.*$" \
"^feature/.*$" \
"^.*$" )
declare -a MAP_BRANCHES_REGEXS=("" \
"^master$" \
"^support/.*$ ^master$" \
"^support/.*$ ^master$" \
"^release/.*$" \
"^release/.*$ ^support/.*$ ^master$" \
"^release/.*$ ^support/.*$ ^master$" )
function findBranchesByBaseBranch() {
local BASE_BRANCH
if [[ -z "${1+x}" || "$1" == '.' ]]; then
BASE_BRANCH="$CURRENT_BRANCH"
else
BASE_BRANCH="$1"
fi
for idx in "${!MAP_BASE_BRANCH_REGEX[@]}"; do
local BASE_BRANCH_REGEX=${MAP_BASE_BRANCH_REGEX[$idx]}
if [[ "$BASE_BRANCH" =~ $BASE_BRANCH_REGEX ]]; then
local BRANCHES_REGEXS=( ${MAP_BRANCHES_REGEXS[$idx]} )
if (( ${#BRANCHES_REGEXS[@]} > 0 )); then
findBranchesSharingFirstCommonCommit $BASE_BRANCH "${BRANCHES_REGEXS[@]}"
fi
break
fi
done
}
findBranchesByBaseBranch "$1"
偏倚结果示例图
假设脚本(所有部分)在可执行文件gitr中,然后运行:
gitr <baseBranch>
对于不同的分支B,我们得到如下结果:
GIVEN B |
Shared Commit C |
Branches P with C in their history (in order) |
feature/a |
D |
master |
feature/b |
D |
master |
feature/c |
L |
release/4, support/1 |
feature/d |
L |
release/4, support/1 |
feature/e |
L |
release/4, support/1 |
feature/f |
C |
release/2, release/3, master |
feature/g |
C |
release/2, release/3, master |
hotfix |
D |
master |
master |
|
(blank, no value) |
release/2 |
C |
master |
release/3 |
C |
master |
release/4 |
L |
support/1 |
support/1 |
L |
master |
为了胜利而重构!
机会!
在最后一个例子中,发布分支共享一个公共提交
使用多个其他分支:发布分支、支持分支或主分支。
让我们“重构”或重新评估所使用的约定,并稍微收紧它们。
考虑一下git的使用约定:
当创建一个新的发布分支时:
立即创建一个新的提交;可能更新一个版本,或者README文件。
这确保了特性/工作分支
对于发布(从发布分支出来)
是否将提交与发布分支共享
在底层的提交之前(并且不被共享)
支持或控制分支。
例如:
G---H <- feature/z
/
E <- release/1
/
A---B---C---D <- master
\
F <- release/2
一个从release/1分支出来的特性不能有一个共同的提交
这包括release/1(它的父版本)和master或release/2。
它为每个分支提供了一个结果,parent,
有了这些约定。
完成了!有了工具和约定,我可以生活在一个OCD友好的结构化git世界。
你的里程可能会有所不同!
分开的想法
依据
Gitr:查找(可能多个)相关分支
Gitp:通过类似git-flow的内部规则/regex找到可能的父母
最重要的是:我得出的结论是,
除了这里展示的内容,
在某种程度上,人们可能需要接受可能有多个
分支处理。
也许可以在所有潜在的分支上进行验证;
“至少一个”或“全部”或??可能会应用规则。
像这样的几个星期,我真的觉得是时候学习Python了。