如何计算Git存储库中特定作者更改的总行数?

我是否可以调用一个命令来计算Git存储库中特定作者更改的行数?我知道一定有方法来计算提交的数量，因为Github为他们的影响图这样做。

当前回答

吉特名声

https://github.com/oleander/git-fame-rb

这是一个很好的工具，可以一次性获得所有作者的计数，包括提交和修改文件的计数:

sudo apt-get install ruby-dev
sudo gem install git_fame
cd /path/to/gitdir && git fame

https://github.com/casperdcl/git-fame上也有Python版本(@fracz提到过):

sudo apt-get install python-pip python-dev build-essential 
pip install --user git-fame
cd /path/to/gitdir && git fame

样例输出:

Total number of files: 2,053
Total number of lines: 63,132
Total number of commits: 4,330

+------------------------+--------+---------+-------+--------------------+
| name                   | loc    | commits | files | percent            |
+------------------------+--------+---------+-------+--------------------+
| Johan Sørensen         | 22,272 | 1,814   | 414   | 35.3 / 41.9 / 20.2 |
| Marius Mathiesen       | 10,387 | 502     | 229   | 16.5 / 11.6 / 11.2 |
| Jesper Josefsson       | 9,689  | 519     | 191   | 15.3 / 12.0 / 9.3  |
| Ole Martin Kristiansen | 6,632  | 24      | 60    | 10.5 / 0.6 / 2.9   |
| Linus Oleander         | 5,769  | 705     | 277   | 9.1 / 16.3 / 13.5  |
| Fabio Akita            | 2,122  | 24      | 60    | 3.4 / 0.6 / 2.9    |
| August Lilleaas        | 1,572  | 123     | 63    | 2.5 / 2.8 / 3.1    |
| David A. Cuadrado      | 731    | 111     | 35    | 1.2 / 2.6 / 1.7    |
| Jonas Ängeslevä        | 705    | 148     | 51    | 1.1 / 3.4 / 2.5    |
| Diego Algorta          | 650    | 6       | 5     | 1.0 / 0.1 / 0.2    |
| Arash Rouhani          | 629    | 95      | 31    | 1.0 / 2.2 / 1.5    |
| Sofia Larsson          | 595    | 70      | 77    | 0.9 / 1.6 / 3.8    |
| Tor Arne Vestbø        | 527    | 51      | 97    | 0.8 / 1.2 / 4.7    |
| spontus                | 339    | 18      | 42    | 0.5 / 0.4 / 2.0    |
| Pontus                 | 225    | 49      | 34    | 0.4 / 1.1 / 1.7    |
+------------------------+--------+---------+-------+--------------------+

但是要注意的是:正如Jared在评论中提到的，在一个非常大的存储库上做这件事将花费数小时。但考虑到它必须处理如此多的Git数据，不确定是否可以改进。

2014-09-27 19:57:23

其他回答

AaronM使用shell一行程序得到的答案很好，但实际上，还有另一个错误，如果用户名和日期之间有不同数量的空格，空格会破坏用户名。损坏的用户名将给出多行用户计数，您必须自己将它们相加。

这个小小的改变解决了我的问题:

git ls-files -z | xargs -0n1 git blame -w --show-email | perl -n -e '/^.*?\((.*?)\s+[\d]{4}/; print $1,"\n"' | sort -f | uniq -c | sort -n

注意\s后面的+，它将占用从名称到日期的所有空白。

实际上，添加这个答案既是为了帮助别人，也是为了我自己的记忆，因为这至少是我第二次谷歌这个主题:)

增加了——show-email to git blame -w来聚合email，因为有些人在不同的计算机上使用不同的Name格式，有时两个同名的人在同一个git中工作。

2013-03-19 15:08:37

在看了Alex和Gerty3000的答案后，我试着缩短了一行字:

基本上，使用git log numstat，而不跟踪更改的文件数量。

Mac OSX上的Git 2.1.0版本:

git log --format='%aN' | sort -u | while read name; do echo -en "$name\t"; git log --author="$name" --pretty=tformat: --numstat | awk '{ add += $1; subs += $2; loc += $1 - $2 } END { printf "added lines: %s, removed lines: %s, total lines: %s\n", add, subs, loc }' -; done

例子:

Jared Burrows   added lines: 6826, removed lines: 2825, total lines: 4001

2014-09-16 18:38:21

我发现下面的方法对于查看当前代码库中谁拥有最多的行很有用:

git ls-files -z | xargs -0n1 git blame -w | ruby -n -e '$_ =~ /^.*\((.*?)\s[\d]{4}/; puts $1.strip' | sort -f | uniq -c | sort -n

其他答案主要集中在提交中更改的行，但如果提交无法存活并被覆盖，则它们可能只是被更改了。上面的咒语还可以让您按行对所有提交者进行排序，而不是一次只排序一个。您可以向git blame (-C -M)添加一些选项，以获得一些更好的数字，将文件移动和文件之间的行移动考虑在内，但如果这样做，该命令可能会运行更长时间。

同样，如果你正在为所有提交者寻找在所有提交中更改的行，下面的小脚本很有帮助:

http://git-wt-commit.rubyforge.org/#git-rank-contributors

2011-03-19 05:53:09

@mmrobins @AaronM @ErikZ @JamesMishra提供的变体都有一个共同的问题:他们要求git生成不用于脚本使用的信息的混合物，包括来自存储库的行内容在同一行，然后用regexp匹配混乱。

当某些行不是有效的UTF-8文本时，以及当某些行恰好与regexp匹配时(这里发生了这种情况)，就会出现问题。

这是一条修改过的线，没有这些问题。它要求git在单独的行上干净地输出数据，这使得它很容易过滤我们想要的内容:

git ls-files -z | xargs -0n1 git blame -w --line-porcelain | grep -a "^author " | sort -f | uniq -c | sort -n

您可以grep其他字符串，如author-mail, committer等。

也许首先要导出LC_ALL=C(假设是bash)以强制进行字节级处理(这碰巧也大大加快了来自基于utf -8的区域设置的grep的速度)。

2016-03-18 17:03:55

为了防止有人想要查看他们代码库中每个用户的统计数据，我的几个同事最近想出了这样一个可怕的句子:

git log --shortstat --pretty="%cE" | sed 's/\(.*\)@.*/\1/' | grep -v "^$" | awk 'BEGIN { line=""; } !/^ / { if (line=="" || !match(line, $0)) {line = $0 "," line }} /^ / { print line " # " $0; line=""}' | sort | sed -E 's/# //;s/ files? changed,//;s/([0-9]+) ([0-9]+ deletion)/\1 0 insertions\(+\), \2/;s/\(\+\)$/\(\+\), 0 deletions\(-\)/;s/insertions?\(\+\), //;s/ deletions?\(-\)//' | awk 'BEGIN {name=""; files=0; insertions=0; deletions=0;} {if ($1 != name && name != "") { print name ": " files " files changed, " insertions " insertions(+), " deletions " deletions(-), " insertions-deletions " net"; files=0; insertions=0; deletions=0; name=$1; } name=$1; files+=$2; insertions+=$3; deletions+=$4} END {print name ": " files " files changed, " insertions " insertions(+), " deletions " deletions(-), " insertions-deletions " net";}'

(需要几分钟来处理我们的回购，其中有大约10-15k次提交。)

2013-12-06 01:49:25

如何计算Git存储库中特定作者更改的总行数?

推荐文章

最新文章

标签