我是否可以调用一个命令来计算Git存储库中特定作者更改的行数?我知道一定有方法来计算提交的数量,因为Github为他们的影响图这样做。


当前回答

要统计给定作者(或所有作者)在给定分支上提交的数量,可以使用git-shortlog;特别是它的——编号和——摘要选项,例如在git存储库上运行时:

$ git shortlog v1.6.4 --numbered --summary
  6904  Junio C Hamano
  1320  Shawn O. Pearce
  1065  Linus Torvalds
    692  Johannes Schindelin
    443  Eric Wong

其他回答

你可以使用whodid (https://www.npmjs.com/package/whodid)

$ npm install whodid -g
$ cd your-project-dir

and

$ whodid author --include-merge=false --path=./ --valid-threshold=1000 --since=1.week

或者直接输入

$ whodid

然后你可以看到这样的结果

Contribution state
=====================================================
 score  | author
-----------------------------------------------------
 3059   | someguy <someguy@tensorflow.org>
 585    | somelady <somelady@tensorflow.org>
 212    | niceguy <nice@google.com>
 173    | coolguy <coolgay@google.com>
=====================================================

下面是一个快速ruby脚本,针对给定的日志查询汇总每个用户的影响。

例如rubinius:

Brian Ford: 4410668
Evan Phoenix: 1906343
Ryan Davis: 855674
Shane Becker: 242904
Alexander Kellett: 167600
Eric Hodel: 132986
Dirkjan Bussink: 113756
...

脚本:

#!/usr/bin/env ruby

impact = Hash.new(0)

IO.popen("git log --pretty=format:\"%an\" --shortstat #{ARGV.join(' ')}") do |f|
  prev_line = ''
  while line = f.gets
    changes = /(\d+) insertions.*(\d+) deletions/.match(line)

    if changes
      impact[prev_line] += changes[1].to_i + changes[2].to_i
    end

    prev_line = line # Names are on a line of their own, just before the stats
  end
end

impact.sort_by { |a,i| -i }.each do |author, impact|
  puts "#{author.strip}: #{impact}"
end

该问题要求提供关于特定作者的信息,但许多答案都是根据修改的代码行返回作者排名列表的解决方案。

这正是我想要的,但现有的解决方案并不完美。为了方便人们通过谷歌找到这个问题,我对它们进行了一些改进,并将其制成一个shell脚本,下面显示该脚本。

它不依赖于Perl或Ruby。此外,空格、重命名和行移动都在行更改计数中被考虑在内。只需将其放入一个文件中,并将Git存储库作为第一个参数传递。

#!/bin/bash
git --git-dir="$1/.git" log > /dev/null 2> /dev/null
if [ $? -eq 128 ]
then
    echo "Not a git repository!"
    exit 128
else
    echo -e "Lines  | Name\nChanged|"
    git --work-tree="$1" --git-dir="$1/.git" ls-files -z |\
    xargs -0n1 git --work-tree="$1" --git-dir="$1/.git" blame -C -M  -w |\
    cut -d'(' -f2 |\
    cut -d2 -f1 |\
    sed -e "s/ \{1,\}$//" |\
    sort |\
    uniq -c |\
    sort -nr
fi

我对上面的一个简短的回答作了修改,但这不足以满足我的需要。我需要能够对提交的行和最终代码中的行进行分类。我还想按文件进行细分。这段代码不递归,它只返回单个目录的结果,但如果有人想进一步了解,这是一个很好的开始。复制并粘贴到文件中并使其可执行或使用Perl运行。

#!/usr/bin/perl

use strict;
use warnings;
use Data::Dumper;

my $dir = shift;

die "Please provide a directory name to check\n"
    unless $dir;

chdir $dir
    or die "Failed to enter the specified directory '$dir': $!\n";

if ( ! open(GIT_LS,'-|','git ls-files') ) {
    die "Failed to process 'git ls-files': $!\n";
}
my %stats;
while (my $file = <GIT_LS>) {
    chomp $file;
    if ( ! open(GIT_LOG,'-|',"git log --numstat $file") ) {
        die "Failed to process 'git log --numstat $file': $!\n";
    }
    my $author;
    while (my $log_line = <GIT_LOG>) {
        if ( $log_line =~ m{^Author:\s*([^<]*?)\s*<([^>]*)>} ) {
            $author = lc($1);
        }
        elsif ( $log_line =~ m{^(\d+)\s+(\d+)\s+(.*)} ) {
            my $added = $1;
            my $removed = $2;
            my $file = $3;
            $stats{total}{by_author}{$author}{added}        += $added;
            $stats{total}{by_author}{$author}{removed}      += $removed;
            $stats{total}{by_author}{total}{added}          += $added;
            $stats{total}{by_author}{total}{removed}        += $removed;

            $stats{total}{by_file}{$file}{$author}{added}   += $added;
            $stats{total}{by_file}{$file}{$author}{removed} += $removed;
            $stats{total}{by_file}{$file}{total}{added}     += $added;
            $stats{total}{by_file}{$file}{total}{removed}   += $removed;
        }
    }
    close GIT_LOG;

    if ( ! open(GIT_BLAME,'-|',"git blame -w $file") ) {
        die "Failed to process 'git blame -w $file': $!\n";
    }
    while (my $log_line = <GIT_BLAME>) {
        if ( $log_line =~ m{\((.*?)\s+\d{4}} ) {
            my $author = $1;
            $stats{final}{by_author}{$author}     ++;
            $stats{final}{by_file}{$file}{$author}++;

            $stats{final}{by_author}{total}       ++;
            $stats{final}{by_file}{$file}{total}  ++;
            $stats{final}{by_file}{$file}{total}  ++;
        }
    }
    close GIT_BLAME;
}
close GIT_LS;

print "Total lines committed by author by file\n";
printf "%25s %25s %8s %8s %9s\n",'file','author','added','removed','pct add';
foreach my $file (sort keys %{$stats{total}{by_file}}) {
    printf "%25s %4.0f%%\n",$file
            ,100*$stats{total}{by_file}{$file}{total}{added}/$stats{total}{by_author}{total}{added};
    foreach my $author (sort keys %{$stats{total}{by_file}{$file}}) {
        next if $author eq 'total';
        if ( $stats{total}{by_file}{$file}{total}{added} ) {
            printf "%25s %25s %8d %8d %8.0f%%\n",'', $author,@{$stats{total}{by_file}{$file}{$author}}{qw{added removed}}
            ,100*$stats{total}{by_file}{$file}{$author}{added}/$stats{total}{by_file}{$file}{total}{added};
        } else {
            printf "%25s %25s %8d %8d\n",'', $author,@{$stats{total}{by_file}{$file}{$author}}{qw{added removed}} ;
        }
    }
}
print "\n";

print "Total lines in the final project by author by file\n";
printf "%25s %25s %8s %9s %9s\n",'file','author','final','percent', '% of all';
foreach my $file (sort keys %{$stats{final}{by_file}}) {
    printf "%25s %4.0f%%\n",$file
            ,100*$stats{final}{by_file}{$file}{total}/$stats{final}{by_author}{total};
    foreach my $author (sort keys %{$stats{final}{by_file}{$file}}) {
        next if $author eq 'total';
        printf "%25s %25s %8d %8.0f%% %8.0f%%\n",'', $author,$stats{final}{by_file}{$file}{$author}
            ,100*$stats{final}{by_file}{$file}{$author}/$stats{final}{by_file}{$file}{total}
            ,100*$stats{final}{by_file}{$file}{$author}/$stats{final}{by_author}{total}
        ;
    }
}
print "\n";


print "Total lines committed by author\n";
printf "%25s %8s %8s %9s\n",'author','added','removed','pct add';
foreach my $author (sort keys %{$stats{total}{by_author}}) {
    next if $author eq 'total';
    printf "%25s %8d %8d %8.0f%%\n",$author,@{$stats{total}{by_author}{$author}}{qw{added removed}}
        ,100*$stats{total}{by_author}{$author}{added}/$stats{total}{by_author}{total}{added};
};
print "\n";


print "Total lines in the final project by author\n";
printf "%25s %8s %9s\n",'author','final','percent';
foreach my $author (sort keys %{$stats{final}{by_author}}) {
    printf "%25s %8d %8.0f%%\n",$author,$stats{final}{by_author}{$author}
        ,100*$stats{final}{by_author}{$author}/$stats{final}{by_author}{total};
}

我发现下面的方法对于查看当前代码库中谁拥有最多的行很有用:

git ls-files -z | xargs -0n1 git blame -w | ruby -n -e '$_ =~ /^.*\((.*?)\s[\d]{4}/; puts $1.strip' | sort -f | uniq -c | sort -n

其他答案主要集中在提交中更改的行,但如果提交无法存活并被覆盖,则它们可能只是被更改了。上面的咒语还可以让您按行对所有提交者进行排序,而不是一次只排序一个。您可以向git blame (-C -M)添加一些选项,以获得一些更好的数字,将文件移动和文件之间的行移动考虑在内,但如果这样做,该命令可能会运行更长时间。

同样,如果你正在为所有提交者寻找在所有提交中更改的行,下面的小脚本很有帮助:

http://git-wt-commit.rubyforge.org/#git-rank-contributors