我是否可以调用一个命令来计算Git存储库中特定作者更改的行数?我知道一定有方法来计算提交的数量,因为Github为他们的影响图这样做。
当前回答
AaronM使用shell一行程序得到的答案很好,但实际上,还有另一个错误,如果用户名和日期之间有不同数量的空格,空格会破坏用户名。损坏的用户名将给出多行用户计数,您必须自己将它们相加。
这个小小的改变解决了我的问题:
git ls-files -z | xargs -0n1 git blame -w --show-email | perl -n -e '/^.*?\((.*?)\s+[\d]{4}/; print $1,"\n"' | sort -f | uniq -c | sort -n
注意\s后面的+,它将占用从名称到日期的所有空白。
实际上,添加这个答案既是为了帮助别人,也是为了我自己的记忆,因为这至少是我第二次谷歌这个主题:)
增加了——show-email to git blame -w来聚合email,因为有些人在不同的计算机上使用不同的Name格式,有时两个同名的人在同一个git中工作。
其他回答
使用以下方法将日志保存到文件:
git log --author="<authorname>" --oneline --shortstat > logs.txt
对于Python爱好者:
with open(r".\logs.txt", "r", encoding="utf8") as f:
files = insertions = deletions = 0
for line in f:
if ' changed' in line:
line = line.strip()
spl = line.split(', ')
if len(spl) > 0:
files += int(spl[0].split(' ')[0])
if len(spl) > 1:
insertions += int(spl[1].split(' ')[0])
if len(spl) > 2:
deletions += int(spl[2].split(' ')[0])
print(str(files).ljust(10) + ' files changed')
print(str(insertions).ljust(10) + ' insertions')
print(str(deletions).ljust(10) + ' deletions')
你的输出是这样的:
225 files changed
6751 insertions
1379 deletions
我编写了这个Perl脚本来完成这项任务。
#!/usr/bin/env perl
use strict;
use warnings;
# save the args to pass to the git log command
my $ARGS = join(' ', @ARGV);
#get the repo slug
my $NAME = _get_repo_slug();
#get list of authors
my @authors = _get_authors();
my ($projectFiles, $projectInsertions, $projectDeletions) = (0,0,0);
#for each author
foreach my $author (@authors) {
my $command = qq{git log $ARGS --author="$author" --oneline --shortstat --no-merges};
my ($files, $insertions, $deletions) = (0,0,0);
my @lines = `$command`;
foreach my $line (@lines) {
if ($line =~ m/^\s(\d+)\s\w+\s\w+,\s(\d+)\s\w+\([\+|\-]\),\s(\d+)\s\w+\([\+|\-]\)$|^\s(\d+)\s\w+\s\w+,\s(\d+)\s\w+\(([\+|\-])\)$/) {
my $lineFiles = $1 ? $1 : $4;
my $lineInsertions = (defined $6 && $6 eq '+') ? $5 : (defined $2) ? $2 : 0;
my $lineDeletions = (defined $6 && $6 eq '-') ? $5 : (defined $3) ? $3 : 0;
$files += $lineFiles;
$insertions += $lineInsertions;
$deletions += $lineDeletions;
$projectFiles += $lineFiles;
$projectInsertions += $lineInsertions;
$projectDeletions += $lineDeletions;
}
}
if ($files || $insertions || $deletions) {
printf(
"%s,%s,%s,+%s,-%s,%s\n",
$NAME,
$author,
$files,
$insertions,
$deletions,
$insertions - $deletions
);
}
}
printf(
"%s,%s,%s,+%s,-%s,%s\n",
$NAME,
'PROJECT_TOTAL',
$projectFiles,
$projectInsertions,
$projectDeletions,
$projectInsertions - $projectDeletions
);
exit 0;
#get the remote.origin.url joins that last two pieces (project and repo folder)
#and removes any .git from the results.
sub _get_repo_slug {
my $get_remote_url = "git config --get remote.origin.url";
my $remote_url = `$get_remote_url`;
chomp $remote_url;
my @parts = split('/', $remote_url);
my $slug = join('-', @parts[-2..-1]);
$slug =~ s/\.git//;
return $slug;
}
sub _get_authors {
my $git_authors = 'git shortlog -s | cut -c8-';
my @authors = `$git_authors`;
chomp @authors;
return @authors;
}
我将其命名为git-line-changes-by-author,并放入/usr/local/bin。因为它保存在我的路径中,所以我可以在2020-01-01之后发出命令git line-changes-by-author—before 2018-12-31—以获得2019年的报告。举个例子。如果我拼错了名字,git会建议正确的拼写。
你可能想要调整_get_repo_slug子只包括remote.origin.url的最后一部分,因为我的回购保存为项目/回购,而你的可能不是。
你想责怪Git。
有一个——show-stats选项来打印一些统计数据。
我对上面的一个简短的回答作了修改,但这不足以满足我的需要。我需要能够对提交的行和最终代码中的行进行分类。我还想按文件进行细分。这段代码不递归,它只返回单个目录的结果,但如果有人想进一步了解,这是一个很好的开始。复制并粘贴到文件中并使其可执行或使用Perl运行。
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my $dir = shift;
die "Please provide a directory name to check\n"
unless $dir;
chdir $dir
or die "Failed to enter the specified directory '$dir': $!\n";
if ( ! open(GIT_LS,'-|','git ls-files') ) {
die "Failed to process 'git ls-files': $!\n";
}
my %stats;
while (my $file = <GIT_LS>) {
chomp $file;
if ( ! open(GIT_LOG,'-|',"git log --numstat $file") ) {
die "Failed to process 'git log --numstat $file': $!\n";
}
my $author;
while (my $log_line = <GIT_LOG>) {
if ( $log_line =~ m{^Author:\s*([^<]*?)\s*<([^>]*)>} ) {
$author = lc($1);
}
elsif ( $log_line =~ m{^(\d+)\s+(\d+)\s+(.*)} ) {
my $added = $1;
my $removed = $2;
my $file = $3;
$stats{total}{by_author}{$author}{added} += $added;
$stats{total}{by_author}{$author}{removed} += $removed;
$stats{total}{by_author}{total}{added} += $added;
$stats{total}{by_author}{total}{removed} += $removed;
$stats{total}{by_file}{$file}{$author}{added} += $added;
$stats{total}{by_file}{$file}{$author}{removed} += $removed;
$stats{total}{by_file}{$file}{total}{added} += $added;
$stats{total}{by_file}{$file}{total}{removed} += $removed;
}
}
close GIT_LOG;
if ( ! open(GIT_BLAME,'-|',"git blame -w $file") ) {
die "Failed to process 'git blame -w $file': $!\n";
}
while (my $log_line = <GIT_BLAME>) {
if ( $log_line =~ m{\((.*?)\s+\d{4}} ) {
my $author = $1;
$stats{final}{by_author}{$author} ++;
$stats{final}{by_file}{$file}{$author}++;
$stats{final}{by_author}{total} ++;
$stats{final}{by_file}{$file}{total} ++;
$stats{final}{by_file}{$file}{total} ++;
}
}
close GIT_BLAME;
}
close GIT_LS;
print "Total lines committed by author by file\n";
printf "%25s %25s %8s %8s %9s\n",'file','author','added','removed','pct add';
foreach my $file (sort keys %{$stats{total}{by_file}}) {
printf "%25s %4.0f%%\n",$file
,100*$stats{total}{by_file}{$file}{total}{added}/$stats{total}{by_author}{total}{added};
foreach my $author (sort keys %{$stats{total}{by_file}{$file}}) {
next if $author eq 'total';
if ( $stats{total}{by_file}{$file}{total}{added} ) {
printf "%25s %25s %8d %8d %8.0f%%\n",'', $author,@{$stats{total}{by_file}{$file}{$author}}{qw{added removed}}
,100*$stats{total}{by_file}{$file}{$author}{added}/$stats{total}{by_file}{$file}{total}{added};
} else {
printf "%25s %25s %8d %8d\n",'', $author,@{$stats{total}{by_file}{$file}{$author}}{qw{added removed}} ;
}
}
}
print "\n";
print "Total lines in the final project by author by file\n";
printf "%25s %25s %8s %9s %9s\n",'file','author','final','percent', '% of all';
foreach my $file (sort keys %{$stats{final}{by_file}}) {
printf "%25s %4.0f%%\n",$file
,100*$stats{final}{by_file}{$file}{total}/$stats{final}{by_author}{total};
foreach my $author (sort keys %{$stats{final}{by_file}{$file}}) {
next if $author eq 'total';
printf "%25s %25s %8d %8.0f%% %8.0f%%\n",'', $author,$stats{final}{by_file}{$file}{$author}
,100*$stats{final}{by_file}{$file}{$author}/$stats{final}{by_file}{$file}{total}
,100*$stats{final}{by_file}{$file}{$author}/$stats{final}{by_author}{total}
;
}
}
print "\n";
print "Total lines committed by author\n";
printf "%25s %8s %8s %9s\n",'author','added','removed','pct add';
foreach my $author (sort keys %{$stats{total}{by_author}}) {
next if $author eq 'total';
printf "%25s %8d %8d %8.0f%%\n",$author,@{$stats{total}{by_author}{$author}}{qw{added removed}}
,100*$stats{total}{by_author}{$author}{added}/$stats{total}{by_author}{total}{added};
};
print "\n";
print "Total lines in the final project by author\n";
printf "%25s %8s %9s\n",'author','final','percent';
foreach my $author (sort keys %{$stats{final}{by_author}}) {
printf "%25s %8d %8.0f%%\n",$author,$stats{final}{by_author}{$author}
,100*$stats{final}{by_author}{$author}/$stats{final}{by_author}{total};
}
下面是一个简短的一行代码,用于生成所有作者的统计信息。它比Dan在https://stackoverflow.com/a/20414465/1102119上的解决方案快得多(我的解决方案的时间复杂度是O(N),而不是O(NM),其中N是提交的数量,M是作者的数量)。
git log --no-merges --pretty=format:%an --numstat | awk '/./ && !author { author = $0; next } author { ins[author] += $1; del[author] += $2 } /^$/ { author = ""; next } END { for (a in ins) { printf "%10d %10d %10d %s\n", ins[a] - del[a], ins[a], del[a], a } }' | sort -rn
推荐文章
- 如何查看远程标签?
- Maven命令行如何指向特定的settings.xml为单个命令?
- Git:在推送后删除提交的文件
- Git分支之间的视觉差异
- 在GitHub中编辑git提交消息
- 是否有可能' git状态'只修改文件?
- Git:如何区分两个不同的文件在不同的分支?
- 如何从远程Git存储库中提取并覆盖本地存储库中的更改?
- Github:导入上游分支到fork
- Git单次修订的日志
- Git在不改变提交时间戳的情况下进行改基
- 如何循环通过文件匹配通配符在批处理文件
- VS 2017 Git本地提交数据库。每次提交时锁定错误
- 如何在过去的一些任意提交之间注入一个提交?
- 从GitHub克隆项目后拉git子模块