不是在Git存储库中,而是在GitHub中——我如何搜索特定存储库/分支的提交消息?


当前回答

您可以对谷歌爬取的存储库执行此操作(结果因存储库而异)。

搜索所有爬取存储库的所有分支,查找“更改许可”

“变更许可证”网站:https://github.com/*/*/commits

搜索所有爬虫库的主分支“change license”:

“变更许可证”网站:https://github.com/*/*/commits/master

搜索所有爬过的twitter存储库的主分支“更改许可证”

“变更许可证”网站:https://github.com/twitter/*/commits/master

搜索twitter/some_project存储库的所有分支,查找“change license”

“变更许可证”网站:https://github.com/twitter/some_project/commits

其他回答

2023年1月更新:(八年后)

使用GitHub CLI gh v2.22.0(2023年1月),您可以从本地克隆的GitHub存储库中搜索:

参见gh search commit:

例子:

# search commits matching set of keywords "readme" and "typo"
$ gh search commits readme typo

# search commits matching phrase "bug fix"
$ gh search commits "bug fix"

# search commits committed by user "monalisa"
$ gh search commits --committer=monalisa

2017年1月更新(两年后):

您现在可以搜索提交消息!(仍然只在主分支中)


2015年2月:考虑到目前基于Elasticsearch(2013年1月引入)的搜索基础设施,不确定这是否可能实现。

作为“来自可信和/或官方来源”的答案,以下是对GitHub负责引入Elasticsearch的人员的采访(2013年8月)。

Tim Pease: We have two document types in there: One is a source code file and the other one is a repository. The way that git works is you have commits and you have a branch for each commit. Repository documents keep track of the most recent commit for that particular repository that has been indexed. When a user pushes a new commit up to Github, we then pull that repository document from elasticsearch. We then see the most recently indexed commit and then we get a list of all the files that had been modified, or added, or deleted between this recent push and what we have previously indexed. Then we can go ahead and just update those documents which have been changed. We don’t have to re-index the entire source code tree every time someone pushes. Andrew Cholakian: So, you guys only index, I’m assuming, the master branch. Tim Pease: Correct. It’s only the head of the master branch that you’re going to get in there and still that’s a lot of data, two billion documents, 30 terabytes. Andrew Cholakian: That is awesomely huge.

[...]

Tim Pease: With indexing source code on push, it’s a self-healing process. We have that repository document which keeps track of the last indexed commit. If we missed, just happen to miss three commits where those jobs fail, the next commit that comes in, we’re still looking at the diff between the previous commit that we indexed and the one that we’re seeing with this new push. You do a git diff and you get all the files that have been updated, deleted, or added. You can just say, “Okay, we need to remove these files. We need to add these files, and all that.” It’s self-healing and that’s the approach that we have taken with pretty much all of the architecture.

这意味着并非所有回购的所有分支都将使用该方法进行索引。 目前无法使用全局提交消息搜索。 Tim Pease本人也确认提交消息没有被索引。

注意,获得本地克隆的自己的elasticsearch本地索引并不是不可能的:参见“用elasticsearch搜索git存储库”

但对于特定的回购,最简单的方法仍然是克隆它,并执行以下操作:

git log --all --grep='my search'

(更多选项在“如何通过提交消息搜索Git存储库?”)

更新(2017/01/05):

GitHub发布了一个更新,允许你现在从他们的UI中搜索提交消息。更多信息请参见博客文章。


我也有同样的问题,昨天联系了GitHub的人:

由于他们将搜索引擎切换到Elasticsearch,因此无法使用GitHub UI搜索提交消息。但是这个功能在团队的愿望清单上。

不幸的是,该函数目前还没有发布日期。

在Github上使用高级搜索和其他答案的组合似乎是最简单的。它基本上是一个搜索字符串构建器。 https://github.com/search/advanced

例如,我想找到Autodesk/maya-usd中包含“USD”的所有提交

然后在搜索结果中可以从左边的列表中选择commit:

简单的回答是,你不能直接在github.com网站上搜索提交消息。目前,我们推荐这个线程中其他人提出的本地git grep解决方案。

在某个时间点上,GitHub确实为单个存储库提供了git grep风格的提交消息搜索。不幸的是,这种方法暴露了拒绝服务,可能导致文件服务器不可访问。出于这个原因,我们删除了git grep搜索。

目前粗略估计,GitHub的提交次数大约在800亿次左右。尽管谷歌的工程师们在背后嘲笑我们,但这在ElasticSearch中存储的文档数量相当大。我们很想让这个数据集可以搜索,但这不是一个微不足道的项目。

截至2019年年中

在左上方的搜索框中输入您的查询 回车 点击“提交”

截图: