不是在Git存储库中,而是在GitHub中——我如何搜索特定存储库/分支的提交消息?
当前回答
2023年1月更新:(八年后)
使用GitHub CLI gh v2.22.0(2023年1月),您可以从本地克隆的GitHub存储库中搜索:
参见gh search commit:
例子:
# search commits matching set of keywords "readme" and "typo"
$ gh search commits readme typo
# search commits matching phrase "bug fix"
$ gh search commits "bug fix"
# search commits committed by user "monalisa"
$ gh search commits --committer=monalisa
2017年1月更新(两年后):
您现在可以搜索提交消息!(仍然只在主分支中)
2015年2月:考虑到目前基于Elasticsearch(2013年1月引入)的搜索基础设施,不确定这是否可能实现。
作为“来自可信和/或官方来源”的答案,以下是对GitHub负责引入Elasticsearch的人员的采访(2013年8月)。
Tim Pease: We have two document types in there: One is a source code file and the other one is a repository. The way that git works is you have commits and you have a branch for each commit. Repository documents keep track of the most recent commit for that particular repository that has been indexed. When a user pushes a new commit up to Github, we then pull that repository document from elasticsearch. We then see the most recently indexed commit and then we get a list of all the files that had been modified, or added, or deleted between this recent push and what we have previously indexed. Then we can go ahead and just update those documents which have been changed. We don’t have to re-index the entire source code tree every time someone pushes. Andrew Cholakian: So, you guys only index, I’m assuming, the master branch. Tim Pease: Correct. It’s only the head of the master branch that you’re going to get in there and still that’s a lot of data, two billion documents, 30 terabytes. Andrew Cholakian: That is awesomely huge.
[...]
Tim Pease: With indexing source code on push, it’s a self-healing process. We have that repository document which keeps track of the last indexed commit. If we missed, just happen to miss three commits where those jobs fail, the next commit that comes in, we’re still looking at the diff between the previous commit that we indexed and the one that we’re seeing with this new push. You do a git diff and you get all the files that have been updated, deleted, or added. You can just say, “Okay, we need to remove these files. We need to add these files, and all that.” It’s self-healing and that’s the approach that we have taken with pretty much all of the architecture.
这意味着并非所有回购的所有分支都将使用该方法进行索引。 目前无法使用全局提交消息搜索。 Tim Pease本人也确认提交消息没有被索引。
注意,获得本地克隆的自己的elasticsearch本地索引并不是不可能的:参见“用elasticsearch搜索git存储库”
但对于特定的回购,最简单的方法仍然是克隆它,并执行以下操作:
git log --all --grep='my search'
(更多选项在“如何通过提交消息搜索Git存储库?”)
其他回答
您可以对谷歌爬取的存储库执行此操作(结果因存储库而异)。
搜索所有爬取存储库的所有分支,查找“更改许可”
“变更许可证”网站:https://github.com/*/*/commits
搜索所有爬虫库的主分支“change license”:
“变更许可证”网站:https://github.com/*/*/commits/master
搜索所有爬过的twitter存储库的主分支“更改许可证”
“变更许可证”网站:https://github.com/twitter/*/commits/master
搜索twitter/some_project存储库的所有分支,查找“change license”
“变更许可证”网站:https://github.com/twitter/some_project/commits
到2017年,GitHub本身就包含了这个功能。
他们使用的示例搜索是repo:torvalds/linux merge:false加密策略
GIF图片来自https://github.com/blog/2299-search-commit-messages
在Github上使用高级搜索和其他答案的组合似乎是最简单的。它基本上是一个搜索字符串构建器。 https://github.com/search/advanced
例如,我想找到Autodesk/maya-usd中包含“USD”的所有提交
然后在搜索结果中可以从左边的列表中选择commit:
简单的回答是,你不能直接在github.com网站上搜索提交消息。目前,我们推荐这个线程中其他人提出的本地git grep解决方案。
在某个时间点上,GitHub确实为单个存储库提供了git grep风格的提交消息搜索。不幸的是,这种方法暴露了拒绝服务,可能导致文件服务器不可访问。出于这个原因,我们删除了git grep搜索。
目前粗略估计,GitHub的提交次数大约在800亿次左右。尽管谷歌的工程师们在背后嘲笑我们,但这在ElasticSearch中存储的文档数量相当大。我们很想让这个数据集可以搜索,但这不是一个微不足道的项目。
2023年1月更新:(八年后)
使用GitHub CLI gh v2.22.0(2023年1月),您可以从本地克隆的GitHub存储库中搜索:
参见gh search commit:
例子:
# search commits matching set of keywords "readme" and "typo"
$ gh search commits readme typo
# search commits matching phrase "bug fix"
$ gh search commits "bug fix"
# search commits committed by user "monalisa"
$ gh search commits --committer=monalisa
2017年1月更新(两年后):
您现在可以搜索提交消息!(仍然只在主分支中)
2015年2月:考虑到目前基于Elasticsearch(2013年1月引入)的搜索基础设施,不确定这是否可能实现。
作为“来自可信和/或官方来源”的答案,以下是对GitHub负责引入Elasticsearch的人员的采访(2013年8月)。
Tim Pease: We have two document types in there: One is a source code file and the other one is a repository. The way that git works is you have commits and you have a branch for each commit. Repository documents keep track of the most recent commit for that particular repository that has been indexed. When a user pushes a new commit up to Github, we then pull that repository document from elasticsearch. We then see the most recently indexed commit and then we get a list of all the files that had been modified, or added, or deleted between this recent push and what we have previously indexed. Then we can go ahead and just update those documents which have been changed. We don’t have to re-index the entire source code tree every time someone pushes. Andrew Cholakian: So, you guys only index, I’m assuming, the master branch. Tim Pease: Correct. It’s only the head of the master branch that you’re going to get in there and still that’s a lot of data, two billion documents, 30 terabytes. Andrew Cholakian: That is awesomely huge.
[...]
Tim Pease: With indexing source code on push, it’s a self-healing process. We have that repository document which keeps track of the last indexed commit. If we missed, just happen to miss three commits where those jobs fail, the next commit that comes in, we’re still looking at the diff between the previous commit that we indexed and the one that we’re seeing with this new push. You do a git diff and you get all the files that have been updated, deleted, or added. You can just say, “Okay, we need to remove these files. We need to add these files, and all that.” It’s self-healing and that’s the approach that we have taken with pretty much all of the architecture.
这意味着并非所有回购的所有分支都将使用该方法进行索引。 目前无法使用全局提交消息搜索。 Tim Pease本人也确认提交消息没有被索引。
注意,获得本地克隆的自己的elasticsearch本地索引并不是不可能的:参见“用elasticsearch搜索git存储库”
但对于特定的回购,最简单的方法仍然是克隆它,并执行以下操作:
git log --all --grep='my search'
(更多选项在“如何通过提交消息搜索Git存储库?”)
推荐文章
- 新Github项目发布通知?
- 如何在本地删除分支?
- GitHub.com的存储库大小限制
- 使用Git管理变更日志的一些好方法是什么?
- Github权限被拒绝:ssh添加代理没有身份
- App-release-unsigned.apk没有签名
- 撤销一个合并拉请求?
- 密码身份验证暂时被禁用,作为停电的一部分。请改用个人访问令牌
- 自动TOC在github风味markdown
- 如何修复Github页面上的HTTP 404 ?
- 如何显示数学方程在一般github的markdown(不是github的博客)
- 无法推送到远程分支,无法解析到分支
- 如何在GitHub上创建自己的存储库?
- Github“更新被拒绝,因为远程包含您在本地没有的工作。”
- 如何找到最近修改文件的git提交?