这些技术之间的核心架构差异是什么?
另外,哪些用例通常更适合每种用例?
这些技术之间的核心架构差异是什么?
另外,哪些用例通常更适合每种用例?
当前回答
While all of the above links have merit, and have benefited me greatly in the past, as a linguist "exposed" to various Lucene search engines for the last 15 years, I have to say that elastic-search development is very fast in Python. That being said, some of the code felt non-intuitive to me. So, I reached out to one component of the ELK stack, Kibana, from an open source perspective, and found that I could generate the somewhat cryptic code of elasticsearch very easily in Kibana. Also, I could pull Chrome Sense es queries into Kibana as well. If you use Kibana to evaluate es, it will further speed up your evaluation. What took hours to run on other platforms was up and running in JSON in Sense on top of elasticsearch (RESTful interface) in a few minutes at worst (largest data sets); in seconds at best. The documentation for elasticsearch, while 700+ pages, didn't answer questions I had that normally would be resolved in SOLR or other Lucene documentation, which obviously took more time to analyze. Also, you may want to take a look at Aggregates in elastic-search, which have taken Faceting to a new level.
Bigger picture: if you're doing data science, text analytics, or computational linguistics, elasticsearch has some ranking algorithms that seem to innovate well in the information retrieval area. If you're using any TF/IDF algorithms, Text Frequency/Inverse Document Frequency, elasticsearch extends this 1960's algorithm to a new level, even using BM25, Best Match 25, and other Relevancy Ranking algorithms. So, if you are scoring or ranking words, phrases or sentences, elasticsearch does this scoring on the fly, without the large overhead of other data analytics approaches that take hours--another elasticsearch time savings. With es, combining some of the strengths of bucketing from aggregations with the real-time JSON data relevancy scoring and ranking, you could find a winning combination, depending on either your agile (stories) or architectural(use cases) approach.
注意:上面确实有关于聚合的类似讨论,但没有关于聚合和相关性评分的讨论——我为任何重叠道歉。 披露:我不为elastic工作,而且由于不同的架构路径,在不久的将来也无法从他们的出色工作中受益,除非我用elasticsearch做一些慈善工作,这也不是一个坏主意
其他回答
While all of the above links have merit, and have benefited me greatly in the past, as a linguist "exposed" to various Lucene search engines for the last 15 years, I have to say that elastic-search development is very fast in Python. That being said, some of the code felt non-intuitive to me. So, I reached out to one component of the ELK stack, Kibana, from an open source perspective, and found that I could generate the somewhat cryptic code of elasticsearch very easily in Kibana. Also, I could pull Chrome Sense es queries into Kibana as well. If you use Kibana to evaluate es, it will further speed up your evaluation. What took hours to run on other platforms was up and running in JSON in Sense on top of elasticsearch (RESTful interface) in a few minutes at worst (largest data sets); in seconds at best. The documentation for elasticsearch, while 700+ pages, didn't answer questions I had that normally would be resolved in SOLR or other Lucene documentation, which obviously took more time to analyze. Also, you may want to take a look at Aggregates in elastic-search, which have taken Faceting to a new level.
Bigger picture: if you're doing data science, text analytics, or computational linguistics, elasticsearch has some ranking algorithms that seem to innovate well in the information retrieval area. If you're using any TF/IDF algorithms, Text Frequency/Inverse Document Frequency, elasticsearch extends this 1960's algorithm to a new level, even using BM25, Best Match 25, and other Relevancy Ranking algorithms. So, if you are scoring or ranking words, phrases or sentences, elasticsearch does this scoring on the fly, without the large overhead of other data analytics approaches that take hours--another elasticsearch time savings. With es, combining some of the strengths of bucketing from aggregations with the real-time JSON data relevancy scoring and ranking, you could find a winning combination, depending on either your agile (stories) or architectural(use cases) approach.
注意:上面确实有关于聚合的类似讨论,但没有关于聚合和相关性评分的讨论——我为任何重叠道歉。 披露:我不为elastic工作,而且由于不同的架构路径,在不久的将来也无法从他们的出色工作中受益,除非我用elasticsearch做一些慈善工作,这也不是一个坏主意
我已经创建了elasticsearch和Solr和splunk之间的主要差异表,您可以使用它作为2016年的更新:
在solr中添加嵌套文档非常复杂,嵌套数据搜索也非常复杂。但弹性搜索容易添加嵌套文档和搜索
我发现上面的一些答案现在有点过时了。从我的角度来看,我每天都在使用Solr(云和非云)和ElasticSearch,这里有一些有趣的区别:
Community: Solr has a bigger, more mature user, dev, and contributor community. ES has a smaller, but active community of users and a growing community of contributors Maturity: Solr is more mature, but ES has grown rapidly and I consider it stable Performance: hard to judge. I/we have not done direct performance benchmarks. A person at LinkedIn did compare Solr vs. ES vs. Sensei once, but the initial results should be ignored because they used non-expert setup for both Solr and ES. Design: People love Solr. The Java API is somewhat verbose, but people like how it's put together. Solr code is unfortunately not always very pretty. Also, ES has sharding, real-time replication, document and routing built-in. While some of this exists in Solr, too, it feels a bit like an after-thought. Support: there are companies providing tech and consulting support for both Solr and ElasticSearch. I think the only company that provides support for both is Sematext (disclosure: I'm Sematext founder) Scalability: both can be scaled to very large clusters. ES is easier to scale than pre-Solr 4.0 version of Solr, but with Solr 4.0 that's no longer the case.
有关Solr vs. ElasticSearch主题的更全面报道,请查看https://sematext.com/blog/solr-vs-elasticsearch-part-1-overview/。这是Sematext系列文章中的第一篇,对Solr和ElasticSearch进行了直接和中立的比较。披露:我在Sematext工作。
我使用Elasticsearch 3年了,使用Solr大约一个月,我觉得与Solr安装相比,Elasticsearch集群非常容易安装。Elasticsearch有一个帮助文档池,其中有很好的解释。其中一个用例是直方图聚合,它在ES中可用,但在Solr中找不到。