这些技术之间的核心架构差异是什么?
另外,哪些用例通常更适合每种用例?
这些技术之间的核心架构差异是什么?
另外,哪些用例通常更适合每种用例?
当前回答
While all of the above links have merit, and have benefited me greatly in the past, as a linguist "exposed" to various Lucene search engines for the last 15 years, I have to say that elastic-search development is very fast in Python. That being said, some of the code felt non-intuitive to me. So, I reached out to one component of the ELK stack, Kibana, from an open source perspective, and found that I could generate the somewhat cryptic code of elasticsearch very easily in Kibana. Also, I could pull Chrome Sense es queries into Kibana as well. If you use Kibana to evaluate es, it will further speed up your evaluation. What took hours to run on other platforms was up and running in JSON in Sense on top of elasticsearch (RESTful interface) in a few minutes at worst (largest data sets); in seconds at best. The documentation for elasticsearch, while 700+ pages, didn't answer questions I had that normally would be resolved in SOLR or other Lucene documentation, which obviously took more time to analyze. Also, you may want to take a look at Aggregates in elastic-search, which have taken Faceting to a new level.
Bigger picture: if you're doing data science, text analytics, or computational linguistics, elasticsearch has some ranking algorithms that seem to innovate well in the information retrieval area. If you're using any TF/IDF algorithms, Text Frequency/Inverse Document Frequency, elasticsearch extends this 1960's algorithm to a new level, even using BM25, Best Match 25, and other Relevancy Ranking algorithms. So, if you are scoring or ranking words, phrases or sentences, elasticsearch does this scoring on the fly, without the large overhead of other data analytics approaches that take hours--another elasticsearch time savings. With es, combining some of the strengths of bucketing from aggregations with the real-time JSON data relevancy scoring and ranking, you could find a winning combination, depending on either your agile (stories) or architectural(use cases) approach.
注意:上面确实有关于聚合的类似讨论,但没有关于聚合和相关性评分的讨论——我为任何重叠道歉。 披露:我不为elastic工作,而且由于不同的架构路径,在不久的将来也无法从他们的出色工作中受益,除非我用elasticsearch做一些慈善工作,这也不是一个坏主意
其他回答
更新
既然问题的范围已经被纠正了,我也可以在这方面补充一些东西:
Apache Solr和ElasticSearch之间有很多比较,所以我将引用我自己认为最有用的,即涵盖最重要的方面:
Bob Yoplait already linked kimchy's answer to ElasticSearch, Sphinx, Lucene, Solr, Xapian. Which fits for which usage?, which summarizes the reasons why he went ahead and created ElasticSearch, which in his opinion provides a much superior distributed model and ease of use in comparison to Solr. Ryan Sonnek's Realtime Search: Solr vs Elasticsearch provides an insightful analysis/comparison and explains why he switched from Solr to ElasticSeach, despite being a happy Solr user already - he summarizes this as follows: Solr may be the weapon of choice when building standard search applications, but Elasticsearch takes it to the next level with an architecture for creating modern realtime search applications. Percolation is an exciting and innovative feature that singlehandedly blows Solr right out of the water. Elasticsearch is scalable, speedy and a dream to integrate with. Adios Solr, it was nice knowing you. [emphasis mine] The Wikipedia article on ElasticSearch quotes a comparison from the reputed German iX magazine, listing advantages and disadvantages, which pretty much summarize what has been said above already: Advantages: ElasticSearch is distributed. No separate project required. Replicas are near real-time too, which is called "Push replication". ElasticSearch fully supports the near real-time search of Apache Lucene. Handling multitenancy is not a special configuration, where with Solr a more advanced setup is necessary. ElasticSearch introduces the concept of the Gateway, which makes full backups easier. Disadvantages: Only one main developer [not applicable anymore according to the current elasticsearch GitHub organization, besides having a pretty active committer base in the first place] No autowarming feature [not applicable anymore according to the new Index Warmup API]
最初的回答
它们是针对完全不同用例的完全不同的技术,因此根本无法以任何有意义的方式进行比较:
Apache Solr - Apache Solr offers Lucene's capabilities in an easy to use, fast search server with additional features like faceting, scalability and much more Amazon ElastiCache - Amazon ElastiCache is a web service that makes it easy to deploy, operate, and scale an in-memory cache in the cloud. Please note that Amazon ElastiCache is protocol-compliant with Memcached, a widely adopted memory object caching system, so code, applications, and popular tools that you use today with existing Memcached environments will work seamlessly with the service (see Memcached for details).
(强调我的)
也许这已经与以下两种相关技术混淆了:
ElasticSearch -这是一个开源(Apache 2)、分布式、RESTful的搜索引擎,建立在Apache Lucene之上。 Amazon CloudSearch—Amazon CloudSearch是一个完全托管的云搜索服务,允许客户轻松地将快速和高度可扩展的搜索功能集成到他们的应用程序中。
Solr和ElasticSearch的产品乍听起来非常相似,并且都使用相同的后端搜索引擎,即Apache Lucene。
虽然Solr更老,功能更全面,更成熟,因此被广泛使用,但ElasticSearch是专门为解决Solr在现代云环境中可伸缩性需求方面的缺点而开发的,这些缺点很难用Solr解决。
因此,将ElasticSearch与最近推出的Amazon CloudSearch进行比较可能是最有用的(参见介绍性文章Start Searching in One Hour for Less Than 100 $ / Month),因为两者都声称在原则上涵盖相同的用例。
我使用Elasticsearch 3年了,使用Solr大约一个月,我觉得与Solr安装相比,Elasticsearch集群非常容易安装。Elasticsearch有一个帮助文档池,其中有很好的解释。其中一个用例是直方图聚合,它在ES中可用,但在Solr中找不到。
我一直致力于。net应用程序的solr和弹性搜索。 我所面临的主要不同是
弹性搜索:
更多的代码和更少的配置,但有api的改变 但仍然是一个代码更改 对于复杂类型,类型中类型即嵌套类型(在solr中无法实现)
Solr:
代码更少,配置更多,因此维护更少 用于在查询期间对结果进行分组(在 弹性搜索,简而言之,没有直接的方法)
我只使用弹性搜索。因为我发现solr很难开始。 Elastic-search的特点:
启动方便,设置少。即使是新手也可以一步一步地设置集群。 简单的Restful API,使用NoSQL查询。以及许多易于访问的语言库。 好的文件,你可以读这本书:。官方网站上有网络版。
我已经创建了elasticsearch和Solr和splunk之间的主要差异表,您可以使用它作为2016年的更新: