A lot late but finally I could got my hands on the videos from the Solr/LuceneRevolution 2014 conference, and this are my take-away points of the last years edition.
I’ve compiled some key points (from my own perspective of the conference) in a very short and summarized list that I want to share, keep in mind that this are opinions of my own.
- If search is key to your business/app or Website then you must monitor how your users are using the search capabilities you’re publishing, this is true if you’re using Solr or Elasticsearch or even Lucene, in the conference we see references to this in several presentations (Airbnb, Evernote, etc.). This could be used to improve your search but can also provide very valuable insight on how your search is being used, basically allowing you to acknowledge if your “formula” is working or not.
Sometimes its OK to use Lucene and build from the ground up your own custom Search solution, this is not for everybody, but for some cases it will be worth the effort (Twitter, LinkedIn)
Solr can be used for the must unexpected use cases, yes we know that when you have text you can use Solr/Elasticsearch/Lucene to search on it, but did you know that you can even search images by color? Also Solr can be used just to deduplicate content, how cool is that? Do you need an engine that you can feed data and then execute quick queries against? then Solr is for you, the use case is limited by your own imagination (or your own needs).
If search is a core feature for your use case consider abstracting the inner workings of the search engine of your choice from the rest of your engineering team, meaning provide a library that will allow to other members create amazing apps without wasting time in learning how Solr/Lucene work, or how you scaled your infrastructure, this are complex issues. In previous editions of this same conference we’ve seen good examples of this approach, the case of CareerBuilder comes to my mind.
Paired with the previous point you also must provide tools that will allow the other members of your engineering team to debug a query, do A/B testing, bring people who knows the content to tell about the results quality, of course non of this is easy to build or maintain but the effort is well compensated, the idea is to create an ecosystem that democratizes search in your organization.
The new Analytics component is a powerful new addition to Solr, coming in the recently released Solr 5.0 and available to previous versions in the form of a patch. This awesome feature was presented by Steve Bower from Bloomberg and is an leap step forward compared to the Stats Components, that old friend that some of us use. I think that this brand new search component combined with
AnalyticsQueryand the introducion of the
PostFilterinterface are leading Solr in a path to become one of the must customizable analytics platform, one item that in my personal opinion Elasticsearch attacked before Solr.
Its always nice to see some more advanced solutions that use Lucene in its core, with its own layers, LinkedIn is a great example of this, although keep in mind that this is not something trivial and you’ll need very talented engineers to create this type of system, and in most cases this is not really required.
Use DocValues there is no other way of saying this, if you want to do analytics, faceting on very large collections you’ll have to use
DocValues it improves the memory usage a lot, if you don’t trust me you’ll hear exactly that in several talks on the conference from more advanced folks.
I think that search engines are getting a lot of attraction, not in the traditional Google, Bing, Yahoo! style but actually as a technology that can power very interesting use cases, mostly analytics and because of this, people/companies has been looking for ways to run this products in an even larger scale. Take a look at the talks by Tomás Fernández Löbbe from the Amazon CloudSearch team and Jessica Mallet from Apple and you’ll think entirely different about your own setup, trust me on this.
I’d love to have the opportunity to go to this conference in the near future, its a real joy to share a room with the must talented engineers out there pushing search to the future. So to finish this post I just want to let you an invitation to the next Lucene/Solr Revolution event, which will be in Austin, TX October 13-16. The registration opens this spring, so if you want to stay informed on this event visit the site or follow @LuceneSolrRev or @Lucidworks Twitter accounts for the most updated news.