// archives

Solr

This tag is associated with 6 posts

Insight into Search on Digg.com powered by Solr

Grant did an interview with Sammy Yu, who worked on the search system for Digg.com that utilizes Solr as their platform. Here are some notes from the interview:
Number of Documents in Digg’s index: 13 Million
Index Size (Lucene) on Disk: 8 GB
Architecture: Master - Slave setup, with 10 slaves, running being a load balancer with some [...]

Lucene / Solr still needs hackers to get up and running

Just read an article posted on the Lucene blog - “Lucene and the Corporate Environment”
If the list of companies using Lucene are not “corporate” environments, then I don’t know what corporate means. If by corporate packaging, you mean it has a lot of bloat and charges exorbitant license fees, then no, unfortunately, Lucene [...]

How to influence the query plan in Lucene / Solr?

I was looking through Luence’s source code today (okay - night) to find whether you could provide hints to Lucene to change the clause precedence during query execution. Unfortunately, I found that Lucene does not support users to supply any such hint (I was looking at ConjunctionScorer).
At work, we have a use case, where we [...]

Solr/Lucene Feature Alert: TrieRange Capabilities

Are you doing range searches in Lucene / Solr in your application? If so, you can get performance boost by using the new TrieRange package.
Here is a ppt that details the capability.
If you want to read more, you can read the article posted by Grant on Lucid’s site.

Solr adds Tika support - Entity Extraction next?

Solr now supports Tika through ExtractingRequestHandler
It is now possible to send any of Tika’s supported document types (MS Office, PDF, XML, HTML, etc.) and have the content extracted and then indexed, all within Solr.
A natural enhancement / extension to Metadata extraction and identification toolkit would be to layer a content analysis framework on top. For [...]

Solr 1.3 comes with Search enhancements

Grant Ingersoll has published a new article on IBM developer works that talks about new features in Solr 1.3.

“Did you mean” Spellchecking
Finding similar pages (More like this)
Editorial results placement - Ability to specify that a particular document (or documents) appear at a particular place in the search results.
Distributed Search - Solr adds distributed search capabilities [...]