Archive for the ‘Search’ Category:
23 Feb
Google Maps team has added a new feature that allows you to look at the entire scope of local businesses by showing more than 10 results on the map. The idea is of using layers on the map that gets activated when there are more relevant results to show.
Below are some of the things the feature can be used for.
Neighborhood Exploration / Discovery
When visiting a new city or area, doing a search on a business category (restaurants, bars, hotels, rental cars) informs you of certain areas that have high density for that category. For e.g. I searched for “Bars in Hollywood, CA” (screenshot above) - you can, at a glance deduce that Hollywood Blvd, between Highland and Vine is where the action is. You can directly go to that street and explore.
As David Mihm points out - it’s like a Heat Map of the city which tells you the hot spots.
More Relevant Local Results
Below is a snapshot of a search for “Dry Cleaners in 90278″. The search query is already very localized as I am providing a zipcode. In the past, Google would show the top dry cleaners (perhaps based on ratings, reviews, authority, etc) but not necessarily the closest one. With the new feature (Andrew Shotland calls it K-Pack) I can quickly find that there is a few Dry Cleaners that are more closer than the ones in the top 10 results and I might not even have to take out the car for going to one.
Business Research / Market Research
As Miriam Ellis’s post points out, there is another aspect of the feature that Business enthusiasts can take advantage of. They can use this to identify the various pockets within an area that are not served well within a particular category and capitalize on that.
This feature has taken local search and exploration to a next level and the whole experience feels lightweight and fast.
Improvements?
I would like Google Maps to give me the ability to combine directions and local search. That way, when I am looking at the directions for a place I want to go to, I can find relevant businesses along that route.
04 Feb
Last month, I wrote about various options on providing search in Django based applications. Kudos to Sean at screely.com for finishing the conversion of Django Solr module to Django 1.0. This adds one more to the list, and perhaps replacing the django lucene module. You can find the code, and other related links on the project page hosted at Google Code.
Providing Search and Faceted Navigation to Django application just became a little easier with this.
16 Jan
I have seen many Django based applications that do not provide intuitive and powerful search capabilities to their users. If you pickup a product created in django and try to do a search, you will be disappointed by the fact that the search is so primitive. No spelling correction, no fuzzy searching, no complex multi-field searches as well. Oh .. and more over, don’t even think about relevance based search.
I have created some applications, but I have been mostly written one-off custom code to integrate best of the breed open source engines like Solr and Lucene into my web applications. While doing that, it got me thinking -
What are the options that a developer has, to provide search in an django based application?
- Use of “contains” using QuerySet API
- Use Django Sphinx
- Use - django-search-lucene app
The first approach is by far the most commonly used in Django applications. What it does is that it makes use of the underlying LIKE operator of the database. The problem is with this approach is that it’s too primitive. No Relevancy, No complex constraints, and won’t work for multi-word query where the two words don’t appear together.
Second option is to use django-sphinx project. Sphinx is an open source search engine, that was primarily written to integrate well with databases (SQL focused). Though, it seems to be gaining some momentum, but for a really poweful and featureful search engine, I have found Lucene much better. Also, the integration with django requires you to install and set it up as separate server, which is always more than you are looking for.
Lastly, there is an app called django-search-lucene that provides lucene and django integration using PyLucene (Python port of Lucene). The application provides easy integration with Django ORM and simple APIs to perform search. Moreover, for power users, they have exposed an api where you can fire native lucene queries. In addition to that, it also exposes some basic status reporting in the Django admin, which is helpful is monitoring the index / searches.
I am also noticing a flurry of activity in the last 48 hours on the project and am curious to know what new additions are being made.
Next time when you are creating a django based application, look at the kind of search that you are providing to your userbase and see if you can use one of the two (2,3) options above to enhance their experience. Also, do tell me what your experience was - always looking forward to hear that.
28 Dec
Solr now supports Tika through ExtractingRequestHandler
It is now possible to send any of Tika’s supported document types (MS Office, PDF, XML, HTML, etc.) and have the content extracted and then indexed, all within Solr.
A natural enhancement / extension to Metadata extraction and identification toolkit would be to layer a content analysis framework on top. For some verticals (especially news), there is value in extracting named entities out of the content from content sources (documents or web pages). These named entities can then be added to Solr that can allow users to slice and dice information by People, Company, Places, etc. Once there, it becomes a great platform for entrepreneurs to develop applications on top of it and not have to worry about entity extraction.
There are already a number of options for extracting entities from text (LingPipe, OpenCalais). The task is to standardize and wrap them in a framework that can be easily plugged into Solr (atleast to start with).
Grant, is someone already working on it? Any plans in the pipeline?
05 Dec
All of us are creating fountains of ambient data, from our phones, our web surfing, our offline purchasing, our interactions with tollbooths, you name it. Combine that ambient data (the imprint we leave on the digital world from our actions) with declarative data (what we proactively say we are doing right now) and you’ve got a major, delicious, wonderful, massive search problem, er, opportunity
Again from John’s post, applying the above to the Local Search industry, there is a huge opportunity for companies to incorporate lifestreaming and microblogging in their existing system. Currently, sites like YellowPages.com, CitySearch, Yelp, etc are focused on reviews and ratings left explicitly by their users. But why not aggregate the updates from my social network about the restaurant I am looking at. Web Publishers have started doing this per topic, look at HuffingtonPost topic pages and you’ll see realtime conversations about that topic (summize widget).
So what’s stopping the local guys from innovating here?
04 Dec
Very soon, we will be able to ask Search a very basic and extraordinarily important question that I can best summarize as this: What are people saying about (my query) right now?
John Battelle wrote a post where he talks about a new evolution of Search, where the system would return everything that people are saying about the “topic” of your search. A search engine that can let you search the conversations that are happening on the web - realtime. A search engine that can inform you about interactions of your social network with what you are searching for (e.g. Canon EOS camera).
Where is that Search engine? Is it an extension of BackType or much more than that?
06 Nov
Grant Ingersoll has published a new article on IBM developer works that talks about new features in Solr 1.3.
- “Did you mean” Spellchecking
- Finding similar pages (More like this)
- Editorial results placement - Ability to specify that a particular document (or documents) appear at a particular place in the search results.
- Distributed Search - Solr adds distributed search capabilities that has the ability to scale the index size by spliting up the documents across several machines (shards)
- Performance Gains - ~5x improvement in indexing speed
Solr is starting to get more mature as it starts adding feature / functionality that modern information access applications need. I have spent a lot of time integrating and enhancing product offerings that used Lucene as the underlying engine to provide search. Now, it’s great to see Solr project moving forward with some thrust.
Companies that use Lucene in their applications and products should definitely start evalutaing Solr and how they can take advantage of it.