About the Talk
June 24, 2012 9:45 AM
San Diego, CA (UCSD Campus)San Diego, CA (UCSD Campus)
If your application relies on simple string comparison to search through text-based data, you might want to learn about an alternative approach. In this session, I will talk about terms such as inverted index, term frequency, document frequency, proximity and more. I will introduce you to Apache Lucene, discuss what it does, and show you how to use it to build your own search feature.
We will cover: * Inverted index * Scoring with TF-IDF metric * Different types of queries (term, phrase, wildcard/regular expression, fuzzy, range, boolean, proximity, etc.) * Filtering * Sorting