Skip to content

As a locker developer, I want to be able to index and search Locker data

erictj edited this page Aug 18, 2011 · 8 revisions

Wireframes

TBD

Inspiration

Indexing

I can call a Locker API function to index any Locker JSON object to a flatfile index using the CLucene library

var search = require('lsearch');
var contacts = <JSON results of calling Locker API query of "/Me/contacts/allContacts", or your own list of Locker contact objects>
for (var i in contacts) {
    search.index('contact', contacts[i], function(err, indexTime) {});
}

When I index an existing document, the existing document will be updated, instead of added. This will allow for documents to be updated in the index, as they change from their existing sources.

var search = require('lsearch');
var contact = <JSON representation of existing Locker contact object, with an "_id" field that's already stored in the search index>
search.index('contact', contacts[i], function(err, indexTime) {});

When a document exists in the index and the source document gets deleted, the existing document needs to be deletable from the index.

var search = require('lsearch');
var contact = <JSON representation of existing Locker contact object, with an "_id" field that's already stored in the search index>
search.deleteDocument(contacts[i]._id, function(err, indexTime) {});

Implementation Results

The final implemented API varies slightly from the original thoughts. The final outcome has a single index method called indexType

indexType("contact", {"_id":1234, "name":"Thomas Muldowney", "nickname":"temas", "email":[{"type":"work", "value":"temas@singly.com"}]}, function(err, indexTime) {});

This single function requires the specific type always so that it can determine how to map the JSON object into a usable content string to index. The implementation automatically does the id checking to make sure that we only maintain one copy of the data at any given time. Therefore deleteDocument is not directly exposed currently. A future revision needs to still have this functionality to handle fully deleted records.

Searching

I can call a Locker API function to search a single service type (e.g. index file) using the Query Parser Syntax and get JSON results back

var search = require('lsearch');
search.query('contact', 'firstname:eric', 0, 10, function(err, results) {
    // do cool stuff with data in results
});

I can call a Locker API function to search all Locker service types (e.g. index files) using the Query Parser Syntax and get JSON results back

var search = require('lsearch');
search.query('*', 'firstname:eric', 0, 10, function(err, results) {
    // do cool stuff with data in results
});

Implementation Results

The final implementation of querying is nearly the same as the proposed API with some basic deviation. First, the limit and offset arguments were rolled together into a single parameter object.

{
  limit:10,
  offset:5
}

NOTE: these query parameters are largely unused at the moment, as query resultsets larger than 100 tend to hurt performance. Still, they're noted here for reference.

Next, we have two query APIs rather than a single one, queryType and queryAll. queryType limits the results to a single type as specified when the document was indexed, queryAll allows for all types as a result.

Finally, the complete Lucene query language is not fully supported. Searches are currently boxed into a single field rather than allowing for the user to specify multiple fields. The term based language constructs are allowed for boosting, mandatories, fuzzy logic, etc.

User

I can use the existing search app to search across my entire locker, using the new CLucene search API, and see basic results in a raw JSON format.

Acceptance Criteria

  • Tests of lsearch API
    • Make tests pass if clucene is not installed
Clone this wiki locally