Implementation Story: CloudSearch on Node.js (Lambda) via LaunchRight.com

Implementation Stories are quick, practical examples (including code) of how the LaunchRight team utilizes technology such as third party API's, cloud services, and more within larger Launch Stories. Here we go...


As part of a larger launch story (coming soon), we recently implemented Amazon CloudSearch to support a MASSIVE data-indexing endeavor. Capitalizing off of previous development trees, we decided to implement a search API utilizing CloudSearch along with a couple of other Amazon Web Services, namely AWS Lambda and the AWS API Gateway.

Amazon CloudSearch is the Amazon Web Services "Search As A Service" or as described by Amazon:

Amazon CloudSearch is a managed service in the AWS Cloud that makes it simple and cost-effective to set up, manage, and scale a search solution for your website or application. Amazon CloudSearch supports 34 languages and popular search features such as highlighting, autocomplete, and geospatial search.
more

AWS Lambda is Amazons "on-demand compute service" offering while the AWS API Gateway is Amazon Web Services "API As A Service" offering. You can read about our implementation stories on both of these Amazon Web Services soon.

Before we get started

This implementation utilized Amazon Web Services Lambda so some steps required for utilizing AWS CloudSearch in a stand-alone node.js environment have been omitted. Drop us a comment below if you have any questions.

We have not covered how to setup your CloudSearch domain as part of this story to keep things here focused on the implementation within node.js. We'll be posting that implemenation story soon.

At various points below you will see me reference domain. Note that this is not your web address or your website domain. This is referring to your AWS CloudSearch domain that you setup. Simply put, your CloudSearch domain can be thought of as your database or warehouse that you can then upload data to (documents) and search through.

Now let's get started...

The AWS SDK

The AWS SDK is readily available when utilizing node.js with AWS Lambda so the first step here, including the AWS SDK in your project, is pretty easy. Simply add the AWS SDK to your project:

var aws = require('aws-sdk');

Once you have included the AWS SDK in your project, you'll want to configure it for use with your Amazon Web Services access key and secret. The following should be AFTER the code shown above:

aws.config.update({ accessKeyId: 'AKXXXXXXXXXXXXXXXXXX', secretAccessKey: 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX', region: 'us-east-1' });

Note, where region above is set to 'us-east-1' you should set this to the respective region for your environment.

Now that we have our access to the Amazon Web Services SDK setup correctly, we can now setup the interface to our AWS CloudSearch domain.

CloudSearch in Node.js

First, let's setup a varible with our domain or the document endpoint URL:

var csDomain = 'xxxxxxxxxxxxxxxxxxxxxx.xx-xxxx-x.cloudsearch.amazonaws.com';

You can find the document endpoint via the AWS CloudSearch Management Dashboard listed as Document Endpoint for your domain.

After we know our document endpoint, we configure the AWS SDK to use this document endpoint in our CloudSearch requests.

We are focusing here on the data side of AWS CloudSearch, specifically adding and searching for data. For this we are going to use the CloudSearchDomain() interface to AWS CloudSearch. There is also a CloudSearch() interface which we don't touch here. The CloudSearch() interface is meant for management of AWS CloudSearch and your CloudSearch domains.

Setup CloudSearchDomain() for your document domain:

var cloudsearch = new aws.CloudSearchDomain({endpoint:csDomain});

Of course, you don't have to use the variable csDomain as I show here. You can always define the document endpoint directly in the request for the CloudSearch object:

var cloudsearch = new aws.CloudSearchDomain({endpoint:'xxxxxxxxxxxxxxxxxxxxxx.xx-xxxx-x.cloudsearch.amazonaws.com'});

Now let's look at a couple of example requests to our CloudSearch domain utilizing the CloudSearch object we just created.

You have access to the full CloudSearch SDK but specifically we're going to look at uploading data and searching for data.

You can always review the full documentation of available endpoints within CloudSearchDomain() and CloudSearch().

Uploading a single document to AWS CloudSearch

We are uploading a simple document to our CloudSearch domain below.

Our document has two indexed fields - title and description but a document can have as many fields as your schema requires.

Let's look at the full code block then break it down:

var documentsBatch = [] var document = {};
document.id = 1;
document.type = 'add';
document.fields = { title:'My First Docment', description:'This is my first document' };
documentsBatch.push(document);
var params = { contentType: 'application/json', documents:JSON.stringify(documentsBatch) };
cloudsearch.uploadDocuments(params, function(err, data) {
if(err) { context.fail(err.stack); }else{ context.success('document uploaded successfully'); } });

Step by step:

We first setup our main document "batch," what will be an array of objects, which we fill with one or more documents (as needed) to be uploaded to our AWS CloudSearch domain:

var documentsBatch = []

Next we create our document object for our single document:

var document = {}

Now we want to fill our document object with our data:

document.id=1; document.type='add';
document.fields = {
title:'My First Document', description:'This is my first document' }

Next we add our document to our document batch:

documentsBatch.push(document)

We now prep our data to send to our AWS CloudSearch domain:

var params = { contentType: 'application/json', documents:JSON.stringify(documentsBatch) };

This step lets the AWS CloudSearch domain know what format our data is in (contentType) and what the data is we want to upload (documents).

We're formatting our data as JSON so we set our contentType to reflect this by setting it to 'application/json'.

To format our documents batch we only need to run it through JSON.stringify(documentsBatch).

With our data package ready, the only step left is sending it over to our AWS CloudSearch domain.

cloudsearch.uploadDocuments(params, function(err, data) { if(err) { context.fail(err.stack); }else{ context.success('document uploaded successfully'); } });

If all goes well, you're AWS Lambda function will exit with 'success.' Any failure in the AWS CloudSearch domain setup OR the formating of your data, and the AWS Lambda function will exit with the error details via err.stack.

Although the example above is only uploading a single document, with some minor modifications, you only need to loop to continue to add documents to the documents batch via documentsBatch.push(document).

Now that we have our document uploaded to our AWS Cloudsearch domain we can dive into searching your AWS CloudSearch domain.

Simple searching of Amazon Web Services CloudSearch domains

You can use the search interface with the same cloudsearch object we created at the beginning so let's look at the complete code block for searching your AWS CloudSearch domain:

var aws = require('aws-sdk'); aws.config.update({
accessKeyId: 'AKXXXXXXXXXXXXXXXXXX', secretAccessKey: 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX', region: 'us-east-1' }); var csDomain = 'xxxxxxxxxxxxxxxxxxxxxx.xx-xxxx-x.cloudsearch.amazonaws.com';
var cloudsearch = new aws.CloudSearchDomain({endpoint:csDomain});
var csParams = {
cursor: 'STRING_VALUE', expr: 'STRING_VALUE', facet: 'STRING_VALUE', filterQuery: 'STRING_VALUE', highlight: 'STRING_VALUE', queryParser: 'simple | structured | lucene | dismax', return: 'STRING_VALUE', size: 0, sort: 'STRING_VALUE', start: 0, stats: 'STRING_VALUE' partial: true || false, queryOptions: 'STRING_VALUE', return: 'STRING_VALUE', query: 'STRING_VALUE' };
cloudsearch.search(csParams, function(err, data) {
if (err) { context.fail(err.stack); // an error occurred } else { context.success(data); } });

After we setup our cloudsearch object, we submit our search request,csParams, to the AWS CloudSearch domain via cloudsearch.search(). The request, if successful, will return a JSON-encoded dataset for our application to use.

The key to submitting your search request, after you have this code block, is understanding how to search AWS CloudSearch in general (not node.js specific). A separate implementation story for the nuances we encountered while searching our AWS CloudSearch domain will be coming soon.

To wrap things up, we've reviewed how to configure your AWS SDK connection for your specific Amazon Web Services account, configured our interface to AWS CloudSearch, uploaded our first document to AWS CloudSearch, and, finally, we submitted our first search request. Plenty more to cover and more impementation stories to come so check back soon at LaunchRight.com/Blog.