Used in this project

NodeJS, NodeJS cluster management, ZombieJS, ExpressJS, Mocha, Supertest, Chai, Chai-Things, CasperJS, PhantomJS, BitBucket, Trello, Heroku, Heroku Buildpacks (multi, node, casper/phantom), Mailgun (including Node integration), Q (for javascript promises in node), MongoDB, MongoLab, WebPageTest, WebPageTest API, Mongoose, Bluebird, Twitter Bootstrap Yelp API, Four Square API, Google Maps API, Yellow Pages API, Facebook Places API - Places endpoint, Node Async module, Handsontable


Work summary

ZombieJS memory leak

Fixed ZombieJS memory leak using Node cluster and process management.

Test suite - API endpoints

Wrote test suite for API endpoints, using Mocha and Supertest.

Test suite - end-to-end for their Store locators

Wrote end-to-end test suite for their store locators in CasperJS, including e2e testing for their Facebook app. Some interesting issues, scraping locators for test purposes was difficult because of different frames and because of asynchronous js loading (via yepnope). Switching to Phantom/Casper solved the async loading issue (was previously trying to scrape with Zombie), but iFrame was a bit more tricky.

Email notifications

Added email notifications for the test-suite via Mailgun. Some interesting scripting needed to be done because the test-suite runs separate scripts, and I needed to send admins an email with the output of both scripts if an error occurs in either. Using Q's promises made things a bit cleaner.

Test suite - perfomance tests from different locations

Wrote performance tests to test load time from 7 different countries via WebPageTest API, log results to a Mongo database, hourly. And then generate reports daily and send out to admins.

Location fetcher

Rewrote a node-backed site, where users can copy and paste names and locations from a spreadsheet, and then search for additional data on Yelp, Four Square, Google Places, Facebook Places, and Yellow Pages, and populate the rest of the columns on the spreadsheet.

Added some specialized throttling for each api data source, both client and server-side. Client-side I used timeouts and incrementally requesting results (e.g. might have to make 10 server requests for search results, to fill out entire data set), while providing some ui feedback to user. If I hadn't done this and user requested 1000 rows, they might have to wait 5 or so minutes w/o any feedback on progress in the UI. Server-side I used the bluebird (or was it asyncjs) promise library to handle some async throttling (e.g. 4 api calls max always happening at once). I then tested and optimized these settings to be as efficient as possible while not going over API limits.

Added some better result scoring, using criteria like distance and string match scoring.

Everything was tested with Mocha.

Used Twitter Bootstrap to make it look decent, and added some UI improvements to the spreadsheet provided by Handsontable. (e.g. clear rows). Added copy and paste functionality as well to copy entire spreadsheet to clipboard.

Mongo DB Copier

Built a node app on Heroku that lets you copy one Mongo database into another (essentially a dump and restore), by entering source and dest db info. Used local storage to save text field entries, so the user wouldn't have to re-enter information. Send email notification to user after completion.

Real-time product availability updater

Built on top of node/mongo/heroku. Given a list of places and products, we want to know which products are available in each of those stores, along with price, and product details. Goal was to have all data be less than an hour old. Did some complexity analysis to see if this was possible given API constraints. Because of these constraints, had to make sure we did things in a very efficient manner:

I tried to minimize db calls doing bulk updates wherever possible. I stored and massaged API results into hashes to minimize processing time, and avoid n+1 lookups, I tried to take advantage of async processing to do parallel updates and API requests where possible (using Q as the promise library).

Added better error handling, also notifying admins in case of errors.

Wrote script to provide a summary of the product availability data (e.g. what % of data is less than 2 hours old, 1 hour old, 30 minutes old, and so on).

Scheduled availability script, and wrapped these runs into a Job data structure, on which I kept stats (e.g. # of products updated, time it took to update them, and so on).

Wrote some tests for the API calls in Mocha.