creating real-time data mashups with node.js and adobe cq by josh miller
DESCRIPTION
NASCAR case study on how Node.js can be used with Adobe CQ for handling the real-time data.TRANSCRIPT
NODE.CQ
CREATING REAL-TIME DATA MASHUPS WITH
NODE.JS AND ADOBE CQ
PROBLEM SCENARIO
We want to mix authored content from Adobe CQ with Real-Time Race Data from our Timing and Scoring system.
Combining Slowly Changing Dimensions such as Driver Team Name, Vehicle Manufacturer Name, Track Information, etc. with Constantly Changing Metrics such as Last Lap Speed, Driver Position, Lap Number, etc.
Adobe CQ is great at managing the authored content, but is less adept at handling the real-time data. The time it takes to ingest the data and replicate it is too long – the data will have already changed.
ENTER
NODE.JS
THE SOLUTION
NODE.JS AND ADOBE CQ ARE
COMPLIMENTARY
ADOBE CQ
• Enterprise-Scale CMS
• Excels at Document
Storage
• Great Authoring
Environment
• Replicates and Scales
Nicely
NODE.JS
• Enterprise-Scale Throughput
• Excels at Real-Time Data
• Easily Connect Disparate Systems
• Scales Nicely
COMMON USE-CASES FOR
NODE.JS
• Creating Network-Intensive Applications
• Creating and Consuming Real-Time Data
• Creating Scalable, High-Throughput Solutions for Large Numbers of Simultaneous Connections
• Creating and Consuming Service-Based API’s
• Creating Stateless, Request-Response Scenarios
• Creating Push Scenarios over Websockets
• Creating Event-Driven Services
WORKING
WITH
NODE.JS
A QUICK INTRODUCTION TO
WORKING WITH NODE.JS
SHOULD BE LIKE WORKING
WITH BUILDING BLOCKS
Node.JS has a broad and diverse developer community. If you want to build something with Node, chances are someone else has already done the same thing.
Before you start building from scratch, look at the packages that already exist on NPM (http://npmjs.org)
Using NPM (Node Package Manager), you can install packages that perform the tasks you need to accomplish.
ELEMENTS OF A NODE.JS
APPLICATION
Web Server / Framework
• Express
• Flatiron
Logging Service
• Morgan
• Winston
Configuration
• Nconf
• config
Promise Library
• Q
• promise
Built-In Services
• HTTP / HTTPS
• FileSystem
• Crypto
• Events
• Stream
• Etc.
NODE.JS GOTCHAS
Some things about Node.JS are a bit different from working with other technologies.
• NODE.JS IS ASYNCHRONOUSGetting familiar with JavaScript Promises and Deferred Libraries or understanding an developing very clear callback chains is a must for working with Node.JSeffectively
• NODE.JS IS A PACKAGE-DRIVEN TECHNOLOGYGetting comfortable working with a Package Manager (NPM) is a must for working with Node.JS effectively
• YOUR APPLICATION IS YOUR SERVERThere is no Apache or nginx or IIS to work with. You build your server, or use a framework like Express or Flatiron
• NODE.JS IS AS FAULT-TOLERANT AS YOU MAKE ITBuilding solid functionality with lots of error handling and good logging is important
WTF DID YOU JUST BUILD?
Node.JS is Package-Driven and NPM provides you with a wealth of resources for working with Node, but be careful what packages you choose. If you see a package that has 25,000 downloads and a vibrant developmenthistory on GitHub then you’re probably safe.
If you’re the only one that has downloaded thispackage this calendar year and the last commitwas made in 2010, you might want to keeplooking for a more popular package.
Just because you have bricks in your bin,you don’t have to use them all together.
INTEGRATING
ADOBE CQ
SO WHAT ARE OUR NEXT STEPS?
USING ADOBE CQ’S REST API
WITH NODE.JS
Adobe CQ is built on top of Apache Sling – a Web Framework that provides a REST API to CRX - the Java Content Repository that sits beneath Adobe CQ
You can directly query CRX using simple REST commands and have the output formatted as JSON
JSON data can be directly consumed by the Node.JS application independent of your website’s front-end
USING THE NODE-DEPTH
SELECTOR WITH ADOBE CQ
USING THE INFINITY
NODE-DEPTH SELECTOR
USING A NUMERIC NODE-
DEPTH SELECTOR
Returns either all child nodes at the given path, or an array of the available numeric node-depth selectors if the structure is deemed too large.
Returns data from the root
path, and all child nodes
at the node-depth
indicated by the selector.
NODE-DEPTH SELECTOR
RESULTS
ARRAY OF AVAILABLE NODE-DEPTH SELECTORS
JSON OUTPUT OF THE AVAILABLE NODES
HOW DO WE
USE THE DATA
WITH NODE.JS?
NOW THAT WE HAVE THIS DATA
HOW DO WE USE THIS DATA?
By itself, the data that comes from CQ is only as useful as the underlying data structure, the power of this data comes in our ability to use Node.JS to quickly extract the data and then mash it up with other data sources.
Using Node.JS, not only can we query data from CRX, we can query data from a number of sources and combine our CRX data with other feeds to create new data sources.
This enables us to mix authored content from CRX with Real-Time data from our Timing and Scoring feed to create a new, single feed that can be used in our Mobile product.
HOW IS THE DATA JOINED
INTO A NEW DATA SOURCE?
Creating the feed mashup is not out-of-the-box functionality for Node.JS – we have to custom-code a method by which to join feeds together
Node.JS enables us to build an application using the building blocks we discussed earlier, but also allows us to create new, custom blocks with which to build
Without too much effort, we have created a package that allows feeds to be joined together using the same Primary and Foreign Key relationships you would find in a typical RDBMS product.
HOW IS THE DATA JOINED
INTO ONE FEED?
• Using simple JSON syntax, we can define a new feed that is comprised of one or more feeds.
• Each feed has a “join” condition that allows a the feed to be joined to the collection based on a specific JSON node value.
• Special syntax allows for variable replacement from URL parameters
• Special syntax allows for values from the new feed to be used throughout the feed
• Includes custom functions such as Date and String Formatting
• Includes dependency conditions where field values are calculated and/or displayed based on the value of other fields
INTEGRATING
REAL-TIME
DATA
TAKING IT ONE STEP FURTHER
GETTING LIVE DATA FROM
THE RACETRACK
During a race, NASCAR vehicles are monitored via transponders placed in the cars. As the cars cross over fiber optic sensors in the track, the data is transmitted to a piece of software called TimeGear.
TimeGear tracks the speed of each car, its position relative to the other race cars and feeds this data into the Timing and Scoring system.
Timing and Scoring provides a feed that is consumed by Apex, our Mobile Cacher application, which streams the JSON feed out to Akamai where the data is consumed by internal applications and third-parties such as Yahoo!, Fox Sports and ESPN.
INTEGRATING OUR REAL-TIME
DATA FEED
Using the same syntax and the same data providers, we can query our Real-Time race data directly from Timing and Scoring, or directly from Akamai to reduce the load on the T&S systems.
Without modifying any code, provided a relationship can be found in the data, we can now merge any JSON data source into our feed.
This allows us to merge our Real-Time race statistics right into our authored CQ content, providing a richer and more in-depth feed for our Mobile application without the delay of first ingesting the race data into Adobe CQ.
Now that our data is available in a new format, we can provide a single stream of data to the NASCAR Mobile application, reducing the number of calls that need to be made from a mobile device.
EXTENDING OUR DATASET
WITH THIRD-PARTY SERVICES
Given the flexibility of this data aggregator, we can now start to lay new and powerful data layers from disparate source on top of our existing data without having to store that data in CQ.
For example, we can pull Real-Time Weather Conditions into our data based on the zip code of the track. We could pull track records to note if a driver’s lap speed was the fastest in the track’s history. We could even pull in Sponsor information based on the current Race Leader.
We accomplish all of this without the need to add to the storage requirements of our application, or write custom aggregators for external content.
WHAT ARE THE
BENEFITS OF
USING
NODE.JS?
NOW THAT WE’RE DONE
COULDN’T WE HAVE DONE
THIS USING CQ?
Of course, we could have accomplished the same end-result using only Adobe CQ and some custom Java code. There are some real benefits to using Node.JS in this scenario though:
• There is no code to compile and new feeds only require JSON configuration
• Node.JS is an extremely high-throughput platform. We can serve hundreds of simultaneous connections per second.
• We reduce the load on our CQ environment by offloading tasks to an application with fewer hardware requirements
• We don’t use an large, complex web framework to deliver small streams of data with no user interface requirements
IS NODE.JS REALLY THAT
MUCH MORE PERFORMANT?
We have used Node.JS for a number of new tasks here at NASCAR Digital Media lately and have found it to be incredibly performant. We recently launched a new RaaS implementation with Gigya and use Node.JS to authenticate users.
During our load tests, we found that we could serve in 10 minutes of sustained load, all of the traffic that we expected the Node service to experience within the entire race season.
In fact, we have found that our load tests typically max-out not because of Node’s inability to serve more requests, but because MySQL starts to queue requests, or Gigya begins to throttle requests-per-second.
WHERE CAN I
LEARN MORE?
OK, I’M INTRIGUED …
RELATED RESOURCES
Node.JShttp://nodejs.org/
NPMhttps://www.npmjs.org/
Adobe CQhttp://www.adobe.com/solutions/web-experience-management.html
Apache Slinghttp://sling.apache.org
Apache Jackrabbithttp://jackrabbit.apache.org/
LEARNING NODE.JS
• Node.JS the Right Way (Book)
http://amzn.to/1wmI4hL
• NodeSchool (Tutorial)
http://nodeschool.io/
• Express Framework (Documentation)
http://expressjs.com/starter/hello-world.html
• JavaScript Promises (Article)
http://www.html5rocks.com/en/tutorials/es6/promises/