automata processing: accelerating big data

18
1 | ©2014 Micron Technology, Inc. September 26, 2014 Presented by: Dan Skinner Director, Business Development Micron Technology, Inc. Automata Processing: Accelerating Big Data

Upload: microntechnology

Post on 28-Nov-2014

454 views

Category:

Technology


4 download

DESCRIPTION

Many of today’s most challenging computer science problems—such as those involving very large data structures, unstructured data, random access or real-time performance requirements—require highly parallel solutions. The current implementation of parallelism can be cumbersome and complex, challenging the capabilities of traditional CPU and memory system architectures and often requiring significant effort on the part of programmers and system designers. For the past seven years, Micron Technology has been developing a hardware co-processor technology that can directly implement large-scale Non-deterministic Finite Automata (NFA) for efficient parallel execution. This new non-Von Neumann processor, currently in fabrication, borrows from the architecture of memory systems to achieve massive data parallelism, addressing complex problems in an efficient, manageable method. On September 17, the Wall Street Technology Association (WSTA) hosted a seminar for Financial IT professionals entitled Delivering Big Data. As part of that event, Micron’s Dan Skinner delivered an introductory on this revolutionary new technology, the growing ecosystem, as well as potential applications in the area of computational finance and data analytics.

TRANSCRIPT

Page 1: Automata Processing: Accelerating Big Data

1 | ©2014 Micron Technology, Inc. September 26, 2014

• Presented by: Dan Skinner

• Director, Business Development

• Micron Technology, Inc.

Automata Processing: Accelerating Big Data

Page 2: Automata Processing: Accelerating Big Data

2 | ©2014 Micron Technology, Inc.

Customers demand high performance for analytics. Increasing levels of parallelism drive complexity in system

architectures.

Massive scale requires aggressive power targets.

Big Data Presents A Unique Challenge for Memory Systems

Five Big Technology Trends

September 26, 2014

BIG DATA CLOUD NETWORKING MOBILE

MACHINE TO

MACHINE

Page 3: Automata Processing: Accelerating Big Data

3 | ©2014 Micron Technology, Inc.

A Repetitive Cycle…

September 26, 2014

The Consistent Message

CPU Vendor System OEM

“Memory is the

bottleneck!”

“We need faster

memory!”

The Response

Memory Industry

“Sure, we can do that!”

1970 Today

Broadside Addressing

Multiplexed Addressing

Fast Page Mode

Extended Data Out

Synchronous DRAM

Innovations in memory interfaces…

… have been critical to improving performance.

Page 4: Automata Processing: Accelerating Big Data

4 | ©2014 Micron Technology, Inc. September 26, 2014

The New Standard for Memory Performance: Hybrid Memory Cube

OEM’s Enablers Tools

Micron’s revolutionary approach combines logic + memory; breaks through the “Memory Wall”

Provides 15X the bandwidth of a DDR3 module

Uses 70% less energy per bit than existing memory technologies

Reduces the memory footprint by nearly 90% compared to today’s RDIMMs

HMC Consortium: A Growing Ecosystem

Page 5: Automata Processing: Accelerating Big Data

5 | ©2014 Micron Technology, Inc. August 2011

Higher speed memory interfaces

Complex algorithms to minimize traffic

Multiple channel memory interfaces

Advanced high speed signaling techniques

And on, and on, and on…

Working harder and faster is the common approach to ‘getting over the wall’.

Hybrid Engine Store Becomes a

Flexible Computational

Engine

Input Input

The ‘Store’

Instructions, Data & Variables

The ‘Engine’

Fixed Computational

Pipeline

Memory Bottleneck

(Memory) (Processor)

The Memory Wall Keeps Getting Higher

Page 6: Automata Processing: Accelerating Big Data

6 | ©2014 Micron Technology, Inc.

Staying a Step Ahead Requires New Technologies

September 26, 2014

Fact: The ability to generate and transport information has vastly exceeded our capacity to analyze that same information.

Fast, accurate analysis of data provides the winning edge in financial markets

Page 7: Automata Processing: Accelerating Big Data

7 | ©2014 Micron Technology, Inc.

Swamped with Data: Three Examples

September 26, 2014

Processing complexity and throughput requirements prevent information from being analyzed.

Sentiment Analysis: (Speed)

Internet Wall Street

Bioinformatics: (Complexity)

DNA Database

Surveillance (Speed & Complexity)

Cameras Monitor

Page 8: Automata Processing: Accelerating Big Data

8 | ©2014 Micron Technology, Inc.

Breaking the Cycle

September 26, 2014

Big Data Pushes Memory to the Limit

CPU Vendor System OEM

“Memory is the

bottleneck!”

“We need faster

memory!”

New Response

Micron Technology

“Let’s rethink the problem”

The modern relationship between processor and memory was conceived to avoid complications associated with physical reconfiguration of ENIAC.

Since the mid 1940’s, most computer systems have been built on this basic architectural concept. The role of memory in systems was firmly cast.

Conclusion: important advancements can be made if we challenge this deeply rooted historical concept.

Page 9: Automata Processing: Accelerating Big Data

9 | ©2014 Micron Technology, Inc.

Introduction to Automata Processing

Hardware implementation of non-deterministic finite automata or NFA (with additional features)

A massively parallel, scalable, two dimensional fabric comprised of 48K processing elements per chip, each programmed to perform a pattern matching and activation task each cycle

Exploits the very high and natural level of parallelism found in memory devices

Addresses complex computational problems with unprecedented parallelism and performance

Deployable in single-chip, module, and multi-module forms

The Automata Processor (AP) is a programmable silicon device capable of performing very high-speed, comprehensive search and

analysis of complex, unstructured data streams.

Page 10: Automata Processing: Accelerating Big Data

10 | ©2014 Micron Technology, Inc.

What is an NFA?

• Finite automaton is a set of states and transition rules that respond to input.

Produces a unique computation (or run) of the automaton for each input string

Non-determinism allows multiple concurrent paths through the automaton.

This is very powerful, handles combinatorial problems

• Micron’s AP adds counters and Boolean elements to handle increased problem complexity without sacrificing capacity

Page 11: Automata Processing: Accelerating Big Data

11 | ©2014 Micron Technology, Inc.

Automata Equivalence

• Any nondeterministic machine can be modeled as deterministic at the expense of exponential growth in the state count.

Today’s supercomputers model NFA as a DFA, traversing every edge to find the solution. This creates an explosion in memory space.

• SNORT example: 100 NFA nodes replace 10,000 DFA nodes

^C U A ^C

*

Deterministic Finite Automaton (DFA)

Nondeterministic Finite Automaton (NFA)

^[AU]

A U A A

^A A ^[ACU]

C

U

A

A

^[AU]

U

C

A

C

C

^A

^[AC] A

^[AC]

^[AC]

Page 12: Automata Processing: Accelerating Big Data

12 | ©2014 Micron Technology, Inc.

Programmer Productivity

September 26, 2014

Pattern #1

Pattern #2

Pattern ’n’

Parallelization of automatons requires no special consideration by the user. Each automaton operates independently upon the input data stream.

.

.

.

.

.

.

.

.

.

Page 13: Automata Processing: Accelerating Big Data

13 | ©2014 Micron Technology, Inc.

GPGPU

CPU CPU

Structured Mathematical Floating Point

Unstructured Random Comparison

High Parallelism

Low Parallelism

Automata Processor Positioning

• The Automata Processor excels where the demand for highly parallel processing and unstructured data intersect

Example: String matching from data services (email, twitter, facebook, voice communications, etc.) to provide:

Sentiment Analysis (Financial Services)

Evidentiary finding (Legal Services)

Threat detection (Security Services)

September 26, 2014

Page 14: Automata Processing: Accelerating Big Data

14 | ©2014 Micron Technology, Inc.

Example: Bioinformatics

• Massively parallel problem space

Human genome mapping ~100 base pair reads to 3.2 billion base pair reference genome

Comparisons across genomes

Prosite protein sequence patterns mapped to Micron Automata Processor

Professor Srinivas Aluru is leading research on

Automata Processors in bioinformatics applications

Page 15: Automata Processing: Accelerating Big Data

15 | ©2014 Micron Technology, Inc.

Breakthrough Performance

Planted Motif Search Problem

Automata Processor UCONN - BECAT Hornet Cluster

Processors 48 (PCIe Board)+CPU 48 CPU (Cluster/OpenMPI)

Power 245W-315W1 >2,000W1

Cost TBD ~$20,0001

Performance (25,10) 12.26 minutes2 20.5 minutes

Performance (26,11) 13.96 minutes2 46.9 hours

Performance (36,16) 36.22 minutes2 Unsolved

1 Micron Technology Estimates, Not including Memory of 4GB DRAM /Core 2 Research conducted by Georgia Tech (Roy/Aluru)

Planted Motif Search - a leading “NP Complete” problem in bioinformatics

Solutions involving high match lengths and substitution counts are often presented to HPC clusters for processing

Independent research predicts the Automata Processor significantly outperforms a multi-core HPC cluster in speed, power and estimated cost

Page 16: Automata Processing: Accelerating Big Data

16 | ©2014 Micron Technology, Inc.

Problems Aligned with the Automata Processor

September 26, 2014

Applications requiring deep analysis of data streams containing spatial and temporal information are often impacted by the memory wall and will benefit from the

processing efficiency and parallelism of the Automata Processor.

Network Security: Millions of patterns Real-time results Unstructured data

Bioinformatics: Large operands Complex patterns Unstructured data

Video Analytics: Highly parallel operation Real-time results Unstructured data

Data Analytics: Highly parallel operation Real-time results Unstructured data

Page 17: Automata Processing: Accelerating Big Data

17 | ©2014 Micron Technology, Inc.

Automata Processor: Support & Tools

September 26, 2014

PCIe Development Board Industry Standard PCIe bus interface Capacity for up to 48 AP’s Large FPGA capacity DDR3 for local storage

Workbench Tool Converts schematic automata to Micron ANML description language

Software Development Kit AP Optimization, loading & debugging tools & compiler.

Page 18: Automata Processing: Accelerating Big Data