(future) in memory enterprise
TRANSCRIPT
-
8/18/2019 (Future) in Memory Enterprise
1/11
W H I T E P A P E R
I n - M e m o r y D a t a b a s e T e c h n o l o g y : A C r i t i c a l S u c c e s sF a c t o r f o r t h e R e a l - T i m e E n t e r p r i s e
Sponsored by: SAP
Carl W. Olofson
November 2012
I D C O P I N I O N
No one needs to be told that the pace of business for most enterprises is increasing
exponentially year after year and that new technology is needed for enterprises to
keep up with both the speed of business and the flood of data that can lead to new
business opportunities. To respond to this challenge, an enterprise must do the
following:
Adopt a strategy that enables reinvention as a real-time enterprise, capable of
transacting business at the required pace and of taking maximum advantage of
new business opportunities as they arise.
Develop a platform capable of coordinating all the IT assets of the enterprise
while flexibly adapting to changing business conditions.
Ensure that the platform is undergirded by an in-memory database (IMDB)
because only an IMDB is capable of keeping up with the speed and flexibility
demands of the real-time enterprise.
Consider adopting the SAP Real-Time Data Platform, which is grounded in SAP'sIMDB technology — SAP HANA.
I N T H I S W H I T E P A P E R
This white paper explores the issue of high performance in data management and the
need for such performance in implementing a real-time enterprise. It describes
the needs of the real-time enterprise and how they are best met by an in-memory
database, which is managed by memory-based database management system
(DBMS) technology. The document contrasts memory-based DBMS technology with
disk-based DBMS technology, showing how the memory-based approach delivers
much greater performance and simpler management than the disk-based approach
without sacrificing database recoverability. It also shows how an in-memory database,
managed by a memory-based DBMS, fits within the framework of the real-time
enterprise.
This paper also discusses the architecture of a data and decision management platform
that includes in-memory database technology. Further, it suggests SAP HANA as
an example of such in-memory database technology and the possible role of SAP's
Real-Time Data Platform in the real-time enterprise.
G l o b a l H e a d
q u a r t e r s : 5 S p e e n S t r e e t F r a m i n g h a m ,
M A
0 1 7 0 1 U S A
P . 5
0 8 . 8
7 2 . 8
2 0 0
F . 5
0 8 . 9
3 5 . 4
0 1 5
w w w . i
d c . c o m
-
8/18/2019 (Future) in Memory Enterprise
2/11
2 #237635 ©2012 IDC
S I T U A T I O N O V E R V I E W
T h e R e a l - T i m e E n t e r p r i s e C h a l l e n g e
Enterprises are struggling to keep up with the demands of business in a globally,
electronically connected world. Not only do events that require response come at an ever-
quickening pace, but business intelligence (BI) about customers and competitors may be
gleaned from vast amounts of ever-changing data on the Internet, if only it can be rapidly
acquired and processed. Business managers know that their competitors are chasing this
intelligence as fast as they are.
How can a modern enterprise keep up with the pace of business demands and also
gain an edge from timely business intelligence? The key is in the ability to acquire and
ingest all the necessary data and to act on that data with as little delay as possible.
This means dealing with streaming data from external sources and sensor and other
machine-generated data from internal sources. It means adopting Big Data
technologies to collect and ingest large amounts of business intelligence data quickly.
Additionally, it means having a core data management platform that can enable
applications to put all the pieces together and act in a timely manner. Memory-based
database technology powering an IMDB needs to be at the heart of such a platform.
What Is a Real-Time Enterprise?
A real-time enterprise can act on events as they happen rather than wait for relevant
information to be entered into the system, stored, compiled, and made available for
query and reporting. A real-time enterprise needs to handle shifting external factors
such as customer demand, supplier pricing and product availability, and operational
costs as well as internal factors such as logistics, inventory, and production rates.
This is not simply about producing an "executive dashboard." This is aboutmarshalling IT resources based on the current situation on the ground as it changes.
It is about automated decision generation and action and about supporting just-in-
time tactical decisions. It affects both transactional operational systems and analytic
systems, including operational BI.
Big Data in Motion and at Rest
Part of what drives the real-time enterprise is Big Data. Most people are familiar with
Big Data at rest, which involves collecting large amounts of data that may be either
streaming, machine-generated data or unorganized collections of content; filtering,
ordering, and formatting that data; and making the data available to drive decisions
and actions. Hadoop falls into this category, and most Big Data at rest solutions,including those based on Hadoop, are batch oriented or have a batch component to
them. As a result, there is always a delay of minutes to hours from when the data
arrives and when it is available to drive actions. For some classes of decisions and
actions, this is fine; for others, not so much.
Big Data in motion is different. This also involves large amounts of streaming data,
but instead of ingesting the data and then examining it, this approach involves
recognizing events in the data and taking immediate action. Such technology
-
8/18/2019 (Future) in Memory Enterprise
3/11
©2012 IDC #237635 3
generally involves the use of a complex event processing (CEP) engine that drives
actions in response to defined complex events. Of course, recognizing such events
often requires context, so such systems need to be able to reference facts or patterns
of facts that have previously occurred, such as those collected by Big Data at rest.
Thus, the two types of Big Data are complementary.
Right-Time Decis ion and Act ions
For real-time decisions to be made and actions to be taken, a system that combines the
two forms of Big Data must be able to hold at least the most immediately relevant
elements of the Big Data at rest in a database that can respond immediately to requests.
For reasons that will soon become apparent, such data should reside in an IMDB.
Classic operational applications are designed to perform preprogrammed functions in
a fixed sequence, with a few conditions here and there modifying that sequence
slightly. A real-time enterprise requires applications that are driven by decisions, most
of them automated, based on the right criteria. Some of the criteria will call for up-to-
the-minute data, while others can tolerate various levels of latency.
In addition to the automated decisions of operational applications, human decisions
also vary in terms of their complexity, the number of people involved, and the amount
of data and its required timeliness. Figure 1 illustrates the range of decisions that IT
systems need to support.
F I G U R E 1
I D C ' s D e c i s i o n M a n a g e m e n t F r a m e w o r k
Source: IDC, 2012
Operational decisions
focus on a specific
project or process and
result in the formation
of a type of p olicy or
rule that drives tacticaldecisions.
Tactical decisions
must apply the policy
or rule in a specific
case, which lends
itself to automation.
Strategic decisions
set the long-term
directions for the
organization, a
product, a service, or
an initiative and result
in guidelines within
which operational
decisions are made.
Degree of Automation
Strategic
Decisions
Operational
Decisions
Tactical
Decisions
Scope and Degree of Risk
L ev el of C ol l a b or a
t i on
N u
m b e r o f D e c i s i o n s
-
8/18/2019 (Future) in Memory Enterprise
4/11
4 #237635 ©2012 IDC
As may be seen, this framework highlights three decision types:
Strategic decisions tend to involve the collection and digestion of large amounts
of data over a considerable period of time, with lots of collaboration among
interested parties. A data warehouse is usually involved. The smallest number of
total decisions are strategic.
Operational decisions involve choices made within the framework of the larger
strategy, usually by line managers. They require somewhat less data, but on a
more timely basis. They may turn to data marts and to some Big Data analysis.
This type of decision ranks second in terms of the total number of decisions made.
Tactical decisions are made "on the ground" and in the heat of action. They must
be made immediately, based on key relevant information that has just arrived,
compared and processed against known patterns and facts. This involves streaming
data and a CEP engine to drive the decisions by examining held data that must
reside in an extremely low-latency database (that is, an IMDB). This represents by
far the largest number of decisions made.
Enterprise Information Management
To nimbly manage all these levels of decision making, an enterprise must have an
enterprise information management (EIM) strategy that embraces a means of defining
and coordinating data across a variety of different types of databases and data stores.
Figure 2 illustrates the kinds of data management involved. Note that in this figure,
"automated" and "real-time" decisions are both examples of "operational decisions," as
referenced in Figure 1.
F I G U R E 2
D e c i s i o n M a n a g e m e n t i n a n E I M C o n t e x t
Source: IDC, 2012
EIMMetadataHub and
Coordinator
Streaming +CEP (BigData inMotion)
Large VolumeVolatile Data(Big Data at
Rest)
Large VolumeStatic Data
(DataWarehouse +
CM)
FixedOperational
Data(Transactional
DBMS)
Real-Time Decisions
Tactical DecisionsStrategic Decisions
Automated Decisions
-
8/18/2019 (Future) in Memory Enterprise
5/11
©2012 IDC #237635 5
An EIM strategy requires a unified data management platform, undergirded by a
system of data coordination among the databases in the environment, driven by
common metadata. It should be noted that for automated and real-time decisions to
be executed at the right time, contextual data must be available in an instant. This
requires IMDB technology.
In-Memory Database Technology
In-memory database technology involves maintaining current live data in memory
rather than on disk. The data of record is actually the data in memory all the time.
Such an approach requires a memory-based DBMS rather than a disk-based DBMS.
Memory-Based DBMS Versus Disk-Based DBMS
Most DBMS products in use today are disk-based DBMS products. They are
optimized for the management of data on disk and minimize disk I/O waits — a major
design point — while ensuring consistency based on the committed data in the
database. A memory-based DBMS, by contrast, is optimized for the organization and
management of data in memory rather than on disk. This does not mean that there is
no disk in the picture at all. Spinning disk is often used for the transaction log (to
ensure recoverability), and for very large databases, seldom-used data may be paged
to disk to free up memory.
Disk-Based DBMS
A disk-based DBMS manages the data in memory for purposes of mapping it to disk.
This means that all the data must be seen as copies of data on disk and that when
the data is changed, it needs to be written back to disk at some point. Because data
is stored on disk in ways that optimize I/O speed, it is not organized in ways that
make it easy to move from one table row to another in memory. Every time the
database server looks for a row, it needs to figure out where that row belongs on disk,
and then whether or not it is already in the buffer. If the row is not in the buffer, the
database server needs to flush the buffer and load the needed data.
As a result, most of the instructions the DBMS executes to respond to a data request
have nothing to do with the request and everything to do with disk and buffer
management. In fact, even for a simple query, the DBMS executes about 10
instructions for disk and buffer management for every one instruction that actually
involves getting the data and returning it to the requester.
Memory-Based DBMS
A memory-based DBMS operates quite differently from a disk-based DBMS. Its normal
operations involve very little I/O, so optimization focuses on keeping the number ofcomputer instructions executed to a minimum. Currently, memory-based DBMS
products vary widely with respect to the way they organize the database in memory.
For a memory-based DBMS, each session still has its own buffer, but the database
itself serves as the standard or common buffer. Such a DBMS requires no reading
from disk, no mapping to disk, no pages, no flushes. When changed data is
committed, that data is simply copied to the database in memory, and the operation is
complete. In some cases, the database may be too big to manage in memory or the
-
8/18/2019 (Future) in Memory Enterprise
6/11
6 #237635 ©2012 IDC
cost of the total amount of memory required may be unacceptable. In such cases, the
least volatile data is relegated to disk, and the entire system operates as if all the
database were in memory, with background services synchronizing the disk-swapped
data. This way, database operations are not slowed down by the swapping activity. It
is for this reason that even when a disk-based database is entirely in buffer memory,
a memory-based database that has all the required data in memory will run an
average of at least 10 times faster than a disk-based database.
In-Memory Database
Some memory-based DBMSs are designed to handle databases that are larger than
the available memory by swapping the least frequently used data to disk. An IMDB is
a database that is optimized by a memory-based DBMS so that the entire database
may be kept in memory. Examples exist of IMDBs that show upwards of 10 times
performance improvements over disk-based databases containing the same data,
and executing the exact workloads, even when the disk-based databases are holding
all their data in buffer so there is no I/O except for the transaction log. Some IMDBs
show upwards of 200 times performance improvements for both query and update.
These results vary, of course, based on architecture and workload.
IMDB Recoverabi l i ty Techniques
One objection often raised to a memory-based database, and especially an IMDB, is
that of recoverability. It is assumed that a disk-based database is more recoverable
because its data resides on disk. This is simply not true. Most modern disk-based
databases keep the majority of their data in buffers most of the time, writing changes
to a log. If the system fails, one must bring it up again and use the transaction log to
write all the changes, in sequence, to disk before the system may be used.
An IMDB also has a transaction log. In many cases, the database server writes
changed data to the transaction log and takes periodic background snapshot backups
(which don't take cycles from the database server itself). Recovery consists of
reloading the snapshot and rolling forward the logged transactions, which, in an
in-memory system, takes far less time than when writes to disk are involved.
This is not the only method of recovery. Many IMDBs are deployed on clusters of
servers that act as data servers as well as standby servers for each other. Still others
combine this approach with asynchronous log writes. This means that they don't wait
for the write to complete successfully before continuing. In the worst case, where the
entire cluster goes down, the log, together with snapshot backups, can be used to
recover the database within some designated interval before failure.
If the interval is zero, then the writes are, effectively, synchronous. Otherwise,
database operations are timed so that delays occur only if the log gets backed up and
can't meet the interval requirement. Most databases can tolerate a small interval of
potentially lost data (the system would need to be completely busy at the time of
failure for data to be lost); even if very small intervals are allowed, the performance
boost is huge.
-
8/18/2019 (Future) in Memory Enterprise
7/11
©2012 IDC #237635 7
Benefits of Memory-Based Database Technology
Those who adopt memory-based database technologies can expect to realize the
following key benefits:
Improvements of 10 –200 times in terms of throughput even against a fully
optimized disk-based database system. This means more and faster tacticaldecisions, more and better operational decisions, and far wider analysis
capability for strategic decisions.
Ability to incorporate Big Data into decision-making activities — from real-time
decisions that exploit immediate sales opportunities, improve service, and avoid
risk to predictive analysis to reveal future market opportunities.
Far less disk space is required for the memory-based database, and especially
for an IMDB, resulting in a cost savings there.
Although a memory-based solution will require more main memory and
processors than a corresponding disk-based system, the total memory increase
is not as great as one might think because a good deal of duplicated memory
and overhead is eliminated, so this is not a matter of substituting memory for
disk; the net hardware cost difference should be favorable even when taking
memory cost into account.
A great deal of staff time is required to manage a disk-based database. DBAs
spend a lot of time on activities such as building and rebuilding indexes, mapping
data to partitions, unloading and reloading data, reallocating data across
volumes, and so on. All this effort is eliminated, along with the corresponding
storage management effort.
A memory-based database is far more nimble than a disk-based database. When
schema changes or rapid data growth happens to a disk-based system, they
generally require a reorganization that involves heavy disk volume work.
Memory-based databases, and especially IMDBs, can generally adjust fairly
dynamically to both schema changes and data growth, though the latter case
may require adding servers to the cluster.
The Right-Time Data Solution
Memory-based database technology is essential to enabling the application
environment to keep up with the pace of business. An IMDB is the only kind of
database that can be used with a streaming data-driven system without slowing
down. The speed, simplicity, and nimble nature of memory-based databases, and
IMDBs in particular, make them critical success factors for achieving the real-time
enterprise, as illustrated in Figure 3.
-
8/18/2019 (Future) in Memory Enterprise
8/11
8 #237635 ©2012 IDC
F I G U R E 3
E l e m e n t s o f a R e a l - T i m e E n t e r p r i s e
Source: IDC, 2012
Examples of the Real-Time Enterprise in Action
The following examples of real-time enterprise applications illustrate the principles
outlined previously. They are not specific use cases, but they incorporate elements of
a number of use cases with which IDC is familiar.
Real-time retailing. A retail firm tracks inventory in its stores by capturing RFID
information from pallets as they arrive at the loading dock and sales at point-of-
sales terminals (POSTs). The sales data also reveals volumes and patterns of
sales moment by moment. Data about sales and inventory by store is loaded into
an IMDB, and analytic software determines whether changes in sales volumes
suggest a price change and whether trends compared with inventory suggest the
need to restock, and if so, which stores and from which warehouses. Such
operations minimize inventory-related costs and maximize competitiveness
without ill-considered price changes up or down.
Real-time trading. A portfolio management company maintains portfolio
holdings and rules in an IMDB. Such rules govern how frequently trades may
occur, how much risk is tolerated, what kinds of issues are to be considered for
inclusion in the portfolio, etc. The firm also receives streaming data about stock
trades. It records data about issues of interest in the IMDB and looks for
interesting trends in share prices. When trends are found, it compares changes
CEP
EngineStreaming Data
Hadoop
Business Action
Immediate
Query
DataWarehouse
Enterprise Analytics
IMDB
High-Speed Transactions and
Automated Decisioning
Real-Time BI
-
8/18/2019 (Future) in Memory Enterprise
9/11
©2012 IDC #237635 9
and the algorithmic buy-or-sell suggestion against each portfolio to determine
whether, for that portfolio, a trade is warranted based on its rules. If so, the trade
is executed. All this is done in milliseconds.
Real-time logistics. A trucking firm receives real-time data about the location
and condition of each truck on the road as well as traffic conditions, which it
maintains in an IMDB, along with the contents of all trucks on the road and their
delivery routes and schedules. Changes in traffic, new or canceled delivery
orders, exegetical events such as accidents or breakdowns, and the fuel level of
each truck affect orders that may be pushed out to the drivers that change their
routes and schedules, all subject to moment-by-moment change. The result is
more nimble pickup and delivery, faster response to problems, optimal truck
routing, and fuel cost savings.
S A P I M D B T e c h n o l o g y
SAP is taking a multifaceted approach to the challenge of providing memory-based
DBMS technology to its customers. For SAP application customers, the company
offers SAP HANA, an IMDB aimed at providing very fast and efficient data operations
for both analytical and operational workloads. As of this writing, SAP HANA primarily
supports SAP's analytic applications, but the company plans to deliver transactional
support for ERP customers by the end of the year.
At the present time, SAP HANA is not a standalone DBMS; rather, it is used in
conjunction with a full-featured relational DBMS (RDBMS). SAP recommends use of
SAP Sybase IQ as the storage database for HANA's analytic workloads and SAP
Sybase Adaptive Server Enterprise for the ERP workloads. Going forward, the
company plans to bring these technologies closer together to form the SAP Real-
Time Data Platform.
Nonetheless, using SAP applications with SAP HANA positions the user to address
the extreme performance and flexibility demands of the real-time enterprise.
F U T U R E O U T L O O K
The idea that memory-based approaches represent the future of database technology is
no longer seriously disputed. The only questions have to do with the proper form such
technology should take, and in most cases, the answer tends to vary depending on the
data management problem one is trying to solve. The result is that over the next few
years we will see a number of different memory-based DBMSs enter the market. Some
will focus on small but very complex analytic workloads. Some will focus on large analytic
workloads. Others will focus on transactional workloads, especially those that have some
real-time dimension. Some will come from established DBMS vendors, and others will
arise from start-ups that no one has ever heard of before.
The challenge for users is to sift through the different memory-based DBMSs and find
those that most effectively address the problem in question and that come from a
vendor that can be trusted to be around for a while — preferably one with a track
record of success. Users should not expect to standardize on one memory-based
-
8/18/2019 (Future) in Memory Enterprise
10/11
10 #237635 ©2012 IDC
DBMS, much less one IMDB, for all workloads; rather, they should pick the right tool
for the job and look to strategic data integration applied for enterprise information
management to reconcile and coordinate the data.
C l o u d A r c h i t e c t u r e s
As memory-based database technologies, and especially IMDBs, evolve, they are
perfect for both public and private cloud deployments because they eliminate the
classic problem of disk-based DBMSs: dedicated, hard-to-move or hard-to-change
storage assets. They fully embrace the cloud concept of resource fungibility that is the
whole point of cloud virtualization. By contrast, disk-based DBMSs make cloud
management much more complicated and constrained.
E v o l v i n g H a r d w a r e A r c h i t e c t u r e s
We should also bear in mind that just as memory-based DBMS software engineers
are evolving their technology to take maximum advantage of today's processor and
memory configurations, hardware engineers are working to evolve better processorand memory configurations in service of IMDBs. This dance of innovation will result in
a dizzying evolution of this technology for some time to come.
C H A L L E N G E S / O P P O R T U N I T I E S
As has been mentioned, memory-based DBMS technologies will emerge from all
quarters — some old, some new. SAP will need to face the competitive challenges of
these technologies. This does not mean that SAP needs to eclipse them all. In some
cases, SAP's IMDB technology will find a synergistic relationship with specialized
memory-based DBMS technology offered by other vendors. In other cases, especially
those involving the core of the SAP Real-Time Data Platform, SAP's IMDB technology— SAP HANA — must be seen as the clear choice to drive enterprise data.
C O N C L U S I O N
In the coming years, there will be a growing emphasis on real-time processing as
a key to success in the global, Internet-driven business world, regardless of what
business one happens to be in. A critical success factor in addressing this
requirement is memory-based DBMS technology. Many such technologies will
emerge that emphasize, to various degrees, speed, volume, flexibility, resiliency,
reliability, and consistency. In making strategic decisions about the future direction of
enterprise applications, enterprises should consider the following:
Some memory-based DBMSs will be just perfect for specific workloads within the
enterprise. Absolute uniformity across the enterprise in this regard is not
required. Pick the best tool for the job.
All the data management for both operational applications and analytic workloads
should ultimately rest on a platform that enables consistent, reliable data to be
delivered where it is needed in a timely manner, where "timely" may range from
days to hours to microseconds.
-
8/18/2019 (Future) in Memory Enterprise
11/11
©2012 IDC #237635 11
Such a platform, at its heart, should have a single IMDB technology that can
handle the speed and variety requirements of the data in the platform.
The SAP Real-Time Data Platform, driven by SAP HANA, which offers an
evolutionary path from a disk-based RDBMS and includes IMDB technology, is a
possible candidate in this regard.
C o p y r i g h t N o t i c e
External Publication of IDC Information and Data — Any IDC information that is to be
used in advertising, press releases, or promotional materials requires prior written
approval from the appropriate IDC Vice President or Country Manager. A draft of the
proposed document should accompany any such request. IDC reserves the right todeny approval of external usage for any reason.
Copyright 2012 IDC. Reproduction without written permission is completely forbidden.