critical_capabilities_for_da_261663

26
G00261663 Critical Capabilities for Data Quality Tools Published: 18 February 2015 Analyst(s): Ted Friedman, Saul Judah Data quality tools are important in a wide range of use cases, and providers in the data quality tools market exhibit varying strength across the functionality critical to these use cases. Buyers need to understand the relative strengths to make optimal selections of vendors and tools. Key Findings Significant growth in the data quality tools market is fueled by demand for deployments in a range of use cases. From traditional operational/transactional data quality applications, to contemporary big data scenarios, data quality tool buyers are specifying an increasingly wide range of requirements. The tools available in the market exhibit significantly different degrees of strength across an increasing range of requirements. Buyers often struggle to find tools that can be readily applied to ever-changing use cases and functional needs. Recommendations Buyers considering data quality tools should: Plan for the full range of use cases you need to support, both now and in the future. Focus on core product capabilities that are commonly needed across many use cases. Recognize that tool functionality is one dimension to evaluate. Consider market presence, availability of skills, support and service capabilities as well as pricing. What You Need to Know The functional characteristics of data quality tools are an increasingly important dimension of evaluation on which buyers — including information management leaders, information governance-

Upload: george-corugedo

Post on 17-Aug-2015

39 views

Category:

Documents


20 download

TRANSCRIPT

Page 1: critical_capabilities_for_da_261663

G00261663

Critical Capabilities for Data Quality ToolsPublished: 18 February 2015

Analyst(s): Ted Friedman, Saul Judah

Data quality tools are important in a wide range of use cases, and providersin the data quality tools market exhibit varying strength across thefunctionality critical to these use cases. Buyers need to understand therelative strengths to make optimal selections of vendors and tools.

Key Findings■ Significant growth in the data quality tools market is fueled by demand for deployments in a

range of use cases.

■ From traditional operational/transactional data quality applications, to contemporary big datascenarios, data quality tool buyers are specifying an increasingly wide range of requirements.

■ The tools available in the market exhibit significantly different degrees of strength across anincreasing range of requirements.

■ Buyers often struggle to find tools that can be readily applied to ever-changing use cases andfunctional needs.

RecommendationsBuyers considering data quality tools should:

■ Plan for the full range of use cases you need to support, both now and in the future.

■ Focus on core product capabilities that are commonly needed across many use cases.

■ Recognize that tool functionality is one dimension to evaluate.

■ Consider market presence, availability of skills, support and service capabilities as well aspricing.

What You Need to KnowThe functional characteristics of data quality tools are an increasingly important dimension ofevaluation on which buyers — including information management leaders, information governance-

Page 2: critical_capabilities_for_da_261663

related stakeholders and data quality project leaders — need to focus. The critical capabilities fordata quality tools defined in this research represent the key functional characteristics that are mostrelevant across the range of contemporary use cases to which data quality tools are applied.

Organizations need to understand the relative importance of each capability for the use cases theyare broaching, and use this insight to assess each provider's suitability for supporting those usecases in the specific way and to the specific depth required. By leveraging the critical capabilitiesratings and use-case scores, buyers can identify a set of providers that may be the best fit to deliverthe product functionality necessary to succeed in their data quality improvement efforts.

The critical capabilities defined in this research and used to assess the vendors align with a subsetof the product evaluation criteria in the corresponding "Magic Quadrant for Data Quality Tools." Theratings allocated to each vendor are driven primarily from customer feedback resulting from asurvey of reference customers provided by each vendor. The degree to which the customer basefeels a given capability meets its needs, along with the frequency of usage across the customersample, influences the rating.

Buyers in this market should carefully consider the requirements for the specific use cases they arefacing, and take note of the relative degree of importance of each capability for those use cases. Inthe case of our chosen set of critical capabilities, all vendors score above the "meets requirements"(3) level for all use cases (see the Critical Capabilities Methodology section at the end of thisresearch). This reflects the fact that these vendors represent the most functionally complete acrossthe data quality tools market. In addition, the relevance of the vendors across all use cases isvalidated by the customer survey data, which shows that all vendors have customers actively usingtheir tools against each of the use cases.

Buyers in the data quality tools market need to recognize that the critical capabilities assessed inthis document represent a subset of the evaluation criteria Gartner recommends when selectingvendors and tools. Therefore, the rankings of vendors expressed here do not represent overallvendor positioning in the market, and do not always align with positioning of vendors in thecorresponding Magic Quadrant. While certain vendors score consistently strongly based oncustomer feedback about the critical capabilities assessed here, product capabilities do not providea complete vendor and tool evaluation on their own. Organizations must also consider eachvendor's market presence, track record, financial and organizational strength, availability of skills,product support, and the depth of its professional services.

Gartner recommends that buyers complement this Critical Capabilities assessment with our "MagicQuadrant for Data Quality Tools" (to understand the vendor landscape beyond product capabilities)and our "Toolkit: RFP Template for Data Quality Tools" (to ensure visibility to the complete breadthof product capabilities relevant in this market).

Page 2 of 26 Gartner, Inc. | G00261663

Page 3: critical_capabilities_for_da_261663

Analysis

Critical Capabilities Use-Case Graphics

Figure 1. Vendors' Product Scores for Master Data Management Use Case

Source: Gartner (February 2015)

Gartner, Inc. | G00261663 Page 3 of 26

Page 4: critical_capabilities_for_da_261663

Figure 2. Vendors' Product Scores for Operational/Transactional Data Quality Use Case

Source: Gartner (February 2015)

Page 4 of 26 Gartner, Inc. | G00261663

Page 5: critical_capabilities_for_da_261663

Figure 3. Vendors' Product Scores for Information Governance Initiatives Use Case

Source: Gartner (February 2015)

Gartner, Inc. | G00261663 Page 5 of 26

Page 6: critical_capabilities_for_da_261663

Figure 4. Vendors' Product Scores for Data Integration Use Case

Source: Gartner (February 2015)

Page 6 of 26 Gartner, Inc. | G00261663

Page 7: critical_capabilities_for_da_261663

Figure 5. Vendors' Product Scores for Data Migration Use Case

Source: Gartner (February 2015)

Gartner, Inc. | G00261663 Page 7 of 26

Page 8: critical_capabilities_for_da_261663

Figure 6. Vendors' Product Scores for Big Data Use Case

Source: Gartner (February 2015)

Vendors

Ataccama

Ataccama's DQ Analyzer, Data Quality Center, DQ Issue Tracker and DQ Dashboard productsprovide a range of capabilities that suit all of the main data quality tools use cases, and theAtaccama customer base reflects broad usage across that range. Reference customers provide

Page 8 of 26 Gartner, Inc. | G00261663

Page 9: critical_capabilities_for_da_261663

very positive feedback regarding their experiences with the vendor's tools relative to each of thecritical capabilities, with data profiling and matching functionality cited as particularly strong. Whilethe Ataccama data quality tools are capable of supporting the full range of use cases, the vendor'sstrength in data profiling — combined with the core data quality operations and performance —supports a strong rating for data migration and data integration use cases. In addition, Ataccama'swork in big data environments (including support for Hadoop) makes the vendor's tools particularlyrelevant to the big data use case, garnering a very strong score here.

DataMentors

DataMentors' DataFuse, ValiData and NetEffect products support the expected range of dataquality technology capabilities, with a strong focus on the customer/party data domain. Referencecustomers report a very positive experience with the data profiling, matching and base data qualityoperations (parsing, standardization and cleansing), but reflect limitations when working acrossmultiple data domains (beyond customer/party). Multidomain weakness drives a lower score in themaster data management (MDM) use case, although for organizations focused predominantly oncustomer/party master data this should not be a deterrent to adoption. The relatively strongcapabilities across the other critical capabilities, as reflected in feedback from reference customers,enable DataMentors to score consistently well across the other use cases, with operational dataquality, data integration and data migration use cases having the highest scores for the vendor.

Experian

Experian Pandora, and the Capture, Clean and Enhance data quality tools represent the vendor'srebranding of its QAS assets and those of X88 Software, which is now part of the Experian group.The profiling functionality, related visualization capabilities, scalability and performance are cited asstrengths by reference customers. This contributes to the vendor scoring most strongly against datamigration, big data, operational and information governance use cases. Limited demonstrated useof the functionality beyond the customer/party domain — as well as weak feedback on workflowcapabilities — supports lower scores for the master data management use case, although Experianhas demonstrated strength in domain-specific MDM-related deployments focused on customer/party master data.

IBM

IBM's InfoSphere Information Server for Data Quality offering (including InfoSphere InformationAnalyzer, InfoSphere QualityStage and InfoSphere Discovery products) provides broad coverage fordata quality functionality, including the critical capabilities identified for this market. Referencecustomers indicate a very positive experience with nearly all of the critical capabilities. Strength inthe fundamentals (parsing, standardization, cleansing and matching) support a strong score for IBMacross the full range of data quality technology use cases. With only the visualization criticalcapability rating lagging slightly behind the others (although still well ahead of the basicrequirements of the market), IBM scores well for all use cases, with operational, data integration anddata migration use cases favoring the vendor most heavily.

Gartner, Inc. | G00261663 Page 9 of 26

Page 10: critical_capabilities_for_da_261663

Informatica

Informatica's data quality products — consisting of Data Quality Standard Edition, Data QualityAdvanced Edition and Data Quality Governance Edition, Address Validation Services and StrikeIron— provide comprehensive coverage of the main data quality functionality required by the market.Reference customers rate the vendor's data profiling, parsing, standardization and cleansingcapabilities as particularly strong. Visualization capabilities are perceived as lagging the othercapabilities (although still ahead of the basic requirements of the market). With thesecharacteristics, Informatica supports well the full range of data quality tools use cases, with aparticularly strong score for operational, data integration and data migration use cases. In addition,the vendor's capabilities combined with its functionality for the Hadoop environment likewisesupport a strong score for the big data use case.

Information Builders

Information Builders' iWay Data Quality Suite provides a full range of data quality functionality. Thetools are seen in deployments supporting each of the main data quality tools use cases. Referencecustomers rate the capabilities for profiling and visualization as particularly strong — the vendor'shistorical depth in analytics, business intelligence and reporting lend themselves well to providingdifferentiation in these areas. As a result, Information Builders shows the strongest affinity forinformation governance and data migration use cases. Capabilities for matching, merging andlinking capability are rated lower, although still at the level of meeting requirements. As a result,Information Builders scores solidly, but slightly lower, for each of the other use cases.

Innovative Systems

Innovative Systems' data quality products include the i/Lytics Enterprise Data Quality Suite,Enlighten and i/Lytics PostLocate, and provide broad functionality with a focus on the customerdata domain. Reference customers cite the vendor's core data quality functionality, specificallyparsing, standardization, cleansing and matching, as primary strengths. They also praise thevendor's ability to deliver suitable scalability and performance in large-scale deployments.Customers also rate the vendor's data profiling capabilities as adequate, but slightly weaker thanthe core capabilities, and Innovative's tools are rarely deployed outside of the customer/partydomain, indicating weakness in multidomain capabilities. As a result, the vendor received thestrongest scores for operational/transactional, data integration and data migration use cases. Themultidomain limitations, in contrast, are reflected in the master data management use case havingthe lowest score.

MIOsoft

MIOsoft is a new entrant to the data quality tools market and its MIOvantage product providesfunctionality for a variety of technology markets. The vendor has only recently begun positioning theproduct as a data quality toolset. In this product, the vendor provides a full range of data qualityfunctionality, and customer deployments reflect a variety of the main use cases. Although referencecustomers overwhelmingly rate the vendor as extremely strong against each of the criticalcapabilities, customer deployments show limited usage in the master data management use case.Very positive customer perceptions drive very high scores in support of all of the other use cases.

Page 10 of 26 Gartner, Inc. | G00261663

Page 11: critical_capabilities_for_da_261663

Neopost

Neopost's products in the data quality tools market — Customer Hub, Location Hub, HIquality 6,Data Services Platform, First Time Right and Architect — reflect the vendor's longtime focus oncustomer data. These products cover the expected range of data quality functionality with anemphasis on customer/party and related location master data. Reference customers cite strength ofNeopost's tools relative to the fundamental capabilities of data profiling, parsing, standardizationand matching. The strong emphasis on customer/party drives a lower rating for multidomainsupport, and reference customers also cite that improvement is needed to workflow andvisualization capabilities. As such, Neopost scores most strongly for data migration and big datause cases, followed by operational data quality, data integration and information governance. Whilescoring lower for the master data management use case due to limited usage in multidomainscenarios, Neopost remains strong in the customer/party master data context.

Oracle

The Oracle Enterprise Data Quality product provides support for each of the critical capabilities,with the Product Data Extension providing deep support for the product/materials domain.Reference customers highlight data profiling, parsing, standardization and cleansing as keyfunctional strengths, and also rate Oracle's scalability and performance highly. Visualization andworkflow functionality is seen as capable, but also represents an opportunity for improvement. As aresult, while Oracle scores well across the full range of data quality use cases, informationgovernance, data migration and big data use cases are where the vendor demonstrates the greatestrelevance and value.

Pitney Bowes

Pitney Bowes' primary data quality product is the Spectrum Technology Platform, which it positionsas its strategic solution for both new customers as well as those seeking to replace its legacy dataquality offerings. Reference customers cite the base data quality capabilities — parsing,standardization, cleansing and matching — as particularly strong, and also cite the scalability andperformance of the tools. While the products can potentially be used with various data domains,actual deployments and the vendor's strong focus on customer/party data contribute to a lowerrating for multidomain capabilities. As a result, Pitney Bowes scores well across the range of dataquality use cases. Each use case — with the exception of master data management due to its lowerrating on multidomain — are supported by a score near or above the "meets or exceeds" range.

RedPoint

The RedPoint Data Management solution provides the full range of data quality functionalityexpected in this market. This solution also supports requirements in related markets, such as dataintegration tools. The vendor's data quality tools are seen in deployments across the full range ofuse cases. Reference customers view the tools' capabilities supporting the core data qualityoperations of parsing, standardization, cleansing and matching as strong. In addition, scalabilityand performance is cited as a significant strength, both in traditional environments as well as bigdata environments such as Hadoop (where the vendor has made substantial investments and

Gartner, Inc. | G00261663 Page 11 of 26

Page 12: critical_capabilities_for_da_261663

continues to innovate). As a result, RedPoint scores most favorably for big data, data migration andinformation governance use cases, but also scores in or above the "meets or exceeds" range forthe others.

SAP

SAP's Data Quality Management, Information Steward and Data Services offerings providecomprehensive data quality functionality for both SAP and non-SAP application and dataenvironments. The vendor's tools are seen in deployments across all of the key data quality usecases with relatively equal frequency. Although reference customers rate the critical capabilities asconsistently meeting requirements, they also expressed a desire for less complexity in deployment,and more rapid time to value. Core parsing, standardization, data profiling, and workflowcapabilities are rated most strongly by customers across the functional requirements. As a result ofthe fairly consistent rating of the capabilities, SAP scores above the "meets requirements" level forall use cases, with no single use case standing out as being more relevant than any other.

SAS

SAS's Data Quality, Data Management, Data Quality Accelerator and Data Quality Desktopproducts represent a full suite of data quality functionality able to support deployments of all sizes.The vendor's tools are seen in deployments across all of the key data quality use cases withrelatively equal frequency. Reference customers report a very positive experience (often exceedingrequirements) for each of the critical capabilities, with the core data quality operations of parsing,standardization, cleansing and matching, as well as user-facing workflow capabilities, standing outabove the others. This contributes to fairly consistent scores across each of the key use cases, allslightly above the "meets or exceeds" range, with operational data quality, data integration and datamigration showing the strongest applicability for SAS's data quality tools.

Talend

Talend's Open Studio for Data Quality and Platform for Data Management are the products viawhich the vendor delivers data quality functionality to the market. Talend's data quality capabilitiesare most commonly seen in data integration scenarios (a core component of its broader productportfolio), but appear in other use cases as well. Reference customers report a generally positiveexperience with most of the critical capabilities, but parsing, standardization, cleansing, matchingand workflow standout as the most significant strengths. Data profiling and visualization capabilitiesare rated as weaker, leading to a lower use-case score for information governance. Master datamanagement, operational/transactional, and data integration use cases score the highest, althoughall score well above the "meets requirements" level.

Trillium Software

Trillium approaches the data quality tools market via its Trillium Software System and Trillium Cloudproducts, which include TS Insight, TS Discovery, TS Quality and TS Case Management. Withcomprehensive functionality that can support each of the key data quality use cases, Trillium'scritical capabilities are rated by reference customers as generally very strong, exceeding

Page 12 of 26 Gartner, Inc. | G00261663

Page 13: critical_capabilities_for_da_261663

requirements in all areas with the exception of workflow and multidomain functionality. The latter isdue to Trillium's historical focus on customer/party data, with fewer deployments focused on otherdata domains. However, even this capability is viewed as suitable to meet common requirements.The resulting scoring supports Trillium's relevance for all key use cases, including a particularaffinity for big data, data migration and operational/transactional deployments. The master datamanagement use case scores slightly lower due to the domain-specific orientation and experiencebase of the vendor.

Uniserv

Uniserv's data quality products — Data Analyzer, Data Cleansing, Data Protection and DataGovernance — support the main functional requirements of this market, and are typically deployedin customer/party data applications. Reference customers rate the data profiling, visualization,matching and workflow capabilities as particularly strong (substantially exceeding requirements).Multidomain support is rated substantially lower given the vendor's historical focus solely oncustomer/party data, with limited functionality for, and experience with, other data domains. Theresulting scores reflect good relevance for all the key use cases, with information governance, datamigration and big data standing out above the others. The master data management use casereceives the lowest score due to the vendor's domain-specific focus, but organizations intending todeploy these tools in the customer/party domain will find them highly relevant.

X88 Software

X88 Software's Pandora product has been rebranded as Experian Pandora since X88 became partof the Experian group. Pandora exhibits its greatest strength in data profiling, which referencecustomers rate as nearly always exceeding requirements. Core data quality requirements are alsorated highly for the most part, with visualization and scalability/performance standing out inparticular as very often exceeding requirements. The strong ratings across all critical capabilitiessupport consistently high scores for all of the key data quality tools use cases. X88 seems to exhibitthe greatest affinity for information governance and data migration scenarios due to the strong dataprofiling and visualization capabilities. While relevant to master data management use cases,Experian's focus on customer/party and X88's slightly lower rating on matching reflect in the lowestscore across the range of use cases.

Context

Technological developments in the field of the cloud, social media, mobile communications andinformation have enabled organizations to create new markets, reach more customers and delivermore value in the products and services they offer. As organizations increasingly seek to capitalizeon these business opportunities to fulfill their digital business agendas, they turn to enablingtechnologies that support their business objectives. This synergy — which sees businessopportunities fuel technology improvements, and in turn give rise to new business models andmarkets — has been the main undercurrent in the software market, which has grown strongly in thepast few years. We expect this trend to continue in the medium to long term.

Gartner, Inc. | G00261663 Page 13 of 26

Page 14: critical_capabilities_for_da_261663

It is in this context that data quality technologies and practices must be considered. Organizationsoften have an understanding of what they need to achieve their strategic objectives, which aretypically revenue growth, operational cost reduction, adherence to regulations, and better customerexperience and retention. One requirement is that the information a company holds on itscustomers, products, suppliers and assets — and their interrelationships — be fit for purpose.Where this isn't the case, efforts to achieve objectives are impeded, which results in less valuebeing delivered to shareholders, reduced competitiveness, rising operational costs, loss ofcustomers to competitors, and fines due to noncompliance with industry and legal regulations.

The data quality tools market remains dynamic owing to its growth in size and volatility on both thesupply side and the demand side. We see high demand for data quality tools, including frommidsize organizations (which traditionally tended not to buy them). This demand is driven partly byactivities in the fields of business intelligence and analytics (analytical scenarios), MDM (operationalscenarios) and digital business. Also contributing to demand are information governance programs,which are growing in number, and requirements to support ongoing operations, data migrations andinterenterprise data sharing.

Specifically, a set of most-significant use cases for data quality tools has emerged:

■ Master data management

■ Operational/transactional data quality

■ Information governance

■ Data integration

■ Data migration

■ Big data

Each of these use cases requires the emphasis of a different combination of functionalcharacteristics of the tools, meaning that versatility and strength in many areas is critical as usecases grow more diverse. The critical capabilities for data quality tools defined in this researchrepresent the most important of these functional characteristics given the trends in data qualitydemand in the market over the next several years.

Product/Service Class Definition

This market includes vendors that offer stand-alone software products to address the corefunctional requirements of the data quality improvement discipline, which are:

■ Data profiling and data quality measurement: The analysis of data to capture statistics(metadata) that provide insight into the quality of data and help to identify data quality issues.

■ Parsing and standardization: The decomposition of text fields into component parts and theformatting of values into consistent layouts, based on industry standards, local standards (forexample, postal authority standards for address data), user-defined business rules, andknowledge bases of values and patterns.

Page 14 of 26 Gartner, Inc. | G00261663

Page 15: critical_capabilities_for_da_261663

■ Generalized cleansing: The modification of data values to meet domain restrictions, integrityconstraints, or other business rules that define when the quality of data is sufficient for anorganization.

■ Matching: The identifying, linking or merging of related entries within or across sets of data.

■ Monitoring: The deployment of controls to ensure that data continues to conform to businessrules that define data quality for an organization.

■ Issue resolution and workflow: The identification, quarantining, escalation and resolution ofdata quality issues through processes and interfaces that enable collaboration with key roles,such as data steward.

■ Enrichment: The enhancement of the value of internally held data by appending relatedattributes from external sources (for example, consumer demographic attributes andgeographic descriptors).

The tools provided by vendors in this market are generally used by organizations for internaldeployment in their IT infrastructure. They use them to directly support transactional processes thatrequire data quality operations and to enable staff in data-quality-oriented roles, such as datastewards, to perform data quality improvement work. Off-premises solutions — in the form ofhosted data quality offerings, SaaS delivery models and cloud services — continue to evolve andgrow in popularity.

Critical Capabilities Definition

Profiling

The analysis of data attributes and datasets to capture statistics (metadata) that provide insight intothe quality of data and help to identify data quality issues.

Data profiling functionality is increasingly critical as organizations wish to expose the facts aboutquality of data in the enterprise, help stakeholders to clearly understand levels of data quality, andrapidly identify shifts in the shape of data they are using to proactively identify new data qualityflaws. The ability to analyze diverse sets of data, and to generate metadata and statistics that canbe readily assessed to drive data quality improvement efforts, is increasingly important to buyers inthis market.

Parsing, Standardizing & Cleansing

The decomposition and formatting of values based on industry standards, local standards, user-defined business rules, and knowledge bases of values and patterns. Modification of data values tomeet domain restrictions, integrity constraints or other business rules.

Parsing, standardization and cleansing represent the fundamental building blocks of data qualityimprovement — performing the core operations of manipulating the syntax and semantics of data tomeet "fit for purpose" requirements.

Gartner, Inc. | G00261663 Page 15 of 26

Page 16: critical_capabilities_for_da_261663

Visualization

Presentation of data profiling, monitoring and operations results and activity — in the form ofreports, dashboards or other representation metaphors — and the openness and configurability ofthese capabilities.

With the increasing desire for business-side stakeholders, information stewards and othernontechnical roles to engage in the data quality improvement process, functionality to allow thoseroles to visually assess the state of data quality, identify issue, and track data quality metrics overtime grows more important.

Matching, Linking & Merging

Identifying, linking or merging related entries within or across sets of data via a variety of algorithmicand rule-based approaches.

A common requirement in many of the data quality tools use cases is the ability to identifyrelationships between records, and determining whether or not they are related or represent thesame instance of a business concept. Sophisticated capabilities for matching, which can alsocompensate for the wide variety of semantics and representations in the typical large enterprise, areincreasingly valuable.

Multidomain Support

The ability to address multiple data subject areas (such as various master data domains), and depthof packaged support for specific subject areas.

While the roots of the data quality tools market are largely grounded in requirements to verify thequality of customer and party data types, demand is now very diverse. Other master data domains,as well as a wide range of scenarios beyond master data, must be supported by the tools.

Workflow

Monitoring and identification, quarantine, escalation and event-based resolution of data qualityissues through processes and interfaces supporting key data-quality-related roles (data stewards,data owners or data quality analysts, for example).

With responsibility for stewardship of data moving out of IT and into the lines of business, and thedesire to have a more formalized process around identifying and resolving data quality issues,workflow functionality grows more critical. This capability enables the design and deployment of anautomated set of activities by which roles, such as the information steward, can most effectivelysupport resolution of data quality issues in a managed and repeatable manner.

Scalability & Performance

Ability to deliver suitable throughput and response times to satisfy performance SLAs givenincreasingly substantial transaction and data volumes.

Page 16 of 26 Gartner, Inc. | G00261663

Page 17: critical_capabilities_for_da_261663

Data volumes continue to escalate, and the complexity of the application and data landscapecontinues to grow. As a result, data quality tools must be able to deal with highly complex andlarge-volume scenarios while delivering adequate throughput, response time and reliability.

Use Cases

Master Data Management

Capabilities applied to various key master data domains in the context of master data management(MDM) initiatives and the deployment of custom or packaged MDM solutions.

The master data management use case stresses the matching, workflow and multidomaincapabilities of the tools most heavily, due to the common requirements to resolve master dataauthored in disparate sources, support the work tasks of information stewards who perform theauthoring and maintenance, and the requirement to deal with an increasingly wide range of masterdata domains.

Operational/Transactional Data Quality

Capabilities applied to controlling quality of data created by, maintained and housed withintransactional applications.

As data quality controls are increasingly applied upstream, close to the source of data, the ability toembeddata quality capabilities closely with operational applications is key. This use caseemphasizes the core data quality operations (parsing, standardization, cleansing and matching) aswell as the need for strong scalability and performance in the face of ever-increasing transactionvolumes.

Information Governance Initiatives

Capabilities supporting the goals of an information governance initiative, and its associated rolesand stakeholders (data stewards, for example).

Information leadership roles (such as the chief data officer) and initiatives focused on increasing thevalue of information assets are being established by more enterprises. A strong focus oninformation governance requires capabilities for data profiling, visualization and workflow to supportinformation stewards.

Data Integration

Capabilities applied within data integration processes and architectures, in support of both dataconsolidation for analytics and operational application integration.

Data integration initiatives cannot be successful without mechanisms to assure the quality of thedata being integrated and delivered. Core data quality operations performed in the face ofincreasing complexity and scale are key to success.

Gartner, Inc. | G00261663 Page 17 of 26

Page 18: critical_capabilities_for_da_261663

Data Migration

Capabilities used in the context of a data conversion, migration or modernization initiative (such asconversion from legacy to modern applications).

These initiatives require a strong focus on identifying data quality issues upfront — therefore thisuse case emphasizes the critical capabilities of data profiling and visualization, while alsodemanding strong scalability and performance to support large-scale migration efforts.

Big Data

Capabilities applied to the ingestion, correlation and interpretation of big data sources in support ofoperational analytics and sentiment analysis.

Big data scenarios increasingly involve the combination of diverse sets of data, many of which willcome from outside the enterprise and are of unknown quality. Therefore, this use case emphasizesmost heavily data profiling and matching capabilities. In addition, performance and scalability arekey.

Vendors Added and Dropped

Because this is a new Critical Capabilities, no vendors have been added or dropped

Inclusion CriteriaIn the context of this Critical Capabilities analysis, we are using the same inclusion criteria as"Magic Quadrant for Data Quality Tools."

To be included, vendors had to meet the following criteria:

■ They must offer stand-alone packaged software tools or cloud-based services (not onlyembedded in, or dependent on, other products and services) that are positioned, marketed andsold specifically for general-purpose data quality applications.

■ They must deliver functionality that addresses, at minimum, profiling, parsing, standardization/cleansing, matching and monitoring. Vendors that offer narrow functionality (for example, thatsupport only address cleansing and validation, or that deal only with matching) are excludedbecause they do not provide complete suites of data quality tools. Specifically, vendors mustoffer all of the following:

■ Profiling and visualization: They must provide packaged functionality for attribute-basedanalysis (for example, minimum, maximum, frequency distribution and so on) anddependency analysis (cross-table and cross-dataset analysis). Profiling results must beexposed in either a tabular or a graphical user interface delivered as part of the vendor'soffering. Profiling results must be able to be stored and analyzed across time boundaries(trending).

Page 18 of 26 Gartner, Inc. | G00261663

Page 19: critical_capabilities_for_da_261663

■ Parsing: They must provide packaged routines for identifying and extracting componentsof textual strings, such as names, mailing addresses and other contact-related information.Parsing algorithms and rules must be applicable to a wide range of data types anddomains, and must be configurable and extensible by the customer.

■ Matching: They must provide configurable matching rules or algorithms that enable usersto customize their matching scenarios, audit the results, and tune the matching scenariosover time. The matching functionality must not be limited to specific data types anddomains, nor limited to the number of attributes that can be considered in a matchingscenario.

■ Standardization and cleansing: They must provide both packaged and extensible rules forhandling syntax (formatting) and semantic (values) transformation of data to ensureconformance with business rules.

■ Monitoring: They must support the ability to deploy business rules for proactive,continuous monitoring of common and user-defined data conditions.

■ They must support this functionality with packaged capabilities for data in more than onelanguage and for more than one country.

■ They must support this functionality both in scheduled (batch) and interactive (real-time) modes.

■ They must support large-scale deployment via server-based runtime architectures that cansupport concurrent users and applications.

■ They must maintain an installed base of at least 100 production, maintenance/subscription-paying customers for the data quality product(s) meeting the above functional criteria. Theproduction customer base must include customers in more than one region (North America,Latin America, EMEA and Asia/Pacific).

■ They must be able to provide reference customers that demonstrate multidomain and/ormultiproject use of the product(s) meeting the above functional criteria.

Vendors meeting the above criteria but limited to deployments in a single specific applicationenvironment, industry or data domain are excluded from this Critical Capabilities.

There are many vendors of data quality tools, but most do not meet the above criteria and aretherefore not included. Many vendors provide products that deal with one very specific data qualityproblem, such as address cleansing and validation, but that cannot support other types ofapplication, or that lack the full breadth of functionality expected of today's data quality solutions.Others provide a range of functionality, but operate only in a single country or support only narrow,departmental implementations. Others may meet all the functional, deployment and geographicrequirements but are at a very early stage in their "life span" and, therefore, have few, if any,production customers.

Gartner, Inc. | G00261663 Page 19 of 26

Page 20: critical_capabilities_for_da_261663

Table 1. Weighting for Critical Capabilities in Use Cases

Critical Capabilities Mas

ter

Dat

a M

anag

emen

t

Op

erat

iona

l/T

rans

acti

ona

l Dat

a Q

ualit

y

Info

rmat

ion

Go

vern

ance

Init

iati

ves

Dat

a In

teg

rati

on

Dat

a M

igra

tio

n

Big

Dat

a

Profiling 10% 5% 20% 10% 20% 25%

Parsing, Standardizing & Cleansing 5% 25% 10% 20% 20% 10%

Visualization 10% 5% 25% 5% 10% 15%

Matching, Linking & Merging 25% 20% 10% 20% 10% 20%

Multidomain Support 25% 10% 10% 10% 5% 10%

Workflow 20% 10% 20% 15% 15% 5%

Scalability & Performance 5% 25% 5% 20% 20% 15%

Total 100% 100% 100% 100% 100% 100%

As of January 2015

Source: Gartner (February 2015)

This methodology requires analysts to identify the critical capabilities for a class of products/services. Each capability is then weighed in terms of its relative importance for specific product/service use cases

Critical Capabilities Rating

Each of the products/services has been evaluated on the critical capabilities on a scale of 1 to 5; ascore of 1 = Poor (most or all defined requirements are not achieved), while 5 = Outstanding(significantly exceeds requirements).

Page 20 of 26 Gartner, Inc. | G00261663

Page 21: critical_capabilities_for_da_261663

Table 2. Product/Service Rating on Critical Capabilities

Product or Service Ratings Ata

ccam

a

Dat

aMen

tors

Exp

eria

n

IBM

Info

rmat

ica

Info

rmat

ion

Bui

lder

s

Inno

vati

ve S

yste

ms

MIO

soft

Neo

po

st

Ora

cle

Pit

ney

Bo

wes

Red

Po

int

SA

P

SA

S

Tal

end

Tri

llium

So

ftw

are

Uni

serv

X88

So

ftw

are

Profiling 4.2 3.9 4.0 4.3 4.5 4.4 3.6 4.4 4.0 4.3 4.3 4.2 3.9 3.9 2.8 4.3 4.0 4.8

Parsing, Standardizing & Cleansing 3.9 4.4 3.5 4.4 4.4 4.2 4.2 4.5 4.0 4.1 4.2 4.8 3.9 4.4 3.7 4.6 3.3 4.2

Visualization 3.6 3.8 4.3 3.8 3.8 4.6 4.3 4.9 3.6 3.6 3.9 3.8 3.2 3.8 2.8 4.3 4.6 4.4

Matching, Linking & Merging 4.0 4.3 3.8 4.3 4.3 3.0 4.1 4.6 3.9 3.3 4.4 4.7 3.4 4.5 4.3 4.1 4.2 3.9

Multidomain Support 3.9 2.7 1.8 4.1 4.1 3.4 1.8 3.2 1.6 3.0 2.3 3.0 3.5 3.6 3.4 3.0 2.3 3.6

Workflow 3.6 4.0 2.5 4.2 4.3 3.6 3.9 4.0 3.3 3.6 4.0 4.6 3.4 4.2 3.8 3.6 4.3 4.2

Scalability & Performance 3.9 4.1 4.1 4.2 4.1 4.1 4.1 4.9 3.8 3.7 4.2 4.7 3.3 4.0 3.7 4.2 4.1 4.5

As of January 2015

Source: Gartner (February 2015)

Gartner, Inc. | G00261663 Page 21 of 26

Page 22: critical_capabilities_for_da_261663

Table 3 shows the product/service scores for each use case. The scores, which are generated bymultiplying the use-case weightings by the product/service ratings, summarize how well the criticalcapabilities are met for each use case.

Page 22 of 26 Gartner, Inc. | G00261663

Page 23: critical_capabilities_for_da_261663

Table 3. Product Score in Use Cases

Use CasesA

tacc

ama

Dat

aMen

tors

Exp

eria

n

IBM

Info

rmat

ica

Info

rmat

ion

Bui

lder

s

Inno

vati

ve S

yste

ms

MIO

soft

Neo

po

st

Ora

cle

Pit

ney

Bo

wes

Red

Po

int

SA

P

SA

S

Tal

end

Tri

llium

So

ftw

are

Uni

serv

X88

So

ftw

are

Master Data Management 3.87 3.75 3.11 4.18 4.22 3.64 3.46 4.15 3.19 3.48 3.72 4.12 3.48 4.06 3.62 3.80 3.72 4.07

Operational/TransactionalData Quality

3.89 4.04 3.51 4.25 4.24 3.83 3.86 4.46 3.60 3.67 4.02 4.48 3.53 4.17 3.71 4.11 3.78 4.20

Information GovernanceInitiatives

3.84 3.88 3.49 4.14 4.20 4.02 3.79 4.38 3.50 3.71 3.94 4.20 3.51 4.02 3.35 4.04 4.00 4.30

Data Integration 3.89 4.01 3.45 4.24 4.26 3.81 3.82 4.41 3.58 3.67 4.02 4.44 3.53 4.15 3.67 4.07 3.83 4.21

Data Migration 3.90 4.03 3.60 4.23 4.26 4.01 3.90 4.47 3.69 3.80 4.09 4.43 3.57 4.10 3.49 4.15 3.92 4.34

Big Data 3.94 3.93 3.68 4.20 4.24 3.95 3.78 4.46 3.62 3.72 4.02 4.28 3.54 4.06 3.44 4.11 3.92 4.31

As of January 2015

Source: Gartner (February 2015)

Gartner, Inc. | G00261663 Page 23 of 26

Page 24: critical_capabilities_for_da_261663

To determine an overall score for each product/service in the use cases, multiply the ratings inTable 2 by the weightings shown in Table 1.

Gartner Recommended ReadingSome documents may not be available as part of your current Gartner subscription.

"How Products and Services Are Evaluated in Gartner Critical Capabilities"

"Magic Quadrant for Data Quality Tools"

"Toolkit: RFP Template for Data Quality Tools"

"Twelve Ways to Improve Your Data Quality"

"The State of Data Quality: Current Practices and Evolving Trends"

Evidence

This research is based on:

■ Extensive data on functional capabilities, customer-base demographics, financial status, pricingand other quantitative attributes gained via an RFI process engaging vendors in this market.

■ Interactive briefings in which the vendors provided Gartner insight on their product capabilities.

■ A Web-based survey of reference customers provided by each vendor, which captured data onusage patterns, levels of satisfaction with major product functionality categories, variousnontechnology vendor attributes (such as pricing, product support and overall service delivery)and more. In total, 329 organizations across all major world regions provided input on theirexperiences with vendors and tools in this manner.

■ Feedback about tools and vendors captured during conversations with users of Gartner's clientinquiry service.

Critical Capabilities Methodology

This methodology requires analysts to identify the critical capabilities for a class ofproducts or services. Each capability is then weighted in terms of its relative importancefor specific product or service use cases. Next, products/services are rated in terms ofhow well they achieve each of the critical capabilities. A score that summarizes howwell they meet the critical capabilities for each use case is then calculated for eachproduct/service.

"Critical capabilities" are attributes that differentiate products/services in a class interms of their quality and performance. Gartner recommends that users consider the

Page 24 of 26 Gartner, Inc. | G00261663

Page 25: critical_capabilities_for_da_261663

set of critical capabilities as some of the most important criteria for acquisitiondecisions.

In defining the product/service category for evaluation, the analyst first identifies theleading uses for the products/services in this market. What needs are end-users lookingto fulfill, when considering products/services in this market? Use cases should matchcommon client deployment scenarios. These distinct client scenarios define the UseCases.

The analyst then identifies the critical capabilities. These capabilities are generalizedgroups of features commonly required by this class of products/services. Eachcapability is assigned a level of importance in fulfilling that particular need; some sets offeatures are more important than others, depending on the use case being evaluated.

Each vendor’s product or service is evaluated in terms of how well it delivers eachcapability, on a five-point scale. These ratings are displayed side-by-side for allvendors, allowing easy comparisons between the different sets of features.

Ratings and summary scores range from 1.0 to 5.0:

1 = Poor: most or all defined requirements not achieved

2 = Fair: some requirements not achieved

3 = Good: meets requirements

4 = Excellent: meets or exceeds some requirements

5 = Outstanding: significantly exceeds requirements

To determine an overall score for each product in the use cases, the product ratings aremultiplied by the weightings to come up with the product score in use cases.

The critical capabilities Gartner has selected do not represent all capabilities for anyproduct; therefore, may not represent those most important for a specific use situationor business objective. Clients should use a critical capabilities analysis as one ofseveral sources of input about a product before making a product/service decision.

Gartner, Inc. | G00261663 Page 25 of 26

Page 26: critical_capabilities_for_da_261663

GARTNER HEADQUARTERS

Corporate Headquarters56 Top Gallant RoadStamford, CT 06902-7700USA+1 203 964 0096

Regional HeadquartersAUSTRALIABRAZILJAPANUNITED KINGDOM

For a complete list of worldwide locations,visit http://www.gartner.com/technology/about.jsp

© 2015 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. or its affiliates. Thispublication may not be reproduced or distributed in any form without Gartner’s prior written permission. If you are authorized to accessthis publication, your use of it is subject to the Usage Guidelines for Gartner Services posted on gartner.com. The information containedin this publication has been obtained from sources believed to be reliable. Gartner disclaims all warranties as to the accuracy,completeness or adequacy of such information and shall have no liability for errors, omissions or inadequacies in such information. Thispublication consists of the opinions of Gartner’s research organization and should not be construed as statements of fact. The opinionsexpressed herein are subject to change without notice. Although Gartner research may include a discussion of related legal issues,Gartner does not provide legal advice or services and its research should not be construed or used as such. Gartner is a public company,and its shareholders may include firms and funds that have financial interests in entities covered in Gartner research. Gartner’s Board ofDirectors may include senior managers of these firms or funds. Gartner research is produced independently by its research organizationwithout input or influence from these firms, funds or their managers. For further information on the independence and integrity of Gartnerresearch, see “Guiding Principles on Independence and Objectivity.”

Page 26 of 26 Gartner, Inc. | G00261663