ec. file · web viewcentre of competence on data warehousing. questionnaire . on . the ....

25

Click here to load reader

Upload: voanh

Post on 30-Jan-2018

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: ec. file · Web viewCENTRE OF COMPETENCE On data warehousing. Questionnaire . on . the . use . of . software. tools . according. to . the . layered . S-DWH . architecture . Dear Colleagues,

QUESTIONNAIRE ON THE USE OF SOFTWARE TOOLS

ACCORDING TO THE LAYERED S-DWH ARCHITECTURE

CENTRE OF COMPETENCE

ON DATA WAREHOUSING

Page 2: ec. file · Web viewCENTRE OF COMPETENCE On data warehousing. Questionnaire . on . the . use . of . software. tools . according. to . the . layered . S-DWH . architecture . Dear Colleagues,

Dear Colleagues,

Some of the objectives of the Centre of Competence – Data Warehousing (CoC-DWH) are:

Contacting ESS members for identifying and prioritizing relevant projects and support requests included clear arrangements about support to be delivered by the CoC-DWH.

Setting up the knowledge repository in the ESSnet DWH domain of the CROS portal and implement a plan for active maintenance, including S-DWH best practice cases in ESS member states.

Sharing experiences on the general aspects of running a Centre of Competence and getting feedback for further evolution of this CoC concept.

As part of this process, we would like to invite you to fill in this questionnaire. The main goal of the questionnaire is to better understand your current situation and needs, and to provide the benefit of experience and best practice to other countries.

We would be grateful if you could complete questionnaire by 29 August 2014 if possible. Please send your answers to [email protected] (with CC to [email protected]).

Best regards,

the CoC-DWH project team

2

Page 3: ec. file · Web viewCENTRE OF COMPETENCE On data warehousing. Questionnaire . on . the . use . of . software. tools . according. to . the . layered . S-DWH . architecture . Dear Colleagues,

National Statistical Institute: _____________________________________________

First name / Surname: _____________________________________________

Position / department: _____________________________________________

Email: _____________________________________________

1. BackgroundIn an S-DWH system we identified four functional layers, starting from the most detailed bottom level up to the top of the S-DWH architecture where conceptual level is placed. The ground level corresponds to the area where the process starts, while the top of the pile is where the data warehousing process finishes. There are:

IV – access layer, for the final presentation, dissemination and delivery of the information sought specialized for external, relatively to NSI or Eurostat, users;

III – interpretation and data analysis layer, enables any data analysis or data mining, functional to support statistical design or any new strategies, as well as data re-use; functionality and data are optimized then for internal users, specifically for statistician methodologists or statistician experts.

II – integration layer, is where all operational activities needed for any statistical production process are carried out; in this layer data are manly transformed from raw to cleaned data and this activities are carried on by internal operators;

I – source layer, is the level in which we locate all the activities related to storing and managing internal or external data sources.

The layered architecture reflects a conceptual organization in which we will consider the first two levels as pure statistical operational infrastructures, functional for acquiring, storing, editing and validating data and the last two layers as the effective data warehouse, i.e. levels in which data are accessible for data analysis. This is generally composed of two main different functional environments: an operational where all available information is collect and build-up, usually defined as Extraction, Transformation and Loading (ETL) environment, while an analytical part is the actual data warehouse, i.e. where data analysis, or mining, reports for executives and statistical deliverables are realized.

3

Page 4: ec. file · Web viewCENTRE OF COMPETENCE On data warehousing. Questionnaire . on . the . use . of . software. tools . according. to . the . layered . S-DWH . architecture . Dear Colleagues,

The lower layers correspond to the area where the process starts, while the top layers represent where the data warehousing processes are carried out. This reflects a conceptual organization in which we consider the first two layers as operational IT infrastructures and the last two layers as the effective data warehouse.

Can the functional scheme described above be applied in your NSI ? What are the differences ?

4

Page 5: ec. file · Web viewCENTRE OF COMPETENCE On data warehousing. Questionnaire . on . the . use . of . software. tools . according. to . the . layered . S-DWH . architecture . Dear Colleagues,

2. Access layerThe principal purpose of data warehouse is to provide information to its users for strategic decision-making. These users interact with the warehouse throughout Access layer using end user access tools.

Specialised Business Intelligence (BI) Tools for data access

BI tools are a type of software that is designed to retrieve, analyse and report data. This broad definition includes everything from Reporting and Query Tools, Application Development Tools to Visual Analytics Software, Navigational Tools (OLAP viewers).

Name Licence type (Free / proprietary)

Version / Year of release

Are you satisfied with this product?

[ ] Oracle

[ ] Microsoft

[ ] SAS Institute

[ ] SAP

[ ] Tableau

[ ] IBM Cognos

[ ] QlikView

[ ] Other brand. Please name it:________________________

[ ] Not using

2.1 Do you plan to change / update currently using software in the next three years ? What are the reasons to do so ?

5

Page 6: ec. file · Web viewCENTRE OF COMPETENCE On data warehousing. Questionnaire . on . the . use . of . software. tools . according. to . the . layered . S-DWH . architecture . Dear Colleagues,

Office Automation Tools for regular productivity and collaboration instruments

By Office automation tools we understand all software programs which make it possible to meet office needs. In particular, an office suite therefore usually contains following software programs: word processing, a spreadsheet, a presentation tool, a database, a scheduler.

Name Licence type (Free / proprietary)

Version / Year of release

Are you satisfied with this product?

[ ] Microsoft Office

[ ] Corel WordPerfect

[ ] iWork

[ ] IBM‘s Lotus SmartSuite

[ ] OpenOffice

[ ] Other brand.Please name it:________________________

[ ] Not using

2.2 Do you plan to change / update currently using software in the next three years ? What are the reasons to do so ?

6

Page 7: ec. file · Web viewCENTRE OF COMPETENCE On data warehousing. Questionnaire . on . the . use . of . software. tools . according. to . the . layered . S-DWH . architecture . Dear Colleagues,

Graphics and Publishing tools

Graphics and publishing tools are tools with ability to create one or more infographics from a provided data set or to visualize information. There are a vast variety of tools and software to create any kind of information graphics, depending on the organization’s needs.

Name Licence type (Free / proprietary)

Version / Year of release

Are you satisfied with this product?

[ ] PSPP

[ ] SAS

[ ] SPSS

[ ] Stata

[ ] Statistical Lab

[ ] STATISTICA

[ ] Other brand. Please name it:________________________

[ ] Not using

2.3 Do you plan to change / update currently using software in the next three years ? What are the reasons to do so ?

7

Page 8: ec. file · Web viewCENTRE OF COMPETENCE On data warehousing. Questionnaire . on . the . use . of . software. tools . according. to . the . layered . S-DWH . architecture . Dear Colleagues,

Web services tools

Web services tools are used for creating and testing interaction between machines (M2M).

Name Licence type (Free / proprietary)

Version / Year of release

Are you satisfied with this product?

[ ] Stylus Studio

[ ] Microsoft Visual Studio

[ ] Apache Axis

[ ] SoapUI

[ ] JBoss

[ ] Other brand. Please name it:________________________

[ ] Not using

2.4 Do you plan to change / update currently using software in the next three years ? What are the reasons to do so ?

3. Interpretation and Data Analysis layer

The interpretation and data analysis layer is intended for statisticians and data scientists, and would enable any data manipulation or unstructured activities. In this layer expert users can carry out data mining or design new statistical strategies.

8

Page 9: ec. file · Web viewCENTRE OF COMPETENCE On data warehousing. Questionnaire . on . the . use . of . software. tools . according. to . the . layered . S-DWH . architecture . Dear Colleagues,

Statistical Data Mining Tools

The overall goal of the data mining tools is to extract information from a data set and transform it into an understandable structure for further use. Aside from the main goal of the data mining tools they should also be capable to visualise data, which was extracted in data mining process. Because of this feature a lot of tools from this category have been already covered in the Graphics and Publishing tools section and are used in the Access layer.

Name Licence type (Free / proprietary)

Version / Year of release

Are you satisfied with this product?

[ ] IBM SPSS Modeler

[ ] SAS Enterprise Miner

[ ] STATISTICA Data Miner

[ ] Microsoft Analysis Services

[ ] Oracle Data Mining

[ ] R

[ ] Other brand. Please name it:________________________

[ ] Not using

3.1 Do you plan to change / update currently using software in the next three years ? What are the reasons to do so ?

9

Page 10: ec. file · Web viewCENTRE OF COMPETENCE On data warehousing. Questionnaire . on . the . use . of . software. tools . according. to . the . layered . S-DWH . architecture . Dear Colleagues,

Business Intelligence Tools for data analysis in a direct connection with database.

Business Intelligence tools which allow users to create visual reports or dashboards and other summaries of specific sets of data for trending and other data analysis needs.

Name Licence type (Free / proprietary)

Version / Year of release

Are you satisfied with this product?

[ ] Eclipse BIRT Project

[ ] JasperReports

[ ] OpenOffice Base

[ ] Oracle Reports

[ ] SAS Web Report Studio

[ ] SQL Server Reporting Services

[ ] Crystal Reports

[ ] Zoho Reports

[ ] Other brand. Please name it:________________________

[ ] Not using

3.2 Do you plan to change / update currently using software in the next three years ? What are the reasons to do so ?

10

Page 11: ec. file · Web viewCENTRE OF COMPETENCE On data warehousing. Questionnaire . on . the . use . of . software. tools . according. to . the . layered . S-DWH . architecture . Dear Colleagues,

Tools for designing OLAP cubes

Name Licence type (Free / proprietary)

Version / Year of release

Are you satisfied with this product?

[ ] SAS OLAP Cube Studio

[ ] SQL Server Analysis Services

[ ] Analytic Workspace Manager 11g

] Pentaho Schema Workbench

[ ] Other brand. Please name it:________________________

[ ] Not using

3.3 Do you plan to change / update currently using software in the next three years ? What are the reasons to do so ?

11

Page 12: ec. file · Web viewCENTRE OF COMPETENCE On data warehousing. Questionnaire . on . the . use . of . software. tools . according. to . the . layered . S-DWH . architecture . Dear Colleagues,

4. Integration layerThe integration layer is where all operational activities needed for all statistical elaboration processes are carried out. This means operations carried out automatically or manually by operators to produce statistical information in an IT infrastructure. With this aim, different subprocesses are predefined and preconfigured by statisticians as a consequence of the statistical survey design in order to support the operational activities.For the Integration layer there are mostly dedicated software applications usually defined as Data Integration tools. This kind of software is used for metadata management and usually is developed and implemented on NSI request. This is because of specific needs and requirements from customer. It has a user friendly graphic interface to help the integration of different input sources and their manipulation.

Italy

Italy (Istat) has self-implemented system SIQual as metadata system. This is an information system on quality, contains information on the execution of Istat primary surveys and secondary studies and on activities developed to guarantee quality of the produced statistical information metadata managing system developed solution. Also this is a tool for generating quality reports.Istat don't have a metadata managing system for operational activities yet.

Lithuania

Statistics Lithuania don‘t use a single, centralized metadata management system yet. Most of the systems have been developed independently of each other. Any kind of metadata can be found in most of our systems. This is the reason why some of metadata are stored as different copies in different systems.Metadata related to quality of statistical data (such as relevance, accuracy, timeliness, punctuality, accessibility, clarity, coherence and comparability), statistical method descriptions are stored as free text using MS Office tools.Almost all of metadata used to analyse and process statistical data of business surveys is stored in Oracle DB with the much of the results processing being carried out in SAS. Only one business survey is carried out in FoxPro, while all the statistical data and metadata of social surveys is stored in MS SQL server.Statistics Lithuania also uses several other software systems, which have some basic metadata storage and management capability, in order to fulfil basic everyday needs.

United Kingdom

United Kingdom’s Office for National Statistics (ONS) doesn’t have a single, centralised metadata management system. The operational metadata systems are developed and supported on a variety of technology platforms:

Most business survey systems (including the business register) are run on Ingres DBMS with the much of the results processing being carried out in SAS.

Many of the older social surveys are using VB and SQL Server technology. Most new developments are carried out using Oracle / Java / SAS technology stack. Older systems supporting Life Events applications (births, marriages, deaths etc.) are

still maintained on Model 204 database.

12

Page 13: ec. file · Web viewCENTRE OF COMPETENCE On data warehousing. Questionnaire . on . the . use . of . software. tools . according. to . the . layered . S-DWH . architecture . Dear Colleagues,

As a result, each system or process supported by each of these technology implementations have their own metadata, which are managed using the specific applications developed for the statistical system storage, along with the data itself.

4.1 Please shortly describe software using for metadata handling and processing in your NSI

4.2 Would you be willing to share your software with other ESS NSIs, either the design and/or the tool ?

DESIGN:

TOOL:

5. Source LayerThe Source Layer is where we locate all the activities related to storing and managing internal or external data sources. Internal data are from direct data capturing carried out by CAWI, CAPI or CATI while external data are from administrative archives, for example from Customs Agencies, Revenue Agencies, Chambers of Commerce, Social Security Institutes.Generally, data from direct surveys are well-structured so they can flow directly into the integration layer. This is because NSIs have full control of their own applications. Differently, data from others institutions’ archives must come into the S-DWH with their metadata in order to be read correctly.

13

Page 14: ec. file · Web viewCENTRE OF COMPETENCE On data warehousing. Questionnaire . on . the . use . of . software. tools . according. to . the . layered . S-DWH . architecture . Dear Colleagues,

Name Licence type (Free / proprietary)

Version / Year of release

Are you satisfied with this product?

[ ] IBM Infosphere DataStage

[ ] Informatica PowerCenter

[ ] Oracle Warehouse Builder

[ ] SAS Data Integration Studio

[ ] SAP Business Objects Data Services

[ ] Microsoft SQL Server Integration Services

[ ] Pentaho Data Integration (Kettle)

[ ] CloverETL

[ ] JasperETL

[ ] Talend Data Integration

[ ] Other brand. Please name it:________________________

[ ] Not using

14

Page 15: ec. file · Web viewCENTRE OF COMPETENCE On data warehousing. Questionnaire . on . the . use . of . software. tools . according. to . the . layered . S-DWH . architecture . Dear Colleagues,

5.1 Do you plan to change / update currently using software in the next three years ? What are the reasons to do so ?

5.2 Would your NSI be willing to share software with other ESS members, either the design and/or the tool(s) ?If not, is there a particular reason ?

5.3 Is there something else you would like to mention/share ?

Thank You !!!

15