data utility requirements
DESCRIPTION
hyhyTRANSCRIPT
S.No Fuctionality Description Data Processed1 Data auditing No
2 No
3 No
4 Parsing No
5 No
6 Data Elimination No
7 Post -Processing No
data is audited with the use of statistical and database methods to detect anomalies and contradictions: this eventually gives an indication of the characteristics of the anomalies and their locations
Workflow specification
The detection and removal of anomalies is performed by a sequence of operations on the data
Workflow Execution
the workflow is executed after its specification is complete and its correctness is verified. The implementation of the workflow should be efficient, even on large sets of data, which inevitably poses a trade-off because the execution of a data-cleansing operation can be computationally expensive
A parser decides whether a string of data is acceptable within the allowed data specification and detects syntax errors
data Transformation
the mapping of the data from its given format into the format expected by the appropriate application. This includes value conversions or translation functions, as well as normalizing numeric values to conform to minimum and maximum values.
requires an algorithm for determining whether data contains duplicate representations of the same entity. Usually, data is sorted by a key that would bring duplicate entries closer together for faster identification.
After executing the cleansing workflow, the results are inspected to verify correctness
Remarks
S.No Fuctionality Description Data Processed
1
Generic feeds
2
3
product feeds
Processing of Feeds related to Products4 Bespoke feeds Processing of "Ready to use" feeds5 Subscription feeds Processing of new consumers added
feed will allow individual details from specific partners to be loaded
transactional feed Need to know data requirements for all partners/resources to feed in the transactional data
Remarks
S.No Fuctionality Description Data Processed Remarks1 Cycle initiation Perodic cycles2 Build reference data Referenceing all data used3 Extract Extraction of data from
4 Transform
5 Stage
6 Audit reports7 Publish Publish data to target tables
8 Archive
9 Clean up
clean, apply business rules, check for data integrity, create aggregates or disaggregates
Load data in staging tables (warehousing)
Prepare audit report in compliance to business rules
After publishing, maintaining meta data
Deletion of duplicate/undesireable data
S.No Fuctionality Description Data Processed1
2
3 Change updates
4
5 Data modelling
6 Admin interface
7
8 Admin interface
Data logging mechanism
must contain a comprehensive log of all uploaded CRM data (eg details of all loaded files including load dates, filenames, errors) in dedicated database tables. The log history shall be retained on an indefinite bas
Deceased suppressions
Deceased individuals are identified via an external data cleansing process. This mechanism must allow updates via a user interface, list or API.
Must allow changes in email and postal address, contact numbers, etc
Improving Data Quality
Deceased individuals are identified via an external data cleansing process. This mechanism must allow updates via a user interface, list or API.
a flexible and easily configurable data model for all CRM data stores. Can provide a visualisation of all objects present in the data model and allow changes (eg add table column, add new view) to be applied in real time.
Allow administrators to manage all data warehouse feeds and support standard ETL features ie data transformations, data workflows, data debugging etc
Clickstream data integration
The unstructured data store shall be configured to ingest clickstream data (ie web behavioural data) sourced from web analytics system (e.g. Google Analytics). The ingestion mechanism shall be able to stitch sessions generated by the same user (eg based on Cookie id) together.
Allow administrators to manage all data warehouse feeds and support standard ETL features ie data transformations, data workflows, data debugging etc
Remarks
S.No Fuctionality Description Data Processed Remarks1
2
3
Individual validation/Deletion
Validate/Delete emails, telephone nos (landline and cell), postal addresses for individuals via user interface, input lists or API
Individual multi- channel matching
The match mechanism that ensures individuals with the same or similar criteria are considered for merging as an individual in the single customer view
Marketing consent update
If a record is a 'Yes' to marketing, but we receive his details via a offline Partner source and he has said 'No' to marketing, we need to update the marketing consent field.
S.No Fuctionality Description Data Processed1 Data storage
2 Data Structure
3 Data history
4
5 Audit history
6 Administration
7 Reporting
8 Hypothesis testing support a hypothesis testing feature
must include a bespoke enterprise relational data warehouse for the storage of CRM data
a relational database technology which allows the Structured/unstructured data to be queried and aggregated.
data warehouse must retain all historic supplied data in dedicated staging tables. This requirement is to all retrospective analysis data of supplied if required.
Blending of multiple data sources
The UI will contain data sourced from structured and unstructured data sources. To allow users to run queries across datasets, a mechanism for joining these disparate datasets must be available
keep track of all updates to the system. The audit log shall include modified_date, username, record_ids etc
The aggregated datastore must provide admin features that allows all aspects of an end users account to be configured, including: -Role based client access eg Advanced user, Admin user, Standard user -Permissions -Ability to define user groups with associated standard layout, queries, field configuration etc -Enabling/disabling of account
reports to be created which can be refreshed and updated with new figures when required. These reports may use complex queries and cross-tabulations
Remarks
S.No Fuctionality Description Data Processed Remarks1
2 Cluster analysis
3 Basket analysis
4
5 Sentiment Analysis
6 Text Analysis
7
8 Data plot types
Cross product holding analysis
analyse the most common relationships between more than 2 products
segment individuals into distinct groups, based on defined variables
to identify products most commonly purchased in the same transaction
Linear andNon- linear modelling
both linear and non-linear regression modelling including stepwise model selection. Example of both regression types
analysis, which will help to identify the level of positive/negative sentiment with respect to specific topics
derive high-quality information such as patterns & trends from unstructured data.
User data imports/exports
Ability to extract data quickly, easily and in various formats/layouts. Data may be exported for Direct Mail or telemarketing campaigns, as well as for further analysis / presentation outside the system
graphical plotting of data ie Scatter plot, Line graph, pie chart, bar chart, column chart, bubble chart, boxplot etc
9 Roll up capability
10
11
12 Venn diagrams
allow users to navigate through the CRM dataset from table to table via a "rollup" feature. For example, a user may view details at the individual level (name, age) but then wish to view the sources that the individual has interacted on eg website,store,office
Customer lifetime value
The analytical client must have the functionality to value & forecast customers lifetime value (over a period of time)
Time series analysis and forecasting
The analytical client must support time series analysis and forecasting. For example, we will be able to predict website traffic for the next 3 years on a monthly basis. Time series analysis will allow us to identify the different parts driving this, i.e. trend, seasonal impact, other.
Venn diagrams that allow rapid selections for very complex queries, e.g. Members, in the XYZ country, who are marketable, male and aged 25-35 can be selected in 2 mins or less with the current solution, without prior preparation
NoStructuredUnstructuredBoth
YesNo