using elk explore defect data
TRANSCRIPT
![Page 1: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/1.jpg)
Use ELK Explore Defect Data
Xu Yabin
Singapore
![Page 2: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/2.jpg)
Content
Customer requirements and defect KPI definition
ELK solution
ELK compared to traditional analytics
method
![Page 3: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/3.jpg)
Customer Requirement
• Online web applications which need to be deployed frequently
• Serious defects and quality issues
• Not enough test before applications deployed
• Defects are always out of control after applications deployed
• Serious defects are always found after the application deployed
• Serious defects are not fixed on time
• Implement Continuous integration and defect management system
• What the result is and how to do continuous assessment for DevOps
activities
![Page 4: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/4.jpg)
Defect KPI Definition
• Based on the customer’s requirements, the defect KPI is
defined as below
• Defect number and distribution
• Defects number before and after applications deployed
• Serious defects number before and after applications
deployed
• Serious defects fixed time
![Page 5: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/5.jpg)
Data analytics tools requirement
• What data analytics tools do we need
• Easily import defect data from current defect system
• Easily configure and calculate to get the KPI data
• Explore defect data without any data model preparation
• Easily dig into the detailed information
• Easy to maintain
• We choose ELK (Elasticsearch, Logstash, Kibana)
![Page 6: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/6.jpg)
Content
Customer requirement and defect KPI definition
ELK solution
ELK compared to traditional analytics
method
![Page 7: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/7.jpg)
ELK Solution
Defect Management System
Distributed data storage and search engine
Original defect data
Logstash Elasticsearch Kibana
Data collector Data analytics and result
• Most of the works are done through configuration, not coding
![Page 8: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/8.jpg)
Original defect data
• Original defect data is from customer’s defect management system,
XML format
![Page 9: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/9.jpg)
ELK Data collector: Logstash
• Collect defect data using Logstash
• Compared to traditional data collector (much code work is needed), Logstash
need no code, only several lines of configuration
• Defect data is put into Elasticsearch through Logstash pipeline
![Page 10: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/10.jpg)
ELK User interface configuration: Kibana
• When data is imported into Elasticsearch, UI configuration
can be done using Kibana
• UI configuration is focused on what will be displayed
• Configuration is in a very natural way
• No business data model is needed before doing the configuration
![Page 11: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/11.jpg)
ELK : User interface
• Easily add query conditions and filters to dig into the data
![Page 12: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/12.jpg)
ELK: Filter and dig into the data:defect distribution by time
• The defect data
view shows all
defect data
Most defects are created
in the year 2015, use the mouse to drag
the area
The defect data is
filtered by the
The defect data is filtered by
the time you selected
![Page 13: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/13.jpg)
ELK: Filter and dig into the data:defect distribution by product
• The defect data
view shows all
defect data
Green part is one product
Double click the green product
The defect data is
filtered by the green product
The defect data view can be
changed to green product
defects
![Page 14: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/14.jpg)
ELK: Multidimensional analysis: defect distribution by product
• Defects
• Defects of different products,different color stands for different products
![Page 15: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/15.jpg)
ELK: Defect KPI displayed• Severity
• Defect before or after release
• Defect close time
![Page 16: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/16.jpg)
Content
Customer requirement and defect KPI definition
ELK solution
ELK compared to traditional analytics
method
![Page 17: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/17.jpg)
ELK: Advantages
• Analyze data without coding
• Fast deliver and low cost
• High flexibility to analyze data
• Easy deploy and maintain
• Learn business data before the data model is created
• Explore and dig the data step by step based on your understanding of
the business
• Big data method
• Performance
• High Availability
• Extendable
• Collect and import data easily
![Page 18: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/18.jpg)
ELK: Why analyze data without coding
• Data analyzing and display
• Traditional method
• The bottle neck is related database
• Aggregated analysis can’t be done by database itself
• We need code using SQL statement like group by and count
• Even simple code make the analytics difficult, because the data,
data process and UI are coupled with the code
• ELK solution
• Powerful aggregated analysis and search capability
• UI is not coupled with data
• Query conditions and filter can be easily added to current
query
![Page 19: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/19.jpg)
• Simple and powerful aggregated analysis,as SQL
group by
• Business concept can be learned from data
aggregation
• Below is Elasticsearch aggregating code
GET _search
{
"aggs" : {
“product": {
"terms": {"field": "{parsed_xml.product}"}
}
}
}
• The search result can be used for another query
"query_string": { "query":
"parsed_xml.product:\“drivers\" AND (*)" }
ELK: Query from configuration not coding
![Page 20: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/20.jpg)
• Traditional data query issues:
• Too much data returned from select statement
• The main reason is that people don’t know how much data
will be returned before doing select
• The data is not filtered
• Too much data in one single table
• If one table is divided, the query code need to be modified
to merge the query result
• Too much influence to existed program
• Not easy to be extended when data increases
Traditional data query issues
![Page 21: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/21.jpg)
• Big data method and concept
• When the amount of data can not be processed or handled by a
single point of resources (machines, CPU, etc.), The data and
the processing power and can be horizontal split, and does not
substantially affect the existing architecture
• ELK solution:
• Too much data returned from select statement
• Count before query
• Filtered before query using aggregating result
• Too much data in a single table
• One table can be divided, no need to change query
statement
• Time sequence is supported, easy to divide the time serous
data
• Easy to be extended through distributed data storage
How ELK deal with the data query issues
![Page 22: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/22.jpg)
• From:
https://www.elastic.co/guide/en/elasticsearch/guide/current/replica-
shards.html
• Elasticsearch allows you to start small and scale horizontally as you
grow. Simply add more nodes, and let the cluster automatically take
advantage of the extra hardware.
• Elasticsearch clusters are resilient — they will detect new or failed
nodes, and reorganize and rebalance data automatically, to ensure
that your data is safe and accessible.
ELK data storage: Elasticsearch distributed data storage
![Page 23: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/23.jpg)
• Traditional data collector issues:
• The database is strictly defined by data types (schema)
• Same data may has different data types in different system
• The data schema relationship (data mapping) between
different system should be defined correctly before data
import
• Or the data import will be failed
Traditional data collector issues
![Page 24: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/24.jpg)
• ELK solution:
• Schema less data import
• No consider data type before data import
• If default data type is not right, it can be changed
How ELK deal with the data collector issues
![Page 25: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/25.jpg)
• With the existing plug-ins, much less programming or no programming
• Filtering, processing and increased data can be easily added to an existing
collection pipe line
• Input and output contents are flexible and extendable
ELK data import: Logstash pipe line
Input:defect data file
Filter1:normalize XML format
Filter2: Get and parser defect data
Filter3: Change time format of the input data
Output:Elasticsearch
Input:defect data file
Filter4: Add a defect fixed time field calculated by defect close time minus defect open time
Output:Elasticsearch
Filter1 Filter2 Filter3
Want to add Filter4 to get defect fixed time
![Page 26: Using ELK Explore Defect Data](https://reader031.vdocuments.us/reader031/viewer/2022030313/58ecb7971a28ab0b128b463f/html5/thumbnails/26.jpg)
• From: https://www.elastic.co/guide/en/logstash/1.5/deploying-and-
scaling.html
ELK data import: Logstash architecture