knowledge discovery with fca
TRANSCRIPT
1/67
Knowledge Discovery with FCA
Lecture 1: Introduction to Formal Concept Analysis
Babes-Bolyai University, Computer Science Department, [email protected]
February 27, 2018
2/67
WHAT IS FORMAL CONCEPT ANALYSIS?BRANCH OF APPLIED MATHEMATICS AND ARTIFICIAL INTELLIGENCE
� Based on Lattice Theory developed by Garrett Birkhoffand others in the 1930s
� Employs algebra in order to formalize notions of conceptand conceptual hierarchy
� Term Formal Concept Analysis (short: FCA) introduced byRudolf Wille in the 1980s.
3/67
WHY FORMAL CONCEPT ANALYSIS
Because...The methods of Formal Concept Analysis offers an algebraicapproach to data analysis and knowledge processing.
� Strengths of FCA are� a solid mathematical and philosophical foundation,� more than 2000 research publications,� experience of several hundred application projects,� an expressive and intuitive graphical representation,� Due to its elementary yet powerful formal theory, FCA can
express other methods, and therefore has the potential tounify the methodology of data analysis.
4/67
APPLICATIONS
Formal Concept Analysis has recently been applied in� Description Logics, for checking completeness of
knowledge bases,� Linguistics, for the investigation of thesauri and
ontologies,� Software Engineering, for modelling type hierarchies with
role types,� Biomathematics, for analysing gene expression data, item
Machine Learning, for discovering website duplicates,� Data Mining, for pattern matching problems,� Rough Set Theory, for studying granular data,� Web Usage Mining, for discovering usage patterns� Medicine, etc...
5/67
WHAT IS FORMAL CONCEPT ANALYSIS?FORMAL CONCEPT ANALYSIS (FCA) IS A
� mathematization/formalization of the philosophicalunderstanding of concepts
� human-centered method to structure and analyze data� method to visualize data and its inherent structures,
implications and dependencies
6/67
AGENDA
1 Concept Lattices� What is a concept?� Formal Context� Derivation Operators� Formal Concept� Concept Lattice� Computing All Concepts� Drawing Concept Lattices� Clarifying and Reducing a Formal Context� Interlude: ConExp, FCA Tools Bundle
7/67
WHAT IS A CONCEPT?
8/67
WHAT IS A CONCEPT?
Consider the concept bird. What drives us to call something abird?� Every object with certain attributes is called bird:
� A bird has feathers.� A bird has two legs.� A bird has a bill.
� All objects having these attributes are called birds:� Duck, goose, owl and parrot are birds.� Penguins are birds, too.� . . .
9/67
WHAT IS A CONCEPT?
We need� objects� attributes� . . .
� what else?� What makes a concept to be a concept?
9/67
WHAT IS A CONCEPT?
We need� objects� attributes� . . .
� what else?� What makes a concept to be a concept?
10/67
WHAT IS A CONCEPT IN FCA?
Formal Concept Analysis models concepts as units of thoughtthat consist of two parts:� The concept extent comprises all objects that belong to the
concept.� The concept inten contains all attributes that all of the
objects have in common.
FCA is used, amongst others, in data analysis, informationretrieval, data mining and software engineering.
11/67
WHAT IS A CONCEPT?FCA is working on the conceptual layer. The representationallayer plays only a minor role.
12/67
WHAT IS A (FORMAL) CONCEPT?
13/67
THE UNIVERSE OF DISCOURSE
13/67
THE UNIVERSE OF DISCOURSE
14/67
THE (FORMAL) CONTEXT
14/67
THE (FORMAL) CONTEXT
15/67
THE (FORMAL) CONTEXT
DefinitionA formal context is a triple (G,M, I), where G is a set of objects, M isa set of attributes, and I is a relation between G and M.
� What is a relation?� What types of relations do we know?
We read (g,m) ∈ I as object g has attribute m.
15/67
THE (FORMAL) CONTEXT
DefinitionA formal context is a triple (G,M, I), where G is a set of objects, M isa set of attributes, and I is a relation between G and M.
� What is a relation?� What types of relations do we know?
We read (g,m) ∈ I as object g has attribute m.
15/67
THE (FORMAL) CONTEXT
DefinitionA formal context is a triple (G,M, I), where G is a set of objects, M isa set of attributes, and I is a relation between G and M.
� What is a relation?� What types of relations do we know?
We read (g,m) ∈ I as object g has attribute m.
16/67
EXAMPLE 1
17/67
EXAMPLE 2
18/67
DATA REPRESENTATION
� By a formal context we represent data...� Nice... but...� However, why FCA and not SQL, Data Mining, etc?
19/67
DATA BIPLOT OF INTERVIEW DATA
20/67
CONCEPT LATTICE OF INTERVIEW DATA
21/67
UNFOLDING DATA IN A CONCEPT LATTICETHE BASIC PROCEDURE OF FORMAL CONCEPT ANALYSIS:
� Data is represented in a very basic data type, called formalcontext.
� Each formal context is transformed into a mathematicalstructure called concept lattice. The information containedin the formal context is preserved.
� The concept lattice is the basis for further data analysis. Itmay be represented graphically to supportcommunication, or it may be investigated with withalgebraic methods to unravel its structure.
22/67
WHAT IS A CONCEPT LATTICE?
� Graphical diagram� Is it a graph? Yes/No?
� Order diagram?� What is an order?� What is a lattice?
22/67
WHAT IS A CONCEPT LATTICE?
� Graphical diagram� Is it a graph? Yes/No?� Order diagram?� What is an order?
� What is a lattice?
22/67
WHAT IS A CONCEPT LATTICE?
� Graphical diagram� Is it a graph? Yes/No?� Order diagram?� What is an order?� What is a lattice?
23/67
MORE EXAMPLESDIVISOR LATTICE OF 200
24/67
MORE EXAMPLESRECOMMENDED SERVING TEMPERATURE FOR RED WINES
25/67
MORE EXAMPLESRECOMMENDED SERVING TEMPERATURE FOR WHITE WINES
26/67
HOW DO WE COMPUTE CONCEPTS?For the mathematical definition of formal concepts weintroduce the derivation operator ′.
For a set of objects A ⊆ G, A′ is defined as:
A′ = {all attributes in M common to the objects of A}.
For a set of attributes B ⊆M, B′ is defined as:
B′ = {objects in G having all attributes of B}.
� We are looking for pairs (A,B) of objects A and attributes Bthat satisfy the conditions
A′ = B and B′ = A
and we call these pairs formal concepts.
27/67
DERIVATION OPERATORS
28/67
29/67
30/67
31/67
32/67
33/67
34/67
35/67
36/67
37/67
38/67
39/67
40/67
41/67
COMPUTING ALL CONCEPTS
There are several algorithms to compute all concepts:� naive approach� intersection method� Next-Closure (Ganter 1984)� Titanic (Stumme et al. 2001)� Inclose family� and several incremental algorithms
42/67
43/67
44/67
45/67
46/67
47/67
48/67
49/67
50/67
51/67
52/67
53/67
54/67
55/67
56/67
57/67
58/67
59/67
60/67
61/67
62/67
63/67
64/67
65/67
66/67
67/67