weave-d - 2nd progress evaluation presentation
TRANSCRIPT
Weave-D A cognitive approach towards data accumulation and fusion
Thushan GanegedaraRuwan GunarathneLasindu Vidana PathiranageBuddhima Wijeweera
Why Weave-D?
Growth of amount of information
Handle data
Temporal
Multi-
modal
Prevent catastrophi
c interferenc
e
Incremental learning algorithms
Visualizing information
Intuitive
Simple
Apply previous
knowledge to acquire
new knowledge
Conceptualization
Generalization of
acquired knowledge
????
What is Weave-D?
Accumulate data (i.e.
Images, Text)
Feature Extraction
Incremental learning
Link generation
Query & Visualize UI
Accumulates temporal, multi-modal or multi source data in an organized manner
Extract information from data (ex. Color, Edge, Shape information of images)
Incrementally learn using IKASL algorithmLinks represent relationships between multi-modal data
Major Research Problems•Integrating incremental learning algorithm to
the selected artificial perception model
•What are the potential performance improvements for selected unsupervised learning algorithm?
•What are the suitable feature extraction techniques for images and text?
•How to visualize complex learning outcomes to user?
Major Challenges•General
▫Limited resources and novelty of the algorithms▫Finding suitable datasets
•Image Feature Extraction▫Deciding the best colors space to represent
images▫Researching shape descriptor implementations
•Text Feature Extraction Techniques▫Researching suitable text feature extraction
techniques
Major Challenges (cont.)•Unsupervised Learning Algorithms
▫Implementing IKASL▫Testing and verifying correctness of IKASL
•Researching information visualization tools to fit our requirements
Project Scope•The proposed system will be implemented
for handling only Image and Text inputs •System will be designed to be used by
data analysts•System will,
▫Extract feature vectors of images and texts ▫Acquire knowledge using data input to
Weave-D▫Generate links between data
Project Scope•Persistence technique (ex. SQL DB,XML,
etc.) will be used to store acquired knowledge and generated links
•Provide an interface for users to query/visualize information
Assumptions & Limitations•Selected features(e.g. color, shape,…) provide a
good representation of the data (e.g. images,…)•In artificial perception model, perception at a
certain layer, can be represented by one most significant feature from the layer below
•Input data should be compatible with feature extractors (i.e. Type, Format, …)
•Tools required (e.g. Feature extraction, Information Visualization) can be utilized in the project with no/slight modifications
Deliverables•JAVA implementation of the proposed
system including several sub-components
•Documentation
Incrementallearning
Information
persistence
Information linking
Information
visualization
• Research Proposal • Literature Review• Project Scope Document • Architectural Document• Project Report • User Manual
Deliverables•Image - Color feature extraction tools
(C#)•Image – Shape feature extraction tools
(Matlab)•Image – Edge feature extraction tools
(Java)•Text feature extraction tools (Java)
•Unsupervised learning algorithm testing and visualization tools (GSOM, IKASL algorithms) (Java)
•Modified information visualization tool (Java)
Feasibility of Deliverables•Feature extraction tools
▫Images (e.g. img (Rummager))▫Text (e.g. Wordnet, uClassify)
•Unsupervised learning algorithm tools▫GSOM▫IKASL
•Visualization▫Arena 3D (with modifications)
IKASL
GSOM
SOM
Literature ReviewArtificial Perception
ModelUnsupervised
Learning Algorithms
Information Visualization
Feature Extraction Techniques
Other Similar Systems
3D
2DText
Images
Which Fits Where?
User
Perception Model
Artificial Perception Model [1]
•Inspired by human perceptive and cognitive system
•Close resemblance to human brain•Key features
▫Supports multiple modalities▫Ability to generate high-level perceptions
by aggregating input stimuli belonging to multiple modalities
▫Conceptualization of information[1] Bamunusinghe, Jeewanee, and Damminda Alahakoon. "Artificial Visual Percepts for Image Understanding." In Proceedings of the International Conference on Intelligent Systems. 2010.
Artificial Perception Model
Specialization of vision
Specialization of color
Unsupervised Learning Algorithms
Self-Organizing Maps (SOM) [2]
•Visualization technique which reduces the dimensions of data to help humans understand high dimensional data.
•Self-Organizing Map (SOM) is a type of unsupervised artificial neural network.
•Topology preserving map
[2] Kohonen, Teuvo. "The self-organizing map." Proceedings of the IEEE 78, no. 9 (1990):1464-1480
SOM
2-Dimensional output space High dimensional input space
Growing Self-Organizing Maps (GSOMs) [3]
• GSOM is an extension of Self-Organizing maps (SOM), which is very popular in knowledge discovery applications.
• GSOM algorithm overcomes several limitations of SOM.
• The main advantage of GSOM over SOM is that, GSOM has the ability to grow and modify the shape to represent the data space better.
• Other similar work are,
▫ Growing Cell Structures (GCS’s)▫ Neural Gas Algorithm (NGA)▫ Incremental Grid Growing (IGG)
[3] Alahakoon, Damminda, Saman K. Halgamuge, and Bala Srinivasan. "Dynamic self-organizing maps with controlled growth for knowledge discovery." Neural Networks, IEEE Transactions on 11, no. 3 (2000): 601-614
GSOM Algorithm•GSOM is an unsupervised neural network,
which is initialized with four nodes and develops to represent the input data space.
•There are three main phases which can be distinguished in GSOM algorithm
▫Initialization phase▫Growing phase ▫Smoothing phase
Initialization Phase•Starting four nodes will be initialized
with random values from the input vector space.
(0,1)
(1,1)
(0,0) (1,0)
Growing phase
(0,1)
(1,1)
(0,0) (1,0)
Input
Euclidian Distance
Winner
Neighborhood
Smoothing phase
•Growing phase stops when new node growth saturates
•Reduce learning rate and fix a small starting neighborhood.
•Find winner and adapt the weights of winner and neighbors in the same way as in growing phase.
SOM GSOM
Fixed number of nodes & Grid size
Ability to grow and change the shape
IKASL Algorithm [4]
• Most current Hebbian rule based algorithms do not encompass incremental learning and life-long learning
• Hebbian rule based unsupervised incremental learning algorithm
• Is both stable and plastic• Can be understood as an n-layer structure• A single layer comprises 2 sub-layers
▫Learning Layer▫Generalized Layer
[4] De Silva, Daswin, and Damminda Alahakoon. "Incremental knowledge acquisition and self learning from text." In Neural Networks (IJCNN), The 2010 International Joint Conference on, pp. 1-8. IEEE, 2010.
IKASL Algorithm
Learn layer (L1)
Generalized layer (G1)
Learn layer (L2)
Input 1
Input 2
IKASL Algorithm (ctd)
Learn layer (L1)
General layer (G1)
Learn layer (L2)
General layer (G2)
Learn layer (L3)
General layer (G3)
Feature Extraction
Image Feature Extraction• In project we do not directly interact with raw
images• There are lots of redundant data in images• The solution is feature extraction techniques
• This transformation process of input data to a set of feature vectors is known as feature extraction
• The Moving Picture Expert Group (MPEG) was established and it has developed several implementations
• In MPEG-7: Multimedia content description interface was created
MPEG-7 Descriptors [5-7]
•Descriptors: a core set of quantitative measures of audio-visual features
•Some of MPEG-7 Descriptors are,▫Dominant Colour Descriptor
▫Colour Layout Descriptor
▫Edge Histogram Descriptors
[5] Ortiz, Edward, Cesar Pantoja, and María Trujillo. "An MPEG-7 Browser." InLatin American Conference on Networked Electronic Media. 2009.[6] Wu, Peng, Yong Man Ro, Chee Sun Won, and Yanglim Choi. "Texture descriptors in MPEG-7." In Computer Analysis of Images and Patterns, pp. 21-28. Springer Berlin Heidelberg, 2001[7] Chatzichristofis, Savvas A., Yiannis S. Boutalis, and Mathias Lux. "Img (rummager): An interactive content based image retrieval system." In Similarity Search and Applications, 2009. SISAP'09. Second International Workshop on, pp. 151-153. IEEE, 2009.
Dominant Colour Descriptor
Colour Layout Descriptor
Edge Histogram Descriptor[17]
[17] Eitz, Mathias, Kristian Hildebrand, Tamy Boubekeur, and Marc Alexa. "An evaluation of descriptors for large-scale image retrieval from sketched feature lines." Computers & Graphics 34, no. 5 (2010): 482-498.
Proportion20018.Jpg, 0.197, 0.162, 0.323, 0.319, 0.437
20019.jpg, 0.340, 0.076, 0.282, 0.303, 0.374
20020.jpg, 0.180, 0.212, 0.333, 0.275, 0.333
20021.jpg, 0.165, 0.222, 0.278, 0.335, 0.409
20024.jpg, 0.324, 0.100, 0.295, 0.281, 0.243
……….
20066.jpg, 0.069, 0.317, 0.257, 0.358, 0.362
Position20018.jpg,
4,4,4,4,4,4,4,3,4,4,4,4,4,4,4,4
20019.jpg, 4,0,4,4,0,4,4,4,4,4,0,4,3,3,4,4
20020.jpg, 4,4,4,4,2,2,4,2,4,4,1,1,4,4,2,4
20021.jpg, 4,3,3,4,4,4,4,4,4,4,4,4,1,1,1,2
20024.jpg, 3,0,2,0,4,4,4,4,4,3,4,1,3,3,0,0
……….
Feature Vectors generated with EHG
20018.Jpg , 0, 0, 1, 1, 1
20019.Jpg , 1, 0, 0, 1, 1
20020.Jpg , 0, 0, 1, 1, 1
20021.Jpg , 0, 0, 1, 1, 1
20024.Jpg , 1, 0, 1, 1, 0
………
………
Existence
Results (Texture)
Existence Proportion Position
Shape Descriptors
[8] Bosch, Anna, Andrew Zisserman, and Xavier Munoz. "Representing shape with a spatial pyramid kernel." In Proceedings of the 6th ACM international conference on Image and video retrieval, pp. 401-408. ACM, 2007.
•PHOG Descriptor [8]
▫Outcomes Local shape (Given by each divided region) Spatial layout (Given by HOGs of regions of finer
spatial grids)
Shape Descriptors•GIST Descriptor [9]
▫A holistic representation of an image
▫Spatial Envelope Described by boundary of surface of image and
inner textures Properties
Naturalness, Openness, Roughness, Ruggedness, Expansion
▫Estimating spatial envelope properties By calculating the energy spectrum of the image
(DFT)[9] Oliva, Aude, and Antonio Torralba. "Modeling the shape of the scene: A holistic representation of the spatial envelope." International journal of computer vision 42, no. 3 (2001): 145-175.
Text Feature Extraction•Suitable text feature extraction techniques
are limited, why?•Technique• document is encoded as a histogram of words
[10]
• select the set of keywords which are usually regarded
as an important keys, to create a feature vector [11]
• using WordNet lexical to create the feature vector [12]
• using uClassify web-service to create the feature
vector
[10] Kaski, Samuel, Timo Honkela, Krista Lagus, and Teuvo Kohonen. "WEBSOM–self-organizing maps of document collections." Neurocomputing 21, no. 1 (1998): 101-117. [11] Chumwatana, Todsanai, K. Wong, and Hong Xie. "A SOM-Based Document Clustering Using Frequent Max Substring for Non-Segmented Texts." Journal of Intelligent Learning Systems & Applications 2 (2010): 117-125. [12] Gharib, Tarek F., Mohammed M. Fouad, Abdulfattah Mashat, and Ibrahim Bidawi. "Self Organizing Map-based Document Clustering Using WordNet Ontologies." International Journal of Computer Science 9 (2012).
Wordnet Lexical Categories
ActAnimalArtifactFood..Communication
BodyCreationEmotionMotion..Weather
uClassify output
docs sport games society Recreation
Arts Science
Business Computers
Health Home
doc1 95.5 4.3 0.1 0 0 0 0 0 0 0doc2 0 0 0 0 0 84 16 0 0 0
“Football refers to a number of sports that involve, to varying degrees, kicking a ball with the foot to score a goal. The most popular of these sports worldwide is association football, more commonly known as just "football" or "soccer". Unqualified, the word football applies to whichever form of football is the most popular in the regional context in which the word appears, including association football, as well as American football, Australian rules football, Canadian football, Gaelic football, rugby league, rugby union and other related games. These variations of football are known as football codes.”
http://en.wikipedia.org/wiki/Football
Information Visualization
Information Visualization • The process of showing information in more
intuitive manner• Today data analysts preferred to use computer
generated models• Information Visualization can be represented by
following taxonomyVisualizing
Tools
2D
2D perspective
3D
2D perspective
3D perspectiv
e
ToolsGephi [13]
Arena [16]
3D BioLayout [14]
UbiGraph [15]
Other Similar Systems
Existing Similar Systems
•Watson is an artificial intelligence computer system capable of answering questions in natural language.
IBM Watson [17]
[17] IBM Watson. n.d. http://www-03.ibm.com/innovation/us/watson/index.shtml (accessed April 28, 2013).
Significance of Watson• The ability to discern double meanings of words,
puns, rhymes, and inferred hints.• Extremely rapid responses• The ability to process vast amounts of information
to make complex and subtle logical connections
Limitations of Watson• Cannot process multi-modal data
• Cannot build a higher level perception of its data
• Watson does not learn incrementally• Requires complex infrastructure
Contributions of Project MembersMajor Task(s) Contributor
Implement SOM RuwanResearch and Implement GSOM ThushanTesting GSOM LasinduResearch IKASL ThushanResearch Fuzzy integral LasinduResearch Image Feature Extraction
Color BuddhimaEdge RuwanShape Thushan
Research Text Feature Extraction RuwanResearch WSD Lasindu
Research Information Visualization Buddhima
References[13] Gephi, an open source graph visualization and manupulation software. n.d. http://gephi.org/ (accessed April 28, 2013).
[14] BioLayout Express 3D. n.d. http://www.biolayout.org/ (accessed April 28, 2013).
[15] Ubigraph: Free dynamic graph visulization software. n.d. http://ubietylab.net/ubigraph/ (accessed April 28, 2013).
[16] Secrier, Maria. Arena3D. n.d. http://arena3d.org/ (accessed April 28, 2013).
Thank You!