encontra presentation
DESCRIPTION
TRANSCRIPT
EnContRA Engine for Content-Based Retrieval Approaches
Ricardo José São Pedro Dias 07/02/2012
Context
ColaDI Project
– Platform for Project Collaboration in Industrial Design
Period: March 2010 / November 2011 (EnContRA – until February 2011)
Objectives
General Framework for (Content-Based) Retrieval Approaches and Applications
– Features:
• Indexing
• Features Extraction
• Searching / Retrieval Algorithms
• Extensible Query Processing
Advantages for Development
1. Modularity
2. Easy to use – Low learning curve
3. Fast development of new approaches
– Examples:
• A new descriptor
• A new searching / retrieval algorithm
• A new indexing structure
• Etc.
Multimedia Support
Support for different multimedia types
– Pictures
– Drawings
– 3D Objects
– Audio / Music
Typical Application Architecture
EnContRA Modules
CREATING A SIMPLE APPROACH Indexing and Retrieving Pictures using QBE
Objectives
Create a simple Query by Example Image Retrieval Application
Data Model – Input Data
Objective
Extracts: Scalable Color
Query
QBE
Pieces to Assembly
1. Image Descriptor: Scalable Color
2. Indexing Structure: In Memory Simple Index
3. Searching Algorithm: Simple Searcher
Pieces to Assembly
1. Image Descriptor: Scalable Color
2. Indexing Structure: In Memory Simple Index
3. Searching Algorithm: Simple Searcher
Choosing a Feature to be extract
Scalable Color Descriptor (Extractor)
DescriptorExtractor extractor = new ScalableColorDescriptor<IndexedObject>();
Extractor Descriptor
Pieces to Assembly
1. Image Descriptor: Scalable Color
2. Indexing Structure: In Memory Simple Index
3. Searching Algorithm: Simple Searcher
Choosing an Indexing Structure
In Memory Simple Index
AbstractIndex index = new SimpleIndex();
Descriptor Descriptor
Pieces to Assembly
1. Image Descriptor: Scalable Color
2. Indexing Structure: In Memory Simple Index
3. Searching Algorithm: Simple Searcher
Searching
Linear Search (for now!)
Searcher searcher = new SimpleSearcher<IndexedObject>();
Descriptor Descriptor
Assembling all the components
Searcher searcher = new SimpleSearcher<IndexedObject>();
searcher.setDescriptorExtractor(extractor);
searcher.setIndex(index);
searcher.setObjectStorage(new SimpleObjectStorage(IndexedObject.class));
searcher.setResultProvider(new DefaultResultProvider());
Setting Main Properties
Not required (recommended)
Indexing the Dataset
File [] pictures = getFilePictures(dataset);
for (File pic : pictures) {
BufferedImage image = ImageIO.read(pic);
searcher.insert(new IndexedObject(image));
}
Extracts descriptors and indexes them!
Performing a Search – Similar
//Load the image query
BufferedImage image = readQuery();
//Perform the search using similar
ResultSet<IndexedObject> results =
searcher.similar(new IndexedObject(image), 20);
//Print the Results
printResults(results);
Performing a Search – Query API
CriteriaBuilderImpl cb = new CriteriaBuilderImpl();
Path<IndexedObject> modelPath = new Path<IndexedObject>(IndexedObject.class); Similar similar = cb.similar(modelPath, new IndexedObject(image)); CriteriaQuery query = cb.createQuery().where(similar).limit(20); ResultSet<StringObject> results = searcher.search(query);
Query Building
Searching
A MORE COMPLEX APPROACH Indexing and Retrieving Pictures using QBE
Objectives
Create a Drawing Retrieval Application, by employing Query By Example (or Sketch)
– Queries:
• 2D Drawings (e.g., SVG files)
• Pictures
Input Model
Drawing Model
public class DrawingModel implements IEntity<Long> {
…
private Drawing drawing;
private BufferedImage image;
…
@Indexed
public BufferedImage getImage();
@Indexed
public Drawing getDrawing();
}
Model to IndexedObject
Indexed Object Factory
Instance
Picture & Vectorial
Image Indexed Object
Drawing Indexed Object
CEDD IdxObj
Edge IdxObj
ColorL IdxObj
Drawing IdxObj
Pieces to Assembly
1. Descriptor: Image Descriptors + TopoGeo
2. Indexing Structure: NBTree
3. Searching Algorithm: NBTree Searcher
Pieces to Assembly
1. Descriptor: Image Descriptors + TopoGeo
2. Indexing Structure: NBTree
3. Searching Algorithm: NBTree Searcher
Descriptors
Image Descriptors
DescriptorExtractor ceddExtractor = new CEDDDescriptor<IndexedObject>();
DescriptorExtractor edgeHistogram =
new EdgeHistogramDescriptor<IndexedObject>();
DescriptorExtractor colorLayout = new ColorLayoutDescriptor<IndexedObject>();
TopoGeo Descriptor
TopogeoDescriptorExtractor topogeoDescriptorExtractor =
new TopogeoDescriptorExtractor();
Pieces to Assembly
1. Descriptor: Image Descriptors + TopoGeo
2. Indexing Structure: NBTree
3. Searching Algorithm: NBTree Searcher
BTree for Indexing
Parameters:
– the name of the index
– the type of objects to be indexed (class)
BTreeIndex exampleIndex = new BTreeIndex(“btreeName", Object.class);
Pieces to Assembly
1. Descriptor: Image Descriptors + TopoGeo
2. Indexing Structure: NBTree
3. Searching Algorithm: NBTree Searcher
NBTree Searcher
Two flavors:
– Regular (original) AbstractSearcher searcher = new NBTreeSearcher();
– Parallel To speed the search AbstractSearcher searcher = new ParallelNBTreeSearcher();
Picture – Composed Searching
AbstractSearcher imageSearcher = new ImageSearchEngine();
imageSearcher.setQueryProcessor(new QueryProcessorDefaultParallelImpl()); imageSearcher.setIndexedObjectFactory(new ImageIndexedObjectFactory()); //Creating a combined searcher, with the selected descriptor for (Map.Entry<String, DescriptorExtractor> entry : availableDescriptors) { AbstractSearcher entrySearcher = new ParallelNBTreeSearcher(); entrySearcher.setQueryProcessor(new QueryProcessorDefaultParallelImpl()); entrySearcher.setIndex(new BTreeIndex("image." + entry.descriptorName, objClass); entrySearcher.setDescriptorExtractor(entry.extractor); imageSearcher.setSearcher("image." + entry.descriptorName, entrySearcher); } searcher.setSearcher("image", imageSearcher);
Drawing Searcher
AbstractSearcher vectSearcher = new ParallelNBTreeSearcher();
vectSearcher.setQueryProcessor( new QueryProcessorDefaultParallelImpl());
vectorialSearcher.setIndex(new BTreeIndex(“vectIndex",
TopogeoDescriptor.class));
TopogeoDescriptorExtractor topogeoExtractor = new TopogeoDescriptorExtractor();
vectorialSearcher.setDescriptorExtractor(topogeoExtractor);
searcher.setSearcher("drawing", vectSearcher);
Performing a Search – Individual
CriteriaQuery<DrawingModel> query = cb.createQuery(DrawingModel.class);
Path<DrawingModel> modelPath = query.from(DrawingModel.class); Path drawingPath = modelPath.get(“drawing”); Similar similar = cb.similar(drawingPath, new IndexedObject(drawing)); CriteriaQuery query = cb.createQuery().where(similar).limit(20); ResultSet<StringObject> results = searcher.search(query);
Query Building
Searching
Performing a Search – Combined
Similar simD = cb.similar(drawingPath, new IndexedObject(drawing)); Similar simP = cb.similar(picturePath, new IndexedObject(image)); And andPredicate = cb.and(simD, simP); CriteriaQuery query = cb.createQuery() .where(andPredicate ).limit(20); ResultSet<StringObject> results = searcher.search(query);
Query Building
Searching
Custom IndexedObject Factory
public class ImageIndexedObjectFactory extends AnnotatedIndexedObjectFactory{
…
protected List<IndexedObject> createObjects(List<IndexedField>
indexedFields) {
//create indexedObjects for the DrawingModel instances
…
}
…
}
Custom Image Searcher
public class ImageSearchEngine implements AbstractSearcher<Long> {
…
protected List<IndexedObject> getIndexedObjects(Object o) throws IndexingException {
//create different indexedObjects for the same image, and use //them in different individual searchers
}
public ResultSet search(Query query) {
//create subqueries to perform search in the individual image //searchers
}
…
}
QUERY API Some features of the Query API
Operators
• AND/ OR
• EQUAL / NOT EQUAL
• SIMILAR
• NOT
Query Processors
• Cascade Processor
– Each sub-expression at a time
• Parallel Processor
– Optimization for sub-expressions like AND and OR
DEMOS Some demos developed during the project
Demos
Available at http://www.youtube.com/inevopt
Android Visual Search Image & Vectorial Search
GETTING ENCONTRA How to get EnContRA, and more documentation and support?
Checkout/Push Source Code
Checkout
git clone [email protected]:inevo/encontra.git
Commit and Push
git commit –m “+ Add: Texture Layout Descriptor.”
git push
http://schacon.github.com/git/gittutorial.html
Contributing and Compiling Modules
mvn install full deploy (compile, package, run tests)
mvn package full deploy (compile, package)
mvn –DskipTests=true install full deploy (skip tests to
speed up)
http://maven.apache.org/
Documentation & Support
• EnContRA 101 – Dev Tutorial (almost finished)
• Javadocs
• Source code
• People:
– Me [email protected]
– Tiago Cardoso [email protected]
– Nelson Silva [email protected]
EnContRA Engine for Content-Based Retrieval Approaches
Ricardo José São Pedro Dias 07/02/2012
The End