multilingual search system

Multilingual Search System TEAM NAME –SHIELD Vamshi Krishna Padidela(50169645) Manikant Manohar Kapuganti(50170071) Pramod Rangaraju(50169514) Sudheer Bondada(50170321) Nikhil Ayyagari(50169485)

Upload: manikant

Post on 28-Jan-2016

216 views

Category:

Documents

0 download

Report

Download

Tags:

Embed Size (px):

DESCRIPTION

Multilingual search system as part of Information Retrieval. The presentation deals with the implementation of a search system using Solr.

TRANSCRIPT

Multilingual Search System

TEAM NAME –SHIELD

Vamshi Krishna Padidela(50169645)

Manikant Manohar Kapuganti(50170071)

Pramod Rangaraju(50169514)

Sudheer Bondada(50170321)

Nikhil Ayyagari(50169485)

Introduction

In this project, we built a retrieval system powered by Solr to search within tweets.

The dataset includes 11,000 tweets(multiple languages) consumed using the Twitter’s REST API. The tweets belong to two sets of topics isis and health with significant sub topics in each.

The UI for the search system is built on banana framework which has powerful dashboard capabilities to visualize big data analytics.

We have implemented below components

1. Content Tagging (Monolingual)

2. Faceted Search

3. Cross-Document Analytics

4. Topic Models and/or LSI

Content Tagging (Monolingual)

We realized content tagging using Alchemy’s Entity Extraction API.

The Alchemy API identifies proper nouns(places, people, organizations) using Natural Language Processing.

The tags for each tweet returned by the Alchemy API is added to the respective tweet using another field “tags”.

The new JSON file with the added “tags” is re-indexed in Solr.

The tags give insights into interesting metrics like popularity of a person, place etc over a period of time.

Results from Alchemy API’s content tagging

Tags for a search field

The tags displayed in the order of most used

Faceted Search

Faceted Search is available with banana framework where the search can be limited based on the fields like text, language, location and etc.

The functionality of facets are similar to filters with added document count.

Faceted search helps displaying dashboards for various analytical purposes.

Faceted search is also called faceted browsing, faceted navigation, guided navigation and sometimes parametric search.

Facets and filters

Pie chart showing the geographical distribution

Cross Document Analytics

Distribution of tweets against time and location

Topic Models-LSI

Implemented Latent Semantic Indexing(LSI) on the data collected to demonstrate semantic search instead of keyword search.

Latent Dirichlet Allocation (LDA) is an initial probabilistic extension of the LSI technique.

LDA is responsible for extraction of collections of topics.

LDA processes tweets in order to find the topic distribution fro each document and also the document distribution for each topic.

The LDA algorithm is invoked on the vectors generated from the Sequence file.

We are using MALLET(Machine Learning for Language Toolkit) for topic generation.(Results pending)

Search System UI – 1/2

Search System UI – 2/2

Thank You!!

A multilingual knowledge management system: A case study ...ccc.inaoep.mx/~villasen/bib/A multilingual... · Introduction Intheearlyandmiddle1990'stheInternet'sfirstweb pages were

M-CAST Multilingual Content Aggregation System based on TRUST Search Engine Borys Czerniejewski Sebastian Lisek Infovide S.A. (PL)

From text to truth real world facets for multilingual search

A System For Multilingual Sentiment Learning On Large Data ...oles/papers/586_Paper.pdf · A System For Multilingual Sentiment Learning On Large Data Sets Alex CHENG1 Oles ZHULYN1

Google’s Multilingual Neural Machine Translation System ... · PDF fileGoogle’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation MelvinJohnson,MikeSchuster,QuocV.Le,MaximKrikun,YonghuiWu,

Multilingual Digital Single Market - lt-innovate.org Multilingual DSM.pdf · @JochenHummel Multilingual Digital Single Market 3 ... Multilingual SEO/SEM ... multilingual CMS, human

Sudeshna Sarkar IIT Kharagpur...The ultimate multilingual search system “Given a query in any medium and any language, ... query translation and query selection •User feedback

A Tajik Extension of the Multilingual Information Extraction System

Challenges in building multilingual multidirectional lexical search · 2010-03-24 · Challenges in building multilingual multidirectional lexical search - the case of Nyishi-Bangla-English

Google’s Multilingual Neural Machine Translation System ... · Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation MelvinJohnson,MikeSchuster,QuocV.Le,MaximKrikun,YonghuiWu,

Multilingual Ontologies for Cross-Language Information Extraction and Semantic Search

The Architectural Design of a System for Interpreting Multilingual …€¦ · The Architectural Design of a System for Interpreting Multilingual Web Documents in E-speranto Grega

Multilingual Scene Character Recognition System using

in Solr Multilingual Searchdata-con.org/wp-content/uploads/2014/09/David-Troiano...Approaches to multilingual search in Solr A Multilingual Search Example The Goal Build a search engine

Shodhika Bhashini Multilingual Search and TTS System

Multilingual Search Engine Marketing

Recycling Lingware in a Multilingual MT System - DiVA …liu.diva-portal.org/smash/get/diva2:1079643/FULLTEXT02.pdf · Recycling Lingware in a Multilingual MT System Manny Rayner

Multilingual Search and Text Analytics with Solr - Steve Kearns

Multilingual Certification System (eMultilingual) User Guide

M-CAST Multilingual Content Aggregation System based on TRUST Search Engine Borys Czerniejewski Sebastian Lisek Infovide-Matrix S.A. (PL)

Optimizing Multilingual Search: Presented by David Troiano, Basis Technology

A prototype system for multilingual data discovery of ... · A prototype system for multilingual data discovery of International Long-Term Ecological Research (ILTER) Network data

The Porphyry System Applied to Multilingual Urban Ontologies

EROS: An Open Source Multilingual Research System for ...EROS: An Open Source Multilingual Research System for Image Content Retrieval dedicated to Conservation-Restoration exchange

Google's Multilingual Neural Machine Translation System

CLIR-Based Collaborative Construction of Multilingual ... · A user will deal with the multilingual dictionary and Solr search engine through a collaborative environment that makes

Mapping, Merging, and Multilingual Taxonomies Heather Hedden · Multilingual Taxonomy Goals Bilingual/Multilingual Taxonomies can enable: 1. A user to search and retrieve content

Show Tell: how do teachers search and multilingual tagging

Enabling Multilingual Search through Controlled ...eprints.rclis.org › 30220 › 1 › Celli_EnablingMultilingualSearch.pdf · 3 Enabling Multilingual Search Using a Controlled

Multilingual System for Web-Information: The State …hansu/ltslides06.pdfCombination of information extraction and multilingual generation Make database information multilingual available

SINAI-GIR A Multilingual Geographical IR System

Optimizing multilingual search in SOLR

Improving multilingual catalog search services by means of multilingual … · 2020-04-20 · IMPROVING MULTILINGUAL CATALOG SEARCH SERVICES BY MEANS OF MULTILINGUAL THESAURUS DISAMBIGUATION

Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation

A Linguistic Approach for Multilingual Machine Translation System