context sensitive ppt
TRANSCRIPT
-
8/13/2019 Context Sensitive Ppt
1/29
SPELL CHECKERBASIC AND CONTEXT
SENSITIVE
By:
K Satish Kumar ( 07131A0544 )
B Krishna Chaitanya (07131A0547 )
K Viswa Sai Raja ( 07131A0546)
-
8/13/2019 Context Sensitive Ppt
2/29
SPELL CHECKER:
A spell checkeris an application program that flags
words in a document that are not be spelt correctly.
Our application provides a method of correction of
misspelled and confused words in a phrase written in anatural language.
The application can offer several words as choice
words for inserting into the passage to replace the
unrecognized word.
-
8/13/2019 Context Sensitive Ppt
3/29
The kind of errors which result due to the absence of the
typed word in the dictionary are known as non-word
errors.
These kind of errors can be detected and corrected
using basic spell checking capabilities.
Examples: ths instead of this,
spel instead of spell
Basic Spell Checking
-
8/13/2019 Context Sensitive Ppt
4/29
BASIC SPELL CORRECTION APPROACH:
In order to perform basic spell checking, first we construct a trie with all the words that
are present in a dictionary. A dictionary is nothing but sequence of words in a text file.
After the trie is constructed then the given text which is either a single sentence or a
group of sentences is split into words. Then every word is searched for its presence inthe trie.
If any word is not found it is added to the misspelling list.
The suggestions to the words in this list are provided using edit distance criteria and
phonetic distance criteria.
In order to provide suggestions based on phonetic distance we are using a Class
called Double Metaphone from the package commons-codec-1.3provided by apache
software foundation group.
-
8/13/2019 Context Sensitive Ppt
5/29
HOW OUR SPELL CHECKER IS DIFFERENT FROM REGULAR
SPELL CHECKER???
I saw TREI trees in the parkINPUT
REGULAR
SPELL
CHECKER
I saw [ TREE | TREK ] trees in the park
-
8/13/2019 Context Sensitive Ppt
6/29
INPUT I saw TREE trees in the park
CONTEXTSENSITIVE
SPELL
CHECKER
I saw THREE trees in the park
-
8/13/2019 Context Sensitive Ppt
7/29
Recently, research has focused on developing algorithms which are capableof recognizing a misspelled word, even if the word itself is in the vocabulary,
based on the context of the surrounding words.
The detection and correction of spelling mistakes that result in real words of
the target language, also known as real word spell checking, is the mostchallenging task for a spell checking system.
However, the majority of those systems are not able to catch the kind of
errors such as in Let us meat today (meat was typed when meet was
intended). This kind of spell checking is known as Context sensitive spell
checking.
Indeed, empirical studies have estimated that errors resulting in valid words
account from 25% to more than 50% of the errors, depending on the
application.
Context Sensitive Spell Checking
-
8/13/2019 Context Sensitive Ppt
8/29
Context Sensitive Spell Check Approach:
In order to perform context based spell checking we are
taking the help of a search engine. It can be of Google or
Yahoo! or Bing or any other which allows to access the search
results of the query through an API.
Yahoo! provides the users an api through which we can give
unlimited number of queries once we have registered with
Yahoo!! BOSS. So finally we are using the search power of
Yahoo!.
Yahoo! Search BOSS (Build your Own Search Service) is aninitiative in Yahoo!! Search to open up Yahoo!!'s search
infrastructure and enable third parties to build revolutionary
search products leveraging their own data, content, technology,
social graph, or other assets.
-
8/13/2019 Context Sensitive Ppt
9/29
In this project, we send requests to the Yahoo! Boss Web
Service to find the possible real word error in the given sentence.
Consider the following sentence,
Let us meat today
The above sentence will be sent to the Yahoo! web server in the following
formats.
* us meat today
Let * meat today
Let us * today
Let us meat *
Context Sensitive Spell Check Approach:
-
8/13/2019 Context Sensitive Ppt
10/29
HOW TO USE THE YAHOO SERVICE:
The Yahoo! web server returns the result count for each sentence
sent. Basing on the number of results received from the web server,
we estimate the possible real word in the given sentence.
After the error has been detected, we generate suggestions basing on
features such as Edit Distance and Phonetic Distance.
But during the testing phase of the spell checking application, we
stored the most likely confused words, so that we need not consider the
above features and check with the most likely confused pair of the word
itself.
-
8/13/2019 Context Sensitive Ppt
11/29
Yahoo BOSS Application ID
-
8/13/2019 Context Sensitive Ppt
12/29
MAIL FEATURE:
JavaMail is a Java API used to receive and send email via SMTP,
POP3 and IMAP. JavaMail is built into the Java EE platform, but also
provides an optional package for use in Java SE.
The JavaMail API provides a platform-independent and protocol-
independent framework to build mail and messaging applications.
In our project, we are providing the users with an option to send the
Spell Checked text to the users mail account. We use the JavaMail
API to send the text content to the mentioned Email Address.
-
8/13/2019 Context Sensitive Ppt
13/29
USE CASE DIAGRAM:
-
8/13/2019 Context Sensitive Ppt
14/29
CLASS DIAGRAM:
-
8/13/2019 Context Sensitive Ppt
15/29
ACTIVITY DIAGRAM:
-
8/13/2019 Context Sensitive Ppt
16/29
SEQUENCE DIAGRAMS:
1.USERAPPLICATION:
-
8/13/2019 Context Sensitive Ppt
17/29
2.USERAPPLICATION - WEBSERVICE:
-
8/13/2019 Context Sensitive Ppt
18/29
3.USERAPPLICATION (MAIL):
-
8/13/2019 Context Sensitive Ppt
19/29
SCREEN SHOTS:
-
8/13/2019 Context Sensitive Ppt
20/29
OPENFILE DIALOGUE:
-
8/13/2019 Context Sensitive Ppt
21/29
OPENED DOCUMENT:
-
8/13/2019 Context Sensitive Ppt
22/29
REPLACING MISSPELLINGS:
-
8/13/2019 Context Sensitive Ppt
23/29
WHAT IF NO SUGGESTION FOUND:
-
8/13/2019 Context Sensitive Ppt
24/29
ADD TO DICTIONARY:
-
8/13/2019 Context Sensitive Ppt
25/29
Context Sensitive Spell
Checking
-
8/13/2019 Context Sensitive Ppt
26/29
MAIL FEATURE:
-
8/13/2019 Context Sensitive Ppt
27/29
REQUEST DETAILS DIALOGUE:
-
8/13/2019 Context Sensitive Ppt
28/29
Mail Received
-
8/13/2019 Context Sensitive Ppt
29/29
CONCLUSION:
This spell checker can be used when we need a rigorous checking ofour text (like when sending the document to higher officials etc.)
The Yahoo! BOSS API does not permit large number of requests inshort time. For this purpose we used a delay of 9 sec betweenconsecutive requests. This is to be reduced.
Besides this the result of our question greatly depends on the searchresults from search engine. Sometimes the required pattern may not befound in the search result.
So the future enhancements can be made such as using our owndatabase. A Database size of about 200GB can be made with the helpof Google trigram datasets and match the sentence against the trigramsto find out the central word and offer the suggestions based on featuressuch as edit and phonetic distances.