essay question assessment in e-learning - product key...
TRANSCRIPT
Ministry of Higher Education & Scientific Research
University of AL-Qadisiyah
College of Computer Science & Information Technology
Department of Computer
Essay question Assessment in E-learning
Ander Graduating Project
A report submitted to the department computer science of the requirements for obtaining a
bachelor's degree in computer science and information technology/ computer department.
College of Computer Science & Information Technology
University of AL-Qadisiyah
Under the supervision of Assistant lecturer
Manar Joundy Hazar
2018 A.C 1440 A.H
Mohammed Hassan Khudair Hussein Mohammed Jawad
Walaa Mohammed olaiwy Baneen Hussein Hameed
I
بسم اهلل الرمحن الرحيم
عملكم ورسوله ﴿ وقل نون ﴾اعملوا فسيرى للاه والمؤم
صدق اهلل العظيم
]سورة التوبة: )105([
II
االهـــــــــــــــــــداء
بدانا بأكثر من يد وقاسييييينا كثر من هم وعانينا الكثير من الايييياوبان وها ن ن اليوم وال مد ه ن و سيييي ر
إلى ىالماييي والنبيإلى منارة الالم وخالاييية مريييوارنا بين دفتي هما الامل المتوا ييي الليالي وتاب اإليام
األمي إلى سيد الخلق إلى رسولنا الكريم سيدنا م مد الى هللا عليه وسلم
إلى الينبوع الم ال يمل الا اء إلى من اكن سيياادتي بخيو منسييونة من قلب ا إلى والدتي الا.ي.ة إلى من
ألنام بالرا ة وال ناء الم لم يبخل بريء من نل دفاي في ريق النناح الم علمني ن رتقي ساى ورقى
اة ب كمة وابر إلى والد الا.ي. سلم ال ي
ح ون ن نرييق و خواني إلى من ب م ينر في عروقي ويل ج بمكراهم فؤاد إلى خواتي إلى من سييرنا سييويا
ي ئالي و.مئقااد ة وتالمنا إلى ت نا يداح بيد ون ن نق ف .هرمن تكان و النناح واإلبداع إلى ال ريق ماا
اباران في الالم إلى من ااغواالإلى من علمونا روفا من مهب وكلمان من درر وعباران من سمى و نلى
لنا علم م روفا ومن فكرهم منارة تنير لنا سيرة الالم والنناح إلى ساتمتنا الكرام
III
Contents
I ..................................................................................................................... اآلية
II ................................................................................................................. االهداء
Contents ......................................................................................................... III
Abstract ......................................................................................................... VI
Keywords ....................................................................................................... VI
Chapter One ..................................................................................................................... 1
1.1 Introduction ......................................................................................... 2
1.2 Problem Statement .............................................................................. 4
1.3 Importance of Study ............................................................................ 5
Chapter Two ..................................................................................................................... 6
2.1 Research Background ...................................................................................... 7
2.1.1 Editorial questions ....................................................................................... 7
2.1.2 Criteria for writing essay questions .............................................................. 7
2.1.3 Advantages of essay questions ................................................................... 8
2.1.4 Disadvantages of essay questions ............................................................. 8
2.2 Related work ....................................................................................... 9
2.3 Automatic Essay Scoring Approaches ............................................... 11
Chapter Three ............................................................................................... 13
3.1 The Pre-Processing Layer ................................................................... 15
3.1.1 Open Natural Language Processing (OpenNLP) ....................................... 15
3.1.2 Sentence Detection ................................................................................... 16
3.1.3 Tokenization .............................................................................................. 16
IV
3.1.4 Name Finder ............................................................................................. 18
3.1.5 POS Tagger............................................................................................... 18
3.1.6 Stemming................................................................................................... 18
3.1.7 Porter algorithm.......................................................................................... 19
3.1.8 Chunking.................................................................................................... 19
3.1.9 Parsing ..................................................................................................... 19
3.1.10 Co-reference Resolution ......................................................................... 20
3.1.11 Stop words .............................................................................................. 20
3.2 Intermediate Processing Layer .......................................................... 20
3.2.1 WordNet ................................................................................................... 20
3.2.2 Summarization .......................................................................................... 22
3.2.3 The word ambiguity .................................................................................. 23
3.2.4 The adapted lask algorithm ...................................................................... 24
3.3 Post-treatment Layer ........................................................................... 24
Chapter Four ................................................................................................. 25
4.1 Metrics ................................................................................................ 26
4.2 Simulation Environment ...................................................................... 26
4.3 Results and Discussion ....................................................................... 27
4.3.1 Experiment 1-Evaluation of the algorithm with the human judgment ......... 27
4.3.2 Sentence Detection ................................................................................... 28
4.3.3 Tokenization .............................................................................................. 30
4.4 Conclusion ................................................................................................ 32
4.5 future work ................................................................................................ 32
Refrains ......................................................................................................... 33
V
Contents Table
Table (1): First Test ......................................................................................... 27
Table (2): proposed technique against human grading ................................... 28
Table (3): The result of the presented method against each measures using
correlation metric ............................................................................................ 29
Table (4): RMSE measures ............................................................................ 31
Contents Figure
Figure (1): Intelligent Tutoring System Main Components ............................... 2
Figure (2): Layers of the proposed framework ................................................ 14
Figure (3): Example of WordNet Structure ...................................................... 21
Figure (4): The Framework Interface ................................................................. 26
Figure (5): Manual student answer samples ................................................... 28
Figure (6): Comparison of knowledge-based and proposed method .............. 30
Figure (7): Suggested method Vs Corpus-based ............................................ 30
Figure (8): Suggested method Vs. Baseline .................................................... 30
VI
Abstract:
E-learning is employing technology to help and promote learning begun decades ago.
Intelligent Tutoring systems (ITSs) are a good example of exhausting intelligent software
agents in e-learning. Assessment plays a significant role in the educational process.
Automated Essay Scoring (AES) is defined as the computer technology that evaluates and
scores the subjective answers. The answers to the essay questions are subjective while the
answers to multiple choice questions, true or false factual answers. Therefore, the process
of evaluating articles automatically is difficult because it requires high accuracy to evaluating
answers. To assess students essay answers based on a linguistic knowledge in this project
we presented a suitable model. Getting results from the recommended system application in
calculating simulation results indicates high precision in performance associated with other
methods. The efficiency of this frame was measured using Pearson correlation coefficient
and square root error rate (RMSE) based on the assessment of student responses obtained
from the University of North Texas data collection, which is available on the Internet.
Keywords:
E-learning, Word Net, NLP, Semantic Similarity, Student Assessment, essay question.
Page 1
Chapter One
Introduction
Page 2
1.1 Introduction
E-learning is an effective way of teaching using the Internet. With E-learning, you can
offer courses for students to study anytime, anywhere, as well as interact with them in an
easy and efficient way. E-learning has become one of the educational process requirements
not only to keep up with current developments and fast in learning organizations all over the
world, but also for E-learning to play a real role in improving education and results [1].
E-Learning has become an important trend in recent years. In addition to providing
richer resources than the traditional classroom to facilitate learning, e-Learning also
overcomes the limitations of time and space of traditional teaching. E-Learning allows
learners to learn independently, meaning that it lacks the supervision and enforcement
mechanisms of traditional teaching [2], [3]. Intelligent Learning Systems (ITS) are computer
systems that seek to provide user needs and provide customized instructions and response
to individuals without human interference. Using ITS, students receive customized learning
materials and automatic feedback about correct performance and errors. ITS is adaptive in
that it adapts and responds to learners with tasks or steps that are proportionate with the
individual characteristics of learners or their requirements or speed of education.
Figure (1): Intelligent Tutoring System Main Components
Page 3
One of the most important aspects of the knowledge process is the assessment of
knowledge acquired by the learner. In a typical classroom assessment (for example, a test,
a job, or an examination), the teacher or lecture student will provide students with answers
to relevant questions. However, in certain scenarios, such as a number of locations around
the world, online erudition environments, and individual or group study sessions that occur
without class, the trainer may not be easily available. In these cases, students still need some
assessment of their knowledge of the subject. Therefore, we must move to Computer
Assisted Assessment (CAA), while some CAA forms do not require sophisticated
understanding of the text (e.g. multiple choice or true / false), There are also student answers
consisting of free text that may require text analysis. Research has so far focused on two
(CAA) sub-tasks: responses to the grading article, which include methodological verification,
grammaticality, essay coherence, and assessment of short-term student responses.
Assessment of learning outcomes with tests and examinations can ease many types
and methods of grading. The types of specific questions can be designed like anything from
multiple-choice questions to simple questions that require natural language answers such as
short answers or articles. The rating method may be either manual staging or automatic
staging by calculation methods. In this paper we focus on the type of short answer question
and the automatic estimation method. [4].
Many researchers argue that the subjective nature of article evaluation leads to
disparities in the grades given by various human residents, which students consider a major
source of injustice.
You may experience this problem by adopting automated article evaluation tools.
The automated evaluation system will be at least consistent in the way the articles are
recorded, and enormous cost and time savings can be achieved if the system can be
presented to evaluative articles within a scope granted by a human evaluator. Furthermore,
according to Hirst (2000) using computers to increase our understanding of textual features
and cognitive skills that are involved in creating and understanding written texts, will provide
a number of benefits to the educational community. It will also help us develop more
effective technologies such as search engines and question-answering systems to provide
universal access to electronic information. [5]
Summit (for exam preparation assistant), a tool to evaluate students' essays based on
their content. It relies on the way semantic text analysis is called. Latent semantic analysis
(LSA) [6]. Essentially, LSA represents each word of text as a vector in a high-dimensional
Page 4
space so that proximity between two stores is closely related to the semantic similarity
between the two words.
This change in how the content of the course and interaction with students is a major
departure from the home classroom course in traditional classrooms. Perhaps the most
interesting position is the actual location of the place in which the introduction and deeper
sharing with the material takes place. Traditionally, an introduction to the classroom is
provided through a lecture, and a deeper engagement outside the classroom is conducted
through homework. In the description above, the introduction takes place outside the
classroom and the participation occurs within the classroom [7].
In this area, most researchers agree that some aspects of complex achievement are
difficult to measure with objective questions. Learning outcomes involving the ability to
remember, organize and integrate ideas, the ability to express themselves in writing and the
ability to supply only the identification of interpretation and application of data, require a less
responsive structure than are subject to objective test clauses. In measuring such results, the
highest levels of Bloom's classification (1956) (specifically evaluation and synthesis)
correspond to the article's question serving the most useful purpose [8].
1.2 Problem Statement
Not the same of Multiple Choice Question (MCQ), essays enclose subjective answers
rather than the accurate answers in MCQ (e.g. true or false) [9]. So, the student skill and
ability play a large role in creating a strong answer free of misspellings and grammatical
errors that will reduce the student's mark if they exist. Therefore, the process of automated
essays evaluations is a challenging task because of the need of comprehensive evaluation
in order to validate the answers accurately [10].
The difference between say multiple choice and short answer questions is easy to
comprehend, but the difference between other question types such as short answers and
essays can become blurred. Therefore, we say that a short answer question is one that can
be considered as meeting at least five specific criteria.
1. The question must require a response that recalls external knowledge.
2. The question must require a response given in natural language.
3. The answer length should be roughly between one phrase and one paragraph.
Page 5
4. The valuation of the responses should focus on the content instead of writing style.
5. The level of openness in open-ended versus close-ended responses should be restricted
with an objective question design [12].
1.3 Importance of Study
Much work has been conducted in the field of automatic grading but the systems are
mainly based on multiple choice exams [9]. These grading programs are not hard to make.
The difficulty lies rather in the design of the propositions which should be close enough to the
right answer but still wrong. An alternative to multiple choice tests is to ask the student to
write an essay about what he or she knows about a domain and then to compare that text to
pre-graded texts.
Educators in Britain are spending about 30% of their time in assessing and grading
student’s answers, which produces a loss of an expected 3 billion pounds per year. Therefore,
it can be imagined that much benefits such as improving economy and saving time, are
gained from the application of automated essays grading systems. Automated assessment
of students' free-text answers has several challenges [12].
Page 6
Chapter Two
Related Work
Page 7
2.1 Research Background
Automatic Essay Scoring (AES) is the study that has been proposed to assess the
teachers by providing an automatic approach to evaluate the score of an essay. In fact, there
are several techniques have been used for AES where the writing style, lexical analysis,
semantic analysis, syntactic analysis and probabilistic approach have been examined in
terms of providing scores [13].
2.1.1 Editorial questions:
This is the oldest pattern of common test questions used since ancient times. These
questions allow the student to answer the question in the form of an essay that is formulated
in his or her own style and usually requires expressive or structural answers, giving the
student an opportunity to express his ideas using his ability to the creation of interrelated
sentences, and such questions have certain criteria that must be adhered to when they are
written in the exams. Here the student skill and ability play a large role in creating a strong
answer free of misspellings and grammatical errors that will reduce the student's mark if they
exist [14].
2.1.2 Criteria for writing essay questions:
To consider the areas in which these questions are available, such as: limited number of
students, or limited to the educational outputs of the higher grades, so it must be
depending on the situation, the purpose and the goal.
Taking into account the development of a good plan during preparation, adherence to
procedures and steps to prepare them. Careful to choose the appropriate and clear words
and forms for the question, where the student understands it, and examples of these
formulas: (Discuss, explain, compare), and therefore can answer correctly.
Keep away from missing formulas, and open during the development of questions.
Consistency between the required achievement and the nature of the questions.
The words used should be used as a function of the category and quality of the question,
such as: (compare, in terms of, or deny) the avoidance of the use of formulas for
substantive questions, such as where, what, and when. Keep the questions covered in all
content, and the goal that students are expected to achieve, as they take into account
more essay questions, with attention to the time required for each answer.
Page 8
Define the typical answer to the questions, which are adopted during the correction, taking
into account the most important elements to be mentioned as a complete question mark.
Also, be careful not to neglect the sub-answers, which reduces and restricts the teacher's
ability to choose the answer he wishes and to see it correctly according to his mood [15].
2.1.3 Advantages of essay questions:
1. Allows students to select the right facts and ideas, and is primarily free to choose and
organize answers.
2. To suit all the abilities and possibilities of the student, helping him to link his ideas,
information, and briefings to be integrated and adequate.
3. Allows students to discover their abilities in finding solutions to problems, by employing
the correct knowledge.
4. This allows the student to express in the way he wants, which contributes to the discovery
of his culture and knowledge, as well as a fertile field to reveal his terminology and
information and confirm its validity, thus providing the teacher with the ability to evaluate
it based on its stock, abilities and skill in expression.
5. The proportion of these questions is small compared to the objective questions [16,17].
2.1.4 Disadvantages of essay questions:
1. The teacher does not allow the teacher to inform the entire curriculum due to the lack of
questions because of the long time it takes to answer. The teacher cannot put many
questions in the exam to take into consideration the abilities of the students and their
ability to complete the exam within the specified time. Standards for all educational
outputs.
2. Do not give the student the right to a satisfactory mark, if the teacher has corrected
randomly, according to his temperament, and without taking into account the typical
answers, or because he sees the correct answer is different from what the student
answered, especially as such questions provide the teacher with the opportunity to
intervene in the answer and determine.
3. It does not take into account the accuracy of marking, which is necessary and necessary
in the development of questions [16,17].
Page 9
2.2 Related work
Several researchers have addressed the problem of automated essay scoring or so-
called automatic essay assessment by using various techniques. The key characteristic
behind these techniques lies on a set of manually scored essay by human in which the essay
that intended to be assessed is compared with the pre-scored essays. Usually, the manually
scored essays are called pre-scored essays or training essays, whereas the essay that
required to be assessed by the computer is called tested essay or automated scoring essay
[18].
The earliest system proposed for essay assessment is the Project Essay Grader (PEG)
which was focusing on the writing style of a given essay in order to provide the score. The
writing style concentrates on essay length and mechanics such as spelling error,
capitalization, grammar and diction. Obviously this approach was criticized due to the lack of
semantic analysis in which the content is being ignored.
A question generation system was presented by Yao et al. based on the approach of
semantic rewriting. State-of-the-art deep linguistic parsing and generation tools are working
to map natural language sentences into their meaning representations in the form of Minimal
Recursion Semantics (MRS) and vice versa. a principled way of generating questions is
obtained, which avoids the ad-hoc manipulation of syntactic structures. Based on the (partial)
understanding of the sentence meaning, the system creates questions that are semantically
grounded and purposeful [19].
There are many methods that used in Graesser, Arthur C., et al. (2000), which is
correcting the computer for the article presented in the test answers by sample of students
and one of the methods used to evaluate the article provided by a group or sample of students
using the model of efficiency based on educational education. One scientist suggested a
model or approach to lessons the smart test answers students using a grammar checker and
providing feedback after evaluating the student's answer. Another world - class student
identification software program proposes using a check-in-side flow control diagram to
measure the similarities [20].
Satav, Harshada, Trupti Nanekar, Supriya Pingale, and Nupur [21] has presented an
examination system that based on SQL and Microsoft.Net (C# and Asp.Net). They
implemented an examination system for computer application base. Their system has only
examination and evaluation subsystems. Moreover, their system included different types of
Page 10
questions that related to computer application such as multiple choice, fill in blank, true / false,
programming design questions.
Ge Yu, Libin Hong, and Lei Sheng [22] has developed a web-based examination
system and evaluation system for computer programming. Their system has an examination
and exercising systems for programming. Furthermore, it includes preposition subsystem for
teachers in order to manage the set of exercises and questions. It also includes a monitoring
system which is used to configure the exam settings. In addition, they provided an
examination system that provides the examination task. They used a fill in blank evaluation
method that is based on the separation of the key words and matching them with the answers
key.
Mohamed Jaballah, Saad Harous, and Sane M. Yagi [23] developed an Arabic
examination system for students in University of Sharjah. Their system just included
examination and grading subsystems of different types of questions such as true/false,
multiple choice, fill in blank and essay question types. The exam paper is generated
automatically by the examination system. However, their grading subsystem was not
automated and the grading is done manually by the teacher via a grading portal.
Chen Xiangjun and Wu Fangsheng [24] proposed an examination system which
provides login activity recording, users management, test question management. It includes
examination subsystem and grading subsystem that based on matching the student answers
with the answers key.
On the other hand, our WBSECIL includes examination subsystem, smart grading
subsystem, homework submission subsystem, smart discussion board subsystem, and
administration subsystems. In addition, it employs AI algorithms to smartly grade the fill in
blank questions by measuring the sentences similarity. Syntax-based Approach. In this
approach, processing follows a common strategy for any input sentence. This strategy is
summarized as four basic steps as:
1. Parsing the sentence to determine the syntactic structure: Sentence detection is the first
and most major pre-processing step in the question generation process. Sentence
detector is the main component of any processing framework relevant to natural
language that concerns splitting input text whatever a whole document, paragraph or
sentence.
Page 11
2. Simplifying the sentence if possible: Sentence simplification is necessary because it
makes some aspects of question generation easier. This process uses one or more
simplification steps, including splitting sentences containing independent clauses,
appositive removal, prepositional phrase removal, discourse marker removal, and
relative clause removal. While simplification makes some aspects of question
generation easier, it also introduces new problems that must be handled, such as level
of simplification required (separately or in a combined mode), and processing different
types of clauses (e.g. illative, concessive, conditional, consecutive, adjectival, or
adverbial
2.3 AUTOMATIC ESSAY SCORING APPROACHES
In order to understand the mechanism of automated scoring systems, the approaches
that have been used by the previous researches should be illustrated in details. As mentioned
earlier, the automatic essay scoring depends mainly on manual pre-scored essays by human
in order to be compared with the new tested essays. In particular, the mechanism of such
comparison is conducted using several approaches. One of the earliest approaches is the
writing style in which the pre-scored essays are compared with the new tested essay in terms
of number of paragraphs, number of sentences and number of words. This can be conducted
by identifying pre-scored essay that share the same writing style characteristics of the tested
essay. In this manner, the score of the most similar pre-scored essay will be assigned to the
new tested essay. For example, if the new tested essay contains five paragraphs, its
automatic score will be the same with the pre-scored essay that contain five paragraphs.
However, other approaches aim to conduct the comparison between the pre-scored
essays and the new tested essay based on the content of these essays. In this manner,
lexical analysis could be used in order to examined lexical similarity between their words. For
example, if a pre-scored essay contains particular word such as ‘plant’ and the tested essay
contains the same word but with derivation such as ‘planting’, lexical analysis has the ability
to identify such similarity. In addition, lexical analysis is useful approach to identify the most
frequent terms of the two essays. In this case, it is easy to address the similarity between the
frequent terms from the two essays [25].
Page 12
Other approach would be used in the essay scoring is the semantic analysis in which
the similarity between two essay could be conducted based on the meaning of words such
as the ‘plant’ and ‘grass’. This capability is not provided by lexical analysis. However, to apply
the semantic analysis, an external knowledge resource has to be provided such as dictionary,
thesaurus or lexicon. There are some available dictionaries such as WordNet but it is
associated only with English language. This can put a limitation for other languages such as
Arabic. Therefore, some researchers come up with new semantic methods that do not require
the use of dictionary. These methods are mainly depending on statistics such as Latent
Semantic Analysis (LSA) and Distributional Semantic Co-occurrence (DISCO) [26].
Other researchers have focused on the syntactical or grammatical analysis in which
the verbs, nouns, adjectives are being analyzed with their semantic. In addition, noun phrases
and verb phrases are being divided in order to establish an independent comparison between
the pre-scored essay and the new tested essay.
Page 13
Chapter Three
Methodology
Page 14
A framework is proposed in this chapter for process of students' evaluation in educational
environments based on linguistic knowledge. The framework loads the student answer as input
which will be passed through three layers illustrated in figure (2)
Figure (2): Layers of the proposed framework
Suggested system is based on linguistic knowledge. Students' answers are
electronically obtained and compared to the ideal answer stored in the system. The acquired
answer as well as assessed electronically
Page 15
3.1 The Pre-Processing Layer:
This process powered by OpenNLP frawmeork that is open source statistical base
parser. OpenNLP is used to process natural languages such as English language. It
performs the tasks of sentence detection, tokenizing, tagging part of speech and detecting
most popular people noun, organization, places, cities, countries and much more.
Most and common benefits of the statistical parser OpenNLP are [27,28]:
Simple and ready to use: Every sample of the OpenNLP can be up running after
simple steps.
Portable: The binaries are creation with “all platforms” that means that it is not
required to do any system configuration or setup any third-party dependencies.
Modular: Unlike other NLP toolkits, which often are built in a monolithic architecture,
OpenNLP is built in a data-centric design so that modules can be picked and changed.
Efficient: Piping the tokenizer (250K per second), POS tagger and lemmatizer all in one
process annotates over thousand words/second. The Named Entity Recognition and
Classification (NERC) module annotates over Kilos of words/second.
Multilingual: Currently we offer OpenNLP annotations for English, but other languages
are now being included in the pipeline.
3.1.1 Open Natural Language Processing (OpenNLP).
OpenNLP library is a machine learning based toolkit for the processing of natural
language text. It supports the most common NLP tasks, such as tokenization, sentence
segmentation, part-of-speech tagging, named entity extraction, chunking, and parsing. These
tasks are usually required to build more advanced text processing services. The library
contains several components, enabling one to build a full natural language processing pipeline.
These components include sentence detector, tokenizer, name finder, part-of-speech tagger,
chunker, and parser. Open NLP not inherently suitable unless combined with other software
to influence in the processing of text [29-30].
Page 16
3.1.2 Sentence Detection
The Open NLP Sentence Detector can detect that a punctuation character marks the
end of a sentence or not. In this sense, a sentence is defined as the longest white space
trimmed character sequence between two punctuation marks. The first and last sentence
make an exception to this rule. The first non-whitespace character is assumed to be the begin
of a sentence, and the last non-whitespace character is assumed to be a sentence end. The
sample text below should be segmented into its sentences [29, 30].
Pierre Vinken, 61 years old, will join the board as a nonexecutive director
Nov. 29. Mr. Vinken is chairman of Elsevier N.V., the Dutch publishing group.
Rudolph Agnew, 55 years old and former chairman of Consolidated. Fields PLC,
was named a director of this British industrial conglomerate.
After detecting the sentence boundaries, each sentence is written in it’s a single line.
Pierre Vinken, 61 years old, will join the board as a nonexecutive director
Nov. 29.
Mr. Vinken is chairman of Elsevier N.V., the Dutch publishing group.
Rudolph Agnew, 55 years old and former chairman of Consolidated.
Fields PLC, was named a director of this British industrial conglomerate.
Pre-trained models on the Web site are trained to detect sentences before the text is
tokenized. However, tokenization can take place first then the sentence detector will handle
the text after it is tokenized. The Open NLP Sentence Detector cannot identify sentence
boundaries based on the contents of the sentence [29, 31].
3.1.3 Tokenization
The Open NLP Tokenizers segment an input character sequence into tokens. Tokens
are usually words, punctuation, numbers, etc. [30].
Page 17
Pierre Vinken, 61 years old, will join the board as a nonexecutive director
Nov. 29.
Mr. Vinken is chairman of Elsevier N.V., the Dutch publishing group.
Rudolph Agnew, 55 years old and former chairman of Consolidated.
Fields PLC, was named a director of this British industrial
conglomerate.
The following result shows the individual tokens in a whitespace separated
representation.
Pierre Vinken, 61 years old, will join the board as a nonexecutive director
Nov. 29.
Mr. Vinken is chairman of Elsevier N.V., the Dutch publishing group.
Rudolph Agnew, 55 years old and former chairman of Consolidated.
Fields PLC, was named a nonexecutive director of this British industrial
conglomerate.
A form of asbestos once used to make Kent cigarette filters has caused a
high percentage of cancer deaths among a group of workers exposed to it
more than 30 years ago, researchers reported.
OpenNLP offers multiple tokenizer implementations [31, 32]:
Whitespace Tokenizer: A whitespace tokenizer, non-whitespace sequences are
identified as tokens Simple Tokenizer: A character class tokenizer, sequences of the same
character class, are tokens. Learnable Tokenizer: A maximum entropy tokenizer, detects
token boundaries based on a probability model. Most part-of-speech taggers, parsers and so
on, work with text tokenized in this manner. It is important to ensure that your tokenizer
produces tokens of the type expected by your later text processing components. With Open
Page 18
NLP (as with many systems), tokenization is a two-stage process: first, sentence boundaries
are identified, then tokens within each sentence are identified [32].
3.1.4 Name Finder
The Name Finder can detect named entities and numbers in the text. To be able to
detect entities the Name, Finder needs a style. The style relies on the language as well as
the type of entity the model was trained to handle.
Open NLP shows a dearth of pre-trained name search methods that are trained on
various sessions available free of charge .to Found names in raw text, are divided into
symbols and sentences. It is important that the tokenization for the training data and the input
text be identical [33].
3.1.5 POS Tagger
Putting POS signs into a part of speech is one way of removing the meaning of words
that aim to assign each word in a particular text to a fixed set of speech parts such as name,
verb, character, or circumstance. There are many words that have many potential signs and
therefore, a POS has been thrown in to cancel these words. Therefore, the primary role of
the POS is to determine the exact tag for each word in the group. [34, 29].
3.1.6 Stemming:
Words with the same meaning appear in various morphological forms. To capture their
similarity they are normalized into a common root-form, the stem word involves suffix
deletion. The suffixes are always added to the right side of the word such as (s, es, ed, ing
.... etc).In addition, the automated word stemming plays an important role in the information
retrieval systems. It improves the performance of the information retrieval systems By
confusing a set of terms or words in a single term or keyword. For example, a group of words
(treatment, treatment, treatment, and treatments) can contain only one root word (treatment).
Furthermore, subsequent automatic removal has a major effect on information
retrieval System performance. It reduces the complexity and size of data in information
retrieval systems by reducing the total number of words in the system. One of the supported
algorithms used for the word "dying" is the Porter stemming algorithm.The Porter algorithm
is described in the next section.
Page 19
3.1.7 Porter algorithm
The Porter algorithm differs from the Lovins root factors in two main ways. The first
difference is a significant reduction in the complexity of rules associated with subsequent
removal. The need for simplicity is embodied in the Leuven algorithm, which contains at least
294 suffixes, each associated with one of the 29 context-sensitive rules determining when or
how these can be removed from the end of the word. The algorithm is very simple in concept,
with approximately 60 suffixes, two re-encoding rules and one type of context-sensitive rule
to determine whether subsequent removal must be made. Instead of rules based on the
number of characters remaining after removal, Porter uses the minimum length based on the
number of static stills ("metrics") remaining after the subsequent removal. This idea, which
can easily be considered a computation of a syllable, was first studied by Dolby and Resnikoff
(1964). The model rule would therefore be as follows: (M> 0) * The FULLNESS! * Bean
This means that the suffix * FULNESS must be replaced with the suffix * FUL if, if only, the
resulting trunk has a non-zero (m) scale.
3.1.8 Chunking
The dissection of text consists of dividing the text into interconnected parts of words,
such as names and classes of action, but does not specify its internal structure or its role in
the main sentence [35].
3.1.9 Parsing
parsing is the most important stage used during the question generation process. It
is useful to understand and extract rich information about wholesale syntax. Distribution
represents a text entity in a tree structure based on the use of grammatical rules and input
sentences.
Parsing is done in – down manner or Progressivemanner. Top-down parsers
construct the parse tree by the derivation of the input sentence from the root node S down
to the leaves [28]. The searcher algorithm of the parser will expand All trees that have S as
a mother, using information about possible trees under the root, will construct a parallel tree
analyzer. In bottom-up parsing, the parse tree construction process is started With the
words of the input and tries to build the trees of the word, again by applying grammar rules
simultaneously. The success of the parser depends on if the parser Succeeds in building a
tree rooted in the start code S that covers all inputs [29].
Page 20
3.1.10 Co-reference Resolution:
is the process of matching all references to the same entity in the document,
regardless of the reference reference format. It usually matches the name, full name or
pronoun. Some work has shown that the common reference resolution can be used to
improve summary systems that rely heavily on word frequency features [27].
A simple example is the use of a pronominal reference. For example, "John will travel
tomorrow, buy the ticket yesterday." In this case, the word "he" refers to "John". Thus, if
words are recorded together, they may be more important. This type of analysis is not
widely used in summary systems due to performance and accuracy problems.
3.1.11 Stop words:
In any correct English sentence, many words are grammatical, meaning that they do
not contribute to the meanings of the sentence. Words like, to and from, are words that are
parked, as they are necessary for the sentence to be logical, but it does not describe the true
meaning. For example, in a fast car sentence, car and fast words are considered the basic
part of the sentence, while the relationship between words is described in the basic part. Stop
words are very common and appear in most sentences. Therefore, this can lead to a higher
overall score and the suggestion may be to remove these words before comparing sentences
3.2 Intermediate Processing Layer:
Intermediate treatment is the core of the proposed method for assessing summaries.
The process of applying semantic and grammatical information to summary assessment.
First, the source text and the summary text are analyzed into a set of sentences. Then the
analogy between each sentence of the text of the summary and the full sentences of the
source text is determined using a similarity configuration of word order and semantic
similarity. The maximum value is set as a degree of similarity to the current outline of the
summary.
3.2.1 WordNet
WordNet is a lexical database containing English words, including word descriptions,
synonyms, and semantic relationships between words. Words in WordNet are organized
hierarchically using hyponymy and hypernymy and words can easily be seen as concepts.
In this way WordNet can be interpreted as a classification.
Page 21
The vocabulary of any language can be defined as a set of models, each of which has one
or more senses. If the model contains more than one sense, it is volatile. If two words share
asense,they are synonymousIn WordNet, Synonyms are placed together in groups called
synsets. This means that two computers will be placed for two words in different groups.
Regardless of words, the synonym has relationships with other synchronizations. These
relationships are based on hyponies, hypnymy, meronymy and holonymy. Since this project
is about semantic similarity, our focus will be on Wahhabism and Al-Fathimi. WordNet
contains words from the name of the four word class grammar, verb, ad-jective, and
adverb. Except for comparison between names and attributes using attributes, there is no
connection between classes, so you can not compare between different categories using
WordNet. This means that the similarity procedure will consist only of noun comparisons,
verb action, and soon.
Figure (3): Example of WordNet Structure
The The availability of the WORDNET database was an important starting point.
The synchronization model is simple enough to provide a fundamental relationship between
the concept and the corresponding words. Moreover, WORDNET covers the entire English
Page 22
dictionary and provides an unusually large amount of conceptual differentiation. It is also
particularly useful from an arithmetic point of view because it has been developed for easy
access and movement through hierarchies.
Starting with WORDNET, we chose a subset of the appropriate processors to represent the
emotional concepts [36].
The first decision we made in the mapping project was to settle the relationships that
would be used to set WordNet as synchronized. There are three possible relationships:
synonymy, hypernymy, and instantiation. Some examples should illustrate these three
relationships and use them in mapping on WordNet groups. Consider the following entry in
the WordNet name database.
Formally, the semantic similarity is defined as follows:
simres (c1; c2) = max c2S (c1; c2) icres(c)
Where S(c1; c2) Is a set of concepts that fall under c1 and c2. The theoretical similarity
measure that used the same IC idea was from Lin. Explains its definition of similarity: ”The
similarity between A and B is measured by the ratio between the amount of information
needed to state the commonality
of A and B and the information needed to fully describe what A and B are.”
Formally, the definition above can be expressed by: simlin (c1; c2) = 2 £ simres (c1;
c2) (icres (c1) + icres (c2))/ (icres(c1) + icres(c2))
3.2.2 Summarization
To produce a concise sentence, the deletion strategy is used to remove unnecessary
information in the sentence from the source text. Unnecessary information includes trivial
details about topics such as examples, scenarios, or repeated information that contain some
important information [37].Indicates the number of words in the sentence. The main task of
the deletion strategy is to remove non-important information such as pausing words and
interpretations and giving two sentences, concise sentences and the original sentence,
allowing Ss to be a concise sentence, Os the original sentence, Len (Os) denotes the length
Page 23
of the Os statement while Len (Os) Wholesale Ss. The first rule of the deletion strategy is as
follows:
Length (Ss) is less than Length (Os)
Sentence Partitioning: Each sentence must be divided into a list of words in order to
measure the similarity. The sentence is tokenized by first eliminating the stop words. The
stop words are the words that should be removed from a sentence before any natural
language processing technique is applied. In addition, there is no specific set of words that
represent the stop words, and it can be any set. The most common set of stop words is the
set of the functional words such as the, for ,and at. Finally, the separator characters between
words are recognized and the word list is created.
3.2.3. The word ambiguity
The POS class in WordNet is divided into data sets that contain synonyms. We have
also seen that a word can be full of sensations, that is to say, it has different senses. All of
these senses will be included in the same classification, but in different groups. When
calculating the degree of similarity between two words, it is necessary to make sure that the
correct sense of words is compared. This is done by creating an unambiguous feeling in the
given context. The algorithm used most often to understand the meaning of the word is the
algorithm of Lesk Algorithm, proposed by Michael E. Lesk in 1986 [38]. The algorithm uses
a word definition to determine if there is something in common with another word.
First, the modified algorithm starts by specifying the context next to the target word.
For example, if the length of the sentence is N, the word words will be the word K.
Furthermore, the targeted word will have K / 2 words on the right side and two K / 2 words on
the left side as a neighboring context. After that, all possible senses for doing the verb and
names for each word will be searched in the specified context and listed in the list. Next, the
algorithm lists each sense as stated in WordNet, which is the gloss in the synonym associated
with the target word via hypernym, transparency, synonym, and thermometric relation .After
archiving all possible flare pairs, the adjusted Lesk algorithm is calculated by combining the
flint pairs and creating an overlap between them. The nested points are counted between two
groups by finding the longest common string between them. The degree of interference
depends on the length of the common serial string. Moreover, the length of the subsystem is
common affects the nested result through its square value. Finally, after each pair of
luminance is recorded, the target word is set with the highest number of nested points.
Page 24
3.2.4 The adapted lask algorithm
Before starting a modified version of the Lesk algorithm, let's look at the original Lesk
algorithm. Call to find the meaning of the word in a certain context is called the word remove
meaning, which is the responsibility of the Lesk algorithm. In addition, the original Lesk
algorithm uses the dictionary definition of the word in order to find its meaning [39].
Furthermore, the original Lesk algorithm counts the number of the common words between
the glosses of two words. The largest the number of common words the closest that sense
to be assigned to the word. The original Lesk algorithm compares the gloss / definition of
each word to the glosses of the all the other words in the sentence. Therefore, the sense with
the highest number of common words would be assigned to the word.
3.3 Post-treatment Layer
This displays the results from the user's system. It shows the measurement of
similarity as a degree to the user. We use the following equations to calculate the final result
(FS) for any written summary of the student:
FS = 2 Token_𝑀𝑎𝑡𝑐ℎ(𝑋,𝑌)
|𝑋|+|𝑌|
Where token match (X, Y) represents the word tokens corresponding to X and Y. The
important point is that they are based on all individual similarity values and therefore reflect
their effect. The coefficient of dice in equation gives the ratio of the number of symbols that
are similar between the total symbols. Thus, a value higher than the average matching with
the dice will always be returned, making it more optimistic [40,41]. In this regard, the threshold
must be specified in advance in order to select matching pairs.
Dice coefficient = 2∗ |𝑋∩𝑌|
|𝑋|+|𝑌|
Page 25
Chapter four
Result and discussion
Page 26
This chapter discusses in details the experiments performed in the research and
Simulation Settings then we conclude th strengths and weaknesses of the research then we
will recommend some future work for research development
4.1. Metrics
we evaluate our system by measured its performance using the flowing metrics
- Pearson’s Correlation Coefficient (PCC): measure a correlation between two variables.
It's a value in [1, -1] where 1, 0, and -1 mean positive, not exist, and negative correlation
correspondingly.
- Root Mean Square Error (RMSE): measure of the variances between assessed and
actual values.
4.2 Simulation Environment
Presented system applied using Open NLP open source library, to presses text. It
provide all NLP tasks (tokenization, POS, chunking, parsing …). Interface of proposed
system was build using visual studio 2013
Figure (4): The Framework Interface
Page 27
4.3. Results and Discussion
To evaluate the framework, we carried out three experiments. In the first experiment,
we measured the performance of the system against human judgment to identify the
students answer grading. In second experiment, we compare the performance of the
framework with the human calculation method and the third experiment comparring with
other well-known or recently proposed methods.
4.3.1. Experiment 1-Evaluation of the algorithm with the human judgment
In summer of 2017, dataset from al qadisyiah university student's (student responses)
were collected in Web programming course. It contain one essay. Table1.Displays the
questions use in this study. Figure 3. Shows a sample answer of three students.
Table (1): First Test
Topic Question Suggested method Score
Web
Programming
list the
benefits of
CSS?
9.65 9
9
10
10
10
Figure (5): Manual student answer samples
Page 28
answer essays ranked by Five human by hand, all five manually grader are computer
since lecturer, mark give for each answer was in range between 0 and 10, three samples are
token , then automatic eases on every answer is associated to lecturer's score.
4.3.2. Experimental 2
To evaluate our technique for grading student response, the performance measured
in contradiction of human grading to determine the similarity between response student and
human grading
Table (2): proposed technique against human grading
Sample question ,precise answer, and student response Score
Question What is the role of a prototype program in problem solving?
Answer To simulate the behavior of portions of the desired software product.
student1 A prototype program is used in problem solving to collect data for
the problem.
2 1
Student2 It simulates the behavior of portions of the desired software
product.
4 4
Student3 To find problem and errors in a program before it is finalized 3 2
Question What are the main advantages associated with object-oriented programming?
Answer Abstraction and reusability.
student1 Re-usability and ease of maintenance 5 5
Student2 Object oriented programming allows programmers to use an object
with classes that can be changed and manipulated while not
affecting the entire object at once.
1 1
Student3 Easier to debugg -Reusability 2 2
We independency test one components of our overall grading system in the final
stage, to assess the performance of suggested technique, we must use an evaluation metric.
Page 29
Several evaluation metrics are commonly used in NLP applications. In our test, the evaluation
is implemented between proposed frame work and dataset using Correlation metric for full
data set.Table8882 3 show result of the proposed frame work compare with each of these
measures and figure (4, 5, 6) are comparison of (Knowledge-based, Corpus-based, Baseline)
measures to the proposed frame work and the result was higher by 0.3 of a higher result in
these measures
Table (3): The result of the presented method against each measures using correlation metric
Measure Correlation
Knowledge-based measures
Shortest path 0.4412
Leacock&Chodorow 0.2232
Lesk 0.3631
Wu&Palmer 0.3365
Resnik 0.2521
Lin 0.3915
Jiang&Conrath 0.4498
Hirst&St-Onge 0.1960
Corpus-based measures
LSA BNC 0.4072
LSA Wikipedia 0.4285
ESA Wikipedia 0.4682
Baseline
tf*idf 0.3646
Suggested method 0.490052586
Page 30
Figure (6): Comparison of knowledge-based and proposed method
4.3.3. Experimental 3
In particular, to evaluate our methods on data set, we select the following methods:
Corpus-based and Baseline
Figure (7): Suggested method Vs Corpus-based
Figure (8): Suggested method Vs. Baseline
0.4900525850.4413
0.2231
0.3630.33660.252
0.39160.4499
0.1961
knowledge based
Shortest path Leacock&Chodorow Lesk Wu&palmer Resnik Lin Jiang&Conrath Hirst&St-Onge
0.4900525850.4071 0.4286 0.4681
-0.16E-16
0.10.20.30.40.50.6
Preposed framework LSA BNC LSA Wikipedia ESA Wikipedia
Co
rrla
tio
n C
oef
fici
ent
Number of Mesure
0.490052585
0.3647
0
0.1
0.2
0.3
0.4
0.5
0.6
Preposed framework tf*idfCo
rre
lati
on
Co
eff
icie
nt
Page 31
As shown in Figures (7), (8) and (9), it is very clearly that suggested method
succeeded higher correlation coefficient than all other methods. We can notice the similarity
between suggested method and human judgment.
Lastly Table (2) shows RMSE of the suggested method and some state-of-the-art
systems giving an indication of the performance of these ranking methods [42].
Table (4): RMSE measures
Measures RMSE
Lesk 1.033
JCN 1.021
HSO 1.037
PATH 1.028
RES 1.046
Lin 1.068
LCH 1.069
WUP 1.08
ESA 1.030
LSA 1.066
Tf * idf 1.084
suggested method 0.61
Page 32
4.4 CONCLUSION
This project has offered an analysis and a discussion of most recently presented
assessment approaches in tutoring. Then it is defines a system that can measure student
essays based on their content from different points of view. In Our testing we display a
important correlation between human scores and proposed method. This is Constitutes a
major move from tradtional syteme which is Specific to a particular type of question (multiple
choice / tue- fale exams). The proposed technology can be used in distance learning
programs where students can connect to the system and easily submit essays. The proposed
system can evaluate the essay questions without the need of the teacher to store any
knowledge. All that is required is the text of the course. This work is significant for schools,
teachers, and students particularly in the development countries since it saves the time and
cost required to spend on the earlier education activities.
4.5 future work
- Dealing with the Arabic language in the computer systems is a great challenge
because of the morphology, complexity, semantic and grammatical, therefore, we
recommend that the future work in this field be evaluated in the subjects taught in Arabic
- Essay question may contain mathematical symbols and formulas which is a big
challenging, making it more difficult to analyze the text. Although the accuracy of the results
of the current proposed system is not satisfactory enough but it is a first step to deal with this
type of topics and therefore we recommend in the future work to develop the system to focus
on dealing with symbols and mathematical formulas and equations
Page 33
References
1. Motiwalla, L. F. (2007). Mobile learning: A framework and evaluation. Computers &
education, 49(3), 581-596.
2. S. Sarhan, “Intelligent Tutoring System", M.Sc. Thesis, Dept. of Computer Science,
Faculty of Computers and Information, Mansoura University, 2009.
3. S. Sarhan, R. Bahgat, A. Tolba, "Rough-Neuro Model for Improving Student State
Diagnosis in Intelligent Tutoring System”, Egyptian Rough Computing Journal (ERCJ),
2009.
4. Valenti, S., Neri, F., & Cucchiarelli, A. (2003). An overview of current research on
automated essay grading. Journal of Information Technology Education: Research, 2,
319-330.
5. Bin, L., & Jian-Min, Y. (2011, September). Automated essay scoring using multi-
classifier fusion. In International Conference on Information and Management
Engineering (pp. 151-157). Springer, Berlin, Heidelberg.
6. Felder, R. M., Woods, D. R., Stice, J. E., & Rugarcia, A. (2000). The future of
engineering education II. Teaching methods that work. Chemical Engineering
Education, 34(1), 26-39.
7. Lage, M. J., Platt, G. J., & Treglia, M. (2000). Inverting the classroom: A gateway to
creating an inclusive learning environment. The Journal of Economic Education, 31(1),
30-43.
8. Ghosh, S. (2010). Online Automated Essay Grading System as a Web Based Learning
(WBL) Tool in Engineering Education. Web-Based Engineering Education: Critical Design
and Effective Tools: Critical Design and Effective Tools, 53.
9. Landauer, T. K. (2003). Automatic essay assessment. Assessment in education:
Principles, policy & practice, 10 (3), 295-308.
10. Burrows, S., Gurevych, I., & Stein, B. (2015). The eras and trends of automatic short
answer grading. International Journal of Artificial Intelligence in Education, 25(1), 60-
117.
11. Lemaire, B., & Dessus, P. (2001). A system to assess the semantic content of student
essays. Journal of Educational Computing Research, 24(3), 305-320.
12. Lenz, B., Wells, J., & Kingston, S. (2015). Transforming schools using project-based
deeper learning, performance assessment, and common core standards. John Wiley &
Sons.
Page 34
13. Islam, M. M., & Hoque, A. L. (2010, December). Automated essay scoring using
generalized latent semantic analysis. In Computer and Information Technology (ICCIT),
2010 13th International Conference on (pp. 358-363). IEEE.
14. Heywood, J. (2000). Assessment in higher education: Student learning, teaching,
programmes and institutions (Vol. 56). Jessica Kingsley Publishers.
15. Arum, R., & Roksa, J. (2011). Academically adrift: Limited learning on college
campuses. University of Chicago Press.
16. Hermet, M., & Szpakowicz, S. (2006). Symbolic assessment of free text answers in a
second-language tutoring system.
17. N. Kang, E. M. van Mulligen, and J. A. Kors, “Comparing and Combining Chunkers of
Biomedical Text”, Journal of Biomedical Informatics, Vol. 44, No.2, 2011, pp. 354-360.
18. J. Kurs, M. Lungu, and O. Nierstrasz, “Top-Down Parsing with Parsing Contexts”, In
Proceedings of International Workshop on Smalltalk Technologies (IWST14), England,
2014, pp. 1-7.
19. Yao, X., Bouma, G., & Zhang, Y. (2012). Semantics-based question generation and
implementation. Dialogue & Discourse, 3(2), 11-42.
20. Graesser, A. C., Wiemer-Hastings, P., Wiemer-Hastings, K., Harter, D., Tutoring
Research Group, T. R. G., & Person, N. (2000). Using latent semantic analysis to
evaluate the contributions of students in AutoTutor. Interactive learning environments,
8(2), 129-147.
21. Baker, R. S., D'Mello, S. K., Rodrigo, M. M. T., & Graesser, A. C. (2010). Better to be
frustrated than bored: The incidence, persistence, and impact of learners’ cognitive–
affective states during interactions with three different computer-based learning
environments. International Journal of Human-Computer Studies, 68(4), 223-241.
22. Yang, G. L. (2009). U.S. Patent No. 7,555,713. Washington, DC: U.S. Patent and
Trademark Office.
23. Jaballah, M., Harous, S., & Yagi, S. M. (2008, April). UOS EASY EXAM Arabic
Computer-Based Examination System. In Information and Communication
Technologies: From Theory to Applications, 2008. ICTTA 2008. 3rd International
Conference on (pp. 1-5). IEEE.
24. Ho, Y. S., Sang, J., Ro, Y. M., Kim, J., & Wu, F. (Eds.). (2015). Advances in Multimedia
Information Processing--PCM 2015: 16th Pacific-Rim Conference on Multimedia,
Gwangju, South Korea, September 16-18, 2015, Proceedings (Vol. 9314). Springer.
Page 35
25. Attali, Y., Bridgeman, B., & Trapani, C. (2010). Performance of a generic approach in
automated essay scoring. The Journal of Technology, Learning and Assessment,
10(3).
26. Bond, F., & Foster, R. (2013). Linking and extending an open multilingual wordnet. In
Proceedings of the 51st Annual Meeting of the Association for Computational
Linguistics (Volume 1: Long Papers) (Vol. 1, pp. 1352-1362).
27. B. A. Galitsky, J. L. de la Rosa, and G. Dobrocsi,“ Inferring the Semantic Properties of
Sentences by Mining Syntactic Parse Trees”, Data & Knowledge Engineering, Vol. 81-
82, 2012, pp. 21-45.
28. B. A. Galitsky, “Transfer Learning of Syntactic Structures for Building Taxonomies for
Search Engines”, Engineering Applications of Artificial Intelligence, Vol. 26, Issue 10,
November 2013, pp. 2504-2515.
29. S. Anantpure, H. Jain, N. Alhat, S. Bhor, and S. Guru, “Literature Survey On Syntax
Parser For English Language Using Grammar Rules ˮ , International Journal of Advance
Foundation and Research in Computer (IJAFRC), Vol. 2, Special Issue (NCRTIT 2015),
January 2015, pp. 327-333.
30. B. A. Galitsky, J. L. de la Rosa, and G. Dobrocsi,“ Inferring the Semantic Properties of
Sentences by Mining Syntactic Parse Trees”, Data & Knowledge Engineering, Vol. 81-
82, 2012, pp. 21-45.
31. B. Galitsky, “Machine Learning of Syntactic Parse Trees for Search and Classification
of Text”, Engineering Applications of Artificial Intelligence, Vol. 26, No. 3, 2013, pp.153-
172.
32. S. W. Tu, M. Peleg, S. Carini, M. Bobak, J. Ross, D. Rubin, and I. Sim, “A Practical
Method for Transforming Free-Text Eligibility Criteria into Computable Criteria”, Journal
of Biomedical Informatics, Vol. 44, Issue 2, 2011, pp. 239–250.
33. B. A. Galitsky, “Transfer Learning of Syntactic Structures for Building Taxonomies for
Search Engines”, Engineering Applications of Artificial Intelligence, Vol. 26, Issue 10,
November 2013, pp. 2504-2515.
34. G. Wilcock, “Text Annotation with OpenNLP and UIMA”, Proceedings of the 17th Nordic
Conference of Computational Linguistics, University of Southern Denmark, Odense,
2009, pp. 7-8.
35. C. Arora, M. Sabetzadeh, L. Briand, F. Zimmer, and R. Gnaga, “Automatic Checking of
Conformance to Requirement Boilerplates via Text Chunking: An Industrial Case Study
ˮ, 7th ACM/IEEE International Symposium on Empirical Software Engineering and
Measurement (ESEM ), USA, 2013, pp.35-44.
Page 36
36. Tufis, D., Cristea, D., & Stamou, S. (2004). BalkaNet: Aims, methods, results and
perspectives. a general overview. Romanian Journal of Information science and
technology, 7(1-2), 9-43.
37. Brown, A. L., & Day, J. D. (1983). Macrorules for summarizing texts: The development
of expertise. Journal of verbal learning and verbal behavior, 22(1), 1-14.
38. Agirre, E., & Edmonds, P. (Eds.). (2007). Word sense disambiguation: Algorithms and
applications (Vol. 33). Springer Science & Business Media.
39. Banerjee, Satanjeev, and Ted Pedersen. An Adapted Lesk Algorithm for Word Sense
Disambiguation Using WordNet. Tech. Duluth: University of Minnesota. Print
40. T. N. Dao and T. Simpson, "Measuring Similarity between sentences.," WordNet. Net,
Tech. Rep.,2005.
41. S. Banerjee, T. Pedersen, "An adapted Lesk algorithm for word sense disambiguation
using. WordNet.," Proceedings of the Third International Conference on Computational
Linguistics and Intelligent Text Processing, Springer Berlin Heidelberg., pp. 136-145,
2002.
42. M. Mohler, R. Bunescu, R. Mihalcea, "Learning to Grade Short Answer Questions using
Semantic Similarity Measures and Dependency Graph Alignments," Association for
Computational Linguistics, pp. 752–762, 2011.