essay question assessment in e-learning - product key...

43
Ministry of Higher Education & Scientific Research University of AL-Qadisiyah College of Computer Science & Information Technology Department of Computer Essay question Assessment in E-learning Ander Graduating Project A report submitted to the department computer science of the requirements for obtaining a bachelor's degree in computer science and information technology/ computer department. College of Computer Science & Information Technology University of AL-Qadisiyah Under the supervision of Assistant lecturer Manar Joundy Hazar 2018 A.C 1440 A.H Mohammed Hassan Khudair Hussein Mohammed Jawad Walaa Mohammed olaiwy Baneen Hussein Hameed

Upload: others

Post on 28-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Ministry of Higher Education & Scientific Research

University of AL-Qadisiyah

College of Computer Science & Information Technology

Department of Computer

Essay question Assessment in E-learning

Ander Graduating Project

A report submitted to the department computer science of the requirements for obtaining a

bachelor's degree in computer science and information technology/ computer department.

College of Computer Science & Information Technology

University of AL-Qadisiyah

Under the supervision of Assistant lecturer

Manar Joundy Hazar

2018 A.C 1440 A.H

Mohammed Hassan Khudair Hussein Mohammed Jawad

Walaa Mohammed olaiwy Baneen Hussein Hameed

Page 2: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

I

بسم اهلل الرمحن الرحيم

عملكم ورسوله ﴿ وقل نون ﴾اعملوا فسيرى للاه والمؤم

صدق اهلل العظيم

]سورة التوبة: )105([

Page 3: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

II

االهـــــــــــــــــــداء

بدانا بأكثر من يد وقاسييييينا كثر من هم وعانينا الكثير من الايييياوبان وها ن ن اليوم وال مد ه ن و سيييي ر

إلى ىالماييي والنبيإلى منارة الالم وخالاييية مريييوارنا بين دفتي هما الامل المتوا ييي الليالي وتاب اإليام

األمي إلى سيد الخلق إلى رسولنا الكريم سيدنا م مد الى هللا عليه وسلم

إلى الينبوع الم ال يمل الا اء إلى من اكن سيياادتي بخيو منسييونة من قلب ا إلى والدتي الا.ي.ة إلى من

ألنام بالرا ة وال ناء الم لم يبخل بريء من نل دفاي في ريق النناح الم علمني ن رتقي ساى ورقى

اة ب كمة وابر إلى والد الا.ي. سلم ال ي

ح ون ن نرييق و خواني إلى من ب م ينر في عروقي ويل ج بمكراهم فؤاد إلى خواتي إلى من سييرنا سييويا

ي ئالي و.مئقااد ة وتالمنا إلى ت نا يداح بيد ون ن نق ف .هرمن تكان و النناح واإلبداع إلى ال ريق ماا

اباران في الالم إلى من ااغواالإلى من علمونا روفا من مهب وكلمان من درر وعباران من سمى و نلى

لنا علم م روفا ومن فكرهم منارة تنير لنا سيرة الالم والنناح إلى ساتمتنا الكرام

Page 4: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

III

Contents

I ..................................................................................................................... اآلية

II ................................................................................................................. االهداء

Contents ......................................................................................................... III

Abstract ......................................................................................................... VI

Keywords ....................................................................................................... VI

Chapter One ..................................................................................................................... 1

1.1 Introduction ......................................................................................... 2

1.2 Problem Statement .............................................................................. 4

1.3 Importance of Study ............................................................................ 5

Chapter Two ..................................................................................................................... 6

2.1 Research Background ...................................................................................... 7

2.1.1 Editorial questions ....................................................................................... 7

2.1.2 Criteria for writing essay questions .............................................................. 7

2.1.3 Advantages of essay questions ................................................................... 8

2.1.4 Disadvantages of essay questions ............................................................. 8

2.2 Related work ....................................................................................... 9

2.3 Automatic Essay Scoring Approaches ............................................... 11

Chapter Three ............................................................................................... 13

3.1 The Pre-Processing Layer ................................................................... 15

3.1.1 Open Natural Language Processing (OpenNLP) ....................................... 15

3.1.2 Sentence Detection ................................................................................... 16

3.1.3 Tokenization .............................................................................................. 16

Page 5: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

IV

3.1.4 Name Finder ............................................................................................. 18

3.1.5 POS Tagger............................................................................................... 18

3.1.6 Stemming................................................................................................... 18

3.1.7 Porter algorithm.......................................................................................... 19

3.1.8 Chunking.................................................................................................... 19

3.1.9 Parsing ..................................................................................................... 19

3.1.10 Co-reference Resolution ......................................................................... 20

3.1.11 Stop words .............................................................................................. 20

3.2 Intermediate Processing Layer .......................................................... 20

3.2.1 WordNet ................................................................................................... 20

3.2.2 Summarization .......................................................................................... 22

3.2.3 The word ambiguity .................................................................................. 23

3.2.4 The adapted lask algorithm ...................................................................... 24

3.3 Post-treatment Layer ........................................................................... 24

Chapter Four ................................................................................................. 25

4.1 Metrics ................................................................................................ 26

4.2 Simulation Environment ...................................................................... 26

4.3 Results and Discussion ....................................................................... 27

4.3.1 Experiment 1-Evaluation of the algorithm with the human judgment ......... 27

4.3.2 Sentence Detection ................................................................................... 28

4.3.3 Tokenization .............................................................................................. 30

4.4 Conclusion ................................................................................................ 32

4.5 future work ................................................................................................ 32

Refrains ......................................................................................................... 33

Page 6: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

V

Contents Table

Table (1): First Test ......................................................................................... 27

Table (2): proposed technique against human grading ................................... 28

Table (3): The result of the presented method against each measures using

correlation metric ............................................................................................ 29

Table (4): RMSE measures ............................................................................ 31

Contents Figure

Figure (1): Intelligent Tutoring System Main Components ............................... 2

Figure (2): Layers of the proposed framework ................................................ 14

Figure (3): Example of WordNet Structure ...................................................... 21

Figure (4): The Framework Interface ................................................................. 26

Figure (5): Manual student answer samples ................................................... 28

Figure (6): Comparison of knowledge-based and proposed method .............. 30

Figure (7): Suggested method Vs Corpus-based ............................................ 30

Figure (8): Suggested method Vs. Baseline .................................................... 30

Page 7: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

VI

Abstract:

E-learning is employing technology to help and promote learning begun decades ago.

Intelligent Tutoring systems (ITSs) are a good example of exhausting intelligent software

agents in e-learning. Assessment plays a significant role in the educational process.

Automated Essay Scoring (AES) is defined as the computer technology that evaluates and

scores the subjective answers. The answers to the essay questions are subjective while the

answers to multiple choice questions, true or false factual answers. Therefore, the process

of evaluating articles automatically is difficult because it requires high accuracy to evaluating

answers. To assess students essay answers based on a linguistic knowledge in this project

we presented a suitable model. Getting results from the recommended system application in

calculating simulation results indicates high precision in performance associated with other

methods. The efficiency of this frame was measured using Pearson correlation coefficient

and square root error rate (RMSE) based on the assessment of student responses obtained

from the University of North Texas data collection, which is available on the Internet.

Keywords:

E-learning, Word Net, NLP, Semantic Similarity, Student Assessment, essay question.

Page 8: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 1

Chapter One

Introduction

Page 9: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 2

1.1 Introduction

E-learning is an effective way of teaching using the Internet. With E-learning, you can

offer courses for students to study anytime, anywhere, as well as interact with them in an

easy and efficient way. E-learning has become one of the educational process requirements

not only to keep up with current developments and fast in learning organizations all over the

world, but also for E-learning to play a real role in improving education and results [1].

E-Learning has become an important trend in recent years. In addition to providing

richer resources than the traditional classroom to facilitate learning, e-Learning also

overcomes the limitations of time and space of traditional teaching. E-Learning allows

learners to learn independently, meaning that it lacks the supervision and enforcement

mechanisms of traditional teaching [2], [3]. Intelligent Learning Systems (ITS) are computer

systems that seek to provide user needs and provide customized instructions and response

to individuals without human interference. Using ITS, students receive customized learning

materials and automatic feedback about correct performance and errors. ITS is adaptive in

that it adapts and responds to learners with tasks or steps that are proportionate with the

individual characteristics of learners or their requirements or speed of education.

Figure (1): Intelligent Tutoring System Main Components

Page 10: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 3

One of the most important aspects of the knowledge process is the assessment of

knowledge acquired by the learner. In a typical classroom assessment (for example, a test,

a job, or an examination), the teacher or lecture student will provide students with answers

to relevant questions. However, in certain scenarios, such as a number of locations around

the world, online erudition environments, and individual or group study sessions that occur

without class, the trainer may not be easily available. In these cases, students still need some

assessment of their knowledge of the subject. Therefore, we must move to Computer

Assisted Assessment (CAA), while some CAA forms do not require sophisticated

understanding of the text (e.g. multiple choice or true / false), There are also student answers

consisting of free text that may require text analysis. Research has so far focused on two

(CAA) sub-tasks: responses to the grading article, which include methodological verification,

grammaticality, essay coherence, and assessment of short-term student responses.

Assessment of learning outcomes with tests and examinations can ease many types

and methods of grading. The types of specific questions can be designed like anything from

multiple-choice questions to simple questions that require natural language answers such as

short answers or articles. The rating method may be either manual staging or automatic

staging by calculation methods. In this paper we focus on the type of short answer question

and the automatic estimation method. [4].

Many researchers argue that the subjective nature of article evaluation leads to

disparities in the grades given by various human residents, which students consider a major

source of injustice.

You may experience this problem by adopting automated article evaluation tools.

The automated evaluation system will be at least consistent in the way the articles are

recorded, and enormous cost and time savings can be achieved if the system can be

presented to evaluative articles within a scope granted by a human evaluator. Furthermore,

according to Hirst (2000) using computers to increase our understanding of textual features

and cognitive skills that are involved in creating and understanding written texts, will provide

a number of benefits to the educational community. It will also help us develop more

effective technologies such as search engines and question-answering systems to provide

universal access to electronic information. [5]

Summit (for exam preparation assistant), a tool to evaluate students' essays based on

their content. It relies on the way semantic text analysis is called. Latent semantic analysis

(LSA) [6]. Essentially, LSA represents each word of text as a vector in a high-dimensional

Page 11: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 4

space so that proximity between two stores is closely related to the semantic similarity

between the two words.

This change in how the content of the course and interaction with students is a major

departure from the home classroom course in traditional classrooms. Perhaps the most

interesting position is the actual location of the place in which the introduction and deeper

sharing with the material takes place. Traditionally, an introduction to the classroom is

provided through a lecture, and a deeper engagement outside the classroom is conducted

through homework. In the description above, the introduction takes place outside the

classroom and the participation occurs within the classroom [7].

In this area, most researchers agree that some aspects of complex achievement are

difficult to measure with objective questions. Learning outcomes involving the ability to

remember, organize and integrate ideas, the ability to express themselves in writing and the

ability to supply only the identification of interpretation and application of data, require a less

responsive structure than are subject to objective test clauses. In measuring such results, the

highest levels of Bloom's classification (1956) (specifically evaluation and synthesis)

correspond to the article's question serving the most useful purpose [8].

1.2 Problem Statement

Not the same of Multiple Choice Question (MCQ), essays enclose subjective answers

rather than the accurate answers in MCQ (e.g. true or false) [9]. So, the student skill and

ability play a large role in creating a strong answer free of misspellings and grammatical

errors that will reduce the student's mark if they exist. Therefore, the process of automated

essays evaluations is a challenging task because of the need of comprehensive evaluation

in order to validate the answers accurately [10].

The difference between say multiple choice and short answer questions is easy to

comprehend, but the difference between other question types such as short answers and

essays can become blurred. Therefore, we say that a short answer question is one that can

be considered as meeting at least five specific criteria.

1. The question must require a response that recalls external knowledge.

2. The question must require a response given in natural language.

3. The answer length should be roughly between one phrase and one paragraph.

Page 12: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 5

4. The valuation of the responses should focus on the content instead of writing style.

5. The level of openness in open-ended versus close-ended responses should be restricted

with an objective question design [12].

1.3 Importance of Study

Much work has been conducted in the field of automatic grading but the systems are

mainly based on multiple choice exams [9]. These grading programs are not hard to make.

The difficulty lies rather in the design of the propositions which should be close enough to the

right answer but still wrong. An alternative to multiple choice tests is to ask the student to

write an essay about what he or she knows about a domain and then to compare that text to

pre-graded texts.

Educators in Britain are spending about 30% of their time in assessing and grading

student’s answers, which produces a loss of an expected 3 billion pounds per year. Therefore,

it can be imagined that much benefits such as improving economy and saving time, are

gained from the application of automated essays grading systems. Automated assessment

of students' free-text answers has several challenges [12].

Page 13: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 6

Chapter Two

Related Work

Page 14: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 7

2.1 Research Background

Automatic Essay Scoring (AES) is the study that has been proposed to assess the

teachers by providing an automatic approach to evaluate the score of an essay. In fact, there

are several techniques have been used for AES where the writing style, lexical analysis,

semantic analysis, syntactic analysis and probabilistic approach have been examined in

terms of providing scores [13].

2.1.1 Editorial questions:

This is the oldest pattern of common test questions used since ancient times. These

questions allow the student to answer the question in the form of an essay that is formulated

in his or her own style and usually requires expressive or structural answers, giving the

student an opportunity to express his ideas using his ability to the creation of interrelated

sentences, and such questions have certain criteria that must be adhered to when they are

written in the exams. Here the student skill and ability play a large role in creating a strong

answer free of misspellings and grammatical errors that will reduce the student's mark if they

exist [14].

2.1.2 Criteria for writing essay questions:

To consider the areas in which these questions are available, such as: limited number of

students, or limited to the educational outputs of the higher grades, so it must be

depending on the situation, the purpose and the goal.

Taking into account the development of a good plan during preparation, adherence to

procedures and steps to prepare them. Careful to choose the appropriate and clear words

and forms for the question, where the student understands it, and examples of these

formulas: (Discuss, explain, compare), and therefore can answer correctly.

Keep away from missing formulas, and open during the development of questions.

Consistency between the required achievement and the nature of the questions.

The words used should be used as a function of the category and quality of the question,

such as: (compare, in terms of, or deny) the avoidance of the use of formulas for

substantive questions, such as where, what, and when. Keep the questions covered in all

content, and the goal that students are expected to achieve, as they take into account

more essay questions, with attention to the time required for each answer.

Page 15: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 8

Define the typical answer to the questions, which are adopted during the correction, taking

into account the most important elements to be mentioned as a complete question mark.

Also, be careful not to neglect the sub-answers, which reduces and restricts the teacher's

ability to choose the answer he wishes and to see it correctly according to his mood [15].

2.1.3 Advantages of essay questions:

1. Allows students to select the right facts and ideas, and is primarily free to choose and

organize answers.

2. To suit all the abilities and possibilities of the student, helping him to link his ideas,

information, and briefings to be integrated and adequate.

3. Allows students to discover their abilities in finding solutions to problems, by employing

the correct knowledge.

4. This allows the student to express in the way he wants, which contributes to the discovery

of his culture and knowledge, as well as a fertile field to reveal his terminology and

information and confirm its validity, thus providing the teacher with the ability to evaluate

it based on its stock, abilities and skill in expression.

5. The proportion of these questions is small compared to the objective questions [16,17].

2.1.4 Disadvantages of essay questions:

1. The teacher does not allow the teacher to inform the entire curriculum due to the lack of

questions because of the long time it takes to answer. The teacher cannot put many

questions in the exam to take into consideration the abilities of the students and their

ability to complete the exam within the specified time. Standards for all educational

outputs.

2. Do not give the student the right to a satisfactory mark, if the teacher has corrected

randomly, according to his temperament, and without taking into account the typical

answers, or because he sees the correct answer is different from what the student

answered, especially as such questions provide the teacher with the opportunity to

intervene in the answer and determine.

3. It does not take into account the accuracy of marking, which is necessary and necessary

in the development of questions [16,17].

Page 16: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 9

2.2 Related work

Several researchers have addressed the problem of automated essay scoring or so-

called automatic essay assessment by using various techniques. The key characteristic

behind these techniques lies on a set of manually scored essay by human in which the essay

that intended to be assessed is compared with the pre-scored essays. Usually, the manually

scored essays are called pre-scored essays or training essays, whereas the essay that

required to be assessed by the computer is called tested essay or automated scoring essay

[18].

The earliest system proposed for essay assessment is the Project Essay Grader (PEG)

which was focusing on the writing style of a given essay in order to provide the score. The

writing style concentrates on essay length and mechanics such as spelling error,

capitalization, grammar and diction. Obviously this approach was criticized due to the lack of

semantic analysis in which the content is being ignored.

A question generation system was presented by Yao et al. based on the approach of

semantic rewriting. State-of-the-art deep linguistic parsing and generation tools are working

to map natural language sentences into their meaning representations in the form of Minimal

Recursion Semantics (MRS) and vice versa. a principled way of generating questions is

obtained, which avoids the ad-hoc manipulation of syntactic structures. Based on the (partial)

understanding of the sentence meaning, the system creates questions that are semantically

grounded and purposeful [19].

There are many methods that used in Graesser, Arthur C., et al. (2000), which is

correcting the computer for the article presented in the test answers by sample of students

and one of the methods used to evaluate the article provided by a group or sample of students

using the model of efficiency based on educational education. One scientist suggested a

model or approach to lessons the smart test answers students using a grammar checker and

providing feedback after evaluating the student's answer. Another world - class student

identification software program proposes using a check-in-side flow control diagram to

measure the similarities [20].

Satav, Harshada, Trupti Nanekar, Supriya Pingale, and Nupur [21] has presented an

examination system that based on SQL and Microsoft.Net (C# and Asp.Net). They

implemented an examination system for computer application base. Their system has only

examination and evaluation subsystems. Moreover, their system included different types of

Page 17: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 10

questions that related to computer application such as multiple choice, fill in blank, true / false,

programming design questions.

Ge Yu, Libin Hong, and Lei Sheng [22] has developed a web-based examination

system and evaluation system for computer programming. Their system has an examination

and exercising systems for programming. Furthermore, it includes preposition subsystem for

teachers in order to manage the set of exercises and questions. It also includes a monitoring

system which is used to configure the exam settings. In addition, they provided an

examination system that provides the examination task. They used a fill in blank evaluation

method that is based on the separation of the key words and matching them with the answers

key.

Mohamed Jaballah, Saad Harous, and Sane M. Yagi [23] developed an Arabic

examination system for students in University of Sharjah. Their system just included

examination and grading subsystems of different types of questions such as true/false,

multiple choice, fill in blank and essay question types. The exam paper is generated

automatically by the examination system. However, their grading subsystem was not

automated and the grading is done manually by the teacher via a grading portal.

Chen Xiangjun and Wu Fangsheng [24] proposed an examination system which

provides login activity recording, users management, test question management. It includes

examination subsystem and grading subsystem that based on matching the student answers

with the answers key.

On the other hand, our WBSECIL includes examination subsystem, smart grading

subsystem, homework submission subsystem, smart discussion board subsystem, and

administration subsystems. In addition, it employs AI algorithms to smartly grade the fill in

blank questions by measuring the sentences similarity. Syntax-based Approach. In this

approach, processing follows a common strategy for any input sentence. This strategy is

summarized as four basic steps as:

1. Parsing the sentence to determine the syntactic structure: Sentence detection is the first

and most major pre-processing step in the question generation process. Sentence

detector is the main component of any processing framework relevant to natural

language that concerns splitting input text whatever a whole document, paragraph or

sentence.

Page 18: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 11

2. Simplifying the sentence if possible: Sentence simplification is necessary because it

makes some aspects of question generation easier. This process uses one or more

simplification steps, including splitting sentences containing independent clauses,

appositive removal, prepositional phrase removal, discourse marker removal, and

relative clause removal. While simplification makes some aspects of question

generation easier, it also introduces new problems that must be handled, such as level

of simplification required (separately or in a combined mode), and processing different

types of clauses (e.g. illative, concessive, conditional, consecutive, adjectival, or

adverbial

2.3 AUTOMATIC ESSAY SCORING APPROACHES

In order to understand the mechanism of automated scoring systems, the approaches

that have been used by the previous researches should be illustrated in details. As mentioned

earlier, the automatic essay scoring depends mainly on manual pre-scored essays by human

in order to be compared with the new tested essays. In particular, the mechanism of such

comparison is conducted using several approaches. One of the earliest approaches is the

writing style in which the pre-scored essays are compared with the new tested essay in terms

of number of paragraphs, number of sentences and number of words. This can be conducted

by identifying pre-scored essay that share the same writing style characteristics of the tested

essay. In this manner, the score of the most similar pre-scored essay will be assigned to the

new tested essay. For example, if the new tested essay contains five paragraphs, its

automatic score will be the same with the pre-scored essay that contain five paragraphs.

However, other approaches aim to conduct the comparison between the pre-scored

essays and the new tested essay based on the content of these essays. In this manner,

lexical analysis could be used in order to examined lexical similarity between their words. For

example, if a pre-scored essay contains particular word such as ‘plant’ and the tested essay

contains the same word but with derivation such as ‘planting’, lexical analysis has the ability

to identify such similarity. In addition, lexical analysis is useful approach to identify the most

frequent terms of the two essays. In this case, it is easy to address the similarity between the

frequent terms from the two essays [25].

Page 19: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 12

Other approach would be used in the essay scoring is the semantic analysis in which

the similarity between two essay could be conducted based on the meaning of words such

as the ‘plant’ and ‘grass’. This capability is not provided by lexical analysis. However, to apply

the semantic analysis, an external knowledge resource has to be provided such as dictionary,

thesaurus or lexicon. There are some available dictionaries such as WordNet but it is

associated only with English language. This can put a limitation for other languages such as

Arabic. Therefore, some researchers come up with new semantic methods that do not require

the use of dictionary. These methods are mainly depending on statistics such as Latent

Semantic Analysis (LSA) and Distributional Semantic Co-occurrence (DISCO) [26].

Other researchers have focused on the syntactical or grammatical analysis in which

the verbs, nouns, adjectives are being analyzed with their semantic. In addition, noun phrases

and verb phrases are being divided in order to establish an independent comparison between

the pre-scored essay and the new tested essay.

Page 20: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 13

Chapter Three

Methodology

Page 21: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 14

A framework is proposed in this chapter for process of students' evaluation in educational

environments based on linguistic knowledge. The framework loads the student answer as input

which will be passed through three layers illustrated in figure (2)

Figure (2): Layers of the proposed framework

Suggested system is based on linguistic knowledge. Students' answers are

electronically obtained and compared to the ideal answer stored in the system. The acquired

answer as well as assessed electronically

Page 22: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 15

3.1 The Pre-Processing Layer:

This process powered by OpenNLP frawmeork that is open source statistical base

parser. OpenNLP is used to process natural languages such as English language. It

performs the tasks of sentence detection, tokenizing, tagging part of speech and detecting

most popular people noun, organization, places, cities, countries and much more.

Most and common benefits of the statistical parser OpenNLP are [27,28]:

Simple and ready to use: Every sample of the OpenNLP can be up running after

simple steps.

Portable: The binaries are creation with “all platforms” that means that it is not

required to do any system configuration or setup any third-party dependencies.

Modular: Unlike other NLP toolkits, which often are built in a monolithic architecture,

OpenNLP is built in a data-centric design so that modules can be picked and changed.

Efficient: Piping the tokenizer (250K per second), POS tagger and lemmatizer all in one

process annotates over thousand words/second. The Named Entity Recognition and

Classification (NERC) module annotates over Kilos of words/second.

Multilingual: Currently we offer OpenNLP annotations for English, but other languages

are now being included in the pipeline.

3.1.1 Open Natural Language Processing (OpenNLP).

OpenNLP library is a machine learning based toolkit for the processing of natural

language text. It supports the most common NLP tasks, such as tokenization, sentence

segmentation, part-of-speech tagging, named entity extraction, chunking, and parsing. These

tasks are usually required to build more advanced text processing services. The library

contains several components, enabling one to build a full natural language processing pipeline.

These components include sentence detector, tokenizer, name finder, part-of-speech tagger,

chunker, and parser. Open NLP not inherently suitable unless combined with other software

to influence in the processing of text [29-30].

Page 23: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 16

3.1.2 Sentence Detection

The Open NLP Sentence Detector can detect that a punctuation character marks the

end of a sentence or not. In this sense, a sentence is defined as the longest white space

trimmed character sequence between two punctuation marks. The first and last sentence

make an exception to this rule. The first non-whitespace character is assumed to be the begin

of a sentence, and the last non-whitespace character is assumed to be a sentence end. The

sample text below should be segmented into its sentences [29, 30].

Pierre Vinken, 61 years old, will join the board as a nonexecutive director

Nov. 29. Mr. Vinken is chairman of Elsevier N.V., the Dutch publishing group.

Rudolph Agnew, 55 years old and former chairman of Consolidated. Fields PLC,

was named a director of this British industrial conglomerate.

After detecting the sentence boundaries, each sentence is written in it’s a single line.

Pierre Vinken, 61 years old, will join the board as a nonexecutive director

Nov. 29.

Mr. Vinken is chairman of Elsevier N.V., the Dutch publishing group.

Rudolph Agnew, 55 years old and former chairman of Consolidated.

Fields PLC, was named a director of this British industrial conglomerate.

Pre-trained models on the Web site are trained to detect sentences before the text is

tokenized. However, tokenization can take place first then the sentence detector will handle

the text after it is tokenized. The Open NLP Sentence Detector cannot identify sentence

boundaries based on the contents of the sentence [29, 31].

3.1.3 Tokenization

The Open NLP Tokenizers segment an input character sequence into tokens. Tokens

are usually words, punctuation, numbers, etc. [30].

Page 24: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 17

Pierre Vinken, 61 years old, will join the board as a nonexecutive director

Nov. 29.

Mr. Vinken is chairman of Elsevier N.V., the Dutch publishing group.

Rudolph Agnew, 55 years old and former chairman of Consolidated.

Fields PLC, was named a director of this British industrial

conglomerate.

The following result shows the individual tokens in a whitespace separated

representation.

Pierre Vinken, 61 years old, will join the board as a nonexecutive director

Nov. 29.

Mr. Vinken is chairman of Elsevier N.V., the Dutch publishing group.

Rudolph Agnew, 55 years old and former chairman of Consolidated.

Fields PLC, was named a nonexecutive director of this British industrial

conglomerate.

A form of asbestos once used to make Kent cigarette filters has caused a

high percentage of cancer deaths among a group of workers exposed to it

more than 30 years ago, researchers reported.

OpenNLP offers multiple tokenizer implementations [31, 32]:

Whitespace Tokenizer: A whitespace tokenizer, non-whitespace sequences are

identified as tokens Simple Tokenizer: A character class tokenizer, sequences of the same

character class, are tokens. Learnable Tokenizer: A maximum entropy tokenizer, detects

token boundaries based on a probability model. Most part-of-speech taggers, parsers and so

on, work with text tokenized in this manner. It is important to ensure that your tokenizer

produces tokens of the type expected by your later text processing components. With Open

Page 25: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 18

NLP (as with many systems), tokenization is a two-stage process: first, sentence boundaries

are identified, then tokens within each sentence are identified [32].

3.1.4 Name Finder

The Name Finder can detect named entities and numbers in the text. To be able to

detect entities the Name, Finder needs a style. The style relies on the language as well as

the type of entity the model was trained to handle.

Open NLP shows a dearth of pre-trained name search methods that are trained on

various sessions available free of charge .to Found names in raw text, are divided into

symbols and sentences. It is important that the tokenization for the training data and the input

text be identical [33].

3.1.5 POS Tagger

Putting POS signs into a part of speech is one way of removing the meaning of words

that aim to assign each word in a particular text to a fixed set of speech parts such as name,

verb, character, or circumstance. There are many words that have many potential signs and

therefore, a POS has been thrown in to cancel these words. Therefore, the primary role of

the POS is to determine the exact tag for each word in the group. [34, 29].

3.1.6 Stemming:

Words with the same meaning appear in various morphological forms. To capture their

similarity they are normalized into a common root-form, the stem word involves suffix

deletion. The suffixes are always added to the right side of the word such as (s, es, ed, ing

.... etc).In addition, the automated word stemming plays an important role in the information

retrieval systems. It improves the performance of the information retrieval systems By

confusing a set of terms or words in a single term or keyword. For example, a group of words

(treatment, treatment, treatment, and treatments) can contain only one root word (treatment).

Furthermore, subsequent automatic removal has a major effect on information

retrieval System performance. It reduces the complexity and size of data in information

retrieval systems by reducing the total number of words in the system. One of the supported

algorithms used for the word "dying" is the Porter stemming algorithm.The Porter algorithm

is described in the next section.

Page 26: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 19

3.1.7 Porter algorithm

The Porter algorithm differs from the Lovins root factors in two main ways. The first

difference is a significant reduction in the complexity of rules associated with subsequent

removal. The need for simplicity is embodied in the Leuven algorithm, which contains at least

294 suffixes, each associated with one of the 29 context-sensitive rules determining when or

how these can be removed from the end of the word. The algorithm is very simple in concept,

with approximately 60 suffixes, two re-encoding rules and one type of context-sensitive rule

to determine whether subsequent removal must be made. Instead of rules based on the

number of characters remaining after removal, Porter uses the minimum length based on the

number of static stills ("metrics") remaining after the subsequent removal. This idea, which

can easily be considered a computation of a syllable, was first studied by Dolby and Resnikoff

(1964). The model rule would therefore be as follows: (M> 0) * The FULLNESS! * Bean

This means that the suffix * FULNESS must be replaced with the suffix * FUL if, if only, the

resulting trunk has a non-zero (m) scale.

3.1.8 Chunking

The dissection of text consists of dividing the text into interconnected parts of words,

such as names and classes of action, but does not specify its internal structure or its role in

the main sentence [35].

3.1.9 Parsing

parsing is the most important stage used during the question generation process. It

is useful to understand and extract rich information about wholesale syntax. Distribution

represents a text entity in a tree structure based on the use of grammatical rules and input

sentences.

Parsing is done in – down manner or Progressivemanner. Top-down parsers

construct the parse tree by the derivation of the input sentence from the root node S down

to the leaves [28]. The searcher algorithm of the parser will expand All trees that have S as

a mother, using information about possible trees under the root, will construct a parallel tree

analyzer. In bottom-up parsing, the parse tree construction process is started With the

words of the input and tries to build the trees of the word, again by applying grammar rules

simultaneously. The success of the parser depends on if the parser Succeeds in building a

tree rooted in the start code S that covers all inputs [29].

Page 27: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 20

3.1.10 Co-reference Resolution:

is the process of matching all references to the same entity in the document,

regardless of the reference reference format. It usually matches the name, full name or

pronoun. Some work has shown that the common reference resolution can be used to

improve summary systems that rely heavily on word frequency features [27].

A simple example is the use of a pronominal reference. For example, "John will travel

tomorrow, buy the ticket yesterday." In this case, the word "he" refers to "John". Thus, if

words are recorded together, they may be more important. This type of analysis is not

widely used in summary systems due to performance and accuracy problems.

3.1.11 Stop words:

In any correct English sentence, many words are grammatical, meaning that they do

not contribute to the meanings of the sentence. Words like, to and from, are words that are

parked, as they are necessary for the sentence to be logical, but it does not describe the true

meaning. For example, in a fast car sentence, car and fast words are considered the basic

part of the sentence, while the relationship between words is described in the basic part. Stop

words are very common and appear in most sentences. Therefore, this can lead to a higher

overall score and the suggestion may be to remove these words before comparing sentences

3.2 Intermediate Processing Layer:

Intermediate treatment is the core of the proposed method for assessing summaries.

The process of applying semantic and grammatical information to summary assessment.

First, the source text and the summary text are analyzed into a set of sentences. Then the

analogy between each sentence of the text of the summary and the full sentences of the

source text is determined using a similarity configuration of word order and semantic

similarity. The maximum value is set as a degree of similarity to the current outline of the

summary.

3.2.1 WordNet

WordNet is a lexical database containing English words, including word descriptions,

synonyms, and semantic relationships between words. Words in WordNet are organized

hierarchically using hyponymy and hypernymy and words can easily be seen as concepts.

In this way WordNet can be interpreted as a classification.

Page 28: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 21

The vocabulary of any language can be defined as a set of models, each of which has one

or more senses. If the model contains more than one sense, it is volatile. If two words share

asense,they are synonymousIn WordNet, Synonyms are placed together in groups called

synsets. This means that two computers will be placed for two words in different groups.

Regardless of words, the synonym has relationships with other synchronizations. These

relationships are based on hyponies, hypnymy, meronymy and holonymy. Since this project

is about semantic similarity, our focus will be on Wahhabism and Al-Fathimi. WordNet

contains words from the name of the four word class grammar, verb, ad-jective, and

adverb. Except for comparison between names and attributes using attributes, there is no

connection between classes, so you can not compare between different categories using

WordNet. This means that the similarity procedure will consist only of noun comparisons,

verb action, and soon.

Figure (3): Example of WordNet Structure

The The availability of the WORDNET database was an important starting point.

The synchronization model is simple enough to provide a fundamental relationship between

the concept and the corresponding words. Moreover, WORDNET covers the entire English

Page 29: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 22

dictionary and provides an unusually large amount of conceptual differentiation. It is also

particularly useful from an arithmetic point of view because it has been developed for easy

access and movement through hierarchies.

Starting with WORDNET, we chose a subset of the appropriate processors to represent the

emotional concepts [36].

The first decision we made in the mapping project was to settle the relationships that

would be used to set WordNet as synchronized. There are three possible relationships:

synonymy, hypernymy, and instantiation. Some examples should illustrate these three

relationships and use them in mapping on WordNet groups. Consider the following entry in

the WordNet name database.

Formally, the semantic similarity is defined as follows:

simres (c1; c2) = max c2S (c1; c2) icres(c)

Where S(c1; c2) Is a set of concepts that fall under c1 and c2. The theoretical similarity

measure that used the same IC idea was from Lin. Explains its definition of similarity: ”The

similarity between A and B is measured by the ratio between the amount of information

needed to state the commonality

of A and B and the information needed to fully describe what A and B are.”

Formally, the definition above can be expressed by: simlin (c1; c2) = 2 £ simres (c1;

c2) (icres (c1) + icres (c2))/ (icres(c1) + icres(c2))

3.2.2 Summarization

To produce a concise sentence, the deletion strategy is used to remove unnecessary

information in the sentence from the source text. Unnecessary information includes trivial

details about topics such as examples, scenarios, or repeated information that contain some

important information [37].Indicates the number of words in the sentence. The main task of

the deletion strategy is to remove non-important information such as pausing words and

interpretations and giving two sentences, concise sentences and the original sentence,

allowing Ss to be a concise sentence, Os the original sentence, Len (Os) denotes the length

Page 30: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 23

of the Os statement while Len (Os) Wholesale Ss. The first rule of the deletion strategy is as

follows:

Length (Ss) is less than Length (Os)

Sentence Partitioning: Each sentence must be divided into a list of words in order to

measure the similarity. The sentence is tokenized by first eliminating the stop words. The

stop words are the words that should be removed from a sentence before any natural

language processing technique is applied. In addition, there is no specific set of words that

represent the stop words, and it can be any set. The most common set of stop words is the

set of the functional words such as the, for ,and at. Finally, the separator characters between

words are recognized and the word list is created.

3.2.3. The word ambiguity

The POS class in WordNet is divided into data sets that contain synonyms. We have

also seen that a word can be full of sensations, that is to say, it has different senses. All of

these senses will be included in the same classification, but in different groups. When

calculating the degree of similarity between two words, it is necessary to make sure that the

correct sense of words is compared. This is done by creating an unambiguous feeling in the

given context. The algorithm used most often to understand the meaning of the word is the

algorithm of Lesk Algorithm, proposed by Michael E. Lesk in 1986 [38]. The algorithm uses

a word definition to determine if there is something in common with another word.

First, the modified algorithm starts by specifying the context next to the target word.

For example, if the length of the sentence is N, the word words will be the word K.

Furthermore, the targeted word will have K / 2 words on the right side and two K / 2 words on

the left side as a neighboring context. After that, all possible senses for doing the verb and

names for each word will be searched in the specified context and listed in the list. Next, the

algorithm lists each sense as stated in WordNet, which is the gloss in the synonym associated

with the target word via hypernym, transparency, synonym, and thermometric relation .After

archiving all possible flare pairs, the adjusted Lesk algorithm is calculated by combining the

flint pairs and creating an overlap between them. The nested points are counted between two

groups by finding the longest common string between them. The degree of interference

depends on the length of the common serial string. Moreover, the length of the subsystem is

common affects the nested result through its square value. Finally, after each pair of

luminance is recorded, the target word is set with the highest number of nested points.

Page 31: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 24

3.2.4 The adapted lask algorithm

Before starting a modified version of the Lesk algorithm, let's look at the original Lesk

algorithm. Call to find the meaning of the word in a certain context is called the word remove

meaning, which is the responsibility of the Lesk algorithm. In addition, the original Lesk

algorithm uses the dictionary definition of the word in order to find its meaning [39].

Furthermore, the original Lesk algorithm counts the number of the common words between

the glosses of two words. The largest the number of common words the closest that sense

to be assigned to the word. The original Lesk algorithm compares the gloss / definition of

each word to the glosses of the all the other words in the sentence. Therefore, the sense with

the highest number of common words would be assigned to the word.

3.3 Post-treatment Layer

This displays the results from the user's system. It shows the measurement of

similarity as a degree to the user. We use the following equations to calculate the final result

(FS) for any written summary of the student:

FS = 2 Token_𝑀𝑎𝑡𝑐ℎ(𝑋,𝑌)

|𝑋|+|𝑌|

Where token match (X, Y) represents the word tokens corresponding to X and Y. The

important point is that they are based on all individual similarity values and therefore reflect

their effect. The coefficient of dice in equation gives the ratio of the number of symbols that

are similar between the total symbols. Thus, a value higher than the average matching with

the dice will always be returned, making it more optimistic [40,41]. In this regard, the threshold

must be specified in advance in order to select matching pairs.

Dice coefficient = 2∗ |𝑋∩𝑌|

|𝑋|+|𝑌|

Page 32: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 25

Chapter four

Result and discussion

Page 33: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 26

This chapter discusses in details the experiments performed in the research and

Simulation Settings then we conclude th strengths and weaknesses of the research then we

will recommend some future work for research development

4.1. Metrics

we evaluate our system by measured its performance using the flowing metrics

- Pearson’s Correlation Coefficient (PCC): measure a correlation between two variables.

It's a value in [1, -1] where 1, 0, and -1 mean positive, not exist, and negative correlation

correspondingly.

- Root Mean Square Error (RMSE): measure of the variances between assessed and

actual values.

4.2 Simulation Environment

Presented system applied using Open NLP open source library, to presses text. It

provide all NLP tasks (tokenization, POS, chunking, parsing …). Interface of proposed

system was build using visual studio 2013

Figure (4): The Framework Interface

Page 34: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 27

4.3. Results and Discussion

To evaluate the framework, we carried out three experiments. In the first experiment,

we measured the performance of the system against human judgment to identify the

students answer grading. In second experiment, we compare the performance of the

framework with the human calculation method and the third experiment comparring with

other well-known or recently proposed methods.

4.3.1. Experiment 1-Evaluation of the algorithm with the human judgment

In summer of 2017, dataset from al qadisyiah university student's (student responses)

were collected in Web programming course. It contain one essay. Table1.Displays the

questions use in this study. Figure 3. Shows a sample answer of three students.

Table (1): First Test

Topic Question Suggested method Score

Web

Programming

list the

benefits of

CSS?

9.65 9

9

10

10

10

Figure (5): Manual student answer samples

Page 35: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 28

answer essays ranked by Five human by hand, all five manually grader are computer

since lecturer, mark give for each answer was in range between 0 and 10, three samples are

token , then automatic eases on every answer is associated to lecturer's score.

4.3.2. Experimental 2

To evaluate our technique for grading student response, the performance measured

in contradiction of human grading to determine the similarity between response student and

human grading

Table (2): proposed technique against human grading

Sample question ,precise answer, and student response Score

Question What is the role of a prototype program in problem solving?

Answer To simulate the behavior of portions of the desired software product.

student1 A prototype program is used in problem solving to collect data for

the problem.

2 1

Student2 It simulates the behavior of portions of the desired software

product.

4 4

Student3 To find problem and errors in a program before it is finalized 3 2

Question What are the main advantages associated with object-oriented programming?

Answer Abstraction and reusability.

student1 Re-usability and ease of maintenance 5 5

Student2 Object oriented programming allows programmers to use an object

with classes that can be changed and manipulated while not

affecting the entire object at once.

1 1

Student3 Easier to debugg -Reusability 2 2

We independency test one components of our overall grading system in the final

stage, to assess the performance of suggested technique, we must use an evaluation metric.

Page 36: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 29

Several evaluation metrics are commonly used in NLP applications. In our test, the evaluation

is implemented between proposed frame work and dataset using Correlation metric for full

data set.Table8882 3 show result of the proposed frame work compare with each of these

measures and figure (4, 5, 6) are comparison of (Knowledge-based, Corpus-based, Baseline)

measures to the proposed frame work and the result was higher by 0.3 of a higher result in

these measures

Table (3): The result of the presented method against each measures using correlation metric

Measure Correlation

Knowledge-based measures

Shortest path 0.4412

Leacock&Chodorow 0.2232

Lesk 0.3631

Wu&Palmer 0.3365

Resnik 0.2521

Lin 0.3915

Jiang&Conrath 0.4498

Hirst&St-Onge 0.1960

Corpus-based measures

LSA BNC 0.4072

LSA Wikipedia 0.4285

ESA Wikipedia 0.4682

Baseline

tf*idf 0.3646

Suggested method 0.490052586

Page 37: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 30

Figure (6): Comparison of knowledge-based and proposed method

4.3.3. Experimental 3

In particular, to evaluate our methods on data set, we select the following methods:

Corpus-based and Baseline

Figure (7): Suggested method Vs Corpus-based

Figure (8): Suggested method Vs. Baseline

0.4900525850.4413

0.2231

0.3630.33660.252

0.39160.4499

0.1961

knowledge based

Shortest path Leacock&Chodorow Lesk Wu&palmer Resnik Lin Jiang&Conrath Hirst&St-Onge

0.4900525850.4071 0.4286 0.4681

-0.16E-16

0.10.20.30.40.50.6

Preposed framework LSA BNC LSA Wikipedia ESA Wikipedia

Co

rrla

tio

n C

oef

fici

ent

Number of Mesure

0.490052585

0.3647

0

0.1

0.2

0.3

0.4

0.5

0.6

Preposed framework tf*idfCo

rre

lati

on

Co

eff

icie

nt

Page 38: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 31

As shown in Figures (7), (8) and (9), it is very clearly that suggested method

succeeded higher correlation coefficient than all other methods. We can notice the similarity

between suggested method and human judgment.

Lastly Table (2) shows RMSE of the suggested method and some state-of-the-art

systems giving an indication of the performance of these ranking methods [42].

Table (4): RMSE measures

Measures RMSE

Lesk 1.033

JCN 1.021

HSO 1.037

PATH 1.028

RES 1.046

Lin 1.068

LCH 1.069

WUP 1.08

ESA 1.030

LSA 1.066

Tf * idf 1.084

suggested method 0.61

Page 39: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 32

4.4 CONCLUSION

This project has offered an analysis and a discussion of most recently presented

assessment approaches in tutoring. Then it is defines a system that can measure student

essays based on their content from different points of view. In Our testing we display a

important correlation between human scores and proposed method. This is Constitutes a

major move from tradtional syteme which is Specific to a particular type of question (multiple

choice / tue- fale exams). The proposed technology can be used in distance learning

programs where students can connect to the system and easily submit essays. The proposed

system can evaluate the essay questions without the need of the teacher to store any

knowledge. All that is required is the text of the course. This work is significant for schools,

teachers, and students particularly in the development countries since it saves the time and

cost required to spend on the earlier education activities.

4.5 future work

- Dealing with the Arabic language in the computer systems is a great challenge

because of the morphology, complexity, semantic and grammatical, therefore, we

recommend that the future work in this field be evaluated in the subjects taught in Arabic

- Essay question may contain mathematical symbols and formulas which is a big

challenging, making it more difficult to analyze the text. Although the accuracy of the results

of the current proposed system is not satisfactory enough but it is a first step to deal with this

type of topics and therefore we recommend in the future work to develop the system to focus

on dealing with symbols and mathematical formulas and equations

Page 40: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 33

References

1. Motiwalla, L. F. (2007). Mobile learning: A framework and evaluation. Computers &

education, 49(3), 581-596.

2. S. Sarhan, “Intelligent Tutoring System", M.Sc. Thesis, Dept. of Computer Science,

Faculty of Computers and Information, Mansoura University, 2009.

3. S. Sarhan, R. Bahgat, A. Tolba, "Rough-Neuro Model for Improving Student State

Diagnosis in Intelligent Tutoring System”, Egyptian Rough Computing Journal (ERCJ),

2009.

4. Valenti, S., Neri, F., & Cucchiarelli, A. (2003). An overview of current research on

automated essay grading. Journal of Information Technology Education: Research, 2,

319-330.

5. Bin, L., & Jian-Min, Y. (2011, September). Automated essay scoring using multi-

classifier fusion. In International Conference on Information and Management

Engineering (pp. 151-157). Springer, Berlin, Heidelberg.

6. Felder, R. M., Woods, D. R., Stice, J. E., & Rugarcia, A. (2000). The future of

engineering education II. Teaching methods that work. Chemical Engineering

Education, 34(1), 26-39.

7. Lage, M. J., Platt, G. J., & Treglia, M. (2000). Inverting the classroom: A gateway to

creating an inclusive learning environment. The Journal of Economic Education, 31(1),

30-43.

8. Ghosh, S. (2010). Online Automated Essay Grading System as a Web Based Learning

(WBL) Tool in Engineering Education. Web-Based Engineering Education: Critical Design

and Effective Tools: Critical Design and Effective Tools, 53.

9. Landauer, T. K. (2003). Automatic essay assessment. Assessment in education:

Principles, policy & practice, 10 (3), 295-308.

10. Burrows, S., Gurevych, I., & Stein, B. (2015). The eras and trends of automatic short

answer grading. International Journal of Artificial Intelligence in Education, 25(1), 60-

117.

11. Lemaire, B., & Dessus, P. (2001). A system to assess the semantic content of student

essays. Journal of Educational Computing Research, 24(3), 305-320.

12. Lenz, B., Wells, J., & Kingston, S. (2015). Transforming schools using project-based

deeper learning, performance assessment, and common core standards. John Wiley &

Sons.

Page 41: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 34

13. Islam, M. M., & Hoque, A. L. (2010, December). Automated essay scoring using

generalized latent semantic analysis. In Computer and Information Technology (ICCIT),

2010 13th International Conference on (pp. 358-363). IEEE.

14. Heywood, J. (2000). Assessment in higher education: Student learning, teaching,

programmes and institutions (Vol. 56). Jessica Kingsley Publishers.

15. Arum, R., & Roksa, J. (2011). Academically adrift: Limited learning on college

campuses. University of Chicago Press.

16. Hermet, M., & Szpakowicz, S. (2006). Symbolic assessment of free text answers in a

second-language tutoring system.

17. N. Kang, E. M. van Mulligen, and J. A. Kors, “Comparing and Combining Chunkers of

Biomedical Text”, Journal of Biomedical Informatics, Vol. 44, No.2, 2011, pp. 354-360.

18. J. Kurs, M. Lungu, and O. Nierstrasz, “Top-Down Parsing with Parsing Contexts”, In

Proceedings of International Workshop on Smalltalk Technologies (IWST14), England,

2014, pp. 1-7.

19. Yao, X., Bouma, G., & Zhang, Y. (2012). Semantics-based question generation and

implementation. Dialogue & Discourse, 3(2), 11-42.

20. Graesser, A. C., Wiemer-Hastings, P., Wiemer-Hastings, K., Harter, D., Tutoring

Research Group, T. R. G., & Person, N. (2000). Using latent semantic analysis to

evaluate the contributions of students in AutoTutor. Interactive learning environments,

8(2), 129-147.

21. Baker, R. S., D'Mello, S. K., Rodrigo, M. M. T., & Graesser, A. C. (2010). Better to be

frustrated than bored: The incidence, persistence, and impact of learners’ cognitive–

affective states during interactions with three different computer-based learning

environments. International Journal of Human-Computer Studies, 68(4), 223-241.

22. Yang, G. L. (2009). U.S. Patent No. 7,555,713. Washington, DC: U.S. Patent and

Trademark Office.

23. Jaballah, M., Harous, S., & Yagi, S. M. (2008, April). UOS EASY EXAM Arabic

Computer-Based Examination System. In Information and Communication

Technologies: From Theory to Applications, 2008. ICTTA 2008. 3rd International

Conference on (pp. 1-5). IEEE.

24. Ho, Y. S., Sang, J., Ro, Y. M., Kim, J., & Wu, F. (Eds.). (2015). Advances in Multimedia

Information Processing--PCM 2015: 16th Pacific-Rim Conference on Multimedia,

Gwangju, South Korea, September 16-18, 2015, Proceedings (Vol. 9314). Springer.

Page 42: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 35

25. Attali, Y., Bridgeman, B., & Trapani, C. (2010). Performance of a generic approach in

automated essay scoring. The Journal of Technology, Learning and Assessment,

10(3).

26. Bond, F., & Foster, R. (2013). Linking and extending an open multilingual wordnet. In

Proceedings of the 51st Annual Meeting of the Association for Computational

Linguistics (Volume 1: Long Papers) (Vol. 1, pp. 1352-1362).

27. B. A. Galitsky, J. L. de la Rosa, and G. Dobrocsi,“ Inferring the Semantic Properties of

Sentences by Mining Syntactic Parse Trees”, Data & Knowledge Engineering, Vol. 81-

82, 2012, pp. 21-45.

28. B. A. Galitsky, “Transfer Learning of Syntactic Structures for Building Taxonomies for

Search Engines”, Engineering Applications of Artificial Intelligence, Vol. 26, Issue 10,

November 2013, pp. 2504-2515.

29. S. Anantpure, H. Jain, N. Alhat, S. Bhor, and S. Guru, “Literature Survey On Syntax

Parser For English Language Using Grammar Rules ˮ , International Journal of Advance

Foundation and Research in Computer (IJAFRC), Vol. 2, Special Issue (NCRTIT 2015),

January 2015, pp. 327-333.

30. B. A. Galitsky, J. L. de la Rosa, and G. Dobrocsi,“ Inferring the Semantic Properties of

Sentences by Mining Syntactic Parse Trees”, Data & Knowledge Engineering, Vol. 81-

82, 2012, pp. 21-45.

31. B. Galitsky, “Machine Learning of Syntactic Parse Trees for Search and Classification

of Text”, Engineering Applications of Artificial Intelligence, Vol. 26, No. 3, 2013, pp.153-

172.

32. S. W. Tu, M. Peleg, S. Carini, M. Bobak, J. Ross, D. Rubin, and I. Sim, “A Practical

Method for Transforming Free-Text Eligibility Criteria into Computable Criteria”, Journal

of Biomedical Informatics, Vol. 44, Issue 2, 2011, pp. 239–250.

33. B. A. Galitsky, “Transfer Learning of Syntactic Structures for Building Taxonomies for

Search Engines”, Engineering Applications of Artificial Intelligence, Vol. 26, Issue 10,

November 2013, pp. 2504-2515.

34. G. Wilcock, “Text Annotation with OpenNLP and UIMA”, Proceedings of the 17th Nordic

Conference of Computational Linguistics, University of Southern Denmark, Odense,

2009, pp. 7-8.

35. C. Arora, M. Sabetzadeh, L. Briand, F. Zimmer, and R. Gnaga, “Automatic Checking of

Conformance to Requirement Boilerplates via Text Chunking: An Industrial Case Study

ˮ, 7th ACM/IEEE International Symposium on Empirical Software Engineering and

Measurement (ESEM ), USA, 2013, pp.35-44.

Page 43: Essay question Assessment in E-learning - Product Key Freequ.edu.iq/repository/wp-content/uploads/2018/07/... · Essay question Assessment in E-learning Ander Graduating Project A

Page 36

36. Tufis, D., Cristea, D., & Stamou, S. (2004). BalkaNet: Aims, methods, results and

perspectives. a general overview. Romanian Journal of Information science and

technology, 7(1-2), 9-43.

37. Brown, A. L., & Day, J. D. (1983). Macrorules for summarizing texts: The development

of expertise. Journal of verbal learning and verbal behavior, 22(1), 1-14.

38. Agirre, E., & Edmonds, P. (Eds.). (2007). Word sense disambiguation: Algorithms and

applications (Vol. 33). Springer Science & Business Media.

39. Banerjee, Satanjeev, and Ted Pedersen. An Adapted Lesk Algorithm for Word Sense

Disambiguation Using WordNet. Tech. Duluth: University of Minnesota. Print

40. T. N. Dao and T. Simpson, "Measuring Similarity between sentences.," WordNet. Net,

Tech. Rep.,2005.

41. S. Banerjee, T. Pedersen, "An adapted Lesk algorithm for word sense disambiguation

using. WordNet.," Proceedings of the Third International Conference on Computational

Linguistics and Intelligent Text Processing, Springer Berlin Heidelberg., pp. 136-145,

2002.

42. M. Mohler, R. Bunescu, R. Mihalcea, "Learning to Grade Short Answer Questions using

Semantic Similarity Measures and Dependency Graph Alignments," Association for

Computational Linguistics, pp. 752–762, 2011.