syntactic parsing for arabic
DESCRIPTION
TRANSCRIPT
Ministry of Higher EducationAl-Imam Muhammad bin Saud Islamic
UniversityCollege of Computer and Information
Sciences
Presented by :
Alkaibari ,Amani .M
Abaalhassan ,Seham.N
Aloud ,Moudhi .A
Instructor:Dr. Amal alsaif
Syntactic Parsing for Arabic
Introduction
Language is a method of communication intellectual
Method of understanding and transfer of ideas
Application began to increase with time .
Applications automatic translation, understand speech operative analysis of texts, revision of the texts and others
SYNTACTIC PARSER
•Syntax: provides rules to put together words to form components
• Grammar is the formal specification of rules of a language.
• Parsing is a method to perform syntactic analysis of a sentence.
Parsing (Syntactic Structure)
•INPUT:
لحديقه في الكبير الكلب
•OUTPUT:
The Information Conveyed by Parse Trees
•Part of speech for each word
(N = noun, V = verb, D = determiner)
•Phrases•Useful Relationships
Examples Of Syntactic Ambiguity in Arabic
• Ambiguity caused by devocalization
word class
1 =ر ”bur “flour ُب noun
2 ?ر ”ber “Honouring ُب noun
3 @ر ”bar “land ُب noun
4 Aر? beraa “Righteousness ُب verb
Two parse trees for the ambiguous sentence
verbal sentence verbal sentence
verb(passive) progent verb(active) subject
أكل
أكل الطعام
الطعام
Grammar confusion
For example, there is more than reading for grammar component .
As when we say ( الطالبالمجتهدون can be , ( والمدرسون
a recipe for a comprehensive teachers only or also for students.
Confusion in returning pronouns
As we say ( المريضة األم تركتلترعاها ممرضتها the ,(مع
pronoun (ها) in the word may return to the "ممرضتها"patient may return to the mother .
and distraction in the word "sponsored" may be due to "patient" may be due to the "mother"
This is more difficult to process analysis grammar.
Parsing Arabic Dialects
The Arabic language is a collection of spoken dialects with important syntactic differences .
Modern Standard Arabic (MSA)
The standard written language is the same throughout the Arab world.
used in some scripted spoken communication
news casts. parliamentary debates.
Levantine Arabic (LA)
example of the Arabic dialects.
leveraging LA/MSA resources is feasible.
Linguistic Facts
differences between LA and MSA using an example :
هدا ) -1 الشغل ُبيحبوش [ LA ] .( الرجال[ jordan ]
العمل ) -2 هذا الرجال يحب [ MSA ] .( الLexically, we observe :1- the word for ‘work’ is " الشغل "in LA but " العمل "in MSA.
2- the word for ‘men’ is the same in both LA and MSA .
3- There are typically also differences in function words, in our example
. ’for ‘not (MSA) ال ) and (LA) ش $4- we see that LA " ُبيحبو " has the same stem as MSA " يحب " .
Syntactically : we observe three differences :
1- First, the subject precedes the verb in LA (SVO order), but follows in MSA (VSO order).2- we see that the demonstrative determiner follows the noun in LA, but precedes it in MSA.3- we see that the negation marker follows the verb in LA, while it precedes the verb in MSA.
Related works
“Better Arabic Parsing: Baselines, Evaluations, and Analysis”
Author by Green and Christopher D. Manning
comparing manually annotated grammar
All experiments were using ATP
ignore all the trees, referring to the mistakes and non-linguistic texts
ATB is still low compared to the Wall SJC
PARSING ARABIC TEXTS USING REAL PATTERNS OFSYNTACTIC TREES
built a parser for Arabic texts, which takes advantage of the machine learning paradigm
defined a new manner for modeling the knowledge that composes an Arabic Treebank
new models are called the patterns of syntactic trees
TOWARDS RESOLVING AMBIGUITY IN NDERSTANDINGARABIC SENTENCE
Described attempt to resolve certain types of ambiguity
Resolving the ambiguity that between main constituents of the nominal and the verbal sentences
And due to the affixes of either nouns or verbs.
Conclusion
In our paper we have seen some example of ambiguity and latest reach of the researchers in the analysis of Arabic
we have shown that Arabic parsing performance is not as poor as previously thought, but remains much lower than English