velina slavova (bulgaria) vladimir polyakov (russia) the metrics of complexity based on system of...

18
Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE (ON THE DATA OF DB «LANGUAGES OF THE WORLD») (*) * The research was supported by Russian Scientific Foundation of Humanities (grant № 10-04-12125в)

Upload: briana-martin

Post on 21-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE

Velina Slavova (Bulgaria) Vladimir Polyakov (Russia)

THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE

LANGUAGE (ON THE DATA OF DB «LANGUAGES OF THE WORLD») (*)

* The research was supported by Russian Scientific Foundation of Humanities (grant № 10-04-12125в)

Page 2: Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE

Screenshots. Win Version

Page 3: Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE

Source of Data for DB JM

• Encyclopedic issue “Jaziki Mira”(Languages of the World) – 18 volumes, printed by Institute of Linguistics of Russian Academy of Science from 1993 to 2011.

• Large Encyclopedic Dictionary. Linguistics (Edited by Yarceva V.N.) – includes interpretation of all terms of model of DB.

Page 4: Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE

List of some Encyclopedic Publications “Jaziki Mira”(Languages of the World)

• Languages of the world: Uralic (1993).• Languages of the world. Paleoasiatic languages. Мoscow: Publ. “Indricк”. (1996). - 231 p.• Languages of the world: Turkic. Мoscow: Publ. “Indricк”. (1997). - 544 p.• Languages of the world: Mongolic languages. Manchu-Tungus languages. Japan. Korean. (Ed.: Kibrik

A.A., Rogova N.B., Romanova O.I.). Мoscow: Publ. “Indricк”. (1997). - 408 p.• Languages of the world: Iranian languages. I. South-Western Iranian languages. Мoscow: Publ.

“Indricк”. (1997). - 207 p.• Languages of the world: Iranian languages. II. North-Western Iranian languages. Мoscow: Publ.

“Indricк”. (1999). – 302 p.• Languages of the world: Dardic and Nuristani languages. Мoscow: Publ. “Indricк”. (1998). - 143 p.• Languages of the world: Iranian languages. III. East Iranian languages. Мoscow: Publ. “Indricк”.

(1999). - 343 p.• Languages of the world: Germanic languages. Celtic languages. Moscow: Publ. “Academia”. (1999).

- 472 p. • Languages of the world: Caucasian languages. RAS. Institute of Linguistics. Moscow: Publ.

“Academia”. (2001).-480 p.• Languages of the world: Romance languages. Moscow: Publ. “Academia”. (2001). - 720 p.• Languages of the world: Indo-Aryan languages of Ancient and Middle Period. Moscow: Publ.

“Academia”. (2004). - 160 p.• Languages of the world: Slavonic languages. RAS. Institute of Linguistics. /Ed. A.M. Moldovan, S.S.

Skorvid, A.A. Kibrik/ Moscow: Publ. “Academia”. (2005). - 656 p. • Languages of the world: Baltic languages. RAS. Institute of Linguistics. /Ed. V.N.Toporov,

M.V.Zavyalov, A.A. Kibrik /. Moscow: Publ. “Academia”. (2006), 224 p.

Page 5: Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE

Dictionary and source books

Dictionary

Two of 18 source books

Page 6: Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE

Characteristics of Data Base “Languages of the World” Content

The Data Base “Languages of the World” has the following quantitative characteristics.

- contains more than 3800 features

- the number of languages is 315 Eurasian languages

- contains the description of the following spheres of language: phonetics, morphology, syntax.

- representation of data: binary

In Data Base “Languages of the World” the following language families and unities are

represented: Austroasian, Austronesian, Altaic, Afroasian, Indoeuropean, Caucasian,

Paleoasian, Sinotibetic, Uralic, Hurrito-Urartean. DB contains the description of languages-

isolates: Ainu, Nivch, Burushaski, Sumeran, Elamite. The unique peculiarity of Data Base

“Languages of the World” is a large collection of extinct languages description, that

includes 54 essays. There is no analogues of such detailed and systematic description of

exinct languages.

The main principles forming of the model of language description are binarity, hierarchicity and

paradigmaticity.

Page 7: Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE

Task Formulation

1. Grammatical constructions are supposed to require different resources of the brain in processing.

2. There is another supposition that the total number of the resources of the brain aimed at processing of the volume, which is approximately equal in the meaning, must be constant.

3. Semantic cases can be an example of a complex construction for the verification of these statements (Fillmore’s cases).

4. The DB “Jazyky Mira” contains semantic cases that form a rather wide paradigm.

Page 8: Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE

Example

Let’s study an example of the accusative case

“Суд обвинил Вас-ю в краже.” “The court accused Basil of robbery.”

In the Russian language case is marked by a form of the noun (Вас-ю) and by a preposition (в), and in the English language – only by preposition (of).

Page 9: Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE

Method of Data Processing

• Velina Slavova used the data of DB “Jazyki Mira” in order to receive a more convenient representation of the case paradigm.

• After a rather sophisticated reduction we received the first results that show examples of correlation of different case systems.

Page 10: Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE

Case description in DB. Scope of the research.

In DB JM we have 405 grammar features devoted to case system (in the Part number 2.3.4 of Model).

In this research only actant case meaning were investigated (140 grammar features ).

They were divided in six fragments:• --subject/object • --contrastive case formation of subject• --contrastive case formation of object• --method of expressing subject--object-meanings • --other actant cases • -case of nominal predicate.At the first step only four fragments were investigated.

Page 11: Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE

Examples of case description--subject/object

---absolutive

---absolutive/relative

---dative

---narrative

---nominative/accusative

---nominative/accusative/genitive

---nominative/accusative-genitive

---nominative/accusative/indefinite accusative

---nominative/acusative/genitive/partitive

---nominative/accusative/privative/sociative

---nominative/accusative/locative

---nominative/accusative/partitive

---nominative/dative-accusative

---nominative/narrative

---nominative/partitive

---nominative/genitive

---nominative/genitive/partitive

---nominative/general indirect

---nominative/ergative

---nominative/ergative/genitive

At left the part of “subject/object” paradigms in DB is shown. At right fragment of description of the English language is shown.

0.0.0.*LANGUAGE DENOMINATION .English …………………………………………………2.3.4.CASE MEANINGS .actant case meanings ..subjective/objective ...general case/accusative ..contrastive case formation ...of object ....nouns and pronouns ..method of expressing subject.-object.meanings ...case affixes ...word order ...auxiliary words ....in preposition .case of attributive relation ..prepositional construction .case of possesive relation ..prepositinal construction ..possesive affix at possesor's name .case of locative relations ..method of expression ...prepositions ………………………………………………..

Page 12: Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE

Metrics of complexity

For each six part the own metrics of complexity was developed.

Part of case description (Complex characteristics)

Type of feature coding Metrics

--subject/object Paradigma – only one choice Maximal number of cases marked in language

--contrastive case formation of subject

Multi-choice Number of features presented in language

--contrastive case formation of object

Multi-choice Number of features presented in language

--method of expressing subject--object-meanings

Multi-choice Number of features presented in language

Page 13: Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE

Correlation Analysis

We can see good correlations between three complex characteristics (marked by yellow).

Page 14: Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE

Factor Analysis

We have two groups of factors (# 1 – yellow, # 2 - blue)

Page 15: Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE

Tree Analysis

• The distances between the languages following this “SO syntactic rules complexity” measure seem to keep languages from some genealogic groups closed together. Nevertheless, it is seen that Indo-European languages are VERY dispersed.

• OLD languages seem to stay a part!

Page 16: Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE

ANALYSIS OF RESULTS

1. The hypothesis about the preservation of the complexity of the grammar structure of the language on a certain level found its confirmation. The study showed that languages with a complex case paradigm have simpler grammatical means of expressing cases and fewer differences in the description of cases for subject/object. Languages with a simple case paradigm have more complex means of expressing case relations and have more differences in the description of cases for subject/object. Such dichotomy explains 76% variations of the content of the DB “Jazyki Mira”

2. In general such description of the case system (as two groups of factors) correlates well with the genealogical tree. The exception is Indo-European language family, which can be conditioned by a big geographical spread of EU languages and, consequently, intensive borrowing during areal contacts. This hypothesis requires additional check.

Page 17: Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE

• The present report is called upon to show that DB “Jazyki Mira” is an interesting resource for studying the complexity of different grammar parts of the language.

• We have only received the first experience. The methods and approaches are still at the stage of establishment and development. Works in this direction will be continued.

AS A CONCLUSION

Page 18: Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE

Thank you for your attention

Contacts:

[email protected]@mail.ru