torsten zesch - spraakbanken.gu.se · generate bundles target words controlled for word frequency...

62
Automatically ________ gap-fill exercise items Torsten Zesch Assistant Professor Language Technology Lab University of Duisburg-Essen

Upload: others

Post on 07-Dec-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

Automatically ________ gap-fill exercise items

Torsten Zesch

Assistant ProfessorLanguage Technology LabUniversity of Duisburg-Essen

Page 2: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

2

Biographical Facts

Computer science background

2006 – 2012

Technische Universität Darmstadt (PhD/PostDoc) Lexical semantics / Wikipedia

2012 – 2013

German Institute for International Educational Research (DIPF), Frankfurt (Substitute Professor)

Automatic scoring (PISA data)

Now

University of Duisburg-Essen, “Language Technology Lab” (Assistant Professor)

LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Page 3: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

3

University of Duisburg-Essen

LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Page 4: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

4

University of Duisburg-Essen

LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Page 5: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

5

University of Duisburg-Essen

LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

4h

Page 6: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

7

Research Interests

NLP for Education

Social Media AnalysisGraduate school “User-centred Social Media”

Language Technology InfrastructureReproducibility & Replicability

LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Page 7: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

8LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

NLP4Edu @ LTLab

Page 8: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

9LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

NLP4Edu (but not CALL) @ LTLab

Automatic scoring Essays Short free-text answers

Plagiarism detection

Spell checking

Page 9: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

10LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

CALL @ LTLab

Language Proficiency Testing Lexical Recognition Tests (for Arabic) cTest difficulty prediction

Learner Language Cognates L2 spelling errors Readability assessment

Exercise generation This talk

Research network INDUS – “Individualized language learning” https://sites.google.com/site/indusnetzwerk/home (in German)

Page 10: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

11LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

In this talk

Automatically ________ gap-fill exercise items

generating

Page 11: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

12

Is there a problem?

He lost his watch yesterday

LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Page 12: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

13

Is there a problem?

He lost his _____ yesterday

LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Page 13: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

14LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

In this talk

Automatically ________ gap-fill exercise items

generating

checking

Page 14: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

15

Checking – Ambiguity Problem

housewifecarjob handkeysglasseswalletsoulphonemarblesconfidencehopewillfuturedegreefriendheadbookpenrubberhatring

He lost his _____ yesterday

LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Page 15: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

16

Checking – Multiple Choice

He lost his _____ yesterday

• go• mind• take• him

• watch• mind• wife• game

LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Page 16: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

17LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

In this talk

Automatically ________ gap-fill exercise items

generating

checking

adapting

Page 17: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

18

Adapting – Type of Knowledge

He lost his _____ yesterday• go• mind• take• him

He lost _____ mind yesterday• him• his• he• her

LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Page 18: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

19

Adapting – Difficulty

He lost his _____ yesterday• go• mind• take• him

His characteristic talk, with its keen _____ of detail and subtle power of inference held me amused and enthralled.

• instincts• observance• presumption• implements

LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Page 19: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

20LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Reducing Ambiguity

Horsmann, Tobias; Zesch, TorstenTowards  Automatic  Scoring  of  Cloze  Items  by  Selecting  Low­Ambiguity Contexts.In:  3rd  workshop  on  NLP  for  computer­assisted  language  learning, Uppsala, Sweden, 2014.

Page 20: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

21

Ambiguity Problem

housewifecarjob handkeysglasseswalletsoulphonemarblesconfidencehopewillfuturedegreefriendheadbookpenrubberhatring

He lost his _____ yesterday

Try to find contexts that areless ambiguous (or even unambiguous)

Try to find contexts that areless ambiguous (or even unambiguous)

LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Page 21: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

22

Ambiguity Problem

housewifecarjob handkeysglasseswalletsoulphonemarblesconfidencehopewillfuturedegreefriendheadbookpenrubberhatring

He lost his bag of _____ yesterday

LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Page 22: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

24LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Finding Low-Ambiguity Contexts

Hearst Patterns ”Z such as X and Y” Already small amounts of ____ such as beer or wine can affect your

ability to drive. → alcohol

Collocations Lessons learned from successful forest ____ prevention campaigns were

presented. → fire

Word Repetition Go through the ____ and mark each affected slice. → slices

Page 23: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

26

Evaluation Results

top-ranked are significantly less ambiguous still too many valid solutions

LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Finding unambiguous gaps might be impossibleFinding unambiguous gaps might be impossible

Page 24: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

27LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Challenging Distractors

Zesch, Torsten; Melamud, Oren Automatic Generation of Challenging Distractors Using Context­Sensitive Inference Rules.In:  Proceedings  of  the  9th  Workshop  on  Innovative  Use  of  NLP  for Building Educational Applications at ACL, Baltimore, USA, 2014.

Page 25: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

28

Tradeoff – Checking & Difficulty

Microsoft _______ Skype for 8.5b dollar.

LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Too easy drinks

Challenging gains

Invalid purchases

Solution acquires

Page 26: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

29LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Inference Rules

acquire gain

?students acquire knowledge

students gain knowledge

Microsoft acquires Skype

Microsoft gains Skype

Page 27: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

30LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Context-insensitive Rules

Page 28: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

31LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Context-sensitive Rules

Melamud Oren, Jonathan Berant, Ido Dagan, Jacob Goldberger, Idan Szpektor.A two level model for context sensitive inference rules.In: Proceedings of ACL 2013, Best paper runner­up

Page 29: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

32LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Finding Challenging Gap-fillers

attaingainhavelackloseobtainpurchaserecoversell

buyhaveloseobtainownpossesspurchaseretainsell

Page 30: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

33LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Finding Challenging Gap-fillers

buyhaveloseobtainownpossesspurchaseretainsell

attaingainhavelackloseobtainpurchaserecoversell

Page 31: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

34LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Evaluation Dataset

Dataset properties Sentences with main verb Medium-sized (5-12 tokens) Understandable by general audience

Examples ”His petition charged mental cruelty” ”The ball floated downstream” “The plan does not cover doctor bills”

Page 32: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

36LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Difficulty of Resulting Items

Randomly split 52 learners in two groups Group 1: our approach Group 2: randomly selected distractors (with comparable corpus frequency)

7 control items (same for both groups) 13 test items

Error rates

Page 33: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

37LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Conclusion

Works well for clearly separated verb senses For example: “He played basketball” “compose”, “tune”

Problems with fine-grained distinctions For example: “The ball floated downstream” ”glide”, “travel”

Contrary to hypothesis, generated distractors not more difficult Subjects with high language proficiency re-test with beginners Error rate does not directly measure difficulty / learning, as subjects might be

confused but still solve it in the end measure response time

Generating difficult distractors is a hard problemGenerating difficult distractors is a hard problem

Page 34: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

38LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Bundled Gap-fill

Wojatzki, Michael; Melamud, Oren; Zesch, TorstenBundled  Gap  Filling:  A  New  Paradigm  for  Unambiguous  Cloze ExercisesIn: Proceedings of  the Building Educational Applications Workshop at NAACL, ACL, San Diego, USA, 2016.

Page 35: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

39LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Bundled Gap-fill

pass

take

prepare

catch

share

Page 36: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

40LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Exercise Type Overview

Page 37: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

41LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Exercise Type Overview

Page 38: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

42LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Exercise Type Overview

Page 39: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

43LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Approach – Probability of Gap Fillers

Page 40: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

44LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Approach – Probability of Gap Fillers

Page 41: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

45LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Approach – Probability of Gap Fillers

Page 42: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

47LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Generate Bundles

Target words controlled for word frequency and word classes

Page 43: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

48LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Generate Bundles

Target words controlled for word frequency and word classes

Sentence base single gap selected at random GUM corpus (Zeldes, 2016)

Background Corpus ukWaC English (Baroni et al., 2009)

Probability distribution of gap-filler words KenLM toolkit (Heafield et al., 2013)

FASTSUBS to efficiently compute vectors (Yuret., 2012)

Page 44: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

49LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Short Wrap-Up

Approach to generate bundled gap-fill exercises

Remaining task: How well does it work?

Page 45: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

50LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

User Study

Near native speakers (N=35) Participants have to successively solve bundles with increasing sizes

Page 46: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

51LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

User Study

Near native speakers (N=35) Participants have to successively solve bundles with increasing sizes

How many participants filled the solution word? Success rate

Page 47: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

52LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

User Study

How many correct now?

If bundles work, score should increaseIf bundles work, score should increase

Near native speakers (N=35) Participants have to successively solve bundles with increasing sizes

Page 48: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

53LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

User Study

Near native speakers (N=35) Participants have to successively solve bundles with increasing sizes

How many correct now?

Page 49: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

54LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Results

Page 50: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

55LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Results

Page 51: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

56LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Results

Page 52: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

57LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Results

Page 53: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

58LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Detailed Results

?

Page 54: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

59LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Detailed Results

?

Page 55: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

60LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Error Analysis – “give” bundle

1. A string of islands and reefs along the eastern edge of the Group shelters the area from strong winds and ocean swells and humpback whales come to these protected waters to ___ birth.

2. Tape the bottom of the box securely so that it doesn't ___ way.

3. We take document security seriously , " she said , but refused to ___ any more details about how the papers came to be on a road.

4. Clap your hands or ___ it a gentle shove until it jumps up and walks away.

Page 56: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

61LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Detailed Results

r = .66r = .66

Page 57: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

62LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Summary – Bundled Gap-fill

Bundled gap fill tests can be effectively generated

Gap bundles decrease ambiguity

Page 58: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

63LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Wrap Up

Reducing ambiguity of cloze tests Works to some extent, but not suitable for

automatic generation

housewifecarjob handkeysglasseswalletsoulphonemarblesconfidencehopewillfuturedegreefriendheadbookpenrubberhatring

He lost his _____ yesterday

Page 59: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

64LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Wrap Up

Reducing ambiguity of cloze tests Works to some extent, but not suitable for

automatic generation

Generating challenging distractors Works to some extent, but not good enough for

automatic generation

Too easy drinksCarrier Sentence Microsoft acquires Skype for 8.5b dollar.

Carrier Sentencewith Gap

Microsoft ________ Skype for 8.5b dollar.Challenging gains

Invalid purchases

Fill-the-gap Example Distractor Difficulty

Page 60: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

65LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Wrap Up

Reducing ambiguity of cloze tests Works to some extent, but not suitable for

automatic generation

Generating challenging distractors Works to some extent, but not good enough for

automatic generation

New approach bundled gap-fill Can be reliably generated Many open questions …

Page 61: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

66LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

Future Work (Bundles only)

Adapt to other languages

Improve gap probability model

Relation to established proficiency tests

Predict and adapt bundle difficulty

Thanks for your ____Special ____ should be paid to this work____ is given to these exercises____ is a currency

Page 62: Torsten Zesch - spraakbanken.gu.se · Generate Bundles Target words controlled for word frequency and word classes Sentence base single gap selected at random GUM corpus (Zeldes,

67LTLab | Torsten Zesch | NLP4CALL & LA, Gothenburg - May 22, 2017

It’s all team work