s ay a nything selma p. and parth p.. s ay w hat ? data-driven system that allows for interactive...
TRANSCRIPT
- Slide 1
- S AY A NYTHING Selma P. and Parth P.
- Slide 2
- S AY W HAT ? Data-driven system that allows for interactive storytelling Human author takes turns with computer author
- Slide 3
- E XAMPLE H: Youll never believe what happened last night. C: Leigh laughed at my joke but I couldnt help but think liz would have laughed harder. H: The joke wasnt very funny in a ha ha kind of way C: It wasnt anything like that, I thought he was going to give me a good night kiss but he ended up licking my cheek, she declared. H: It made me sneeze and snort out loud. C: And now my nose hurts form the snorting.
- Slide 4
- E XAMPLE H: Youll never believe what happened last night. C: Leigh laughed at my joke but I couldnt help but think liz would have laughed harder. H: The joke wasnt very funny in a ha ha kind of way C: It wasnt anything like that, I thought he was going to give me a good night kiss but he ended up licking my cheek, she declared. H: It made me sneeze and snort out loud. C: And now my nose hurts form the snorting.
- Slide 5
- E XAMPLE H: Youll never believe what happened last night. C: Leigh laughed at my joke but I couldnt help but think liz would have laughed harder. H: The joke wasnt very funny in a ha ha kind of way C: It wasnt anything like that, I thought he was going to give me a good night kiss but he ended up licking my cheek, she declared. H: It made me sneeze and snort out loud. C: And now my nose hurts form the snorting.
- Slide 6
- W HY DO THIS ? Intriguing Blends structure with creativity Void Game of sorts Fits in the space between language games and more graphically oriented video games Foundational work in progress Research
- Slide 7
- N ARRATIVE M ODELS (M AJEWSKI 2003) Five ways: Linear String of pearls Branching Amusement park Building blocks
- Slide 8
- W HAT S BEEN DONE BEFORE ? Faade http://interactivestoriesonline.com/ http://www.interactivenarratives.org/ Top Down Approach Strict domains Technical/ non-inclusive
- Slide 9
- H OW IT WORKS 1. User enters sentence 2. Users sentence is used to search corpus using term frequency - inverse document frequency (tf- idf) algorithm 3. Highest scored match is retrieved. Sentence after best match is outputted by computer 4. Repeat
- Slide 10
- H OW DID THEY DO IT ?
- Slide 11
- G ETTING D ATA ( DB ) Considered Manual (StoryCorps/ Fed. Writers) Well curated biased Favored blog posts (3.4 million) 1.06 billion words Extraction Only 17% textual material on weblogs is narrational 3.7 million story segments 66.5 million sentences Favored Recall over Precision FN preferred over FP
- Slide 12
- G ETTING D ATA Spinn3r.com 44 million weblog. Sampled / hand annotated 5,270 blogs. Annotated validations set
- Slide 13
- G ETTING D ATA Randomized training/ testing Supervised machine learning Binary classification problem Confidence Weighted Linear Classifier [Dredze 2008]
- Slide 14
- G ETTING D ATA Rando. Data Set annotation training/ testing F(x)F(x) Sampling
- Slide 15
- C ORPUS ( DB ) C REATION Crawled blogs and applied algorithm: (2012): Post- Processing Parse trees Verb tenses First personal pronouns
- Slide 16
- C ORPUS ( DB ) C REATION Rando. Data Set crawl blogs annotation training/ testing F(x)F(x) Sampling Blog data F(x)F(x) Story data
- Slide 17
- A PPLICATION Querying Corpus ( ) Optimized: Return a list of stories that contain any matching words with user input Use TF-IDF ! Story data
- Slide 18
- A PPLICATION Rando. Data Set crawl blogs annotation training/ testing F(x)F(x) Sampling Blog data F(x)F(x) Story data User Input Query Story data Match Algorithm (tf id) Computer Output
- Slide 19
- T ERM FREQUENCY INVERSE DOCUMENT FREQUENCY ( TF - IDF ) TF-IDF is a numerical statistic that reflects how important a word is to a document in a corpus The term frequency measures how often a word appears in a document The inverse document frequency is a measure of how common a word is within the corpus as a whole. It tells us how much information a word provides.
- Slide 20
- T ERM FREQUENCY INVERSE DOCUMENT F REQUENCY ( TF - IDF ) Image credit: Li(2011)
- Slide 21
- R ESULTS
- Slide 22
- A REAS TO I MPROVE Metrics? Entertainment Coherence
- Slide 23
- A REAS TO I MPROVE Metrics? Entertainment Coherence Believability / Usability Compare how well with next sentence user would have written
- Slide 24
- A REAS TO I MPROVE Fail to use all preceding sentences. Only returns highest ranked search.
- Slide 25
- A REAS TO I MPROVE Fail to use all preceding sentences. Only returns highest ranked search.
- Slide 26
- A REAS TO I MPROVE Fail to use all preceding sentences. Only returns highest ranked search.
- Slide 27
- F UTURE WORK No narrative plot
- Slide 28
- W ORK IN P ROGRESS Foundational Last article was 2012
- Slide 29
- T HANKS
- Slide 30
- D ISCUSSION Q S
- Slide 31
- I MPROVEMENTS ?
- Slide 32
- Q UALITY A SSESSMENT
- Slide 33
- M ODIFY C ORPUS
- Slide 34
- P APER C ONTENT ?