![Page 1: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/1.jpg)
Text to 3D Scene Generationwith Rich Lexical Grounding
ACL-IJCNLP July 27, 2015 Beijing, China
“There is a desk and there is a notepad on the desk.There is a pen next to the notepad.”
Angel Chang Will Monroe Manolis SavvaChristopher Potts Christoper D. Manning
Stanford University
![Page 2: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/2.jpg)
Outline
● Introduction and prior work
● Dataset
● Lexical learning
● Generation with lexical grounding
● Evaluation
● Challenges and Conclusion
![Page 3: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/3.jpg)
Outline
● Introduction and prior work
● Dataset
● Lexical learning
● Generation with lexical grounding
● Evaluation
● Challenges and conclusion
![Page 4: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/4.jpg)
The art of 3D scene design
![Page 5: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/5.jpg)
The art of 3D scene design Call of Duty: Advanced Warfare[Activision / Sledgehammer Games]
![Page 6: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/6.jpg)
Call of Duty: Advanced Warfare[Activision / Sledgehammer Games]
Toy Story 3[Disney / Pixar]
The art of 3D scene design
![Page 7: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/7.jpg)
Call of Duty: Advanced Warfare[Activision / Sledgehammer Games]
Toy Story 3[Disney / Pixar]
“Modern: Plywood, Plastic & Polished Metal”[Homedit Interior Design & Architecture]
The art of 3D scene design
![Page 8: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/8.jpg)
Generating 3D scenes from text
![Page 9: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/9.jpg)
Generating 3D scenes from text
TOYS’ POV -- An idyllic day care classroom, filled with the happy bustle of four- and five-year-olds, playing with toys -- dinosaurs, a baby doll, a pink Teddy bear, a Ken doll. ...
A Tonka Truck races forward, then backs up in a quick 180 arc, revealing a large pink Teddy bear, LOTSO, in its bed. Lotso taps a Tinker Toy cane and the truck bed rises, “dumping” him out. Like Bob Hope stepping off the links in Palm Springs, Lotso exudes an easy, cheerful charisma.
(Screenplay by Michael Arndt)
![Page 10: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/10.jpg)
Selected prior work
SHRDLU (Winograd, 1972) WordsEye (Coyne and Sproat, 2001)
![Page 11: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/11.jpg)
Scene generation pipelineThere is a room with a wooden desk and a black lamp. There is a chair to the right of the desk.
(Chang et al., 2014)
![Page 12: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/12.jpg)
Scene generation pipelineThere is a room with a wooden desk and a black lamp. There is a chair to the right of the desk.
parsing
(Chang et al., 2014)
![Page 13: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/13.jpg)
Scene generation pipelineThere is a room with a wooden desk and a black lamp. There is a chair to the right of the desk.
parsing
object selection
(Chang et al., 2014)
![Page 14: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/14.jpg)
Scene generation pipelineThere is a room with a wooden desk and a black lamp. There is a chair to the right of the desk.
parsing
layout
object selection
(Chang et al., 2014)
![Page 15: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/15.jpg)
Handling lexical variety
sofa
couch
loveseat
dresser
chest of drawers
cabinet
![Page 16: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/16.jpg)
Identifying object mentions
Wood table and four wood chairs in the center of the room
![Page 17: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/17.jpg)
Wood table and four wood chairs in the center of the room
Can we fix this by learning from data?
Identifying object mentions
![Page 18: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/18.jpg)
Outline
● Introduction and prior work
● Dataset
● Lexical learning
● Generation with lexical grounding
● Evaluation
● Challenges and conclusion
![Page 19: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/19.jpg)
Outline
● Introduction and prior work
● Dataset
● Lexical learning
● Generation with lexical grounding
● Evaluation
● Challenges and conclusion
![Page 20: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/20.jpg)
Dataset
There is a bed and there is a chair next to the bed.
![Page 21: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/21.jpg)
Dataset
There is a bed and there is a chair next to the bed.
![Page 22: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/22.jpg)
Structure of a 3D scene
![Page 23: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/23.jpg)
{ 'modelID': '7bdc0aac', 'position': [118.545639, 97.979499, 3.098599], 'scale': 0.087807, 'rotation': -1.088704}
Structure of a 3D scene
![Page 24: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/24.jpg)
{ 'modelID': '7bdc0aac', 'position': [118.545639, 97.979499, 3.098599], 'scale': 0.087807, 'rotation': -1.088704}
Field Value
name ellington armchair
id 7bdc0aac
tags armchair, chair, ellington, haughton, sam, seating, woodmark
category Chair
wnlemmas armchair
unit 0.028974
up [0, 0, 1]
front [0, -1, 0]
Structure of a 3D scene
![Page 25: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/25.jpg)
{ 'modelID': '7bdc0aac', 'position': [118.545639, 97.979499, 3.098599], 'scale': 0.087807, 'rotation': -1.088704}
Field Value
name ellington armchair
id 7bdc0aac
tags armchair, chair, ellington, haughton, sam, seating, woodmark
category Chair
wnlemmas armchair
unit 0.028974
up [0, 0, 1]
front [0, -1, 0]
WordNet
human-tagged keywords &categories
size & orientationsuggestions
Structure of a 3D scene
![Page 26: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/26.jpg)
Dataset
There is a bed and there is a chair next to the bed.
![Page 27: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/27.jpg)
Dataset
There is a bed and there is a chair next to the bed.
Floor to ceiling windows on back wall. Green bed with two pillows and black blanket. Lights recessed into right side wall. Light wood flooring. A chair is in the upper right hand corner
There is a bed on the side of the room. There is a chair in the corner, next to the windows.
I see a bed and a chair.
The room has three windows on one wall. There is a red bed in the back of the room. Along side the bed is a side chair that is red and white.
This room has a bed with red bedding against the wall. Next to the bed is a chair.
there is a antique looking bed with red covers and pillows in a room. next to it is a recliner chair with red padding. also there are windows.
there is a bed with five pillows on it, and next to it is a chair
There is a bed in the room with two pillows and a small chair near to the right side of it.
There is a large grey bed in the bottom right corner of the room. Above the bed is a small black chair.
![Page 28: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/28.jpg)
Dataset
There is a bed and there is a chair next to the bed.
Floor to ceiling windows on back wall. Green bed with two pillows and black blanket. Lights recessed into right side wall. Light wood flooring. A chair is in the upper right hand corner
There is a bed on the side of the room. There is a chair in the corner, next to the windows.
I see a bed and a chair.
The room has three windows on one wall. There is a red bed in the back of the room. Along side the bed is a side chair that is red and white.
This room has a bed with red bedding against the wall. Next to the bed is a chair.
there is a antique looking bed with red covers and pillows in a room. next to it is a recliner chair with red padding. also there are windows.
there is a bed with five pillows on it, and next to it is a chair
There is a bed in the room with two pillows and a small chair near to the right side of it.
There is a large grey bed in the bottom right corner of the room. Above the bed is a small black chair.
![Page 29: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/29.jpg)
Dataset
There is a bed and there is a chair next to the bed.
Floor to ceiling windows on back wall. Green bed with two pillows and black blanket. Lights recessed into right side wall. Light wood flooring. A chair is in the upper right hand corner
There is a bed on the side of the room. There is a chair in the corner, next to the windows.
I see a bed and a chair.
The room has three windows on one wall. There is a red bed in the back of the room. Along side the bed is a side chair that is red and white.
This room has a bed with red bedding against the wall. Next to the bed is a chair.
there is a antique looking bed with red covers and pillows in a room. next to it is a recliner chair with red padding. also there are windows.
there is a bed with five pillows on it, and next to it is a chair
There is a bed in the room with two pillows and a small chair near to the right side of it.
There is a large grey bed in the bottom right corner of the room. Above the bed is a small black chair.
![Page 30: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/30.jpg)
Dataset
There is a bed and there is a chair next to the bed.
Floor to ceiling windows on back wall. Green bed with two pillows and black blanket. Lights recessed into right side wall. Light wood flooring. A chair is in the upper right hand corner
There is a bed on the side of the room. There is a chair in the corner, next to the windows.
I see a bed and a chair.
The room has three windows on one wall. There is a red bed in the back of the room. Along side the bed is a side chair that is red and white.
This room has a bed with red bedding against the wall. Next to the bed is a chair.
there is a antique looking bed with red covers and pillows in a room. next to it is a recliner chair with red padding. also there are windows.
there is a bed with five pillows on it, and next to it is a chair
There is a bed in the room with two pillows and a small chair near to the right side of it.
There is a large grey bed in the bottom right corner of the room. Above the bed is a small black chair.
1128 scenes
4284 scene descriptions
60 seedsentences
![Page 31: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/31.jpg)
Outline
● Introduction and prior work
● Dataset
● Lexical learning
● Generation with lexical grounding
● Evaluation
● Challenges and conclusion
![Page 32: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/32.jpg)
Discrimination task
brown room with a refrigerator in the back corner
A B C
D E
![Page 33: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/33.jpg)
D
brown room with a refrigerator in the back corner
Discrimination task
![Page 34: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/34.jpg)
Learning lexical items
● One-vs.-all logistic regression● Features: 1{(language, object)}
– language: bag-of-words / bag-of-bigrams
– object: model id / categorybrownbrown roomroomroom withwith...
room01room027bdc0aaccat:Roomcat:Refrigerator...
![Page 35: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/35.jpg)
Discrimination results
Random set
Model ids only 71.5%
Model ids + categories 83.3%
● Accuracy (% correct scenes identified)
![Page 36: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/36.jpg)
Lexical grounding examples
text category
chair Chair
couch Couch
sofa Couch
fruit Bowl
bookshelf Bookcase
![Page 37: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/37.jpg)
Lexical grounding examples
![Page 38: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/38.jpg)
Outline
● Introduction and prior work
● Dataset
● Lexical learning
● Generation with lexical grounding
● Evaluation
● Challenges and conclusion
![Page 39: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/39.jpg)
Generate!
There is a room with a wooden desk and a black lamp. There is a chair to the right of the desk.
?
![Page 40: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/40.jpg)
Baseline
There is a room with a wooden desk and a black lamp. There is a chair to the right of the desk.
deskroom
chair
wooden desk
There is
a black
a
a wooden
black lamp
![Page 41: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/41.jpg)
Baseline
There is a room with a wooden desk and a black lamp. There is a chair to the right of the desk.
deskroom
chair
wooden desk
There is
a black
a
a wooden
black lamp
![Page 42: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/42.jpg)
Baseline
There is a room with a wooden desk and a black lamp. There is a chair to the right of the desk.
deskroom
chair
wooden desk
There is
a black
a
a wooden
black lamp
![Page 43: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/43.jpg)
Baseline
There is a room with a wooden desk and a black lamp. There is a chair to the right of the desk.
group by object
sum weights
2.1 1.5 2.3 2.0
1.7 1.8 1.9
![Page 44: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/44.jpg)
Baseline
There is a room with a wooden desk and a black lamp. There is a chair to the right of the desk.
choose top k(k = 4)
K = 4, average number of objects in human-constructed scenes
2.1 1.5 2.3 2.0
1.7 1.8 1.9
![Page 45: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/45.jpg)
Baseline
There is a room with a wooden desk and a black lamp. There is a chair to the right of the desk.
choose top k(k = 4)
No relationship enforced between objects! Combine with rule-based parser?
2.1 1.5 2.3 2.0
1.7 1.8 1.9
![Page 46: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/46.jpg)
Rule-based parsing
There is a room with a wooden desk and a black lamp. There is a chair to the right of the desk.
(Chang et al., 2014)
![Page 47: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/47.jpg)
Rule-based parsing
There is a room with a wooden desk and a black lamp. There is a chair to the right of the desk.
● Identify object categories using noun phrases
(Chang et al., 2014)
![Page 48: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/48.jpg)
Rule-based parsing
There is a room with a wooden desk and a black lamp. There is a chair to the right of the desk.
● Identify object categories using noun phrases● Identify attributes and keywords using
modifiers and dependency patterns
(Chang et al., 2014)
![Page 49: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/49.jpg)
Rule-based parsing
There is a room with a wooden desk and a black lamp. There is a chair to the right of the desk.
● Identify object categories using noun phrases● Identify attributes and keywords using modifiers
and dependency patterns● Identify spatial relations using dependency
patterns
(Chang et al., 2014)
![Page 50: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/50.jpg)
Rule-based parsing
There is a room with a wooden desk and a black lamp. There is a chair to the right of the desk.
● Identify object categories using noun phrases● Identify attributes and keywords using modifiers
and dependency patterns● Identify spatial relations using dependency
patterns● Look up objects from DB using categories and
keywords
(Chang et al., 2014)
![Page 51: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/51.jpg)
Parsing + learned lexical grounding
there is a room with a wooden desk and a black lamp
![Page 52: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/52.jpg)
Parsing + learned lexical grounding
there is a room with a wooden desk and a black lamp
c=argmaxc
∑ϕi∈ϕ(p )
θ(i ,c)
Lamp Table Vase
![Page 53: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/53.jpg)
Parsing + learned lexical grounding
there is a room with a wooden desk and a black lamp
Lamp 2.304Table 0.622Vase -0.310
c=argmaxc
∑ϕi∈ϕ(p )
θ(i ,c)
![Page 54: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/54.jpg)
Parsing + learned lexical grounding
there is a room with a wooden desk and a black lamp
c=argmaxc
∑ϕi∈ϕ(p )
θ(i ,c) m=argmaxm∈c (λd ∑
ϕi∈ϕ(d )
θ( i ,m)+λx ∑ϕi∈ϕ(x)
θ(i ,m))
Lamp 2.304Table 0.622Vase -0.310
![Page 55: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/55.jpg)
Parsing + learned lexical grounding
there is a room with a wooden desk and a black lamp
Lamp 2.304Table 0.622Vase -0.310
c=argmaxc
∑ϕi∈ϕ(p )
θ(i ,c) m=argmaxm∈c (λd ∑
ϕi∈ϕ(d )
θ(i ,m)+λ x ∑ϕi∈ϕ(x)
θ(i, m))
![Page 56: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/56.jpg)
Parsing + learned lexical grounding
there is a room with a wooden desk and a black lamp
0.302 0.460 -0.021
Lamp 2.304Table 0.622Vase -0.310
c=argmaxc
∑ϕi∈ϕ(p )
θ(i ,c) m=argmaxm∈c (λd ∑
ϕi∈ϕ(d )
θ(i ,m)+λ x ∑ϕi∈ϕ(x)
θ(i, m))
![Page 57: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/57.jpg)
Parsing + learned lexical grounding
There is a room with a wooden desk and a black lamp. There is a chair to the right of the desk.
![Page 58: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/58.jpg)
Scene generation pipelineThere is a room with a wooden desk and a black lamp. There is a chair to the right of the desk.
parsing
layout
object selection
(Chang et al., 2014)
![Page 59: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/59.jpg)
Generated scene examples
A round table is in the center of the room with four chairs around the table. There is a double window facing west. A door is on the east side of the room.
![Page 60: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/60.jpg)
Outline
● Introduction and prior work
● Dataset
● Lexical learning
● Generation with lexical grounding
● Evaluation
● Challenges and conclusion
![Page 61: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/61.jpg)
Evaluation
● Turkers rated fidelity of generated sceneson a scale of 1 (poor) to 7 (good)
![Page 62: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/62.jpg)
Evaluation
● Turkers rated fidelity of generated sceneson a scale of 1 (poor) to 7 (good)
● Compare scenes generated with four methods against human-built scenes
![Page 63: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/63.jpg)
EvaluationIn between the doors and the window, there is a black couch with red cushions, two white pillows, and one black pillow. In front of the couch, there is a wooden coffee table with a glass top and two newspapers. Next to the table, facing the couch, is a wooden folding chair.
human-built
![Page 64: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/64.jpg)
EvaluationIn between the doors and the window, there is a black couch with red cushions, two white pillows, and one black pillow. In front of the couch, there is a wooden coffee table with a glass top and two newspapers. Next to the table, facing the couch, is a wooden folding chair.
![Page 65: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/65.jpg)
Evaluation
● Turkers rated fidelity of generated sceneson a scale of 1 (poor) to 7 (good)
● Compare scenes generated with 4 methods (random, lexical baseline, rule-based-parser, combined) against human-built scenes
![Page 66: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/66.jpg)
Evaluation
● Turkers rated fidelity of generated sceneson a scale of 1 (poor) to 7 (good)
● Compare scenes generated with 4 methods (random, lexical baseline, rule-based-parser, combined) against human-built scenes
● Two sets of scene descriptions Seed: seed sentences Mturk: descriptions provided by turkers
![Page 67: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/67.jpg)
Dataset
There is a bed and there is a chair next to the bed.
Seed
![Page 68: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/68.jpg)
Dataset
There is a bed and there is a chair next to the bed.
Seed
Simple, no modifiers
![Page 69: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/69.jpg)
Dataset
There is a bed and there is a chair next to the bed.
Seed
![Page 70: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/70.jpg)
Dataset
There is a bed and there is a chair next to the bed.
Floor to ceiling windows on back wall. Green bed with two pillows and black blanket. Lights recessed into right side wall. Light wood flooring. A chair is in the upper right hand corner
There is a bed on the side of the room. There is a chair in the corner, next to the windows.
I see a bed and a chair.
The room has three windows on one wall. There is a red bed in the back of the room. Along side the bed is a side chair that is red and white.
This room has a bed with red bedding against the wall. Next to the bed is a chair.
there is a antique looking bed with red covers and pillows in a room. next to it is a recliner chair with red padding. also there are windows.
there is a bed with five pillows on it, and next to it is a chair
There is a bed in the room with two pillows and a small chair near to the right side of it.
There is a large grey bed in the bottom right corner of the room. Above the bed is a small black chair.
Seed
Mturk
![Page 71: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/71.jpg)
Dataset
There is a bed and there is a chair next to the bed.
Floor to ceiling windows on back wall. Green bed with two pillows and black blanket. Lights recessed into right side wall. Light wood flooring. A chair is in the upper right hand corner
There is a bed on the side of the room. There is a chair in the corner, next to the windows.
I see a bed and a chair.
The room has three windows on one wall. There is a red bed in the back of the room. Along side the bed is a side chair that is red and white.
This room has a bed with red bedding against the wall. Next to the bed is a chair.
there is a antique looking bed with red covers and pillows in a room. next to it is a recliner chair with red padding. also there are windows.
there is a bed with five pillows on it, and next to it is a chair
There is a bed in the room with two pillows and a small chair near to the right side of it.
There is a large grey bed in the bottom right corner of the room. Above the bed is a small black chair.
Seed
Mturk
More complex, varied language
![Page 72: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/72.jpg)
Evaluation Results
Method Simple
Random 2.03
Lexical baseline 3.51
Rule-based parser 5.44
Combined 5.23
Human-built 6.06
Turkers rated fidelity of generated sceneson a scale of 1 (poor) to 7 (good)
168 participants, average 4.2 ratings per scene-description pair
![Page 73: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/73.jpg)
Evaluation Results
Method Seed
Random 2.03
Lexical baseline 3.51
Rule-based parser 5.44
Combined 5.23
Human-built 6.06
Turkers rated fidelity of generated sceneson a scale of 1 (poor) to 7 (good)
168 participants, average 4.2 ratings per scene-description pair
![Page 74: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/74.jpg)
Evaluation Results
Method Seed
Random 2.03
Lexical baseline 3.51
Rule-based parser 5.44
Combined 5.23
Human-built 6.06
Turkers rated fidelity of generated sceneson a scale of 1 (poor) to 7 (good)
168 participants, average 4.2 ratings per scene-description pair
![Page 75: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/75.jpg)
Evaluation Results
Method Seed
Random 2.03
Lexical baseline 3.51
Rule-based parser 5.44
Combined 5.23
Human-built 6.06
Turkers rated fidelity of generated sceneson a scale of 1 (poor) to 7 (good)
168 participants, average 4.2 ratings per scene-description pair
![Page 76: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/76.jpg)
Evaluation Results
Method Seed Mturk
Random 2.03 1.68
Lexical baseline 3.51 2.61
Rule-based parser 5.44 3.15
Combined 5.23 3.73
Human-built 6.06 5.87
Turkers rated fidelity of generated sceneson a scale of 1 (poor) to 7 (good)
168 participants, average 4.2 ratings per scene-description pair
![Page 77: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/77.jpg)
Evaluation Results
Method Seed Mturk
Random 2.03 1.68
Lexical baseline 3.51 2.61
Rule-based parser 5.44 3.15
Combined 5.23 3.73
Human-built 6.06 5.87
Turkers rated fidelity of generated sceneson a scale of 1 (poor) to 7 (good)
168 participants, average 4.2 ratings per scene-description pair
![Page 78: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/78.jpg)
Evaluation Results
Method Seed Mturk
Random 2.03 1.68
Lexical baseline 3.51 2.61
Rule-based parser 5.44 3.15
Combined 5.23 3.73
Human-built 6.06 5.87
Turkers rated fidelity of generated sceneson a scale of 1 (poor) to 7 (good)
168 participants, average 4.2 ratings per scene-description pair
![Page 79: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/79.jpg)
Evaluation Results
Method Seed Mturk
Random 2.03 1.68
Lexical baseline 3.51 2.61
Rule-based parser 5.44 3.15
Combined 5.23 3.73
Human-built 6.06 5.87
Turkers rated fidelity of generated sceneson a scale of 1 (poor) to 7 (good)
168 participants, average 4.2 ratings per scene-description pair
![Page 80: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/80.jpg)
Outline
● Introduction and prior work
● Dataset
● Lexical learning
● Generation with lexical grounding
● Evaluation
● Challenges and conclusion
![Page 81: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/81.jpg)
Evaluation Results
Method Seed Mturk
Random 2.03 1.68
Lexical baseline 3.51 2.61
Rule-based parser 5.44 3.15
Combined 5.23 3.73
Human-built 6.06 5.87
Turkers rated fidelity of generated sceneson a scale of 1 (poor) to 7 (good)
168 participants, average 4.2 ratings per scene-description pair
![Page 82: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/82.jpg)
Generated scene examplesIn between the doors and the window, there is a black couch with red cushions, two white pillows, and one black pillow. In front of the couch, there is a wooden coffee table with a glass top and two newspapers. Next to the table, facing the couch, is a wooden folding chair.
![Page 83: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/83.jpg)
Generated scene examplesIn between the doors and the window, there is a black couch with red cushions, two white pillows, and one black pillow. In front of the couch, there is a wooden coffee table with a glass top and two newspapers. Next to the table, facing the couch, is a wooden folding chair.
![Page 84: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/84.jpg)
Generated scene examplesIn between the doors and the window, there is a black couch with red cushions, two white pillows, and one black pillow. In front of the couch, there is a wooden coffee table with a glass top and two newspapers. Next to the table, facing the couch, is a wooden folding chair.
![Page 85: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/85.jpg)
Generated scene examplesIn between the doors and the window, there is a black couch with red cushions, two white pillows, and one black pillow. In front of the couch, there is a wooden coffee table with a glass top and two newspapers. Next to the table, facing the couch, is a wooden folding chair.
![Page 86: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/86.jpg)
Generated scene examplesIn between the doors and the window, there is a black couch with red cushions, two white pillows, and one black pillow. In front of the couch, there is a wooden coffee table with a glass top and two newspapers. Next to the table, facing the couch, is a wooden folding chair.
![Page 87: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/87.jpg)
Generated scene examplesIn between the doors and the window, there is a black couch with red cushions, two white pillows, and one black pillow. In front of the couch, there is a wooden coffee table with a glass top and two newspapers. Next to the table, facing the couch, is a wooden folding chair.
?
![Page 88: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/88.jpg)
Remaining Challenges
● Grounding of spatial relations
● Coreference
There in the middle is a table. On the table is a cup.
facing the couch
![Page 89: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/89.jpg)
Summary
● Learning of lexical grounding to handle linguistic variation in scene description
![Page 90: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/90.jpg)
Summary
● Learning of lexical grounding to handle linguistic variation in scene description
● Combined rule-based parser and learned lexical groundings for scene generation
![Page 91: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/91.jpg)
Summary
● Learning of lexical grounding to handle linguistic variation in scene description
● Combined rule-based parser and learned lexical groundings for scene generation
● Evaluation demonstrating improved text to scene generation
![Page 92: Text to 3D Scene Generation with Rich Lexical … to 3D Scene Generation with Rich Lexical Grounding ACL-IJCNLP July 27, 2015 Beijing, China “There is a desk and there is a notepad](https://reader031.vdocuments.us/reader031/viewer/2022021819/5acda0417f8b9a93268dd419/html5/thumbnails/92.jpg)
Thank you!
Dataset is publicly availablehttp://nlp.stanford.edu/data/text2scene.shtml