intelligent email: reply and attachment prediction mark dredze, tova brooks, josh carroll joshua...
DESCRIPTION
Intelligent? How? Prediction tasks treated as binary classification problems Binary vector, where each dimension represents a feature Learning performed with logistic regression System evaluated using F 1, harmonic mean of precision and recall Single-user (adaptive) and cross-user (adaptable) settingsTRANSCRIPT
![Page 1: Intelligent Email: Reply and Attachment Prediction Mark Dredze, Tova Brooks, Josh Carroll Joshua Magarick, John Blitzer, Fernando Pereira Presented by](https://reader035.vdocuments.us/reader035/viewer/2022070605/5a4d1ae97f8b9ab059979e99/html5/thumbnails/1.jpg)
Intelligent Email: Reply and Attachment PredictionMark Dredze, Tova Brooks, Josh CarrollJoshua Magarick, John Blitzer, Fernando Pereira
Presented by Nareg Torosian
![Page 2: Intelligent Email: Reply and Attachment Prediction Mark Dredze, Tova Brooks, Josh Carroll Joshua Magarick, John Blitzer, Fernando Pereira Presented by](https://reader035.vdocuments.us/reader035/viewer/2022070605/5a4d1ae97f8b9ab059979e99/html5/thumbnails/2.jpg)
What’s the use?
Whittaker & Sidner’s “email overload” Task management Personal archiving Asynchronous communication
Assist overwhelmed email users Support enhanced email interface
![Page 3: Intelligent Email: Reply and Attachment Prediction Mark Dredze, Tova Brooks, Josh Carroll Joshua Magarick, John Blitzer, Fernando Pereira Presented by](https://reader035.vdocuments.us/reader035/viewer/2022070605/5a4d1ae97f8b9ab059979e99/html5/thumbnails/3.jpg)
Intelligent? How?
Prediction tasks treated as binary classification problems Binary vector , where each
dimension represents a feature Learning performed with logistic regression System evaluated using F1, harmonic mean
of precision and recall Single-user (adaptive) and cross-user
(adaptable) settings
![Page 4: Intelligent Email: Reply and Attachment Prediction Mark Dredze, Tova Brooks, Josh Carroll Joshua Magarick, John Blitzer, Fernando Pereira Presented by](https://reader035.vdocuments.us/reader035/viewer/2022070605/5a4d1ae97f8b9ab059979e99/html5/thumbnails/4.jpg)
Reply prediction
Indicate which messages require reply Allow user to manage these messages
![Page 5: Intelligent Email: Reply and Attachment Prediction Mark Dredze, Tova Brooks, Josh Carroll Joshua Magarick, John Blitzer, Fernando Pereira Presented by](https://reader035.vdocuments.us/reader035/viewer/2022070605/5a4d1ae97f8b9ab059979e99/html5/thumbnails/5.jpg)
Reply prediction features
Relational features Based on user profile
# of sent and received messages, address book, email address and domain
I appear in the CC list, I frequently reply to this user, etc.
200 in Dredze et al.’s experiment Document features
Presence of question marks and question words TF-IDF (term frequency – inverse document
frequency) scores Presence of attachments 14,800 in Dredze et al.’s experiment
![Page 6: Intelligent Email: Reply and Attachment Prediction Mark Dredze, Tova Brooks, Josh Carroll Joshua Magarick, John Blitzer, Fernando Pereira Presented by](https://reader035.vdocuments.us/reader035/viewer/2022070605/5a4d1ae97f8b9ab059979e99/html5/thumbnails/6.jpg)
The grand experiment
Evaluated on 4 user mailboxes Users manually tagged messages as
either needs reply or does not need reply “It is not surprising that overwhelmed users
acknowledge that a message did require their reply even though they failed to do so; classifiers trained on actual user reply behavior are thus very poor.”
2,391 total emails, excluding spam 80/20 train/test split
![Page 7: Intelligent Email: Reply and Attachment Prediction Mark Dredze, Tova Brooks, Josh Carroll Joshua Magarick, John Blitzer, Fernando Pereira Presented by](https://reader035.vdocuments.us/reader035/viewer/2022070605/5a4d1ae97f8b9ab059979e99/html5/thumbnails/7.jpg)
The single-user results
![Page 8: Intelligent Email: Reply and Attachment Prediction Mark Dredze, Tova Brooks, Josh Carroll Joshua Magarick, John Blitzer, Fernando Pereira Presented by](https://reader035.vdocuments.us/reader035/viewer/2022070605/5a4d1ae97f8b9ab059979e99/html5/thumbnails/8.jpg)
The cross-user results
Only relational features were effective, so others omitted
![Page 9: Intelligent Email: Reply and Attachment Prediction Mark Dredze, Tova Brooks, Josh Carroll Joshua Magarick, John Blitzer, Fernando Pereira Presented by](https://reader035.vdocuments.us/reader035/viewer/2022070605/5a4d1ae97f8b9ab059979e99/html5/thumbnails/9.jpg)
Attachment prediction
“See attachment…hey, wait a minute…” Possible UI considerations
Document sidebar Alert user before sending
Indicate which messages need attachments
![Page 10: Intelligent Email: Reply and Attachment Prediction Mark Dredze, Tova Brooks, Josh Carroll Joshua Magarick, John Blitzer, Fernando Pereira Presented by](https://reader035.vdocuments.us/reader035/viewer/2022070605/5a4d1ae97f8b9ab059979e99/html5/thumbnails/10.jpg)
Attachment prediction features
Relational features Based on user profile
# of sent and received messages, # of attachments, email address and domain
Conjunctions between volume of messages/attachments and TO/CC fields
72 in Dredze et al.’s experiment Document features
Presence and placement of “attach” Presence of attachments 39,308 in Dredze et al.’s experiment
![Page 11: Intelligent Email: Reply and Attachment Prediction Mark Dredze, Tova Brooks, Josh Carroll Joshua Magarick, John Blitzer, Fernando Pereira Presented by](https://reader035.vdocuments.us/reader035/viewer/2022070605/5a4d1ae97f8b9ab059979e99/html5/thumbnails/11.jpg)
The grander experiment
Evaluated on publicly available Enron email corpus 150 users and 250,000 emails Lots of cleanup needed
Users manually tagged messages as needs attachment Only popular document formats Forwarded messages excluded
Subset of 15,000 messages from 144 users 1,020 with attachments
10-fold cross validation
![Page 12: Intelligent Email: Reply and Attachment Prediction Mark Dredze, Tova Brooks, Josh Carroll Joshua Magarick, John Blitzer, Fernando Pereira Presented by](https://reader035.vdocuments.us/reader035/viewer/2022070605/5a4d1ae97f8b9ab059979e99/html5/thumbnails/12.jpg)
The results
![Page 13: Intelligent Email: Reply and Attachment Prediction Mark Dredze, Tova Brooks, Josh Carroll Joshua Magarick, John Blitzer, Fernando Pereira Presented by](https://reader035.vdocuments.us/reader035/viewer/2022070605/5a4d1ae97f8b9ab059979e99/html5/thumbnails/13.jpg)
GUEPs and CDs GUEPs
Mental model Improvement Consistency
CDs Premature commitment Hidden dependencies Abstraction Consistency Provisionality