deepdive introduction dongfang xu ph.d student, school of information, university of arizona sept...
TRANSCRIPT
![Page 1: DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015](https://reader036.vdocuments.us/reader036/viewer/2022062807/5697c0021a28abf838cc2bbd/html5/thumbnails/1.jpg)
DeepDiveIntroduction
Dongfang XuPh .D student, School of Information, University of Arizona
Sept 10, 2015
![Page 2: DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015](https://reader036.vdocuments.us/reader036/viewer/2022062807/5697c0021a28abf838cc2bbd/html5/thumbnails/2.jpg)
Agenda
Brief IntroductionBrief Introduction
KBC modelKBC model
WorkflowWorkflow
ReferenceDeepDive: A Data Management System for Automatic Knowledge Base Construction. Ce Zhang.Ph.D. Dissertation, University of Wisconsin-Madison, 2015.
![Page 3: DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015](https://reader036.vdocuments.us/reader036/viewer/2022062807/5697c0021a28abf838cc2bbd/html5/thumbnails/3.jpg)
Brief Introduction
What is Deep Dive?
•DeepDive is a new type of data management system that enables
one to tackle extraction, integration, and prediction problems in a
single system.
•It is built by generalizing from experience in building more than ten
high-quality Knowledge Base Construction (KBC) systems. (Flexible
framework)
What is KBC?•Knowledge Base Construction (KBC) is the process of populating a
knowledge base (KB).
![Page 4: DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015](https://reader036.vdocuments.us/reader036/viewer/2022062807/5697c0021a28abf838cc2bbd/html5/thumbnails/4.jpg)
Brief Introduction
Why Deep Dive? Or Why KBC?•Its potential to answer key scientific questions.---Collect facts, contribute to scientific discoveries
• Typical knowledge base require a large amount of resource.•Common problems in scientific area.
![Page 5: DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015](https://reader036.vdocuments.us/reader036/viewer/2022062807/5697c0021a28abf838cc2bbd/html5/thumbnails/5.jpg)
Brief Introduction
Why Deep Dive? Or Why KBC?•Its potential to answer key scientific questions.•Typical knowledge base require a large amount of resource.•Its good performance. ---- Developer thinks about features (extraction rules), not algorithms. ---- Large amounts of data from a variety of sources; ----High quality in extracting complex knowledge and building entity relation;---- Calibrated probabilities for each assertion it makes;----Domain knowledge + framework Deep Dive = KBC system in specific domain
![Page 6: DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015](https://reader036.vdocuments.us/reader036/viewer/2022062807/5697c0021a28abf838cc2bbd/html5/thumbnails/6.jpg)
Brief Introduction
General description Deep Dive----The application of Relational database.
All data in Deep Dive is stored in a relational database.
----The main task it to figure out the relation & entity.
---- A selection of target facts typically defined for an IE task.
---- Multiple non-content cues such as layout information may be used to assist extraction, e.g. section headers or their layout in tabular data.
----Extract all kinds of information about the entity and relation, high data volume.
![Page 7: DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015](https://reader036.vdocuments.us/reader036/viewer/2022062807/5697c0021a28abf838cc2bbd/html5/thumbnails/7.jpg)
Agenda
Brief IntroductionBrief Introduction
KBC modelKBC model
WorkflowWorkflow
![Page 8: DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015](https://reader036.vdocuments.us/reader036/viewer/2022062807/5697c0021a28abf838cc2bbd/html5/thumbnails/8.jpg)
KBC model
Entity: An entity is a real-world person, place, or thing. ----For example, the entity “Michelle Obama 1” represents the actual entity for a
person whose name is “Michelle Obama”.Relation: A relation associates two (or more) entities.----For example, the entity “Barack Obama 1” and “Michelle Obama 1” participate
in the HasSpouse relation, which indicates that they are married.Mention: a mention is a span of text in an input file that refers to an entity or relationship.---- “Michelle” may be a mention of the entity “Michelle Obama 1.”Relation Mention: A relation mention is a phrase that connects two mentions that participate in a relation.---- “and his wife” =“Barack Obama” and “M. Obama”.
![Page 9: DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015](https://reader036.vdocuments.us/reader036/viewer/2022062807/5697c0021a28abf838cc2bbd/html5/thumbnails/9.jpg)
KBC model
![Page 10: DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015](https://reader036.vdocuments.us/reader036/viewer/2022062807/5697c0021a28abf838cc2bbd/html5/thumbnails/10.jpg)
Agenda
Brief IntroductionBrief Introduction
KBC modelKBC model
WorkflowWorkflow
![Page 11: DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015](https://reader036.vdocuments.us/reader036/viewer/2022062807/5697c0021a28abf838cc2bbd/html5/thumbnails/11.jpg)
Work Flow
![Page 12: DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015](https://reader036.vdocuments.us/reader036/viewer/2022062807/5697c0021a28abf838cc2bbd/html5/thumbnails/12.jpg)
Work Flow
Input file
![Page 13: DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015](https://reader036.vdocuments.us/reader036/viewer/2022062807/5697c0021a28abf838cc2bbd/html5/thumbnails/13.jpg)
Work Flow
Input file
User Schema
![Page 14: DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015](https://reader036.vdocuments.us/reader036/viewer/2022062807/5697c0021a28abf838cc2bbd/html5/thumbnails/14.jpg)
Work Flow
Candidate Generation Feature Extraction
![Page 15: DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015](https://reader036.vdocuments.us/reader036/viewer/2022062807/5697c0021a28abf838cc2bbd/html5/thumbnails/15.jpg)
Work Flow
Candidate Generation & Feature Extraction
![Page 16: DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015](https://reader036.vdocuments.us/reader036/viewer/2022062807/5697c0021a28abf838cc2bbd/html5/thumbnails/16.jpg)
Work Flow
Supervision(1) hand-labeling, and (2) distant supervision
![Page 17: DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015](https://reader036.vdocuments.us/reader036/viewer/2022062807/5697c0021a28abf838cc2bbd/html5/thumbnails/17.jpg)
Work Flow
Supervision(1) hand-labeling, and (2) distant supervision
![Page 18: DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015](https://reader036.vdocuments.us/reader036/viewer/2022062807/5697c0021a28abf838cc2bbd/html5/thumbnails/18.jpg)
Work Flow
Learning and InferenceIn the learning and inference phase, Deep Dive generates a factor graph.
![Page 19: DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015](https://reader036.vdocuments.us/reader036/viewer/2022062807/5697c0021a28abf838cc2bbd/html5/thumbnails/19.jpg)
Work Flow
Learning and Inference
![Page 20: DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015](https://reader036.vdocuments.us/reader036/viewer/2022062807/5697c0021a28abf838cc2bbd/html5/thumbnails/20.jpg)
DeepDive
Resourcehttp://deepdive.stanford.edu.
https://www.youtube.com/watch?v=SfkLvExfl-s
http://pages.cs.wisc.edu/~czhang/zhang.thesis.pdf
![Page 21: DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015](https://reader036.vdocuments.us/reader036/viewer/2022062807/5697c0021a28abf838cc2bbd/html5/thumbnails/21.jpg)
Thank you!
Q&A