getting started with condor - mit opencourseware started collecting web content onedegreecollector...

33
Getting Started With Condor 1

Upload: dinhnguyet

Post on 11-Jun-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

Getting Started With Condor

1

Page 2: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

Contents Getting Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting Wikipedia Data Collecting CoolPeople Coolhunting Blueprin

2t

Page 3: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

3

Page 4: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

Getting Data into Condor IMAP E-Mail (Mailcollector) Communication View (social net)

Eudora mailboxes Term view (semantic net)

Communication View (link net) Web/Blog/News/Scholar Term view (semantic net) (WebCollector) Communication View (semantic net)

Wikipedia (WikiFactFetcher)

C

ondo

r Term view (semantic net) nippets (OneDegreeCollector)

Communication View (social net) Twitter (TwitterCollector) Term view (semantic net)

Communication View (social net) FlatFiles (FileLoader) Term view (semantic net)

PeopleNetworks (CoolPeople) Communication View (social net)

Facebook Communication View (social net) 4

S

Con

dor

Page 5: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

Temporal Visualization by a Sliding Time Frame

1 n 2 n+1

3 n+2 4 n+3

5 n+4 time

5

Page 6: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

6

With and without history

Page 7: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

Preparation Install MySQL Install Java (only Windows) Install Java 3D (only Windows) Start Java (if it does not run yet)

7

Page 8: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

Contents Getting Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting Wikipedia Data Collecting CoolPeople Coolhunting Blueprin

8t

Page 9: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

9

Collect Web Content

Page 10: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

10

Communication View

Page 11: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

11

Term view index

Page 12: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

12

Term view index - 2

Page 13: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

Contents Getting Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting Wikipedia Data Collecting CoolPeople Coolhunting Blueprin

13t

Page 14: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

One-Degree-Collector

Complementary to the Blog Collector Fetches only one degree Retrieved websites are not aggregate

14

Page 15: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

One-Degree-Collector - UI

15

GUIresembles Blog Collector

Page 16: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

One-Degree-Collector - result

typical result of one-degree search

16

Page 17: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

Contents Getting Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting Wikipedia Data Collecting CoolPeople Coolhunting Blueprin

17t

Page 18: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

Creating Term View Without OneDegreeCollector Start List: Create Stoplist First

18

Page 19: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

19

… then use this stop list for the term view

Page 20: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

20

Creating Term View With Start AND Stop List

Page 21: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

Contents Getting Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting Wikipedia Data Collecting CoolPeople Coolhunting Blueprin

21t

Page 22: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

Collect E-Mail java -Xmx2048M -jar condor-2.1.jar

Condor Key

MySQL password (default: no password)

22

Page 23: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

Tools to collect data

23

Page 24: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

For username, host, port, and ssl check with your email provider (for gmail, see next slide) provider (for gmail, see

Anonymize will replace email addresses with random identifiers

Anonymize will replace

Left side: enter here the specification of the mailbox Right side: database related data, eg

MailCollector no pass. username: root,

Content: yes will download the whole emails, w/o content only the sender, Delete the present data

Here you can choose recipients and the subject line are in the database? specific folders to downloaded dow24 nload

Delete the present data

word MailCollector

Page 25: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

Settings for gmail

[email protected]

Your gmail password

imap.gmail.com.

Don’t forget the access information for your mysql database on the right, then press start.It might take a while (esp. with huge mailboxes) before you see a progress bar.

Page 26: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

1 2

3

4

26

Visualize Mail-Data

Page 27: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

7

8

9

27

Visualize E-Mail Data (3)

Page 28: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

28

Dynamic View of Communication

Page 29: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

1

2 3

29

Visualize E-Mail Contents

Page 30: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

4 5

30

Visualize E-Mail Contents (2)

Page 31: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

Dynamic View of Terms

31

Page 32: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

Contents Getting Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting Wikipedia Data Collecting CoolPeople Coolhunting Blueprin

32t

Page 33: Getting Started With Condor - MIT OpenCourseWare Started Collecting Web Content OneDegreeCollector Building your own Startlists Collecting your E-Mail Collecting Facebook Data Collecting

MIT OpenCourseWarehttp://ocw.mit.edu

15.599 Workshop in IT: Collaborative Innovation NetworksFall 2011 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.