introduction to information retrieval - kangwoncs.kangwon.ac.kr/~leeck/ir/postinglist.pdf ·...

18
Introduction to Information Retrieval PostingList Park Cheon Eum

Upload: others

Post on 20-Mar-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Introduction to Information Retrieval

Introduction to

Information Retrieval

PostingList

Park Cheon Eum

Introduction to Information Retrieval

awk - array

Ch. 1

Introduction to Information Retrieval

awk - array

Introduction to Information Retrieval

awk - array

Introduction to Information Retrieval

awk - array

Introduction to Information Retrieval

awk - array

Introduction to Information Retrieval

Algorithm

start

doc1, … , 10

split(doc1,…,10)

doc1,…,10 < id

append(docs, doc1,…,10)

sort, uniq

posting

postring결과

End

Introduction to Information Retrieval

Algorithm - Indexer steps: Token sequence

문서 내용을 토큰 별로 나누어 ID를 설정한다.

I did enact Julius

Caesar I was killed

i' the Capitol;

Brutus killed me.

Doc 1

So let it be with

Caesar. The noble

Brutus hath told you

Caesar was ambitious

Doc 2

Introduction to Information Retrieval

Algorithm - Indexer steps: Sort

단어 별로 정렬한다. ID 순으로

Introduction to Information Retrieval

Algorithm - Indexer steps: Dictionary & Postings

같은 단어 && 같은 ID 는 하나만 남긴다. (= frequency)

같은 단어 && 다른 ID는 Posting한다.

Sec. 1.2

Introduction to Information Retrieval

Algorithm

Introduction to Information Retrieval

Processing

doc1, … , 10

split(doc1,…,10)

doc1,…,10 < id

Introduction to Information Retrieval

Processing

append(docs, doc1,…,10)

Introduction to Information Retrieval

Processing

sort

Introduction to Information Retrieval

Processing

15

frequency

Introduction to Information Retrieval

Processing

쉬운 방법

posting

Introduction to Information Retrieval

Processing

배열 사용 posting

Introduction to Information Retrieval

Processing

배열 사용

posting