introduction to information retrievalleeck/ir/postinglist.pdf · 2017-03-02 · introduction to...
TRANSCRIPT
Introduction to Information Retrieval
Introduction to
Information Retrieval
PostingList
Park Cheon Eum
Introduction to Information Retrieval
awk - array
Ch. 1
Introduction to Information Retrieval
awk - array
Introduction to Information Retrieval
awk - array
Introduction to Information Retrieval
awk - array
Introduction to Information Retrieval
awk - array
Introduction to Information Retrieval
Algorithm
start
doc1, … , 10
split(doc1,…,10)
doc1,…,10 < id
append(docs, doc1,…,10)
sort, uniq
posting
postring결과
End
Introduction to Information Retrieval
Algorithm - Indexer steps: Token sequence
문서 내용을 토큰 별로 나누어 ID를 설정한다.
I did enact Julius
Caesar I was killed
i' the Capitol;
Brutus killed me.
Doc 1
So let it be with
Caesar. The noble
Brutus hath told you
Caesar was ambitious
Doc 2
Introduction to Information Retrieval
Algorithm - Indexer steps: Sort
단어 별로 정렬한다. ID 순으로
Introduction to Information Retrieval
Algorithm - Indexer steps: Dictionary & Postings
같은 단어 && 같은 ID 는 하나만 남긴다. (= frequency)
같은 단어 && 다른 ID는 Posting한다.
Sec. 1.2
Introduction to Information Retrieval
Algorithm
Introduction to Information Retrieval
Processing
doc1, … , 10
split(doc1,…,10)
doc1,…,10 < id
Introduction to Information Retrieval
Processing
append(docs, doc1,…,10)
Introduction to Information Retrieval
Processing
sort
Introduction to Information Retrieval
Processing
15
frequency
Introduction to Information Retrieval
Processing
쉬운 방법
posting
Introduction to Information Retrieval
Processing
배열 사용 posting
Introduction to Information Retrieval
Processing
배열 사용
posting