lets impl sbv

Upload: toshiyuki-maezawa

Post on 11-Jul-2015

1.053 views

Category:

Documents


5 download

TRANSCRIPT

echizen_tm Dec. 10, 2011

http://d.hatena.ne.jp/echizen_tm/20111210/1323541165

(1 slides) (3 slides / ) (7 slides / ) (1 slides) (25 slides / ) (1 slides) (1 slides)

IDechizen_tm EchizenBlog-Zwei(http://d.hatena.ne.jp/echizen_tm/)

web ()

(1/3) (Succinct Data Structure) (O(1)O(logN))

(2/3)

mozc(google) (WEB+DB PRESS Vol.64 ) mozcLOUDS

Sedue(PFI)

(3/3)

LOUDS() LOUDS

LOUDS

(1/7) (Succinct Bit Vector)

rank/select(O(1)O(logN)) NN+o(N) LOUDS o(N)N

(2/7) rank(i): i(0)

(00110010) rank(0) = 0rank(1) rank(3) rank(5) rank(7) = = = = 0, 1, 2, 3, rank(2) rank(4) rank(6) rank(8) = = = = 1, 1, 3, 3

0 7

(3/7) rank char v[] = {0, a, 0, 0, n, x, 0, 0} Char w[] = {a, n, x}uint8_t b = 0x32

8byte

(0011 0010)

(3 + 1)byte

get(w, b, 1) = w[rank(b, 1)] = w[0] = a

get(w, b, 4) = w[rank(b, 4)] = w[1] = n get(w, b, 5) = w[rank(b, 5)] = w[2] = x

(4/7) select(i): i(0)

(00110010) select(0) = 1,select(1) = 4, select(2) = 5

0 7

(5/7) select char s[] = appleorangeremonchar p[] = {0, 5, 11}(16 + 3)byte

char s[] = appleorangeremonuint16_t b = 0x0821

(16 + 2)byte (0000 1000 0010 0001)

get(s, b, 0) = select(b, 0) = 0

get(s, b, 1) = select(b, 1) = 5 get(s, b, 2) = select(b, 2) = 11

(6/7) rank/select

(irank/selector) rank(select(0)) = rank(1) = 0 rank(select(1)) = rank(4) = 1 rank(select(2)) = rank(5) = 2

rankselect

(7/7) (sparse)(dense)

501100011, 01100100

501000010, 11101101(00010010)012 01 5

Linux/2.27GHz 2core/24GB rank/select1000 (5000bit / 1bit)name ux-trie rx-trie marisa-trie rank(sec) 18.4 19.1 18.7 select(sec) 21.1 20.1 18.8 size(byte) 14,062,520 14,597,160 15,234,440

(1000bit / 1bit)name

rank(sec) 19.0 18.5 18.7

select(sec) 21.8 20.4 19.7

size(byte) 14,062,520 14,597,160 14,921,944

ux-trie rx-trie marisa-trie

(1/25) (ux,rx,marisa)

rankselect3

(2/25)

ux, marisa: C++ rx: C

ux, marisa: uint8_t, uint32_t, uint64_t (stdint.h) rx: char, intC++ (C/C++) uintXX_t

(3/25) rank rankpopcount + rank

popcountuint32_t,uint64_t rankB

O(1) iBrank(i)

(4/25) (B=8) popcount rank11001101 11000100 10000111 01100011

5 5

3 8

4 12

4 16

rank

iBrank(i)O(1) rank(0)=0, rank(8)=5, rank(16) = 8,

/B

(5/25) iBrank(i)O(1)

rank(21) 11001101 11000100 10000111

015rankO(1) 1623 3popcount10000111 00111000

8

3

(6/25) rank B BV BrankrankR rank(i) = R[i / B 1] + popcount(V[i / B] 1 = +

0 1 0 0 0 1 0 0

0 1 0 0 0 1 0 0

=

10 00 10 01

(18/25) 2

(1) (2)

10 00 10 01 10 00 10 01

& &

0x33 (00110011) 0xCC (11001100)

= =

00 00 00 01 10 00 10 00

(2)2(1)10 00 10 00 00 00 00 01

>> 2 = +

00 10 00 10

00 10 00 10

=

0010 0011

(19/25) 4

(1) (2)

0010 0011 0010 0011

& &

0x0F (00001111) 0xF0 (11110000)

= =

0000 0011 0010 0000

(2)4(1)0010 0000 0000 0011

>> 4 = +

0000 0010

0000 0010

=

00000101

(11001101) 5(=00000101)

(20/25) popcountrank

popcount

(3) (2) (1)

1 1 0 0 1 1 0 1 10 00 10 01 0010 0011 00000101

1 1 0 0 1 1 0 1 2 0 2 1 2 3 5

1 2 4 8

(21/25) 11001101select(1)

(1)44

(1)

0010 0011

2 3

4

select(1)(0)1 (1)43(01234) select(1)40 7

(22/25) (2)0-12-3

(2)

10 00 10 01

2 0 2 1

2

select(1)(0)1 (2)0-112-32 (00-1 1,22-3) select(1)2-30 7

(23/25) (3)231 1 0 0 1 1 0 1 1 1 0 0 1 1 0 1

(3)

1

select(1)(0)1 (2)0-10 32 select(1) = 20 7

(3)21

(24/25) popcount2n

select(1)

(1) (2) (3)

0010 0011 10 00 10 01 1 1 0 0 1 1 0 1

2 3 2 0 2 1 1 1 0 0 1 1 0 1

4 2 1

0 7

(25/25) select B=64popcount B>646464

B=51264 B=512

popcount*8rank popcount (12/1000) (select) 8

marisa-trieselect

B=512rank rank: rank(O(1)) + popcount(O(B/64) = O(1)) select: (O(log(N/B)) = O(logN)) +(O(B/64) = O(1)) + popcount(O(log64) = O(1))

Sedue(http://preferred.jp/sedue.html) mozc(http://code.google.com/p/mozc/) ux-trie(http://code.google.com/p/ux-trie) marisa-trie(http://code.google.com/p/marisa-trie) revision mozc: r73 ux-trie: r42 marisa-trie: r83