cassandra 멘붕기 | devon 2012
TRANSCRIPT
![Page 1: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/1.jpg)
Cassandra 멘붕기삽질로 일궈낸 카산드라 튜닝 노하우 공유
Daum 비즈개발2팀 안세준
![Page 2: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/2.jpg)
검색 광고? 광고 검색?
![Page 3: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/3.jpg)
![Page 4: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/4.jpg)
![Page 5: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/5.jpg)
![Page 6: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/6.jpg)
검색 광고? 광고 검색?
CPT, CPM, CPC, ... ?
무엇에 대해 광고비를 지불하는가
기간, 회수, 클릭
![Page 7: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/7.jpg)
검색 광고? 광고 검색?
![Page 8: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/8.jpg)
쿼리는 얼마나?
다음 통합검색: 5~6천만/일
외부 매체 포함 총 유입 쿼리: 120~140 M
카산드라 Read (쿼리의 15배): 20억/일
카산드라 총 쿼리: 약 25억/일
![Page 9: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/9.jpg)
쿼리는 얼마나?
Oracle로 저 쿼리를 받아낼 수 없다.
물론 JOIN-SQL 쓰면 쿼리수는 줄어든다.
![Page 10: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/10.jpg)
Oracle은 안되니...
Clix(Sales) DB -> 검색용DB
동기화? 인덱싱?
검색용 DB는 빠르고, 안정적이어야 한다.
정보구조가 다르다.
Read가 Write보다 3~4배 많다.
![Page 11: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/11.jpg)
검색용DB
메시지����������� ������������������ 큐
인덱서
![Page 12: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/12.jpg)
리즈 시절
구현이 쉽다.
쓸만한 속도가 나온다. MEMORY Engine 쓰면...
요즘에 비해 유입 쿼리가 절반 수준이었다.
게시판 구현과 유사한 방식
그렇게... Mysql을 선택해 버렸다......
![Page 13: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/13.jpg)
안되겠다.
광고주와 광고량 점점 증가
메모리자원은 비싸고...
휘발성 DB ➔ 장애 나면 멘붕
![Page 14: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/14.jpg)
RDBMS를 꼭 써야하나?
“검색어” ➔ 광고 목록 (즉, 검색결과)
검색엔진은 RDB안쓰잖아?
Relation?
있으면 쓰겠지...
없어도 그만...
![Page 15: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/15.jpg)
NoSQL?
NoSQL이라는 단어만 들어본 상태..
SQL이 아니라고?
그런뜻이 아닌걸 알고.. 멘붕
Hadoop, Cassandra, MongoDB, ...
뭐가 이렇게 많아? 멘붕
![Page 16: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/16.jpg)
![Page 17: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/17.jpg)
Why Cassandra?카산드라 내부의 데이터 저장방식
검색 인덱스의 데이터 구조와 매우 유사
그 외 NoSQL의 일반적(매력적) 특성
High Availabilities
Scalabilities
등등등...
![Page 18: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/18.jpg)
![Page 19: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/19.jpg)
![Page 20: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/20.jpg)
Cluster > Keyspace > CF > Row > Column
![Page 21: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/21.jpg)
Why Cassandra?간략한 특징
토큰 링 위에 key를 배치한다.
모든 노드의 행동이 동일하다.
Update가 없다. 무조건 Write.
메모리 우선 Read/Write
C를 다소 희생시켜 A, P 보장
![Page 22: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/22.jpg)
![Page 23: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/23.jpg)
1st trial20 Nodes 짜리 Cluster
각 Node는,
CPU: 4-cores
RAM: 8GB
DISK: SATA-3 x 2 volumes
...면 충분할거라 생각했다.
![Page 24: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/24.jpg)
1차 멘붕
구현은 가능했다.
Grinder로 부하테스트도 해봤다.
근데 이게 함정ㅇㅇ
![Page 25: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/25.jpg)
1차 멘붕
Hot spot 다수 발생
schema 설계를 “더 잘 해야”했다.
CF, ROW, COLUMN 크기와 수를 예측
Replication strategy도..
![Page 26: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/26.jpg)
![Page 27: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/27.jpg)
![Page 28: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/28.jpg)
![Page 29: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/29.jpg)
1차 멘붕
![Page 30: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/30.jpg)
1차 멘붕
![Page 31: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/31.jpg)
2nd trial
스키마를 고쳤음
ROW KEY 선택이 중요
![Page 32: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/32.jpg)
2차 멘붕
왜 이래도 안되나?ㅠㅠㅠㅠㅠㅠ
Compactions ➔ HEAVY I/O
iowait 상태 유발된 후 swap발생
Hot spot이 전체 클러스터로 확산
메모리 우선이라는 특징이 무력화
![Page 33: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/33.jpg)
2차 멘붕
iostat -dkx 1
더 빠른 I/O성능을 갖는 디스크 필요
메모리 증설
![Page 34: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/34.jpg)
3rd trial...이라고 쓰고 서버교체라고 읽는다.
SATA to SAS
RAID-10 구성
RAID controller상의 cache는 그다지...
swapoff
네할렘 트리플 풀채널 : 4GB x 3EA
![Page 35: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/35.jpg)
3차 멘붕
며칠 잠잠하더니 뻥뻥터지는 Read Time-out
이번에도 Disk I/O 관련한 iowait 부하
tcp소켓도 다소 문제.
![Page 36: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/36.jpg)
3차 멘붕
![Page 37: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/37.jpg)
![Page 38: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/38.jpg)
3차 멘붕스키마에 오타라니 OTL
keys_cached가 8,000,000
이 실수를 안하는 꼼수-_-개발
800000 은 자릿점이 없다
800111
50111000
![Page 39: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/39.jpg)
3차 멘붕
문제가 카산드라에만 있는 것은 아니다.
![Page 40: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/40.jpg)
Tuning again...
이번엔 쫌 지대로 하자.
마침 v1.0이 발표되며 많은 것들이 저절로 해결
올레~~~~
![Page 41: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/41.jpg)
Tuning again...
블럭디바이스 스케줄러
noop, anticipatory, DEADLINE, cfq
ext4 mount options
rw,noatime,data=writeback,barrier=0,nobh,errors=remount-ro
Integrity 위협 = Consistency 희생?
![Page 42: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/42.jpg)
![Page 43: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/43.jpg)
Tuning again.../etc/sysctl.conf
vm.swappiness = 0
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1
net.core.somaxconn = 3072
net.ipv4.tcp_fin_timeout = 10
![Page 44: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/44.jpg)
Tuning again...Running on JVM
깨알같은...
-XX:+UseCompressedOops
-XX:+UseParNewGC (기본)
-XX:+UseConcMarkSweepGC (기본)
-XX:+CMSParallelRemarkEnabled (기본)
![Page 45: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/45.jpg)
Tuning again...Running on JVM
(계속)
-XX:SurvivorRatio=11
-XX:MaxTenuringThreshold=1 (기본)
-XX:CMSInitiatingOccupancyFraction=90
-XX:+UseCMSInitiatingOccupancyOnly (기본)
![Page 46: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/46.jpg)
Tuning again...
JNA 사용이 선택적이다.
당연히 쓰자.
root권한으로 실행시키자
mmap
![Page 47: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/47.jpg)
Tuning again...
CompactionStrategy : LeveledCompactionStrategy
vs. SizeTieredCompactionStrategy
sstable_size_in_mb: 16~64MB
Row cache 없이 Key cache만 사용
![Page 48: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/48.jpg)
4차 멘붕
노드중 한 대만 이상한 짓을 한다.
Hot spot?
RAID controller 불량으로 판정
교체후 쌩쌩해짐
장비 불량은 빨리 발견해서 AS받자.
![Page 49: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/49.jpg)
지금은 잘 돕니다ㅎㅎ
![Page 50: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/50.jpg)
요약SYSTEM
RAM, SAS, RAID
OS
scheduler
sysctl, swapoff
ext4, mount opt
JVM
OOPS
GC
Cassandra
JNA, MMAP
schema design
cache size
disable keep-alive
Endpoint snitch
column index size
disable preheat
![Page 51: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/51.jpg)
요약
카산드라던 뭐던간에,
목적에 잘 맞는 스토리지를 골라서,
잘 튜닝하면 충분히 성능 좋다.
![Page 52: Cassandra 멘붕기 | Devon 2012](https://reader036.vdocuments.us/reader036/viewer/2022081720/558d2675d8b42a21638b463f/html5/thumbnails/52.jpg)
Reference
http://cassandra.apache.org/
http://www.datastax.com/docs/1.0/operations/tuning
http://www.wikipedia.org/
http://github.com/hector-client/hector