mariadb - a mysql replacement #self2014
Post on 26-Jan-2015
128 Views
Preview:
DESCRIPTION
TRANSCRIPT
MariaDB - a MySQL replacement
Colin Charles, Team MariaDB, SkySQL Ab colin@mariadb.org | http://mariadb.org/
http://bytebot.net/blog/ | @bytebot on Twitter South East Linux Fest, Charlotte, NC, USA
20 June 2014
whoami
• Work on MariaDB at SkySQL Ab
• Merged with Monty Program Ab, makers of MariaDB
• Formerly MySQL AB (exit: Sun Microsystems)
• Past lives include Fedora Project (FESCO), OpenOffice.org
Who are you?
• Developer?
• Operator? (DBA, sysadmin)
• A bit of both?
Global Top 20 sites1.Google
2.Facebook
3.YouTube
4.Yahoo!
5.Baidu
6.Wikipedia
7.QQ
8.Taobao
9.Twitter
10.Live
11.LinkedIn
12.Sina
13.Amazon
14.hao123.com
15.google.co.in
16.blogspot
17.weibo.com
18.wordpress.com
19.360.cn
20.yandex.ru
5W1H is MariaDB• Drop-in compatible MySQL replacement
• Community developed, Foundation backed, feature enhanced, backwards compatible, GPLv2 licensed
• Steady stream of releases in 4 years 4 months: 5.1, 5.2, 5.3, 5.5, 10.0, MariaDB Galera Cluster 5.5, MariaDB with TokuDB 5.5
• Enterprise features open: PAM authentication plugin, threadpool, audit plugin
• Default in Red Hat Enterprise Linux, Fedora, openSUSE, etc.
Microseconds• TIME, DATETIME, TIMESTAMP, temporal functions, CAST, dynamic
columns
CREATE TABLE microsec(
column_microsec DATETIME(6),
column_millisec TIME(3)
);
SELECT CURTIME(6);
MariaDB 5.3+
Microseconds & 5.6
• TIME_TO_SEC(), UNIX_TIMESTAMP() preserve microseconds of the argument
MariaDB 10.0 MySQL 5.6
SELECT TIME_TO_SEC('10:10:10.12345'); +-------------------------------+ | TIME_TO_SEC('10:10:10.12345') | +-------------------------------+ | 36610.12345 | +-------------------------------+ 1 row in set (0.01 sec)
SELECT TIME_TO_SEC('10:10:10.12345'); +-------------------------------+ | TIME_TO_SEC('10:10:10.12345') | +-------------------------------+ | 36610 | +-------------------------------+ 1 row in set (0.00 sec)
Virtual Columns
• A column in a table that has its value automatically calculated either with a pre-calculated/deterministic expression or values of other fields in the table
• VIRTUAL - computed on the fly when data is queried (like a VIEW)
• PERSISTENT - computed when data is inserted and stored in a table
MariaDB 5.2+
PCRE Regular Expressions• Powerful REGEXP/RLIKE operator
• New operators:
• REGEXP_REPLACE(sub,pattern,replace)
• REGEXP_INSTR(sub,pattern)
• REGEXP_SUBSTR(sub,pattern)
• Works with multi-byte character sets that MariaDB supports, including East-Asian sets
MariaDB 10.0+
GIS• MariaDB implements a subset of SQL with Geometry Types
• No longer just minimum bounding rectangles (MBR) - shapes considered
CREATE TABLE geom (g GEOMETRY NOT NULL, SPATIAL INDEX(g)) ENGINE=MyISAM;
• ST_ prefix - as per OpenGIS requirements
MariaDB 5.3+
Sample use cases• Import OpenStreetMap data into MariaDB: http://www.slideshare.net/
hholzgra/fosdem-2014mariadbgis
• Use the OpenStreetMap dataset: https://mariadb.com/kb/en/openstreetmap-dataset/
• Screencast: https://blog.mariadb.org/screencast-mariadb-gis-demo/
• node.js example use case for mapping GPX data: https://blog.mariadb.org/node-js-mariadb-and-gis/ & jQuery usage: https://blog.mariadb.org/jquery-and-gis-distance-in-mariadb/
Dynamic columns• Allows you to create virtual columns with dynamic content for each row in
table. Store different attributes for each item (like a web store).
• Basically a BLOB with handling functions: COLUMN_CREATE, COLUMN_ADD, COLUMN_GET, COLUMN_DELETE, COLUMN_EXISTS, COLUMN_LIST, COLUMN_CHECK, COLUMN_JSON
• In MariaDB 10.0: name support (instead of referring to columns by numbers, name it), convert all dynamic column content to JSON array, interface with Cassandra
INSERT INTO tbl SET dyncol_blob=COLUMN_CREATE("column_name", "value");
MariaDB 5.3+
Full-text search via SphinxSE
mysql> INSTALL PLUGIN sphinx SONAME 'ha_sphinx.so';
Query OK, 0 rows affected (0.01 sec)
MariaDB 5.2+
What is SphinxSE?• SphinxSE is just the storage engine that still depends on the Sphinx
daemon
• It doesn’t store any data itself
• Its just a built-in client to allow MariaDB to talk to Sphinx searchd, run queries, obtain results
• Indexing, searching is performed on Sphinx
Sphinx search tableCREATE TABLE t1
(
id INTEGER UNSIGNED NOT NULL,
weight INTEGER NOT NULL,
query VARCHAR(3072) NOT NULL,
group_id INTEGER,
INDEX(query)
) ENGINE=SPHINX CONNECTION="sphinx://localhost:9312/test";
!
SELECT * FROM t1 WHERE query='test it;mode=any';
Sphinx search tables
• 1st column: INTEGER UNSIGNED or BIGINT (document ID)
• 2nd column: match weight
• 3rd column: VARCHAR or TEXT (your query)
• Query column needs indexing, no other column needs to be
Query Cassandra
• Data is mapped: rowkey, static columns, dynamic columns
• super columns aren’t supported
• No 1-1 direct map for data types
• Write to Cassandra from SQL (SELECT, INSERT, UPDATE, DELETE)
MariaDB 10.0+
Cassandra II pk varchar(36) primary key,
data1 varchar(60),
data2 bigint
) engine=cassandra keyspace='ks1' column_family='cf1'
• Table must have a primary key
• name/type must match Cassandra’s rowkey
• Columns map to Cassandra’s static columns
• name must be same as in Cassandra, datatypes must match, can be subset of CF’s columns
Mapping
• Datatype mapping - complete table at KB
• Data mapping is safe - engine will refuse incorrect mappings
• Command mapping: INSERT overwrites rows, UPDATE reads then writes, DELETE reads then writes
Typical use cases
• Web page hits collection, streaming data
• Sensor data
• Reads served with a lookup
• Want an auto-replicated, fault-tolerant table?
CONNECT• Target: ETL for BI or analytics
• Import data from CSV, XML, ODBC, MS Access, etc.
• WHERE conditions pushed to ODBC source
• DROP TABLE just removes the stored definition, not data itself
• “Virtual” tables cannot be indexed
MariaDB 10.0+
SPIDER
• Horizontal partitioning, built on top of PARTITIONs
• Associates a partition with a remote server
• Transparent to user, easy to expand
• Has index condition pushdown support enabled
MariaDB 10.0+
TokuDB• Opensource - separate MariaDB 5.5+TokuDB/integrated in 10.0.5
• Improved insert (10-20x faster) & query speed, compression (up to 90% space reduction), replication performance and online schema flexibility
• Uses Fractal Tree Indexes instead of B-Tree
• Tests & builds of TokuDB on multiple platforms
Engines, etc• Plan for backups - TokuDB can be cool for your uses as an example
• Galera: study your workload patterns, your application, etc.
• SPIDER (built-in sharding capabilities, partitioning & XA transaction capable with multiple backends including Oracle)
• its not going to be straightforward to “just start” - need to know right tables to implement, etc.
Threadpool• Modified from 5.1 (libevent based), great for CPU bound
loads and short running queries
• Windows (threadpool), Linux (epoll), Solaris (event ports), FreeBSD/OSX (kevents)
• No minimization of concurrent transactions with dynamic pool size
• thread_handling=pool-of-threads
• https://mariadb.com/kb/en/thread-pool-in-mariadb-55/
MariaDB 5.5+
PAM Authentication• Authentication using /etc/shadow
• Authentication using LDAP, SSH pass phrases, password expiration, username mapping, logging every login attempt, etc.
• INSTALL PLUGIN pam SONAME ‘auth_pam.so’;
• CREATE USER foo@host IDENTIFIED via pam
• Remember to configure PAM (/etc/pam.d or /etc/pam.conf)
• http://www.mysqlperformanceblog.com/2013/02/24/using-two-factor-authentication-with-percona-server/
MariaDB 5.2+
Non-blocking client library• start operation, do work in thread, operation processed, result
travels back
• use cases: multiple queries against single server (utilize more CPUs); queries against multiple servers (SHOW STATUS on many machines)
• https://mariadb.com/kb/en/about-non-blocking-operation-in-the-client-library/
• fast node.js driver available: mariasql
MariaDB 5.5+
LIMIT ROWS EXAMINED
• The purpose of this optimization is to provide the means to terminate the execution of SELECTstatements which examine too many rows, and thus use too many resources.
• SELECT * from t1, t2 LIMIT 10 ROWS EXAMINED 1000;
• https://mariadb.com/kb/en/limit-rows-examined/
MariaDB 5.5+
SQL Error Logging Plugin
• Log errors sent to clients in a log file that can be analysed later. Log file can be rotated (recommended)
• a MYSQL_AUDIT_PLUGIN
install plugin SQL_ERROR_LOG soname 'sql_errlog.so';
MariaDB 5.5+
Audit Plugin
• Log server activity - who connects to the server, what queries run, what tables touched - rotating log file or syslogd
• a MYSQL_AUDIT_PLUGIN
INSTALL PLUGIN server_audit SONAME ‘server_audit.so’;
MariaDB 10.0+
Replication made better• Selective skipping of replication events (session-based or on master
or slave)
• Dynamic control of replication variables (no restarts!)
• Using row-based replication? Annotate the binary log with SQL statements
• Slaves perform checksums on binary log events
• Slaves crash-safe (data stored inside transaction tables)
MariaDB 5.3+
Replication made better II• Group commit in the binary log - finally, sync_binlog=1, innodb_flush_log_at_trx_commit=1 performs
• START TRANSACTION WITH CONSISTENT SNAPSHOT
• mysqldump —single-transaction —master-data - full non-blocking backup
• Parallel replication
• Multi-source replication - (real-time) analytics, shard provisioning, backups, etc.
New KILL syntax• HARD | SOFT & USER USERNAME are MariaDB-specific (5.3.2)
• KILL QUERY ID query_id (10.0.5) - kill by query id, rather than thread id
• SOFT ensures things that may leave a table in an inconsistent state aren’t interrupted (like REPAIR or INDEX creation for MyISAM or Aria)
KILL [HARD | SOFT] [CONNECTION | QUERY] [thread_id | USER user_name]
MariaDB 5.3+
Statistics• Understand server activity better to understand database loads
• SET GLOBAL userstat=1;
• SHOW CLIENT_STATISTICS; SHOW USER_STATISTICS;
• # of connections, CPU usage, bytes received/sent, row statistics
• SHOW INDEX_STATISTICS; SHOW TABLE_STATISTICS;
• # rows read, changed, indexes
• INFORMATION_SCHEMA.PROCESSLIST has MEMORY_USAGE, EXAMINED_ROWS (similar with SHOW STATUS output)
MariaDB 5.2+
MariaDB 10.0+
EXPLAIN enhanced
• Explain analyser: https://mariadb.org/explain_analyzer/analyze/
• SHOW EXPLAIN for <thread_id>
• EXPLAIN output in the slow query log
• EXPLAIN not just for SELECT but INSERT/UPDATE/DELETE
MariaDB 10.0+
Roles
• Bundles users together, with similar privileges - follows the SQL standard
CREATE ROLE audit_bean_counters;
GRANT SELECT ON accounts.* to audit_bean_counters;
GRANT audit_bean_counters to ceo;
MariaDB 10.0+
FusionIO
• If you have nvmfs (formerly DirectFS), you can disable the innodb_doublewrite buffer
• page level compression in background threads (reduces I/O, saves the life of your device)
MariaDB 10.0+
What else is there• Engines: Aria, OQGRAPH, FederatedX
• Progress reporting for ALTER/LOAD DATA INFILE
• Table Elimination
• HandlerSocket
• SHUTDOWN functionality
• And a lot more….
Feedback plugin
• feedback=on in my.cnf [mysql]
Connectors• The MariaDB project provides LGPL connectors (client libraries) for:
• C
• Java
• ODBC
• Embedding a connector? Makes sense to use these LGPL licensed ones…
OptimizerMariaDB 10 MySQL 5.6
index_merge=on index_merge_union=on
index_merge_sort_union=on index_merge_intersection=on
index_merge_sort_intersection=off engine_condition_pushdown=off index_condition_pushdown=on
derived_merge=on derived_with_keys=on
firstmatch=on loosescan=on
materialization=on in_to_exists=on
semijoin=on partial_match_rowid_merge=on partial_match_table_scan=on
subquery_cache=on mrr=off
mrr_cost_based=off mrr_sort_keys=off
outer_join_with_cache=on semijoin_with_cache=on
join_cache_incremental=on join_cache_hashed=on
join_cache_bka=on optimize_join_buffer_size=off
table_elimination=on extended_keys=on
exists_to_in=off
index_merge=on index_merge_union=on
index_merge_sort_union=on index_merge_intersection=on
engine_condition_pushdown=on
index_condition_pushdown=on mrr=on
mrr_cost_based=on block_nested_loop=on
batched_key_access=off materialization=on
semijoin=on loosescan=on firstmatch=on
subquery_materialization_cost_based=on
use_index_extensions=on
MariaDB 5.3+
MariaDB Galera Cluster
• MariaDB Galera Cluster is made for today’s cloud based environments. It is fully read-write scalable, comes with synchronous replication, allows multi-master topologies, and guarantees no lag or lost transactions.
• Currently 5.5-based
• 10.0 is in beta (almost ready for release)
Trusted by many• Google
• Wikipedia
• Tumblr
• SpamExperts
• Limelight Networks
• KakaoTalk
• Paybox Services
Quality matters• security@mariadb.org is now commonly on CC when it comes to
MySQL bugs
• Selective (not blind) merging
• Tests (mysql-test/)
• MySQL 5.5: 2,466
• MySQL 5.6: 3,603
• MariaDB 10.0: 3,812
Going forward• column level & block level encryption (Eperi, Google - InnoDB, Aria)
• Kerberos authentication plugin
• Query timeouts
• More Google Summer of Code features (4 students 2014; 3 students 2013)
• Full 5.6 compatibility + 5.7 features (so syntax will match for duplicated functionality)
Resources• We moved to github! https://github.com/MariaDB/server
• We’re still on launchpad for older branches: https://launchpad.net/maria
• maria-discuss@lists.launchpad.net
• maria-developers@lists.launchpad.net
• #maria on freenode
• facebook.com/MariaDB.dbms
• @mariadb / +MariaDB
Q&Acolin@mariadb.org | byte@bytebot.net http://skysql.com/ | http://mariadb.org/
twitter: @bytebot | url: http://bytebot.net/blog/
top related