Databases

SQL

MariaDB

Installation

CentOS [1]:

$ sudo vim /etc/yum.repos.d/mariadb.repo
[mariadb]
name = MariaDB
baseurl = http://yum.mariadb.org/10.1/centos7-amd64
gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
gpgcheck=1
$ sudo yum install MariaDB-server MariaDB-client

Configuration Tuning

Here are common optimizations for MariaDB’s configuration. [12][13][14]

# Enable query cache.
query_cache_type = 1
# Increase in increments of 32M.
query_cache_size = 128M
# Increase temporary table size. Set these two options to the same value.
tmp_table_size = 1G
max_heap_table_size = 1G
# Set to ~85% of available RAM if it is the only major service.
innodb_buffer_pool_size = 6.8G
# Avoid reverse DNS lookups.
skip-name-resolve

Adjust the thread_cache_size until the percentage calculated below is as close to 100% as possible. [15]

mysql> SHOW STATUS LIKE 'Threads_created';
mysql> SHOW STATUS LIKE 'Connections';
mysql> SELECT 100 - (( <THREADS_CREATED> / <CONNECTIONS> ) * 100) as "Threads_created % of Connections"\G

The MySQLTuner script can also help to automatically find settings that should be changed based on MariaDB’s recent utilization.

Use the MySQL Calculator to ensure that the new settings will not use more than the available RAM on the system.

Table Engines

A full list of the available engines are provided at https://mariadb.com/kb/en/mariadb/storage-engines.

CassandraSE

The Cassandra Storage Engine (CassandraSE) is used to connect to a NoSQL Cassandra cluster. This allows the relational management of MariaDB to work with the fast and scalable Cassandra server. [2]

The Cassandra storage engine is missing from the official RHEL 7 repositories but can be installed from the RHEL 6 repository. [4]

$ sudo yum install http://yum.mariadb.org/10.1/centos6-amd64/rpms/MariaDB-10.1.18-centos6-x86_64-cassandra-engine.rpm
  • As the root MariaDB user, load the Cassandra library.

    $ sudo mysql
    > install soname 'ha_cassandra.so';
    
  • Configure the default Cassandra rpc/client IP to connect to.

    > SET GLOBAL cassandra_default_thrift_host='<CASSANDRA_IP>'
    
  • The Cassandra table has to be created with the “COMPACT STORAGE” option. Otherwise, MariaDB will be unable to access the table properly.

    $ cqlsh
    > CREATE TABLE <NAME> (<VALUE1> <TYPE> PRIMARY KEY, <VALUE2> <TYPE>) WITH COMPACT STORAGE;
    
  • Create a Cassandra table within a database. MariaDB needs to know the keyspace and the column/table to map to as well as what IP address to use to connect to Cassandra.

    > CREATE TABLE <NAME> ENGINE=cassandra KEYSPACE="<KEYSPACE_NAME>" COLUMN_FAMILY="<COLUMN_FAMILY_NAME>";
    

[2][3]

Clustering

Maxscale

MaxScale is a proxy that can load balance requests in different ways. This is useful for specifying reads and writes to specific servers. [5]

Configuration

The latest version of MariaDB’s MaxScale can be found at https://mariadb.com/downloads/maxscale.

RHEL Install [6]:

$ sudo yum install https://downloads.mariadb.com/MaxScale/2.0.1/rhel/7/x86_64/maxscale-2.0.1-1.rhel.7.x86_64.rpm

MaxScale requires the configuration of a listener that is associated with a router that serves requests to/from a list of servers.

Listener options:

  • [NAME]

  • type=listener

  • service = Specify the back-end service to use. This is usually a router.

  • protocol=MySQLClient

  • address = Specify the address to listen on.

  • port=3306

Listener example:

[listener]
type=listener
service="Read Write Service"
protocol=MySQLClient
port=3306

Router options:

  • [NAME]

  • type=service

  • router

    • readconnroute = Load balance requests.

    • readwritesplit = Send write requests to one node and read queries to all nodes.

    • schemarouter = Shard databases. Requests to a particular database will be routed to a specific server.

    • binlogrouter = Copy binary logs from servers other servers. If a back-end server fails, MaxScale will replace it and serve read requests from the available binary log.

  • router_options

    • master = Write only.

    • slave = Read only.

    • master,slave = Read and write.

  • servers = A comma separated list of back-end servers.

  • user = Specify a MySQL user to connect as.

  • passwd = Specify the password for the MySQL user.

Example:

[Read Write Service]
type=service
router=readwritesplit
servers=server1,server2,server3
user=maxscale
passwd=123456

Server options:

  • [NAME]

  • type=server

  • address = Specify the address of the MySQL server.

  • port= Specify the MySQL port (default: 3306).

  • protocol=MySQLBackend

[7]

For replication, a maxscale MySQL user needs “REPLICATION SET” and “SELECT” grants for all databases.

GRANT REPLICATION SET, SELECT ON *.* TO 'maxscale'@'%' IDENTIFIED BY 'securepassword123';

In a master-slave configuration, at least two servers are required to be running. This is because MaxScale is unsure if other nodes are present and cannot determine if a server is a master or a slave. This will prevent it from working properly and this error will occur for all connections. [8] It is ideal to follow the quorum theory by having 3 servers to support a failed host properly.

ERROR 1045 (28000): failed to create new session

NoSQL

Cassandra

Configuration

RHEL Install [9][10]:

$ sudo yum install java-1.8.0-openjdk

File: /etc/yum/repos.d/datastax.repo

[datastax]
name = DataStax Repo for Apache Cassandra
baseurl = http://rpm.datastax.com/community
enabled = 1
gpgcheck = 0
$ sudo yum install cassandra30
$ sudo systemctl daemon-reload

Configuration options:

  • cluster_name = The unique name for a cluster. Default: Test Cluster.

  • listen_address = The IP address to listen on for clustering. Default: localhost.

  • listen_interface = The network interface to listen on for clustering. Default: eth0.

  • rpc_address = The IP address to listen on for client requests. Default: localhost.

  • rpc_interface = The network interface to listen on for client requests. Default: eth1.

  • start_rpc = Start the client service to allow incoming connections. Default: false.

  • disk_optimization_strategy = Specify the type of disk to optimize reads/writes for. Default: ssd.

    • ssd = Solid state drivers.

    • spinning = Spinning disk hard drives.

  • disk_failure_policy = The action to take when a disk is missing or in a failed state. Default: stop.

    • best_effort = Do not use the disk but attempt to respond to requests with any data available.

    • die = Kill off all processes.

    • ignore = Ignore any major I/O errors and provide failure responses to any requests.

    • stop = Gracefully stop the service.

  • endpoint_snitch = Select a snitch interface for clustering.

    • CloudstackSnitch = Integrate with the Apache Cloudstack.

    • Ec2Snitch = Cluster based on Amazon EC2 regions and compute availability zones.

    • Ec2MultiRegionSnitch = Allows multiple Amazon EC2 regions to be used via public floating IPs.

    • GoogleCloudSnitch = Cluster based on the Google Cloud Platform’s regions and compute availability zones.

    • GossipingPropertyFileSnitch = Cluster based on the datacenter and rack location. Recommended for a multidatacenter cluster.

    • RackInferringSnitch = Similar to GossipingPropertyFileSnitch except that the datacenter is automatically determined by the 2nd octet of the IP and the rack is determined by the 3rd.

    • SimpleSnitch = Cluster based on proximity, but datacenter and rack location does not matter. Recommended for clusters in one region.

  • seed_provider = The IP addresses of Cassandra servers in other datacenters to replicate to. At least one node should be a seed provider in every datacenter. Not all nodes should be seed providers due to that leading to performance issues.

    • class_name: org.apache.cassandra.locator.SimpleSeedProvider

      • parameters:

      • seeds: “<IP_ADDRESS_1>”, “<IP_ADDRESS_2>

  • concurrent_reads = Default: 32. Recommended: (16 * <COUNT_OF_DISKS>).

  • concurrent_writes = Default: 32. Recommended: (16 * <COUNT_OF_CPU_CORES>).

  • concurrent_counter_writes = Default: 32. Recommended: 16 * <COUNT_OF_DISKS>).

  • concurrent_batchlog_writes = Default: 32. Recommended: (16 * <COUNT_OF_CPUS>).

  • concurrent_materialized_view_writes = Default: 32. Recommended: Use less than the concurrent reads/writes.

  • incremental_backups = Default: false. Choose whether or not to use incremental backups. When taking snapshots, hardlinks will be used to refer back to old data for efficient backups.

  • snapshot_before_compact = Default: false. Choose whether or not to automatically take backups before running a compaction.

[11]

History

Bibliography

  1. “Installing MariaDB with yum.” MariaDB Knowledgebase. Accessed October 16, 2016. https://mariadb.com/kb/en/mariadb/yum/

  2. “Cassandra Storage Engine Overview.” MariaDB Knowledgebase. Accessed October 16, 2016. https://mariadb.com/kb/en/mariadb/cassandra-storage-engine-overview/

  3. “Cassandra Storage Engine Use Example.” MariaDB Knowledgebase. Accessed October 16, 2016. https://mariadb.com/kb/en/mariadb/cassandra-storage-engine-use-example/

  4. “Missing CentOS7 RPM: MariaDB-10.1.16-centos7-x86_64-cassandra-engine.rpm?” MariaDB Knowledgebase. Accessed October 16, 2016. https://mariadb.com/kb/en/mariadb/missing-centos7-rpm-mariadb-10116-centos7-x86_64-cassandra-enginerpm/

  5. “About MariaDB MaxScale.” MariaDB Knowledgebase. Accessed October 16, 2016. https://mariadb.com/kb/en/mariadb-enterprise/about-mariadb-maxscale/

  6. “MariaDB MaxScale Installation Guide.” MariaDB Knowledgebase. Accessed October 22, 2016. https://mariadb.com/kb/en/mariadb-enterprise/mariadb-maxscale-14/mariadb-maxscale-installation-guide/

  7. “MaxScale Configuration & Usage Scenarios.” MariaDB Knowledgebase. Accessed October 22, 2016. https://mariadb.com/kb/en/mariadb-enterprise/mariadb-maxscale-14/maxscale-configuration-usage-scenarios/

  8. “Issue with MaxScale when slaves are broken.” MaxScale Google Groups. August 28, 2014. Accessed November 12, 2016. https://groups.google.com/forum/#!topic/maxscale/HK49D15s21s

  9. “How To Install Cassandra on CentOS 7” liquidweb Knowledgebase. Accessed October 16, 2016. https://www.liquidweb.com/kb/how-to-install-cassandra-on-centos-7/

  10. “Installing the DataStax Distribution of Apache Cassandra 3.x on RHEL-based systems.” DataStax Distribution of Apache Cassandra 3 Documentation. October 14, 2016. Accessed October 16, 2016. http://docs.datastax.com/en/cassandra/3.x/cassandra/install/installRHEL.html

  11. “The cassandra.yaml configuration file.” DataStax Documentation. Accessed February 8, 2018. http://docs.datastax.com/en/cassandra/3.0/cassandra/configuration/configCassandra_yaml.html

  12. “Get the Best Out of MariaDB with Performance Tuning.” Open Source For You. May 1, 2017. Accessed May 11, 2020. https://opensourceforu.com/2017/05/get-best-mariadb-performance-tuning/

  13. “Calculating InnoDB Buffer Pool Size for your MySQL Server.” ScaleGrid Blog. March 28, 2018. Accessed May 11, 2020. https://scalegrid.io/blog/calculating-innodb-buffer-pool-size-for-your-mysql-server/

  14. “skip-name-resolve to speed up MySQL and avoid problems.” VION Technology Blog. September 18, 2012. Accessed May 11, 2020. https://www.vionblog.com/skip-name-resolve-to-speed-up-mysql-and-avoid-problems/

  15. “MySQL Optimization Tip - thread_cache_size.” Another MySQL DBA. September 2, 2013. http://anothermysqldba.blogspot.com/2013/09/mysql-optimization-tip-threadcachesize.html