The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data. Cassandra’s support for replicating across multiple datacenters is best-in-class, providing lower latency for your users and the peace of mind of knowing that you can survive regional outages.
This is a basic multi-node cassandra setup.
Initial Server Setup
Hardware Information
All the server were with below configuration.
CPU : 40 Cores
RAM : 192GB
Setting Host for cassandra
Setting up the servers and update /etc/hosts
as below.
# Adding CASSANDRA NODES
10.130.18.35 CASSANDRA01 # SEED
10.130.18.93 CASSANDRA02 # Worker
10.130.18.98 CASSANDRA03 # Worker
Updating hostname
on all servers.
Update hostname
s as required.
sudo vim /etc/sysconfig/network
Update hostname
as below, do the same in all servers [CASSANDRA01
, CASSANDRA02
,CASSANDRA03
].
NETWORKING=yes
HOSTNAME=CASSANDRA01
To update the hostname
without a reboot execute below command.
sudo hostname CASSANDRA01
NOTE : hostname
command will keep the hostname till the next reboot. So its required that we update /etc/sysconfig/network
file.
Creating cassandra
user with sudo
permissions.
Have a script which will create a user on server.
wget https://raw.githubusercontent.com/ahmedzbyr/create_user_script/master/create_user_script.sh
sh create_user_script.sh -s cassandra
This will create a cassendra
user, with sudo
permissions.
Creating passwordless entry from SEED (CASSANDRA01
) to other servers.
Create a rsa
key on CASSANDRA01
ssh-keygen -t rsa
Create .ssh
directory on other 2 servers.
ssh cassandra@CASSANDRA02 mkdir -p .ssh
ssh cassandra@CASSANDRA03 mkdir -p .ssh
Add the id_rsa.pub
to authorized_keys
cat ~/.ssh/id_rsa.pub | ssh cassandra@CASSANDRA02 'cat >> .ssh/authorized_keys'
cat ~/.ssh/id_rsa.pub | ssh cassandra@CASSANDRA03 'cat >> .ssh/authorized_keys'
Make sure we have the right permissions.
ssh cassandra@CASSANDRA02 chmod 744 -R .ssh
ssh cassandra@CASSANDRA03 chmod 744 -R .ssh
Testing.
ssh cassandra@CASSANDRA02
ssh cassandra@CASSANDRA03
Extracting Files.
Extracting Files to opt and creating a link.
sudo tar xvzf apache-cassandra-2.1.3-bin.tar.gz -C /opt
sudo ln -s /opt/apache-cassandra-2.1.3 /opt/cassandra
sudo chown cassandra:cassandra -R /opt/cassandra
sudo chown cassandra:cassandra -R /opt/apache-cassandra-2.1.3
Creating Required Directories.
sudo mkdir -p /data1/cassandra/commitlog
sudo mkdir -p /data1/cassandra/data
sudo mkdir -p /data1/cassandra/saved_cahes
Updating Configuration File.
Setting initial_token
as below.
Node 0: 0 Node 1: 3074457345618258602 Node 2: 6148914691236517205
On Node CASSANDRA01
cluster_name: 'MyCassandraCluster'
initial_token: 0
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "10.130.18.35"
listen_address: 10.130.18.35
endpoint_snitch: SimpleSnitch
data_file_directories:
- /data1/cassandra/data
commitlog_directory: /data1/cassandra/commitlog
saved_caches_directory: /data1/cassandra/saved_caches
On Node CASSANDRA02
cluster_name: 'MyCassandraCluster'
initial_token: 3074457345618258602
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "10.130.18.35"
listen_address: 10.130.18.93
endpoint_snitch: SimpleSnitch
data_file_directories:
- /data1/cassandra/data
commitlog_directory: /data1/cassandra/commitlog
saved_caches_directory: /data1/cassandra/saved_caches
On Node CASSANDRA03
cluster_name: 'MyCassandraCluster'
initial_token: 6148914691236517205
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "10.130.18.35"
listen_address: 10.130.18.98
endpoint_snitch: SimpleSnitch
data_file_directories:
- /data1/cassandra/data
commitlog_directory: /data1/cassandra/commitlog
saved_caches_directory: /data1/cassandra/saved_caches
Starting cassandra
.
On Server CASSANDRA01
.
sh /opt/cassandra/bin/cassandra
Wait till the server initialize and then start rest of nodes.
On Server CASSANDRA02
.
sh /opt/cassandra/bin/cassandra
On Server CASSANDRA03
.
sh /opt/cassandra/bin/cassandra
Checking Cluster Information.
[cassandra@CASSANDRA01 bin]$ ./nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.10.18.98 72.09 KB 1 33.3% 1a5a0c77-b5e6-4057-87b4-a8e788786244 rack1
UN 10.10.18.35 46.24 KB 1 83.3% 67de1b1f-8070-48c1-ad88-2c0d4dd7a988 rack1
UN 10.10.18.93 55.64 KB 1 83.3% 7fba7cd0-6f99-4ce8-8194-c9a8b23488cd rack1
Logging into CQL Shell.
We need to export
CQLSH_HOST
[cassandra@CASSANDRA01 bin]$ export CQLSH_HOST=10.10.18.35
[cassandra@CASSANDRA01 bin]$ cqlsh
Connected to CassandraJIOCluster at 10.10.18.35:9042.
[cqlsh 5.0.1 | Cassandra 2.1.3 | CQL spec 3.2.0 | Native protocol v3]
Use HELP for help.
cqlsh>
Data Location on CASSANDRA01, CASSANDRA02, CASSANDRA03
[cassandra@CASSANDRA01 bin]$ ls -l /data1/cassandra/
total 12
drwxr-xr-x 2 cassandra cassandra 4096 Mar 19 14:23 commitlog
drwxr-xr-x 4 cassandra cassandra 4096 Mar 19 14:23 data
drwxr-xr-x 2 cassandra cassandra 4096 Mar 19 13:18 saved_caches
[cassandra@CASSANDRA01 bin]$
Performace Tuning.
Updating cassandra.yaml
file.
# For workloads with more data than can fit in memory, Cassandra's
# bottleneck will be reads that need to fetch data from
# disk. "concurrent_reads" should be set to (16 * number_of_drives) in
# order to allow the operations to enqueue low enough in the stack
# that the OS and drives can reorder them. Same applies to
# "concurrent_counter_writes", since counter writes read the current
# values before incrementing and writing them back.
#
# On the other hand, since writes are almost never IO bound, the ideal
# number of "concurrent_writes" is dependent on the number of cores in
# your system; (8 * number_of_cores) is a good rule of thumb.
# concurrent_reads: 32
# concurrent_writes: 32
# Change as we had a 40core machine which calculates to 240.
concurrent_reads: 32
concurrent_writes: 240
concurrent_counter_writes: 32
Updating cassandra-env.sh
file.
# Override these to set the amount of memory to allocate to the JVM at
# start-up. For production use you may wish to adjust this for your
# environment. MAX_HEAP_SIZE is the total amount of memory dedicated
# to the Java heap; HEAP_NEWSIZE refers to the size of the young
# generation. Both MAX_HEAP_SIZE and HEAP_NEWSIZE should be either set
# or not (if you set one, set the other).
#
# The main trade-off for the young generation is that the larger it
# is, the longer GC pause times will be. The shorter it is, the more
# expensive GC will be (usually).
#
# The example HEAP_NEWSIZE assumes a modern 8-core+ machine for decent pause
# times. If in doubt, and if you do not particularly want to tweak, go with
# 100 MB per physical CPU core.
# Important is the HEAP_NEWSIZE 100MB * number of Core (40 cores in our case)
# MAX_HEAP_SIZE="4G"
# HEAP_NEWSIZE="800M"
MAX_HEAP_SIZE="15G"
HEAP_NEWSIZE="4G"
Updating cassandra-topology.properties
file.
If the server are in Data Center which in different location then we need to update this file as well. Also specify rack in that DC.
Cassandra
=
:``.
NOTE : This has to match with the cassendra-rackdc.properties
file.
10.130.18.35=DC1:RAC1
10.130.18.93=DC2:RAC1
10.130.18.98=DC2:RAC2
When using this format we need to update cassendra-rackdc.properties
and use endpoint_snitch:
as GossipingPropertyFileSnitch
in the cassandra.yaml
Installing OpsCenter
Monitoring for Cassandra.
Setting up a opscenter
for our cassandra cluster
Download opscenter
Archive.
wget http://downloads.datastax.com/community/opscenter-5.0.tar.gz
Extracting opscenter
.
Extracting, Create and Change owner.
sudo tar xvzf opscenter-5.0.2.tar.gz -C /opt/
cd /opt/
sudo ln -s opscenter-5.0.2 opscenter
sudo chown cassandra:cassandra -R opscenter*
Configure opscenter
Update configuration file.
vim /opt/opscenter/conf/opscenterd.conf
Update the interface as below.
[webserver]
port = 8888
interface = 10.10.18.35
Configuring Agent.
Update the File below
vim /opt/opscenter/agent/conf/address.yaml
Add Below Line.
stomp_interface: "10.10.18.35"
Starting opsCenter.
/opt/opscenter/bin/opscenter
Open the browser with below URL.
http://10.10.18.35:8888/opscenter/index.html
- In the UI Select, Manager Existing Cluster. (Manage an existing DataStax Enterprise or Cassandra cluster with OpsCenter.)
- Add
Server IPs
as below. Our Cluster running on JMX7199
port.
Newline is the Separator.
10.10.18.35
10.10.18.93
10.10.18.98
Starting Agent Manually.
Agent can be started from the opscenter
. But if there is some issues then we can start it manually. (Make sure to update the address.yaml
as above.)
/opt/opscenter/agent/bin/datastax-agent