Cassandra installation 1.2.4 on FreeBSD 9.1
I've been wanting to try out Cassandra for a while now, ever since I heard how easy replication is. Let's start with the basics:
-
Cassandra is a part of Apache Hadoop
-
Cassandra is a NoSQL database
-
Cassandra needs Java
-
Cassandra has two ways to manipulate data - The thrift language and also CQL
-
Cassandra can use CQL (Cassandra Query Language) for it's queries - parts of it SQL but it's not the same thing
Requirements:
- Java needs to be installed already. If you don't have it already installed, follow my post here, steps 1-4, to install OpenJDK Note: Instead of steps 1-3 you can use this this script
1. Download Cassandra in the /tmp directory: (16mb)
cd /tmp ; wget ftp://apache.cs.utah.edu/apache.org/cassandra/1.2.4/apache-cassandra-1.2.4-bin.tar.gz
2. Create a directory to untar it to - based on http://wiki.apache.org/cassandra/GettingStarted Cassandra's default settings expect it to be under /var/lib/cassandra
. It's easy to change those.
mkdir /opt
Untar it - in this case we untar to the directory we previously created: /opt
tar -xzf apache-cassandra-1.2.4-bin.tar.gz -C /opt
4. Now let's go and edit the directories that Cassandra expects to find. To do that we need to edit the file: /opt/apache-cassandra-1.2.4/conf/cassandra.yaml
ee /opt/apache-cassandra-1.2.4/conf/cassandra.yaml
5. Change the lines 106-107:
data_file_directories: - /var/lib/cassandra/data
to:
data_file_directories: - /opt/cassandra/data
6. Change line 110
commitlog_directory: /var/lib/cassandra/commitlog
to:
commitlog_directory: /opt/cassandra/commitlog
7. Change line 188:
saved_caches_directory: /var/lib/cassandra/saved_caches
to:
saved_caches_directory: /opt/cassandra/saved_caches
8. Now create the above 3 directories:
mkdir -p /opt/cassandra/data
mkdir -p /opt/cassandra/commitlog
mkdir -p /opt/cassandra/saved_caches
9. By default Cassandra also expects it's logs to go under: /var/log/cassandra/
This will work fine with FreeBSD but the directory needs to be created - create it with:
mkdir /var/log/cassandra
If your needs are different and you need to change this setting you can change that in file: /opt/apache-cassandra-1.2.4/conf/log4j-server.properties
in line: 35
log4j.appender.R.File=/var/log/cassandra/system.log
10. Set the PATH
for cassandra so you don't have to change to the correct path all the time:
set path=(/opt/apache-cassandra-1.2.4/bin $path)
11. Make the PATH
change permanent so everytime you login it's still set:
echo 'set path=(/opt/apache-cassandra-1.2.4/bin $path)' >> ~/.cshrc
12. Start cassandra with the -f
switch to have it run in the foreground - recommended for the first run so you can see whether or not you get any errors
cassandra -f
Note: If you get an error like this:
Error: Exception thrown by the agent : java.net.MalformedURLException: Local host name unknown: java.net.UnknownHostException: weirdbricks: weirdbricks
The solution is here: http://www.wowza.com/forums/showthread.php?337-Malformed-URL-exception
In my /etc/hosts
I had this entry:
127.0.0.1 localhost localhost.my.domain
I replaced it with:
127.0.0.1 weirdbricks localhost localhost.my.domain
and tried the command again:
cassandra -f
If all went well you will get loads of text but the end should look like this:
INFO 18:22:28,485 Binding thrift service to localhost/127.0.0.1:9160
INFO 18:22:28,553 Using TFramedTransport with a max frame size of 15728640 bytes.
INFO 18:22:28,569 Using synchronous/threadpool thrift server on localhost : 9160
INFO 18:22:28,572 Listening for thrift clients...
INFO 18:22:38,482 Created default superuser 'cassandra'
13. OK - now use CTRL+C to stop it and restart without the -f flag so it runs in the background:
cassandra
Note: If you get stuck here:
INFO 18:31:16,933 Node localhost/127.0.0.1 state jump to normal
INFO 18:31:16,944 Startup completed! Now serving reads.
Just press ENTER to continue - then you'll be back at the shell.
14. Make sure Cassandra is running - check the open ports:
sockstat -4 | grep 9160
You should get something like this:
root java 1163 55 tcp4 127.0.0.1:9160 *:*
15. Now connect with the cassandra-cli
cd /opt/apache-cassandra-1.2.4/bin
./cassandra-cli
You should see something like this:
Connected to: "Test Cluster" on 127.0.0.1/9160
Welcome to Cassandra CLI version 1.2.4
Type 'help;' or '?' for help. Type 'quit;' or 'exit;' to quit.
16. To make Cassandra start up on boot let's write a nice one-liner and put it in a script:
Save the script under: /opt/apache-cassandra-1.2.4/bin/cassandra-boot.sh and
add the following code:
#!/bin/csh
#set the cassandra dir
set cassandra_dir=/opt/apache-cassandra-1.2.4/bin
#set the cassandra log dir
set cassandra_log_dir=/var/log/cassandra
#set the boot log file including a date
set boot_log_file=$cassandra_log_dir/boot-`date +%m-%d-%Y-%H%M%S`.log
#checks if cassandra is already running
set cassandra_running=`ps auxwww | grep java | grep cassandra | grep -v grep | wc -l`
if ($cassandra_running >= 1) then
logger -s "Cassandra already running!"
exit 1
endif
logger -s "Cassandra starting.."
#starts cassandra
$cassandra_dir/cassandra -p /var/run/cassandra.pid > $boot_log_file
#checks if cassandra started successfully
if ($? == 0) then
logger -s "Cassandra started - PID: `cat /var/run/cassandra.pid`"
else logger -s "Cassandra did not start!"
endif
#housekeeping - delete files older than 3 days
find $cassandra_log_dir -mtime +3 -exec rm {} ;
17. Edit your crontab
ee /etc/crontab
Add the following:
@reboot root /opt/apache-cassandra-1.2.4/bin/cassandra-boot.sh
Now on the next reboot Cassandra will start automatically!
18. Ooops, almost forgot to make the script bootable!
chmod +x /opt/apache-cassandra-1.2.4/bin/cassandra-boot.sh
19. To kill Cassandra type:
kill `cat /var/run/cassandra.pid`
20. To restart Cassandra type from anywhere:
cassandra-boot.sh
References:
http://wiki.apache.org/cassandra/GettingStarted http://wiki.apache.org/cassandra/RunningCassandra