YCSBTutorial
Workshop in information security by Yosi Barad, Ainat Chervin and Ilia oshmiansky
YCSB- Tutorial
Introduction:
This document aims to provide a few easy to follow steps for the first-time user.
We will cover the following subjects regardingYCSB benchmark tool:
- Installation and configuration of YCSB.
- Run YCSB.
- Examples of usage: benchmark Cassandra using YCSB.
Installation and configuration of YCSB:
- YCSB uses maven repository to build the java source code, so first of all you need to install maven on your machine. You can download it from here:
- Download YCSB from here:
- Extract maven files. e.g. to /specific/disk1/temp/maven/.
- Extract YCSB files. e.g. to /specific/disk1/temp/YCSB/.
- Set environment variables:
- Add the following new Variables and values to the system:
- setenvMAVEN_HOME "/specific/disk1/temp/maven:."
This should be the path to the maven folder.
- setenv JAVA_HOME "/usr/local/lib/jdk-6u25-ea-bin-b03:."
This should be the path to the java folder.
- Open terminal or cmd and use maven to build YCSB source code.
Enter the following command: "mvn clean package" from the YCSB folder.
It should look like:
This may take a few minutes to be done.
Once the build phase is completed, it should prompt: BUILD SUCCESS.
- Afterwards edit the PATH environment variable and add the jar files of the database you would like to benchmark both form its home folder and form YCSB folder.
For example if you would like to benchmark Cassandra database you should edit the PATH variable in the following way:
setenv PATH "/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/games:/usr/local32/bin:/specific/scratches/parallel/yosibar1-2012-10-31/cassandra/lib:/specific/scratches/parallel/yosibar1-2012-10-31/YCSB/cassandra/target:."
You may see in blue text that we add 2 paths to the PATH variable. First the path to Cassandra lib folder which contain Cassandra jar library files, and finally we add YCSB/cassandra/target folder which contain the YCSB build jar file for Cassandra.
- Finally check your environment variables and verify that the PATH variable has changed successfully.
Run YCSB:
- Now we are ready to use YCSB:
Enter the following command in the command prompt (or terminal) from
YCSB folder location:
bin/ycsb
- YCSBshould invoke the help menu:
This describes the supported commands, databases and options for YCSB to be used.
You can run commands directly against the database using the shell command:
You may replace the basic database in this example with any supported db you would like, but first make sure it runs correctly before you try to invoke it with YCSB client shell.
Moreover you can start benchmark your database using the load and run commands but notice that you must first create a table called "usertable" in the database before you start an automated test.
For example in Cassandra you first need to create keyspace called "usertable" and use it in order to create column family called "data" and only afterwards to start and run the tests. You may find more details regarding the creation of "usertable"in various databases and regarding the workload runningin the YCSB documentation on:
Examples of usage: benchmark Cassandra using YCSB
- First we'll bring cassandra server up and prompt the client shell up by:
bin/cassandra-cli -host <ip address> -p 9170 -u <username> -pw <password>
for example:
bin/cassandra-cli -host 127.0.0.1 -p 9170 -u yosi -pw 123
- Next we'll create a new keyspace called usertable: create keyspace usertable;
- We'll use the keyspace and create a new column family called data:
- use usertable;
- create column family data;
- Now we are ready to benchmark Cassandra using YCSB.
In this example we'll use workloada form the YCSB core workloads which is a 50/50 workload of reads and inserts from the database.
First let's use the –load command to prepare the workload as values to be read are inserted to Cassandra database. Afterwards we'll use the –run command to perform the benchmark test of workloada.
Enter the following command in the command prompt (or terminal) from
YCSB folder location:
bin/ycsb load cassandra-10 -p hosts="132.67.105.254" -P workloads/workloadaworkloada_res.txt
hosts="132.67.105.254"refers to the ip cassandra listen on.
-P workloads/workloadarefer to the workload being used.
This should create a file called workloada_res.txt contains the load phase information:
Next we need to run the workload using YCSB run command:
The benchmark results will appear in terms of throughput, latency and run time.
For example in this test we performed 10,000 operations, 89454 of the operations returned successfully after 1 millisecond, 4493 operations returned after 2 millisecond, and so on.
The overall throughput was 1872 operations per second.
You may change the operations count to 50000 for the workload by adding:
-p operationcount=50000 to the command line.
You may change the number of threads for YCSB to invoke the benchmark by adding:
-p threadcount=100.
You may add any other property parameters to your workload by changing the YCSB source code using the getproperty mechanism (you may check the java files and Javadoc for more information) after you insert your changes to the code, build the source code again using the "mvn clean package" command from the YCSB directory and add the relevant parameter using –p key=value to the YCSB command.