Hot questions for Using Cassandra in nosql


I hope, that someone is actually able to help, because I'm currently stuck on trying to work with Cassandra ATM.

My Set up: For development, I have a minimal Cassandra 3.0.4 cluster with two nodes (one on my working machine, one in a VM). Usually only the local one is up and running. I use the latest Java driver version 3.0.0 to connect to the pool.

My cassandra.yaml contains rpc_address and listen_address to the IP of each node. The seed is my primary working machine.

My Problem: Everything is working fine from cqlsh (at any time) and when both nodes are up an running (from Java). But as soon as I stop the one in the VM, my Spring based application is throwing errors during startup:

2016-03-29 09:05:33.515 | INFO  | main                 | com.datastax.driver.core.NettyUtil                          :83    | Did not find Netty's native epoll transport in the classpath, defaulting to NIO.
2016-03-29 09:05:34.147 | INFO  | main                 | com.datastax.driver.core.policies.DCAwareRoundRobinPolicy   :95    | Using data-center name 'datacenter1' for DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct datacenter name with DCAwareRoundRobinPolicy constructor)
2016-03-29 09:05:34.149 | INFO  | main                 | com.datastax.driver.core.Cluster$Manager                    :1475  | New Cassandra host / added
2016-03-29 09:05:34.149 | INFO  | main                 | com.datastax.driver.core.Cluster$Manager                    :1475  | New Cassandra host / added
2016-03-29 09:05:34.150 | INFO  | main                 | my_company.cassandra.dao.impl.CassandraDaoImpl     :55    | Connected to cluster: TestCaseCluster
2016-03-29 09:05:34.151 | INFO  | main                 | my_company.cassandra.dao.impl.CassandraDaoImpl     :57    | Datacenter: datacenter1; Host: /; Rack: rack1, State: UP|true
2016-03-29 09:05:34.151 | INFO  | main                 | my_company.cassandra.dao.impl.CassandraDaoImpl     :57    | Datacenter: datacenter1; Host: /; Rack: rack1, State: UP|true
2016-03-29 09:05:34.220 | WARN  | luster1-nio-worker-2 | com.datastax.driver.core.SessionManager$7                   :378   | Error creating pool to /
com.datastax.driver.core.exceptions.ConnectionException: [/] Pool was closed during initialization
    at com.datastax.driver.core.HostConnectionPool$2.onSuccess( [cassandra-driver-core-3.0.0.jar:?]
    at com.datastax.driver.core.HostConnectionPool$2.onSuccess( [cassandra-driver-core-3.0.0.jar:?]
    at$ [guava-16.0.1.jar:?]
    at$SameThreadExecutorService.execute( [guava-16.0.1.jar:?]
    at [guava-16.0.1.jar:?]
    at [guava-16.0.1.jar:?]
    at [guava-16.0.1.jar:?]
    at$CombinedFuture.setOneValue( [guava-16.0.1.jar:?]
    at$CombinedFuture.access$400( [guava-16.0.1.jar:?]
    at$CombinedFuture$ [guava-16.0.1.jar:?]
    at$SameThreadExecutorService.execute( [guava-16.0.1.jar:?]
    at [guava-16.0.1.jar:?]
    at [guava-16.0.1.jar:?]
    at [guava-16.0.1.jar:?]
    at$FallbackFuture$1$1.onSuccess( [guava-16.0.1.jar:?]
    at$ [guava-16.0.1.jar:?]
    at$SameThreadExecutorService.execute( [guava-16.0.1.jar:?]
    at$ImmediateFuture.addListener( [guava-16.0.1.jar:?]
    at [guava-16.0.1.jar:?]
    at$FallbackFuture$1.onFailure( [guava-16.0.1.jar:?]
    at$ [guava-16.0.1.jar:?]
    at$SameThreadExecutorService.execute( [guava-16.0.1.jar:?]
    at [guava-16.0.1.jar:?]
    at [guava-16.0.1.jar:?]
    at [guava-16.0.1.jar:?]
    at$FallbackFuture$1$1.onFailure( [guava-16.0.1.jar:?]
    at$ [guava-16.0.1.jar:?]
    at$SameThreadExecutorService.execute( [guava-16.0.1.jar:?]
    at [guava-16.0.1.jar:?]
    at [guava-16.0.1.jar:?]
    at [guava-16.0.1.jar:?]
    at [guava-16.0.1.jar:?]
    at$FallbackFuture$1.onFailure( [guava-16.0.1.jar:?]
    at$ [guava-16.0.1.jar:?]
    at$SameThreadExecutorService.execute( [guava-16.0.1.jar:?]
    at [guava-16.0.1.jar:?]
    at [guava-16.0.1.jar:?]
    at [guava-16.0.1.jar:?]
    at$ [guava-16.0.1.jar:?]
    at$SameThreadExecutorService.execute( [guava-16.0.1.jar:?]
    at [guava-16.0.1.jar:?]
    at [guava-16.0.1.jar:?]
    at [guava-16.0.1.jar:?]
    at [guava-16.0.1.jar:?]
    at com.datastax.driver.core.Connection$1.operationComplete( [cassandra-driver-core-3.0.0.jar:?]
    at com.datastax.driver.core.Connection$1.operationComplete( [cassandra-driver-core-3.0.0.jar:?]
    at io.netty.util.concurrent.DefaultPromise.notifyListener0( [netty-common-4.0.33.Final.jar:4.0.33.Final]
    at io.netty.util.concurrent.DefaultPromise.notifyListeners0( [netty-common-4.0.33.Final.jar:4.0.33.Final]
    at io.netty.util.concurrent.DefaultPromise.notifyListeners( [netty-common-4.0.33.Final.jar:4.0.33.Final]
    at io.netty.util.concurrent.DefaultPromise.tryFailure( [netty-common-4.0.33.Final.jar:4.0.33.Final]
    at$AbstractNioUnsafe.fulfillConnectPromise( [netty-transport-4.0.33.Final.jar:4.0.33.Final]
    at$AbstractNioUnsafe.finishConnect( [netty-transport-4.0.33.Final.jar:4.0.33.Final]
    at [netty-transport-4.0.33.Final.jar:4.0.33.Final]
    at [netty-transport-4.0.33.Final.jar:4.0.33.Final]
    at [netty-transport-4.0.33.Final.jar:4.0.33.Final]
    at [netty-transport-4.0.33.Final.jar:4.0.33.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$ [netty-common-4.0.33.Final.jar:4.0.33.Final]
    at [?:1.8.0_74]

In this example, I find the follwing line interessting:

Datacenter: datacenter1; Host: /; Rack: rack1, State: UP|true

Because this is the mentioned VM, that is actually down:

Datacenter: datacenter1
|/ State=Normal/Leaving/Joining/Moving
--  Address      Load       Tokens       Owns (effective)  Host ID                               Rack
DN  89.12 KB   256          100.0%            197f6f0f-b820-4ab8-b7ef-bcc8773a345c  rack1
UN  96.26 KB   256          100.0%            db7d053b-f8d1-4a59-9cb2-3abf54b24687  rack1

Where DN should mean "Down" & "Normal", according to this excerpt from nodetools status. So as far as I understand it, the Java driver doesn't recognize the second node as down and still tries to connect to it, because it's in the list of (available) nodes.

Cas this be an incompatibility issues, because of the combination of driver version and Cassandra version? But I thought, they are compatible: DataStax Java-Driver documentation on GitHub

Please ask, if you need more info. I will update this text accordingly.

Thanks and regards.


edit1: I have initialized a Keyspace with replication class SimpleStrategy and factor 3 - I read somewere, that the number should not exceed the number of nodes (I guess it was somewhere in the docu, but I don't have the link anymore)... Can this be a reason?


A pitty, that no one seems to know of this kind of problem. After several attempts and searches on the Internet (where I found close to nothing on this particular problem) I was almost giving up on the idea.


Then two things came to my attention:

  1. The replication factor for my test keyspace was 3, while I only had two nodes. Not very sensible.
  2. If I had looked closly to the exception, I would have seen, that this is a warning and not a fatal error.

So what?

I was still able to connect to the cluster and actually query it, but always gave up too early because of this exception.

Almost "Much Ado About Nothing".

Everything's now working so far and I could further develop my applciation. As well as learn a lot about this high availability NoSQL database and where it differes from "classic" relational database, even if the query language has many similarities. It's quite exiting!

So: Sorry for the fuss!

Cheers, Daniel


Table :

  TIME_STAMP timestamp,
  TYPE text,

 time_stamp           | type
 2013-05-15 00:00:00-0700 | sometext
 2013-05-16 00:00:00-0700 | sometext
 2013-05-17 00:00:00-0700 | sometext

SELECT * FROM TEST_PAYLOAD WHERE TIME_STAMP>='2013-05-15 00:00:00-0700';

code=2200 [Invalid query] message="Only EQ and IN relation are supported on the partition key (unless you use the token() function)"

it doesn't work for > or any range selection while it works for = as far index is concerned it has only one primary key there is no partition key.Why it asks for token().

i Would like to retrieve relative range can be only date or date with time not specific timestamp exist in db.


I guess you are bit confused about the Cassandra terminology.

Please refer here

partition key: The first column declared in the PRIMARY KEY definition

ie, when you create a table like this

 PRIMARY KEY (key1, key2, key3)

key1 is called the partition key and key2, key3 are called clustering keys.

In your case you don't have clustering keys, so the single primary key which you declared became the partition key.

Also range queries ( < , >) should be performed on clustering keys.

If you don't have any other candidates for primary key, i think you should remodel your table like this

  BUCKET varchar,
  TIME_STAMP timestamp,
  TYPE text,

For BUCKET you can provide the year or year&month combination. So your keys would look like these 2013, 2014, 06-2014, 10-2014 etc.

So while querying go to the desired bucket and do range scans like TIME_STAMP >= '2013-05-15 00:00:00-0700'


Using cassandra object mapper api wanted to do a batch persist.

for single object it's working fine.

Mapper<MyObj> mapper = new MappingManager(getSession()).mapper(MyObj.class);;

For batch update I tried in this way, but ideally Cassandra is thinking I am persisting a list so it's giving exception like @Table annotation was not found in List, which is the expected behavior

Mapper<List<MyObj>> mapper = new MappingManager(getSession()).mapper(List.class);

Is there a way to achieve the same in cassandra using batch?

NOTE: Alternatively I can do this by repeating the first step using for loop, but I am looking for a Batch insertions @ObjectMapper.


Here is the snippet that I wanted to summarize form @adutra discussion in the above thread.

You can take the advantage of BatchStatement which is coming with cassandra-core driver and it's very much compatible with cassandra-mapper.

Mapper<MyObj> mapper = new MappingManager(getSession()).mapper(MyObj.class);
BatchStatement batch = new BatchStatement();


I'm getting started with Cassandra, I can connect with "cmd", I have Java, python installed. (My keyspace name is Alvaro, my tablename is alumnos) Well I have this issue

Exception in thread "main" com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: localhost/ (null))
    at com.datastax.driver.core.ControlConnection.reconnectInternal(
    at com.datastax.driver.core.ControlConnection.connect(
    at com.datastax.driver.core.Cluster$Manager.init(
    at com.datastax.driver.core.Cluster.init(
    at com.datastax.driver.core.Cluster.connect(
    at com.datastax.driver.core.Cluster.connect(
    at Test2.main(

My program is pretty simple I just want to connect and insert


import com.datastax.driver.core.Cluster;
import com.datastax.driver.core.Session;
    public class Test2{
            public static void main(String[] args) {
                Session session ;
                // Connect to the cluster and keyspace "alvaro"
                Cluster cluster = Cluster.builder().addContactPoint("localhost").build(); 
                //Inserting into alumnos 
                session.execute("INSERT INTO alumnos (matricula, nombre, edad) VALUES (now(),'Alvaroxx',23)");

If I run the app with only the lines 1-8 it doesn't break. But I can't connect, I've been trying all the week Installing a lot of jars btw my jars are.

C:\apache-cassandra-3.0.18-bin\cassandra_jars\cassandra-driver-core-2.0.2.jar C:\apache-cassandra-3.0.18-bin\cassandra_jars\guava-16.0.1.jar C:\apache-cassandra-3.0.18-bin\cassandra_jars\metrics-core-3.0.2.jar C:\apache-cassandra-3.0.18-bin\cassandra_jars\netty-3.9.0.Final.jar C:\apache-cassandra-3.0.18-bin\cassandra_jars\slf4j-simple-1.7.5.jar C:\apache-cassandra-3.0.18-bin\cassandra_jars\lz4-1.3.0.jar C:\apache-cassandra-3.0.18-bin\cassandra_jars\snappy-java- C:\apache-cassandra-3.0.18-bin\cassandra_jars\slf4j-api-1.7.5.jar

Please, help me Sorry for my bad english


The driver cant connect to So that means that you either have a firewall blocking 9042 on localhost or your server is not running correctly.

Have you tried connecting with a 3rd party tool, like the cassandra nosql manager?

What are the logs from the server saying? Verify that the server is listening on the correct interface.


I have used Cassandra CQL driver to implement some module. I know CQL driver works on port 9042.My module is working fine on port 9042 for cassandra servers(tried both local and remote). However due to some constraints on the data center port 9042 is not open for Cassandra. I need to make my application work on this data center.

Can we use the same code in some way with port 9160? I know 9160 is the thrift port and used for many other drivers for Cassandra. I was just wondering if there is any hack possible to make CQL driver work on 9160 or modify the code as little as possible to make it work on 9160.

Currently when I try to use the same code with port 9160 it gives the following error

com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s)  tried for query failed (tried: xx.yy.zz.aa        (com.datastax.driver.core.ConnectionException:  [xx.yy.zz.aa :9160] Unexpected   error during   transport initialization 
(com.datastax.driver.core.ConnectionException:   [xx.yy.zz.aa :9160] Operation timed out)),         xx.yy.zz.aa :9160 

When I try to telnet 9160 port

telnet xx.yy.zz.aa 9160
Trying xx.yy.zz.aa...
Connected to xx.yy.zz.aa.
Escape character is '^]'.
Connection closed by foreign host.

Looking for some help.


You can make the CQL interface available on port 9160 by changing the native_transport_port from 9042 to 9160 in cassandra.yaml on your cassandra server(s). You will also need to change rpc_port to something other than 9160.

In the datastax java-driver you can configure the port to use in Cluster.Builder by using the withPort method.


I was trying to understand how PagingState works with Statement in Cassandra. I tried with a sample that inserts few 1000s of records into database and tried reading the same from DB with fetch size set to 10 and using paging state. This is working perfectly fine. Here is my sample junit code:

public void setup() {
    cassandraTemplate.executeQuery("create table if not exists pagesample(a int, b int, c int, primary key(a,b))");
    String insertQuery = "insert into pagesample(a,b,c) values(?,?,?)";
    PreparedStatement insertStmt = cassandraTemplate.getConnection().prepareStatement(insertQuery);
    for(int i=0; i < 5; i++){
        for(int j=100; j<1000; j++){
            cassandraTemplate.executeQuery(insertStmt, new Object[]{i, j, RandomUtils.nextInt()});

public void testPagination() {
    String selectQuery = "select * from pagesample where a=?";
    String pagingStateStr = null;
    for(int run=0; run<90; run++){
        ResultSet resultSet = selectRows(selectQuery, 10, pagingStateStr, 1);
        int fetchedCount = resultSet.getAvailableWithoutFetching();
        System.out.println(run+". Fetched size: "+fetchedCount);
        for(Row row : resultSet){
            System.out.print(row.getInt("b")+", ");
            if(--fetchedCount == 0){

        PagingState pagingState = resultSet.getExecutionInfo().getPagingState();
        pagingStateStr =  pagingState.toString();

public ResultSet selectRows(String cql, int fetchSize, String pagingState, Object... bindings){
    SimpleStatement simpleStatement = new SimpleStatement(cql, bindings);
    return getSession().execute(simpleStatement);

When I execute this program, I see that every iteration in testPagination is exactly printing 10 records. But here is what the documentation says:

  • Note that setting a fetch size doesn’t mean that Cassandra will always return the exact number of rows, it is possible that it returns slightly more or less results.

I am not really able to understand why Cassandra will return not exactly the same number of rows as specified in fetch size. Is this the case when there is no where clause provided in the query? Will it return exact number of records when a query is constrained on a partition key? Please clarify.


From the CQL protocol specification:

Clients should also not assert that no result will have more than result_page_size results. While the current implementation always respect the exact value of result_page_size, we reserve ourselves the right to return slightly smaller or bigger pages in the future for performance reasons

So it's good practice to always rely on getAvailableWithoutFetching instead of the page size, in case Cassandra changes its implementation in the future.


I created table column datatype as timestamp while storing it stores the default timestamp format yyyy-mm-dd HH:mm:ssZbut when try to select based on timestamp it doesn't return record.

  TS timestamp,
  VALUE text,
  EMAILID text,

INSERT INTO TEST(TS,VALUE,EMAILID) VALUES('1418026180922','sometext','email@');
SELECT * FROM TEST WHERE TS='2014-12-08 00:38:10-0800'  ALLOW FILTERING;

THIS Query returns 0 rows? Why it returns like that am i doing something wrong?

It works for below query:



Your query

SELECT * FROM TEST WHERE TS='2014-12-08 00:38:10-0800'  ALLOW FILTERING;

discards (or rather ignores) the millisecond part of the timestamp you use to insert the row

//       ^^^^

And so the dates aren't equal.

If you had instead inserted

INSERT INTO TEST(TS,VALUE,EMAILID) VALUES('14180261800000','sometext','email@');

you could retrieve it with

select * from test where ts='2014-12-08 00:09:40-0800'; // note that I've fixed it from your '2014-12-08 00:38:10-0800'


Here's a strange one.

I've got a Cassandra table, call it, "test". It has several columns. These are queried via the Java driver, with a simple "select * from test where uuid=foo".

This works locally, all columns are returned.

In production environment, however, one column is dropped, despite it having data in it (which demonstrates the column can be written to).

I've looked at ColumnDefinitions for the Row object, and it indeed does not contain the column in question.

Why? Anyone even have a workable theory?


If you are using 'select *' as a PreparedStatement you will need to reprepare your statement if your schema changes (columns added or removed). Cassandra doesn't update it's prepared statement cache on schema change until 2.1.3, so this will not work on previous versions.

See: Cassandra nodejs DataStax driver don't return newly added columns via prepared statement execution


Suppose that I have a table in Cassandra which has a map field (map<int, text>) named map1. And I want to execute this statement: SELECT * from TABLE1 WHERE map1 = ? After creating an instance of PreparedStatement, I will need to call bind on it with a valid value for ? place-holder. How can I do that?

This my incomplete code: PreparedStatement stmt = session.prepare("SELECT * from TABLE1 WHERE map1 = ?"); session.execute(stmt.bind(?));

P.S. Assume I have enabled "ALLOW FILTERING"


The bindmarker ? is a positional placeholder, and positions are zero-based, so the first bindmarker gets the index 0 and so on. You can bind a value for this placeholder in two ways:

  1. Use the BoundStatement.setMap(int, Map) method:

    BoundStatement bs = stmt.bind();
    Map<Integer, String> map1 =...
    bs.setMap(0, map1); // 0 is the zere-based index of your bindmarker
  2. Use the PreparedStatement.bind(Object...) method, which is essentially a short-hand for the code above; each argument must correspond to a positional bindmarker in the query:

    Map<Integer, String> map1 =...
    BoundStatement bs = stmt.bind(map1);

And finally, because your column is of type map<int,text>, the driver naturally maps it in Java to Map<Integer, String>. That's why you should use the setMap() method in BoundStatement.


Is it a good idea to send batch update statements spanning multiple table? Will it generate a heavier workload on the coordination node?


Yes it is a bad idea to send LOGGED BATCH spanning multiple partition because it can overload the coordinator.

If you need to optimize the network and don't care about batching guarantee (that all mutation will be eventually applied) user UNLOGGED BATCH


I have worked with Cassandra quite a bit and feel like I have already put up with enough BS from the database over the years. Just wondering why this isn't working in Apache Cassandra 2.1.0 (or 2.0.8) w/ 2.1.1 of the datastax java driver. It really seems like this should work.

public class BS {

    public static void main(String [] args) {
           Cluster cluster = Cluster.builder().addContactPoint("").build();
           Metadata metadata = cluster.getMetadata();
           System.out.printf("Connected to cluster: %s\n", metadata.getClusterName());
           Session session = cluster.connect();

           session.execute("CREATE KEYSPACE IF NOT EXISTS fook WITH replication= {'class':'SimpleStrategy', 'replication_factor':1 }");
           session.execute("CREATE TABLE IF NOT EXISTS fook.dooftbl (id bigint PRIMARY KEY, doof text, ownerid bigint)");

           long id = new Random().nextLong();
           long ownerid = new Random().nextLong();
           String doof = "broken db";

           String load = "INSERT INTO fook.dooftbl (id,doof,ownerid) VALUES (?,?,?)";
           PreparedStatement ps = session.prepare(load);

        try {
            String cql = "SELECT doof FROM fook.dooftbl WHERE id=?";
            PreparedStatement ps2 = session.prepare(cql);
            ResultSet rs= session.execute(ps2.bind(id));
            System.out.println("Result set: " + rs.toString() + " size: " + rs.all().size() + " fullyFetched:" + rs.isFullyFetched());

            //Row one =;
            //if (one!=null)
            //  System.out.println("It worked. You will never have to worry about seeing this msg.");

            String msg = null;
            for (Row r : rs.all()) {
                msg = r.getString("doof");              
            System.out.println("msg:" + msg);

            catch (Exception e) {


Am I doing something wrong here or is it something relatively minor?


Connected to cluster: Test Cluster

Result set: ResultSet[ exhausted: false, Columns[doof(varchar)]] size: 1 fullyFetched:true



You're invoking


which, as the javadoc says,

Returns all the remaining rows in this ResultSet as a list.

Note that, contrary to iterator() or successive calls to one(), this method forces fetching the full content of the ResultSet at once, holding it all in memory in particular. It is thus recommended to prefer iterations through iterator() when possible, especially if the ResultSet can be big.

So, when you then do

Row one =;

which, as the javadoc states, returns

the next row in this resultSet or null if this ResultSet is exhausted.

it will return null since the ResultSet is exhausted.

You would have seen

Result set: ResultSet[ exhausted: false, Columns[doof(varchar)]] size: 1 fullyFetched:true   

in your logs which shows the row you added previously.