Hot questions for Using Cassandra in datastax enterprise

Top Java Programmings / Cassandra / datastax enterprise

Question:

I'm using com.datastax.cassandra cassandra-driver-core version 1.0.0 in Java 7 to connect to an Apache Cassandra (version 1.2.12). While 1.2.12 is the version I am using today, that version is subject to change, and I would like to know, if possible, how to retrieve the version of Cassandra programmatically (presumably using the driver, though I'm open to other suggestions).

I found Cluster.getMetadata() and Cluster.getMetadata.getKeyspace() which returns a Metadata object and a KeyspaceMetadata object, respectively, but neither of those seem to have any methods the would return the version.

Any help is appreciated

ANSWER

Thanks to both Mikhail and Bryce, I've come up with the answer. This method returns the Cassandra version, or "Offline" if the cluster is down. I've tested this code, and it works flawlessly.

private String getCassandraVersion() {
    String[] cassandraServers = <insert your servers here>;
    String keyspace = "system";
    Session session = null;
    try {
        Cluster cluster = Cluster.builder().addContactPoints(cassandraServers)
                .build();
        session = cluster.connect(keyspace);
        PreparedStatement ps = session.prepare("select release_version from system.local where key = 'local'");
        ResultSet result = session.execute(ps.bind());
        Row versionRow = result.one();
        if (versionRow != null) {
            return versionRow.getString("release_version");
        }
    } catch (Exception ex) {
        _log.error("Failed to connect to '" + keyspace + "' keyspace!");
    } finally {
        if(session != null) {
            try {
                session.shutdown();
            } catch (Exception ex) {
                //NOP
            }
        }
    }
    return "Offline";
}

Thanks again guys!


Answer:

In addition to Mikhail's very concise answer, let me just add that you can query release_version and other important items that a node knows about itself from the local column family on the system keyspace:

cqlsh> use system;
cqlsh:system> desc table local;

CREATE TABLE local (
  key text,
  bootstrapped text,
  cluster_name text,
  cql_version text,
  data_center text,
  gossip_generation int,
  host_id uuid,
  native_protocol_version text,
  partitioner text,
  rack text,
  release_version text,
  schema_version uuid,
  thrift_version text,
  tokens set<text>,
  truncated_at map<uuid, blob>,
  PRIMARY KEY ((key))
)...

That column family should only have one row, keyed off of key='local'.

For more information, check this doc: The Data Dictionary in Cassandra 1.2

Question:

My code directly executes the bound statement prepared without any exact query. Then how to get the cql it is trying to perform in cassandra database?

For example:

public <T> void save(T entity) {
    if (entity != null) {
        Statement statement = getEntityMapper(entity).saveQuery(entity);
        statement.setConsistencyLevel(consistencyLevelWrite);
        mappingManager.getSession().execute(statement);
    }
}

I am trying to get something like INSERT INTO "keyspace"."tableName"("column1","column2") VALUES (value1,value2)


Answer:

My most generic answer is to enable the query logger. It will show executed queries in your application logs.

If you need something more specific and want to manipulate the query string in your own code, you can take inspiration from the implementation: QueryLogger.java. In this particular case, you can get the "generic" query string (with placeholders) by casting to BoundStatement and then invoking .preparedStatement().getQueryString() on it; then inspect the bound statement for the values of the placeholders. As you'll see in the code, QueryLogger handles a lot of corner cases (e.g. truncating large parameters).

Question:

I have 5GB worth of data in DSE 4.8.9. I am trying to load the same data into DSE 5.0.2. The command I use is following:

root@dse:/mnt/cassandra/data$ sstableloader -d 10.0.2.91 /mnt/cassandra/data/my-keyspace/my-table-0b168ba1637111e6b40131c603254a9b/

This gives me following exception:

DEBUG 15:27:12,850 Using framed transport.
DEBUG 15:27:12,850 Opening framed transport to: 10.0.2.91:9160
DEBUG 15:27:12,850 Using thriftFramedTransportSize size of 16777216
DEBUG 15:27:12,851 Framed transport opened successfully to: 10.0.2.91:9160
Could not retrieve endpoint ranges: 
InvalidRequestException(why:unconfigured table schema_columnfamilies)
    java.lang.RuntimeException: Could not retrieve endpoint ranges: at     org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:342)
at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:156)
at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:109)
Caused by: InvalidRequestException(why:unconfigured table schema_columnfamilies)
 at org.apache.cassandra.thrift.Cassandra$execute_cql3_query_result$execute_cql3_query_resultStandardScheme.read(Cassandra.java:50297)
at org.apache.cassandra.thrift.Cassandra$execute_cql3_query_result$execute_cql3_query_resultStandardScheme.read(Cassandra.java:50274)
at org.apache.cassandra.thrift.Cassandra$execute_cql3_query_result.read(Cassandra.java:50189)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
at org.apache.cassandra.thrift.Cassandra$Client.recv_execute_cql3_query(Cassandra.java:1734)
at org.apache.cassandra.thrift.Cassandra$Client.execute_cql3_query(Cassandra.java:1719)
at org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:321)
... 2 more

Thoughts?


Answer:

For scenarios when you have few nodes and not a lot of data, you can follow these steps for a cluster migration (ensure the clusters are at most 1 major release apart)

1) create the schema in the new cluster

2) move both node's data to each new node (into the new cfid tables)

3) nodetool refresh to pick up the data

4) nodetool cleanup to clear out the extra data

5) If the old cluster was from a previous major version, run sstable upgrade on the new cluster.

Question:

Can someone point me to a detailed documentation how paging is implemented, with page and page state? I have gone through https://datastax.github.io/java-driver/manual/paging/

But how is it implemented internally?

Is the coordinator drawing data and does a limit offset query as the data is drawn out from replicas sequentially for every page request?

Or are the saving the file cursor and doing a RandomAccess? If so can I get that back from the driver and use it later on?


Answer:

The documentation you mentioned is the most up-to-date about pagination with the DataStax Java driver. You can also read this blog post, which is a bit old, but still valid.

Is the coordinator drawing data and does a limit offset query [...] ?

No. Actually, there is no "offset query" in Cassandra, see CASSANDRA-6511. This is also covered in the driver documentation on paging.

Or are the saving the file cursor and doing a RandomAccess? If so can I get that back from the driver and use it later on?

Yes to both. The paging state exposed by the driver is meant to be used in exactly that way; again, this is explained in the driver documentation on paging.