Hot questions for Using Cassandra in cassandra cli

Top Java Programmings / Cassandra / cassandra cli

Question:

I have a table like this in CQL3

create table product_info
(
 key text,
 value text,
 Primary key (key)
);

It is a vertical table . Since I can insert new rows with (key , value ) pair.

Sample data will be :

product_info

  key                |     value       
  -------------------------------------------
  product_name       |   sample_product   
  quantity           |   2
  manufacture        |   sample_manufacturer   
  ....                   ....

But what I need is a horizontal table , where I could able to add columns dynamically without altering the table.

product_info

    product_name     |   quantity   |  manufacture           |  ....
   ------------------------------------------------------------------------------    
    sample_product   |    2         |  sample_manufacturer   |  ....

I need the structure like the above table , need to keep on add the columns on the fly.

CQL3 provides an option to add columns dynamically , but before that we need to alter the table.

I need to know is there any other method which allows this.

I found that by using thrift api it is possible, but since thrift is not more supported , can not use that.

Is there any other API like hector or anything else supporting this ?

I did go through the similar stack overflow posts , but I didn't get a better solution.


Answer:

CREATE TABLE product_info(
    product_name text,
    key text,
    value text,
    PRIMARY KEY (product_name, key)
);

Now you can insert up to 2B k/v pairs because the key is now the clustering column.

INSERT INTO product_info (product_name, key, value) 
    VALUES ('iPhone 6', 'quantity', '2');

INSERT INTO product_info (product_name, key, value) 
    VALUES ('iPhone 6', 'manufacturer', 'Apple');

INSERT INTO product_info (product_name, key, value) 
    VALUES ('iPhone 6', 'quantity', '2');

INSERT INTO product_info (product_name, key, value) 
    VALUES ('iPhone 6', 'another column name', 'another column value');

However, you did not specify your query access patterns, so this data model may be totally wrong (or ok) for your application.

Question:

I tried to install and run Apache Cassandra on the amazon instance. On the instance all work fine. Also i can to connect from remote machine via cassandra-cli --host PUBLIC_IP --port 9160.

But when i try to connect via spring-data-cassandra (1.2.0.Build snapshot) cassandra throw error:

All host(s) tried for query failed (tried: /PUBLIC_IP:9160  (com.datastax.driver.core.ConnectionException: [/PUBLIC_IP:9160] Unexpected error during transport initialization (com.datastax.driver.core.TransportException: [/PUBLIC_IP:9160] Channel has been closed)))

My cassandra.yuml:

listen_address: localhost
start_native_transport: true
native_transport_port: 9042
start_rpc: true
rpc_address: 0.0.0.0
rpc_port: 9160
broadcast_rpc_address: PRIVATE_AWS_INSTANCE_IP

cassandra-cli --host PUBLIC_IP --port 9160 work fine.

Cluster bean:

@Bean
public CassandraClusterFactoryBean cluster() throws Exception {

    CassandraClusterFactoryBean cluster = new CassandraClusterFactoryBean();
    cluster.setContactPoints(env.getProperty("cassandra.contactpoints"));
    cluster.setPort(Integer.parseInt(env.getProperty("cassandra.port")));  
    return cluster;
}

props:

cassandra.contactpoints=PUBLIC_IP
cassandra.port=9160
cassandra.keyspace=mykeyspace

Dependencies:

<properties>
    <spring.version>4.1.0.RELEASE</spring.version>
</properties>

<dependencies>

    <dependency>
        <groupId>org.springframework.data</groupId>
        <artifactId>spring-cql</artifactId>
        <version>1.1.0.RELEASE</version>
    </dependency>


    <dependency>
        <groupId>org.springframework</groupId>
        <artifactId>spring-core</artifactId>
        <version>${spring.version}</version>
    </dependency>

    <dependency>
        <groupId>org.springframework</groupId>
        <artifactId>spring-expression</artifactId>
        <version>${spring.version}</version>
    </dependency>

    <dependency>
        <groupId>org.springframework</groupId>
        <artifactId>spring-context</artifactId>
        <version>${spring.version}</version>
    </dependency>

 <dependency>
        <groupId>org.springframework.data</groupId>
        <artifactId>spring-data-cassandra</artifactId>
        <version>1.2.0.BUILD-SNAPSHOT</version>
    </dependency>

UPD

Problem was resolved by changing port to 9042 (native)


Answer:

You're using the native driver but connecting on the Thrift port. Try 9042.

Question:

I have some questions about UDT, not sure if it's bug or not.

here is my type definition and table definition

CREATE TYPE test_udt_bigint (
    id    varchar,
    data Map<int, bigint>
);

CREATE TYPE test_udt (
    id    varchar,
    data Map<int, int>
);

CREATE TABLE test_tbl_bigint (
    row_id varchar PRIMARY KEY,
    udt_data map<varchar, frozen<test_udt_bigint>>
);

CREATE TABLE test_tbl_int (
    row_id varchar PRIMARY KEY,
    udt_data map<varchar, frozen<test_udt>>
);

After creating those objects, I used cqlsh to insert data, it got success, and I can use select command to retrieve data. But after inserting data via JAVA, it will cause lots of problems.

Here is the repository I used for inserting data: https://github.com/sophiah/cassandra_test/tree/master/cassandra-test-udt

After inserting data into test_tbl_udt, everything looks great and I can select via cqlsh as normal:

cqlsh:testcassandra> select * from test_tbl_int;

row_id | udt_data
--------+------------------------------------------------
test | {'key-01': {id: 'mapkey-01 ', data: {10: 20}}}
xxx |  {'key-01': {id: 'mapkey-01', data: {10: 20}}}

but, after inserting data into test_tbl_bigint, there are something = wrong:

cqlsh:testcassandra> select * from test_tbl_int;
Traceback (most recent call last):
File "bin/cqlsh", line 1093, in perform_simple_statement
    rows = self.session.execute(statement, trace=self.tracing_enabled)
File "/opt/apache-cassandra-2.1.11/bin/../lib/cassandra-driver-internal-only-2.7.2.zip/cassandra-driver-2.7.2/cassandra/cluster.py", line 1602, in execute
    result = future.result()
File "/opt/apache-cassandra-2.1.11/bin/../lib/cassandra-driver-internal-only-2.7.2.zip/cassandra-driver-2.7.2/cassandra/cluster.py", line 3347, in result
    raise self._final_exception
error: unpack requires a string argument of length 4

cqlsh:testcassandra> select * from test_tbl_bigint;
NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 127.0.0.1 datacenter1>: ConnectionShutdown('Connection to 127.0.0.1 is defunct',)})

any suggestion?

thanks


Answer:

The short answer is that the table name is wrong in da_test_tbl_bigint.java, it tries to insert into test_tbl_int. It's not yet clear to me why the driver doesn't catch the error, I'll update my answer when I've figured it out.

Question:

My application runs with more than 100 transactions per second in production. I would like to know the configuration should be used to achieve this.

In non Prod environment i am using the cluster with DCAwareLoadBalancingPolocy and consistency level as LOCAL_QUORUM.

All remaining configuration is left as default.

Is the default configuration enough or i need to specify all the connection options like pooling options, socket options, consistency level, etc.,

PS: Cassandra version 3 Please suggest how to scale it.


Answer:

The Java driver defaults are quite good, especially for that load. You need to use DCAware/TokenAware load balancing policy that is default. You may tune the connection pooling to allow more "in-flight" requests per single connection. You need to have only single instance of the Session class per application to avoid opening too many connections to cluster. The real performance gain comes from using the asynchronous operations, and having lower consistency level, like, LOCAL_ONE (but this is application specific).

Question:

Using the Datastax Cassandra Java Client (3.2.0), is there a way of controlling the log level programmatically for the client logs?


Answer:

For instance:

        LoggerContext loggerContext = (LoggerContext)LoggerFactory.getILoggerFactory();
        Logger rootLogger = loggerContext.getLogger("com.datastax.driver.core.Connection");
        rootLogger.setLevel(Level.TRACE);

Question:

There is a bunch of different cassandra's clients in now days. Most of them was built on top of Thrift driver and then adopted to use DataStax Java driver. I can name Kundera and Astynax as biggest of them. The last one has only a beta support for the Java driver. And there is an Achilles client that build on top of DataStax Java driver and supports all new Cassandra features. It's a little bit younger and I know nothing of it's use cases in production. But it looks very promising.

So I'm new in NoSQL world ask you to give me a hint - with client to use with new project? Suppose it will be a big solution with 33 nodes Cassandra cluster and a lot of different kinds of queries to db.

Thank you in advance.


Answer:

If the platform you're on has a client from DataStax, use that. The DataStax drivers for the jvm and .NET (and possibly others) are quite polished, support all available features, and deal with connection complexity internally very nicely. In addition, if you're looking to do Spark analytics, the DataStax Spark connector is the only option supporting good locality, and it uses the DataStax java driver internally.