Hot questions for Using Cassandra in cqlsh

Question:

We're on Cassandra 2.0.15, and seeing huge read latencies (>60sec) coming up at regular intervals (about every 3min), from all app hosts. We measure this latency around calls to session.execute(stmt). At the same time, Cassandra traces report duration of <1s. We also ran, in a loop, a query via cqlsh from the same hosts during those peak latency times, and cqlsh always came back within 1s. What can explain this discrepancy at the Java driver level?

-- edit: in reply to comments --

Cassandra servers JVM settings: -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003 -Xms32G -Xmx32G -XX:+UseG1GC -Djava.net.preferIPv4Stack=true -Dcassandra.jmx.local.port=7199 -XX:+DisableExplicitGC.

Client side GC is negligible (below). Client settings: -Xss256k -Xms4G -Xmx4G, Cassandra driver version is 2.1.7.1

Client side measuring code:

val selectServiceNames = session.prepare(QueryBuilder.select("service_name").from("service_names"))

override def run(): Unit = {
  val start = System.currentTimeMillis()
  try {
    val resultSet = session.execute(selectServiceNames.bind())
    val serviceNames = resultSet.all()
    val elapsed = System.currentTimeMillis() - start
    latency.add(elapsed) // emits metric to statsd
    if (elapsed > 10000) {
      log.info("Canary2 sensed high Cassandra latency: " + elapsed + "ms")
    }
  } catch {
    case e: Throwable =>
      log.error(e, "Canary2 select failed")
  } finally {
    Thread.sleep(100)
    schedule()
  }
}

Cluster construction code:

def createClusterBuilder(): Cluster.Builder = {
  val builder = Cluster.builder()
  val contactPoints = parseContactPoints()
  val defaultPort = findConnectPort(contactPoints)
  builder.addContactPointsWithPorts(contactPoints)
  builder.withPort(defaultPort) // This ends up config.protocolOptions.port
  if (cassandraUsername.isDefined && cassandraPassword.isDefined)
    builder.withCredentials(cassandraUsername(), cassandraPassword())
  builder.withRetryPolicy(ZipkinRetryPolicy.INSTANCE)
  builder.withLoadBalancingPolicy(new TokenAwarePolicy(new LatencyAwarePolicy.Builder(new RoundRobinPolicy()).build()))
}

One more observation I cannot explain. I ran two threads that execute the same query in the same manner (as above) in a loop, the only difference is yellow thread sleeps 100millisec between queries, and green thread sleeps 60sec between queries. Green thread hits low latency (under 1s) much more often than the yellow one.


Answer:

This is a common problem when you get a compoent to test itself.

  • you can experience delays which are not visible to the tools in question.
  • your component has no idea when the request should have started.
  • when the JVM stops, this can prevent you from seeing the delays you are try to measure.

The most likely explanation is the second one. Say you have a queue of 100 tasks but because the system is running slowly each task is taking 1 second. You time each task internally and it sees it took 1 seconds, however add 100 tasks to the queue and the first one starts after 0 second, but the last starts after 99 seconds and then reports it took 1 second, but from your point of view it took 100 seconds to complete, 99 seconds of which was waiting to start.

There can also be delays in the result reaching you but this is less likely unless the operations you do in processing the results is more than the database takes. i.e. you might assume the bottleneck is on the server.

Question:

When I try to copy a table to cassandra using the command:

copy images from 'images.csv'

I get the error:

'PicklingError: Can't pickle <class 'cqlshlib.copyutil.ImmutableDict'>: attribute lookup cqlshlib.copyutil.ImmutableDict failed'

I have successfully imported all of my other tables, but this one is not working. The only difference with this one is that it contains large binary blobs for images.

Here is a sample row from the csv file:

b267ba01-5420-4be5-b962-7e563dc245b0,,0x89504e...[large binary blob]...426082,0,7e700538-cce3-495f-bfd2-6a4fa968bdf6,pentium_e6600,01fa819e-3425-47ca-82aa-a3eec319a998,0,7e700538-cce3-495f-bfd2-6a4fa968bdf6,,,png,0

And here is the file that causes the error: https://www.dropbox.com/s/5mrl6nuwelpf3lz/images.csv?dl=0

Here is my schema:

CREATE TABLE dealtech.images (
    id uuid PRIMARY KEY,
    attributes map<text, text>,
    data blob,
    height int,
    item_id uuid,
    name text,
    product_id uuid,
    scale double,
    seller_id uuid,
    text_bottom int,
    text_top int,
    type text,
    width int
)

The tables were exported using cassandra 2.x and I am currently using cassandra 3.0.9 to import them.


Answer:

I ran into this same issue with apache cassandra 3.9, although my datasets were fairly small (46 rows in one table, 262 rows in another table).

PicklingError: Can't pickle <class 'cqlshlib.copyutil.link'>: attribute lookup cqlshlib.copyutil.link failed

PicklingError: Can't pickle <class 'cqlshlib.copyutil.attribute'>: attribute lookup cqlshlib.copyutil.attribute failed

Where link and attribute are types I defined.

The COPY commands were apart of a .cql script that was being run inside a Docker container as apart of it's setup process.

I read in a few places where people were seeing this PicklingError on Windows (seemed to be related to NTFS), but the Docker container in this case was using Alpine Linux.

The fix was to add these options to the end of my COPY commands:

WITH MINBATCHSIZE=1 AND MAXBATCHSIZE=1 AND PAGESIZE=10;

http://docs.datastax.com/en/cql/3.3/cql/cql_reference/cqlshCopy.html

I was not seeing the PicklingError running these .cql scripts containing COPY commands locally, so it seems to be an issue that only rears it's head in a low memory situation.

Related issues:

  • Pickling Error running COPY command: CQLShell on Windows
  • Cassandra multiprocessing can't pickle _thread.lock objects

Question:

My Cassandra won't start on a new installed root...

/var/log/cassandra/debug.log prints:

ERROR [main] 2018-02-03 07:13:46,463 CassandraDaemon.java:706 - Exception encountered during startup java.lang.AbstractMethodError: org.apache.cassandra.utils.JMXServerUtils$Exporter.exportObject

root@srv49023:~# java -version
java version "1.8.0_161"
Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)

root@srv49023:~# systemctl status cassandra
‚óŹ cassandra.service - LSB: distributed storage system for structured data
   Loaded: loaded (/etc/init.d/cassandra)
   Active: active (exited) since Sat 2018-02-03 07:13:43 EST; 1min 8s ago
  Process: 213 ExecStart=/etc/init.d/cassandra start (code=exited, status=0/SUCCESS)

Feb 03 07:13:43 srv49023 systemd[1]: Started LSB: distributed storage system for structured data.

Thanks for helping!


Answer:

This seems to relate to an upgrade to the JDK to 8u161 which was released 2 days ago.

A ticket has been opened on the Cassandra Jira

There is no published work-around that I can find. You might have to go back to an earlier version of the JDK or wait for Cassandra 3.11.2 which fixes the issue.

Question:

I have a CSV text file that I want to use to fill a table in my keyspace (using Java).

I tried using the COPY command, but I just realized that the COPY command is not an actual CQL command: it's something that can only be run with the cqlsh shell.

What I tried to do:

session.execute("COPY table(Column1,Column2,Column3) 
                 FROM ('textfile.txt') 
                 WITH DELIMITER='\t'");

(And the code before that if you want it for context):

    Cluster cluster = Cluster.builder().addContactPoint("127.0.0.1").build();
    Session session = cluster.connect( );
    session.execute("CREATE KEYSPACE test WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': '1'};";);
    session.execute("USE test");
    session.execute("CREATE TABLE table(Column1 text, Column2 text, Column3 text, PRIMARY KEY(Column1))");

And the main error message I have is:

Exception in thread "main" com.datastax.driver.core.exceptions.SyntaxError: line 1:0 no viable alternative at input 'COPY' ([COPY]...)

Which I googled, and found out that you can only use COPY through cqlsh.

Is there another way to easily import a CSV into a keyspace? One that I'll be able to use through CQL commands in Java?

Thanks!


Answer:

You're right that 'copy' is a cqlsh (shell) command rather than a CQL (protocol) command. Here are the relevant docs which explain which apply in each case:

To import from a CSV file within your Java app you would need to write it yourself. One popular tool is the cassandraloader which is written in Java. There may be other source code snippets you could learn from as well.

Question:

I have a table like this in Cassandra-

CREATE TABLE DATA_HOLDER (USER_ID TEXT, RECORD_NAME TEXT, RECORD_VALUE BLOB, PRIMARY KEY (USER_ID, RECORD_NAME));

I want to count distinct USER_ID in my above table? Is there any way I can do that?

My Cassandra version is:

[cqlsh 4.1.1 | Cassandra 2.0.10.71 | DSE 4.5.2 | CQL spec 3.1.1 | Thrift protocol 19.39.0]

Answer:

The select expression is defined as:

selection_list
 | DISTINCT selection_list

so you can:

SELECT DISTINCT USER_ID FROM DATA_HOLDER;

Question:

I set up a new Cassandra and use ./cassandra to open it, but it fails. In the system.log, it shows the following error:

ERROR 13:30:31 Exception encountered during startup java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Insufficient disk space to write 542 bytes
    at org.apache.cassandra.db.ColumnFamilyStore.runWithCompactionsDisabled(ColumnFamilyStore.java:2888) ~[apache-cassandra-2.2.10.jar:2.2.10]
    at org.apache.cassandra.db.ColumnFamilyStore.truncateBlocking(ColumnFamilyStore.java:2849) ~[apache-cassandra-2.2.10.jar:2.2.10]
    at org.apache.cassandra.db.SystemKeyspace.discardCompactionsInProgress(SystemKeyspace.java:404) ~[apache-cassandra-2.2.10.jar:2.2.10]
    at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:230) [apache-cassandra-2.2.10.jar:2.2.10]
    at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:533) [apache-cassandra-2.2.10.jar:2.2.10]
    at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:642) [apache-cassandra-2.2.10.jar:2.2.10] Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Insufficient disk space to write 542 bytes
    at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:394) ~[apache-cassandra-2.2.10.jar:2.2.10]
    at org.apache.cassandra.db.SystemKeyspace.forceBlockingFlush(SystemKeyspace.java:593) ~[apache-cassandra-2.2.10.jar:2.2.10]
    at org.apache.cassandra.db.SystemKeyspace.saveTruncationRecord(SystemKeyspace.java:432) ~[apache-cassandra-2.2.10.jar:2.2.10]
    at org.apache.cassandra.db.ColumnFamilyStore$13.run(ColumnFamilyStore.java:2843) ~[apache-cassandra-2.2.10.jar:2.2.10]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_141]
    at org.apache.cassandra.db.ColumnFamilyStore.runWithCompactionsDisabled(ColumnFamilyStore.java:2884) ~[apache-cassandra-2.2.10.jar:2.2.10]           ... 5 common frames omitted Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Insufficient disk space to write 542 bytes
    at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) ~[guava-16.0.jar:na]
    at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) ~[guava-16.0.jar:na]
    at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) ~[guava-16.0.jar:na]
    at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:390) ~[apache-cassandra-2.2.10.jar:2.2.10]           ... 10 common frames omitted Caused by: java.lang.RuntimeException: Insufficient disk space to write 542 bytes
    at org.apache.cassandra.db.Directories.getWriteableLocation(Directories.java:349) ~[apache-cassandra-2.2.10.jar:2.2.10]
    at org.apache.cassandra.db.Memtable.flush(Memtable.java:324) ~[apache-cassandra-2.2.10.jar:2.2.10]
    at org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1187) ~[apache-cassandra-2.2.10.jar:2.2.10]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_141] 

the output of command "df -h":

root@srv-machine-learning01:/var/lib/cassandra# df -h

Filesystem                                   Size  Used Avail Use% Mounted on

/dev/xvda1                                        61G   60G     0 100% /

udev                                              10M     0   10M   0% /dev

tmpfs                                            2.4G  8.3M  2.4G   1% /run

tmpfs                                            5.9G     0  5.9G   0% 

/dev/shm

tmpfs                                            5.0M     0  5.0M   0% 

/run/lock

tmpfs                                            5.9G     0  5.9G   0% 

/sys/fs/cgroup

bak02.sovanta.com:/volume1/bak_machine_learning   70T   19T   52T  27% 

Answer:

It looks like your root file system is full.

/dev/xvda1 61G 60G 0 100% /

Since Cassandra stores data by default in /var/lib/cassandra and you haven't set up a different file system to use for this path it will use your root file system. I can think of several solutions for this issue.

  • Free up space on root partition
  • Increase disk space for your root file system
  • Add a new disk to your system and create a new filesystem which mounts to /var/lib/cassandra

Question:

I have a three-node cluster with replicationfactor 2 and I want to receive the table local from keyspace 'system'. That means I want to acess local data of a Cassandra node. Is that possible?


Answer:

Yes. You can query the system.local table from within Java just like any other.

    session = cluster.connect();
    ResultSet results = getSession()
        .execute("SELECT key,broadcast_address,cql_version FROM system.local");

    for (Row row : results) {
        System.out.println(row.getString("key") + " "
            + row.getInet("broadcast_address") + " "
            + row.getString("cql_version"));
    }

local 127.0.0.1 3.4.4

Question:

I am running Cassandra 3.7 on an Ubuntu Server 16.04 LTS 32bit machine that I'm re-purposing from storage. I managed to properly configure the JVM to operate within the 32 bit environment. Nodetool status works fine. But I am having a hard time getting CQLSH to work. I end up with the following error:

('Unable to connect to any servers', {'127.0.0.1': ProtocolError("cql_version '3.4.2' is not supported by remote (w/ native protocol). Supported versions: [u'3.4.0']",)})

Obviously there is some version conflict here but I'm confused as to why and how to fix it. I had a previous 3.0 install of Cassandra that I configured cassanda-env.sh to properly start the JVM. I kept that .sh file when I installed up to 3.7. I updated cassandra.yaml with the new install.

So my question here is: does Cassandra 3.7 support CQL 3.4.2? If not then why does it appear to ship with it? If it does not, what do I need to do to downgrade to CQL 3.4.0? If it does support it and my configuration files are wrong then what's the error and how do I fix it? Or, is there another configuration that I'm completely missing here?

Here are my current configurations:

cassandra-env.sh: https://www.dropbox.com/s/dcs99hgry5behqd/cassandra-env.sh?dl=0 cassandra.yaml:https://www.dropbox.com/s/boi8bh7gin0390f/cassandra.yaml?dl=0

Any assistance greatly appreciated.

EDIT: Figured out, with the help of the first answer, that I had an older instance of the server running. I couldn't get it to start with the correct version...so then I deleted cassandra from /etc/init.d and now I can't get the debian package to automatically register the service.

EDIT2: Downloaded cassandra script from git repo and now cassandra will start again after rebooting. Now cassandra -v shows me:

3.7

Which what I wanted to see and solved my issue pertaining to the version conflict. But now unfortunately I see a new error:

Connection error: ('Unable to connect to any servers', {'127.0.0.1': TypeError('ref() does not take keyword arguments',)})

This problem is addressed in: cqlsh connection error: 'ref() does not take keyword arguments'


Answer:

It looks like you're running the 3.7 cqlsh against an older server instance. First, be sure you are actually running Cassandra 3.7 (you can probably just check the jar path or look in the logs).

If you want to connect using the advertised cql_version there is a command line option: https://github.com/apache/cassandra/blob/cassandra-3.7/bin/cqlsh.py#L222

select release_version, cql_version from system.local;

(or just look at the cqlsh welcome header) to see what these parameters are.

Question:

As per Documentation and given knowledge on the internet. It seems that the below-given properties

- request_timeout_in_ms

- write_request_timeout_in_ms

- read_request_timeout_in_ms

Works only for internal(server side) Cassandra requests. I was even convinced to this fact when I set these parameters in cassandra.yaml file to 80000 but still got the Timeout error against my Select query to a bit bigger record by following two ways: 1) when I tried to connect to Cassandra via cqlsh without additional parameter --request-timeout=80000. by adding this parameter, I was able to run select statment successfully which was being failed last time. 2) When I tried to update the same record via java client using Cassandra driver without setting up new SocketOptions().setReadTimeoutMillis(80000) in Cluster.Builder creation. Question: Is there are way to set these request_timeout parameters to Cassandra for external(client side) requests as well (So I don't have to mention these values while connecting via Cqlsh or javaclient or DevCenter by DataStax)?


Answer:

The server cant really enforce a client side timeout as well, as there are delays that occur outside the server. An example is a delay introduced by the linux kernel in sending the request, then a 300ms latency spike cross DC, your client is getting the request 500ms after the app sent it.

Worse comes in play with GCs, ie C* sends the response, but then has a 2 second STW gc pause. The request has been "completed" from server perspective but the client will have an additional 2 second delay. If your server is poorly configured you can easily see a 8 second GC periodically. The timeout on the server side is as best as it can handle given unknowable (or prohibitively unknowable at least) factors outside its control. If you have a strict timeout its best to handle it on the client side. I would recommend doing it in your request handler.

ListenableFuture result = Futures.withTimeout(session.executeAsync(/*statement*/),
                                              8, TimeUnit.SECONDS, executor)

setReadTimeoutMillis is a little nuanced, its per request but a execute/executeAsync can end up being multiple requests, as it will try multiple hosts potentially as part of query plan (retry/speculative retry). So for example a RF=3 request on LOCAL_ONE with setReadTimeoutMillis of 2 could actually take 6 seconds to timeout depending on retry policy.

Question:

Can we set user defined field length in Cassandra database USING CQL. I would like to validate input field in cassandra ex. age filed should not be more than 60 and less than 18.


Answer:

No, there is no such thing in cassandra

You have to validate input from the application layer.