Hot questions for Using Cassandra in timeuuid


While working on a use case where the data needs to be sorted on UUID which are all Type 1 or timebased and generated using Datastax Cassandra Java driver library (UUIDS.timebased()), i found that UUID.compareTo is not sorting some of the UUIDs correctly. The logic in compareTo is

 * Compares this UUID with the specified UUID.
 * <p> The first of two UUIDs is greater than the second if the most
 * significant field in which the UUIDs differ is greater for the first
 * UUID.
 * @param  val
 *         {@code UUID} to which this {@code UUID} is to be compared
 * @return  -1, 0 or 1 as this {@code UUID} is less than, equal to, or
 *          greater than {@code val}
public int compareTo(UUID val) {
    // The ordering is intentionally set up so that the UUIDs
    // can simply be numerically compared as two numbers
    return (this.mostSigBits < val.mostSigBits ? -1 :
            (this.mostSigBits > val.mostSigBits ? 1 :
             (this.leastSigBits < val.leastSigBits ? -1 :
              (this.leastSigBits > val.leastSigBits ? 1 :

I had the below 2 UUIDs generated using the datastax cassandra driver for java.

UUID uuid1 = java.util.UUID.fromString("7fff5ab0-43be-11ea-8fba-0f6f28968a17")
UUID uuid2 = java.util.UUID.fromString("80004510-43be-11ea-8fba-0f6f28968a17")
uuid1.timestamp() //137997224058510000
uuid2.timestamp() //137997224058570000

From the above, it is evident that uuid1 is smaller than uuid2, but when we compare them using UUID compareTo method, we get different output. We should get output as -1 as it is supposed to be less than but we get answer as 1 which shows that this uuid1 is greater than uuid2

uuid1.compareTo(uuid2) //output - 1

On further analysing this, found out that the msb for uuid2 transforms to a negative number where as msb for uuid1 is positive number. Because of this the logic in compareTo is returning the value of 1 instead of -1.

u_7fff5ab0 = {UUID@2623} "7fff5ab0-43be-11ea-8fba-0f6f28968a17"
mostSigBits = 9223190274975338986
leastSigBits = -8090136810520933865

u_80004510 = {UUID@2622} "80004510-43be-11ea-8fba-0f6f28968a17"
mostSigBits = -9223296100696452630
leastSigBits = -8090136810520933865

Is this behaviour normal with UUID and their comparison with each other ? If so then how do we handle sorting of such timebased UUIDs?

Thank you


Please note that comparing time based UUIDs need special care, From the docs:

Lastly, please note that Cassandra's timeuuid sorting is not compatible with UUID.compareTo(java.util.UUID) and hence the UUID created by this method are not necessarily lower bound for that latter method.

Time based UUIDs should not be compare with the java.util.UUID#compareTo. To compare the two time based UUID, you should compare the time; within these two UUID contains. you need a custom Utility method implementation or just compare two timesstamps. Here is an example how to do it:

// must be timebased UUID
int compareTo(UUID a, UUID b){

To learn more, go through this DOCS.


I'm trying to save java.util.UUID to Cassandra column of type timeuuid. For example, that is a default spring-data-cassandra mapping: . Value of UUID is generated by java.util.UUID#randomUUID() I get an exception: "com.datastax.driver.core.exceptions.InvalidQueryException: Invalid version for TimeUUID type"

Studying code at reveals the reason:

    public void validate(byte[] bytes)
        if (bytes.length != 16 && bytes.length != 0)
            throw new MarshalException(String.format("TimeUUID should be 16 or 0 bytes (%d)", bytes.length));
        // version is bits 4-7 of byte 6.
        if (bytes.length > 0)
            if ((bytes[6] & 0xf0) != 0x10)
                throw new MarshalException("Invalid version for TimeUUID type.");


That means that Cassandra timeuuid type accepts only time-based UUIDs. Value that is generated by java.util.UUID#randomUUID() is type 4 (pseudo randomly generated) UUID and does not pass validation. So TimeUUID class works as expected, but the exception cause is not so obvious. Possible workarounds:

  1. You should insert the timeuuid generated via datastax driver. In your case , since you are using version 1 timeuuid, you must use UUIDs.timeBased() .


  1. Or if you are mapping entity through spring-data-cassandra and need to save UUIDs supplied by third-party, add to entity field the annotation @CassandraType(type = DataType.Name.UUID)


The use case that we are working to solve with Cassandra is this: We need to retrieve a list of entity UUIDs that have been updated within a certain time range within the last 90 days. Imagine that we're building a document tracking system, so our relevant entity is a Document, whose key is a UUID.

The query we need to support in this use case is: Find all Document UUIDs that have changed between StartDateTime and EndDateTime.

Question 1: What's the best Cassandra table design to support this query?

I think the answer is as follows:

CREATE TABLE document_change_events (
    event_uuid TIMEUUID,
    document_uuid uuid,
    PRIMARY KEY ((event_uuid), document_uuid)
) WITH default_time_to_live='7776000';

And given that we can't do range queries on partition keys, we'd need to use the token() method. As such the query would then be:

SELECT document_uuid 
 WHERE token(event_uuid) > token(minTimeuuid(?)) 
   AND token(event_uuid) < token(maxTimeuuid(?))

For example:

SELECT document_uuid 
 WHERE token(event_uuid) > token(minTimeuuid('2015-05-10 00:00+0000')) 
   AND token(event_uuid) < token(maxTimeuuid('2015-05-20 00:00+0000'))

Question 2: I can't seem to get the following Java code using DataStax's driver to reliability return the correct results.

If I run the following code 10 times pausing 30 seconds between, I will then have 10 rows in this table:

private void addEvent() {

    String cql = "INSERT INTO document_change_events (event_uuid, document_uuid) VALUES(?,?)";

    PreparedStatement preparedStatement = cassandraSession.prepare(cql);
    BoundStatement boundStatement = new BoundStatement(preparedStatement);

    boundStatement.setUUID("event_uuid", UUIDs.timeBased());
    boundStatement.setUUID("document_uuid", UUIDs.random());



Here are the results:

cqlsh:> select event_uuid, dateOf(event_uuid), document_uuid from document_change_events;

 event_uuid                           | dateOf(event_uuid)       | document_uuid
 414decc0-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:51:09-0500 | 92b6fb6a-9ded-47b0-a91c-68c63f45d338
 9abb4be0-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:53:39-0500 | 548b320a-10f6-409f-a921-d4a1170a576e
 6512b960-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:52:09-0500 | 970e5e77-1e07-40ea-870a-84637c9fc280
 53307a20-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:51:39-0500 | 11b4a49c-b73d-4c8d-9f88-078a6f303167
 ac9e0050-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:54:10-0500 | b29e7915-7c17-4900-b784-8ac24e9e72e2
 88d7fb30-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:53:09-0500 | c8188b73-1b97-4b32-a897-7facdeecea35
 0ba5cf70-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:49:39-0500 | a079b30f-be80-4a99-ae0e-a784d82f0432
 76f56dd0-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:52:39-0500 | 3b593ca6-220c-4a8b-8c16-27dc1fb5adde
 1d88f910-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:50:09-0500 | ec155e0b-39a5-4d2f-98f0-0cd7a5a07ec8
 2f6b3850-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:50:39-0500 | db42271b-04f2-45d1-9ae7-0c8f9371a4db

(10 rows)

But if I then run this code:

private static void retrieveEvents(Instant startInstant, Instant endInstant) {

    String cql = "SELECT document_uuid FROM document_change_events " + 
                 "WHERE token(event_uuid) > token(?) AND token(event_uuid) < token(?)";

    PreparedStatement preparedStatement = cassandraSession.prepare(cql);
    BoundStatement boundStatement = new BoundStatement(preparedStatement);


    ResultSet resultSet = cassandraSession.execute(boundStatement);

    if (resultSet == null) {
      System.out.println("None found.");

    while (!resultSet.isExhausted()) {


It only retrieves three results:


Why didn't it retrieve all 10 results? And what do I need to change to achieve the correct results to support this use case?

For reference, I've tested this against dsc-2.1.1, dse-4.6 and using the DataStax Java Driver v2.1.6.


First of all, please only ask one question at a time. Both of your questions here could easily stand on their own. I know these are related, but it just makes the readers come down with a case of tl;dr.

I'll answer your 2nd question first, because the answer ties into a fundamental understanding that is central to getting the data model correct. When I INSERT your rows and run the following query, this is what I get:

aploetz@cqlsh:stackoverflow2> SELECT document_uuid FROM document_change_events 
WHERE token(event_uuid) > token(minTimeuuid('2015-05-10 00:00-0500')) 
  AND token(event_uuid) < token(maxTimeuuid('2015-05-22 00:00-0500'));


(4 rows)

Which is similar to what you are seeing. Why didn't that return all 10? Well, the answer becomes apparent when I include token(event_uuid) in my SELECT:

aploetz@cqlsh:stackoverflow2> SELECT token(event_uuid),document_uuid FROM document_change_events WHERE token(event_uuid) > token(minTimeuuid('2015-05-10 00:00-0500')) AND token(event_uuid) < token(maxTimeuuid('2015-05-22 00:00-0500'));

 token(event_uuid)    | document_uuid
 -2112897298583224342 | a079b30f-be80-4a99-ae0e-a784d82f0432
  2990331690803078123 | 3b593ca6-220c-4a8b-8c16-27dc1fb5adde
  5049638908563824288 | ec155e0b-39a5-4d2f-98f0-0cd7a5a07ec8
  5577339174953240576 | db42271b-04f2-45d1-9ae7-0c8f9371a4db

(4 rows)

Cassandra stores partition keys (event_uuid in your case) in order by their hashed token value. You can see this when using the token function. Cassandra generates partition tokens with a process called consistent hashing to ensure even cluster distribution. In other words, querying by token range doesn't make sense unless the actual (hashed) token values are meaningful to your application.

Getting back to your first question, this means you will have to find a different column to partition on. My suggestion is to use a timeseries mechanism called a "date bucket." Picking the date bucket can be tricky, as it depends on your requirements and query that's really up to you to pick a useful one.

For the purposes of this example, I'll pick "month." So I'll re-create your table partitioning on month and clustering by event_uuid:

CREATE TABLE document_change_events2 (
    event_uuid TIMEUUID,
    document_uuid uuid,
    month text,
    PRIMARY KEY ((month),event_uuid, document_uuid)
) WITH default_time_to_live='7776000';

Now I can query by a date range, when also filtering by month:

aploetz@cqlsh:stackoverflow2> SELECT document_uuid FROM document_change_events2 
WHERE month='201505'
  AND event_uuid > minTimeuuid('2015-05-10 00:00-0500')
  AND event_uuid < maxTimeuuid('2015-05-22 00:00-0500');


(10 rows)

Again, month may not work for your application. So put some thought behind coming up with an appropriate column to partition on, and then you should be able to solve this.


uuid or Universally unique identifier and timeuuid are long 128-bit value.

In Cassandra database and because of its concepts I used uuid and timeuuid for our entities identifier and now I want to compress uuid and timeuuid or reduce its size when client (user) can see the id in the URL bar.

For example Twitter also used Cassandra but when you open a Tweet, the Tweet's id is like 10153967535312713 but a simple uuid is like 10646334-2c02-11e6-bb4a-7720eb141b83 that is more characters and not user friendly (of course both IDs are not user friendly :D)

In different programming languages there are some compression functions, such as gzcompress in PHP and GZIPOutputStream in Java but these functions (classes) will compress data and return in GZIP format which is not allowed to use in URL!

Now just by the way is there any way or function/algorithm to get smaller or compressed version of a uuid or timeuuid?


Twitter originally developed Snowflake a long time ago and I believe that it is this id that Twitter is still using. There are now many flake id implementations available that can generate a UUID like number instead of a true UUID. I have used Flake Java in a project of mine.


I am using this libraryDependencies += "com.datastax.oss" % "java-driver-core" % "4.3.0" library for creating time based uuid's. Although it generates time based uuid's it gives me upto seconds but i am looking for values in 100s of nano seconds

import com.datastax.oss.driver.api.core.uuid.Uuids

The output uuid is something like f642f350-0230-11ea-a02f-597f2801796a which corresponds to Friday, November 8, 2019 at 2:06:30 PM Greenwich Mean Time

Please help on how to get time in milliseconds seomething like this Friday, November 8, 2019 at 2:06:30:0000000Z PM Greenwich Mean Time

I want timestamp to be converted in uuid format for testing(tests accept only uuid format). i would then convert uuid back to time to measure some time difference.


There are a few steps here. The first is to convert the time-based UUID's timestamp (which is in 100s of nanoseconds from October 15, 1582) to one compatible with Java's date functionality (i.e. milliseconds from January 1, 1970). Notably, you asked for better than millisecond precision.

Next, we need to interpret that date into the correct time zone.

Finally, we need to format it into text in the desired format.

Here's the code:

// this is the difference between midnight October 15, 1582 UTC and midnight January 1, 1970 UTC as 100 nanosecond units
private static final long EPOCH_DIFFERENCE = 122192928000000000L;

private static final ZoneId GREENWICH_MEAN_TIME = ZoneId.of("GMT");

private static final DateTimeFormatter FORMATTER = new DateTimeFormatterBuilder()
        .appendText(DAY_OF_WEEK, FULL)
        .appendLiteral(", ")
        .appendText(MONTH_OF_YEAR, FULL)
        .appendLiteral(' ')
        .appendLiteral(", ")
        .appendValue(YEAR, 4)
        .appendLiteral(" at ")
        .appendValue(MINUTE_OF_HOUR, 2)
        .appendValue(SECOND_OF_MINUTE, 2)
        .appendFraction(NANO_OF_SECOND, 7, 7, false)
        .appendLiteral(' ')
        .appendLiteral(' ')

public static String formattedDateFromTimeBasedUuid(UUID uuid) {
    ZonedDateTime date = timeBasedUuidToDate(uuid);
    return FORMATTER.format(date);

public static ZonedDateTime timeBasedUuidToDate(UUID uuid) {
    if (uuid.version() != 1) {
        throw new IllegalArgumentException("Provided UUID was not time-based.");
    // the UUID timestamp is in 100 nanosecond units.
    // convert that to nanoseconds
    long nanoseconds = (uuid.timestamp() - EPOCH_DIFFERENCE) * 100;
    long milliseconds = nanoseconds / 1000000000;
    long nanoAdjustment = nanoseconds % 1000000000;
    Instant instant = Instant.ofEpochSecond(milliseconds, nanoAdjustment);
    return ZonedDateTime.ofInstant(instant, GREENWICH_MEAN_TIME);

I'd drop those methods and constants in a utility class for convenient reuse.

A couple notes:

  • Lots of statically imported constants here. They come from java.time.format.TextStyle and java.time.temporal.ChronoField.
  • I used DateTimeFormatterBuilder instead of the more common DateTimeFormatter.forPattern(String). I find it more readable and am willing to tolerate the resulting verbosity.
  • I tweaked one thing about your desired format: you asked for the time to be 2:06:30:001; this code yields 2:06:30.001 -- a decimal point between the seconds and milliseconds, rather than a colon. This is more correct, but if you prefer the colon, just change the corresponding .appendLiteral('.') to pass a colon instead.
  • You'll often find example code that defines DateTimeFormatters, ZoneIds, etc inline. These classes are thread-safe and reusable, so for best results, you should define them as constants, as I have done here. You'll get better performance and reduced memory usage.
  • Note that the DataStax driver's Uuids class uses the system clock's millisecond-precision value as input, so you're just going to see zeros in the last four positions, unless you implement your own nanosecond-based variant. You could do that using System.nanoTime(), but there are some complications -- check out the note on the JavaDoc for more.

To determine the amount of time between two ZonedDateTimes, you just do this:

Duration duration = Duration.between(date1, date2);

The Duration class has several useful methods you can use to interpret the result.


I would like to know a way to dynamically generate timeUUID via pre-processor or post-processor beanshell in Jmeter for dynamic load testing to better fit real life scenario. I have tried to add datastax cassandra driver in /lib folder, but it seems to complain about missing dependencies. It is probably not a standalone cassandra driver. Any suggestion for a library that can generate timeUUID would be appreciated.


Looking into Maven Central Cassandra JDBC Driver has some dependencies which you need to have in JMeter's CLASSPATH as well.

Example steps (assume Apache Maven)

  1. Create an arbitrary folder somewhere
  2. Create pom.xml file in that folder with the following contents:

    <project xmlns=""
  3. Replace 2.1.10. with the version of your Cassandra server

  4. In the folder created in step 1 execute the following command:

    mvn dependency:copy-dependencies
  5. Copy all the .jar files from target/dependency folder to JMeter's CLASSPATH (i.e. to "lib" folder of your JMeter installation). My list looks like:

  6. Restart JMeter to pick the jars up
  7. In Beanshell Test Elements use the following code:

    import com.datastax.driver.core.utils.UUIDs;
    UUID timeUUID = UUIDs.timeBased();
    String timeUUIDString = timeUUID.toString();
    vars.put("timeUUID", timeUUIDString);

See How to Use BeanShell: JMeter's Favorite Built-in Component for more information on using Beanshell in JMeter.


I have this Cassandra table:

CREATE TABLE xxx ( id timeuuid PRIMARY KEY);

and this class:

@Table(name = "xxx", schema = "yyy")
   public class XXX {

  public UUID id;

Upon persisting, I get:

Exception in thread "main" com.impetus.kundera.KunderaException: java.lang.IllegalArgumentException: GenerationType.AUTO Strategy not supported by this client :com.impetus.client.cassandra.pelops.PelopsClient at com.impetus.kundera.persistence.EntityManagerImpl.persist( at at Importer.exec( at Importer.main( at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke( at sun.reflect.DelegatingMethodAccessorImpl.invoke( at java.lang.reflect.Method.invoke( at com.intellij.rt.execution.application.AppMain.main( Caused by: java.lang.IllegalArgumentException: GenerationType.AUTO Strategy not supported by this client :com.impetus.client.cassandra.pelops.PelopsClient at com.impetus.kundera.persistence.IdGenerator.onAutoGenerator( at com.impetus.kundera.persistence.IdGenerator.generateAndSetId( at com.impetus.kundera.graph.ObjectGraphBuilder.getNode( at com.impetus.kundera.graph.ObjectGraphBuilder.getObjectGraph( at com.impetus.kundera.persistence.PersistenceDelegator.persist( at com.impetus.kundera.persistence.EntityManagerImpl.persist( ... 8 more

This is my pom.xml:

        <name>Kundera Public Repository</name>


From the logs:

GenerationType.AUTO Strategy not supported by this client com.impetus.client.cassandra.pelops.PelopsClient

I suggest you to use Thrift client. You can do this by changing kundera.client.lookup.class property in persistence.xml to the following:

<property name="kundera.client.lookup.class" value="com.impetus.client.cassandra.thrift.ThriftClientFactory" />

Also, I suggest you to use latest version of Kundera-Cassandra.