Skip to content

1.1. UUIDv1

Fabio Lima edited this page Dec 20, 2024 · 5 revisions

Time-based UUID

The Time-based UUID has a timestamp and a node identifier. These two information represent WHEN and WHERE the UUID was created.

The timestamp is the count of 100 nanosecond intervals since 1582-10-15, the beginning of Gregorian calendar.

The node identifier can be a MAC address, a hash of system information, a specific number or a random number (default).

💡 HINT
What are UUIDs?: https://playfulprogramming.com/posts/what-are-uuids#UUIDv1

See the section Choosing the node identifier.

// with a static random number as node identifier
UUID uuid = UuidCreator.getTimeBased();
// with a MAC address as node identifier
UUID uuid = UuidCreator.getTimeBasedWithMac();
// with a hash of hostname, MAC and IP as node identifier
UUID uuid = UuidCreator.getTimeBasedWithHash();
// with a changing random number as node identifier
UUID uuid = UuidCreator.getTimeBasedWithRandom();

Sequence of time-based UUIDs:

0edd7640-8eff-11e9-8649-972f32b091a1
0edd7641-8eff-11e9-8649-972f32b091a1
0edd7642-8eff-11e9-8649-972f32b091a1
0edd7643-8eff-11e9-8649-972f32b091a1
0edd7644-8eff-11e9-8649-972f32b091a1
0edd7645-8eff-11e9-8649-972f32b091a1
0edd7646-8eff-11e9-8649-972f32b091a1
0edd7647-8eff-11e9-8649-972f32b091a1
0edd7648-8eff-11e9-8649-972f32b091a1
0edd7649-8eff-11e9-8649-972f32b091a1
0edd764a-8eff-11e9-8649-972f32b091a1
0edd764b-8eff-11e9-8649-972f32b091a1
0edd764c-8eff-11e9-8649-972f32b091a1
0edd764d-8eff-11e9-8649-972f32b091a1
0edd764e-8eff-11e9-8649-972f32b091a1
0edd764f-8eff-11e9-8649-972f32b091a1
       ^ look

|-----------------|----|-----------|
     timestamp    clkseq  node id

Other usage examples:

// Time-based with a date and time chosen by you
Instant myInstant = Instant.parse("1987-01-23T01:23:45.123456789Z");
UUID uuid = UuidCreator.getTimeBased(myInstant, null, null);
// Time-based with a clock sequence chosen by you
Integer myClockSeq = 0xAAAA; // Override the random clock sequence
UUID uuid = UuidCreator.getTimeBased(null, myClockSeq, null);
// Time-based with a node identifier chosen by you
Long myNode = 0xAAAAAAAAAAAAL; // Override random node identifier
UUID uuid = UuidCreator.getTimeBased(null, null, myNode);

Implementation

The Time-based UUID has three parts: timestamp, clock-sequence and node identifier.

Time-based UUID structure

 00000000-0000-v000-m000-000000000000
|1-----------------|2---|3-----------|

1: timestamp
2: clock-sequence
3: node identifier

All these conditions must be true for a UUID collision to occur:

  1. The same timestamp;
    • it has millisecond precision;
    • the millisecond is combined with a counter to produce a 100-nanos timestamp;
    • the counter is a random number between 0 and 9,999;
    • the counter is incremented whenever the millisecond repeats.
  2. The same clock sequence;
    • it is random number between 0 and 2^14-1;
    • it is incremented if the timestamp repeats;
    • it is unique in the a class loader scope.
  3. The same node identifier:
    • it is a number between 0 and 2^48-1;
    • it can be one of this options:
      • a static random number (default);
      • a host's MAC address;
      • a hash of hostname, MAC and IP;
      • a changing random number;
      • a specific number chosen by you.

Timestamp

The timestamp is a value that represents date and time. It has 3 subparts: low timestamp, middle timestamp, high timestamp.

Standard timestamp arrangement

 00000000-0000-v000-m000-000000000000
|1-------|2---|3---|

1: timestamp low      *
2: timestamp mid     ***
3: timestamp high   *****

In the version 1 UUID the timestamp bytes are rearranged so that the highest bits are put in the end of the array of bits and the lowest ones in the beginning. The standard timestamp resolution is 1 second divided by 10,000,000.

The timestamp is the amount of 100 nanoseconds intervals since 1582-10-15. Since the timestamp has 60 bits (unsigned), the greatest date and time that can be represented is 5236-03-31T21:21:00.684Z (2^60/10^7/60/60/24/365.25 + 1582 = ~5235).

In this implementation, the timestamp has milliseconds accuracy, that is, it uses System.currentTimeMillis() to get the current milliseconds. An internal counter is used to simulate the standard timestamp resolution of 10 million intervals per second.

You can create a function that implements the TimeFunction interface if you don't like the default timestamp algorithm. Examples:

// with timestamp provided by a custom function
MyTimeFunction function = new MyTimeFunction();
TimeBasedFactory factory = TimeBasedFactory.builder()
        .withTimeFunction(function)
        .build();
// with timestamp provided by a lambda expression
TimeBasedFactory factory = TimeBasedFactory.builder()
        .withTimeFunction(() -> TimeFunction.toTimestamp(Instant.now()))
        .build();

Counter

The counter is started with a random number between 0 to 9,999.

Every time a request is made the counter is incremented by 1.

The timestamp is calculated with this formula: MILLISECONDS * 10,000 + COUNTER.

Overrun

The RFC-4122 says that:

   If a system overruns the generator by requesting too many UUIDs
   within a single system time interval, the UUID service MUST either
   return an error, or stall the UUID generator until the system clock
   catches up.

   Note: If the processors overrun the UUID generation frequently,
   additional node identifiers can be allocated to the system, which
   will permit higher speed allocation by making multiple UUIDs
   potentially available for each time stamp value.

If the counter reaches the maximum of 10,000 within a single millisecond, the generator waits for next millisecond.

You probably don't have to worry if your application doesn't reach the theoretical limit of 10 million UUIDs per second per node (10k/ms/node). But if the the overrun occurs frequently, more than one time-based factory can be used to permit multiple UUIDs for each timestamp value. See an example in this GitHub Gist: LessBlockingFactory.

Clock sequence

The clock sequence helps to avoid duplicates. It comes in when the system clock appears to be backwards.

The first bits of the clock sequence are multiplexed with the variant number of the RFC-4122. Because of that, it has a range from 0 to 16383 (0x0000 to 0x3FFF). This value is increased by 1 if more than one request is made by the system at the same timestamp or if the timestamp is backwards. In other words, it is a counter that is incremented whenever the timestamp repeats or may be repeated.

The DefaultClockSeqFunction uses a pool to ensure that each instance of this function receives a unique clock sequence value in the JVM. This prevents more than one instance in the same JVM from sharing the same clock sequence at any time.

You can also use a custom function that implements the ClockSeqFunction interface if you want to control the clock sequence yourself. Examples:

// with clock sequence provided by a custom function
MyClockSeqFunction function = new MyClockSeqFunction();
TimeBasedFactory factory = TimeBasedFactory.builder()
        .withClockSeqFunction(function)
        .build();
// with clock sequence provided by a lambda expression
Random random = new Random();
TimeBasedFactory factory = TimeBasedFactory.builder()
        .withClockSeqFunction((timestamp) -> ClockSeqFunction.toExpectedRange(random.nextInt()))
        .build();

Node identifier

In this library the node identifier is generated by a secure random generator by default. Alternatively you can use a IEEE 802 MAC address or a system data hash as node identifier.

The effective way to avoid collisions is to ensure that each generator has its own node identifier. As a suggestion, a device ID managed by your the App can be used, if you don't want a node identifier based on a random number, a MAC address or a system data hash.

You can also create your own function that implements the NodeIdFunction interface. Examples:

// with node identifier provided by a custom function
MyNodeIdFunction function = new MyNodeIdFunction();
TimeBasedFactory factory = TimeBasedFactory.builder()
        .withNodeIdFunction(function)
        .build();
// with node identifier provided by a lambda expression
Random random = new Random();
TimeBasedFactory factory = TimeBasedFactory.builder()
        .withNodeIdFunction(() -> random.nextLong())
        .build();

Hardware address

The hardware address node identifier is the MAC address associated with the hostname. If that MAC address can't be found, it is the first MAC address that is up and running. If no MAC is found, it is a random number.

System data hash

The system data hash is calculated from a list of system properties: hostname, MAC and IP. These information are collected and passed to a SHA-256 message digest. The node identifier is the first 6 bytes of the resulting hash.

Host name

The hostname is searched in the environment variables "HOSTNAME" (Linux) and "COMPUTERNAME" (Windows). If those variables are not found, it tries to look up the hostname by calling InetAddress.getLocalHost().getHostName().

Choosing the node identifier

This library allows you to manage the node identifier for each machine by defining the system property uuidcreator.node or the environment variable UUIDCREATOR_NODE. The system property has priority over the environment variable. If no property or variable is defined, the node identifier is randomly chosen.

These options are accepted:

  • The string "mac" to use a MAC address;
  • The string "hash" to use a hash of hostname, MAC and IP;
  • The string "random" to use a random number that always changes;
  • The string representation of a specific number between 0 and 2^48-1.

The accepted number formats are: decimal, hexadecimal, and octal.

  • Defining a system property:
# Append one of these examples to VM arguments

# Use a MAC address
-Duuidcreator.node="mac"

# Use a hash of hostname, MAC and IP
-Duuidcreator.node="hash"

# Use a random number that always changes
-Duuidcreator.node="random"

# Use a specific number
-Duuidcreator.node="0xC0DA0615BB23"
  • Defining an environment variable:
# Append one of these examples to /etc/environment or ~/.profile

# Use a MAC address
export UUIDCREATOR_NODE="mac"

# Use a hash of hostname, MAC and IP
export UUIDCREATOR_NODE="hash"

# Use a random number that always changes
export UUIDCREATOR_NODE="random"

# Use a specific number
export UUIDCREATOR_NODE="0xC0DA0615BB23"

More examples of how to define the environment variable:

# Append one of these examples to ~/.profile

# Use `hostid` command
export UUIDCREATOR_NODE=0x`hostid`

# Use `machine-id` file
export UUIDCREATOR_NODE=0x`cut -c-12 /etc/machine-id`

# Use the MD5 hash of hostname
export UUIDCREATOR_NODE=0x`hostname | md5sum | cut -c-12`

# Use MAC returned by `ifconfig`
export UUIDCREATOR_NODE=0x`ifconfig eth0 | egrep -o 'ether [0-9a-f:]+' | cut -c7- | tr -d ':'`

# Use IP returned by `ifconfig` in hexadecimal 
export UUIDCREATOR_NODE=0x`ifconfig eth0 | egrep -o 'inet [0-9\.]+' | cut -c6- | tr '.' ' ' | xargs printf '%02x'`

# Use the SHA-256 hash of hostname, MAC and IP
export UUIDCREATOR_NODE=0x`echo -n $(hostname) \
                           $(ifconfig eth0 | egrep -o 'ether [0-9a-f:]+' | cut -c7-) \
                           $(ifconfig eth0 | egrep -o 'inet [0-9\.]+' | cut -c6-) \
                           | sha256sum | cut -c-12`

More examples

A key generator that makes substitution easy if necessary:

package com.example;

import com.github.f4b6a3.uuid.UuidCreator;

public class UuidGenerator {
    public static UUID generate() {
        return UuidCreator.getTimeBased();
    }
}
    UUID uuid = UuidGenerator.generate();

A key generator that hides the complexity behind its generation method:

package com.example;

import java.util.concurrent.ThreadLocalRandom;
import com.github.f4b6a3.uuid.factory.rfc4122.TimeBasedFactory;

public class UuidGenerator {

    private static final TimeBasedFactory FACTORY = TimeBasedFactory.builder()
        .withNodeIdFunction(() -> ThreadLocalRandom.current().nextLong())
        .build();
    
    public static UUID generate() {
        return FACTORY.create()
    }
}
    UUID uuid = UuidGenerator.generate();

A less-blocking factory that wraps an array of time-based factories to generate more than 10 million UUIDs per second:

package com.example;

import java.util.UUID;
import com.github.f4b6a3.uuid.factory.NoArgsFactory;
import com.github.f4b6a3.uuid.factory.rfc4122.TimeBasedFactory;

/**
 * A less-blocking factory that wraps an array of factories.
 * 
 * It can be used to generate UUIDs with less thread contention.
 * 
 * It can generate more than 10 million time-based UUIDs per second.
 */
public class LessBlockingFactory implements NoArgsFactory {

    private final NoArgsFactory[] factories;

    public LessBlockingFactory(int length) {
        factories = new NoArgsFactory[length];
        for (int i = 0; i < factories.length; i++) {
            factories[i] = new TimeBasedFactory();
        }
    }

    @Override
    public UUID create() {
        // calculate the factory index given the current thread ID
        final int index = (int) Thread.currentThread().getId() % factories.length;
        return factories[index].create();
    }
}
    // instantiate a less-blocking factory with an array of 8 factories
    LessBlockingFactory factory = new LessBlockingFactory(8);

    // use the less-blocking factory
    UUID uuid = factory.create();

A utility that generates time-based UUIDs for uploaded files, transforming the file name into node identifier:

package com.example;

import java.nio.ByteBuffer;
import java.nio.charset.StandardCharsets;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.time.Instant;
import java.util.UUID;

import com.github.f4b6a3.uuid.UuidCreator;

/**
 * A utility that generates time-based UUIDs for uploaded files.
 * 
 * The upload date is transformed into the UUID's timestamp.
 * 
 * The file name or URL is transformed into the UUID's node identifier.
 * 
 * This example is inspired on a use case found here in GitHub.
 */
public final class UploadId {

    private UploadId() {
    }

    public static UUID getUploadId(Instant date, String name) {
        final long node = getNameHash(name);
        return UuidCreator.getTimeBased(date, 0, node);
    }

    private static long getNameHash(String name) {
        try {
            MessageDigest hasher = MessageDigest.getInstance("MD5");
            byte[] md5 = hasher.digest(name.getBytes(StandardCharsets.UTF_8));
            return ByteBuffer.wrap(md5).getLong();
        } catch (NoSuchAlgorithmException var3) {
            throw new InternalError("MD5 not supported", var3);
        }
    }
}
    Instant date = Instant.now();         // the file creation date
    String name = "My uploaded file.jpg"; // the file name or URL
    UUID uuid = UploadId.getUploadId(date, name);