h12.io/sej provides composable components of distributed, persisted message queue and allows trading off between reliablilty, latency and throughput with minimal devops overhead.

Package Organization

  • h12.io/sej: writer, scanner and offset
    • shard: sharding
    • hub: copying across machines
    • cmd/sej: command line tool

SEJ Directory

[root-dir]/
    [sej-dir]/
        jnl.lck
        jnl/
            0000000000000000.jnl
            000000001f9e521e.jnl
            ......
        ofs/
            reader1.ofs
            reader1.lck
            reader2.ofs
            reader2.lck
            ......

Journal File format

segment_file = { message }                          .
message      = offset timestamp type key value size .
offset       = uint64                               .
timestamp    = int64                                .
type         = uint8                                .
key          = key_size { uint8 }                   .
key_size     = int8                                 .
value        = value_size { uint8 }                 .
value_size   = int32                                .
size         = int32                                .

All integers are written in the big endian format.

name description
offset the position of the message in the queue
timestamp the timestamp represented in nanoseconds since Unix Epoch
type an int8 value that could be used to indicate the type of the message
key the encoded key
value the encoded value
size the size of the whole message including itself, allowing reading backward

Writer

  • Append from the last offset in segmented journal files
  • File lock to prevent other writers from opening the journal files
  • Startup corruption detection & truncation

Scanner

  • Read from an offset in segmented journal files
  • Change monitoring
    • directory
    • file append
  • Handle incomplete last message
  • Truncation detection & fail fast
  • Timeout

Offset

  • First/last offset
  • Offset persistence

Sharding

[root-dir]/
    [shard0]/
    [shard1]/
    ......

Each shard directory is a SEJ directory with a name in the form of [prefix].[shard-bit].[shard-index].

  • prefix must satisfy [a-zA-Z0-9_-]*
  • when prefix is empty, [prefix]. including the dot is omitted
  • shard-bit: 1, 2, …, 9, a
  • shard-index: 000, 001, …, 3ff

Hub

[root-dir]/
    [client-id0].[shard0]/
    [client-id1].[shard0]/
    ......

client-dir is the SEJ directory name belonging to a client.