Tutorial
Published on: October 21, 2025
Author: pbdecoder.online
What is CBOR? Complete Guide to Concise Binary Object Representation
Comprehensive guide to CBOR (Concise Binary Object Representation) - a binary data serialization format similar to Protocol Buffers, with examples and comparisons
cbor
binary format
data serialization
encoding
json alternative

What is CBOR? Complete Guide to Concise Binary Object Representation

Overview

CBOR (Concise Binary Object Representation) is a binary data serialization format defined in RFC 7049 and RFC 8949. Like Protocol Buffers, CBOR is designed to be compact, fast, and suitable for constrained environments. While CBOR data itself is binary and not human-readable, it has a diagnostic notation that can represent CBOR data in a human-readable form for debugging and documentation purposes.

What is CBOR?

Basic Definition

CBOR is a binary encoding format that represents structured data in a compact way. It's designed to be:

  • Concise: Smaller than JSON and XML
  • Fast: Quick to encode and decode
  • Self-describing: No schema required
  • Extensible: Supports custom data types
  • Interoperable: Works across different platforms and languages
  • Binary format: Not human-readable (except diagnostic notation)

Key Features

FeatureCBORJSONProtocol Buffers
Schema Required

No

No

Yes

Human Readable

No (Binary)

Yes

No

Size Efficiency

High

Low

Very High

Parsing Speed

Fast

Medium

Very Fast

Data TypesRich (23 types)Limited (6 types)Rich (custom)

CBOR Data Types

CBOR supports a rich set of data types organized into major types:

Major Types

// Major Type 0: Unsigned integers (0-23, 24-255, 16-bit, 32-bit, 64-bit)
0, 1, 23, 24, 255, 65535, 4294967295

// Major Type 1: Negative integers
-1, -24, -256, -65536

// Major Type 2: Byte strings
h'48656c6c6f'  // "Hello" in hex

// Major Type 3: Text strings
"Hello, World!"

// Major Type 4: Arrays
[1, 2, 3, "hello", true]

// Major Type 5: Maps (objects)
{"name": "John", "age": 30, "active": true}

// Major Type 6: Semantic tags
1(1609459200)  // Unix timestamp tag

// Major Type 7: Floats, simple values, break
true, false, null, undefined, 3.14159

CBOR Diagnostic Notation

While CBOR data is stored in binary format and is not human-readable, the CBOR specification defines a diagnostic notation that provides a human-readable representation of CBOR data. This notation is primarily used for:

  • Documentation: Explaining CBOR data structures in specifications
  • Debugging: Understanding the content of CBOR messages during development
  • Testing: Writing test cases with readable CBOR data representations

Diagnostic Notation Examples

// Binary CBOR data (hex): 0x83010203
// Diagnostic notation: [1, 2, 3]

// Binary CBOR data (hex): 0xA26161016162820203
// Diagnostic notation: {"a": 1, "b": [2, 3]}

// Binary CBOR data with semantic tags
// Diagnostic notation: 1(1609459200)  // Unix timestamp
// Diagnostic notation: 32("https://example.com")  // URI tag

Important Note

The diagnostic notation is NOT the actual CBOR format - it's just a human-readable way to represent what the binary CBOR data contains. When working with CBOR in applications, you're always dealing with the compact binary representation.

CBOR vs Other Formats

Size Comparison Example

Let's compare the same data in different formats:

// JSON (67 bytes)
{
  "name": "Alice",
  "age": 25,
  "active": true,
  "scores": [95, 87, 92]
}
// CBOR (42 bytes) - Diagnostic notation (human-readable representation)
{
  "name": "Alice",
  "age": 25,
  "active": true,
  "scores": [95, 87, 92]
}

// Actual CBOR binary data (hex):
// A4646E616D65654C69636563616765186961637469766566F5667363...
// Protocol Buffers (≈20 bytes with schema)
// Requires .proto definition

Working with CBOR

Encoding Example (JavaScript)

const cbor = require('cbor');

// Data to encode
const data = {
  name: "Alice",
  age: 25,
  active: true,
  scores: [95, 87, 92],
  timestamp: new Date()
};

// Encode to CBOR
const encoded = cbor.encode(data);
console.log('CBOR bytes:', encoded.length);
console.log('CBOR hex:', encoded.toString('hex'));

// Decode from CBOR
const decoded = cbor.decode(encoded);
console.log('Decoded:', decoded);

Encoding Example (Python)

import cbor2
import datetime

# Data to encode
data = {
    'name': 'Alice',
    'age': 25,
    'active': True,
    'scores': [95, 87, 92],
    'timestamp': datetime.datetime.now()
}

# Encode to CBOR
encoded = cbor2.dumps(data)
print(f'CBOR bytes: {len(encoded)}')
print(f'CBOR hex: {encoded.hex()}')

# Decode from CBOR
decoded = cbor2.loads(encoded)
print(f'Decoded: {decoded}')

Streaming Example

const cbor = require('cbor');
const fs = require('fs');

// Create a CBOR encoder stream
const encoder = new cbor.Encoder();
const output = fs.createWriteStream('data.cbor');

encoder.pipe(output);

// Stream multiple objects
encoder.write({id: 1, name: "Alice"});
encoder.write({id: 2, name: "Bob"});
encoder.write({id: 3, name: "Charlie"});
encoder.end();

// Read back with decoder stream
const decoder = new cbor.Decoder();
const input = fs.createReadStream('data.cbor');

input.pipe(decoder);

decoder.on('data', (obj) => {
  console.log('Decoded object:', obj);
});

CBOR Binary Format Structure

Basic Structure

CBOR uses a simple encoding scheme where each data item starts with an initial byte:

Initial Byte = Major Type (3 bits) + Additional Information (5 bits)

Bits: 7 6 5 | 4 3 2 1 0
      ------+----------
      Major | Additional
      Type  | Information

Encoding Examples

// Positive integer 42
// Major type 0, additional info 24 (1-byte follows)
0x18, 0x2A

// Text string "CBOR"
// Major type 3, length 4
0x64, 0x43, 0x42, 0x4F, 0x52

// Array [1, 2, 3]
// Major type 4, length 3, then elements
0x83, 0x01, 0x02, 0x03

// Map {"a": 1}
// Major type 5, length 1, then key-value pairs
0xA1, 0x61, 0x61, 0x01

Advanced CBOR Features

Semantic Tags

CBOR supports semantic tags for special data types:

// Common semantic tags
const taggedData = {
  // Tag 0: Standard date/time string
  datetime: cbor.Tagged(0, "2023-12-25T10:30:00Z"),
  
  // Tag 1: Epoch-based date/time
  timestamp: cbor.Tagged(1, 1703505000),
  
  // Tag 2: Positive bignum
  bigint: cbor.Tagged(2, Buffer.from([0x01, 0x00, 0x00, 0x00, 0x00])),
  
  // Tag 21: Base64url encoding expected
  base64url: cbor.Tagged(21, "SGVsbG8gV29ybGQ"),
  
  // Tag 32: URI
  uri: cbor.Tagged(32, "https://example.com")
};

Indefinite-Length Items

CBOR supports streaming of indefinite-length arrays and maps:

// Indefinite-length array
const indefiniteArray = cbor.encode([
  cbor.BREAK,  // Special marker for indefinite length
  1, 2, 3, 4, 5
]);

// Indefinite-length map
const indefiniteMap = cbor.encode(new Map([
  [cbor.BREAK, null],  // Indefinite length marker
  ["key1", "value1"],
  ["key2", "value2"]
]));

Use Cases and Applications

IoT and Constrained Devices

// Sensor data transmission
const sensorData = {
  deviceId: "sensor-001",
  temperature: 23.5,
  humidity: 65.2,
  battery: 87,
  timestamp: Date.now()
};

// CBOR is ideal for IoT due to small size
const cborData = cbor.encode(sensorData);
// Transmit over LoRaWAN, NB-IoT, etc.

Web APIs

// Express.js middleware for CBOR
app.use('/api/cbor', (req, res, next) => {
  if (req.headers['content-type'] === 'application/cbor') {
    let body = Buffer.alloc(0);
    req.on('data', chunk => {
      body = Buffer.concat([body, chunk]);
    });
    req.on('end', () => {
      req.body = cbor.decode(body);
      next();
    });
  } else {
    next();
  }
});

// API endpoint
app.post('/api/cbor/data', (req, res) => {
  // Process CBOR data
  const result = processData(req.body);
  
  // Respond with CBOR
  res.setHeader('Content-Type', 'application/cbor');
  res.send(cbor.encode(result));
});

Configuration Files

// config.cbor - Binary configuration
const config = {
  server: {
    host: "localhost",
    port: 8080,
    ssl: true
  },
  database: {
    url: "mongodb://localhost:27017",
    options: {
      maxPoolSize: 10,
      serverSelectionTimeoutMS: 5000
    }
  },
  features: {
    authentication: true,
    logging: true,
    metrics: false
  }
};

// Save as CBOR
fs.writeFileSync('config.cbor', cbor.encode(config));

// Load CBOR config
const loadedConfig = cbor.decode(fs.readFileSync('config.cbor'));

Performance Considerations

Encoding Performance

const Benchmark = require('benchmark');
const suite = new Benchmark.Suite;

const testData = {
  users: Array.from({length: 1000}, (_, i) => ({
    id: i,
    name: `User ${i}`,
    email: `user${i}@example.com`,
    active: i % 2 === 0,
    scores: [Math.random() * 100, Math.random() * 100]
  }))
};

suite
  .add('JSON.stringify', () => {
    JSON.stringify(testData);
  })
  .add('CBOR.encode', () => {
    cbor.encode(testData);
  })
  .on('complete', function() {
    console.log('Fastest is ' + this.filter('fastest').map('name'));
  })
  .run();

Memory Usage

// Memory-efficient streaming for large datasets
const stream = require('stream');

class CBORProcessor extends stream.Transform {
  constructor() {
    super({ objectMode: true });
  }
  
  _transform(chunk, encoding, callback) {
    try {
      // Process each CBOR object
      const processed = this.processObject(chunk);
      this.push(cbor.encode(processed));
      callback();
    } catch (error) {
      callback(error);
    }
  }
  
  processObject(obj) {
    // Your processing logic here
    return obj;
  }
}

Best Practices

1. Choose Appropriate Data Types

// Good: Use appropriate numeric types
const data = {
  count: 42,           // Small integer
  price: 19.99,        // Float
  id: BigInt(123456789012345)  // Big integer
};

// Avoid: Everything as strings
const badData = {
  count: "42",         // Should be number
  price: "19.99",      // Should be number
  id: "123456789012345"  // Could be BigInt
};

2. Use Semantic Tags

// Good: Use semantic tags for special types
const eventData = {
  eventId: "evt-123",
  timestamp: cbor.Tagged(1, Math.floor(Date.now() / 1000)),
  location: cbor.Tagged(32, "https://maps.example.com/location/123"),
  metadata: cbor.Tagged(21, base64UrlEncode(metadataBuffer))
};

3. Handle Errors Gracefully

function safeCBORDecode(buffer) {
  try {
    return cbor.decode(buffer);
  } catch (error) {
    if (error.message.includes('Unexpected end of CBOR data')) {
      console.error('Incomplete CBOR data received');
      return null;
    }
    throw error;
  }
}

Common Pitfalls

1. Indefinite Length Confusion

// Wrong: Mixing definite and indefinite length
const wrongArray = [cbor.BREAK, 1, 2, 3];  // Don't do this

// Right: Proper indefinite length array
const rightArray = cbor.encodeCanonical([1, 2, 3], {
  indefinite: true
});

2. Tag Misuse

// Wrong: Using wrong tag for data type
const wrongDate = cbor.Tagged(2, "2023-12-25");  // Tag 2 is for bignums

// Right: Correct tag for date
const rightDate = cbor.Tagged(0, "2023-12-25T00:00:00Z");

Conclusion

CBOR is an excellent choice for applications that need:

  • Compact binary encoding without schema requirements
  • Rich data type support beyond JSON's limitations
  • Fast parsing in constrained environments
  • Self-describing format for flexible data exchange
  • Streaming capabilities for large datasets

While Protocol Buffers might be more efficient for high-performance applications with stable schemas, CBOR offers a great balance of efficiency, flexibility, and ease of use, making it ideal for IoT, web APIs, and configuration files.

Further Reading

Related Posts

What is Protocol Buffers Format? Complete Guide
Deep dive into Protocol Buffers binary format structure, principles, and advantages - master efficient data serialization technology
What is Protocol Buffers? Complete Introduction
Comprehensive understanding of Google Protocol Buffers concepts, advantages, use cases, and core features
Complete FlatBuffers Tutorial: Google High-Performance Serialization Library Guide
FlatBuffers Tutorial: Master Google FlatBuffers serialization library, learn FlatBuffers zero-copy features, FlatBuffers performance optimization, FlatBuffers schema definition, FlatBuffers multi-language support, and FlatBuffers real-world applications