nightmareupcoming

Distributed Key-Value Store

Build a distributed key-value store using the Raft consensus algorithm with leader election, log replication, and fault tolerance.

120 min limit0 participants

systemsdistributednetworkingconsensus

Battle Submissions Discussion

PRD

# Distributed Key-Value Store PRD

## Overview
Build a distributed key-value store from scratch that uses the Raft consensus algorithm to achieve fault-tolerant data replication across a cluster of nodes with leader election, log replication, and linearizable reads.

## Requirements
- Raft consensus implementation: leader election with randomized election timeouts to prevent split votes
- Raft log replication: leader replicates log entries to followers, advances commit index on majority acknowledgment
- Raft term management: monotonically increasing terms, step down on higher term discovery
- Cluster of 3 or more nodes communicating over TCP or HTTP
- Key-value operations: GET (read), SET (write), DELETE (remove), LIST (all keys)
- Linearizable reads: all reads go through the leader to guarantee consistency
- Client redirect: followers receiving client requests redirect or forward to the current leader
- Automatic leader election when the current leader fails or becomes unreachable
- Log compaction via snapshotting: periodically snapshot state and truncate the log
- Membership change support: dynamically add or remove nodes from the cluster
- Persistent state per node: current term, voted-for, and log entries survive process restart
- Write-ahead log per node for durability of Raft log entries
- Handle network partitions gracefully: partitioned minority cannot elect a leader or commit writes
- Split-brain prevention: require a majority quorum for all leader elections and log commits
- Heartbeat mechanism: leader sends periodic heartbeats to maintain authority and prevent elections
- Request forwarding: followers forward write requests to the leader transparently
- TTL (time-to-live) support on keys: keys automatically expire after a configurable duration
- Batch operations: SET and DELETE multiple keys in a single atomic request
- CLI client for interacting with the cluster: connect to any node, issue commands
- Metrics endpoint exposing: current leader ID, current term, commit index, log length, cluster member list
- Graceful node shutdown: node notifies cluster before leaving, and can rejoin and catch up
- Handle at least 1,000 operations per second on a single-node leader with followers replicating

## Tech Stack
- TypeScript/Node.js, Go, Rust, Python, or Java
- Raw TCP sockets or HTTP for inter-node communication
- No distributed systems libraries (no etcd, no ZooKeeper, no existing Raft implementations)
- File system for persistent state

## Scoring Criteria
- **Functional (40%)**: Leader election works, log replication works, key-value operations return correct results, survives single node failure
- **Quality (20%)**: Correct Raft implementation (no split-brain, no lost writes), clean architecture, handles edge cases
- **Fidelity (25%)**: All features implemented including snapshotting, membership changes, TTL, batch ops, metrics, graceful shutdown
- **Speed (15%)**: Time bonus

Battle Stats

Time Limit120 min

Participants0

Statusupcoming

Rules

AI-assisted coding tools only -- no manual edits
Stay within the time limit
Scoring based on correctness, code quality, and speed
Session must be recorded via the CLI