This repository contains list of distributed system projects with open source code in various programming languages which may be useful in order to better understand how to build distributed services.
- (Golang) Jocko - a Kafka/distributed commit log service in Go. [Serf + Raft]
- (Golang) oklog - a distributed and coordination-free log management system for big ol' clusters [Archived]
- (Golang) elasticell - a distributed HA Redis-compatible NoSQL database with strong consistency and reliability
- (Erlang) CouchDB - a highly available, partition tolerant, eventually consistent document database . Supports master-master setups with automatic conflict detection.
- (Java) Apache HBase - a Hadoop database, a distributed, scalable, big data store. Useful when random, realtime read/write access to big data needed
- (Golang) Tair - a high-performance and high-availability distributed fast-access memory (MDB)/persistent (LDB) storage service
- (Golang) immudb - an immutable database based on zero trust, Key/Value & SQL, tamperproof, data change history
- (Rust) toydb - distributed SQL database in Rust, written as a learning project
- (Rust) DB3 Network - a decentralized firebase firestore alternative
- (Python) ZODB - an ACID transactional object-oriented database
- (Golang) requiemdb - a permanent storage for OTEL data
- (C) memcached - a high performance multithreaded event-based key/value cache store intended to be used in a distributed system
- (C) redis - an in-memory database with various value types that persists on disk
- (Rust) TiKV - a distributed transactional key-value database, originally created to complement TiDB
- (C++) leveldb - a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values
- (Golang) goleveldb - a LevelDB implemented in Golang
- (Golang) summitdb - an in-memory, NoSQL key/value database. It persists to disk, uses the Raft consensus algorithm, is ACID compliant, and built on a transactional and strongly-consistent model.It supports custom indexes, geospatial data, JSON documents, and user-defined JS scripting
- (Python) pupdb - a simple file-based key-value database
- (Python) pickledb - an open source key-value store using Python's json module
- (C++) KeyDB - a faster drop-in multithreaded alternative to Redis
- (C++) Dragonfly - an in-memory data store fully compatible with Redis and Memcache and designed using modern algorithms
- (Golang) BadgerDB - an embeddable, persistent and fast key-value (KV) database written in pure Go
- (Golang) BuntDB - a low-level, in-memory, key/value store in pure Go. It persists to disk, is ACID compliant, and uses locking for multiple readers and a single writer. It supports custom indexes and geospatial data.
- (Rust) ConstDB - a redis-like cache store that implements CRDTs and active-active replications.
- (Golang) GhostDB - a distributed, in-memory, general purpose key-value data store that delivers microsecond performance at any scale
- (Dart) Hive - a lightweight and blazing fast key-value database written in pure Dart. Inspired by Bitcask
- (Golang) rosedb - a fast, stable, and embedded NoSQL database based on bitcask, supports a variety of data structures such as string, list, hash, set, and sorted set
- (Rust) PumpkinDB - an immutable Ordered Key-Value Database Engine
- (Golang) FlashDB - a simple, in-memory, key/value store in pure Go. It persists to disk, is ACID compliant, and uses locking for multiple readers and a single writer. It supports redis like operations for data structures like SET, SORTED SET, HASH and STRING
- (PHP) Lazer - a PHP flat file database based on JSON files
- (Golang) Scribble - a tiny JSON database in Golang
- (Golang) FlyDB - a high-performance KV storage engine based on bitcask paper supports redis protocol and the corresponding data structure
- (Rust) Engula - a distributed key-value store, used as a cache, database, and storage engine
- (Golang) Dice - an extremely simple Golang-based in-memory KV store that speaks Redis dialect
- (Golang) CockroachDB - a distributed fault-tolerant SQL database built on a transactional and strongly-consistent key-value store
- (Golang) YugabyteDB - a cloud native distributed SQL database for mission-critical applications
- (Golang) RQLite - a lightweight, distributed relational database, which uses SQLite as its storage engine
- (Golang) Kingbus - a distributed MySQL binlog store based on raft [Raft]
- (C++) YDB is an open-source Distributed SQL Database that combines high availability and scalability with strict consistency and ACID transactions
- (Golang) RadonDB - an open source, Cloud-native MySQL database for unlimited scalability and performance
- (C++) MongoDB - document database designed for ease of development and scaling
- (Golang) FerretDB - an proxy, converting the MongoDB 6.0+ wire protocol queries to SQL - using PostgreSQL as a database engine
- (C#) LiteDB - NoSQL Document Store in a single data file
- (Python) tinydb - a lightweight document oriented database written in pure Python
- (PHP) SleekDB - a simple flat file NoSQL like database implemented in PHP without any third-party dependencies that store data in plain JSON files
- (Rust) BonsaiDB - an ACID, transactional KV or document dev-friendly database with configurable delayed on-disk data storing
- (Golang) CloverDB - a lightweight document-oriented NoSQL database written in pure Golang
- (Java) neo4j - Graph Database
- (Python) edgedb - a graph-relational database
- (C++) nebula - a distributed, fast open-source graph database featuring horizontal scalability and high availability
- (Golang) EliasDB - a graph-based lightweight database
- (Golang) VictoriaMetrics - fast, cost-effective monitoring solution and time series database
- (Golang) influxdb - scalable datastore for metrics, events, and real-time analytics
- (Java) trino - fast distributed SQL query engine for big data analytics
- (Java) Apache Doris - an easy-to-use, high performance and unified analytics database
- (Scala) FiloDB - Distributed, Prometheus-compatible, real-time, in-memory, massively scalable, multi-schema time series / event / operational database
- (Rust) ceresdb - high-performance, distributed, schema-less, cloud native time-series database that can handle both time-series and analytics workloads
- (Golang) tstorage is a lightweight local on-disk storage engine for time-series data with a straightforward API
- (Rust) CnosDB is a high-performance, high-compression, and easy-to-use open-source distributed time-series database. Used in fields such as IoT, industrial internet, connected cars, and IT operations
- (Golang) LinDB - a scalable, high performance, high availability distributed time series database
- (Scala) FiloDB - a distributed, prometheus-compatible, real-time, in-memory, massively scalable, multi-schema time series / event / operational database
- (Rust) CeresDB - a high-performance, distributed, cloud native time-series database
- (Java) Apache Cassandra - a highly-scalable partitioned row store. Rows are organized into tables with a required primary key
- (C++) scylladb - a real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB
- (Golang) FrostDB - an embeddable wide-column columnar database written in Go
- (Golang) SpiceDB - a Google Zanzibar-inspired, database system for creating and managing security-critical application permissions
- (Golang) Keto - a Google Zanzibar-inspired open source database, gRPC, REST APIs, newSQL, and an easy and granular permission language. Supports ACL, RBAC
- (C++) BaikalDB is a distributed HTAP MySQL-compatible database designed for petabytes scale
- (Golang) AresDB - a GPU-powered real-time analytics storage and query engine
- (Rust) Qdrant - a vector similarity search engine and vector database
- (Golang) milvus - an open-source vector database built to power embedding similarity search and AI applications
- (Golang) Weaviate - an open source vector database that stores both objects and vectors
- (Golang) tobias-mayer/vector-db - a simple vector database that can be used to search for similar vectors in logarithmic time
- (Rust) DANNY - a decentralized vector database for building vector search applications
- (Golang) Glide - an open reliable fast LLM/model gateway for rapid development of GenAI apps
- (Golang) Traefik - a cloud-native app proxy
- (Lua) Kong - a cloud-native feature-rich API gatewat
- (Golang) Skipper - an HTTP router and reverse proxy for service composition
- (Golang) janus - a lightweight API gateway and management platform
- (Golang) Lura - ultra performance API gateway with middlewares
- (Python) MLFLow Gateway - an LLM proxy
- (Golang) etcd - distributed reliable key-value store for the most critical data of a distributed system [Raft + gRPC]
- (Java) Apache Zookeeper - highly reliable distributed coordination
- (Golang) chubby - A (very simplified) implementation of Chubby, Google's distributed lock service
- (Java) Kafka - a distributed, highly scalable, elastic, fault-tolerant, and secure event streaming platform
- (Python) faust - a distributed stream processing library that ports the ideas from Kafka Streams to Python
- (Golang) Liftbridge - a lightweight, fault-tolerant message streams by implementing a durable stream augmentation for the NATS messaging system
- (Rust) RisingWave - a distributed SQL database for stream processing, designed to reduce the complexity and cost of building real-time applications
- (Golang) dkron - a distributed, fault tolerant job scheduling system for cloud native environments
- (Python) Celery - a distributed task queue
- (Python) Apache Airflow - a platform to programmatically author, schedule, and monitor workflows
- (Golang) nsq - realtime fault tolerant distributed messaging platform designed to operate at scale, handling billions of messages per day [Raft + gRPC]
- (Golang) Sandglass - distributed, horizontally scalable, persistent, time ordered message queue
- (Golang) dnpipes - distributed version of Unix named pipes comparable to AWS SQS
- (PHP) GatewayWorker - distributed realtime messaging framework based on workerman
- (C++) ZeroMQ - abstraction of asynchronous message queues, multiple messaging patterns, message filtering (subscriptions), seamless access to multiple transport protocols and more
- (Java) Apache Pulsar - distributed pub-sub messaging platform with a very flexible messaging model and an intuitive client API
- (Java) Apache ActiveMQ - high performance Apache 2.0 licensed Message Broker
- (Java) ElasticSearch - distributed, RESTful search and analytics engine
- (Java) Apache Lucene - a high-performance, full featured text search engine library
- (Rust) MeiliSearch - Lightning Fast, Ultra Relevant, and Typo-Tolerant Search Engine
- (JS) FlexSearch - memory-flexible full-text search library
- (Golang) RiotSearch - distributed, Simple and efficient full text search engine
- (C++) Typesense - fast, typo tolerant, fuzzy search engine
- (Rust) Sonic - fast, lightweight & schema-less search backend. An alternative to Elasticsearch that runs on a few MBs of RAM
- (Golang) JuiceFS - Hadoop-compatible AWS S3-compatible high-performance POSIX file system
- (Golang) SeaweedFS - a simple Hadoop-compatible AWS S3-compatible distributed highly scalable distributed file system
- (C) GlusterFS - distributed storage that can scale to several petabytes
- (C++) GlusterFS - highly reliable, scalable and efficient distributed file system. It spreads data over a number of physical servers, making it visible to an end user as a single file system.
- (Golang) etcd - framework for distributed systems development. Provides the core requirements for distributed systems development including RPC and Event driven communication
- (Golang) ergo - port of Erlang/OTP approaches in Golang
- (Golang) gosiris - an actor framework for Golang
- (Python) cotyledon - a framework for defining long-running services. It provides handling of Unix signals, spawning of workers, supervision of children processes, daemon reloading, sd-notify, rate limiting for worker spawning, and more.
- (Java) atomix - fully featured framework for building fault-tolerant distributed systems [REST + Raft]
- (Kotlin) orbit - virtual actor framework for building distributed systems
- (JS) hemera - A Node.js microservices toolkit for the NATS messaging system [RPC]
- (Python) Tooz - centralizing the most common distributed primitives like group membership protocol, lock service and leader election by providing a coordination API helping developers to build distributed applications
- (C++) Nebula - powerful framework for building highly concurrent, distributed, and resilient message-driven applications
- (GoLang) Service Weaver - A framework that allows to write applications as modular binary and deploy it as a set of microservices
- (GoLang) Dapr - portable, serverless, event-driven runtime that works as a sidecar and makes it easy for developers to build resilient, stateless and stateful microservices
- (Golang) Dragonboat - a high performance multi-group Raft consensus library in pure Go
- (Golang) Golimit - Uber ringpop based distributed and decentralized rate limiter
- (Python) Tenacity - general-purpose retrying library
- (Elixir) ex_hash_ring - pure Elixir consistent hash ring implementation based on the excellent C hash-ring lib
- (Elixir) raft - Raft consensus implementation
- (C++) NuRaft - Raft implementation derived from the cornerstone project
- (Python) Hyx - Lightweight fault tolerance primitives for your resilient and modern Python microservices
- (Python) Migdalor - a Kubernetes native peer discovery for Python asyncio nodes
- (Golang) skiplist - a Golang implementation of the skiplist data structure
- (Java) Waltz - a quorum-based distributed write-ahead log for replicating transactions
- awesome-scalability - Reading list for illustrating the patterns of scalable, reliable, and performant large-scale systems
- awesome-distributed-systems - curated list on awesome material on distributed systems
- awesome-database-learning - a list of learning materials to understand databases internals
- (C/C++)(Book) Build Your Own Redis with C/C++
- (C) (Article) Writing a sqlite clone from scratch in C
- Berkley CS186: Intro into Database Systems
- MIT 6.830: Database Systems