Using Nats Jetstream as an event store #3772
-
a. Can we use NATS JetStream for Event Sourcing? Does it have the capability to store millions of events at a time without affecting the speed and performance of an application? b. I read that we can archive the event logs to cold storage. But I have not come across an example that demonstrates how we can move the logs to, let's say, S3 storage. Is this "cold storage" feature as powerful as Pulsar's Tiered Storage? c. Are there any applications who are using it as an Event Store in production? If yes, any feedback? Thank you all for making such a great software and make it available to all of us. |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 11 replies
-
Hi @cloudcompute. Answers in order: a. The short answer is yes. A stream in NATS (JetStream) is very well suited for event sourcing since every event can be published/appended to a subject that represents an aggregate/entity/consistency boundary of your choosing. For example, a stream called You can enforce optimistic concurrency on a stream-level or a per-subject level within a stream using a message header, This allows for concurrent appends across subjects without contention, linearizability on a per-subject basis (entity event stream), while still gaining a total order of events across all subjects within a stream for consumption. Subjects are indexed within a stream, so the OCC check does not add overhead, and a stream in general can grow as large as you have resources to support it. On replay to build the state of the aggregate to accept a new event, if a consumer is filtered to a specific subject, since the index is present, it only performs a linear scan over the blocks between the earliest and latest events for that subject. If applying CQRS, many consumers with optional subject-based filtering can be used to derive desired read models. b. Currently, tiered storage support is not built-in as an option to a stream's retention policy. This has been discussed several times, but needs to be prioritized. Depending on whether snapshotting is being used in conjunction with event streams, a separate consumer that is archiving event blocks should be suitable. If there is a desired to have transparent infinite retention, then tiered storage would be ideal, otherwise this could be abstracted in Rita, for example. c. There certainly has been an increasing amount of interest from the DDD/ES/CQRS community, however I don't have a list of folks using it in production today. To my knowledge @pavelnikolov has used NATS for event sourcing (based on his talk), but I will let him share his experience and whether it was in a production scenario. What I will say is that all of these features that are required for a "proper event store" are being used in other use cases and there are many, many people using NATS/JetStream in production. If there is a question about scale or performance, the NATS team will address any concerns or questions you have. |
Beta Was this translation helpful? Give feedback.
-
@bruth Thanks for a huge write-up, this is great response. Certainly, ES/CQRS is picking up as companies have started thinking about moving to microservices. I'd like to give a try to test implementing Nats for event sourcing. But I know it'd be a huge task to bring it in production. Just one more question, Kafka and Pulsar use Bookkeeper/Zookeeper to store the events data which are optimised for real-time workloads. What does Nats use, is it a home-grown key-value store? If yes, why not something built on top of a key-value store like RocksDB. Regards |
Beta Was this translation helpful? Give feedback.
-
Hi @bruth same use case as OP here - thanks for your elaborate answer and further explanations! Is there any limits to the maximum number of Streams in JetStream? Or maximum number of subscribers or maximum concurrent consumers, etc.? Is there any other such limits or bottlenecks or potential issue in order to achieve linear scalability for a large-scale event sourcing use case? |
Beta Was this translation helpful? Give feedback.
-
Hi @Fizmath Thank you for sharing your work. I have few concerns/questions about the architecture. a. Each microservice has its own PostgreSQL. Probably you are using CDC to stream the changes to NATS. I don't know whether you are using NATS (Jetstream) as 'an event store' and as single source of truth. If yes, I don't think there is any need for intermediary PostgreSQL. The events can be inserted inside Jetstream straightway. b. As Payment, Customers, etc. are microservices, what is the term "monolith" about? c. Ideally speaking, we should strive for an architecture where microservices can be built using any language of our choice.. particularly if we are using event-driven. But I don't know exactly how we can achieve this with little development effort. I know we can use Nats SDKs available for different languages. But do all SDKs support features like backpressure, no idea. Is there any NATS proxy? With Regards |
Beta Was this translation helpful? Give feedback.
Hi @cloudcompute. Answers in order:
a. The short answer is yes. A stream in NATS (JetStream) is very well suited for event sourcing since every event can be published/appended to a subject that represents an aggregate/entity/consistency boundary of your choosing. For example, a stream called
orders
with subjectsorders.*
bound to it. Then all publishes/appends toorders.1
,orders.2
, etc. would be appended to that stream.You can enforce optimistic concurrency on a stream-level or a per-subject level within a stream using a message header,
Nats-Expected-Last-Sequence
orNats-Expected-Last-Subject-Sequence
, respectively, whose value is the expected sequence. Of course, if the sequence in th…