-
-
Notifications
You must be signed in to change notification settings - Fork 57
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 965c2e7
Showing
129 changed files
with
23,708 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# Sphinx build info version 1 | ||
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done. | ||
config: 516b136fe673602ffad261b5e36d4c20 | ||
tags: 645f666f9bcd5a90fca523b33c5a78b7 |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Empty file.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,171 @@ | ||
=============== | ||
Snuba Consumers | ||
=============== | ||
|
||
Summary | ||
------- | ||
|
||
* ``snuba consumer`` is still the easiest way to deploy a consumer for a new | ||
dataset, as the message processor can be written in Python. | ||
|
||
* ``snuba rust-consumer --use-rust-processor`` can be used to serve the needs | ||
of high-throughput datasets, as it sidesteps all problems with Python | ||
concurrency. For that, the message processor needs to be written to Rust. | ||
|
||
* ``snuba rust-consumer --use-python-processor`` is an experiment that attempts | ||
to consolidate both consumers' logic, so that we can eventually remove the | ||
code beneath ``snuba consumer``. It is not deployed anywhere in prod. | ||
|
||
* In order to port a dataset fully to Rust, register a new message processor in | ||
``rust_snuba/src/processors/mod.rs`` and add it to | ||
``tests/consumers/test_message_processors.py``. Later, use | ||
``RustCompatProcessor`` to get rid of the Python implementation. | ||
|
||
Message processors | ||
------------------ | ||
|
||
Each storage in Snuba defines a conversion function mapping the layout of a | ||
Kafka message to a list of ClickHouse rows. | ||
|
||
Message processors are defined in :py:mod:`snuba.dataset.processors`, and | ||
need to subclass from | ||
:py:class:`snuba.dataset.processors.DatasetMessageProcessor`. Just by | ||
subclassing, their name becomes available for reference in | ||
``snuba/datasets/configuration/*/storages/*.yaml``. | ||
|
||
Python consumers | ||
---------------- | ||
|
||
``snuba consumer`` runs the Python consumer. It uses the Python message | ||
processor mentioned above to convert Kafka messages into rows, batches those | ||
rows up into larger ``INSERT``-statements, sends off the ``INSERT`` statement | ||
to the cluster defined in ``settings.py``. | ||
|
||
Test endpoints | ||
-------------- | ||
|
||
In sentry we have a lot of tests that want to insert into ClickHouse. Tests | ||
have certain requirements that our Kafka consumers can't meet and which don't | ||
apply to production: | ||
|
||
|
||
* They require strong consistency, as they want to run a handful of queries + | ||
assertions, then insert a few rows, wait for them to be inserted, then run | ||
some more assertions depending on that new data. | ||
|
||
* Because tests wait for every insert synchronously, insertion latency is | ||
really important, while throughput isn't. | ||
|
||
* Basically, people want to write e2e tests involving Snuba similarly to how | ||
tests involving relational DBs in Django ORM are being written. | ||
|
||
Every storage can be inserted into and wiped using HTTP as well, using the | ||
endpoints defined in ``snuba.web.views`` prefixed with ``/tests/``. Those | ||
endpoints use the same message processors as the Python consumer, but there's | ||
no batching at all. One HTTP request gets directly translated into a blocking | ||
``INSERT``-statement towards ClickHouse. | ||
|
||
Rust consumers | ||
-------------- | ||
|
||
``snuba rust-consumer`` runs the Rust consumer. It comes in two flavors, "pure | ||
Rust" and "hybrid": | ||
|
||
* ``--use-rust-processor`` ("pure rust") will attempt to find and load a Rust | ||
version of the message processor. There is a mapping from Python class names | ||
like ``QuerylogProcessor`` to the relevant function, defined in | ||
``rust_snuba/src/processors/mod.rs``. If that function exists, it is being | ||
used. The resulting running consumer is sometimes 20x faster than the Python | ||
version. | ||
|
||
If a Rust port of the message processor can't be found, the consumer silently | ||
falls back to the second flavor: | ||
|
||
* ``--use-python-processor`` ("hybrid") will use the Python message processor from | ||
within Rust. For this mode, no dataset-specific logic has to be ported to | ||
Rust, but at the same time the performance benefits of using a Rust consumer | ||
are negligible. | ||
|
||
Python message processor compat shims | ||
------------------------------------- | ||
|
||
Even when a dataset is being processed with 100% Rust in prod (i.e. pure-rust | ||
consumer is being used), we still have those ``/tests/`` API endpoints, as | ||
there is a need for testing, and so there still needs to be a Python | ||
implementation of the same message processor. For this purpose | ||
``RustCompatProcessor`` can be used as a baseclass that will delegate all logic | ||
back into Rust. This means that: | ||
|
||
* ``snuba consumer`` will connect to Kafka using Python, but process messages | ||
in Rust (using Python's multiprocessing) | ||
* ``snuba rust-consumer --use-python-processor`` makes no sense to deploy | ||
anywhere, as it will connect to Kafka using Rust, then perform a roundtrip | ||
through Python only to call Rust business logic again. | ||
|
||
Testing | ||
------- | ||
|
||
When porting a message processor to Rust, we validate equivalence to Python by: | ||
|
||
1. Registering the Python class in | ||
``tests/consumers/test_message_processors.py``, where it will run all | ||
payloads from ``sentry-kafka-schemas`` against both message processors and | ||
assert the same rows come back. If there is missing test coverage, it's | ||
preferred to add more payloads to ``sentry-kafka-schemas`` than to write | ||
custom tests. | ||
|
||
2. Remove the Python message processor and replace it with | ||
``RustCompatProcessor``, in which case the existing Python tests (of both | ||
Snuba and Sentry) will be directly run against the Rust message processor. | ||
|
||
Architecture | ||
------------ | ||
|
||
Python | ||
~~~~~~ | ||
|
||
In order to get around the GIL, Python consumers use arroyo's multiprocessing | ||
support to be able to use multiple cores. This comes with significant | ||
serialization overhead and an amount of complexity that is out of scope for | ||
this document. | ||
|
||
Pure Rust | ||
~~~~~~~~~ | ||
|
||
Despite the name this consumer is still launched from the Python CLI. The way | ||
this works is that all Rust is compiled into a shared library exposing a | ||
``consumer()`` function, and packaged using ``maturin`` into a Python wheel. | ||
The Python CLI for ``snuba rust-consumer``: | ||
|
||
1. Parses CLI arguments | ||
2. Resolves config (loads storages, clickhouse settings) | ||
3. Builds a new config JSON payload containing only information relevant to the | ||
Rust consumer (name of message processor, name of physical Kafka topic, name | ||
of ClickHouse table, and connection settings) | ||
4. calls ``rust_snuba.consumer(config)``, at which point Rust takes over the | ||
process entirely. | ||
|
||
Concurrency model in pure-rust is very simple: The message processors run on a | ||
``tokio::Runtime``, which means that we're using regular OS threads in order to | ||
use multiple cores. The GIL is irrelevant since no Python code runs. | ||
|
||
Hybrid | ||
~~~~~~ | ||
|
||
Hybrid consumer is mostly the same as pure-rust. The main difference is that it | ||
calls back into Python message processors. How that works is work-in-progress, | ||
but fundamentally it is subject to the same concurrency problems as the regular | ||
pure-Python consumer, and is therefore forced to spawn subprocesses and perform | ||
IPC one way or the other. | ||
|
||
Since the consumer is launched from the Python CLI, it will find the Python | ||
interpreter already initialized, and does not have to re-import Snuba again | ||
(except in subprocesses) | ||
|
||
Signal-handling is a bit tricky. Since no Python code runs for the majority of | ||
the consumer's lifetime, Python's signal handlers cannot run. This also means | ||
that the Rust consumer has to register its own handler for ``Ctrl-C``, but | ||
doing so also means that Python's own signal handlers are completely ignored. | ||
This is fine for the pure-rust case, but in the Hybrid case we have some Python | ||
code still running. For that Python code, ``KeyboardInterrupt`` does | ||
not work. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,177 @@ | ||
================ | ||
Snuba Data Model | ||
================ | ||
|
||
This section explains how data is organized in Snuba and how user facing | ||
data is mapped to the underlying database (Clickhouse in this case). | ||
|
||
The Snuba data model is divided horizontally into a **logical model** and | ||
a **physical model**. The logical data model is what is visible to the Snuba | ||
clients through the Snuba query language. Elements in this model may or may | ||
not map 1:1 to tables in the database. The physical model, instead, maps 1:1 | ||
to database concepts (like tables and views). | ||
|
||
The reasoning behind this division is that it allows Snuba to expose a | ||
stable interface through the logical data model and perform complex mapping | ||
internally to execute a query on different tables (part of the physical | ||
model) to improve performance in a way that is transparent to the client. | ||
|
||
The rest of this section outlines the concepts that compose the two models | ||
and how they are connected to each other. | ||
|
||
The main concepts, described below are dataset, entity and storage. | ||
|
||
.. image:: /_static/architecture/datamodel.png | ||
|
||
Datasets | ||
======== | ||
|
||
A Dataset is a name space over Snuba data. It provides its own schema and | ||
it is independent from other datasets both in terms of logical model and | ||
physical model. | ||
|
||
Examples of datasets are, discover, outcomes, sessions. There is no | ||
relationship between them. | ||
|
||
A Dataset can be seen as a container for the components that define its | ||
abstract data model and its concrete data model that are described below. | ||
|
||
In term of query language, every Snuba query targets one and only one | ||
Dataset, and the Dataset can provide extensions to the query language. | ||
|
||
Entities and Entity Types | ||
========================= | ||
|
||
The fundamental block of the logical data model Snuba exposes to the client | ||
is the Entity. In the logical model an entity represents an instance of an | ||
abstract concept (like a transaction or an error). In practice an *Entity* | ||
corresponds to a row in a table in the database. The *Entity Type* is the | ||
class of the Entity (like Error**s** or Transaction**s**). | ||
|
||
The logical data model is composed by a set of *Entity Types* and by their | ||
relationships. | ||
|
||
Each *Entity Type* has a schema which is defined by a list of fields with | ||
their associated abstract data types. The schemas of all the *Entity Types* | ||
of a Dataset (there can be several) compose the logical data model that is | ||
visible to the Snuba client and against which Snuba Queries are validated. | ||
No lower level concept is supposed to be exposed. | ||
|
||
Entity Types are unequivocally contained in a Dataset. An Entity Type cannot | ||
be present in multiple Datasets. | ||
|
||
Relationships between Entity Types | ||
---------------------------------- | ||
|
||
Entity Types in a Dataset are logically related. There are two types of | ||
relationships we support: | ||
|
||
- Entity Set Relationship. This mimics foreign keys. This relationship is | ||
meant to allow joins between Entity Types. It only supports one-to-one | ||
and one-to-many relationships at this point in time. | ||
- Inheritance Relationship. This mimics nominal subtyping. A group of Entity | ||
Types can share a parent Entity Type. Subtypes inherit the schema from the | ||
parent type. Semantically the parent Entity Type must represent the union | ||
of all the Entities whose type inherit from it. It also must be possible | ||
to query the parent Entity Type. This cannot be just a logical relationship. | ||
|
||
Entity Type and consistency | ||
--------------------------- | ||
|
||
The Entity Type is the largest unit where Snuba **can** provide some strong | ||
data consistency guarantees. Specifically it is possible to query an Entity | ||
Type expecting Serializable Consistency (please don't use that. Seriously, | ||
if you think you need that, you probably don't). This does not extend to | ||
any query that spans multiple Entity Types where, at best, we will have | ||
eventual consistency. | ||
|
||
This also has an impact on Subscription queries. These can only work on one | ||
Entity Type at a time since, otherwise, they would require consistency between | ||
Entity Types, which we do not support. | ||
|
||
.. ATTENTION:: | ||
To be precise the unit of consistency (depending on the Entity Type) | ||
can be even smaller and depend on how the data ingestion topics | ||
are partitioned (project_id for example), the Entity Type is the | ||
maximum Snuba allows. More details are (ok, will be) provided in | ||
the Ingestion section of this guide. | ||
|
||
Storage | ||
======= | ||
|
||
Storages represent and define the physical data model of a Dataset. Each | ||
Storage represent is materialized in a physical database concept like a table | ||
or a materialized view. As a consequence each Storage has a schema defined | ||
by fields with their types that reflects the physical schema of the DB | ||
table/view the Storage maps to and it is able to provide all the details to | ||
generate DDL statements to build the tables on the database. | ||
|
||
Storages are able to map the logical concepts in the logical model discussed | ||
above to the physical concept of the database, thus each Storage needs to be | ||
related with an Entity Type. Specifically: | ||
|
||
- Each Entity Type must be backed by least one Readable Storage (a Storage we | ||
can run query on), but can be backed by multiple Storages (for example a | ||
pre-aggregate materialized view). Multiple Storages per Entity Type are meant | ||
to allow query optimizations. | ||
- Each Entity Type must be backed by one and only one Writable | ||
Storage that is used to ingest data and fill in the database tables. | ||
- Each Storage is backing exclusively one Entity Type. | ||
|
||
|
||
|
||
Examples | ||
======== | ||
|
||
This section provides some examples of how the Snuba data model can represent | ||
some real world models. | ||
|
||
These case studies are not necessarily reflecting the current Sentry production | ||
model nor they are part of the same deployment. They have to be considered as | ||
examples taken in isolation. | ||
|
||
Single Entity Dataset | ||
--------------------- | ||
|
||
This looks like the Outcomes dataset used by Sentry. This actually does not | ||
reflect Outcomes as of April 2020. It is though the design Outcomes should | ||
move towards. | ||
|
||
.. image:: /_static/architecture/singleentity.png | ||
|
||
This Dataset has one Entity Type only which represent an individual Outcome | ||
ingested by the Dataset. Querying raw Outcomes is painfully slow so we have | ||
two Storages. One is the Raw storage that reflects the data we ingest and a | ||
materialized view that computes hourly aggregations that are much more efficient | ||
to query. The Query Planner would pick the storage depending if the query | ||
can be executed on the aggregated data or not. | ||
|
||
Multi Entity Type Dataset | ||
------------------------- | ||
|
||
The canonical example of this Dataset is the Discover dataset. | ||
|
||
.. image:: /_static/architecture/multientity.png | ||
|
||
This has three Entity Types. Errors, Transaction and they both inherit from | ||
Events. These form the logical data model, thus querying the Events Entity | ||
Type gives the union of Transactions and Errors but it only allows common | ||
fields between the two to be present in the query. | ||
|
||
The Errors Entity Type is backed by two Storages for performance reasons. | ||
One is the main Errors Storage that is used to ingest data, the other is a | ||
read only view that is putting less load on Clickhosue when querying but | ||
that offers lower consistency guarantees. Transactions only have one storage | ||
and there is a Merge Table to serve Events (which is essentially a view over | ||
the union of the two tables). | ||
|
||
Joining Entity types | ||
-------------------- | ||
|
||
This is a simple example of a dataset that includes multiple Entity Types | ||
that can be joined together in a query. | ||
|
||
.. image:: /_static/architecture/joins.png | ||
|
||
GroupedMessage and GroupAssingee can be part of a left join query with Errors. | ||
The rest is similar with what was discussed in the previous examples. |
Oops, something went wrong.