Releases: reddit/baseplate.py
v0.30.2
v0.30.1
This is a bugfix release with two changes:
- Properly do a graceful shutdown when the infrastructure informs us we are about to be shut down. If requests finish up fast enough, they won't be cut off in flight.
- Limit the size of the internal queue in the Kombu QueueConsumer. Previously, if the remote queue had a lot of messages this internal queue would fill up without bound and potentially consume excessive memory.
v0.30.0
Important News
This will be the last version of Baseplate.py to support Python 3.5 and lower (including 2.7). We will continue to publish bugfixes for this release going forward to support services stuck on old Pythons, but all new development will expect Python 3.6 or newer.
New Features
Ratelimit tools
Baseplate now has helpers for maintaining ratelimit counters in Memcached or Redis. You can use this to correctly apply ratelimits to actions in your application.
See the docs for more information.
More runtime metrics
The previous release (v0.29) added a new system of per-process server metrics. This release adds two more things to watch: garbage collector stats and event loop blocker monitoring. These metrics will help you understand if your application is stalling in ways that would cause weird p99 spikes across many requests.
See the docs for more information.
Credential secrets
Services often need to securely store username/password pairs. Baseplate now has a convention for doing so called a credential secret. In addition, the sqlalchemy integration now uses this new credential type and you can expect other integrations to do so in the future.
See the secrets store docs for more information on credential secrets and the sqlalchemy integration docs for how to use that with SQL databases.
Changes
- The
FileWatcher
can now pass various options (encoding, binary mode, etc.) through to theopen
call when it loads the watched file. - The configuration prefix used by
secrets_store_from_config
andexperiments_client_from_config
is now configurable. - Potentially breaking: an accidental
logger
attribute has been removed from the context object. - Timers can now be sent manually without start/stop.
- When a metrics batch is too large, we also log the counters found in that batch. This error generally indicates that the service is doing far too much in a single request and the counters can help figure out what operation is being repeated many times.
- Experiments can now use more targeting operators like gt, lt, etc. and specify ranges of values they apply to.
- Trace publishers now send larger batches to reduce load on Zipkin.
baseplate-tshell
now supports IPython 5+.baseplate-tshell
now activatesreadline
for proper text editing.- Development has moved into the OneVM and the tooling has been modernized.
Bug Fixes
- A race condition in the Cassandra integration was fixed. You should no longer get "timer already stopped" errors from the Cassandra timers.
- A regression in parsing the
Sampled
header on upstream spans in Pyramid services was fixed. - Thrift header names are now case-insensitive. This allows them to transit systems like Envoy that canonicalize the names to lower case.
- The message queue helper properly prints messages without
b""
artifacts. - An exception in the experiments framework is now caught and turned into a mismatch safely.
v0.29
Important
Apache Thrift
This is a major, breaking change. The Thrift implementation used has been changed from an ancient version of FBThrift to modern Apache Thrift. This comes with all the same bells and whistles (THeaderProtocol
) and should generally be of comparable performance, but allows us to keep up with upstream much more effectively and generally gives us more flexibility. See the "Upgrading" section below for details of how to upgrade.
New Features
Runtime metrics
This is the first release of Baseplate that automatically sends per-process server metrics outside of those your application sends. The first metric to send is a gauge indicating the number of active requests being handled in each process. More server/runtime metrics will be coming very soon.
Pyramid CSRF Policy
A CSRF implementation for Pyramid suitable for use in an intranet environment is now included in Baseplate.
Changes
- BREAKING CHANGE The
max_concurrency
setting for servers is now mandatory. It is very important to configure this to something meaningful for your application based on the IO vs. CPU usage of requests. - A client or server raising Thrift exceptions defined in the IDL will no longer count those as failed RPCs for metrics purposes.
- Thrift RPC failures now include service and method names in the error log.
- Setting up thrift clients on the request context now takes far less work. This performs significantly better on services with many clients.
- The secrets store will only re-check the file on disk for modifications once per request. This performs significantly better on services that use many secrets per request.
- When
baseplate-serve
binds its own socket,SO_REUSEPORT
is now used to improve the balance of load across processes on the same host. This has no effect when running under an Einhorn configured to bind the socket (--bind
). - The method which Pyramid servers determine if they'll accept Trace headers from a client can now be controlled by the application.
- The secret fetcher sidecar now supports Vault 0.9+ authentication.
- FileWatcher, SecretsStore, and experiments config all have a timeout parameter that control how long they will block waiting for the underlying file to be available during initial startup.
- The experiments framework no longer creates a local span each time bucketing happens.
- When Vault authentication fails, the error message gives some advice about how to resolve the situation.
- Event publishers now use exponential backoff when publishing to the event collectors.
Bug fixes
- The live data sidecar will now wait a while for the secrets store to be available rather than crashing immediately if it happens to come up first.
- Traces missing the
Sampled
header will no longer be thrown out. This was always meant to be optional.
Upgrading
Apache Thrift
Apache Thrift does not support the event handler interface that older versions of FBThrift did. Baseplate's integration with the Thrift server is different as a result. See below for an example diff covering what needs to change.
--- a/reddit_service_activity/__init__.py
+++ b/reddit_service_activity/__init__.py
@@ -14,7 +14,7 @@
tracing_client_from_config,
)
from baseplate.context.redis import RedisContextFactory
-from baseplate.integration.thrift import BaseplateProcessorEventHandler
+from baseplate.integration.thrift import baseplateify_processor
from .activity_thrift import ActivityService, ttypes
from .counter import ActivityCounter
@@ -48,7 +48,7 @@ def from_json(cls, value):
)
-class Handler(ActivityService.ContextIface):
+class Handler(ActivityService.Iface):
def __init__(self, counter):
self.counter = counter
@@ -142,8 +142,5 @@ def make_processor(app_config): # pragma: nocover
counter = ActivityCounter(cfg.activity.window.total_seconds())
handler = Handler(counter=counter)
- processor = ActivityService.ContextProcessor(handler)
- event_handler = BaseplateProcessorEventHandler(logger, baseplate)
- processor.setEventHandler(event_handler)
-
- return processor
+ processor = ActivityService.Processor(handler)
+ return baseplateify_processor(processor, logger, baseplate)
diff --git a/requirements.txt b/requirements.txt
index 5f0ab24..bd3baf9 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,5 +1,5 @@
-baseplate==0.28.0
+baseplate==0.29.0
@@ -18,5 +18,5 @@ raven==5.27.0
-Thrift==0.1
+thrift==0.12.1
Additionally, the Thrift compiler for Apache Thrift is called thrift
rather than thrift1
. If you're using the compiler directly you'll need to update this. Baseplate's built in thriftfile compilation steps handle this automatically.
This new updated compiler has a few differences which your thrift IDL and application code will need to take into account:
- The
float
type in FBThrift isn't available in Apache Thrift, only the largerdouble
type is. Unfortunately, this is a breaking change on the wire as the two types have quite different byte representations due to their different sizes. For an actively used field, you can make a newdouble
-typed field and have your application populate or read both the float and double fields. Once all clients are using the new field you can drop the old one and then move to the new Baseplate. - Optional arguments to RPC methods do not get a default
=None
in the generated code anymore. Clients will need to ensure they're passing values for all parameters. - A list of keywords from various languages (e.g.
next
) is now blacklisted for use in field names in Thrift. If you have any fields with names like this, the new compiler will balk. Thankfully this is a purely code-side change and has no effect on how things look on the wire so you can just update your code without worrying about clients.
v0.28
Changes
- Accept B3- prefix for tracing headers to allow support for other tracing clients.
- Report crashes that happen outside server spans to Sentry. This includes stuff that happens during request parsing before the application receives the request.
- More additions to experiment framework.
- Targeting overrides
- SimpleExperiment targeting
- Include span information in experiment events
Bug fixes
- Fix creation of context attributes on local spans
- Fix local span support in sqlalchemy context client
v0.27.0
v0.26.0
New Features
Queue consumer support with Kombu
Baseplate now has first class support for consuming messages from queue brokers like RabbitMQ using Kombu. The full trace and diagnostic framework works here.
from kombu import Connection, Exchange
from baseplate import queue_consumer
def process_links(context, msg_body, msg):
print('processing %s' % msg_body)
queue_consumer.consume(
baseplate=make_baseplate(cfg, app_config),
exchange=Exchange('reddit_exchange', 'direct'),
connection=Connection(
hostname='amqp://guest:guest@reddit.local:5672',
virtual_host='/',
),
queue_name='process_links_q',
routing_keys=[
'link_created',
'link_deleted',
'link_updated',
],
handler=process_links,
)
See the documentation for more details.
Changes
- The memcached instrumentation now adds details about each call to span tags. This includes key names, key counts, and other settings.
- When preparing CQL statements with the Cassandra integration, Baseplate will now cache the prepared statement for you. This means you can call
prepare()
every time safely. - The secret fetcher daemon can now be run in a single-shot mode where it exits immediately after fetching secrets. This can be used for situations like cron jobs in Kubernetes.
- When installing as a wheel, the baseplate CLI scripts no longer have a Python version suffix.
baseplate-serve2
->baseplate-serve
. - The Zipkin tracing observer can now ship spans to a sidecar span publisher daemon rather than sending from within the application itself.
- There are now new methods to check experiment names are valid and to get lists of all active experiments.
- Experiments now send exposure events.
Bug Fixes
- Fix a case where connection failures in the thrift connection pool implementation would cause the pool to lose connection slots and eventually be depleted.
- Fix an issue where for r2 experiments with low bucketing and 3 total treatments, bucketing is uneven.
v0.25.0
New Features
baseplate-tshell
You can now fire up a REPL shell in a Thrift service's context for debugging. This is patterened off of pshell
from Pyramid and supports IPython's REPL if installed.
$ baseplate-tshell2 example.ini
Python 2.7.6 (default, Nov 23 2017, 15:49:48)
Type "copyright", "credits" or "license" for more information.
IPython 2.2.0 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
Available Objects:
app This project's app instance
context The context for this shell instance's span
In [1]: context.redis.ping()
Out[1]: True
Changes
- Multiple increments of the same statsd counter in the same batch are now coalesced together before serialization to reduce metric datagram size. This reduces bandwidth usage for most non-trivial applications since span success/failure counters happen often.
- Exceptions raised from the OS during MessageQueue instantiation now come with hints of how to resolve the issue.
- Event logging in the experiments framework is now extensible and defaults to just logging at DEBUG level.
- Success/failure counters (added in v0.20) are now sent for server spans as well as client spans.
- The
EdgeRequestContext
now has aservice
property for use when the request's authentication token identifies an (internal) service rather than a user as the principal. The object has a singlename
property to get the authenticated service's name. This can be used to whitelist services for specific multi-user access to data.
Bug Fixes
- Fix a crash in the
MetricsBaseplateObserver
when a metrics batch gets too big for IP fragmentation. Just log a warning and swallow the error now. - Don't put connections back in thrift connection pool after unexpected low-level errors from the remote application. This was a cause for at least some "TTransportException: read 0 bytes" errors since the connection was broken but had been returned to the pool for the next request to fail on.
- Fix baseplate-healthcheck3 against UNIX domain socket endpoints. There was a Python 3 incompatibility.
- Include Makefile in setuptools MANIFEST.
Upgrading
Event logging
Remove the event_queue
parameter from your ExperimentsContextFactory
object and add an appropriately constructed EventLogger
instead.
v0.24.1
v0.24.0
New Features
EdgeRequestContext/AuthenticationToken unification
This isn't a new addition, but a breaking rework of authentication context in Baseplate. Authentication token propagation and access is now fully integrated into the edge request context. Authentication tokens are propagated inside the edge context header and the API for applications built on Baseplate is unified. See below for details on how to use this.
For context, here are the original release notes on the authentication system:
Authentication tokens provided by the authentication service can now be automatically propagated between services when making Thrift calls. This allows internal services to securely and accurately understand on whose behalf a given request is being made so they can decide if the requester is authorized for a particular action. The context is passed implicitly, in request headers, so no extra parameters need be added to service IDLs. Baseplate provides APIs for validating and accessing the tokens from within request context and will automatically pass upstream credentials to downstream services without extra work.
This should now be stable and ready for wide use.
Kubernetes authentication backend for Vault secrets fetcher
Baseplate's secret fetcher daemon can now authenticate to Vault using Kubernetes as its proof of identity. This allows Baseplate's Vault sidecar to be used inside Kubernetes pods.
Histogram metrics
You can now use Baseplate's metrics API to send arbitrary integers to StatsD to be collected into Histograms. This is useful for monitoring the distribution of a metric over a time interval.
thrift_pool_from_config
A new helper for creating ThriftConnectionPool
objects from configuration without the boilerplate. See "Upgrading" below for details of how to switch over.
Changes
- Baseplate now emits a new
ServerSpanInitialized
event in Pyramid applications after it is done initializing a request's span context but before request handler starts. This allows you to hook in to the request lifecycle but have access to all the context attributes you have registered with Baseplate. - Baseplate CLI tools now use
/usr/bin/env python
for compatibility with environments like virtualenv. - The setup script (setup.py) now triggers the thrift build process. This makes
python setup.py sdist
and descendants work cleanly.
Bug Fixes
- Don't retry publishing forever when v2 event fails to validate.
- Use correct path for authentication token public key in Vault.
Upgrading
EdgeRequestContext
Most of the workings of this system are under the hood in Baseplate, but you'll need to configure it at application startup. The way this looks is a little different depending on where in the call graph your service sits.
For "landlocked" services inside the cluster that don't interact directly with clients, we assume that upstream services will propagate edge request context (including authentication) to us and all we need to do is verify this. Create an EdgeRequestContextFactory
passing in the secrets store for your application and then pass that along to your application's Baseplate integration. For an example Thrift service:
--- a
+++ b/
@@ -90,6 +90,7
+from baseplate.core import EdgeRequestContextFactory
from baseplate.secrets import secrets_store_from_config
--- a
+++ b
@@ -115,9 +115,15 @@ def make_wsgi_app(app_config):
+ edge_context_factory = EdgeRequestContextFactory(secrets)
+
handler = Handler()
processor = {{ cookiecutter.service_name }}.ContextProcessor(handler)
- event_handler = BaseplateProcessorEventHandler(logger, baseplate)
+ event_handler = BaseplateProcessorEventHandler(
+ logger,
+ baseplate,
+ edge_context_factory=edge_context_factory,
+ )
processor.setEventHandler(event_handler)
return processor
Now you can use context.request_context
(EdgeRequestContext
) and all of its properties to access the context provided by upstream edge services.
For services at the edge, that is ones that interact with external clients directly, we need to collect the right information from the external request and put it into the context to be propagated to downstream services. Create an EdgeRequestContextFactory
and use it to make fresh context on each request. For a Pyramid service, the new ServerSpanInitialized
event is particularly useful here because we can use any services managed through Baseplate in the event handler.
--- a
+++ b/
@@ -90,6 +90,7
+from baseplate.core import EdgeRequestContextFactory
+from baseplate.integration.pyramid import ServerSpanInitialized
from baseplate.secrets import secrets_store_from_config
--- a
+++ b
@@ -115,9 +115,15 @@ def make_wsgi_app(app_config):
+ edge_context_factory = EdgeRequestContextFactory(secrets)
+
+ def add_edge_context(event):
+ bearer_token = event.request.headers[...]
+ authn_response = event.request.authn_service.authenticate_oauth2_bearer_token(bearer_token)
+ edge_context = edge_context_factory.new(
+ ... fill in details from the external request ...
+ )
+ edge_context.attach_context(event.request)
+
+ configurator.add_subscriber(add_edge_context, ServerSpanInitialized)
All of this context will be available in your application immediately and will also show up automatically in downstream services.
Both types of services will need access to the authentication service's public key to be able to validate authentication tokens. This is as simple as adding secret/authentication/public-key
to the list of secrets managed by your service's secret fetcher daemon.
See the documentation for more details on all this.
thrift_pool_from_config
If your application was doing its own configuration parsing to build a Thrift connection pool, you can now use this function instead.
Before:
cfg = config.parse_config(app_config, {
"example_service": {
"endpoint": config.Endpoint,
"timeout": config.Timespan,
},
})
...
example_pool = ThriftConnectionPool(
cfg.example_service.endpoint,
timeout=cfg.example_service.timeout.total_seconds(),
)
...
After:
example_pool = thrift_pool_from_config(app_config, prefix="example_service.")