Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(astro): Add distributed tracing via <meta> tags #9483

Merged
merged 4 commits into from
Nov 9, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 79 additions & 0 deletions packages/astro/src/server/meta.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
import type { Hub } from '@sentry/core';
import { getDynamicSamplingContextFromClient } from '@sentry/core';
import type { Span } from '@sentry/types';
import {
dynamicSamplingContextToSentryBaggageHeader,
generateSentryTraceHeader,
logger,
TRACEPARENT_REGEXP,
} from '@sentry/utils';

/**
* Extracts the tracing data from the current span or from the client's scope
* (via transaction or propagation context) and renders the data to <meta> tags.
*
* This function creates two serialized <meta> tags:
* - `<meta name="sentry-trace" content="..."/>`
* - `<meta name="baggage" content="..."/>`
*
* TODO: Extract this later on and export it from the Core or Node SDK
*
* @param span the currently active span
* @param client the SDK's client
*
* @returns an object with the two serialized <meta> tags
*/
export function getTracingMetaTags(span: Span | undefined, hub: Hub): { sentryTrace: string; baggage?: string } {
const scope = hub.getScope();
const client = hub.getClient();
const { dsc, sampled, traceId } = scope.getPropagationContext();
const transaction = span?.transaction;

const sentryTrace = span ? span.toTraceparent() : generateSentryTraceHeader(traceId, undefined, sampled);

const dynamicSamplingContext = transaction
? transaction.getDynamicSamplingContext()
: dsc
? dsc
: client
? getDynamicSamplingContextFromClient(traceId, client, scope)
: undefined;
Lms24 marked this conversation as resolved.
Show resolved Hide resolved

const baggage = dynamicSamplingContextToSentryBaggageHeader(dynamicSamplingContext);

const isValidSentryTraceHeader = TRACEPARENT_REGEXP.test(sentryTrace);
if (!isValidSentryTraceHeader) {
logger.warn('Invalid sentry-trace data. Returning empty <meta name="sentry-trace"/> tag');
}

const validBaggage = isValidBaggageString(baggage);
if (!validBaggage) {
logger.warn('Invalid baggage data. Returning empty <meta name="baggage"/> tag');
}

return {
sentryTrace: `<meta name="sentry-trace" content="${isValidSentryTraceHeader ? sentryTrace : ''}"/>`,
baggage: baggage && `<meta name="baggage" content="${validBaggage ? baggage : ''}"/>`,
};
}

/**
* Tests string against baggage spec as defined in:
*
* - W3C Baggage grammar: https://www.w3.org/TR/baggage/#definition
* - RFC7230 token definition: https://datatracker.ietf.org/doc/html/rfc7230#section-3.2.6
*
* exported for testing
*/
export function isValidBaggageString(baggage?: string): boolean {
if (!baggage || !baggage.length) {
return false;
}
const keyRegex = "[-!#$%&'*+.^_`|~A-Za-z0-9]+";
const valueRegex = '[!#-+-./0-9:<=>?@A-Z\\[\\]a-z{-}]+';
const spaces = '\\s*';
const baggageRegex = new RegExp(
`^${keyRegex}${spaces}=${spaces}${valueRegex}(${spaces},${spaces}${keyRegex}${spaces}=${spaces}${valueRegex})*$`,
);
return baggageRegex.test(baggage);
}
31 changes: 26 additions & 5 deletions packages/astro/src/server/middleware.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
import { captureException, configureScope, startSpan } from '@sentry/node';
import { captureException, configureScope, getCurrentHub, startSpan } from '@sentry/node';
import { addExceptionMechanism, objectify, stripUrlQueryAndFragment, tracingContextFromHeaders } from '@sentry/utils';
import type { APIContext, MiddlewareResponseHandler } from 'astro';

import { getTracingMetaTags } from './meta';

type MiddlewareOptions = {
/**
* If true, the client IP will be attached to the event by calling `setUser`.
Expand Down Expand Up @@ -93,11 +95,30 @@ export const handleRequest: (options?: MiddlewareOptions) => MiddlewareResponseH
},
},
async span => {
const res = await next();
if (span && res.status) {
span.setHttpStatus(res.status);
const originalResponse = await next();
if (span && originalResponse.status) {
span.setHttpStatus(originalResponse.status);
}

const hub = getCurrentHub();
const client = hub.getClient();
const contentType = originalResponse.headers.get('content-type');

const isPageloadRequest = contentType && contentType.startsWith('text/html');
if (!isPageloadRequest || !client) {
return originalResponse;
}
return res;

const html = await originalResponse.text();
if (typeof html !== 'string' || !html.includes('<head>')) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m: Honestly I am a little worried here for multiple reasons:

  • I don't understand the Request/Response APIs enough yet but my worry is that this will consume the repsonse stream and somehow make it unavailable for anything that tries to access it.
  • I believe with our approach here we will definitely prevent streamed responses from being streamed and just completely block right? This could hurt TTFB.
  • I don't know how to feel about the heuristic to find the head tag here 😂. Using an HTML parser to properly find it is probably overkill?
  • Let's make absolutely sure this doesn't open any XSS vectors. Maybe we run it through a regexp before writing to the HTML.

Copy link
Member Author

@Lms24 Lms24 Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the Request/Response APIs enough yet but my worry is that this will consume the repsonse stream and somehow make it unavailable for anything that tries to access it.
I believe with our approach here we will definitely prevent streamed responses from being streamed and just completely block right? This could hurt TTFB.

It's a good point. Astro uses streaming but they also show a similar example for modifying HTML in their middleware docs. I'm gonna ask the Astro folks if there are performance concerns with doing this.

I don't know how to feel about the heuristic to find the head tag here 😂. Using an HTML parser to properly find it is probably overkill?

Agreed, it's super basic but we use the identical approach in SvelteKit and so far it worked decently well. Using a parser is for sure a performance concern. My pragmatic take would be to go with this and improve the lookup logic once we discover problems here.

Let's make absolutely sure this doesn't open any XSS vectors.

Generally, I agree but I'm not entirely sure about the XSS vector. I guess for something malicious to end up in here, the SDK's options (release, environment) or the transaction data (name, ids) would need to be somehow modified beforehand. Do you see any obvious ways how this could happen?

Maybe we run it through a regexp before writing to the HTML.

I think this would work decently well for sentry-trace content but baggage is a little more arbitrary. Lemme try to come up with something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Astro uses streaming but they also show a similar example for modifying HTML

Alright if Astro is recommending this it's probably fine.

My pragmatic take would be to go with this and improve the lookup logic once we discover problems here.

Sounds good to me.

Do you see any obvious ways how this could happen?

No obvious ways, but if we can come up with a simple way to ensure the string cant be escaped it would be great.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lforst I added a baggage regex check in 272cbbd

for sentry-trace, we already had a regex

We'll only serialize the baggage/sentry-trace content if the regexes match.

return originalResponse;
}

const { sentryTrace, baggage } = getTracingMetaTags(span, hub);
const content = `<head>\n${sentryTrace}\n${baggage}\n`;
const modifiedHtml = html.replace('<head>', content);

return new Response(modifiedHtml, originalResponse);
},
);
return res;
Expand Down
178 changes: 178 additions & 0 deletions packages/astro/test/server/meta.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
import * as SentryCore from '@sentry/core';
import { vi } from 'vitest';

import { getTracingMetaTags, isValidBaggageString } from '../../src/server/meta';

const mockedSpan = {
toTraceparent: () => '12345678901234567890123456789012-1234567890123456-1',
transaction: {
getDynamicSamplingContext: () => ({
environment: 'production',
}),
},
};

const mockedHub = {
getScope: () => ({
getPropagationContext: () => ({
traceId: '123',
}),
}),
getClient: () => ({}),
};

describe('getTracingMetaTags', () => {
it('returns the tracing tags from the span, if it is provided', () => {
{
// @ts-expect-error - only passing a partial span object
const tags = getTracingMetaTags(mockedSpan, mockedHub);

expect(tags).toEqual({
sentryTrace: '<meta name="sentry-trace" content="12345678901234567890123456789012-1234567890123456-1"/>',
baggage: '<meta name="baggage" content="sentry-environment=production"/>',
});
}
});

it('returns propagationContext DSC data if no span is available', () => {
const tags = getTracingMetaTags(undefined, {
...mockedHub,
// @ts-expect-error - only passing a partial scope object
getScope: () => ({
getPropagationContext: () => ({
traceId: '12345678901234567890123456789012',
sampled: true,
spanId: '1234567890123456',
dsc: {
environment: 'staging',
public_key: 'key',
trace_id: '12345678901234567890123456789012',
},
}),
}),
});

expect(tags).toEqual({
sentryTrace: expect.stringMatching(
/<meta name="sentry-trace" content="12345678901234567890123456789012-(.{16})-1"\/>/,
),
baggage:
'<meta name="baggage" content="sentry-environment=staging,sentry-public_key=key,sentry-trace_id=12345678901234567890123456789012"/>',
});
});

it('returns only the `sentry-trace` tag if no DSC is available', () => {
vi.spyOn(SentryCore, 'getDynamicSamplingContextFromClient').mockReturnValueOnce({
trace_id: '',
public_key: undefined,
});

const tags = getTracingMetaTags(
// @ts-expect-error - only passing a partial span object
{
toTraceparent: () => '12345678901234567890123456789012-1234567890123456-1',
transaction: undefined,
},
mockedHub,
);

expect(tags).toEqual({
sentryTrace: '<meta name="sentry-trace" content="12345678901234567890123456789012-1234567890123456-1"/>',
});
});

it('returns only the `sentry-trace` tag if no DSC is available', () => {
vi.spyOn(SentryCore, 'getDynamicSamplingContextFromClient').mockReturnValueOnce({
trace_id: '',
public_key: undefined,
});

const tags = getTracingMetaTags(
// @ts-expect-error - only passing a partial span object
{
toTraceparent: () => '12345678901234567890123456789012-1234567890123456-1',
transaction: undefined,
},
{
...mockedHub,
getClient: () => undefined,
},
);

expect(tags).toEqual({
sentryTrace: '<meta name="sentry-trace" content="12345678901234567890123456789012-1234567890123456-1"/>',
});
});
});

describe('isValidBaggageString', () => {
it.each([
'sentry-environment=production',
'sentry-environment=staging,sentry-public_key=key,sentry-trace_id=abc',
// @ is allowed in values
'sentry-release=project@1.0.0',
// spaces are allowed around the delimiters
'sentry-environment=staging , sentry-public_key=key ,sentry-release=myproject@1.0.0',
'sentry-environment=staging , thirdparty=value ,sentry-release=myproject@1.0.0',
// these characters are explicitly allowed for keys in the baggage spec:
"!#$%&'*+-.^_`|~1234567890abcxyzABCXYZ=true",
// special characters in values are fine (except for ",;\ - see other test)
'key=(value)',
'key=[{(value)}]',
'key=some$value',
'key=more#value',
'key=max&value',
'key=max:value',
'key=x=value',
])('returns true if the baggage string is valid (%s)', baggageString => {
expect(isValidBaggageString(baggageString)).toBe(true);
});

it.each([
// baggage spec doesn't permit leading spaces
' sentry-environment=production,sentry-publickey=key,sentry-trace_id=abc',
// no spaces in keys or values
'sentry-public key=key',
'sentry-publickey=my key',
// no delimiters ("(),/:;<=>?@[\]{}") in keys
'asdf(x=value',
'asdf)x=value',
'asdf,x=value',
'asdf/x=value',
'asdf:x=value',
'asdf;x=value',
'asdf<x=value',
'asdf>x=value',
'asdf?x=value',
'asdf@x=value',
'asdf[x=value',
'asdf]x=value',
'asdf\\x=value',
'asdf{x=value',
'asdf}x=value',
// no ,;\" in values
'key=va,lue',
'key=va;lue',
Copy link
Member Author

@Lms24 Lms24 Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is technically incorrect because ";" is valid if baggage values have properties but afaik we don't support this when parsing so we should be good 🤞 (properly matching properties blows up the complexity of the regex quite a bit)

'key=va\\lue',
'key=va"lue"',
// baggage headers can have properties but we currently don't support them
'sentry-environment=production;prop1=foo;prop2=bar,nextkey=value',
// no fishy stuff
'absolutely not a valid baggage string',
'val"/><script>alert("xss")</script>',
'something"/>',
'<script>alert("xss")</script>',
'/>',
'" onblur="alert("xss")',
])('returns false if the baggage string is invalid (%s)', baggageString => {
expect(isValidBaggageString(baggageString)).toBe(false);
});

it('returns false if the baggage string is empty', () => {
expect(isValidBaggageString('')).toBe(false);
});

it('returns false if the baggage string is empty', () => {
expect(isValidBaggageString(undefined)).toBe(false);
});
});
Loading
Loading