Skip to content

Automatically document your analytics setup by analyzing tracking code and generating data schemas 🚀

License

Notifications You must be signed in to change notification settings

fliskdata/analyze-tracking

Repository files navigation

@flisk/analyze-tracking

Automatically document your analytics setup by analyzing tracking code and generating data schemas from tools like Segment, Amplitude, Mixpanel, and more 🚀.

NPM version

Why Use @flisk/analyze-tracking?

📊 Understand Your Tracking – Effortlessly analyze your codebase for track calls so you can see all your analytics events, properties, and triggers in one place. No more guessing what’s being tracked!

🔍 Auto-Document Events – Generates a complete YAML schema that captures all events and properties, including where they’re implemented in your codebase.

🕵️‍♂️ Track Changes Over Time – Easily spot unintended changes or ensure your analytics setup remains consistent across updates.

📚 Populate Data Catalogs – Automatically generate structured documentation that can help feed into your data catalog, making it easier for everyone to understand your events.

Quick Start

Run without installation! Just use:

npx @flisk/analyze-tracking /path/to/project [options]

Key Options:

  • -g, --generateDescription: Generate descriptions of fields (default: false)
  • -o, --output <output_file>: Name of the output file (default: tracking-schema.yaml)
  • -c, --customFunction <function_name>: Specify a custom tracking function

🔑  Important: you must set the OPENAI_API_KEY environment variable to use generateDescription

Note on Custom Functions 💡

Use this if you have your own in-house tracker or a wrapper function that calls other tracking libraries.

We currently only support functions that follow the following format:

yourCustomTrackFunctionName('<event_name>', {
  <event_parameters>
});

What’s Generated?

A clear YAML schema that shows where your events are tracked, their properties, and more. Here’s an example:

version: 1
source:
  repository: <repository_url>
  commit: <commit_sha>
  timestamp: <commit_timestamp>
events:
  <event_name>:
    description: <ai_generated_description>
    implementations:
      - description: <ai_generated_description>
        path: <path_to_file>
        line: <line_number>
        function: <function_name>
        destination: <platform_name>
    properties:
      <property_name>:
        description: <ai_generated_description>
        type: <property_type>

Use this to understand where your events live in the code and how they’re being tracked.

GPT-4o mini is used for generating descriptions of events, properties, and implementations.

See schema.json for a JSON Schema of the output.

Supported tracking libraries

Google Analytics
gtag('event', '<event_name>', {
  <event_parameters>
});
Segment
analytics.track('<event_name>', {
  <event_parameters>
});
Mixpanel
mixpanel.track('<event_name>', {
  <event_parameters>
});
Amplitude
amplitude.logEvent('<event_name>', {
  <event_parameters>
});
Rudderstack
rudderanalytics.track('<event_name>', {
  <event_parameters>
});
mParticle
mParticle.logEvent('<event_name>', {
  <event_parameters>
});
PostHog
posthog.capture('<event_name>', {
  <event_parameters>
});
Pendo
pendo.track('<event_name>', {
  <event_parameters>
});
Heap
heap.track('<event_name>', {
  <event_parameters>
});
Snowplow (struct events)
snowplow('trackStructEvent', {
  category: '<category>',
  action: '<action>',
  label: '<label>',
  property: '<property>',
  value: '<value> '
});
trackStructEvent({
  category: '<category>',
  action: '<action>',
  label: '<label>',
  property: '<property>',
  value: '<value>'
});
buildStructEvent({
  category: '<category>',
  action: '<action>',
  label: '<label>',
  property: '<property>',
  value: '<value>'
});

Note: Snowplow Self Describing Events are coming soon!

Contribute

We’re actively improving this package. Found a bug? Want to request a feature? Open an issue or contribute directly!