Skip to content

Commit

Permalink
Docs for Mobile Search Clients Sources Daily (#813)
Browse files Browse the repository at this point in the history
Co-authored-by: Jan-Erik Rediger <jrediger@mozilla.com>
  • Loading branch information
pissac17 and badboy authored Feb 12, 2024
1 parent 10521df commit df107e0
Show file tree
Hide file tree
Showing 2 changed files with 52 additions and 0 deletions.
1 change: 1 addition & 0 deletions src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,7 @@
- [Search Clients Engines Sources Daily](datasets/search/search_clients_engines_sources_daily/reference.md)
- [Search Clients Last Seen](datasets/search/search_clients_last_seen/reference.md)
- [Client LTV](datasets/search/client_ltv/reference.md)
- [Mobile Search Clients Sources Daily](datasets/search/mobile_search_clients_sources_daily/intro.md)
- [Non-Desktop Datasets](datasets/non_desktop.md)
- [Day 2-7 Activation](datasets/non_desktop/day_2_7_activation/reference.md)
- [Google Play Store](datasets/non_desktop/google_play_store/reference.md)
Expand Down
51 changes: 51 additions & 0 deletions src/datasets/search/mobile_search_clients_sources_daily/intro.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
`mobile_search_clients_engines_sources_daily` is designed to enable client-level search analyses for mobile.
Querying this dataset can be slow;
consider using `mobile_search_aggregates` for coarse analyses.

## Contents

`mobile_search_clients_engines_sources_daily` has one row for each unique combination of:
(`client_id`, `submission_date`, `engine`, `source`).

Alongside standard search metrics, this dataset includes client specific descriptive information as well.
For example, we include `normalized_app_name` and `normalized_app_name_os` for each row of data. `normalized_app_name` modifies the raw `app_name` data to align it more consistently with KPI reporting while `normalized_app_name_os` combines app name and os used by each client. Refer to the table below for comprehensive mapping details regarding these two fields.

| `app_name` | `os` | `normalized_app_name_os` | `normalized_app_name` |
| --------------------- | ------- | ---------------------------- | --------------------- |
| `Fenix` | Android | Firefox Android | Firefox |
| `Fennec` | Other | Fennec Other | Fennec |
| `Fennec` | Android | Legacy Firefox Android | Fennec |
| `Fennec` | iOS | Firefox iOS | Firefox |
| `Firefox Preview` | Android | Firefox Preview | Firefox Preview |
| `FirefoxConnect` | Android | Firefox for Echo Show | Firefox for Echo Show |
| `FirefoxForFireTV` | Android | Firefox for FireTV | Firefox for FireTV |
| `Focus Android Glean` | Android | Focus Android | Focus |
| `Focus iOS Glean` | iOS | Focus iOS | Focus |
| `Klar Android Glean` | Android | Klar Android | Klar |
| `Klar iOS Glean` | iOS | Klar iOS | Klar |
| `Other` | iOS | Other iOS | Other |
| `Other` | Other | Other | Other |
| `Other` | Android | Other Android | Other |
| `Zerda` | Android | Firefox Lite Android | Firefox Lite |
| `Zerda_cn` | Android | Firefox Lite Android (China) | Firefox Lite (China) |

Note that, if there were no such searches in a row's segment
(i.e. the count would be 0),
the column value is `null`.
Each of these columns represent different types of searches.
For more details, see the [search data documentation]

## Background and Caveats

`mobile_search_clients_engines_sources_daily` does not include
(`client_id` `submission_date`) pairs
if we did not receive a ping for that `submission_date`.

We impute a `NULL` `engine` and `source` for pings with no search counts.
This ensures users who never search are included in this dataset.

This dataset is large.
If you're querying this dataset from STMO,
heavily limit the data you read using `submission_date` or `sample_id`.

[search data documentation]: ../../search.md

0 comments on commit df107e0

Please sign in to comment.