-
Notifications
You must be signed in to change notification settings - Fork 5
/
README.Rmd
136 lines (96 loc) · 4.28 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
---
output: github_document
---
```{r opts, echo = FALSE}
knitr::opts_chunk$set(
fig.path = "images/"
)
```
# Prevelence of SARS-CoV-2 Variants of Concern in Aoteoroa New Zealand
The `data/` directory contains a single file, `all_variant_counts.csv`. This file contains
the 7-day rolling average for the number of genomes from a given variant sequenced by
ESR. Data in the most recent weeks will be updated as new cases reported in this time
are added to the sequencing dataset Data begins at the start of the August 2021 Delta outbreak.
It should be noted that the cases contributing to this data are not a random or representative
sample of all cases. Information on the demographic samplig of cases is provided in
the June 2023 Covid Genomics Insight report [pdf](https://www.esr.cri.nz/assets/HEALTH-CONTENT/COVID-Genomics-Insights-Dashboard-CGID/CGID_38_Report.pdf).
Changes to the set of lineages tracked in this repository are described in `changes.md`
In addition the `data/archived` subdirectory contians no-longer updated files
* `rbd_levels.csv`, which includes information on the number of key
mutations observed in the receptor binding domain of the spike protein. Higher values
of this measure are [associated with increased immune evasiveness](https://virological.org/t/sars-cov-2-evolution-post-omicron/911). This data was
part of ESR's reporting in late 2022 but is no longer useful.
* `community_variant_counts.csv` and `border_variant_counts.csv` break down the
lineage data into those cases associated with community-acquired infections and
border-associated infections respectively. Border and community cases can no
longer be reliably differentiated, so this data is no longer updated.
* `May_23_variant_classes.csv`, which includes variant classified following
reporting criteria used prior to May 2023 (inducing BQ.1.1, XBF and XBC which
are no longer tracked separately)
## Example
A brief example of how the data might be used, using the R language.
## Since Delta
```{r plot, message=FALSE, warning=FALSE}
library(lubridate)
library(tidyr)
library(ggplot2)
library(forcats)
voc <- read.csv("data/all_variant_counts.csv")
voc$report_date <- ymd(voc$report_date)
tidy_voc <- pivot_longer(voc, !report_date, names_to = "variant", values_to = "n_genomes")
stack_order <- c("Delta",
"BA.1",
"BA.2",
"BA.5",
"BA.4",
"BA.2.75",
"CH.1.1",
"FK.1.1",
"XBB",
"XBB.1.5",
"XBB.1.16",
"EG.5",
"HK.3",
"BA.2.86",
"JN.1",
"JN.1.4",
"Recombinant",
"XBC"
)
tidy_voc$variant <- fct_relevel(tidy_voc$variant, stack_order)
voc_pal <-c(Delta = "#3182bd",
BA.1 = "#bd0026",
BA.2 = "#fd8d3c",
BA.4 = "#B14B02",
BA.5 = "#FD3A6B",
BA.2.75 = "#fecc5c",
CH.1.1 = "#bd0026",
FK.1.1 = "#E81634",
XBB = "#b6db3a",
XBB.1.5 = "#5CCC34",
XBB.1.16 = "#9CC33C",
EG.5 = "#599945",
HK.3 = "#3e5836",
BA.2.86 = "#B399D4",
JN.1 = "#6f4f97",
JN.1.4 = "#422f5a",
Recombinant = "#57badb",
XBC = "#9AD6EA"
)
ggplot(tidy_voc, aes(report_date, n_genomes, fill=variant)) +
geom_area(position = 'fill', colour='black', size=0.3) +
scale_fill_manual(values=voc_pal)+
theme_bw(base_size = 16)
```
## Recent
```{r, recent_plot}
library(lubridate)
recent <- filter(tidy_voc, report_date >= today() - months(3), n_genomes > 0)
recent_pal <- voc_pal[ as.character(unique(recent$variant)) ]
ggplot(recent, aes(report_date, n_genomes, fill=variant)) +
geom_area(position = 'fill', colour='black', size=0.3) +
scale_fill_manual(values=recent_pal)+
theme_bw(base_size = 16)
```
## License
The data is released under a [CC-BY 4.0 international license](https://creativecommons.org/licenses/by/4.0/). You are free to copy, distribute or adapt this data as long as you acknowledge ESR.