-
Notifications
You must be signed in to change notification settings - Fork 0
/
oidd898_hw2.Rmd
128 lines (102 loc) · 7.11 KB
/
oidd898_hw2.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
---
author: "Jordan Farrer"
geometry: margin=.75in
output: function(...) {
fmt <- rmarkdown::pdf_document(toc = FALSE, number_section = FALSE, df_print = 'kable', includes = includes(in_header = "header.tex"),...)
fmt$knitr$knit_hooks$size = function(before, options, envir) {
if (before) return(paste0("\n \\", options$size, "\n\n"))
else return("\n\n \\normalsize \n")
}
return(fmt)
}
---
```{r setup, include = FALSE}
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE, fig.align = 'center', size = 'small')
pacman::p_load(tidyverse, pander, stringr, rvest, lubridate)
panderOptions('round', 4)
panderOptions('keep.trailing.zeros', TRUE)
panderOptions('big.mark', ',')
options(scipen = 10, expressions = 10000, digits = 4)
```
The R code that built this can be found in **[https://github.com/jrfarrer/oidd898_hw2](https://github.com/jrfarrer/oidd898_hw2)**.
##### 1) 5 cities to open your first stores
In order to create a set of 5 cities to open the first stores in I started with [the 298 largest cities in the US](https://simple.wikipedia.org/wiki/List_of_United_States_cities_by_population) and filtered down to the those in the 19 Midwest states specified by the [FiveThirtyEight](https://fivethirtyeight.com/datalab/what-states-are-in-the-midwest/). I attempted to use these in Facebook Ads; however, I did not find the filters useful enough to evaluate individual cities. Instead, I decided to look at the popularity of the words "weed", "cannabis", "marijuana", and "pot" on election day (Nov 8, 2017) across the US[^1]. The choice of election day was to identify people curious if the legalization of marijuana was on the ballot. In addition, this allowed each state to get a score. City-level data was not possible. Then, I combined the two datasets ranked the cities based on being in a state with a high-level of popularity for the marijuana-related terms and the largest population. Notably, this forces Colorado cities to the top, though marijuana was not a ballot question, though it was in #1 Billings, Montana (and #13 Fargo, North Dakota).
```{r echo = FALSE}
library("rvest")
url <- "https://simple.wikipedia.org/wiki/List_of_United_States_cities_by_population"
population <-
(url %>%
read_html() %>%
html_nodes(xpath='//*[@id="mw-content-text"]/table[1]') %>%
html_table())[[1]] %>%
mutate(population = str_replace_all(`Population (2014)`, "&", "")) %>%
separate(col = population, into = c("left", 'right'), sep = "\\.") %>%
mutate(Population = as.integer(str_replace_all(left, ",", ''))) %>%
select(Rank, City, State, Population) %>%
as_tibble()
midwest_states <-
population %>%
inner_join(
data_frame(
state = state.name
, abb = state.abb
) %>%
filter(abb %in% c('PA','WV','KY','OH','IN','IL','MI','WI','IA','MO',
'OK','KS','CO','NE','SD','ND','MN','WY','MT'))
, by = c("State" = "state")
)
fb_bulk_upload_string <-
midwest_states %>%
mutate(city_state = paste0(City, ", ", State)) %>%
select(city_state) %>%
unlist() %>%
paste0(collapse = "; ")
google_trends_elections_day <- read_csv("election_day.csv", skip = 3, col_names = c("state", "value"))
midwest_states %>%
inner_join(google_trends_elections_day, by = c("State" = "state")) %>%
arrange(desc(value), desc(Population)) %>%
mutate(Rank = row_number()) %>%
select(Rank, City, State, Population, `Google Trends State Value` = value) %>%
head(5) %>%
pander(caption = "Top 5 Cities to Open Stores")
```
![Attempt to use Facebook Ads](fb_ads.png)
##### 2) Hours your store will be opened
Google Trends only has hourly data if you look at data for the past 7 days. I did this for Washington state and Colorado (separately because it cannot be done combined). As these are two states have legal recreational marijuana people may search for "dispensaries near me". I wanted to find when this was most common. I used terms "dispensary" or "dispensaries" for WA[^2] and CO[^3] and plotted the two time series below. From this chart, I would have the stores open from noon to 3am each day except Sunday (early close at midnight).
```{r echo = FALSE}
dispensaries <-
purrr::map(c('dispensaries_wa.csv', 'dispensaries_co.csv'), read_csv, skip = 3,
col_names = c("time", "dispensary","dispensaries","dispensary + dispensaries")) %>%
bind_rows() %>%
cbind(state = c(rep("Washington", nrow(.)/ 2), rep("Colorado", nrow(.)/ 2))) %>%
gather(keyword, value, -state, -time) %>%
mutate(
keyword = factor(keyword, levels = c("dispensary","dispensaries","dispensary + dispensaries"))
, day = lubridate::wday(time, label = TRUE)
, hour = lubridate::hour(time)
) %>%
select(state, everything())
dispensaries %>%
filter(keyword == "dispensary + dispensaries") %>%
ggplot() +
geom_line(aes(x = hour, y = value, colour = state)) +
geom_smooth(aes(x = hour, y = value), se = FALSE) +
facet_grid(. ~ day) +
theme_bw() +
theme(legend.position = 'top') +
scale_x_continuous(breaks = c(6, 12, 18, 24)) +
labs(y = NULL, x = "Time of Day",
caption = "Feb 8, 2017 9:00pm to Feb 15, 2017 7:00pm",
colour = NULL, title = "Google Trends Popularity of 'dispensary' or 'dispensaries'")
```
##### 3) 2 foods to sell in your store in addition to the marijuana
Oftentimes our perception of food is more than just the taste. It is also the culmination of advertising messages that shape our feelings toward a food. Like most Americans I dream of hamburgers; however, each time I eat one I am let down by the taste. The years of seeing juicy hamburgers in advertising have made me crave the concept of a hamburger rather than the actual taste. In choosing two foods to sell in addition to marijuana, we need to remember that what is more important is what marketing has best positioned to be complements of marijuana rather than which foods *actually* make sense with the plant. I attempted to create Facebook Ads audiences with different combinations of terms that demonstrated interest in marijuana and affinity to a particular food. I also attempted to find search terms related to "munchies" on Google Trends or find terms correlated with marijuana-related terms using [Google Correlate](https://www.google.com/trends/correlate).
![Attempt find correlations with Facebook Ads](fb_ads2.png)
Unfortunately, I was unable to find anything that made reasonable sense. In the case of food, it might make sense to go do the Malcolm Gladwell approach to "go with your gut". Moreover, product choice is more likely to be something that can be tested and adjusted. As such I would recommend:
|Food|Justification|
|:----:|:----:|
|Doritos|Item marketed as one to crave|
|Burritos|Heavier, also associated with cravings|
[^1]: https://www.google.com/trends/explore?date=2016-11-08%202016-11-08&geo=US&q=weed%2Bpot%2Bcannabis%2Bmarijuana
[^2]: https://www.google.com/trends/explore?date=now%207-d&geo=US-WA&q=dispensary,dispensaries,dispensary%20%2B%20dispensaries
[^3]: https://www.google.com/trends/explore?date=now%207-d&geo=US-CO&q=dispensary,dispensaries,dispensary%20%2B%20dispensaries