The goal of the project was to collect, combine and analyse data obtained from Google Trends. My initial motivation to start this project was to perform a simple time series analysis of Relative Search Volumes for Autism vs Autism Spectrum Disorder. I was interested if the trends changed across the time, with Autism Spectrum Disorder getting more popular over time. At some point I decided that it would be even more interesting to do that for as many languages as possible. I collected the names for Autism/Autism Spectrum Disorder in different languages from Wikipedia pages, downloaded the data from Google trends, joined, cleaned and aggregated the data. I discovered few common characteristics for the time series across different languages and additionally employed geopandas to visualize the data.
- Doing more in depth research on RSV from Google Trends to improve my understanding of the data I collected
- Model the time series for different countries as an excercise in time based cv and predictive modeling
The most challenging part was to combine the RSV time series data with geographical data. I joined, melted/pivoted multiple tables. I also learned the basics of working with geopandas and discovered I really enjoy to visualize my data on a map. I learned how to spot common characteristics in the time series data and interpret them.