System developed to predict the risk of an epidemic outbreak based on internet searches. The system was based on the analysis of the search results from the Google search engine. Search data is shared via Google Trends service. Once downloaded, the data is preprocessed so that it can be used as exogenous input in the autoregressive model. The procedure is focused on the epidemy of COVID-19. The predicited autoregressive signal represents the number of cases in the following weeks.
In this study various autoregressive models were compared. Models based on the Akaike information criterion and the analysis of the squared sum of errors and incremental models were included. Thanks to the analysis of each model score, the conclusion of the efficieny of prediction based on internet searches could be made. What is more, the characteristic of the autoregressive signal was analyzed.