feed.xml

<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.0.1">Jekyll</generator><link href="/dev/feed.xml" rel="self" type="application/atom+xml" /><link href="/dev/" rel="alternate" type="text/html" /><updated>2020-05-26T17:29:42+00:00</updated><id>/dev/feed.xml</id><title type="html">Turing.jl</title><subtitle>Turing: A robust, efficient and modular library for general-purpose probabilistic programming.
</subtitle><author><name>The Turing Team</name></author><entry><title type="html">Replication study: Estimating number of infections and impact of NPIs on COVID-19 in European countries (Imperial Report 13)</title><link href="/dev/posts/2020-05-04-Imperial-Report13-analysis" rel="alternate" type="text/html" title="Replication study: Estimating number of infections and impact of NPIs on COVID-19 in European countries (Imperial Report 13)" /><published>2020-05-14T00:00:00+00:00</published><updated>2020-05-14T00:00:00+00:00</updated><id>/dev/posts/Imperial-Report13-analysis</id><content type="html" xml:base="/dev/posts/2020-05-04-Imperial-Report13-analysis">&lt;p&gt;The Turing.jl team is currently exploring possibilities in an attempt to help with the ongoing SARS-CoV-2 crisis. As preparation for this and to get our feet wet, we decided to perform a replication study of the &lt;a href=&quot;https://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/report-13-europe-npi-impact/&quot;&gt;Imperial Report 13&lt;/a&gt;, which attempts to estimate the real number of infections and impact of non-pharmaceutical interventions on COVID-19. In the report, the inference was performed using the probabilistic programming language (PPL) Stan. We have explicated their model and inference in Turing.jl, a Julia-based PPL. We believe the results and analysis of our study are relevant for the public, and for other researchers who are actively working on epidemiological models. To that end, our implementation and results are available &lt;a href=&quot;https://github.com/cambridge-mlg/Covid19&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In summary, we replicated the Imperial COVID-19 model using Turing.jl. Subsequently, we compared the inference results between Turing and Stan, and our comparison indicates that results are reproducible with two different implementations. In particular, we performed 4 sets of simulations using the Imperial COVID-19 model. The resulting estimates of the expected real number of cases, in contrast to the &lt;em&gt;recorded&lt;/em&gt; number of cases, the reproduction number \(R_t\), and the expected number of deaths as a function of time and non-pharmaceutical interventions (NPIs) for each Simulation are shown below.&lt;/p&gt;

&lt;div id=&quot;simulation-1-full&quot; class=&quot;plotly&quot;&gt;&lt;/div&gt;
&lt;script&gt;
 Plotly.d3.json(&quot;../assets/figures/2020-05-04-Imperial-Report13-analysis/full_prior.json&quot;, function(err, fig) {
   Plotly.plot(&quot;simulation-1-full&quot;, fig.data, fig.layout, {responsive: true});
 });
&lt;/script&gt;

&lt;p&gt;&lt;strong&gt;Simulation (a):&lt;/strong&gt; hypothetical Simulation from the model without data (prior predictive) or non-pharmaceutical interventions. Under the prior assumptions of the Imperial Covid-19 model, there is a very wide range of epidemic progressions with expected cases from almost 0 to 100% of the population over time. The black bar corresponds to the date of the last observation. Note that \(R_t\) has a different time-range than the other plots; following the original report, this shows the 100 days following the country-specific &lt;code class=&quot;highlighter-rouge&quot;&gt;epidemic_start&lt;/code&gt; which is defined to be 31 days prior to the first date of 10 cumulative deaths, while the other plots show the last 60 days.&lt;/p&gt;

&lt;div id=&quot;simulation-2-full&quot; class=&quot;plotly&quot;&gt;&lt;/div&gt;
&lt;script&gt;
 Plotly.d3.json(&quot;../assets/figures/2020-05-04-Imperial-Report13-analysis/full_posterior.json&quot;, function(err, fig) {
   Plotly.plot(&quot;simulation-2-full&quot;, fig.data, fig.layout, {responsive: true});
 });
&lt;/script&gt;

&lt;p&gt;&lt;strong&gt;Simulation (b):&lt;/strong&gt; future Simulation with non-pharmaceutical interventions kept in place (posterior predictive). After incorporating the observed infection data, we can see a substantially more refined range of epidemic progression. The reproduction rate estimate lies in the range of 3.5-5.6 before any intervention is introduced. The dotted lines correspond to observations, and the black bar corresponds to the date of the last observation.&lt;/p&gt;

&lt;div id=&quot;simulation-3-full&quot; class=&quot;plotly&quot;&gt;&lt;/div&gt;
&lt;script&gt;
 Plotly.d3.json(&quot;../assets/figures/2020-05-04-Imperial-Report13-analysis/full_counterfactual.json&quot;, function(err, fig) {
   Plotly.plot(&quot;simulation-3-full&quot;, fig.data, fig.layout, {responsive: true});
 });
&lt;/script&gt;

&lt;p&gt;&lt;strong&gt;Simulation (c):&lt;/strong&gt; future Simulation with non-pharmaceutical interventions removed. Now we see the hypothetical scenarios after incorporating infection data, but with non-pharmaceutical interventions removed. This plot looks similar to Simulation (a), but with a more rapid progression of the pandemic since the estimated reproduction rate is bigger than the prior assumptions. The dotted lines correspond to observations, and the black bar corresponds to the date of the last observation.&lt;/p&gt;

&lt;div id=&quot;simulation-4-full&quot; class=&quot;plotly&quot;&gt;&lt;/div&gt;
&lt;script&gt;
 Plotly.d3.json(&quot;../assets/figures/2020-05-04-Imperial-Report13-analysis/full_counterfactual2.json&quot;, function(err, fig) {
   Plotly.plot(&quot;simulation-4-full&quot;, fig.data, fig.layout, {responsive: true});
 });
&lt;/script&gt;

&lt;p&gt;&lt;strong&gt;Simulation (d):&lt;/strong&gt; future Simulation with when &lt;code class=&quot;highlighter-rouge&quot;&gt;lockdown&lt;/code&gt; is lifted two weeks before the last observation (predictive posterior). As a result there is a clear, rapid rebound of the reproduction rate. Comparing with Simulation (b) we do not observe an &lt;em&gt;immediate&lt;/em&gt; increase in the number of expected cases and deaths upon lifting lockdown, but there is a significant difference in the number of cases and deaths in the last few days in the plot: Simulation (d) results in both greater number of cases and deaths, as expected. This demonstrates how the effects of lifting an intervention might not become apparent in the measurable variables, e.g. deaths, until several weeks later. The dotted lines correspond to observations, the black bar corresponds to the date of the last observation, and the red bar indicates when &lt;code class=&quot;highlighter-rouge&quot;&gt;lockdown&lt;/code&gt; was lifted.&lt;/p&gt;

&lt;p&gt;Overall, Simulation (a) shows the prior modelling assumptions, and how these prior assumptions determine the predicted number of cases, etc. before seeing any data. Simulation (b) predicts the trend of the number of cases, etc. using estimated parameters and by keeping all the non-pharmaceutical interventions in place. Simulation (c) shows the estimate in the case where none of the intervention measures are ever put in place. Simulation (d) shows the estimates in the case when the lockdown was lifted two weeks prior to the last observation while keeping all the other non-pharmaceutical interventions in place.&lt;/p&gt;

&lt;p&gt;We want to emphasise that we do not provide additional analysis of the Imperial model yet, nor are we aiming to make any claims about the validity or the implications of the model. Instead we refer to Imperial Report 13 for more details and analysis. The purpose of this post is solely to add validation to the &lt;em&gt;inference&lt;/em&gt; performed in the paper by obtaining the same results using a different probabilistic programming language (PPL) and by exploring whether or not Turing.jl can be useful for researchers working on these problems.&lt;/p&gt;

&lt;p&gt;For our next steps, we’re looking at collaboration with other researchers and further developments of this and similar models.
There are some immediate directions to explore:&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;Incoporation of more sources of data, e.g. national mobility, seasonal changes and behavior changes in individuals.&lt;/li&gt;
  &lt;li&gt;How the assumptions incorporated into the priors and their parameters change resulting posterior.&lt;/li&gt;
  &lt;li&gt;The current model does not directly include recovery as a possibility and assumes that if a person has been infected once then he/she will be infectious until death. Number of recovered cases suffers from the same issues as the number of cases: it cannot be directly observed. But we can also deal with it in a similar manner as is done with number of cases and incorporate this into the model for a potential improvement.
This will result in a plethora of different models from which we can select the most realistic one using different model comparions techniques, e.g. leave-one-out cross-validation (loo-cv).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Such model refinement can be potentially valuable given the high impact of this pandemic and the uncertainty and debates in the potential outcomes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Acknowledgement&lt;/strong&gt; &lt;em&gt;We would like to thank the Julia community for creating such an excellent platform for scientific computing, and for the continuous feedback that we have received. We also thank researchers from Computational and Biological Laboratory at Cambridge University for their feedback on an early version of the post.&lt;/em&gt;
&lt;!----- Footnotes -----&gt;&lt;/p&gt;</content><author><name>Tor Erlend Fjelde; Mohamed Tarek; Kai Xu; David Widmann; Martin Trapp; Cameron Pfiffer; Hong Ge</name></author><summary type="html">The Turing.jl team is currently exploring possibilities in an attempt to help with the ongoing SARS-CoV-2 crisis. As preparation for this and to get our feet wet, we decided to perform a replication study of the Imperial Report 13, which attempts to estimate the real number of infections and impact of non-pharmaceutical interventions on COVID-19. In the report, the inference was performed using the probabilistic programming language (PPL) Stan. We have explicated their model and inference in Turing.jl, a Julia-based PPL. We believe the results and analysis of our study are relevant for the public, and for other researchers who are actively working on epidemiological models. To that end, our implementation and results are available here.</summary></entry><entry><title type="html">Google Summer of Code/Julia Summer of Code</title><link href="/dev/posts/2020-02-12-jsoc" rel="alternate" type="text/html" title="Google Summer of Code/Julia Summer of Code" /><published>2020-02-12T00:00:00+00:00</published><updated>2020-02-12T00:00:00+00:00</updated><id>/dev/posts/jsoc</id><content type="html" xml:base="/dev/posts/2020-02-12-jsoc">&lt;p&gt;Last year, Turing participated in the Google Summer of Code (GSoC) through the Julia language organization. It was a fun time, and the project was better for it. Turing plans to participate in the upcoming GSoC, and we wanted to outline some potential projects and expectations we have for applicants.&lt;/p&gt;

&lt;p&gt;If you are not aware, Google provides funds to students around the world to develop a project of their choice over the summer. Students receive funds from Google and spend three months on any open source project.&lt;/p&gt;

&lt;p&gt;The Turing development team has prepared a list of possible projects that we have deemed valuable to the project and easy enough that it could feasibly be created in the three-month limit. This list is not exlusive – if you have a good idea, you can write it up in your proposal, though it is recommend that you reach out to any of the Turing team on Julia’s &lt;a href=&quot;https://julialang.slack.com/&quot;&gt;Slack&lt;/a&gt; (you can get an invite &lt;a href=&quot;https://slackinvite.julialang.org/&quot;&gt;here&lt;/a&gt;) or &lt;a href=&quot;https://discourse.julialang.org/c/domain/probprog&quot;&gt;Discourse&lt;/a&gt;. Messages on Discourse should be posted to the “Probabilistic programming” category – we’ll find you!&lt;/p&gt;

&lt;p&gt;Possible project ideas:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Benchmarking&lt;/strong&gt;. Turing’s performance has been sporadically benchmarked against various other probabilistic programming languages (e.g. Turing, Stan, PyMC3, TensorFlow Prob), but a systemic approach to studying where Turing excels and where it falls short would be useful. A GSoC student would implement identical models in many PPLs and build tools to benchmark all PPLs against one another.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Nested sampling integration&lt;/strong&gt;. Turing focuses on modularity in inference methods, and the development team would like to see more inference methods, particularly the popular nested sampling method. A Julia package (&lt;a href=&quot;https://github.com/mileslucas/NestedSamplers.jl&quot;&gt;NestedSamplers.jl&lt;/a&gt;) but it is not hooked up to Turing and does not currently have a stable API. A GSoC student would either integrate that package or construct their own nested sampling method and build it into Turing.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Automated function memoization by model annotation&lt;/strong&gt;. Function memoization is a way to reduce costly function evaluation by caching the output when the same inputs are given. Turing’s Gibbs sampler often ends up &lt;a href=&quot;https://turing.ml/dev/docs/using-turing/performancetips#reuse-computations-in-gibbs-sampling&quot;&gt;rerunning expensive functions&lt;/a&gt; multiple times, and it would be a significant performance improvement to allow Turing’s model compiler to automatically memoize functions where appropriate. A student working on this project would become intimately familiar with Turing’s model compiler and build in various automated improvements.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Making Distributions GPU compatible&lt;/strong&gt;. Julia’s GPU tooling is generally quite good, but currently Turing is not able to reliably use GPUs while sampling because &lt;a href=&quot;https://github.com/JuliaStats/Distributions.jl&quot;&gt;Distributions.jl&lt;/a&gt; is not GPU compatible. A student on this project would work with the Turing developers and the Distributions developers to allow the use of GPU parallelism where possible in Turing.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Static distributions&lt;/strong&gt;. Small, fixed-size vectors and matrices are fairly common in Turing models. This means that sampling in Turing can probably benefit from using statically sized vectors and matrices from &lt;a href=&quot;https://github.com/JuliaArrays/StaticArrays.jl&quot;&gt;StaticArrays.jl&lt;/a&gt; instead of the dynamic normal Julia arrays. Beside the often superior performance of small static vectors and matrices, static arrays are also automatically compatible with the GPU stack in Julia. Currently, the main obstacle to using StaticArrays.jl is that distributions in &lt;a href=&quot;https://github.com/JuliaStats/Distributions.jl&quot;&gt;Distributions.jl&lt;/a&gt; are not compatible with StaticArrays. A GSoC student would adapt the multivariate and matrix-variate distributions as well as the univariate distribution with vector parameters in Distributions.jl to make a spin-off package called StaticDistributions.jl. The student would then benchmark StaticDistributions.jl against Distributions.jl and showcase an example of using StaticDistributions.jl together with &lt;a href=&quot;https://github.com/JuliaGPU/CuArrays.jl&quot;&gt;CuArrays.jl&lt;/a&gt; and/or &lt;a href=&quot;https://github.com/JuliaGPU/CUDAnative.jl&quot;&gt;CUDAnative.jl&lt;/a&gt; for GPU-acceleration.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;GPnet extensions&lt;/strong&gt;. One of Turing’s sattelite packages, &lt;a href=&quot;https://github.com/TuringLang/GPnet.jl&quot;&gt;GPnet&lt;/a&gt;, is designed to provide a comprehensive suite of Gaussian process tools. See &lt;a href=&quot;https://github.com/TuringLang/GPnet.jl/issues/2&quot;&gt;this issue&lt;/a&gt; for potential tasks – there’s a lot of interesting stuff going on with GPs, and this task in particular may have some creative freedom to it.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Better chains and model diagnostics&lt;/strong&gt;. One package that Turing (and many others) rely on heavily is &lt;a href=&quot;https://github.com/TuringLang/MCMCChains.jl&quot;&gt;MCMCChains.jl&lt;/a&gt;, a package designed to format, store, and analyze parameter samples generated during MCMC inference. MCMCChains is currently showing its age a little and has many &lt;a href=&quot;https://github.com/TuringLang/MCMCChains.jl/issues/171&quot;&gt;bad design choices&lt;/a&gt; that need to be fixed. Alternatively, a student could contstruct a far more lightweight chain system.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Model comparison tools&lt;/strong&gt;. Turing and its sattelite packages do not currently provide a comprehensive suite of model comparison tools, a critical tool for the applied statistician. A student who worked on this project would implement various model comparison tools like &lt;a href=&quot;https://mc-stan.org/loo/&quot;&gt;LOO and WAIC&lt;/a&gt;, among others.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;MLE/MAP tools&lt;/strong&gt;. &lt;a href=&quot;https://en.wikipedia.org/wiki/Maximum_likelihood_estimation&quot;&gt;Maximum likelihood estimates&lt;/a&gt; (MLE) and &lt;a href=&quot;https://en.wikipedia.org/wiki/Maximum_a_posteriori_estimation&quot;&gt;maximum a posteriori&lt;/a&gt; (MAP) estimates can currently only be done by users through a &lt;a href=&quot;https://turing.ml/dev/docs/using-turing/advanced#maximum-a-posteriori-estimation&quot;&gt;clunky set of workarounds&lt;/a&gt;. A streamlined function like &lt;code class=&quot;highlighter-rouge&quot;&gt;mle(model)&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;map(model)&lt;/code&gt; would be very useful for many of Turing’s users who want to see what the MLE or MAP estimates look like, and it may be valuable to allow for functionality that allows MCMC sampling to begin from the MLE or MAP estimates. Students working on this project will work with optimization packages such as &lt;a href=&quot;https://github.com/JuliaNLSolvers/Optim.jl&quot;&gt;Optim.jl&lt;/a&gt; to make MLE and MAP estimation straightforward for Turing models.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Particle sampler improvements&lt;/strong&gt;. Turing’s development team has spent a lot of time and energy to make inference methods more modular, but Turing’s particle samplers have not yet been modernized and spun off into a separate package. Two packages that resulted from this were &lt;a href=&quot;https://github.com/TuringLang/AdvancedHMC.jl&quot;&gt;AdvancedHMC&lt;/a&gt; for Hamiltonian MCMC methods, and &lt;a href=&quot;https://github.com/TuringLang/AdvancedMH.jl&quot;&gt;AdvancedMH&lt;/a&gt; for Metropolis-Hastings style inference methods. A student who worked on this project would become very familiar with Turing’s inference backend and with particle sampling methods. This is a good project for people who love making things efficient and easily extendable.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Other projects are welcome, but we do strongly recommend discussing any potential projects with members of the Turing team, as they will end up mentoring GSoC students for the duration of the project.&lt;/p&gt;

&lt;p&gt;We’re looking forward to what people are interested in!&lt;/p&gt;</content><author><name>Cameron Pfiffer</name></author><summary type="html">Last year, Turing participated in the Google Summer of Code (GSoC) through the Julia language organization. It was a fun time, and the project was better for it. Turing plans to participate in the upcoming GSoC, and we wanted to outline some potential projects and expectations we have for applicants.</summary></entry><entry><title type="html">Turing’s Blog</title><link href="/dev/posts/2019-12-14-initial-post" rel="alternate" type="text/html" title="Turing's Blog" /><published>2019-12-14T00:00:00+00:00</published><updated>2019-12-14T00:00:00+00:00</updated><id>/dev/posts/initial-post</id><content type="html" xml:base="/dev/posts/2019-12-14-initial-post">&lt;p&gt;All good open source projects should have a blog, and Turing is one such project. Later on, members of the Turing team may be populating this feed with posts on topics like&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Interesting things you can do with Turing, or interesting things we have seen others do.&lt;/li&gt;
  &lt;li&gt;Development updates and major release announcements.&lt;/li&gt;
  &lt;li&gt;Research updates.&lt;/li&gt;
  &lt;li&gt;Explorations of Turing’s internals.&lt;/li&gt;
  &lt;li&gt;Updates to Turing’s sattelite projects &lt;a href=&quot;https://github.com/TuringLang/AdvancedHMC.jl&quot;&gt;AdvancedHMC.jl&lt;/a&gt; or &lt;a href=&quot;https://github.com/TuringLang/Bijectors.jl&quot;&gt;Bijectors.jl&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Stay tuned!&lt;/p&gt;</content><author><name>Cameron Pfiffer</name></author><summary type="html">All good open source projects should have a blog, and Turing is one such project. Later on, members of the Turing team may be populating this feed with posts on topics like</summary></entry></feed>