Overview.
Python tools. Fama-French data in Pandas.
Applications.
Code.
The world is filled with differences. People are large and small, old and young. Countries are large and small, rich and poor. Companies, too.
What do we mean when we say something is "random"? Well, we might mean really crazy. But here we mean a number of different things could happen, but we're not sure ahead of time which one. The Steelers might win or lose. The stock market might go up or down? The economy might grow quickly or slowly. Your income might go up or down. You get the idea.
Examples:
- bar chart of equity returns
- boxplots
- distribution of one-day currency changes: euro, rmb, swf
- Distribution of ages
- income
- medical spending (MEPS)
- Kevin Williams long-tail data...
- BDS firm size and age distributions: http://www.census.gov/ces/dataproducts/bds/
- Fandango movie ratings from 538
See figs 2-4: http://public.econ.duke.edu/~psarcidi/aa.pdf
-
https://jakevdp.github.io/blog/2013/12/01/kernel-density-estimation/
-
http://stackoverflow.com/questions/4150171/how-to-create-a-density-plot-in-matplotlib
http://pandas.pydata.org/pandas-docs/stable/visualization.html#density-plot
Dynamics https://www.nact.org/resources/2014_SP_Global_Corporate_Default_Study.pdf
bar charts, pdfs, kde...
http://pandas.pydata.org/pandas-docs/stable/visualization.html#other-plots
Compare two distributions of movie ratings http://fivethirtyeight.com/features/fandango-movies-ratings/
Show tools, generate random data
What's a black swan? How big was the drop in the Chinese market?
- CPS data?
- Long-tail sales data (music, movies?)
- Births by age of mother
- Age distribution
https://terrytao.wordpress.com/2009/07/03/benfords-law-zipfs-law-and-the-pareto-distribution/
Scatterplot matrix: http://pandas.pydata.org/pandas-docs/stable/visualization.html#scatter-matrix-plot
Guess the correlation...