Skip to content

Latest commit

 

History

History
94 lines (48 loc) · 2.33 KB

random1.md

File metadata and controls

94 lines (48 loc) · 2.33 KB

Describing data 1: Distributions of things


Overview.

Python tools. Fama-French data in Pandas.

Applications.

Code.


The world is filled with differences. People are large and small, old and young. Countries are large and small, rich and poor. Companies, too.

What do we mean when we say something is "random"? Well, we might mean really crazy. But here we mean a number of different things could happen, but we're not sure ahead of time which one. The Steelers might win or lose. The stock market might go up or down? The economy might grow quickly or slowly. Your income might go up or down. You get the idea.

Examples:

  • bar chart of equity returns
  • boxplots
  • distribution of one-day currency changes: euro, rmb, swf
  • Distribution of ages
  • income
  • medical spending (MEPS)
  • Kevin Williams long-tail data...
  • BDS firm size and age distributions: http://www.census.gov/ces/dataproducts/bds/
  • Fandango movie ratings from 538

See figs 2-4: http://public.econ.duke.edu/~psarcidi/aa.pdf

http://pandas.pydata.org/pandas-docs/stable/visualization.html#density-plot

Dynamics https://www.nact.org/resources/2014_SP_Global_Corporate_Default_Study.pdf

Describing randomness

bar charts, pdfs, kde...

http://pandas.pydata.org/pandas-docs/stable/visualization.html#other-plots

Compare two distributions of movie ratings http://fivethirtyeight.com/features/fandango-movies-ratings/

Scipy and Numpy

Show tools, generate random data

Normal and other distributions

What's a black swan? How big was the drop in the Chinese market?

Equity returns

More distributions

  • CPS data?
  • Long-tail sales data (music, movies?)
  • Births by age of mother
  • Age distribution

Pareto etc

https://terrytao.wordpress.com/2009/07/03/benfords-law-zipfs-law-and-the-pareto-distribution/

Scatterplots

Scatterplot matrix: http://pandas.pydata.org/pandas-docs/stable/visualization.html#scatter-matrix-plot

Guess the correlation...

References