-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path2-viz.html
1811 lines (1763 loc) · 172 KB
/
2-viz.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html lang="" xml:lang="">
<head>
<meta charset="utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<title>Chapter 2 Data Visualization | Statistical Inference via Data Science</title>
<meta name="description" content="An open-source and fully-reproducible electronic textbook for teaching statistical inference using tidyverse data science tools." />
<meta name="generator" content="bookdown 0.22 and GitBook 2.6.7" />
<meta property="og:title" content="Chapter 2 Data Visualization | Statistical Inference via Data Science" />
<meta property="og:type" content="book" />
<meta property="og:url" content="https://moderndive.com/" />
<meta property="og:image" content="https://moderndive.com//images/logos/book_cover.png" />
<meta property="og:description" content="An open-source and fully-reproducible electronic textbook for teaching statistical inference using tidyverse data science tools." />
<meta name="github-repo" content="moderndive/ModernDive_book" />
<meta name="twitter:card" content="summary" />
<meta name="twitter:title" content="Chapter 2 Data Visualization | Statistical Inference via Data Science" />
<meta name="twitter:site" content="@ModernDive" />
<meta name="twitter:description" content="An open-source and fully-reproducible electronic textbook for teaching statistical inference using tidyverse data science tools." />
<meta name="twitter:image" content="https://moderndive.com//images/logos/book_cover.png" />
<meta name="author" content="Chester Ismay and Albert Y. Kim Foreword by Kelly S. McConville Adapted by William R. Morgan" />
<meta name="date" content="2021-07-28" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="apple-mobile-web-app-capable" content="yes" />
<meta name="apple-mobile-web-app-status-bar-style" content="black" />
<link rel="apple-touch-icon-precomposed" sizes="152x152" href="images/logos/favicons/apple-touch-icon.png" />
<link rel="shortcut icon" href="images/logos/favicons/favicon.ico" type="image/x-icon" />
<link rel="prev" href="1-getting-started.html"/>
<link rel="next" href="3-wrangling.html"/>
<script src="libs/header-attrs-2.9/header-attrs.js"></script>
<script src="libs/jquery-2.2.3/jquery.min.js"></script>
<link href="libs/gitbook-2.6.7/css/style.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-table.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-bookdown.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-highlight.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-search.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-fontsettings.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-clipboard.css" rel="stylesheet" />
<link href="libs/anchor-sections-1.0.1/anchor-sections.css" rel="stylesheet" />
<script src="libs/anchor-sections-1.0.1/anchor-sections.js"></script>
<script src="libs/kePrint-0.0.1/kePrint.js"></script>
<link href="libs/lightable-0.0.1/lightable.css" rel="stylesheet" />
<script src="libs/htmlwidgets-1.5.3/htmlwidgets.js"></script>
<link href="libs/dygraphs-1.1.1/dygraph.css" rel="stylesheet" />
<script src="libs/dygraphs-1.1.1/dygraph-combined.js"></script>
<script src="libs/dygraphs-1.1.1/shapes.js"></script>
<script src="libs/moment-2.8.4/moment.js"></script>
<script src="libs/moment-timezone-0.2.5/moment-timezone-with-data.js"></script>
<script src="libs/moment-fquarter-1.0.0/moment-fquarter.min.js"></script>
<script src="libs/dygraphs-binding-1.1.1.6/dygraphs.js"></script>
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-89938436-1', 'auto');
ga('send', 'pageview');
</script>
<style type="text/css">
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
code span.al { color: #ff0000; font-weight: bold; } /* Alert */
code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #7d9029; } /* Attribute */
code span.bn { color: #40a070; } /* BaseN */
code span.bu { } /* BuiltIn */
code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4070a0; } /* Char */
code span.cn { color: #880000; } /* Constant */
code span.co { color: #60a0b0; font-style: italic; } /* Comment */
code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #ba2121; font-style: italic; } /* Documentation */
code span.dt { color: #902000; } /* DataType */
code span.dv { color: #40a070; } /* DecVal */
code span.er { color: #ff0000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #40a070; } /* Float */
code span.fu { color: #06287e; } /* Function */
code span.im { } /* Import */
code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #007020; font-weight: bold; } /* Keyword */
code span.op { color: #666666; } /* Operator */
code span.ot { color: #007020; } /* Other */
code span.pp { color: #bc7a00; } /* Preprocessor */
code span.sc { color: #4070a0; } /* SpecialChar */
code span.ss { color: #bb6688; } /* SpecialString */
code span.st { color: #4070a0; } /* String */
code span.va { color: #19177c; } /* Variable */
code span.vs { color: #4070a0; } /* VerbatimString */
code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
</style>
<style type="text/css">
/* Used with Pandoc 2.11+ new --citeproc when CSL is used */
div.csl-bib-body { }
div.csl-entry {
clear: both;
}
.hanging div.csl-entry {
margin-left:2em;
text-indent:-2em;
}
div.csl-left-margin {
min-width:2em;
float:left;
}
div.csl-right-inline {
margin-left:2em;
padding-left:1em;
}
div.csl-indent {
margin-left: 2em;
}
</style>
<link rel="stylesheet" href="style.css" type="text/css" />
</head>
<body>
<div class="book without-animation with-summary font-size-2 font-family-1" data-basepath=".">
<div class="book-summary">
<nav role="navigation">
<ul class="summary">
<li class="chapter" data-level="" data-path="index.html"><a href="index.html"><i class="fa fa-check"></i>Welcome to ModernDive</a></li>
<li class="chapter" data-level="" data-path="foreword.html"><a href="foreword.html"><i class="fa fa-check"></i>Foreword</a></li>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html"><i class="fa fa-check"></i>Preface</a>
<ul>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#introduction-for-students"><i class="fa fa-check"></i>Introduction for students</a>
<ul>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#what-we-hope-you-will-learn-from-this-book"><i class="fa fa-check"></i>What we hope you will learn from this book</a></li>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#datascience-pipeline"><i class="fa fa-check"></i>Data/science pipeline</a></li>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#reproducible-research"><i class="fa fa-check"></i>Reproducible research</a></li>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#final-note-for-students"><i class="fa fa-check"></i>Final note for students</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#introduction-for-instructors"><i class="fa fa-check"></i>Introduction for instructors</a>
<ul>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#resources"><i class="fa fa-check"></i>Resources</a></li>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#why-did-we-write-this-book"><i class="fa fa-check"></i>Why did we write this book?</a></li>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#who-is-this-book-for"><i class="fa fa-check"></i>Who is this book for?</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#connect-and-contribute"><i class="fa fa-check"></i>Connect and contribute</a></li>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#acknowledgements"><i class="fa fa-check"></i>Acknowledgements</a></li>
<li class="chapter" data-level="" data-path="preface.html"><a href="preface.html#about-this-book"><i class="fa fa-check"></i>About this book</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="about-the-authors.html"><a href="about-the-authors.html"><i class="fa fa-check"></i>About the authors</a></li>
<li class="chapter" data-level="1" data-path="1-getting-started.html"><a href="1-getting-started.html"><i class="fa fa-check"></i><b>1</b> Getting Started with Data in R</a>
<ul>
<li class="chapter" data-level="1.1" data-path="1-getting-started.html"><a href="1-getting-started.html#r-rstudio"><i class="fa fa-check"></i><b>1.1</b> What are R and RStudio?</a>
<ul>
<li class="chapter" data-level="1.1.1" data-path="1-getting-started.html"><a href="1-getting-started.html#installing"><i class="fa fa-check"></i><b>1.1.1</b> Installing R and RStudio</a></li>
<li class="chapter" data-level="1.1.2" data-path="1-getting-started.html"><a href="1-getting-started.html#using-r-via-rstudio"><i class="fa fa-check"></i><b>1.1.2</b> Using R via RStudio</a></li>
</ul></li>
<li class="chapter" data-level="1.2" data-path="1-getting-started.html"><a href="1-getting-started.html#code"><i class="fa fa-check"></i><b>1.2</b> How do I code in R?</a>
<ul>
<li class="chapter" data-level="1.2.1" data-path="1-getting-started.html"><a href="1-getting-started.html#programming-concepts"><i class="fa fa-check"></i><b>1.2.1</b> Basic programming concepts and terminology</a></li>
<li class="chapter" data-level="1.2.2" data-path="1-getting-started.html"><a href="1-getting-started.html#messages"><i class="fa fa-check"></i><b>1.2.2</b> Errors, warnings, and messages</a></li>
<li class="chapter" data-level="1.2.3" data-path="1-getting-started.html"><a href="1-getting-started.html#tips-code"><i class="fa fa-check"></i><b>1.2.3</b> Tips on learning to code</a></li>
</ul></li>
<li class="chapter" data-level="1.3" data-path="1-getting-started.html"><a href="1-getting-started.html#packages"><i class="fa fa-check"></i><b>1.3</b> What are R packages?</a>
<ul>
<li class="chapter" data-level="1.3.1" data-path="1-getting-started.html"><a href="1-getting-started.html#package-installation"><i class="fa fa-check"></i><b>1.3.1</b> Package installation</a></li>
<li class="chapter" data-level="1.3.2" data-path="1-getting-started.html"><a href="1-getting-started.html#package-loading"><i class="fa fa-check"></i><b>1.3.2</b> Package loading</a></li>
<li class="chapter" data-level="1.3.3" data-path="1-getting-started.html"><a href="1-getting-started.html#package-use"><i class="fa fa-check"></i><b>1.3.3</b> Package use</a></li>
</ul></li>
<li class="chapter" data-level="1.4" data-path="1-getting-started.html"><a href="1-getting-started.html#rfishbase"><i class="fa fa-check"></i><b>1.4</b> Explore your first datasets</a>
<ul>
<li class="chapter" data-level="1.4.1" data-path="1-getting-started.html"><a href="1-getting-started.html#rfishpackage"><i class="fa fa-check"></i><b>1.4.1</b> <code>rfishbase</code> package</a></li>
<li class="chapter" data-level="1.4.2" data-path="1-getting-started.html"><a href="1-getting-started.html#fishbasedataframe"><i class="fa fa-check"></i><b>1.4.2</b> <code>fishbase</code> data frame</a></li>
<li class="chapter" data-level="1.4.3" data-path="1-getting-started.html"><a href="1-getting-started.html#exploredataframes"><i class="fa fa-check"></i><b>1.4.3</b> Exploring data frames</a></li>
<li class="chapter" data-level="1.4.4" data-path="1-getting-started.html"><a href="1-getting-started.html#identification-vs-measurement-variables"><i class="fa fa-check"></i><b>1.4.4</b> Identification and measurement variables</a></li>
<li class="chapter" data-level="1.4.5" data-path="1-getting-started.html"><a href="1-getting-started.html#help-files"><i class="fa fa-check"></i><b>1.4.5</b> Help files</a></li>
</ul></li>
<li class="chapter" data-level="1.5" data-path="1-getting-started.html"><a href="1-getting-started.html#conclusion"><i class="fa fa-check"></i><b>1.5</b> Conclusion</a>
<ul>
<li class="chapter" data-level="1.5.1" data-path="1-getting-started.html"><a href="1-getting-started.html#additional-resources"><i class="fa fa-check"></i><b>1.5.1</b> Additional resources</a></li>
<li class="chapter" data-level="1.5.2" data-path="1-getting-started.html"><a href="1-getting-started.html#whats-to-come"><i class="fa fa-check"></i><b>1.5.2</b> What’s to come?</a></li>
</ul></li>
</ul></li>
<li class="part"><span><b>I Data Science with tidyverse</b></span></li>
<li class="chapter" data-level="2" data-path="2-viz.html"><a href="2-viz.html"><i class="fa fa-check"></i><b>2</b> Data Visualization</a>
<ul>
<li class="chapter" data-level="" data-path="2-viz.html"><a href="2-viz.html#needed-packages"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="2.1" data-path="2-viz.html"><a href="2-viz.html#grammarofgraphics"><i class="fa fa-check"></i><b>2.1</b> The grammar of graphics</a>
<ul>
<li class="chapter" data-level="2.1.1" data-path="2-viz.html"><a href="2-viz.html#components-of-the-grammar"><i class="fa fa-check"></i><b>2.1.1</b> Components of the grammar</a></li>
<li class="chapter" data-level="2.1.2" data-path="2-viz.html"><a href="2-viz.html#gapminder"><i class="fa fa-check"></i><b>2.1.2</b> Gapminder data</a></li>
<li class="chapter" data-level="2.1.3" data-path="2-viz.html"><a href="2-viz.html#other-components"><i class="fa fa-check"></i><b>2.1.3</b> Other components</a></li>
<li class="chapter" data-level="2.1.4" data-path="2-viz.html"><a href="2-viz.html#ggplot2-package"><i class="fa fa-check"></i><b>2.1.4</b> ggplot2 package</a></li>
</ul></li>
<li class="chapter" data-level="2.2" data-path="2-viz.html"><a href="2-viz.html#FiveNG"><i class="fa fa-check"></i><b>2.2</b> Five named graphs - the 5NG</a></li>
<li class="chapter" data-level="2.3" data-path="2-viz.html"><a href="2-viz.html#scatterplots"><i class="fa fa-check"></i><b>2.3</b> 5NG#1: Scatterplots</a>
<ul>
<li class="chapter" data-level="2.3.1" data-path="2-viz.html"><a href="2-viz.html#geompoint"><i class="fa fa-check"></i><b>2.3.1</b> Scatterplots via <code>geom_point</code></a></li>
<li class="chapter" data-level="2.3.2" data-path="2-viz.html"><a href="2-viz.html#overplotting"><i class="fa fa-check"></i><b>2.3.2</b> Overplotting</a></li>
<li class="chapter" data-level="2.3.3" data-path="2-viz.html"><a href="2-viz.html#summary"><i class="fa fa-check"></i><b>2.3.3</b> Summary</a></li>
</ul></li>
<li class="chapter" data-level="2.4" data-path="2-viz.html"><a href="2-viz.html#linegraphs"><i class="fa fa-check"></i><b>2.4</b> 5NG#2: Linegraphs</a>
<ul>
<li class="chapter" data-level="2.4.1" data-path="2-viz.html"><a href="2-viz.html#geomline"><i class="fa fa-check"></i><b>2.4.1</b> Linegraphs via <code>geom_line</code></a></li>
<li class="chapter" data-level="2.4.2" data-path="2-viz.html"><a href="2-viz.html#summary-1"><i class="fa fa-check"></i><b>2.4.2</b> Summary</a></li>
</ul></li>
<li class="chapter" data-level="2.5" data-path="2-viz.html"><a href="2-viz.html#facets"><i class="fa fa-check"></i><b>2.5</b> Facets</a></li>
<li class="chapter" data-level="2.6" data-path="2-viz.html"><a href="2-viz.html#histograms"><i class="fa fa-check"></i><b>2.6</b> 5NG#3: Histograms</a>
<ul>
<li class="chapter" data-level="2.6.1" data-path="2-viz.html"><a href="2-viz.html#geomhistogram"><i class="fa fa-check"></i><b>2.6.1</b> Histograms via <code>geom_histogram</code></a></li>
<li class="chapter" data-level="2.6.2" data-path="2-viz.html"><a href="2-viz.html#adjustbins"><i class="fa fa-check"></i><b>2.6.2</b> Adjusting the bins</a></li>
<li class="chapter" data-level="2.6.3" data-path="2-viz.html"><a href="2-viz.html#summary-2"><i class="fa fa-check"></i><b>2.6.3</b> Summary</a></li>
</ul></li>
<li class="chapter" data-level="2.7" data-path="2-viz.html"><a href="2-viz.html#boxplots"><i class="fa fa-check"></i><b>2.7</b> 5NG#4: Boxplots</a>
<ul>
<li class="chapter" data-level="2.7.1" data-path="2-viz.html"><a href="2-viz.html#geomboxplot"><i class="fa fa-check"></i><b>2.7.1</b> Boxplots via <code>geom_boxplot</code></a></li>
<li class="chapter" data-level="2.7.2" data-path="2-viz.html"><a href="2-viz.html#summary-3"><i class="fa fa-check"></i><b>2.7.2</b> Summary</a></li>
</ul></li>
<li class="chapter" data-level="2.8" data-path="2-viz.html"><a href="2-viz.html#geombar"><i class="fa fa-check"></i><b>2.8</b> 5NG#5: Barplots</a>
<ul>
<li class="chapter" data-level="2.8.1" data-path="2-viz.html"><a href="2-viz.html#barplots-via-geom_bar-or-geom_col"><i class="fa fa-check"></i><b>2.8.1</b> Barplots via <code>geom_bar</code> or <code>geom_col</code></a></li>
<li class="chapter" data-level="2.8.2" data-path="2-viz.html"><a href="2-viz.html#must-avoid-pie-charts"><i class="fa fa-check"></i><b>2.8.2</b> Must avoid pie charts!</a></li>
<li class="chapter" data-level="2.8.3" data-path="2-viz.html"><a href="2-viz.html#two-categ-barplot"><i class="fa fa-check"></i><b>2.8.3</b> Two categorical variables</a></li>
<li class="chapter" data-level="2.8.4" data-path="2-viz.html"><a href="2-viz.html#summary-4"><i class="fa fa-check"></i><b>2.8.4</b> Summary</a></li>
</ul></li>
<li class="chapter" data-level="2.9" data-path="2-viz.html"><a href="2-viz.html#data-vis-conclusion"><i class="fa fa-check"></i><b>2.9</b> Conclusion</a>
<ul>
<li class="chapter" data-level="2.9.1" data-path="2-viz.html"><a href="2-viz.html#summary-table"><i class="fa fa-check"></i><b>2.9.1</b> Summary table</a></li>
<li class="chapter" data-level="2.9.2" data-path="2-viz.html"><a href="2-viz.html#function-argument-specification"><i class="fa fa-check"></i><b>2.9.2</b> Function argument specification</a></li>
<li class="chapter" data-level="2.9.3" data-path="2-viz.html"><a href="2-viz.html#additional-resources-1"><i class="fa fa-check"></i><b>2.9.3</b> Additional resources</a></li>
<li class="chapter" data-level="2.9.4" data-path="2-viz.html"><a href="2-viz.html#whats-to-come-3"><i class="fa fa-check"></i><b>2.9.4</b> What’s to come</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="3" data-path="3-wrangling.html"><a href="3-wrangling.html"><i class="fa fa-check"></i><b>3</b> Data Wrangling</a>
<ul>
<li class="chapter" data-level="" data-path="3-wrangling.html"><a href="3-wrangling.html#wrangling-packages"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="3.1" data-path="3-wrangling.html"><a href="3-wrangling.html#piping"><i class="fa fa-check"></i><b>3.1</b> The pipe operator: <code>%>%</code></a></li>
<li class="chapter" data-level="3.2" data-path="3-wrangling.html"><a href="3-wrangling.html#filter"><i class="fa fa-check"></i><b>3.2</b> <code>filter</code> rows</a></li>
<li class="chapter" data-level="3.3" data-path="3-wrangling.html"><a href="3-wrangling.html#slice-rows"><i class="fa fa-check"></i><b>3.3</b> <code>slice</code> rows</a></li>
<li class="chapter" data-level="3.4" data-path="3-wrangling.html"><a href="3-wrangling.html#select"><i class="fa fa-check"></i><b>3.4</b> <code>select</code> variables</a>
<ul>
<li class="chapter" data-level="3.4.1" data-path="3-wrangling.html"><a href="3-wrangling.html#rename"><i class="fa fa-check"></i><b>3.4.1</b> <code>rename</code> variables</a></li>
</ul></li>
<li class="chapter" data-level="3.5" data-path="3-wrangling.html"><a href="3-wrangling.html#summarize"><i class="fa fa-check"></i><b>3.5</b> <code>summarize</code> variables</a></li>
<li class="chapter" data-level="3.6" data-path="3-wrangling.html"><a href="3-wrangling.html#groupby"><i class="fa fa-check"></i><b>3.6</b> <code>group_by</code> rows</a>
<ul>
<li class="chapter" data-level="3.6.1" data-path="3-wrangling.html"><a href="3-wrangling.html#grouping-by-more-than-one-variable"><i class="fa fa-check"></i><b>3.6.1</b> Grouping by more than one variable</a></li>
</ul></li>
<li class="chapter" data-level="3.7" data-path="3-wrangling.html"><a href="3-wrangling.html#mutate"><i class="fa fa-check"></i><b>3.7</b> <code>mutate</code> existing variables</a></li>
<li class="chapter" data-level="3.8" data-path="3-wrangling.html"><a href="3-wrangling.html#arrange"><i class="fa fa-check"></i><b>3.8</b> <code>arrange</code> and sort rows</a></li>
<li class="chapter" data-level="3.9" data-path="3-wrangling.html"><a href="3-wrangling.html#joins"><i class="fa fa-check"></i><b>3.9</b> <code>join</code> data frames</a></li>
<li class="chapter" data-level="3.10" data-path="3-wrangling.html"><a href="3-wrangling.html#wrangling-conclusion"><i class="fa fa-check"></i><b>3.10</b> Conclusion</a>
<ul>
<li class="chapter" data-level="3.10.1" data-path="3-wrangling.html"><a href="3-wrangling.html#summary-table-1"><i class="fa fa-check"></i><b>3.10.1</b> Summary table</a></li>
<li class="chapter" data-level="3.10.2" data-path="3-wrangling.html"><a href="3-wrangling.html#additional-resources-2"><i class="fa fa-check"></i><b>3.10.2</b> Additional resources</a></li>
<li class="chapter" data-level="3.10.3" data-path="3-wrangling.html"><a href="3-wrangling.html#whats-to-come-1"><i class="fa fa-check"></i><b>3.10.3</b> What’s to come?</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="4" data-path="4-tidy.html"><a href="4-tidy.html"><i class="fa fa-check"></i><b>4</b> Data Importing and “Tidy” Data</a>
<ul>
<li class="chapter" data-level="" data-path="4-tidy.html"><a href="4-tidy.html#tidy-packages"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="4.1" data-path="4-tidy.html"><a href="4-tidy.html#csv"><i class="fa fa-check"></i><b>4.1</b> Importing data</a>
<ul>
<li class="chapter" data-level="4.1.1" data-path="4-tidy.html"><a href="4-tidy.html#using-the-console"><i class="fa fa-check"></i><b>4.1.1</b> Using the console</a></li>
<li class="chapter" data-level="4.1.2" data-path="4-tidy.html"><a href="4-tidy.html#using-rstudios-interface"><i class="fa fa-check"></i><b>4.1.2</b> Using RStudio’s interface</a></li>
</ul></li>
<li class="chapter" data-level="4.2" data-path="4-tidy.html"><a href="4-tidy.html#tidy-data-ex"><i class="fa fa-check"></i><b>4.2</b> “Tidy” data</a>
<ul>
<li class="chapter" data-level="4.2.1" data-path="4-tidy.html"><a href="4-tidy.html#tidy-definition"><i class="fa fa-check"></i><b>4.2.1</b> Definition of “tidy” data</a></li>
<li class="chapter" data-level="4.2.2" data-path="4-tidy.html"><a href="4-tidy.html#converting-to-tidy-data"><i class="fa fa-check"></i><b>4.2.2</b> Converting to “tidy” data</a></li>
</ul></li>
<li class="chapter" data-level="4.3" data-path="4-tidy.html"><a href="4-tidy.html#case-study-tidy"><i class="fa fa-check"></i><b>4.3</b> Case study: Weight loss data</a></li>
<li class="chapter" data-level="4.4" data-path="4-tidy.html"><a href="4-tidy.html#tidyverse-package"><i class="fa fa-check"></i><b>4.4</b> <code>tidyverse</code> package</a></li>
<li class="chapter" data-level="4.5" data-path="4-tidy.html"><a href="4-tidy.html#tidy-data-conclusion"><i class="fa fa-check"></i><b>4.5</b> Conclusion</a>
<ul>
<li class="chapter" data-level="4.5.1" data-path="4-tidy.html"><a href="4-tidy.html#additional-resources-3"><i class="fa fa-check"></i><b>4.5.1</b> Additional resources</a></li>
<li class="chapter" data-level="4.5.2" data-path="4-tidy.html"><a href="4-tidy.html#whats-to-come-2"><i class="fa fa-check"></i><b>4.5.2</b> What’s to come?</a></li>
</ul></li>
</ul></li>
<li class="part"><span><b>II Data Modeling with moderndive</b></span></li>
<li class="chapter" data-level="5" data-path="5-regression.html"><a href="5-regression.html"><i class="fa fa-check"></i><b>5</b> Basic Regression</a>
<ul>
<li class="chapter" data-level="" data-path="5-regression.html"><a href="5-regression.html#reg-packages"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="5.1" data-path="5-regression.html"><a href="5-regression.html#model1"><i class="fa fa-check"></i><b>5.1</b> One numerical explanatory variable</a>
<ul>
<li class="chapter" data-level="5.1.1" data-path="5-regression.html"><a href="5-regression.html#model1EDA"><i class="fa fa-check"></i><b>5.1.1</b> Exploratory data analysis</a></li>
<li class="chapter" data-level="5.1.2" data-path="5-regression.html"><a href="5-regression.html#model1table"><i class="fa fa-check"></i><b>5.1.2</b> Simple linear regression</a></li>
<li class="chapter" data-level="5.1.3" data-path="5-regression.html"><a href="5-regression.html#model1points"><i class="fa fa-check"></i><b>5.1.3</b> Observed/fitted values and residuals</a></li>
</ul></li>
<li class="chapter" data-level="5.2" data-path="5-regression.html"><a href="5-regression.html#model2"><i class="fa fa-check"></i><b>5.2</b> One categorical explanatory variable</a>
<ul>
<li class="chapter" data-level="5.2.1" data-path="5-regression.html"><a href="5-regression.html#model2EDA"><i class="fa fa-check"></i><b>5.2.1</b> Exploratory data analysis</a></li>
<li class="chapter" data-level="5.2.2" data-path="5-regression.html"><a href="5-regression.html#model2table"><i class="fa fa-check"></i><b>5.2.2</b> Linear regression</a></li>
<li class="chapter" data-level="5.2.3" data-path="5-regression.html"><a href="5-regression.html#model2points"><i class="fa fa-check"></i><b>5.2.3</b> Observed/fitted values and residuals</a></li>
</ul></li>
<li class="chapter" data-level="5.3" data-path="5-regression.html"><a href="5-regression.html#reg-related-topics"><i class="fa fa-check"></i><b>5.3</b> Related topics</a>
<ul>
<li class="chapter" data-level="5.3.1" data-path="5-regression.html"><a href="5-regression.html#correlation-is-not-causation"><i class="fa fa-check"></i><b>5.3.1</b> Correlation is not necessarily causation</a></li>
<li class="chapter" data-level="5.3.2" data-path="5-regression.html"><a href="5-regression.html#leastsquares"><i class="fa fa-check"></i><b>5.3.2</b> Best-fitting line</a></li>
<li class="chapter" data-level="5.3.3" data-path="5-regression.html"><a href="5-regression.html#underthehood"><i class="fa fa-check"></i><b>5.3.3</b> <code>get_regression_x()</code> functions</a></li>
</ul></li>
<li class="chapter" data-level="5.4" data-path="5-regression.html"><a href="5-regression.html#reg-conclusion"><i class="fa fa-check"></i><b>5.4</b> Conclusion</a>
<ul>
<li class="chapter" data-level="5.4.1" data-path="5-regression.html"><a href="5-regression.html#additional-resources-basic-regression"><i class="fa fa-check"></i><b>5.4.1</b> Additional resources</a></li>
<li class="chapter" data-level="5.4.2" data-path="5-regression.html"><a href="5-regression.html#whats-to-come-4"><i class="fa fa-check"></i><b>5.4.2</b> What’s to come?</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="6" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html"><i class="fa fa-check"></i><b>6</b> Multiple Regression</a>
<ul>
<li class="chapter" data-level="" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#mult-reg-packages"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="6.1" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#model4"><i class="fa fa-check"></i><b>6.1</b> One numerical and one categorical explanatory variable</a>
<ul>
<li class="chapter" data-level="6.1.1" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#model4EDA"><i class="fa fa-check"></i><b>6.1.1</b> Exploratory data analysis</a></li>
<li class="chapter" data-level="6.1.2" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#model4interactiontable"><i class="fa fa-check"></i><b>6.1.2</b> Interaction model</a></li>
<li class="chapter" data-level="6.1.3" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#model4table"><i class="fa fa-check"></i><b>6.1.3</b> Parallel slopes model</a></li>
<li class="chapter" data-level="6.1.4" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#model4points"><i class="fa fa-check"></i><b>6.1.4</b> Observed/fitted values and residuals</a></li>
</ul></li>
<li class="chapter" data-level="6.2" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#model3"><i class="fa fa-check"></i><b>6.2</b> Two categorical explanatory variables</a>
<ul>
<li class="chapter" data-level="6.2.1" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#model3EDA"><i class="fa fa-check"></i><b>6.2.1</b> Exploratory data analysis</a></li>
<li class="chapter" data-level="6.2.2" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#model3table"><i class="fa fa-check"></i><b>6.2.2</b> Regression lines</a></li>
<li class="chapter" data-level="6.2.3" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#model3points"><i class="fa fa-check"></i><b>6.2.3</b> Observed/fitted values and residuals</a></li>
</ul></li>
<li class="chapter" data-level="6.3" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#mult-reg-related-topics"><i class="fa fa-check"></i><b>6.3</b> Related topics</a>
<ul>
<li class="chapter" data-level="6.3.1" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#model-selection"><i class="fa fa-check"></i><b>6.3.1</b> Model selection using visualizations</a></li>
<li class="chapter" data-level="6.3.2" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#rsquared"><i class="fa fa-check"></i><b>6.3.2</b> Model selection using R-squared</a></li>
</ul></li>
<li class="chapter" data-level="6.4" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#mult-reg-conclusion"><i class="fa fa-check"></i><b>6.4</b> Conclusion</a>
<ul>
<li class="chapter" data-level="6.4.1" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#additional-resources-4"><i class="fa fa-check"></i><b>6.4.1</b> Additional resources</a></li>
<li class="chapter" data-level="6.4.2" data-path="6-multiple-regression.html"><a href="6-multiple-regression.html#whats-to-come-5"><i class="fa fa-check"></i><b>6.4.2</b> What’s to come?</a></li>
</ul></li>
</ul></li>
<li class="part"><span><b>III Statistical Inference with infer</b></span></li>
<li class="chapter" data-level="7" data-path="7-sampling.html"><a href="7-sampling.html"><i class="fa fa-check"></i><b>7</b> Sampling</a>
<ul>
<li class="chapter" data-level="" data-path="7-sampling.html"><a href="7-sampling.html#sampling-packages"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="7.1" data-path="7-sampling.html"><a href="7-sampling.html#sampling-activity"><i class="fa fa-check"></i><b>7.1</b> Sampling bowl activity</a>
<ul>
<li class="chapter" data-level="7.1.1" data-path="7-sampling.html"><a href="7-sampling.html#what-proportion-of-this-bowls-balls-are-red"><i class="fa fa-check"></i><b>7.1.1</b> What proportion of this bowl’s balls are red?</a></li>
<li class="chapter" data-level="7.1.2" data-path="7-sampling.html"><a href="7-sampling.html#using-the-shovel-once"><i class="fa fa-check"></i><b>7.1.2</b> Using the shovel once</a></li>
<li class="chapter" data-level="7.1.3" data-path="7-sampling.html"><a href="7-sampling.html#student-shovels"><i class="fa fa-check"></i><b>7.1.3</b> Using the shovel 33 times</a></li>
<li class="chapter" data-level="7.1.4" data-path="7-sampling.html"><a href="7-sampling.html#sampling-what-did-we-just-do"><i class="fa fa-check"></i><b>7.1.4</b> What did we just do?</a></li>
</ul></li>
<li class="chapter" data-level="7.2" data-path="7-sampling.html"><a href="7-sampling.html#sampling-simulation"><i class="fa fa-check"></i><b>7.2</b> Virtual sampling</a>
<ul>
<li class="chapter" data-level="7.2.1" data-path="7-sampling.html"><a href="7-sampling.html#using-the-virtual-shovel-once"><i class="fa fa-check"></i><b>7.2.1</b> Using the virtual shovel once</a></li>
</ul></li>
<li class="chapter" data-level="7.3" data-path="7-sampling.html"><a href="7-sampling.html#sampling-framework"><i class="fa fa-check"></i><b>7.3</b> Sampling framework</a>
<ul>
<li class="chapter" data-level="7.3.1" data-path="7-sampling.html"><a href="7-sampling.html#terminology-and-notation"><i class="fa fa-check"></i><b>7.3.1</b> Terminology and notation</a></li>
<li class="chapter" data-level="7.3.2" data-path="7-sampling.html"><a href="7-sampling.html#sampling-definitions"><i class="fa fa-check"></i><b>7.3.2</b> Statistical definitions</a></li>
<li class="chapter" data-level="7.3.3" data-path="7-sampling.html"><a href="7-sampling.html#moral-of-the-story"><i class="fa fa-check"></i><b>7.3.3</b> The moral of the story</a></li>
</ul></li>
<li class="chapter" data-level="7.4" data-path="7-sampling.html"><a href="7-sampling.html#sampling-case-study"><i class="fa fa-check"></i><b>7.4</b> Case study: Polls</a></li>
<li class="chapter" data-level="7.5" data-path="7-sampling.html"><a href="7-sampling.html#sampling-conclusion-central-limit-theorem"><i class="fa fa-check"></i><b>7.5</b> Central Limit Theorem</a></li>
<li class="chapter" data-level="7.6" data-path="7-sampling.html"><a href="7-sampling.html#sampling-conclusion"><i class="fa fa-check"></i><b>7.6</b> Conclusion</a>
<ul>
<li class="chapter" data-level="7.6.1" data-path="7-sampling.html"><a href="7-sampling.html#sampling-conclusion-table"><i class="fa fa-check"></i><b>7.6.1</b> Sampling scenarios</a></li>
<li class="chapter" data-level="7.6.2" data-path="7-sampling.html"><a href="7-sampling.html#additional-resources-5"><i class="fa fa-check"></i><b>7.6.2</b> Additional resources</a></li>
<li class="chapter" data-level="7.6.3" data-path="7-sampling.html"><a href="7-sampling.html#whats-to-come-6"><i class="fa fa-check"></i><b>7.6.3</b> What’s to come?</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="8" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html"><i class="fa fa-check"></i><b>8</b> Bootstrapping and Confidence Intervals</a>
<ul>
<li class="chapter" data-level="" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#CI-packages"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="8.1" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#resampling-tactile"><i class="fa fa-check"></i><b>8.1</b> Pennies activity</a>
<ul>
<li class="chapter" data-level="8.1.1" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#what-is-the-average-year-on-us-pennies-in-2019"><i class="fa fa-check"></i><b>8.1.1</b> What is the average year on US pennies in 2019?</a></li>
<li class="chapter" data-level="8.1.2" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#resampling-once"><i class="fa fa-check"></i><b>8.1.2</b> Resampling once</a></li>
<li class="chapter" data-level="8.1.3" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#student-resamples"><i class="fa fa-check"></i><b>8.1.3</b> Resampling 35 times</a></li>
<li class="chapter" data-level="8.1.4" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#ci-what-did-we-just-do"><i class="fa fa-check"></i><b>8.1.4</b> What did we just do?</a></li>
</ul></li>
<li class="chapter" data-level="8.2" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#resampling-simulation"><i class="fa fa-check"></i><b>8.2</b> Computer simulation of resampling</a>
<ul>
<li class="chapter" data-level="8.2.1" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#virtually-resampling-once"><i class="fa fa-check"></i><b>8.2.1</b> Virtually resampling once</a></li>
<li class="chapter" data-level="8.2.2" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#bootstrap-35-replicates"><i class="fa fa-check"></i><b>8.2.2</b> Virtually resampling 35 times</a></li>
<li class="chapter" data-level="8.2.3" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#bootstrap-1000-replicates"><i class="fa fa-check"></i><b>8.2.3</b> Virtually resampling 1000 times</a></li>
</ul></li>
<li class="chapter" data-level="8.3" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#ci-build-up"><i class="fa fa-check"></i><b>8.3</b> Understanding confidence intervals</a>
<ul>
<li class="chapter" data-level="8.3.1" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#percentile-method"><i class="fa fa-check"></i><b>8.3.1</b> Percentile method</a></li>
<li class="chapter" data-level="8.3.2" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#se-method"><i class="fa fa-check"></i><b>8.3.2</b> Standard error method</a></li>
</ul></li>
<li class="chapter" data-level="8.4" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#bootstrap-process"><i class="fa fa-check"></i><b>8.4</b> Constructing confidence intervals</a>
<ul>
<li class="chapter" data-level="8.4.1" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#original-workflow"><i class="fa fa-check"></i><b>8.4.1</b> Original workflow</a></li>
<li class="chapter" data-level="8.4.2" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#infer-workflow"><i class="fa fa-check"></i><b>8.4.2</b> <code>infer</code> package workflow</a></li>
<li class="chapter" data-level="8.4.3" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#percentile-method-infer"><i class="fa fa-check"></i><b>8.4.3</b> Percentile method with <code>infer</code></a></li>
<li class="chapter" data-level="8.4.4" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#infer-se"><i class="fa fa-check"></i><b>8.4.4</b> Standard error method with <code>infer</code></a></li>
</ul></li>
<li class="chapter" data-level="8.5" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#one-prop-ci"><i class="fa fa-check"></i><b>8.5</b> Interpreting confidence intervals</a>
<ul>
<li class="chapter" data-level="8.5.1" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#ilyas-yohan"><i class="fa fa-check"></i><b>8.5.1</b> Did the net capture the fish?</a></li>
<li class="chapter" data-level="8.5.2" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#shorthand"><i class="fa fa-check"></i><b>8.5.2</b> Precise and shorthand interpretation</a></li>
<li class="chapter" data-level="8.5.3" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#ci-width"><i class="fa fa-check"></i><b>8.5.3</b> Width of confidence intervals</a></li>
</ul></li>
<li class="chapter" data-level="8.6" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#case-study-two-prop-ci"><i class="fa fa-check"></i><b>8.6</b> Case study: Is yawning contagious?</a>
<ul>
<li class="chapter" data-level="8.6.1" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#mythbusters-study-data"><i class="fa fa-check"></i><b>8.6.1</b> <em>Mythbusters</em> study data</a></li>
<li class="chapter" data-level="8.6.2" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#sampling-scenario"><i class="fa fa-check"></i><b>8.6.2</b> Sampling scenario</a></li>
<li class="chapter" data-level="8.6.3" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#ci-build"><i class="fa fa-check"></i><b>8.6.3</b> Constructing the confidence interval</a></li>
<li class="chapter" data-level="8.6.4" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#interpreting-the-confidence-interval"><i class="fa fa-check"></i><b>8.6.4</b> Interpreting the confidence interval</a></li>
</ul></li>
<li class="chapter" data-level="8.7" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#ci-conclusion"><i class="fa fa-check"></i><b>8.7</b> Conclusion</a>
<ul>
<li class="chapter" data-level="8.7.1" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#bootstrap-vs-sampling"><i class="fa fa-check"></i><b>8.7.1</b> Comparing bootstrap and sampling distributions</a></li>
<li class="chapter" data-level="8.7.2" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#theory-ci"><i class="fa fa-check"></i><b>8.7.2</b> Theory-based confidence intervals</a></li>
<li class="chapter" data-level="8.7.3" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#additional-resources-6"><i class="fa fa-check"></i><b>8.7.3</b> Additional resources</a></li>
<li class="chapter" data-level="8.7.4" data-path="8-confidence-intervals.html"><a href="8-confidence-intervals.html#whats-to-come-7"><i class="fa fa-check"></i><b>8.7.4</b> What’s to come?</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="9" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html"><i class="fa fa-check"></i><b>9</b> Hypothesis Testing</a>
<ul>
<li class="chapter" data-level="" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#nhst-packages"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="9.1" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#ht-activity"><i class="fa fa-check"></i><b>9.1</b> Promotions activity</a>
<ul>
<li class="chapter" data-level="9.1.1" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#does-gender-affect-promotions-at-a-bank"><i class="fa fa-check"></i><b>9.1.1</b> Does gender affect promotions at a bank?</a></li>
<li class="chapter" data-level="9.1.2" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#shuffling-once"><i class="fa fa-check"></i><b>9.1.2</b> Shuffling once</a></li>
<li class="chapter" data-level="9.1.3" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#shuffling-16-times"><i class="fa fa-check"></i><b>9.1.3</b> Shuffling 16 times</a></li>
<li class="chapter" data-level="9.1.4" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#ht-what-did-we-just-do"><i class="fa fa-check"></i><b>9.1.4</b> What did we just do?</a></li>
</ul></li>
<li class="chapter" data-level="9.2" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#understanding-ht"><i class="fa fa-check"></i><b>9.2</b> Understanding hypothesis tests</a></li>
<li class="chapter" data-level="9.3" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#ht-infer"><i class="fa fa-check"></i><b>9.3</b> Conducting hypothesis tests</a>
<ul>
<li class="chapter" data-level="9.3.1" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#infer-workflow-ht"><i class="fa fa-check"></i><b>9.3.1</b> <code>infer</code> package workflow</a></li>
<li class="chapter" data-level="9.3.2" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#comparing-infer-workflows"><i class="fa fa-check"></i><b>9.3.2</b> Comparison with confidence intervals</a></li>
<li class="chapter" data-level="9.3.3" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#only-one-test"><i class="fa fa-check"></i><b>9.3.3</b> “There is only one test”</a></li>
</ul></li>
<li class="chapter" data-level="9.4" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#ht-interpretation"><i class="fa fa-check"></i><b>9.4</b> Interpreting hypothesis tests</a>
<ul>
<li class="chapter" data-level="9.4.1" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#trial"><i class="fa fa-check"></i><b>9.4.1</b> Two possible outcomes</a></li>
<li class="chapter" data-level="9.4.2" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#types-of-errors"><i class="fa fa-check"></i><b>9.4.2</b> Types of errors</a></li>
<li class="chapter" data-level="9.4.3" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#choosing-alpha"><i class="fa fa-check"></i><b>9.4.3</b> How do we choose alpha?</a></li>
</ul></li>
<li class="chapter" data-level="9.5" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#ht-case-study"><i class="fa fa-check"></i><b>9.5</b> Case study: Are action or romance movies rated higher?</a>
<ul>
<li class="chapter" data-level="9.5.1" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#imdb-data"><i class="fa fa-check"></i><b>9.5.1</b> IMDb ratings data</a></li>
<li class="chapter" data-level="9.5.2" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#sampling-scenario-1"><i class="fa fa-check"></i><b>9.5.2</b> Sampling scenario</a></li>
<li class="chapter" data-level="9.5.3" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#conducting-the-hypothesis-test"><i class="fa fa-check"></i><b>9.5.3</b> Conducting the hypothesis test</a></li>
</ul></li>
<li class="chapter" data-level="9.6" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#nhst-conclusion"><i class="fa fa-check"></i><b>9.6</b> Conclusion</a>
<ul>
<li class="chapter" data-level="9.6.1" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#theory-hypo"><i class="fa fa-check"></i><b>9.6.1</b> Theory-based hypothesis tests</a></li>
<li class="chapter" data-level="9.6.2" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#when-inference-is-not-needed"><i class="fa fa-check"></i><b>9.6.2</b> When inference is not needed</a></li>
<li class="chapter" data-level="9.6.3" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#problems-with-p-values"><i class="fa fa-check"></i><b>9.6.3</b> Problems with p-values</a></li>
<li class="chapter" data-level="9.6.4" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#additional-resources-7"><i class="fa fa-check"></i><b>9.6.4</b> Additional resources</a></li>
<li class="chapter" data-level="9.6.5" data-path="9-hypothesis-testing.html"><a href="9-hypothesis-testing.html#whats-to-come-8"><i class="fa fa-check"></i><b>9.6.5</b> What’s to come</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="10" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html"><i class="fa fa-check"></i><b>10</b> Inference for Regression</a>
<ul>
<li class="chapter" data-level="" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#inf-packages"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="10.1" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#regression-refresher"><i class="fa fa-check"></i><b>10.1</b> Regression refresher</a>
<ul>
<li class="chapter" data-level="10.1.1" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#teaching-evaluations-analysis"><i class="fa fa-check"></i><b>10.1.1</b> Teaching evaluations analysis</a></li>
<li class="chapter" data-level="10.1.2" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#sampling-scenario-2"><i class="fa fa-check"></i><b>10.1.2</b> Sampling scenario</a></li>
</ul></li>
<li class="chapter" data-level="10.2" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#regression-interp"><i class="fa fa-check"></i><b>10.2</b> Interpreting regression tables</a>
<ul>
<li class="chapter" data-level="10.2.1" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#regression-se"><i class="fa fa-check"></i><b>10.2.1</b> Standard error</a></li>
<li class="chapter" data-level="10.2.2" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#regression-test-statistic"><i class="fa fa-check"></i><b>10.2.2</b> Test statistic</a></li>
<li class="chapter" data-level="10.2.3" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#p-value"><i class="fa fa-check"></i><b>10.2.3</b> p-value</a></li>
<li class="chapter" data-level="10.2.4" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#confidence-interval"><i class="fa fa-check"></i><b>10.2.4</b> Confidence interval</a></li>
<li class="chapter" data-level="10.2.5" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#regression-table-computation"><i class="fa fa-check"></i><b>10.2.5</b> How does R compute the table?</a></li>
</ul></li>
<li class="chapter" data-level="10.3" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#regression-conditions"><i class="fa fa-check"></i><b>10.3</b> Conditions for inference for regression</a>
<ul>
<li class="chapter" data-level="10.3.1" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#residuals-refresher"><i class="fa fa-check"></i><b>10.3.1</b> Residuals refresher</a></li>
<li class="chapter" data-level="10.3.2" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#linearity-of-relationship"><i class="fa fa-check"></i><b>10.3.2</b> Linearity of relationship</a></li>
<li class="chapter" data-level="10.3.3" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#independence-of-residuals"><i class="fa fa-check"></i><b>10.3.3</b> Independence of residuals</a></li>
<li class="chapter" data-level="10.3.4" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#normality-of-residuals"><i class="fa fa-check"></i><b>10.3.4</b> Normality of residuals</a></li>
<li class="chapter" data-level="10.3.5" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#equality-of-variance"><i class="fa fa-check"></i><b>10.3.5</b> Equality of variance</a></li>
<li class="chapter" data-level="10.3.6" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#what-is-the-conclusion"><i class="fa fa-check"></i><b>10.3.6</b> What’s the conclusion?</a></li>
</ul></li>
<li class="chapter" data-level="10.4" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#infer-regression"><i class="fa fa-check"></i><b>10.4</b> Simulation-based inference for regression</a>
<ul>
<li class="chapter" data-level="10.4.1" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#confidence-interval-for-slope"><i class="fa fa-check"></i><b>10.4.1</b> Confidence interval for slope</a></li>
<li class="chapter" data-level="10.4.2" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#hypothesis-test-for-slope"><i class="fa fa-check"></i><b>10.4.2</b> Hypothesis test for slope</a></li>
</ul></li>
<li class="chapter" data-level="10.5" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#inference-conclusion"><i class="fa fa-check"></i><b>10.5</b> Conclusion</a>
<ul>
<li class="chapter" data-level="10.5.1" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#theory-regression"><i class="fa fa-check"></i><b>10.5.1</b> Theory-based inference for regression</a></li>
<li class="chapter" data-level="10.5.2" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#summary-of-statistical-inference"><i class="fa fa-check"></i><b>10.5.2</b> Summary of statistical inference</a></li>
<li class="chapter" data-level="10.5.3" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#additional-resources-8"><i class="fa fa-check"></i><b>10.5.3</b> Additional resources</a></li>
<li class="chapter" data-level="10.5.4" data-path="10-inference-for-regression.html"><a href="10-inference-for-regression.html#whats-to-come-9"><i class="fa fa-check"></i><b>10.5.4</b> What’s to come</a></li>
</ul></li>
</ul></li>
<li class="part"><span><b>IV Conclusion</b></span></li>
<li class="chapter" data-level="11" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html"><i class="fa fa-check"></i><b>11</b> Tell Your Story with Data</a>
<ul>
<li class="chapter" data-level="11.1" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#review"><i class="fa fa-check"></i><b>11.1</b> Review</a>
<ul>
<li class="chapter" data-level="" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#story-packages"><i class="fa fa-check"></i>Needed packages</a></li>
</ul></li>
<li class="chapter" data-level="11.2" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#seattle-house-prices"><i class="fa fa-check"></i><b>11.2</b> Case study: Seattle house prices</a>
<ul>
<li class="chapter" data-level="11.2.1" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#house-prices-EDA-I"><i class="fa fa-check"></i><b>11.2.1</b> Exploratory data analysis: Part I</a></li>
<li class="chapter" data-level="11.2.2" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#house-prices-EDA-II"><i class="fa fa-check"></i><b>11.2.2</b> Exploratory data analysis: Part II</a></li>
<li class="chapter" data-level="11.2.3" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#house-prices-regression"><i class="fa fa-check"></i><b>11.2.3</b> Regression modeling</a></li>
<li class="chapter" data-level="11.2.4" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#house-prices-making-predictions"><i class="fa fa-check"></i><b>11.2.4</b> Making predictions</a></li>
</ul></li>
<li class="chapter" data-level="11.3" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#data-journalism"><i class="fa fa-check"></i><b>11.3</b> Case study: Effective data storytelling</a>
<ul>
<li class="chapter" data-level="11.3.1" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#bechdel-test-for-hollywood-gender-representation"><i class="fa fa-check"></i><b>11.3.1</b> Bechdel test for Hollywood gender representation</a></li>
<li class="chapter" data-level="11.3.2" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#us-births-in-1999"><i class="fa fa-check"></i><b>11.3.2</b> US Births in 1999</a></li>
<li class="chapter" data-level="11.3.3" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#scripts-of-r-code"><i class="fa fa-check"></i><b>11.3.3</b> Scripts of R code</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="11-thinking-with-data.html"><a href="11-thinking-with-data.html#concluding-remarks"><i class="fa fa-check"></i>Concluding remarks</a></li>
</ul></li>
<li class="appendix"><span><b>Appendix</b></span></li>
<li class="chapter" data-level="A" data-path="A-appendixA.html"><a href="A-appendixA.html"><i class="fa fa-check"></i><b>A</b> Statistical Background</a>
<ul>
<li class="chapter" data-level="A.1" data-path="A-appendixA.html"><a href="A-appendixA.html#appendix-stat-terms"><i class="fa fa-check"></i><b>A.1</b> Basic statistical terms</a>
<ul>
<li class="chapter" data-level="A.1.1" data-path="A-appendixA.html"><a href="A-appendixA.html#mean"><i class="fa fa-check"></i><b>A.1.1</b> Mean</a></li>
<li class="chapter" data-level="A.1.2" data-path="A-appendixA.html"><a href="A-appendixA.html#median"><i class="fa fa-check"></i><b>A.1.2</b> Median</a></li>
<li class="chapter" data-level="A.1.3" data-path="A-appendixA.html"><a href="A-appendixA.html#appendix-sd-variance"><i class="fa fa-check"></i><b>A.1.3</b> Standard deviation and variance</a></li>
<li class="chapter" data-level="A.1.4" data-path="A-appendixA.html"><a href="A-appendixA.html#five-number-summary"><i class="fa fa-check"></i><b>A.1.4</b> Five-number summary</a></li>
<li class="chapter" data-level="A.1.5" data-path="A-appendixA.html"><a href="A-appendixA.html#distribution"><i class="fa fa-check"></i><b>A.1.5</b> Distribution</a></li>
<li class="chapter" data-level="A.1.6" data-path="A-appendixA.html"><a href="A-appendixA.html#outliers"><i class="fa fa-check"></i><b>A.1.6</b> Outliers</a></li>
</ul></li>
<li class="chapter" data-level="A.2" data-path="A-appendixA.html"><a href="A-appendixA.html#appendix-normal-curve"><i class="fa fa-check"></i><b>A.2</b> Normal distribution</a></li>
<li class="chapter" data-level="A.3" data-path="A-appendixA.html"><a href="A-appendixA.html#appendix-log10-transformations"><i class="fa fa-check"></i><b>A.3</b> log10 transformations</a></li>
</ul></li>
<li class="chapter" data-level="B" data-path="B-appendixB.html"><a href="B-appendixB.html"><i class="fa fa-check"></i><b>B</b> Inference Examples</a>
<ul>
<li class="chapter" data-level="" data-path="B-appendixB.html"><a href="B-appendixB.html#needed-packages-1"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="B.1" data-path="B-appendixB.html"><a href="B-appendixB.html#inference-mind-map"><i class="fa fa-check"></i><b>B.1</b> Inference mind map</a></li>
<li class="chapter" data-level="B.2" data-path="B-appendixB.html"><a href="B-appendixB.html#one-mean"><i class="fa fa-check"></i><b>B.2</b> One mean</a>
<ul>
<li class="chapter" data-level="B.2.1" data-path="B-appendixB.html"><a href="B-appendixB.html#problem-statement"><i class="fa fa-check"></i><b>B.2.1</b> Problem statement</a></li>
<li class="chapter" data-level="B.2.2" data-path="B-appendixB.html"><a href="B-appendixB.html#competing-hypotheses"><i class="fa fa-check"></i><b>B.2.2</b> Competing hypotheses</a></li>
<li class="chapter" data-level="B.2.3" data-path="B-appendixB.html"><a href="B-appendixB.html#exploring-the-sample-data"><i class="fa fa-check"></i><b>B.2.3</b> Exploring the sample data</a></li>
<li class="chapter" data-level="B.2.4" data-path="B-appendixB.html"><a href="B-appendixB.html#non-traditional-methods"><i class="fa fa-check"></i><b>B.2.4</b> Non-traditional methods</a></li>
<li class="chapter" data-level="B.2.5" data-path="B-appendixB.html"><a href="B-appendixB.html#traditional-methods"><i class="fa fa-check"></i><b>B.2.5</b> Traditional methods</a></li>
<li class="chapter" data-level="B.2.6" data-path="B-appendixB.html"><a href="B-appendixB.html#comparing-results"><i class="fa fa-check"></i><b>B.2.6</b> Comparing results</a></li>
</ul></li>
<li class="chapter" data-level="B.3" data-path="B-appendixB.html"><a href="B-appendixB.html#one-proportion"><i class="fa fa-check"></i><b>B.3</b> One proportion</a>
<ul>
<li class="chapter" data-level="B.3.1" data-path="B-appendixB.html"><a href="B-appendixB.html#problem-statement-1"><i class="fa fa-check"></i><b>B.3.1</b> Problem statement</a></li>
<li class="chapter" data-level="B.3.2" data-path="B-appendixB.html"><a href="B-appendixB.html#competing-hypotheses-1"><i class="fa fa-check"></i><b>B.3.2</b> Competing hypotheses</a></li>
<li class="chapter" data-level="B.3.3" data-path="B-appendixB.html"><a href="B-appendixB.html#exploring-the-sample-data-1"><i class="fa fa-check"></i><b>B.3.3</b> Exploring the sample data</a></li>
<li class="chapter" data-level="B.3.4" data-path="B-appendixB.html"><a href="B-appendixB.html#non-traditional-methods-1"><i class="fa fa-check"></i><b>B.3.4</b> Non-traditional methods</a></li>
<li class="chapter" data-level="B.3.5" data-path="B-appendixB.html"><a href="B-appendixB.html#traditional-methods-1"><i class="fa fa-check"></i><b>B.3.5</b> Traditional methods</a></li>
<li class="chapter" data-level="B.3.6" data-path="B-appendixB.html"><a href="B-appendixB.html#comparing-results-1"><i class="fa fa-check"></i><b>B.3.6</b> Comparing results</a></li>
</ul></li>
<li class="chapter" data-level="B.4" data-path="B-appendixB.html"><a href="B-appendixB.html#two-proportions"><i class="fa fa-check"></i><b>B.4</b> Two proportions</a>
<ul>
<li class="chapter" data-level="B.4.1" data-path="B-appendixB.html"><a href="B-appendixB.html#problem-statement-2"><i class="fa fa-check"></i><b>B.4.1</b> Problem statement</a></li>
<li class="chapter" data-level="B.4.2" data-path="B-appendixB.html"><a href="B-appendixB.html#competing-hypotheses-2"><i class="fa fa-check"></i><b>B.4.2</b> Competing hypotheses</a></li>
<li class="chapter" data-level="B.4.3" data-path="B-appendixB.html"><a href="B-appendixB.html#exploring-the-sample-data-2"><i class="fa fa-check"></i><b>B.4.3</b> Exploring the sample data</a></li>
<li class="chapter" data-level="B.4.4" data-path="B-appendixB.html"><a href="B-appendixB.html#non-traditional-methods-2"><i class="fa fa-check"></i><b>B.4.4</b> Non-traditional methods</a></li>
<li class="chapter" data-level="B.4.5" data-path="B-appendixB.html"><a href="B-appendixB.html#traditional-methods-2"><i class="fa fa-check"></i><b>B.4.5</b> Traditional methods</a></li>
<li class="chapter" data-level="B.4.6" data-path="B-appendixB.html"><a href="B-appendixB.html#test-statistic-2"><i class="fa fa-check"></i><b>B.4.6</b> Test statistic</a></li>
<li class="chapter" data-level="B.4.7" data-path="B-appendixB.html"><a href="B-appendixB.html#state-conclusion-2"><i class="fa fa-check"></i><b>B.4.7</b> State conclusion</a></li>
<li class="chapter" data-level="B.4.8" data-path="B-appendixB.html"><a href="B-appendixB.html#comparing-results-2"><i class="fa fa-check"></i><b>B.4.8</b> Comparing results</a></li>
</ul></li>
<li class="chapter" data-level="B.5" data-path="B-appendixB.html"><a href="B-appendixB.html#two-means-independent-samples"><i class="fa fa-check"></i><b>B.5</b> Two means (independent samples)</a>
<ul>
<li class="chapter" data-level="B.5.1" data-path="B-appendixB.html"><a href="B-appendixB.html#problem-statement-3"><i class="fa fa-check"></i><b>B.5.1</b> Problem statement</a></li>
<li class="chapter" data-level="B.5.2" data-path="B-appendixB.html"><a href="B-appendixB.html#competing-hypotheses-3"><i class="fa fa-check"></i><b>B.5.2</b> Competing hypotheses</a></li>
<li class="chapter" data-level="B.5.3" data-path="B-appendixB.html"><a href="B-appendixB.html#exploring-the-sample-data-3"><i class="fa fa-check"></i><b>B.5.3</b> Exploring the sample data</a></li>
<li class="chapter" data-level="B.5.4" data-path="B-appendixB.html"><a href="B-appendixB.html#non-traditional-methods-3"><i class="fa fa-check"></i><b>B.5.4</b> Non-traditional methods</a></li>
<li class="chapter" data-level="B.5.5" data-path="B-appendixB.html"><a href="B-appendixB.html#traditional-methods-3"><i class="fa fa-check"></i><b>B.5.5</b> Traditional methods</a></li>
<li class="chapter" data-level="B.5.6" data-path="B-appendixB.html"><a href="B-appendixB.html#test-statistic-3"><i class="fa fa-check"></i><b>B.5.6</b> Test statistic</a></li>
<li class="chapter" data-level="B.5.7" data-path="B-appendixB.html"><a href="B-appendixB.html#compute-p-value-1"><i class="fa fa-check"></i><b>B.5.7</b> Compute <span class="math inline">\(p\)</span>-value</a></li>
<li class="chapter" data-level="B.5.8" data-path="B-appendixB.html"><a href="B-appendixB.html#state-conclusion-3"><i class="fa fa-check"></i><b>B.5.8</b> State conclusion</a></li>
<li class="chapter" data-level="B.5.9" data-path="B-appendixB.html"><a href="B-appendixB.html#comparing-results-3"><i class="fa fa-check"></i><b>B.5.9</b> Comparing results</a></li>
</ul></li>
<li class="chapter" data-level="B.6" data-path="B-appendixB.html"><a href="B-appendixB.html#two-means-paired-samples"><i class="fa fa-check"></i><b>B.6</b> Two means (paired samples)</a>
<ul>
<li class="chapter" data-level="" data-path="B-appendixB.html"><a href="B-appendixB.html#problem-statement-4"><i class="fa fa-check"></i>Problem statement</a></li>
<li class="chapter" data-level="B.6.1" data-path="B-appendixB.html"><a href="B-appendixB.html#competing-hypotheses-4"><i class="fa fa-check"></i><b>B.6.1</b> Competing hypotheses</a></li>
<li class="chapter" data-level="B.6.2" data-path="B-appendixB.html"><a href="B-appendixB.html#exploring-the-sample-data-4"><i class="fa fa-check"></i><b>B.6.2</b> Exploring the sample data</a></li>
<li class="chapter" data-level="B.6.3" data-path="B-appendixB.html"><a href="B-appendixB.html#non-traditional-methods-4"><i class="fa fa-check"></i><b>B.6.3</b> Non-traditional methods</a></li>
<li class="chapter" data-level="B.6.4" data-path="B-appendixB.html"><a href="B-appendixB.html#traditional-methods-4"><i class="fa fa-check"></i><b>B.6.4</b> Traditional methods</a></li>
<li class="chapter" data-level="B.6.5" data-path="B-appendixB.html"><a href="B-appendixB.html#comparing-results-4"><i class="fa fa-check"></i><b>B.6.5</b> Comparing results</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="C" data-path="C-appendixC.html"><a href="C-appendixC.html"><i class="fa fa-check"></i><b>C</b> Tips and Tricks</a>
<ul>
<li class="chapter" data-level="" data-path="C-appendixC.html"><a href="C-appendixC.html#needed-packages-2"><i class="fa fa-check"></i>Needed packages</a></li>
<li class="chapter" data-level="C.1" data-path="C-appendixC.html"><a href="C-appendixC.html#data-wrangling"><i class="fa fa-check"></i><b>C.1</b> Data wrangling</a>
<ul>
<li class="chapter" data-level="C.1.1" data-path="C-appendixC.html"><a href="C-appendixC.html#appendix-missing-values"><i class="fa fa-check"></i><b>C.1.1</b> Dealing with missing values</a></li>
<li class="chapter" data-level="C.1.2" data-path="C-appendixC.html"><a href="C-appendixC.html#appendix-reordering-bars"><i class="fa fa-check"></i><b>C.1.2</b> Reordering bars in a barplot</a></li>
<li class="chapter" data-level="C.1.3" data-path="C-appendixC.html"><a href="C-appendixC.html#appendix-money-on-axis"><i class="fa fa-check"></i><b>C.1.3</b> Showing money on an axis</a></li>
<li class="chapter" data-level="C.1.4" data-path="C-appendixC.html"><a href="C-appendixC.html#appendix-changing-values"><i class="fa fa-check"></i><b>C.1.4</b> Changing values inside cells</a></li>
<li class="chapter" data-level="C.1.5" data-path="C-appendixC.html"><a href="C-appendixC.html#appendix-convert-numerical-categorical"><i class="fa fa-check"></i><b>C.1.5</b> Converting a numerical variable to a categorical one</a></li>
<li class="chapter" data-level="C.1.6" data-path="C-appendixC.html"><a href="C-appendixC.html#appendix-prop"><i class="fa fa-check"></i><b>C.1.6</b> Computing proportions</a></li>
<li class="chapter" data-level="C.1.7" data-path="C-appendixC.html"><a href="C-appendixC.html#appendix-commas"><i class="fa fa-check"></i><b>C.1.7</b> Dealing with %, commas, and $</a></li>
</ul></li>
<li class="chapter" data-level="C.2" data-path="C-appendixC.html"><a href="C-appendixC.html#interactive-graphics"><i class="fa fa-check"></i><b>C.2</b> Interactive graphics</a>
<ul>
<li class="chapter" data-level="C.2.1" data-path="C-appendixC.html"><a href="C-appendixC.html#interactive-linegraphs"><i class="fa fa-check"></i><b>C.2.1</b> Interactive linegraphs</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="D" data-path="D-appendixD.html"><a href="D-appendixD.html"><i class="fa fa-check"></i><b>D</b> Learning Check Solutions</a>
<ul>
<li class="chapter" data-level="D.1" data-path="D-appendixD.html"><a href="D-appendixD.html#chapter-1-solutions"><i class="fa fa-check"></i><b>D.1</b> Chapter 1 Solutions</a></li>
</ul></li>
<li class="chapter" data-level="E" data-path="E-appendixE.html"><a href="E-appendixE.html"><i class="fa fa-check"></i><b>E</b> Versions of R Packages Used</a></li>
<li class="chapter" data-level="" data-path="references.html"><a href="references.html"><i class="fa fa-check"></i>References</a></li>
</ul>
</nav>
</div>
<div class="book-body">
<div class="body-inner">
<div class="book-header" role="navigation">
<h1>
<i class="fa fa-circle-o-notch fa-spin"></i><a href="./">Statistical Inference via Data Science</a>
</h1>
</div>
<div class="page-wrapper" tabindex="-1" role="main">
<div class="page-inner">
<section class="normal" id="section-">
<html>
<img src='https://moderndive.com/wide_format.png' alt="ModernDive">
</html>
<div id="viz" class="section level1" number="2">
<h1><span class="header-section-number">Chapter 2</span> Data Visualization</h1>
<p>We begin the development of your data science toolbox with data visualization. By visualizing data, we gain valuable insights we couldn’t initially obtain from just looking at the raw data values. We’ll use the <code>ggplot2</code> package, as it provides an easy way to customize your plots. <code>ggplot2</code> is rooted in the data visualization theory known as <em>the grammar of graphics</em> <span class="citation">(<a href="#ref-wilkinson2005" role="doc-biblioref">Wilkinson 2005</a>)</span>, developed by Leland Wilkinson. </p>
<p>At their most basic, graphics/plots/charts (we use these terms interchangeably in this book) provide a nice way to explore the patterns in data, such as the presence of <em>outliers</em>, <em>distributions</em> of individual variables, and <em>relationships</em> between groups of variables. Graphics are designed to emphasize the findings and insights you want your audience to understand. This does, however, require a balancing act. On the one hand, you want to highlight as many interesting findings as possible. On the other hand, you don’t want to include so much information that it overwhelms your audience.</p>
<p>As we will see, plots also help us to identify patterns and outliers in our data. We’ll see that a common extension of these ideas is to compare the <em>distribution</em> of one numerical variable, such as what are the center and spread of the values, as we go across the levels of a different categorical variable.</p>
<div id="needed-packages" class="section level3 unnumbered">
<h3>Needed packages</h3>
<p>Let’s load all the packages needed for this chapter (this assumes you’ve already installed them). Read Section <a href="1-getting-started.html#packages">1.3</a> for information on how to install and load R packages.</p>
<div class="sourceCode" id="cb13"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb13-1"><a href="2-viz.html#cb13-1" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(rfishbase)</span>
<span id="cb13-2"><a href="2-viz.html#cb13-2" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(ggplot2)</span>
<span id="cb13-3"><a href="2-viz.html#cb13-3" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(dplyr)</span></code></pre></div>
</div>
<div id="grammarofgraphics" class="section level2" number="2.1">
<h2><span class="header-section-number">2.1</span> The grammar of graphics</h2>
<p>We start with a discussion of a theoretical framework for data visualization known as “the grammar of graphics.” This framework serves as the foundation for the <code>ggplot2</code> package which we’ll use extensively in this chapter. Think of how we construct and form sentences in English by combining different elements, like nouns, verbs, articles, subjects, objects, etc. We can’t just combine these elements in any arbitrary order; we must do so following a set of rules known as a linguistic grammar. Similarly to a linguistic grammar, “the grammar of graphics” defines a set of rules for constructing <em>statistical graphics</em> by combining different types of <em>layers</em>. This grammar was created by Leland Wilkinson <span class="citation">(<a href="#ref-wilkinson2005" role="doc-biblioref">Wilkinson 2005</a>)</span> and has been implemented in a variety of data visualization software platforms like R, but also <a href="https://plot.ly/">Plotly</a> and <a href="https://www.tableau.com/">Tableau</a>.</p>
<div id="components-of-the-grammar" class="section level3" number="2.1.1">
<h3><span class="header-section-number">2.1.1</span> Components of the grammar</h3>
<p>In short, the grammar tells us that:</p>
<blockquote>
<p><strong>A statistical graphic is a <code>mapping</code> of <code>data</code> variables to <code>aes</code>thetic attributes of <code>geom</code>etric objects.</strong></p>
</blockquote>
<p>Specifically, we can break a graphic into the following three essential components:</p>
<ol style="list-style-type: decimal">
<li><code>data</code>: the dataset containing the variables of interest.</li>
<li><code>geom</code>: the geometric object in question. This refers to the type of object we can observe in a plot. For example: points, lines, and bars.</li>
<li><code>aes</code>: aesthetic attributes of the geometric object. For example, x/y position, color, shape, and size. Aesthetic attributes are <em>mapped</em> to variables in the dataset.</li>
</ol>
<p>You might be wondering why we wrote the terms <code>data</code>, <code>geom</code>, and <code>aes</code> in a computer code type font. We’ll see very shortly that we’ll specify the elements of the grammar in R using these terms. However, let’s first break down the grammar with an example.</p>
</div>
<div id="gapminder" class="section level3" number="2.1.2">
<h3><span class="header-section-number">2.1.2</span> Gapminder data</h3>
<p>In February 2006, a Swedish physician and data advocate named Hans Rosling gave a TED talk titled <a href="https://www.ted.com/talks/hans_rosling_shows_the_best_stats_you_ve_ever_seen">“The best stats you’ve ever seen”</a> where he presented global economic, health, and development data from the website <a href="http://www.gapminder.org/tools/#_locale_id=en;&chart-type=bubbles">gapminder.org</a>. For example, for data on 142 countries in 2007, let’s consider only a few countries in Table <a href="2-viz.html#tab:gapminder-2007">2.1</a> as a peek into the data.</p>
<table class="table" style="font-size: 16px; margin-left: auto; margin-right: auto;">
<caption style="font-size: initial !important;">
<span id="tab:gapminder-2007">TABLE 2.1: </span>Gapminder 2007 Data: First 3 of 142 countries
</caption>
<thead>
<tr>
<th style="text-align:left;">
Country
</th>
<th style="text-align:left;">
Continent
</th>
<th style="text-align:right;">
Life Expectancy
</th>
<th style="text-align:right;">
Population
</th>
<th style="text-align:right;">
GDP per Capita
</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left;">
Afghanistan
</td>
<td style="text-align:left;">
Asia
</td>
<td style="text-align:right;">
43.8
</td>
<td style="text-align:right;">
31889923
</td>
<td style="text-align:right;">
975
</td>
</tr>
<tr>
<td style="text-align:left;">
Albania
</td>
<td style="text-align:left;">
Europe
</td>
<td style="text-align:right;">
76.4
</td>
<td style="text-align:right;">
3600523
</td>
<td style="text-align:right;">
5937
</td>
</tr>
<tr>
<td style="text-align:left;">
Algeria
</td>
<td style="text-align:left;">
Africa
</td>
<td style="text-align:right;">
72.3
</td>
<td style="text-align:right;">
33333216
</td>
<td style="text-align:right;">
6223
</td>
</tr>
</tbody>
</table>
<p>Each row in this table corresponds to a country in 2007. For each row, we have 5 columns:</p>
<ol style="list-style-type: decimal">
<li><strong>Country</strong>: Name of country.</li>
<li><strong>Continent</strong>: Which of the five continents the country is part of. Note that “Americas” includes countries in both North and South America and that Antarctica is excluded.</li>
<li><strong>Life Expectancy</strong>: Life expectancy in years.</li>
<li><strong>Population</strong>: Number of people living in the country.</li>
<li><strong>GDP per Capita</strong>: Gross domestic product (in US dollars).</li>
</ol>
<p>Now consider Figure <a href="2-viz.html#fig:gapminder">2.1</a>, which plots this for all 142 of the data’s countries.</p>
<div class="figure" style="text-align: center"><span id="fig:gapminder"></span>
<img src="ModernDive_files/figure-html/gapminder-1.png" alt="Life expectancy over GDP per capita in 2007." width="\textwidth" />
<p class="caption">
FIGURE 2.1: Life expectancy over GDP per capita in 2007.
</p>
</div>
<p>Let’s view this plot through the grammar of graphics:</p>
<ol style="list-style-type: decimal">
<li>The <code>data</code> variable <strong>GDP per Capita</strong> gets mapped to the <code>x</code>-position <code>aes</code>thetic of the points.</li>
<li>The <code>data</code> variable <strong>Life Expectancy</strong> gets mapped to the <code>y</code>-position <code>aes</code>thetic of the points.</li>
<li>The <code>data</code> variable <strong>Population</strong> gets mapped to the <code>size</code> <code>aes</code>thetic of the points.</li>
<li>The <code>data</code> variable <strong>Continent</strong> gets mapped to the <code>color</code> <code>aes</code>thetic of the points.</li>
</ol>
<p>We’ll see shortly that <code>data</code> corresponds to the particular data frame where our data is saved and that “data variables” correspond to particular columns in the data frame. Furthermore, the type of <code>geom</code>etric object considered in this plot are points. That being said, while in this example we are considering points, graphics are not limited to just points. We can also use lines, bars, and other geometric objects.</p>
<p>Let’s summarize the three essential components of the grammar in Table <a href="2-viz.html#tab:summary-table-gapminder">2.2</a>.</p>
<table class="table" style="font-size: 16px; margin-left: auto; margin-right: auto;">
<caption style="font-size: initial !important;">
<span id="tab:summary-table-gapminder">TABLE 2.2: </span>Summary of the grammar of graphics for this plot
</caption>
<thead>
<tr>
<th style="text-align:left;">
data variable
</th>
<th style="text-align:left;">
aes
</th>
<th style="text-align:left;">
geom
</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left;">
GDP per Capita
</td>
<td style="text-align:left;">
x
</td>
<td style="text-align:left;">
point
</td>
</tr>
<tr>
<td style="text-align:left;">
Life Expectancy
</td>
<td style="text-align:left;">
y
</td>
<td style="text-align:left;">
point
</td>
</tr>
<tr>
<td style="text-align:left;">
Population
</td>
<td style="text-align:left;">
size
</td>
<td style="text-align:left;">
point
</td>
</tr>
<tr>
<td style="text-align:left;">
Continent
</td>
<td style="text-align:left;">
color
</td>
<td style="text-align:left;">
point
</td>
</tr>
</tbody>
</table>
</div>
<div id="other-components" class="section level3" number="2.1.3">
<h3><span class="header-section-number">2.1.3</span> Other components</h3>
<p>There are other components of the grammar of graphics we can control as well. As you start to delve deeper into the grammar of graphics, you’ll start to encounter these topics more frequently. In this book, we’ll keep things simple and only work with these two additional components:</p>
<ul>
<li><code>facet</code>ing breaks up a plot into several plots split by the values of another variable (Section <a href="2-viz.html#facets">2.5</a>) </li>
<li><code>position</code> adjustments for barplots (Section <a href="2-viz.html#geombar">2.8</a>) </li>
</ul>
<p>Other more complex components like <code>scales</code> and <code>coord</code>inate systems are left for a more advanced text such as <a href="http://r4ds.had.co.nz/data-visualisation.html#aesthetic-mappings"><em>R for Data Science</em></a> <span class="citation">(<a href="#ref-rds2016" role="doc-biblioref">Grolemund and Wickham 2017</a>)</span>. Generally speaking, the grammar of graphics allows for a high degree of customization of plots and also a consistent framework for easily updating and modifying them.</p>
</div>
<div id="ggplot2-package" class="section level3" number="2.1.4">
<h3><span class="header-section-number">2.1.4</span> ggplot2 package</h3>
<p>In this book, we will use the <code>ggplot2</code> package for data visualization, which is an implementation of the <code>g</code>rammar of <code>g</code>raphics for R <span class="citation">(<a href="#ref-R-ggplot2" role="doc-biblioref">Wickham, Chang, et al. 2021</a>)</span>. As we noted earlier, a lot of the previous section was written in a computer code type font. This is because the various components of the grammar of graphics are specified in the <code>ggplot()</code> function included in the <code>ggplot2</code> package. For the purposes of this book, we’ll always provide the <code>ggplot()</code> function with the following arguments (i.e., inputs) at a minimum:</p>
<ul>
<li>The data frame where the variables exist: the <code>data</code> argument.</li>
<li>The mapping of the variables to aesthetic attributes: the <code>mapping</code> argument which specifies the <code>aes</code>thetic attributes involved.</li>
</ul>
<p>After we’ve specified these components, we then add <em>layers</em> to the plot using the <code>+</code> sign. The most essential layer to add to a plot is the layer that specifies which type of <code>geom</code>etric object we want the plot to involve: points, lines, bars, and others. Other layers we can add to a plot include the plot title, axes labels, visual themes for the plots, and facets (which we’ll see in Section <a href="2-viz.html#facets">2.5</a>).</p>
<p>Let’s now put the theory of the grammar of graphics into practice.</p>
</div>
</div>
<div id="FiveNG" class="section level2" number="2.2">
<h2><span class="header-section-number">2.2</span> Five named graphs - the 5NG</h2>
<p>In order to keep things simple in this book, we will only focus on five different types of graphics, each with a commonly given name. We term these “five named graphs” or in abbreviated form, the <strong>5NG</strong>: </p>
<ol style="list-style-type: decimal">
<li>scatterplots</li>
<li>linegraphs</li>
<li>histograms</li>
<li>boxplots</li>
<li>barplots</li>
</ol>
<p>We’ll also present some variations of these plots, but with this basic repertoire of five graphics in your toolbox, you can visualize a wide array of different variable types. As we’ll see, certain plots are only appropriate for categorical variables, while others are only appropriate for numerical variables.</p>
</div>
<div id="scatterplots" class="section level2" number="2.3">
<h2><span class="header-section-number">2.3</span> 5NG#1: Scatterplots</h2>
<p>The simplest of the 5NG are <em>scatterplots</em>, also called <em>bivariate plots</em>. They allow you to visualize the <em>relationship</em> between two numerical variables. While you may already be familiar with scatterplots, let’s view them through the lens of the grammar of graphics we presented in Section <a href="2-viz.html#grammarofgraphics">2.1</a>. Specifically, we will visualize the relationship between the following two numerical variables in the <code>all_fishdata</code> data frame we created in Chapter <a href="1-getting-started.html#fishbasedataframe">1.4.2</a>:</p>
<ol style="list-style-type: decimal">
<li><code>Length</code>: typical length (in cm) of a fish species on the horizontal “x” axis and</li>
<li><code>Weight</code>: typical weight (in g) on the vertical “y” axis</li>
</ol>
<p>for brackish fish species. (If the <code>all_fishdata</code> set is not in your Environment, rerun the <code>all_fishdata <- species()</code> command to reproduce it.)</p>
<p>This requires paring down the data from all 34,299 fish species to only the 3,116 <em>brackish</em> species that live in brackish (slightly salty) environments at any stage of development. We do this so our scatterplot will involve a manageable 3,116 points, and not an overwhelmingly large number. To achieve this, we’ll take the <code>all_fishdata</code> data frame, filter the rows so that only the 3,116 rows corresponding to brackish fish species are kept, and save this in a new data frame called <code>brackish_fish</code> using the <code><-</code> <em>assignment</em> operator: </p>
<div class="sourceCode" id="cb14"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb14-1"><a href="2-viz.html#cb14-1" aria-hidden="true" tabindex="-1"></a>brackish_fish <span class="ot"><-</span> all_fishdata <span class="sc">%>%</span> </span>
<span id="cb14-2"><a href="2-viz.html#cb14-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">filter</span>(Brack <span class="sc">==</span> <span class="sc">-</span><span class="dv">1</span>)</span></code></pre></div>
<p>For now, we suggest you don’t worry if you don’t fully understand this code. We’ll see later in Chapter <a href="3-wrangling.html#wrangling">3</a> on data wrangling that this code uses the <code>dplyr</code> package for data wrangling to achieve our goal: it takes the <code>all_fishdata</code> data frame and <code>filter</code>s it to only return the rows where <code>Brack</code> is equal to <code>-1</code> (paradoxically indicating that this is true, as opposed to 0/false). Recall from Section <a href="1-getting-started.html#code">1.2</a> that testing for equality is specified with <code>==</code> and not <code>=</code>. Convince yourself that this code achieves what it is supposed to by exploring the resulting data frame by running <code>View(brackish_fish)</code>. You’ll see that it has 3,116 rows, consisting of only 3,116 brackish fish species.</p>
<div class="learncheck">
<p>
<strong><em>Learning check</em></strong>
</p>
</div>
<p><strong>(LC2.1)</strong> Take a look at both the <code>all_fishdata</code> and <code>brackish_fish</code> data frames by running <code>View(all_fishdata)</code> and <code>View(brackish_fish)</code>. Does the <code>brackish_fish</code> data frame seem to contain the expected rows?</p>
<div class="learncheck">
</div>
<div id="geompoint" class="section level3" number="2.3.1">
<h3><span class="header-section-number">2.3.1</span> Scatterplots via <code>geom_point</code></h3>
<p>Let’s now go over the code that will create the desired scatterplot, while keeping in mind the grammar of graphics framework we introduced in Section <a href="2-viz.html#grammarofgraphics">2.1</a>. Let’s take a look at the code and break it down piece-by-piece.</p>
<div class="sourceCode" id="cb15"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb15-1"><a href="2-viz.html#cb15-1" aria-hidden="true" tabindex="-1"></a><span class="fu">ggplot</span>(<span class="at">data =</span> brackish_fish, <span class="at">mapping =</span> <span class="fu">aes</span>(<span class="at">x =</span> Length, <span class="at">y =</span> Weight)) <span class="sc">+</span> </span>
<span id="cb15-2"><a href="2-viz.html#cb15-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_point</span>()</span></code></pre></div>
<p>Within the <code>ggplot()</code> function, we specify two of the components of the grammar of graphics as arguments (i.e., inputs):</p>
<ol style="list-style-type: decimal">
<li>The <code>data</code> as the <code>brackish_fish</code> data frame via <code>data = brackish_fish</code>.</li>
<li>The <code>aes</code>thetic <code>mapping</code> by setting <code>mapping = aes(x = Length, y = Weight)</code>. Specifically, the variable <code>Length</code> maps to the <code>x</code> position aesthetic, while the variable <code>Weight</code> maps to the <code>y</code> position.</li>
</ol>
<p>We then add a layer to the <code>ggplot()</code> function call using the <code>+</code> sign. The added layer in question specifies the third component of the grammar: the <code>geom</code>etric object. In this case, the geometric object is set to be points by specifying <code>geom_point()</code>. After running these two lines of code in your console, you’ll notice two outputs: a warning message and the graphic shown in Figure <a href="2-viz.html#fig:noalpha">2.2</a>.</p>
<pre><code>Warning: Removed 2532 rows containing missing values (geom_point).</code></pre>
<div class="figure" style="text-align: center"><span id="fig:noalpha"></span>
<img src="ModernDive_files/figure-html/noalpha-1.png" alt="Length versus weight for brackish species in all_fishdata." width="\textwidth" />
<p class="caption">
FIGURE 2.2: Length versus weight for brackish species in all_fishdata.
</p>
</div>
<p>Let’s first unpack the graphic in Figure <a href="2-viz.html#fig:noalpha">2.2</a>. Observe that a <em>positive relationship</em> exists overall between <code>Length</code> and <code>Weight</code>: as fish lengths increase, fish weights tend to also increase. Observe also the large number of points clustered near (0, 0).</p>
<p>Let’s turn our attention to the warning message. R is alerting us to the fact that many rows were ignored due to missing information. For these rows, either the value for <code>Length</code> or <code>Weight</code> or both were missing (recorded in R as <code>NA</code>), and thus these rows were ignored in our plot.</p>
<p>Before we continue, let’s make a few more observations about this code that created the scatterplot. Note that the <code>+</code> sign comes at the end of lines, and not at the beginning. You’ll get an error in R if you put it at the beginning of a line. When adding layers to a plot, you are encouraged to start a new line after the <code>+</code> (by pressing the Return/Enter button on your keyboard) so that the code for each layer is on a new line. As we add more and more layers to plots, you’ll see this will greatly improve the legibility of your code.</p>
<p>To stress the importance of adding the layer specifying the <code>geom</code>etric object, consider Figure <a href="2-viz.html#fig:nolayers">2.3</a> where no layers are added. Because the <code>geom</code>etric object was not specified, we have a blank plot which is not very useful!</p>
<div class="sourceCode" id="cb17"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb17-1"><a href="2-viz.html#cb17-1" aria-hidden="true" tabindex="-1"></a><span class="fu">ggplot</span>(<span class="at">data =</span> brackish_fish, <span class="at">mapping =</span> <span class="fu">aes</span>(<span class="at">x =</span> Length, <span class="at">y =</span> Weight))</span></code></pre></div>
<div class="figure" style="text-align: center"><span id="fig:nolayers"></span>
<img src="ModernDive_files/figure-html/nolayers-1.png" alt="A plot with no layers." width="\textwidth" />
<p class="caption">
FIGURE 2.3: A plot with no layers.
</p>
</div>
<div class="learncheck">
<p>
<strong><em>Learning check</em></strong>
</p>
</div>
<p><strong>(LC2.2)</strong> What are some practical reasons why <code>Length</code> and <code>Weight</code> have a positive relationship?</p>
<p><strong>(LC2.3)</strong> For which of these two variables, <code>Length</code> and <code>Weight</code>, is the data more frequently missing? How did you determine this?</p>
<p><strong>(LC2.4)</strong> What are some other features of the plot that stand out to you?</p>
<p><strong>(LC2.5)</strong> Create a new scatterplot using different quantitative variables in the <code>brackish_fish</code> data frame by modifying the example given.</p>
<div class="learncheck">
</div>
</div>
<div id="overplotting" class="section level3" number="2.3.2">
<h3><span class="header-section-number">2.3.2</span> Overplotting</h3>
<p>The large mass of points near (0, 0) in Figure <a href="2-viz.html#fig:noalpha">2.2</a> can cause some confusion since it is hard to tell the true number of points that are plotted. This is the result of a phenomenon called <em>overplotting</em>. As one may guess, this corresponds to points being plotted on top of each other over and over again. When overplotting occurs, it is difficult to know the number of points being plotted. There are two methods to address the issue of overplotting. Either by</p>
<ol style="list-style-type: decimal">
<li>Adjusting the transparency of the points or</li>
<li>Adding a little random “jitter,” or random “nudges,” to each of the points.</li>
</ol>
<p><strong>Method 1: Changing the transparency</strong></p>
<p>The first way of addressing overplotting is to change the transparency/opacity of the points by setting the <code>alpha</code> argument in <code>geom_point()</code>. We can change the <code>alpha</code> argument to be any value between <code>0</code> and <code>1</code>, where <code>0</code> sets the points to be 100% transparent and <code>1</code> sets the points to be 100% opaque. By default, <code>alpha</code> is set to <code>1</code>. In other words, if we don’t explicitly set an <code>alpha</code> value, R will use <code>alpha = 1</code>.</p>
<p>Note how the following code is identical to the code in Section <a href="2-viz.html#scatterplots">2.3</a> that created the scatterplot with overplotting, but with <code>alpha = 0.2</code> added to the <code>geom_point()</code> function:</p>
<div class="sourceCode" id="cb18"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb18-1"><a href="2-viz.html#cb18-1" aria-hidden="true" tabindex="-1"></a><span class="fu">ggplot</span>(<span class="at">data =</span> brackish_fish, <span class="at">mapping =</span> <span class="fu">aes</span>(<span class="at">x =</span> Length, <span class="at">y =</span> Weight)) <span class="sc">+</span> </span>
<span id="cb18-2"><a href="2-viz.html#cb18-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_point</span>(<span class="at">alpha =</span> <span class="fl">0.2</span>)</span></code></pre></div>
<div class="figure" style="text-align: center"><span id="fig:alpha"></span>
<img src="ModernDive_files/figure-html/alpha-1.png" alt="Arrival vs. departure delays scatterplot with alpha = 0.2." width="\textwidth" />
<p class="caption">
FIGURE 2.4: Arrival vs. departure delays scatterplot with alpha = 0.2.
</p>
</div>
<p>The key feature to note in Figure <a href="2-viz.html#fig:alpha">2.4</a> is that the transparency of the points is cumulative: areas with a high-degree of overplotting are darker, whereas areas with a lower degree are less dark. Note furthermore that there is no <code>aes()</code> surrounding <code>alpha = 0.2</code>. This is because we are not mapping a variable to an aesthetic attribute, but rather merely changing the default setting of <code>alpha</code>. In fact, you’ll receive an error if you try to change the second line to read <code>geom_point(aes(alpha = 0.2))</code>.</p>
<p><strong>Method 2: Jittering the points</strong></p>
<p>The second way of addressing overplotting is by <em>jittering</em> all the points. This means giving each point a small “nudge” in a random direction. You can think of “jittering” as shaking the points around a bit on the plot. Let’s illustrate using a simple example first. Say we have a data frame with 4 identical rows of x and y values: (0,0), (0,0), (0,0), and (0,0). In Figure <a href="2-viz.html#fig:jitter-example-plot-1">2.5</a>, we present both the regular scatterplot of these 4 points (on the left) and its jittered counterpart (on the right).</p>
<div class="figure" style="text-align: center"><span id="fig:jitter-example-plot-1"></span>
<img src="ModernDive_files/figure-html/jitter-example-plot-1-1.png" alt="Regular and jittered scatterplot." width="\textwidth" />
<p class="caption">
FIGURE 2.5: Regular and jittered scatterplot.
</p>
</div>
<p>In the left-hand regular scatterplot, observe that the 4 points are superimposed on top of each other. While we know there are 4 values being plotted, this fact might not be apparent to others. In the right-hand jittered scatterplot, it is now plainly evident that this plot involves four points since each point is given a random “nudge.”</p>
<p>Keep in mind, however, that jittering is strictly a visualization tool; even after creating a jittered scatterplot, the original values saved in the data frame remain unchanged. </p>
<p>To create a jittered scatterplot, instead of using <code>geom_point()</code>, we use <code>geom_jitter()</code>. Observe how the following code is very similar to the code that created the scatterplot with overplotting in Subsection <a href="2-viz.html#geompoint">2.3.1</a>, but with <code>geom_point()</code> replaced with <code>geom_jitter()</code>.</p>
<div class="sourceCode" id="cb19"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb19-1"><a href="2-viz.html#cb19-1" aria-hidden="true" tabindex="-1"></a><span class="fu">ggplot</span>(<span class="at">data =</span> brackish_fish, <span class="at">mapping =</span> <span class="fu">aes</span>(<span class="at">x =</span> Length, <span class="at">y =</span> Weight)) <span class="sc">+</span> </span>
<span id="cb19-2"><a href="2-viz.html#cb19-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">geom_jitter</span>(<span class="at">width =</span> <span class="dv">30</span>, <span class="at">height =</span> <span class="dv">30</span>)</span></code></pre></div>