forked from UBC-MDS/socceranalysisR
-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
137 lines (104 loc) · 4.94 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
---
output: github_document
editor_options:
markdown:
wrap: 72
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# socceranalysisR
<!-- badges: start -->
[![R-CMD-check](https://github.com/UBC-MDS/socceranalysisR/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/UBC-MDS/socceranalysisR/actions/workflows/R-CMD-check.yaml)
<!-- badges: end -->
socceranalysisR is a powerful R package designed to make it easy to
analyze and understand soccer statistics. With its set of functions, you
can quickly obtain summary statistics for a particular team, identify
outliers based on market value, rank players by goals per game and
display different plots. The package is built in a way that allows user
to easily customize the functions to their own interests, giving them
the flexibility to analyze the data in a way that is most meaningful to
them. Whether you're a coach, a sports journalist or an analyst,
socceranalysisR will help you unlock the insights hidden in your soccer
data and make more informed decisions.
## Functions
1. `soc_find_team_stat`: provides a quick and easy way to understand
the descriptive statistics of a team.
2. `soc_rankingplayers`: Ranks players based on specific attributes
3. `soc_get_outliers`: Identifes outliers using statistical methods
(interquartile range or standard deviations)
4. `soc_viz_stats` : Generates meaningful visualizations to help users
understand and interpret the numerical data (scatter plots or histograms)
## R ecosystem
socceranalysisR can be used in conjunction with other popular R packages
such as [dplyr](https://github.com/tidyverse/dplyr) and
[ML_for_Hackers](https://github.com/johnmyleswhite/ML_for_Hackers) to
perform more advanced data analysis and machine learning tasks. For
example, users can use dplyr to manipulate and clean their soccer data,
and then use this package to perform specific soccer-related analysis on
the cleaned data. Additionally, socceranalysisR can be used in
conjunction with ML_for_Hackers for machine learning tasks on soccer
data. They are designed to be a higher-level, more user-friendly and
declarative interface based on
[ggplot2](https://github.com/tidyverse/ggplot2) for performing specific
soccer-related analysis and visualization tasks. Users can perform
similar visualization using [shiny](https://github.com/rstudio/shiny).
Overall, socceranalysisR is a valuable addition to the R ecosystem as it
provides a specialized tool for analyzing and understanding soccer data
without the need for writing complex code, this can be especially useful
for users who may not have extensive experience with data analysis.
## Installation
You can install the development version of socceranalysisR from
[GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("UBC-MDS/socceranalysisR")
```
## Example
This is a basic example which shows you how to solve a common problem:
```{r example}
library(socceranalysisR)
# Visualization
## toy data
small_data <- data.frame(age = - c(18, 20, 20, 25, 25, 24, 25, 30, 33, 32, 44, 30, 17),
Wages_Euros = c(300000, 575000, 150000, 475000,375000, 225000, 225000, 200000, 225000, 250000, 250000, 375000, 200000))
## scatter plots
soc_viz_stats('age', 'Wages_Euros', T , small_data)
## histograms
soc_viz_stats('age', 'Wages_Euros', FALSE , small_data)
## getting outliers
soc_get_outliers(small_data,Wages_Euros,"SD",2)
## toy data set for ranking
my_data <- data.frame (Name = c("Flora", "Mary", "Sarah", "Ester", "Sophie", "Maria"),
second_column = c(1, 4, 3, 5, 1, 6),
third_column = c(7, 30, 12, 17, 34,8))
## ranking players by their names based on attributes
rankingplayers(my_data, "second_column")
## toy dataset to test soc_find_team_stat
small_df <- data.frame(Team= c("A", "B", "B", "B", "C", "C") ,
Market_Value_Euros = c(375000, 225000, 225000, 200000, 225000, 250000),
age = c(18, 20, 20, 25, 25, 24))
## a team's descriptive statistics table and box-plot of selected feature
soc_find_team_stat(small_df, "B", Market_Value_Euros)
```
## Contributors
| Core contributor | Github.com username |
|------------------|---------------------|
| Flora Ouedraogo | @florawendy19 |
| Gaoxiang Wang | @louiewang820 |
| Manvir Kohli | @manvirsingh96 |
| Vincent Ho | @vincentho32 |
## Contributing
Authors: Vincent Ho, Manvir Singh Kohli, Gaoxiang Wang, Flora Ouedraogo
Interested in contributing? Check out the contributing guidelines.
Please note that this project is released with a Code of Conduct. By
contributing to this project, you agree to abide by its terms.
## License
`socceranalysisR` was created by Gaoxiang Wang, Manvir Kohli, Vincent Ho
and Flora Ouedraogo. It is licensed under the terms of the MIT license.