HLADiversity R Package

The goal of HLADiversity is to estimate the distinctiveness of HLA alleles.

The package does the following:
- calculates HLA allele frequency
- plots HLA allele frequency
- plots HLA allele count
- compares the frequency of a target dataset to a reference population
- plots the diversity of HLA alleles.

Data Format

The input data has to be a HLA PED file in tsv format. The columns of the HLA PED file should be in the format shown below.

HLA class I and II alleles (A, B, C, DQA1, DQB1, DRB1, DPA1, DPB1)
HLA allele column name separator is "."
Include population column

A.1	A.2	B.1	B.2	C.1	C.2	DQA1.1	DQA1.2	DQB1.1	DQB1.2	DRB1.1	DRB1.2	DPA1.1	DPA1.2	DPB1.1	DPB1.2
A*30:02	A*32:01	B*42:01	B*44:03	C*04:01	C*17:01	DQA1*02:01	DQA1*04:01	DQB1*02:01	DQB1*04:02	DRB1*03:02	DRB1*07:01	DPA1*01:03	DPA1*03:03	DPB1*04:01	DPB1*04:02
A*02:03	A*33:03	B*38:02	B*58:01	C*03:02	C*07:02	DQA1*01:02	DQA1*03:01	DQB1*03:02	DQB1*06:09	DRB1*04:03	DRB1*13:02	DPA1*01:03	DPA1*02:02	DPB1*04:02	DPB1*394:01

If your HLA PED file does not have a population column, create one with the name of the dataset. For example:

hped$Population <- "1KG"

Installation

You can install the development version of HLADiversity from GitHub with:

devtools::install_github("yang-luo-lab/HLADiversity")

Running the functions

The example datasets are derived from the 1000 Genomes (reference) and GGVP (target) datasets: https://www.internationalgenome.org/

Load Library

library(HLADiversity)

Function 1: Calculate HLA allele frequency

head(calculate_HLA_frequency(reference))
#> Loading required package: pacman
#>    allele        freq count
#> 1 A*30:02 0.028009839  1207
#> 2 A*02:03 0.002784740   120
#> 3 A*02:01 0.200106748  8623
#> 4 A*01:01 0.093451221  4027
#> 5 A*24:02 0.081917757  3530
#> 6 A*02:07 0.006312076   272

Function 2: Plot HLA allele frequecies

Plot_HLA_allele_frequency(target, minFreq = 0.05)

Function 3: Plot HLA allele counts

Plot_HLA_allele_count(reference)
#> Warning: Expected 2 pieces. Additional pieces discarded in 1 rows [481].
#> Warning: Expected 2 pieces. Missing pieces filled with `NA` in 1 rows [757].

Function 4: Compare frequency of target to reference

Plot_HLA_target_vs_ref(target, reference)
#> `geom_smooth()` using formula = 'y ~ x'

Function 5: Plot HLA allele diversity

plot_HLA_Diversity(reference, gene = "A", ntop = 5)
#> Warning: Expected 2 pieces. Additional pieces discarded in 1 rows [1201].
#> Warning: Expected 2 pieces. Missing pieces filled with `NA` in 1 rows [1].
#> Warning: Expected 2 pieces. Additional pieces discarded in 1 rows [1200].

Help

Find a description of each function and how to run it by invoking a question mark before the function in R console.

?calculate_HLA_frequency

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
R		R
data		data
man		man
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
HLADiversity.Rproj		HLADiversity.Rproj
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md
output		output

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HLADiversity R Package

Data Format

Installation

Running the functions

Load Library

Function 1: Calculate HLA allele frequency

Function 2: Plot HLA allele frequecies

Function 3: Plot HLA allele counts

Function 4: Compare frequency of target to reference

Function 5: Plot HLA allele diversity

Help

About

Releases

Packages

Languages

License

yang-luo-lab/HLADiversity

Folders and files

Latest commit

History

Repository files navigation

HLADiversity R Package

Data Format

Installation

Running the functions

Load Library

Function 1: Calculate HLA allele frequency

Function 2: Plot HLA allele frequecies

Function 3: Plot HLA allele counts

Function 4: Compare frequency of target to reference

Function 5: Plot HLA allele diversity

Help

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages