ConPosER - CONserved POSition identifiER

Identify conserved positions in amino acid multiple sequence alignments in R.

The sequence-structure-function paradigm follows that a protein’s primary sequence dictates its three-dimensional structure, which in turn influences function. Sequence determinants are generally restricted to a limited number of residues, evidenced by structurally similar proteins having low sequence identity and the ability to experimentally mutate many positions along the protein sequence with no impact on activity. Those positions that do affect activity when mutated tend to be located at the protein's core or at functional sites. It follows that residues which are conserved across a protein family are likely determinants of function, as any mutations at these sites abolish function which would exclude these mutants from selection and sequencing.

ConPosER generates a multiple sequence alignment (MSA) of input amino acid (AA) sequences, then identifies positions 100% conserved across all sequences. It can then plot the 2D location of these positions on a gene map. MSAs are generated using the msa package.

ConPosER includes two functions:

conposer_id(): Takes input AA sequences, and outputs list of 100% conserved positions and the residue observed there.
conposer_plot(): Plots the 100% conserved positions on 2D gene map, plus a barplot of the number of AAs observed at each position. Calls a sub-function conposer_geneplot() which includes the formatting for the geneplot.

Content

Install

Install directly from GitHub:

source("https://raw.github.com/kevinsblake/ConPosER/main/ConPosE.R")

Alternatively, can download and then install using the filepath:

source("dir1/dir2/pathotype.R")

Functions

`conposer_id()`

Description

Function to list each conserved position and the amino acid at that positions.

Usage

library(Biostrings)
library(msa)
library(dplyr)

conposer_id(seqs.file="sequences.fasta", msa=c("ClustalW", "ClustalOmega", "Muscle"), gap.lim=0.30)

Arguments

seqs.file Filepath to fasta file containing input sequences.

msa Specifies the multiple sequence alignment algorithm used to align the input sequences.

gap.lim Specifies the fraction of sequences that can have a gap at a given position. Columns with gaps greater than this fraction are ignored. (Default = 0.30)

Examples

> df <- conposer_id("sequences.fasta", msa="ClustalOmega")
using Gonnet
> head(df)
  pos AA
1  39  G
2  44  G
3  51  L
4  62  E
5  70  R
6  73  G

`conposer_plot()`

Description

Function for generating a gene plot of the MSA indicating the locations of the conserved positions, plus a barplot of the number of AAs observed at each position.

Usage

library(Biostrings)
library(msa)
library(dplyr)

conposer_plot(seqs.file="sequences.fasta", filename="geneplot.pdf", linecol="black", gap.lim=0.30)

Arguments

seqs.file Path and filename of fasta file containing input sequences.

filename Path and filename of output geneplot. Default is "geneplot.pdf"

linecol Color of lines indicating conserved positions. Default is "black."

gap.lim Specifies the fraction of sequences that can have a gap at a given position. Columns with gaps greater than this fraction are ignored. (Default = 0.30)

Examples

> conposer_plot(seqs.file="sequences.fasta", filename="geneplot.pdf", linecol="red")
using Gonnet
pdf 
  2

References

Bodenhofer et al. msa: an R package for multiple sequence alignment. Bioinformatics. 2015 Dec 15; 31(24):3997-9. doi: 10.1093/bioinformatics/btv494.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
photos		photos
.gitignore		.gitignore
ConPosE.R		ConPosE.R
ConPosER.Rproj		ConPosER.Rproj
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ConPosER - CONserved POSition identifiER

Content

Install

Functions

`conposer_id()`

Description

Usage

Arguments

Examples

`conposer_plot()`

Description

Usage

Arguments

Examples

References

About

Releases

Packages

Languages

kevinsblake/ConPosER

Folders and files

Latest commit

History

Repository files navigation

ConPosER - CONserved POSition identifiER

Content

Install

Functions

conposer_id()

Description

Usage

Arguments

Examples

conposer_plot()

Description

Usage

Arguments

Examples

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`conposer_id()`

`conposer_plot()`

Packages