Skip to content

Latest commit

 

History

History
47 lines (30 loc) · 2.23 KB

README.md

File metadata and controls

47 lines (30 loc) · 2.23 KB

Website Twitter Follow

logo

Activark is a data-driven, ML-based approach to predict the functional consequence of genetic changes in protein kinases. Activark was trained on a curated dataset of activating (i.e. constitutive-activation or increase in kinase activity), deactivating (i.e. loss or decrease in Kinase activity), and drug-resistance protein variants in human kinases and using sequence and structural features.

Briefly, we applied a random forest algorithm to develop 3 contrasting predictors based on seven types of sequence and structural features:

  1. Pred (A v D): The first predictor, activating vs deactivating, represents a typical situation when one has what is believed to be a functional variant (e.g. observed many times in a cohort or dataset) and wishes to distinguish these two possibilities.
  2. Pred (A vs D vs N): The second, activating, deactivating or neutral, is more reflective of a situation where one does not know if a variant is functional at all and thus one needs to predict neutrals.
  3. Pred (R vs N):The third predictor, resistance vs neutral, predicts if a given mutation is resistant or not.

To access the Activark webservice, go to here To know more about Activark, visit here


How to run Activark locally?

If you wish run Activark locally on your system, follow the steps below:

Create the environment (activark) from the environment.yml file

conda env create -f environment.yml

Activate the environment

conda activate activark

Move to the predictor directory (required)

cd ML/

Read the help section

./prepareTestData.py -h

Example of input:

./prepareTestData.py sample_mutations.txt


Performance of Activark

10-fold stratified CV results of all the 3 predictors ROC


Contact

Gurdeep Singh: gurdeep.singh[at]bioquant[dot]uni-heidelberg[dot]de
Torsten Schmenger: torsten.schmenger[at]bioquant[dot]uni-heidelberg[dot]de