Skip to content

achigbrow/CAS-523-Proj-2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

65 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CAS-523-Proj-2

Genome

Wuhan-Hu-1

We are using the complete genome of the Sars-COV-2 virus strain Wuhan-Hu-1 found National Library of Medicine for the first part of the project. The text version of the genome can be found in genomes/.

When reading this genome, we can break it into the 5'UTR region (sites 1-265) and the actual gene (sites 266-21555) according to the FASTA file linked in the above reference.

The genomes/fasta_reader.py can be used to read the wuhan-hu-1.txt file by calling its read_wuhan_1 function with the filepath to wuhan-hu-1.txt. This function will return 2 strings: hu1_full_genome and hu1_rbd. The full genome is just that. The hu1_rbd is a string comprised of the nucleotides at sites 21563..25384 corresponding (hopefully) to the RBD. It has an arg parser and can be called in the terminal:

python fasta_reader.py [-filepath=<file/you/are/reading> -opt=[which file 
you are reading]

The only option for -opt at this stage is 0 for ``read_wuhan_1`. If we use a different FASTA for part 2, we can update the reader to portion it appropriately and add its option.

References:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published