-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about the input of extract_codon_alignment.py #194
Comments
The fist problem has solved, the transcript_id isn't a file, just a id |
Hi,
codonAlignments.allCESARexons.fa.gz In other words, pls use the new codonAlignments.fa.gz |
Thank you for your reply, and I want to ask another question. I want to get cds multiple sequence alignment file for single-copy orthologous genes across multiple species. I used this script and also applied Hmmcleaner for cleaning, filtered based on the integrity of start codons. Now, the remaining genes include those with alternative splicing. I only need to keep the most suitable transcript, right? I no longer need to filter for premature gene termination or gene duplications, correct? python3 extract_codon_alignment.py -o ./cds.msa/ENSMUST00000070533.Xkr4.fa -s input_dirs toga.transcripts.bed ENSMUST00000070533.Xkr4 --macse_caller "java -jar /home/software/MACSE_V2_PIPELINES-11.05/UTILS/macse_v2.03.jar" |
Dear professor,
I am attempting to obtain multi-sequence alignments of orthologous genes using the results from TOGA. Some of the species are from my own analysis, while others are from Zoonomia. I have two questions:
(1)I ran a test with a few species, but I encountered the error message: "# Warning! TOGA didn't find transcript_id orthologs for the following species." I suspect this may be due to a misunderstanding of the formats for the input_dirs, reference_bed, and transcript_id files. Could you kindly provide some guidance on how to resolve this issue?
(2)Some of the files downloaded from Zoonomia have either been renamed or are missing, which may affect the execution of extract_codon_alignment.py. For instance, there are files like codonAlignments.allCESARexons.fa.gz and codonAlignments.fa.gz. Should I use one of these as codon.fasta? Also, some files like query_isoforms.tsv are missing. Could this be problematic?
I would greatly appreciate any advice you can provide.
Best wishes
The text was updated successfully, but these errors were encountered: