-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option to turn OFF the filtering of low complexity regions #5
Comments
I think this should be possible. I will need to add an option to the command file to turn off filtering. If I understand correctly, this would mean supplying the following options to BLAST when running the forward BLAST search:
If you have any thoughts on this please let me know. I will do my best to implement this week. |
I looked through the help pages for different blasts, and it looks like -dust filtering option exists only for blastn. Other blasts have -seg filtering option. According to the BLAST manual, segmasker is used to mask low-complexity regions of protein sequences, while dustmasker is used to do a similar thing for nucleotide sequences. I don't know for sure, but since the forward BLAST search uses tblastn, I think we need to change only one thing: -seg no
# -soft_masking is false by default in tblastn Let me know what you think. |
Thanks this is helpful - It is possible to perform DIGS using blastn, so it looks as though I will need to implement this slightly differently depending on which BLAST program is being used. |
I've updated the DIGS code that BLAST arguments can now be provided in the .ctl file, in the screensets block. e.g.
|
Hi Rob, I've been testing DIGS for the last couple of weeks. I've run it with the same settings as before, but this time turning off the pre-filtering.
I've managed to find missing Pro-rich protein regions this time. However, the problem now is that they have the wrong labelling (assigned_gene_name). I'm wondering if this has something to do with pre-filtering for the reverse BLAST run. When I turn off the pre-filtering, does it work only for the forward BLAST run or the reverse run as well? For instance, I've noticed that tblastn is now running with |
OK thanks, this is good to know about, I'll take a look and get back to you. |
Good afternoon,
I am wondering if there is a possibility to turn OFF the filtering of low-complexity regions in DIGS or make the filtering optional (e.g. in the control file). I am searching for a protein with a region of lox complexity, and unfortunately, this region cannot be found unless the filtering is switched off. I tested it using DIGS, NCBI tblastn, and Ensembl tblastn. NCBI and Ensembl allow me to find the exon corresponding to this region only when I deselect the "Filter low complexity region" option, while I cannot find the region at all using DIGS. I hope it makes sense.
Thanks a lot in advance!
The text was updated successfully, but these errors were encountered: