geneID v3 is a browser-based application that takes user’s defined sequenceID, or a comma-separated list of sequenceIDs, and returns corresponding coding sequences (cds) and polypeptide (pep) sequences in the new tab.
User is able to choose to download visualised results either as TAB-separated (.txt
) files or FASTA (.fasta
) files, or both.
Moreover, user can choose to conduct a query across three hierarchical levels of information:
- Considering representative sequences only
- Considering alternative and representative sequences
- All sequences – i.e. alternative, representative and cultivar-specific sequences
At the time, geneID contains potato sequenceIDs and corresponding sequences, however it is organism-agnostic.
Fig1: Landing page, one sequenceID query
Fig2: Landing page, comma-separated list query
Fig3: Results Tab
User needs to install:
- Visual studio code (April 2022, v 1.67)
- In Visual Studio Code add extension SQL Server (mssql)
- XAMMP (v 8.1.6, PHP 8.1.6)
Note: Install just apache server, other things such as MySQL, FileZilla, Mercury and Tomcat are not needed - Microsoft SQL Server Management Studio (v 18.11.1)
- PyCharm (v 2022.1.1)
- Prepare/download table(s) containing sequenceIDs, and the corresponding sequences (e.g. preprocessed CDS and pep)
- Import tables in Microsoft SQL Server Management Studio
right click on your database -> Tasks -> Import Flat File -> Select table(s)
- Download all files for Visual studio code from Scripts directory
- Deposit all necessary files into
htcdocs
designated directory in your installation folder for XAMMP - Connect database from Microsoft SQL Server Management Studio to Visual studio code
Note: Follow steps listed at Connect to your database - Open XAMMP and press
start button
(to the right of the module Apache) - Open your browser and type
localhost/the-name-of-directory
in which you saved your Visual studio code files
Download all fasta files and download PyCharm
Reformating fasta files to tab delimited files:
out_lines = []
temp_line = ''
with open('path/to/file','r') as fp:
for line in fp:
if line.startswith('>'):
out_lines.append(temp_line)
temp_line = line.strip() + '\t'
else:
temp_line += line.strip()
with open('path/to/new_file', 'w') as fp_out:
fp_out.write('\n'.join(out_lines))
Replace path/to/file
with path to fasta files. replace path/to/new_file
with directory where tab delimited files will be written to.
Code taken from: stackoverflow 'convert a fasta file to a tab-delimited file using python script'
In Microsoft SQL Server Management Studio import converted files into designated database.
Right click on your database -> Import flat file -> Pick your file
If needed, modify the names of your columns and types of your data.
Merge all fasta files using:
SELECT * FROM (name_of_your_table)
UNION
SELECT * FROM (name_of_second_table)
(...)
for f in *fasta; do
echo $f
xargs faidx -d ' ' $f \
< 5cv_weak-components_extract-IDs.txt > \
./out/subset_$f 2> ./err/subset_$f.error;
done;
https://fairdomhub.org/data_files/3420
https://bioinformatics.stackexchange.com/questions/14818/creating-a-tab-delimited-file
https://www.mysql.com/products/workbench/migrate/
https://dev.mysql.com/doc/workbench/en/wb-migration-install.html
https://blog.appseed.us/flask-react-full-stack-seed-projects/
https://dev.to/dev_elie/connecting-a-react-frontend-to-a-flask-backend-h1o
https://flask.palletsprojects.com/en/2.0.x/
https://dev.to/gajesh/the-complete-flask-beginner-tutorial-124i
https://github.com/neo4j/neo4j
https://fairdomhub.org/assays/1268?graph_view=tree