Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError with netsyn_getINSDCFiles.py or netsyn.py #1

Open
cmonat opened this issue Jun 9, 2022 · 27 comments
Open

IndexError with netsyn_getINSDCFiles.py or netsyn.py #1

cmonat opened this issue Jun 9, 2022 · 27 comments
Assignees

Comments

@cmonat
Copy link

cmonat commented Jun 9, 2022

Hello,

I'm trying to run netsyn and got the following error:

Traceback (most recent call last):
File "/grid/sw/netsyn/0.1.0/bin/netsyn", line 11, in
load_entry_point('netsyn==0.1.0', 'console_scripts', 'netsyn')()
File "/grid/sw/netsyn/0.1.0/lib/python3.8/site-packages/netsyn-0.1.0-py3.8.egg/netsyn/netsyn.py", line 668, in main
boxesManager(runFromBox, resultsDirectory,
File "/grid/sw/netsyn/0.1.0/lib/python3.8/site-packages/netsyn-0.1.0-py3.8.egg/netsyn/netsyn.py", line 427, in boxesManager
runBox(nameBox, resultsDirectory, analysisNumber, ORDERBOX, args)
File "/grid/sw/netsyn/0.1.0/lib/python3.8/site-packages/netsyn-0.1.0-py3.8.egg/netsyn/netsyn.py", line 535, in runBox
netsyn_getINSDCFiles.run(args.UniProtACList)
File "/grid/sw/netsyn/0.1.0/lib/python3.8/site-packages/netsyn-0.1.0-py3.8.egg/netsyn/netsyn_getINSDCFiles.py", line 203, in run
crossReference[entry]['Cross-reference (embl)'][index],
IndexError: list index out of range

My uniprot_acc.list contains 5951 lines including the "header" line: UniProt_AC, is there a limitation in the number of UniProt accessions we can process with NetSyn?

Have a great day
Regards

C.

@baudstam
Copy link
Contributor

baudstam commented Jun 9, 2022

Hello,
There is no limitation in the number of Uniprot accession. I suspect an error in the format of input file
Could you please give me the command line you used and your uniprot_acc.list file.
I will try to reproduce the problem on my side and give you a solution of your problem.

Best Regards
Mark Stam

@cmonat
Copy link
Author

cmonat commented Jun 9, 2022

Hello,

the command line I used is:

netsyn -u uniprot_acc.list -o netsyn_test_for_toxphyl

And here is my uniprot list file:
uniprot_acc.txt

Thank you very much for your help
Regards
C.

@cmonat
Copy link
Author

cmonat commented Jun 14, 2022

Hello,

don't know if this information can help but, I have the error message following the download of M30127.embl INSDCFiles. I have tried to reinstall the tool and rerun the command but I got the same error message at the same time in the process.
Any tips to make it work?

Thank you very much in advance
Regards
C.

@baudstam baudstam self-assigned this Jun 14, 2022
@baudstam
Copy link
Contributor

Hello,
In fact I found a bug into one of the script. I corrected it, but I am making some test.
When everything will be fine, you will must reinstall a new version of netsyn.
I let you know when it will be done.
Best Regard
Mark

@baudstam
Copy link
Contributor

Hello,
The bug has been corrected and I was able to compute your data.
You need to download the latest version of NetSyn and reinstall it.
Tell me if everything is fine on your side, so I can close the issue.
Best Regard
Mark

@cmonat
Copy link
Author

cmonat commented Jun 16, 2022

Hello,

I have made a new try with the latest version and the previous error is gone. But I still have an error. Now the problem came at the Walktrap clustering with the folowing message:

Traceback (most recent call last):
File "/grid/sw/netsyn/0.1.0/bin/netsyn", line 11, in
load_entry_point('netsyn==0.1.0', 'console_scripts', 'netsyn')()
File "/grid/sw/netsyn/0.1.0/lib/python3.8/site-packages/netsyn-0.1.0-py3.8.egg/netsyn/netsyn.py", line 668, in main
boxesManager(runFromBox, resultsDirectory,
File "/grid/sw/netsyn/0.1.0/lib/python3.8/site-packages/netsyn-0.1.0-py3.8.egg/netsyn/netsyn.py", line 427, in boxesManager
runBox(nameBox, resultsDirectory, analysisNumber, ORDERBOX, args)
File "/grid/sw/netsyn/0.1.0/lib/python3.8/site-packages/netsyn-0.1.0-py3.8.egg/netsyn/netsyn.py", line 565, in runBox
netsyn_syntenyFinder.run(proteins, targets, args.WindowSize, args.SyntenyGap,
File "/grid/sw/netsyn/0.1.0/lib/python3.8/site-packages/netsyn-0.1.0-py3.8.egg/netsyn/netsyn_syntenyFinder.py", line 466, in run
walktrap_clustering = graph_walktrap.as_clustering()
File "/grid/sw/netsyn/0.1.0/lib/python3.8/site-packages/igraph-0.9.11-py3.8-linux-x86_64.egg/igraph/clustering.py", line 978, in as_clustering
membership = community_to_membership(self._merges, num_elts, num_elts - n)
igraph._igraph.InternalError: Error at src/community/community_misc.c:111: Number of steps is greater than number of rows in merges matrix: found 3257 steps, 3052 rows. -- Invalid value

Any idea how I can pass this one?
Thank you very much for your help.

Have a great day
Regards
C.

@baudstam
Copy link
Contributor

Hello,
I was able to reproduce your error.
I let you know when I know how to fix it
Best Regard
Mark

@baudstam
Copy link
Contributor

Hello,
Sorry for my late response.
I was able to compute your data with an old version of igraph (version 0.8.2)
Could you please, try to install netsyn with the version 0.8.2 of igraph (pip install python-igraph==0.8.2)
I continue to correct the error with igraph.
Anyway, I have the netsyn output for your data. I can send the file to you if you wish.
Best Regard
Mark

@cmonat
Copy link
Author

cmonat commented Jun 29, 2022

Hello,

thanks for your answer.
I'll ask the IT service to do so and see if I can generate the output. Otherwise I'll come back to you.
Thanks for your help.

Have a great day
Regards
C.

@baudstam
Copy link
Contributor

baudstam commented Jul 1, 2022

Hello,
You will not be able to use NetSyn for the moment. The 29 of june Uniprot has change the url used to query their website.
I need to make some change in the NetSyn code.
But for the moment I am covid positive and could not work.
I will let you know when the new version will be available.
Best Regard
Mark

@baudstam
Copy link
Contributor

Hello,
I create a new release of NetSyn. this new release modify the way to query Uniprot and correct the bug with the walktrap clustering (using the last version of Igraph (version 0.9.11)).
I was able to process your data with this new version of NetSyn.
Be aware that you need a new python library (requests) for this new release of NetSyn.
Best Regard

@cmonat
Copy link
Author

cmonat commented Aug 29, 2022

Hello,

sorry for beeing loong to test the new version, I was on holiday :)
I've made a test with the v.1.1 and the walktrap clustering seem to be ok but... I have another error this time with the MCL clustering with the following message:

[INFO] MCL clustering...
Traceback (most recent call last):
File "/grid/sw/netsyn/0.1.1/bin/netsyn", line 11, in
load_entry_point('netsyn==0.1.1', 'console_scripts', 'netsyn')()
File "/grid/sw/netsyn/0.1.1/lib/python3.7/site-packages/netsyn-0.1.1-py3.7.egg/netsyn/netsyn.py", line 669, in main
analysisNumber, ORDERBOX, args)
File "/grid/sw/netsyn/0.1.1/lib/python3.7/site-packages/netsyn-0.1.1-py3.7.egg/netsyn/netsyn.py", line 427, in boxesManager
runBox(nameBox, resultsDirectory, analysisNumber, ORDERBOX, args)
File "/grid/sw/netsyn/0.1.1/lib/python3.7/site-packages/netsyn-0.1.1-py3.7.egg/netsyn/netsyn.py", line 566, in runBox
args.SyntenyScoreCutoff, args.ClusteringAdvancedSettings)
File "/grid/sw/netsyn/0.1.1/lib/python3.7/site-packages/netsyn-0.1.1-py3.7.egg/netsyn/netsyn_syntenyFinder.py", line 545, in run
matrix_adjacency = nx.to_scipy_sparse_array(nxGraph, weight='weight')
File "/grid/sw/python/3.7.1/lib/python3.7/site-packages/networkx/init.py", line 51, in getattr
raise AttributeError(f"module {name} has no attribute {name}")
AttributeError: module networkx has no attribute to_scipy_sparse_array

Any idea how I can pass this one?
Thank you very much for your help.

Have a great day
Regards
C.

@baudstam
Copy link
Contributor

Hello,
I am working on your bug.
I will let you know when a new version will be available.
Best Regard
Mark

@baudstam
Copy link
Contributor

Hello,
Could you tell me wich version of python and networkx you used ?
I tested NetSyn installed with python 3.7 and networkx 2.6.3 and I get the same error as you.
I also tested NetSyn installed with python 3.8 and networkx 2.8.6 and I was able to compute your data .

I have modified the README on the NetSyn githup.
Tell me if it solved your problem.
Best Regards
Mark

@cmonat
Copy link
Author

cmonat commented Nov 25, 2022

Hello,

it took me quite a time to come back to this analysis but I'm back ^^
Aaaaand I got a new error message just following the [INFO] Edges content formatting ... (we go further each time, that's already a good thing!):

Traceback (most recent call last):
  File "/grid/sw/netsyn/0.1.1/bin/netsyn", line 11, in <module>
    load_entry_point('netsyn==0.1.1', 'console_scripts', 'netsyn')()
  File "/grid/sw/netsyn/0.1.1/lib/python3.8/site-packages/netsyn-0.1.1-py3.8.egg/netsyn/netsyn.py", line 668, in main
    boxesManager(runFromBox, resultsDirectory,
  File "/grid/sw/netsyn/0.1.1/lib/python3.8/site-packages/netsyn-0.1.1-py3.8.egg/netsyn/netsyn.py", line 427, in boxesManager
    runBox(nameBox, resultsDirectory, analysisNumber, ORDERBOX, args)
  File "/grid/sw/netsyn/0.1.1/lib/python3.8/site-packages/netsyn-0.1.1-py3.8.egg/netsyn/netsyn.py", line 572, in runBox
    netsyn_dataExport.run(nodesFile,
  File "/grid/sw/netsyn/0.1.1/lib/python3.8/site-packages/netsyn-0.1.1-py3.8.egg/netsyn/netsyn_dataExport.py", line 641, in run
    full_graph.write_graphml(graphmlOut)
igraph._igraph.InternalError: Error at src/io/graphml.c:1497: Forbidden control character 0x02 found in igraph_i_xml_escape. -- Invalid value

Now I run it with python/3.8.5 but not sure which version of networkx (BTW, do you know a quite way to check for that, please?).

Thanks
Have a great day
C.

@baudstam
Copy link
Contributor

baudstam commented Dec 8, 2022

Hello,
I am looking for your bug.
To know your networkx's version, you can open a prompt and type:
python
import networkx
networkx.__version__

It should give you the version of networkx

@cmonat
Copy link
Author

cmonat commented Dec 8, 2022

Hello,

thanks for that, and yes I tested with networkx 2.8.6

Have a great day
C.

@cmonat
Copy link
Author

cmonat commented Jan 30, 2023

Hello, I am looking for your bug. To know your networkx's version, you can open a prompt and type: python import networkx networkx.version

It should give you the version of networkx

Hello,
did you have time to check for the bug and to fix it?
Thanks for your help

have a great day
C.

@baudstam
Copy link
Contributor

Hello,
I am sorry, but i didn't have time to fix the bug. I try to do it this week.

Best Regard
Mark

@JeanMainguy
Copy link
Member

Hello,

We have added a fix to prevent the igraph error you mentioned earlier (in PR #3). We are now using networkx instead of igraph to write the graphML. On our side, it seems to solve the problem. Could you try this fix ?

You would need to reinstall the current version of netsyn. To do so you can for example use a conda environnement :

# install netsyn with conda in an env and activate it
conda create -n netsyn_env -c bioconda netsyn
conda activate netsyn_env 

# get the last version of netsyn
git clone https://github.com/labgem/netsyn

# install the last version in your conda environnement
pip install netsyn/

Tell us if it works :-)
Best

@cmonat
Copy link
Author

cmonat commented Feb 16, 2023

Hello,

it's working!
Thanks a lot :)

C.

@cmonat cmonat closed this as completed Feb 16, 2023
@cmonat
Copy link
Author

cmonat commented May 30, 2023

Hello,

I'm using NetSyn with a new dataset and got a new error message :

Traceback (most recent call last): File "/grid/sw/netsyn/0.1.1/bin/netsyn", line 11, in <module> load_entry_point('netsyn==0.1.1', 'console_scripts', 'netsyn')() File "/grid/sw/netsyn/0.1.1/lib/python3.8/site-packages/netsyn-0.1.1-py3.8.egg/netsyn/netsyn.py", line 668, in main boxesManager(runFromBox, resultsDirectory, File "/grid/sw/netsyn/0.1.1/lib/python3.8/site-packages/netsyn-0.1.1-py3.8.egg/netsyn/netsyn.py", line 427, in boxesManager runBox(nameBox, resultsDirectory, analysisNumber, ORDERBOX, args) File "/grid/sw/netsyn/0.1.1/lib/python3.8/site-packages/netsyn-0.1.1-py3.8.egg/netsyn/netsyn.py", line 535, in runBox netsyn_getINSDCFiles.run(args.UniProtACList) File "/grid/sw/netsyn/0.1.1/lib/python3.8/site-packages/netsyn-0.1.1-py3.8.egg/netsyn/netsyn_getINSDCFiles.py", line 369, in run getEMBLfromENA(nucleicAccession, nucleicFilePath, http) File "/grid/sw/netsyn/0.1.1/lib/python3.8/site-packages/netsyn-0.1.1-py3.8.egg/netsyn/netsyn_getINSDCFiles.py", line 296, in getEMBLfromENA if contentType == 'text/plain;charset=UTF-8' and res.data.decode('utf-8') == 'Entry: {} display type is either not supported or entry is not found.\n'.format(nucleicAccession): UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8f in position 0: invalid start byte
Is there something I should change to make it work?
Thanks in advance
C.

@cmonat cmonat reopened this May 30, 2023
@baudstam
Copy link
Contributor

Hello,
Can you give us your new dataset. It looks like there is a particular character in your file. I want to check.
Best regard
Mark

@cmonat
Copy link
Author

cmonat commented Oct 12, 2023

Hello,

sorry, I forgot to answer and unfortunately I am not able to share the data.
But now I have tried to run NetSyn with another data set and I got the following error:

[INFO] MCL clustering... Traceback (most recent call last): File "/grid/sw/netsyn/0.1.1/bin/netsyn", line 11, in <module> load_entry_point('netsyn==0.1.1', 'console_scripts', 'netsyn')() File "/grid/sw/netsyn/0.1.1/lib/python3.8/site-packages/netsyn-0.1.1-py3.8.egg/netsyn/netsyn.py", line 668, in main boxesManager(runFromBox, resultsDirectory, File "/grid/sw/netsyn/0.1.1/lib/python3.8/site-packages/netsyn-0.1.1-py3.8.egg/netsyn/netsyn.py", line 427, in boxesManager runBox(nameBox, resultsDirectory, analysisNumber, ORDERBOX, args) File "/grid/sw/netsyn/0.1.1/lib/python3.8/site-packages/netsyn-0.1.1-py3.8.egg/netsyn/netsyn.py", line 565, in runBox netsyn_syntenyFinder.run(proteins, targets, args.WindowSize, args.SyntenyGap, File "/grid/sw/netsyn/0.1.1/lib/python3.8/site-packages/netsyn-0.1.1-py3.8.egg/netsyn/netsyn_syntenyFinder.py", line 545, in run matrix_adjacency = nx.to_scipy_sparse_array(nxGraph, weight='weight') File "/grid/sw/python/3.8.5/lib/python3.8/site-packages/networkx/convert_matrix.py", line 923, in to_scipy_sparse_array A = sp.sparse.coo_array((d, (r, c)), shape=(nlen, nlen), dtype=dtype) AttributeError: module 'scipy.sparse' has no attribute 'coo_array'

I run with python/3.8.5 netsyn/0.1.1 mmseqs2/13.45111 and networkx '2.8.6'

Thanks for your help
Have a great day
C.

@cmonat
Copy link
Author

cmonat commented Nov 28, 2023

Hello,

by any chance, did you had time to check for the error I got just below?
Thanks in advance

Have a great day
C.

@baudstam
Copy link
Contributor

baudstam commented Nov 28, 2023

Hello,
Unfortunately, we had no time to make progress on this problem. We try to solve it

Best Regard
Mark

@JeanMainguy
Copy link
Member

Hello @cmonat,

It appears that the error you encountered might be due to an old scipy version.
To solve the error, updating scipy to version 1.8.0 should help.

If you installed netsyn using conda, you can try the following command:

conda install 'scipy==1.8'

Alternatively, it should also work using pip directly:

pip install "scipy==1.8.0"

Best,

Jean

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants