Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fetch_all does not work #130

Closed
kapedalex opened this issue Jan 10, 2024 · 11 comments
Closed

fetch_all does not work #130

kapedalex opened this issue Jan 10, 2024 · 11 comments
Assignees

Comments

@kapedalex
Copy link

kapedalex commented Jan 10, 2024

from geofetch import Geofetcher

geof = Geofetcher(just_metadata=False,
                  processed=False,
                  max_soft_size="12GB",
                  data_source="all")

geof.fetch_all("GSE95654")

Response for geofetch -i GSE95654 is the same

The system cannot find the path specified.
'{' is not recognized as an internal or external command,
operable program or batch file.
To download raw data You must first install the sratoolkit, with prefetch in your PATH. Installation instruction: http://geofetch.databio.org/en/latest/install/

Problem is, that sratoolkit is actually installed and added in path. I can use prefetch GSE95654 in terminal and everything will be ok.

Basic guides like

find_gse = Finder()
gse_list = find_gse.get_gse_all()

and

geof = Geofetcher(processed=True, acc_anno=True, discard_soft=True)
geof.get_projects("GSE160204")

works fine

@khoroshevskyi
Copy link
Member

khoroshevskyi commented Jan 10, 2024

Thank you for raising an issue. Could you please provide us with the full geofetch log and information about the system on which you are running geofetch, including the Python version?

I have just installed prefetch and Geofetch, and unfortunately, I can't reproduce the error. Additionally, note that prefetch can now be installed using the following command:

sudo apt install sra-toolkit

@pedro-w
Copy link
Contributor

pedro-w commented Jan 11, 2024

I also have seen this, using command

> geofetch --verbosity 5 -i GSE135644 -m dl
[INFO] [15:46:54] Metadata folder: D:\Libraries\lda\dl\GSE135644
The system cannot find the path specified.
'{' is not recognized as an internal or external command,
operable program or batch file.
To download raw data You must first install the sratoolkit, with prefetch in your PATH. Installation instruction: http://geofetch.databio.org/en/latest/install/

My version is

Python 3.11.7 (tags/v3.11.7:fa7a6f2, Dec  4 2023, 19:24:49) [MSC v.1937 64 bit (AMD64)]

(Windows 11)

@khoroshevskyi
Copy link
Member

Thank you for your response @pedro-w . Unfortunately, geofetch wasn't tested on Windows. We will try to solve this issue ASAP.

@pedro-w
Copy link
Contributor

pedro-w commented Jan 11, 2024

Thanks for the swift response. I don't know if @kapedalex is also on Windows?

I'm happy to help test anything, just let me know.

@pedro-w
Copy link
Contributor

pedro-w commented Jan 12, 2024

I'll just add that it does work in WSL (Debian) - I am guessing it is assuming a POSIX shell somewhere which fails under Windows?

@khoroshevskyi
Copy link
Member

Sorry, that it takes so long. Unfortunately, I don't have access to a Windows laptop every day. However, I have identified an error, and it appears to be occurring within one of the imported libraries. I will continue working on resolving this issue next week.

@pedro-w pedro-w mentioned this issue Jan 19, 2024
@pedro-w
Copy link
Contributor

pedro-w commented Jan 22, 2024

Hi. I tried your pr #132 and it seemed to work (see below) but it's still giving the

The system cannot find the path specified. 
'{' is not recognized as an internal or external command, operable program or batch file.

lines. I don't know where this is coming from, but if it's nothing to worry about then I'm fine with that.


(venv) PS D:\temp\geofetch> python -m geofetch -i GSE67303 -n red_algae -m d:\temp\gf-test
[INFO] [09:05:26] Metadata folder: d:\temp\gf-test\red_algae
The system cannot find the path specified.
'{' is not recognized as an internal or external command,
operable program or batch file.
[WARNING] [09:05:26] GEOfetch is not checking if prefetch is installed on Windows, please make sure it is installed and in your PATH, otherwise it will not be possible to download raw data.
[INFO] [09:05:26] Trying GSE67303 (not a file) as accession...
[INFO] [09:05:26] Skipped 0 accessions. Starting now.
[INFO] [09:05:26] Processing accession 1 of 1: 'GSE67303'
[INFO] [09:05:26] Found previous GSE file: d:\temp\gf-test\red_algae\GSE67303_GSE.soft
[INFO] [09:05:26] Found previous GSM file: d:\temp\gf-test\red_algae\GSE67303_GSM.soft
[INFO] [09:05:26] Processed 4 samples.
[INFO] [09:05:26] Expanding metadata list...
[INFO] [09:05:26] Found SRA Project accession: SRP056574
[INFO] [09:05:26] Found SRA metadata, opening..
[INFO] [09:05:26] Parsing SRA file to download SRR records
[INFO] [09:05:26] Getting SRR: SRR1930183  in (GSE67303)

2024-01-22T09:05:28 prefetch.3.0.10: Current preference is set to retrieve SRA Normalized Format files with full base quality scores.
2024-01-22T09:05:30 prefetch.3.0.10: 1) Downloading 'SRR1930183'...
2024-01-22T09:05:30 prefetch.3.0.10: SRA Normalized Format file is being retrieved, if this is different from your preference, it may be due to current file availability.
2024-01-22T09:05:30 prefetch.3.0.10:  Downloading via HTTPS...
2024-01-22T09:06:06 prefetch.3.0.10:  HTTPS download succeed
2024-01-22T09:06:06 prefetch.3.0.10:   verifying 'SRR1930183'...
2024-01-22T09:06:06 prefetch.3.0.10:  'SRR1930183' is valid
2024-01-22T09:06:06 prefetch.3.0.10: 1) 'SRR1930183' was downloaded successfully
2024-01-22T09:06:06 prefetch.3.0.10: 'SRR1930183' has 0 unresolved dependencies
[INFO] [09:06:06] Getting SRR: SRR1930184  in (GSE67303)

2024-01-22T09:06:08 prefetch.3.0.10: Current preference is set to retrieve SRA Normalized Format files with full base quality scores.
2024-01-22T09:06:09 prefetch.3.0.10: 1) Downloading 'SRR1930184'...
2024-01-22T09:06:09 prefetch.3.0.10: SRA Normalized Format file is being retrieved, if this is different from your preference, it may be due to current file availability.
2024-01-22T09:06:09 prefetch.3.0.10:  Downloading via HTTPS...
2024-01-22T09:06:27 prefetch.3.0.10:  HTTPS download succeed
2024-01-22T09:06:27 prefetch.3.0.10:   verifying 'SRR1930184'...
2024-01-22T09:06:27 prefetch.3.0.10:  'SRR1930184' is valid
2024-01-22T09:06:27 prefetch.3.0.10: 1) 'SRR1930184' was downloaded successfully
2024-01-22T09:06:27 prefetch.3.0.10: 'SRR1930184' has 0 unresolved dependencies
[INFO] [09:06:27] Getting SRR: SRR1930185  in (GSE67303)

2024-01-22T09:06:29 prefetch.3.0.10: Current preference is set to retrieve SRA Normalized Format files with full base quality scores.
2024-01-22T09:06:29 prefetch.3.0.10: 1) Downloading 'SRR1930185'...
2024-01-22T09:06:29 prefetch.3.0.10: SRA Normalized Format file is being retrieved, if this is different from your preference, it may be due to current file availability.
2024-01-22T09:06:29 prefetch.3.0.10:  Downloading via HTTPS...
2024-01-22T09:06:40 prefetch.3.0.10:  HTTPS download succeed
2024-01-22T09:06:40 prefetch.3.0.10:   verifying 'SRR1930185'...
2024-01-22T09:06:40 prefetch.3.0.10:  'SRR1930185' is valid
2024-01-22T09:06:40 prefetch.3.0.10: 1) 'SRR1930185' was downloaded successfully
2024-01-22T09:06:41 prefetch.3.0.10: 'SRR1930185' has 0 unresolved dependencies
[INFO] [09:06:41] Getting SRR: SRR1930186  in (GSE67303)

2024-01-22T09:06:42 prefetch.3.0.10: Current preference is set to retrieve SRA Normalized Format files with full base quality scores.
2024-01-22T09:06:43 prefetch.3.0.10: 1) Downloading 'SRR1930186'...
2024-01-22T09:06:43 prefetch.3.0.10: SRA Normalized Format file is being retrieved, if this is different from your preference, it may be due to current file availability.
2024-01-22T09:06:43 prefetch.3.0.10:  Downloading via HTTPS...
2024-01-22T09:06:51 prefetch.3.0.10:  HTTPS download succeed
2024-01-22T09:06:51 prefetch.3.0.10:   verifying 'SRR1930186'...
2024-01-22T09:06:51 prefetch.3.0.10:  'SRR1930186' is valid
2024-01-22T09:06:51 prefetch.3.0.10: 1) 'SRR1930186' was downloaded successfully
2024-01-22T09:06:52 prefetch.3.0.10: 'SRR1930186' has 0 unresolved dependencies
[INFO] [09:06:52] Finished processing 1 accession(s)
[INFO] [09:06:52] Creating complete project annotation sheets and config file...
[INFO] [09:06:52] Sample annotation sheet: d:\temp\gf-test\red_algae\GSE67303_PEP\GSE67303_PEP_raw.csv . Saved!
[INFO] [09:06:52] File has been saved successfully
[INFO] [09:06:52]   Config file: d:\temp\gf-test\red_algae\GSE67303_PEP\GSE67303_PEP.yaml

@pedro-w pedro-w mentioned this issue Feb 2, 2024
@khoroshevskyi
Copy link
Member

@pedro-w Thank you for your support. I just released new geofetch, v0.12.6. If you have chance, could you please confirm that everything works?

@pedro-w
Copy link
Contributor

pedro-w commented Feb 12, 2024

@khoroshevskyi Apologies for the delay. I tried with a clone of github tag v0.12.6, on Windows 11.

  • without prefetch on the path I got a sensible-looking error message with instructions
  • after adding prefetch to the path I was able to download data sets.

So, I can confirm it's working as expected for me.

Thanks 👍

@khoroshevskyi
Copy link
Member

Thank you very much for your help!

@pedro-w
Copy link
Contributor

pedro-w commented Feb 12, 2024

@kapedalex did it fix your issue too 🤞 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants