Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to set paramenters without cluster? #186

Open
mili-ai opened this issue Oct 15, 2024 · 7 comments
Open

how to set paramenters without cluster? #186

mili-ai opened this issue Oct 15, 2024 · 7 comments

Comments

@mili-ai
Copy link

mili-ai commented Oct 15, 2024

I don't have a cluster on my server, only regular CPUs. How should I set thread parameters?

@MichaelHiller
Copy link
Collaborator

Oh, good question. This can take a long time.
@kirilenkobm Does TOGA have the ability to run multi-threaded?

You might want to wait a few more weeks until we release TOGA 2.0 which will be much much faster and >300X less memory demanding.

@mili-ai
Copy link
Author

mili-ai commented Oct 16, 2024

Thank you for your reply. I am looking forward to TOGA 2.0 version

@ReverendCasy
Copy link
Collaborator

Hello @mili-ai ,
Sorry for the late response. In the current TOGA versions, there are two ways to run it on your local machine:

  1. Do not set a --nextflow_config_dir option. This will automatically set Nextflow executor to ‘local’.
  2. Modify Nextflow configuration files in the nextflow_config_files/ directory by setting process.executor to local, then provide the path to TOGA with a --nextflow_config_dir option. This way also allows you to control the number of CPUs used by each paralleled TOGA step.
    Note that some of CESAR jobs might take a lot of memory to compute, so consider setting the upper memory threshold with --cesar_mem_limit.
    Hope that helps. Please let me know if you tried any of these options and whether they worked fine for you.

Best,
Yury

@swagttt1
Copy link

Oh, good question. This can take a long time. @kirilenkobm Does TOGA have the ability to run multi-threaded?

You might want to wait a few more weeks until we release TOGA 2.0 which will be much much faster and >300X less memory demanding.

Thank you for developing such a useful tool! It is currently December 12, 2024. When will TOGA v2.0 be released?

@MichaelHiller
Copy link
Collaborator

There is no multithreaded option. We parallelize over the transcripts.

TOGA2 will be released in early January.

@swagttt1
Copy link

swagttt1 commented Dec 12, 2024

Hello @mili-ai , Sorry for the late response. In the current TOGA versions, there are two ways to run it on your local machine:

  1. Do not set a option. This will automatically set Nextflow executor to ‘local’.--nextflow_config_dir
  2. Modify Nextflow configuration files in the directory by setting to , then provide the path to TOGA with a option. This way also allows you to control the number of CPUs used by each paralleled TOGA step.
    Note that some of CESAR jobs might take a lot of memory to compute, so consider setting the upper memory threshold with .
    Hope that helps. Please let me know if you tried any of these options and whether they worked fine for you.nextflow_config_files/``process.executor``local``--nextflow_config_dir``--cesar_mem_limit

Best, Yury

Hi, Yury
Hope this message finds you well!
I don't have a Slurm cluster and have been running TOGA on a local CentOS server. Based on discussions in another issue (#131), I found that TOGA may not be compatible with a single machine and is better suited for running on a Slurm cluster. In this issue, following your guidance, I ran the command:
./toga.py test_input/hg38.mm10.chr11.chain test_input/hg38.genCode27.chr11.bed test_input/hg38.2bit test_input/mm10.2bit --kt --pn test -i supply/hg38.wgEncodeGencodeCompV34.isoforms.txt --cb 3,5 --cjn 500 --u12 supply/hg38.U12sites.tsv --ms
(removing the --nc nextflow_config_files parameters), but encountered a new problem:

......
chain_runner_55: processing chain_id: 553928 transcripts: ENST00000320048,
chain_runner_55: processing chain_id: 513374 transcripts: ENST00000329434,ENST00000328375,ENST00000641167,
chain_runner_55: processing chain_id: 749433 transcripts: ENST00000301790,ENST00000540857,ENST00000539221,
chain_runner_55: processing chain_id: 86290 transcripts: ENST00000313555,ENST00000641320,
chain_runner_55: processing chain_id: 930267 transcripts: ENST00000625203,
chain_runner_55: processing chain_id: 653133 transcripts: ENST00000328188,

#### STEP 3: Merge step 2 output

Reading /home/liuyt/software/TOGA/test/temp/toga_filt_ref_annot.bed
merge_chains_output: got data for 3674 transcripts
merge_chains_output: Loading the results...
merge_chains_output: There are 60 result files to combine
merge_chains_output: got 17036 keys in chain_genes_data
merge_chains_output: got 20495 keys in chain_raw_data
merge_chains_output: There were 20482 transcript lines and 20495 chain lines
merge_chains_output: ERROR!
genes_counter and chain_counter hold different values:
20482 and 20495 respectively

Do you have any insights or suggestions?
Thanks

@ReverendCasy
Copy link
Collaborator

Hi @swagttt1 ,
I guess some of the feature extraction jobs exited improperly. Could you please attach the whole TOGA log and/or feature extraction Nextflow log?

Best,
Yury

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants