How to analysis the time cost of each part #112

MendelXu · 2019-08-14T07:50:34Z

MendelXu
Aug 14, 2019

What is your question?

Hi, I'm trying to implement my project with your framework, however, I'd like to count the time each part costs to make full use of GPUs, but it's puzzling that the time count by myself is not the same as tqdm does. So could you give me some advice about what happened? From process bar, the time is 1.4s/it while data time is 0.003s, gpu time is 0.5~0.7s.

Code

            # what I add to the trainer
            # code added by me
            batch_start_tic = time.time()
            for batch_nb, data_batch in enumerate(self.tng_dataloader):
                self.batch_nb = batch_nb
                self.global_step += 1

                model = self.__get_model()
                model.global_step = self.global_step

                # stop when the flag is changed or we've gone past the amount
                #  requested in the batches
                self.total_batch_nb += 1
                met_batch_limit = batch_nb > self.nb_tng_batches
                if met_batch_limit:
                    break

                # ---------------
                # RUN TRAIN STEP
                # ---------------
                batch_fb = time.time()
                batch_result = self.__run_tng_batch(data_batch, batch_nb)
                early_stop_epoch = batch_result == -1
                # code added by me
                batch_fb_end = time.time()
                self.__add_tqdm_metrics({'data time': batch_fb-batch_start_tic,'gpu time': batch_fb_end-batch_fb})
                batch_start_tic = time.time()

By the way, I find the gpu utils is about 80%, is there any tricks can make it up to 100%?

What's your environment?

PyTorch version 1.1.0
Lightning version 0.3.6.9
Test-tube version 0.6.7.6

Thanks a lot.

Answered by williamFalcon

Aug 14, 2019

tqdm time is a running average. you have to let it warm up for a bit before it converges to the correct time.

View full answer

williamFalcon · 2019-08-14T10:46:46Z

williamFalcon
Aug 14, 2019
Maintainer

tqdm time is a running average. you have to let it warm up for a bit before it converges to the correct time.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to analysis the time cost of each part #112

{{title}}

Replies: 1 comment

{{title}}

Select a reply

How to analysis the time cost of each part #112

MendelXu Aug 14, 2019

What is your question?

Code

What's your environment?

Replies: 1 comment

williamFalcon Aug 14, 2019 Maintainer

MendelXu
Aug 14, 2019

williamFalcon
Aug 14, 2019
Maintainer