Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkgr must more aggressively error on failed download #163

Open
dpastoor opened this issue Sep 30, 2019 · 10 comments
Open

pkgr must more aggressively error on failed download #163

dpastoor opened this issue Sep 30, 2019 · 10 comments

Comments

@dpastoor
Copy link
Contributor

when downloading packages, receiving a 404 or other issue where cannot download should not just print a warning and proceed, as the installation will obviously fail since that package does not exist

1388:{"level":"warning","msg":"bad server response","package":"cmprsk","status":"404 Not Found","status_code":404,"time":"2019-09-30T08:42:15-04:00","url":"https://metrumresear
chgroup.github.io/cran/2019-09-22/src/contrib/cmprsk_2.2-7.1.tar.gz"}
ERRO[0509] installation failed for packages: cmprsk
TRAC[0509] Resetting package environment
INFO[0510] duration:8m30.328218453s
FATA[0510] failed package install with err, failed installation for packages: cmprsk
@bschulth
Copy link

Can you possibly add a Retry here:

https://github.com/metrumresearchgroup/pkgr/blob/develop/cran/download-package.go#L128

I added a log to record the reason for download fail

log.WithField("package", d.Package.Package).Warn(err)

And the failure I am seeing is:

INFO[0087] downloading package                           package=nycflights13
WARN[0087] http2: server sent GOAWAY and closed the connection; LastStreamID=1999, ErrCode=NO_ERROR, debug=""  package=reticulate
WARN[0087] http2: server sent GOAWAY and closed the connection; LastStreamID=1999, ErrCode=NO_ERROR, debug=""  package=caret
WARN[0087] downloading failed                            package=reticulate
WARN[0087] downloading failed                            package=caret
WARN[0087] http2: server sent GOAWAY and closed the connection; LastStreamID=1999, ErrCode=NO_ERROR, debug=""  package=gt
WARN[0087] downloading failed                            package=gt
WARN[0087] http2: server sent GOAWAY and closed the connection; LastStreamID=1999, ErrCode=NO_ERROR, debug=""  package=tsibble
WARN[0087] downloading failed                            package=tsibble

Wondering if perhaps, the server thinks its a DOS attack?

@dpastoor
Copy link
Contributor Author

Thanks for the report Brian - what type of repo are you pulling from - gitlab pages? rstudio package manager? MRAN, other? There is pretty high concurrency of these requests.

This is on the queue to just redo anyway - like pkgr shouldn't even just warn, it should straight error since if you're failing to dl obv the install is going to go away.

What would be helpful is what could we add to also make this more robust - both some retr(ies) on the http end + potentially some concurrent_dl setting

@bschulth
Copy link

bschulth commented Oct 13, 2021

INFO[0002] package installation sources
AmgenInternal=25
BioCann=1
BioCsoft=17
CRAN=994 ========> (https://cran.microsoft.com/snapshot/2021-10-08)
CRAN_20190901=1
CRAN_20200118=4
CRAN_20200510=2
H2O=1
INLA=1
glmmADMB_repo=1
tarballs=0

@dpastoor
Copy link
Contributor Author

ok thanks - that CRAN server is.... not great... has had a number of outages. pkgr needs improvement here but in the meantime - some knowledge to drop - switch away from mran and instead rstudio has cran snapshotted every night now:

image

i've switched over and its been much better

@dpastoor
Copy link
Contributor Author

@bschulth
Copy link

Ah, cool, let me give that a try.
Yeah my build have been failing every for the past few days.
I was about to go with a hack in your code here and put a 1 second sleep in place so that I don't spam that server.
https://github.com/metrumresearchgroup/pkgr/blob/develop/cran/download-package.go#L124
It definitely seems like it thinks I am DOS'ing it. The one second delay on each round let it succeed.

I'll let you know how the RStudio snapshot works....

@bschulth
Copy link

package manager is much slower (like it has 1-2 second built-in throttle), but my first attempt passed, so seems like a good work-around. Thanks!

I think MRAN, would be fine, but needs a download throttle....so maybe a workaround is to introduce a new property in the yml file for a per-repo download throttle in seconds. I think this is an issue only because I am up to 999 packages from there.

@dpastoor
Copy link
Contributor Author

kk give #389 a try - that has an exponential backoff built in (thanks hashicorp) and the concurrency control knob - perhaps

PKGR_DL_CONCURRENCY=3 pkgr install ...

combined with the retries just in case will do the trick. Interesting that package manager was slower for you, we've seen that speed things up for us compared to MRAN - though to be fair, 99.99% of the time we point to mpn.metworx.com these days :-) (though i'm not sure if your entire snapshot is present in MPN)

@bschulth
Copy link

bschulth commented Oct 13, 2021

No errors during the download phase using #389 on the MRAN link, so that that seems like a positive!

Regarding timing, RStudio Package Manager History Repo: 6m.47s to download all
MRAN, 3m29s to download all.

PKGR_DL_CONCURRENCY=2 works as well, sufficiently throttling the concurrent downloads so that the last few packages don't start failing and hit the download retry. (Though at 2 threads, it's as slow as RS Package Manager.....6m27).

So 2 positive changes.

@bschulth
Copy link

bschulth commented Feb 3, 2022

Poking this topic.

  • Just an additional note, when packages passively fail to download upstream,
  • installs seem to passively fail downstream (e.g. process does not exit with a non-zero exit code so that docker builds fail)
 time="2022-02-03T04:35:15-08:00" level=info msg="Successfully Installed." package=libcoin remaining=660 repo=CRAN version=1.0-9
 time="2022-02-03T04:35:16-08:00" level=error msg="cmd output" exit_code=1 output="Warning: invalid package ‘/opt/local/docker/installers/runtime/pkgr’\nError: ERROR: no packages specified\n" package=coda stderr="Warning: invalid package ‘/opt/local/docker/installers/runtime/pkgr’\nError: ERROR: no packages specified\n" stdout=
 time="2022-02-03T04:35:16-08:00" level=warning msg="error installing" err="exit status 1"
 time="2022-02-03T04:35:19-08:00" level=info msg="Successfully Installed." package=mc2d remaining=659 repo=CRAN version=0.1-21

Also, it seems that the final failure does not exit with a non-zero exit code:

 time="2022-02-03T04:38:25-08:00" level=error msg="did not install IRdisplay"
 time="2022-02-03T04:38:25-08:00" level=error msg="did not install distributional"
 time="2022-02-03T04:38:25-08:00" level=error msg="did not install refund"
 time="2022-02-03T04:38:25-08:00" level=error msg="installation failed for packages: ucminf, proto, svUnit, wavelets, entropy, coda, clue"
 time="2022-02-03T04:38:25-08:00" level=info msg="starting individual tarball install"
 time="2022-02-03T04:38:25-08:00" level=info msg="total package install time" duration=37m12.389199139s
 time="2022-02-03T04:38:26-08:00" level=info msg="duration:37m13.64722306s"
 time="2022-02-03T04:38:26-08:00" level=error msg="failed package install with err, failed installation for packages: ucminf, proto, svUnit, wavelets, entropy, coda, clue"

So it would be mainly nice to get a non-zero exit code so that automated builds fail properly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants