Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Put hydra advances query result in a public s3 bucket somewhere on a cron job #405

Open
lf- opened this issue Mar 23, 2024 · 1 comment
Open
Labels
enhancement new-service Request for a new service to be ran on the NixOS infra

Comments

@lf-
Copy link
Member

lf- commented Mar 23, 2024

@mweinelt :)

Is your feature request related to a problem? Please describe.

We have no good way of bisecting hydra anymore since the grahamc service died a year ago. But we have the data, it is just locked behind the hydra API being busted, and can relatively easily simply be obtained from the database. Although there is definitely merit in doing the https://pad.lassul.us/opendata-projects and making an open hydra read replica, this requires effort.

I have written a tool https://github.com/lf-/necktangler, and a ~5s sql query to run against hydra, where the result could simply be put in a s3 bucket on a nightly cron job and we could call it a day.

Describe the solution you'd like

Put this query in a cron job somewhere with hydra DB access, and run it with:

psql $SOME_CONNECTION_DETAILS -v job=tested -v project=nixos -v jobset=trunk-combined --csv -f hydra-query.sql > data-nixos-unstable.csv
psql $SOME_CONNECTION_DETAILS -v job=tested -v project=nixos -v jobset=unstable-small --csv -f hydra-query.sql > data-nixos-unstable-small.csv
psql $SOME_CONNECTION_DETAILS -v job=unstable -v project=nixpkgs -v jobset=trunk --csv -f hydra-query.sql > data-nixpkgs-unstable.csv
psql $SOME_CONNECTION_DETAILS -v job=darwin-tested -v project=nixpkgs -v jobset=nixpkgs-23.11-darwin --csv -f hydra-query.sql > data-nixpkgs-23.11-darwin.csv
psql $SOME_CONNECTION_DETAILS -v job=tested -v project=nixos -v jobset=release-23.11 --csv -f hydra-query.sql > data-nixos-23.11.csv
psql $SOME_CONNECTION_DETAILS -v job=tested -v project=nixos -v jobset=release-23.11-small --csv -f hydra-query.sql > data-nixos-23.11-small.csv
# s3 upload to some bucket somewhere
select
    jsei.revision as sha, b.timestamp as advance_ts, concat(j.project, ':', j.name, ':', b.job) as hydra_job, b.id as hydra_build_id
    from jobsets j
        inner join builds b on j.id = b.jobset_id
        inner join jobsetevalmembers jsem on jsem.build = b.id
        inner join jobsetevals jse on jsem.eval = jse.id
        inner join jobsetevalinputs jsei on jsei.eval = jse.id
    where
    b.job = :'job' and
    j.project = :'project' and
    j.name = :'jobset' and
    jsei.name = 'nixpkgs' and
    b.finished = 1 and
    -- arbitraily chosen start in time; there are some bogus commit IDs from
    -- 2014 or so, which we would rather not deal with
    b.timestamp > extract(epoch from '2018-01-01T00:00:00Z'::timestamptz)
    order by b.timestamp asc

(nixpkgs-unstable is 8571 rows, nixos-unstable is 3000 rows, from several years. imo we have no reason to shard this any time soon)

Describe alternatives you've considered

Going and building a public read replica of hydra (sounds like effort).

Being unable to bisect hydra.

Additional context

@lf- lf- changed the title Put hydra advances query result in a public s3 bucket somewhere Put hydra advances query result in a public s3 bucket somewhere on a cron job Mar 23, 2024
@RaitoBezarius
Copy link
Member

An alternative could be to add this particular API endpoint in the Hydra code and deploy it fast.

@mweinelt mweinelt added the new-service Request for a new service to be ran on the NixOS infra label Mar 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement new-service Request for a new service to be ran on the NixOS infra
Projects
None yet
Development

No branches or pull requests

3 participants