-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migration from pulp to createrepo-agent #972
Comments
Discussion point: "upstream" repositories In the debian repositories, we're listing upstream repositories in a configuration files on the In Pulp, this is implemented by creating specifically named repository entities which can be synchronized from the upstream URL and then synchronized between pulp distributions just like a ROS distro sync. Though this is implemented, we aren't actually using it and the "bootstrap" repository for RHEL is empty. The sync feature implemented in The easiest way to align with how this is done on the debian side is simply to write out a script to a known location on the At the moment, I'm considering simply dropping CONCLUSION: Implement upstream repositories using |
I generally agree with this. My one worry about this is how quickly can get get out releases? Is it bottlenecked on you uploading them? |
In general, there are usually ~36 hrs of overhead and ~7 days of "baking" in the testing repositories. The latter can be expedited automatically if a few community members test the update manually and report their findings.
Yes, though the robotics SIG has the ability to push updates to these packages as well. We (Open Robotics) don't currently have any automation, documentation, or tooling for updating RPM packages to the bootstrap repository as it is, so it would probably still need to go through me. Rather than spending resources on standing that up, I'd rather focus on getting more team members into the Fedora robotics SIG who can push the packages and test them for faster update turnarounds. Great questions, BTW. |
The high level write-up is superb and will be great for folks who aren't as familiar with the internals of the buildfarm or aren't at a scale where they're feeling our pain. One question, this looks like there's a hard cut between createrepo-agent and pulp. A. Is that the case or is it just not clear from the plan that both systems will be run in parallel during an initial phase? B. If there's a hard-cut to createrepo-agent am I forgetting a conversation where you convinced me to go ahead with that? 😅 I think it is worth trying to build out a detailed checklist of steps either in this issue or one on the private config repository. Two examples (both on private config repos) are the original pulp deployment and the migration of the ROS build farm to Ubuntu Xenial. Writing out the exhaustive lists helps identify step ordering (and possible conflicts between is expected to happen first) as well as providing a safety during the migration when adrenaline impedes upon critical thinking.
I do not think that we need to block this deployment on having an import-upstream feature in the createrepo-agent. But I do not think that we can leave import_upstream support unimplemented or forego having a bootstrap repo for RPM repositories entirely. I do agree with Scott that our steady-state should be pushing changes to infrastructure upstream, something that the release cadence of Debian and Ubuntu doesn't enable which as a result, creates potential conflicts between the upstream and ROS Infra-provided versions of the infrastructure packages which can be avoided by publishing those directly to Fedora project archives (EPEL is a Fedora project).
I haven't been following the Fedora releases very closely but my understanding is that the major messy issues with infra packages are caught and settled by people using the packages from the ROS repos on Debian/Ubuntu with enough swiftness that our Fedora infra people (cottsay alone, at present) can update the pending releases without having to release the intermediate duds so a good portion of those never make it to Fedora in the first place thanks to the intrepid community running out of the ROS repos on our more widely adopted platforms. My main concern is that the bootstrap repo has other uses beyond distributing the latest ROS infrastructure packages, some legacy which we're trying to move away from and some still valid. As far as legacy goes, we've used the bootstrap repository to pull in packages that are not available upstream to provide for ROS, however, with Jammy in particular, we've limited this to just very closely associated projects like Gazebo and Colcon where we work directly with upstream and providing packages via the ROS bootstrap repository is primarily a matter of ensuring that consistent versions are available and in use. There are also still be packages, like those provided by commercial DDS vendors, which we would need to publish but aren't suitable for Fedora Project repositories. Lastly, because of the way rosdep and bloom interact, we may also need the bootstrap repo to create equivs / empty packages for soft dependencies that are only available on, for example amd64 but not arm64, which rosdep doesn't model even if the package is available in EPEL on another platform.
I agree that getting more of us active in the Fedora Robotics SIG is the primary focus I think we can revisit implementation details for an RPM ros_bootstrap repo in future discussions I agree that time now doesn't need to be spent there.
Going along with expanding our involvement in the Robotics SIG, I do think it's important to be careful we don't create a bloc here who unintentionally who acts in concern to try and override the usual mechanisms for review. But if we get genuine, distributed, community input on issues that's super valuable. |
Thanks for your detailed thoughts, @nuclearsandwich.
That's correct. The primary driver for this is that our implementations using Pulp and createrepo-agent use entirely different credentials - and even types of credentials - from each other. Supporting both in parallel would mean wiring new credentials into each
You're probably not missing anything, I probably just overlooked or forgot about it. Discussing it here (in writing) is good.
Sure, I can do that. Expect a follow-up comment on this issue.
The existing 'sync' scenario in createrepo-agent can handle what we need here. My hesitation mostly stemmed from shoehoringing it into the same place in our workflows as the import jobs for reprepro. Namely, the fact that "upstream" repositories are specified as part of deployment and not part of configuration is...less than ideal. It is absolutely technically possible for me to make this work the same for the RPM repositories, but I wanted to weight it against what we'd like to see. It requires very little effort to stand this up with the goal of feature parity with our use of reprepro here. I'll make that happen, now now that my hesitations have been heard. |
I'm fine with transitional configuration options in principle but I also respect the work you've put into this and the confidence that you have that a hard-cut is not a huge risk. I think we should discuss synchronously the pros and cons of a parallel deployment versus a hard-cut with a working revert path together with @clalancette and then adopt the approach we prefer coming out of that. |
We actually just finished a discussion where we decided to pursue the parallel deployment approach. It's more work than a hard cut but the value in recovering from unforeseen issues makes it worthwhile. |
I've updated all of the necessary branches to support the |
Pre-Deployment Checklist
Deployment Checklist
When there are no in-progress jobs, we can then use commands like this to look for differences between the repositories:
|
Am I right in thinking that Foxy is intentionally omitted from the above checklists because it has no RPM builds? |
We can also use repodiff to check the |
Right, there is no reconfigure job at all because there are no build files for RHEL or Fedora.
Absolutely. |
Man this is gonna start to feel like code review. Can we DevOps so hard we version control our checklists?
Beyond that I think this checklist is bang on and can stand as the plan. We'll pick a date later on once some of the other parts of this process are through review. Thanks for taking the time to build it up! |
Sure, added.
Also added - good call. I really don't think a Jenkins host snapshot will help us - we don't even need to deploy to that host. All of the job configurations are managed by Please find some time in the next week to review the three linked PRs in the pre-deployment checklist. |
Definitely! I focused on the plan first but I'll be sure to do those next. |
First pass reviews completed on each linked PR. |
Many thanks - I believe I've addressed this round of feedback. I think that after this next round, we should set a date. |
Agreed and all ✔️! I put it on Monday's agenda to pick a date. |
Cross-linking the discourse post in this thread. tl;dr - this is scheduled to happen tomorrow (2022-9-19). |
The initial deployment has concluded, but I'd like to keep this ticket open until we've removed Pulp completely. |
Summary
This ticket tracks migrating the RPM repository management tool used in
ros_buildfarm
from pulp to a new purpose-built tool called createrepo-agent.Background
RPM repository metadata consists of a collection of XML files which reside in a subdirectory of the repository root. The root document,
repomd.xml
, can be signed using a GPG key. Unlike debian metadata which uses a "clearsign" signature, therepomd.xml.asc
is a "detached" signature. Any modification to the contents of the repository typically results in changes to each of the ~5 XML files and the signature.Pulp is a general-purpose content management solution with robust plugins specifically targeted at RPMs. It leverages postgresql, redis, Django, and stores payload data in a CAS. It is written in Python, and uses several daemon processes to implement different roles to service different types of requests.
Motivation for this change
Pulp is a very powerful content management tool, but it is extremely heavyweight and complex. Implementing the required queries to perform package invalidation (as is required by
ros_buildfarm
) means that we must perform import operations serially, and performance at our scale has become unsustainable. Central to our performance problems are that metadata generation in Pulp is far too slow.Additionally, the way RPM repository metadata is hosted inherently provides for races when updating metadata that clients may be simultaneously downloading due to the fact that several separate files must be updated together. Pulp has no mitigation for this problem, and it is causing jobs to occasionally fail to download repository metadata.
Another problem with our current solution is that the serialization of repository operations is tightly coupled to Jenkins, making it difficult to experiment with other orchestration and execution solutions.
After analyzing the performance problems we're currently experiencing with Pulp, it was decided that a new tool should be created which can solve several of the problems holding us back today.
Overview of
createrepo-agent
High-level features:
repomd.xml
) and retiring after it is unlikely to be requested.Roll out process
See #972 (comment)
The text was updated successfully, but these errors were encountered: