Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement S3 Object Storage for Package Repositories #291

Merged
merged 16 commits into from
Aug 28, 2024

Commits on Aug 26, 2024

  1. buildmaster: Check system-packages dir in builder cache pruning.

    Packages can come from either packages or system-packages directories
    and both need to be checked to see if the cached package should be
    kept.
    
    Packages coming from the system-packages directory were not taken into
    account and pruned from the builder cache on each buildrun only to then
    be re-uploaded.
    mmlr committed Aug 26, 2024
    Configuration menu
    Copy the full SHA
    65cf92b View commit details
    Browse the repository at this point in the history
  2. buildmaster: Replace manual file moving with posix_rename.

    As the comment indiciated, earlier versions of Paramiko did not provide
    a rename/move operation that was compatible with BFS due to the use of
    hardlinks. Newer versions do now provide posix_rename that can be used
    instead of the manual "mv" shell command.
    mmlr committed Aug 26, 2024
    Configuration menu
    Copy the full SHA
    8d0c7b0 View commit details
    Browse the repository at this point in the history
  3. buildmaster: Drop sources and provide licenses in the image.

    Since building the host tools as part of the container image build, the
    source volume is not actually needed for the backend and the frontend
    never needed that volume in the first place.
    
    Originally the shared source volume was meant to reduce used disk space
    when running multiple instances. This is not needed anymore as the image
    contains and shares the host tools and the bootstrap process for getting
    the system-packages has been externalized.
    
    The only user of the shared sources was the built in licenses in the
    Haiku repository. For now, provide these in the image as well. This
    could later also be moved to an external archive like what is done for
    the system-packages.
    mmlr committed Aug 26, 2024
    Configuration menu
    Copy the full SHA
    2977680 View commit details
    Browse the repository at this point in the history
  4. buildmaster: Improve caching in backend container image build.

    Adding the HaikuPorter sources to the image invalidates the cache for
    each change that is made. Move that install to the end and into separate
    steps so that package installation and minisign build can be cached.
    mmlr committed Aug 26, 2024
    Configuration menu
    Copy the full SHA
    b01b751 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    d3c1ffc View commit details
    Browse the repository at this point in the history
  6. buildmaster: Record current HaikuPorts revision on bootstrap.

    This makes this work more out of the box.
    mmlr committed Aug 26, 2024
    Configuration menu
    Copy the full SHA
    b9a39e2 View commit details
    Browse the repository at this point in the history
  7. buildmaster: Fix typo in environment variable name.

    This is only printed when system-packages are missing.
    mmlr committed Aug 26, 2024
    Configuration menu
    Copy the full SHA
    2ddda59 View commit details
    Browse the repository at this point in the history

Commits on Aug 28, 2024

  1. Configuration menu
    Copy the full SHA
    d8926e2 View commit details
    Browse the repository at this point in the history
  2. buildmaster: Fix detection of repository creation failure.

    The echo command, introduced to make the output easier to read, was
    hiding the return value of the actual package repository creation
    command.
    mmlr committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    7691e6c View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    c3ac25a View commit details
    Browse the repository at this point in the history
  4. buildmaster: Delegate package reads/writes to PackageRepository.

    This furthers abstraction and will be needed when packages are not
    necessarily local anymore.
    
    Read and write are implemented as streaming operations using file
    objects to allow for various backends without the need for local
    temporary copies of files.
    mmlr committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    cd077da View commit details
    Browse the repository at this point in the history
  5. Cleanup: Rename argument of ScheduledBuild to missingPackageIDs.

    That's what the member variable is called and what that list actually
    contains.
    mmlr committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    f797928 View commit details
    Browse the repository at this point in the history
  6. Cleanup: Remove unused package obsoletion functions.

    These are never used as the obsoletion is handled at the Repository and
    PackageRepository level.
    mmlr committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    4d64533 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    b0aed94 View commit details
    Browse the repository at this point in the history
  8. Implement S3 based storage backend for PackageRepository.

    The storage backend is used to hold the actual packages while the local
    packages directory is only used to keep track of the current package
    list.
    
    New packages are spooled to the local packages directory as they are
    built and are kept there for adding them to the package repo file
    (where package information is needed and the checksum is calculated).
    Once added to the repo, the packages are uploaded to object storage and
    the local copy is stubbed out to an 0 byte file.
    
    When dependency packages are needed on the builder (and are not already
    cached there), they are streamed directly from object storage without
    repopulating the local packages directory.
    
    After the package repo is updated it is uploaded to object storage as
    well, along with its info file, sha256 checksum and the package list
    file. This allows the object storage to be used as a complete package
    repo by pkgman directly.
    
    Finally packages in object storage are then pruned based on the list of
    current local stub package files to keep the state in sync.
    
    Note that this requires a "package_repo" command that supports the "-t"
    argument to the "update" command as only stub packages are available
    locally and the package info can therefore not be extracted from them.
    Instead the package names are assumed to be canonical and the package
    info to be immutable. This is unproblematic, as the buildmaster setup
    ensures that packages cannot be overwritten (this would also have failed
    previously as the checksums were intentionally not revalidated).
    
    The storage backend config file path is given with a new
    "--storage-backend-config" option. It should point to a JSON file with
    a "backend_type" string (only "s3" is supported for now). A sample
    config is also included. An empty path is allowed and causes no storage
    backend to be used.
    
    The S3 storage backend needs an "endpoint_url", "access_key_id",
    "secret_access_key" and "bucket_name" to be specified in the config
    file. An optional "prefix" can also be supplied to place multiple
    instances into the same bucket.
    
    Include the storage backend config option in the buildmaster scripts fed
    from a "STORAGE_BACKEND_CONFIG" environment variable for easy
    configuration.
    mmlr committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    452f7b3 View commit details
    Browse the repository at this point in the history
  9. buildmaster: Drop shared packages volume.

    The packages repository never actually needed to be shared or separate
    and can just as well be located on the main buildmaster volume. It was
    originally shared only so that repositories for multiple architectures
    could be served from a single server.
    
    When using object storage as the storage backend, the repository
    directories are only used to keep the state and don't provide the
    actual repo or package files. In this case a separate volume is even
    less useful.
    
    Point frontend container to the single buildmaster volume instead of the
    previously shared instances directory on the packages volume. This means
    that the fontend will generally not be shared across architectures
    anymore. Since it reduces the scope of the shared volumes this does ease
    deployment.
    
    The "repo_consistency.txt" and "report.txt", that report the consistency
    of the recipe and package repository respectively, are moved from the
    packages volume to the output directory as this makes them accessible
    through the normal frontend.
    mmlr committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    40047c4 View commit details
    Browse the repository at this point in the history