Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New feature: reservations #37

Merged
merged 20 commits into from
Oct 12, 2023
Merged

New feature: reservations #37

merged 20 commits into from
Oct 12, 2023

Conversation

fridim
Copy link
Contributor

@fridim fridim commented Sep 21, 2023

See GPTEINFRA-5361 Dedicated AWS Sandboxes for events

User story

As an OPS i want to be able to reserve AWS sandboxes for events (for example, Summit) in advance:

  • provide the list of accounts to AWS for monitoring
  • easiest cleanup once the event is done
  • easiest cost tracking

The selection of which reservation to use can happen in AgnosticV definition of the catalog items.

This change

Implementation of a new feature to be able to reserve resources.

Part of this change:

  • Update OpenAPI schema - Add new and update existing endpoints to interact with reservations
  • Add functional tests using Hurl
    2023-09-27_11-34
  • Add DB migration
    • Create new table reservations that has state of live reservation definitions
    • Create new table reservations_events and associated triggers and SQL functions to track every modifications that happen on the reservations table 2023-09-26_14-20
  • Patch conan the destroyer (cleanup daemon) to preserve the reservation name when cleaning up.
  • Patch sandbox-list to display a new "reservation" column
  • Patch prometheus endpoint to add the reservation
  • Rename tag of temporary images built from Pull Requests: add temporary- prefix to be explicit. Change expiration from 1 day to 7 days.
  • Fix locality of workers
    Currently Postgresql channels in the sandbox-api don't have a concept of locality.
    • lifecycle_placement_jobs_status_channel
    • lifecycle_resource_jobs_status_channel
      It means workers running on another cluster and listening to the same postgresql channel, can process it and do the work.
      That would make troubleshooting very hard as one would have to listen to all the logs to understand the flow.
      We fix this by implementing locality in the request channels so only workers from the originating pod are first allowed to process the requests. If they don't, after a while, any worker can claim the request.

@fridim fridim force-pushed the GPTEINFRA-5361 branch 2 times, most recently from 2676038 to bf55569 Compare September 21, 2023 12:36
Also cleanup formatting in 000.hurl
This commit includes:
- Update to Go version 1.21  ()
- Consolidate then Kind of resource for AWS sandboxes. Use AwsSandbox
  Defensive programming: use all possibilities in switch cases.
- Fix GetAccounts: do not specify manually all fields, as it's the
default.
- Fix GetAccounts which was limited to 10. Now it's using the argument
'count' properly
- Add schema migration.  There is a table 'reservations_events' to track
any update/insert/delete so we keep full history of what happens.
- It's possible to scale down a reservation.
Error:

popen failure: Cannot allocate memory
initdb: error: program "postgres" is needed by initdb but was not found in the same directory as "/usr/lib/postgresql/15/bin/initdb"

Keep using bullseye version of the image for now
Make sure workers consume local jobs first.
Give a 2-second delay for non-local workers. They can claim after that
delay if the job is still 'new' and available for pickup.
* Add new endpoints and patch existing handlers.
* Update functional tests
* Update OpenAPI schema
As we'll onboard more accountprovider in the future, make it clear that
is for Aws.
@fridim fridim added the enhancement New feature or request label Sep 27, 2023
@fridim fridim marked this pull request as ready for review October 11, 2023 09:41
Copy link
Contributor

@aleixhub aleixhub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fridim fridim merged commit 6d2bea9 into main Oct 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants