Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support partitioned AMI creation #129

Open
kvick opened this issue Aug 26, 2013 · 25 comments
Open

support partitioned AMI creation #129

kvick opened this issue Aug 26, 2013 · 25 comments
Milestone

Comments

@kvick
Copy link
Contributor

kvick commented Aug 26, 2013

Aminator currently assumes that we use a partition-less disk. Some users have requested something like this in the past #59. This may be a duplicate, so feel free to close this one and reopen #59.

The end goal is that we can aminate a volume that contains a partition table. We may require some additional configs to know where/how to mount the volume.

@jhohertz
Copy link

Pull #89 speaks to the same thing.

I've run into this item I think, and may try to take something on around this.

@jhohertz
Copy link

Reading the other issues around working w/ partitioned AMIs, I think perhaps my issue is a bit different, Aminator seems to identify there is a partition ok, and the mistake is in trying to tell EC2 to attach a volume as a partition rather than a block device.

I could probably tweak it such that that call detects if there is a partition number, and strips it from the attach (and i guess detach) operation, but expect will need further changes, such as the suggestion around a --partition parameter to tell Aminator which partition to care about for the remainder of the operations.

If anyone notices my messages, and knows of any other activities around such support I should look at before diving it, it would be appreciated.

Thanks.

@coryb
Copy link
Contributor

coryb commented Mar 12, 2014

The base ami that you aminate on top of would have partitions, so there is no way to detect the partitions without attaching it first, but before you attach it you have to give it a proper device name (ie sdg vs sdg1). So there is a bit of a race condition. One option is to have a special tag on the base ami to indicate it has partitions also indicate which partition is the root. I have a feeling that the changes might be a bit messy, I think there are quite a few assumptions (in the form of device naming conventions) that assume we are dealing with partitions-less disks.

-Cory

@jhohertz
Copy link

Thanks for the feedback. I'm trying to trace the references to device nodes through the code, I'm missing something fundamental as I can't currently see how it would work on whole devices vs partitions, mostly because of this bit of code:

    self._allowed_devices = [device_format.format(self._device_prefix, major, minor)
                             for major in majors
                             for minor in xrange(1, 16)]

Which is generating a list consisting only of partition devices, (IE: everyone has a suffix from 1-16) to consider attaching a volume to.

But again, I must be missing something, as this must work for other folks.

@bmoyles
Copy link
Contributor

bmoyles commented Mar 12, 2014

A partitioned AMI will likely be registered with a different pvgrub AKI than non-partitioned AMIs. I believe the partitioned AMIs are registered with the hd00 flavor AKI as they most often have /boot (and subsequently menu.lst) in hd0,0. The partitionless images are usually registered with the hd0 AKI. This doesn't account for custom partition schemes, of course--someone could choose to have partitions but no distinct /boot (and thus be forced to use hd0) but that's a more distant edge case than just vanilla partitioning...

@jhohertz
Copy link

I was half wondering if some property of the AMI was driving the selection of the device node it tries to attach the created EBS volume, but that code above. with the prefix being "xvd", the majors starting at "f", and numbers 1-16, seems to be the only source of candidates it will try to request the EBS to attach to. So it tries to do the equivilant in boto of:

ec2-attach-volume vol-123456 -i i-myintid -d /dev/sdf1

And aws, not expecting a device node w/ a partition, fails the call. (get the same API error if I do it that way, and dropping partition off the device node succeeds)

From a an output with --debug on, this is where I am falling down:

2014-03-12 18:15:18 [INFO] looking up base AMI with ID ami-0d9c9f64
2014-03-12 18:15:18 [INFO] Successfully resolved ubuntu/images/hvm/ubuntu-precise-12.04-amd64-server-20140227(ami-0d9c9f64)
2014-03-12 18:15:18 [INFO] Searching for an available block device
2014-03-12 18:15:19 [INFO] Block device /dev/xvdf1 allocated
2014-03-12 18:15:20 [ERROR] 400 Bad Request
2014-03-12 18:15:20 [ERROR]
InvalidParameterValueValue (/dev/sdf1) for parameter device is invalid. /dev/sdf1 is not a valid EBS device name.c30c7150-168f-47f0-a81e-b5c51e7b3b7e

So again, I'm not even sure if I am getting as far as the part of the code that needs to be made aware of partition, it feels like i'm still just in a setup phase of allocating resources for the job.

@coryb
Copy link
Contributor

coryb commented Mar 12, 2014

I think you should have -d sdf1 not -d /dev/sdf1

Aminator starts at sdf because some aws instance types have a root disk (a) + 4 ephemeral disks (b-e).

-Cory

@jhohertz
Copy link

Hmmm. Just did a test, I can drop /dev/ from the name, but it refuses to deal with a partition node still, IE:

ec2-attach-volume vol-f07499bc -i i-a3a36680 -d sdf1

Client.InvalidParameterValue: Value (sdf1) for parameter device is invalid. sdf1 is not a valid EBS device name.

ec2-attach-volume vol-f07499bc -i i-a3a36680 -d sdf

ATTACHMENT vol-f07499bc i-a3a36680 sdf attaching 2014-03-12T19:24:34+0000

ec2-detach-volume vol-f07499bc -i i-a3a36680 -d sdf

ATTACHMENT vol-f07499bc i-a3a36680 sdf detaching 2014-03-12T19:24:34+0000

But Aminator, and the discussion thus far, implies it should be possible to attach a single partition to EBS. I must be missing something, as that seems wrong on several levels to me. (Sure, don't partition it, format the whole disk, but partition without a larger volume containing it? Seems wrong to me)

Can you confirm that I should be able to attach directly to a partition node, and this is a behaviour Aminator relies upon? (Experienced with server admin, but a little newer to EC2, so maybe my intuition is steering me wrongly here...)

Just ran this by my peer here, affirmed I'm not crazy, block devices need to be attached as such, calling attach volume with the attachment point ending in a number vs. letter just isn't valid.... leading me back to being curious why the allowed devices list contains partition numbers (see 3rd comment)

@jhohertz
Copy link

Sorry for all the updated/edits, trying to provide as much info as I can.

I've been using master branch, notice there is a lot of activity since in testing, perhaps I will give that a whirl...

@bmoyles
Copy link
Contributor

bmoyles commented Mar 12, 2014

Aha! Your debug output has the key.

  • First, the image you're using as a foundation is a HVM image. You won't be able to create another HVM image from that. I would use the PV image for 12.04 instead unless there's something special about the HVM image that you really need.
  • Second, Canonical hasn't given public create-volume permissions to the snapshots backing their public AMIs. You won't be able to use their AMI directly, unfortunately. You will need to do something along the lines of launch the 12.04 instance, use create-image to create your own private copy of 12.04, and then use THAT as your foundation. A more advanced way to go about that (which ensures you don't create state on the new image by launching it) would be to create a bare volume and dd the contents of the 12.04 rootfs (you can find tarballs and such here http://cloud-images.ubuntu.com/ ) onto that. You'd then snapshot that volume, create image from the snapshot, and have your own private copy of 12.04 that can be used as a foundation.

@jhohertz
Copy link

Thanks, I was starting to get to the realization on your second point after some reading, but you make it rather more clear. (Realized I was missing the "foundation" AMI)

To the first point, I've been directed to create HVM-based images, as we're aligning to instance types requiring them, can I create those from a para image? I assumed I'd need to base from same.

Thanks a lot for your help and feedback.

@bmoyles
Copy link
Contributor

bmoyles commented Mar 12, 2014

The only way for regular users to create HVM images (today) is to launch an instance and use create-image, unfortunately :\ Amazon clearly has the ability to grant folks the ability to create HVM images as Canonical clearly can create their own, but they have not released that functionality wide. Until they do so, you won't be able to use aminator to create images, unfortunately.

@coryb
Copy link
Contributor

coryb commented Mar 12, 2014

Well, sort of. We recently added --vm-type hvm option to aminator (testing branch) to register an ami as hvm. We aminate on top of a pv base image that is capable of booting hvm or pv. I am not sure if the canonical images are capable of booting hvm or pv out of the box I believe we did some grub magic to get it working right (I dont have the details at the moment).

-Cory

@bmoyles
Copy link
Contributor

bmoyles commented Mar 12, 2014

cough you guys have special permission from Amazon too cough
:)

@coryb
Copy link
Contributor

coryb commented Mar 12, 2014

Could be, but I think it is open to general availability now. It is fully supported via the boto python clients. Although of course I only have tested in the accounts we own.

-Cory

@bmoyles
Copy link
Contributor

bmoyles commented Mar 12, 2014

Hmm, weird, the API does support it but none of the official documentation describes the process of actually creating one. Their line was they weren't ready to support it widely due to the complexity of creating the image (need to monkey with grub and other foo due to the HVM bootloader being more akin to a real x86 bootloader versus pvgrub).

@bmoyles
Copy link
Contributor

bmoyles commented Mar 12, 2014

Huh, I take it all back. Just registered one of our snaps with hvm as the vm type and sure enough it works. These aren't the droids you're looking for... move along :)

@jhohertz
Copy link

This sounds good. :) I am just about to sit down and do my foundation image now, planning to do one by hand by looking through the scripts in netflixoss-ansible, and the wiki-page-with-warnings. Then I can find out if I have an actual problem with partitions (which seem to be more common on HVM types perhaps?)

Someone else today framed HVM in their build as just recently "kind of" working. (Nothing to do Aminator). No idea if that's from generated within EC2 or not.

@jhohertz
Copy link

Just a note I'm a lot further along now, partly from ensuring I'm using a foundation, partly from switching to running Aminator in a non-HVM instance. Debugging stuff to do with the ansible playbooks still, so not quite testing yet, but well past the issue that brought me here.

@jhohertz
Copy link

So with one little patch to aminator-plugins/ansible-provisioner, I have an ansible-provisioned Aminator instance, Aminating an Aminator AMI, for PV/EBS. Delving in deeper I see my mistake around expecting "normal" semantics around device attachment and partitions.

I tried --vm-type hvm from my PV foundation, and Aminator is happy about it, but it fails to boot, probably due to something lacking in the foundation. I found reference in the EC2 docs about downgrading GRUB, as I understand HVM, it would be looking for a first stage in the boot sector, then jumping into the later stage from /boot, and my whole disk foundation can't handle that. Perhaps I just need to grub-install... forgot till I was writing this that there's room for a boot sector on the whole-disk filesystems. Maybe need to sort something out with kernel too... I am still catching up, did Xen / normal kernels ever unify?

Back to the topic of the ticket... since I've taken it so far off course, I had a thought.

Some of the challenge seems to be around how to handle an arbitrary partitioning arrangement. (/boot on 1 vs one /). I wonder if you really need to. Since everyone needs to craft a foundation, they'll know how it's setup. Maybe allow passing an fstab (or subset) into Aminator as a parameter. That said --partition would allow fetching the actual table for setting up a chroot layout. (And it's presence or not, switching between caring about partitions vs. whole disk)

@jhohertz
Copy link

Okay, so I've now generated partitioned AMIs for PV/HVM referencing the same snapshot, getting the parity between environments I was hoping for, and I'm back to looking @ Aminator... just want to run my thinking/idea by you, as I'd hope whatever I do would be worthwhile to pull.

I am basing my thinking around not really wanting multiple partitions, the table is more just a mechanism to support HVM+PV from a single source. For my purposes, a single partition is just fine, somewhat matches existing expectations of Aminator, and I am thinking may be fairly simple to add support for this model.

So what I am looking at is implementing a new "blockdevice" and "volume" plugin... and adding a new environment definition, that switches to using these. The working name I am using is "linux1part". I suspect I may find I need to derive a new finalizer as well, but haven't dug that far yet.

Does this align to any plans/thinking you have internally on this topic? There is certainly more elaborate ways this could be done, but I'm not sure how much value there really is in elaborate partitioning arrangements for the root volume... it would increase complexity a bit around managing multiple mounts setting up the chroot... maybe later could look at a "linuxNpart" setup at a later time?

@jhohertz
Copy link

Not actually sure I'd need to change anything about volume the more I look at it.

Thinking of renaming the current linux blockdevice "slice" (it's not a partition per-se, so borrowing a BSD-ism) and adding "disk" for the model of a numberless "whole" block device.

And beyond that, I just see one bit where we need to work around oddness of the ec2 api around blockdevice mappings. To get PV mounting the whole disk, you do a sensible:

  -b /dev/sda=<snap>:<params>

On HVM this apparently fails, and we do a very counter-intuitive:

  -b /dev/sda1=<snap>:<params>

Which if you think about it, is the exact opposite of how this effort is trying flip things about...

But I think those are the two main things that would need doing to get a new blockdevice that groks the single-partitioned volumes.

If these come from the foundation ami meta, the latter point may not even need treatment by Aminator. That would mean looking @ two AMI meta... which would get awkward. I have both types pointing at the same snap, and with assumption around a single partition always, and a desire to source both types off one process... now I'm looking @ cloud.ec2's handling of mappings. It feels a bit hard-coded to do what I suggest above around the odd mapping needs, but not doing that would involve, I think, externalizing mappings into a plugin type, single-partition being the first, more complex as other plugins to this new type? Which may be the way to go to step towards the larger theme of this ticket?

@jhohertz
Copy link

Just a note of where I'm at. I split the volume/blockdevice plugins into slice and disk variants. Was simple enough.

Where I'm stuck, is what to do with the twists on root device mapping needed. in "slice" mode, this matches root-device-name (/dev/sda1), where we need /dev/sda for pvm disk type, and /dev/sda1 for hvm disk type in the block device mapping (not sure why the latter).

I think the volume type needs to register something for both a root-device-name, and a "mapping-root-name" for the device map. It seems cleaner than the tag plugins inspecting which volume plugin was run.

I must confess higher-order Python isn't one of my first languages, so not sure the best way to approach it. I see there's some context models and some cases of things stuffing keys into it, but unclear enough of the specifics to be a little wary of making naive changes. (what scopes are there, etc.).

@jhohertz
Copy link

Just a note that my own effort to add such support stalled, as I worked around the issue for a bit by avoiding Aminator, and now I've managed to build Aminator-compatible foundations that work with HVM machine types, which makes this a whole lot less pressing for myself personally, as the HVM support was the big driver initially for looking for support for partitioned AMIs in Aminator.

@grahamlyons
Copy link

This was addressed in #207

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants