-
Notifications
You must be signed in to change notification settings - Fork 0
An Overview of the Razor Microkernel
The Razor Microkernel itself is a small, in-memory Linux kernel that is used to boot 'nodes' (in this case, servers) that are discovered by the Razor server. It provides the Razor Server with an essential point of control that the Razor Server needs in order to discover and manage these nodes in the network. The Razor Microkernel also performs an essential task for the Razor Server; discovering the capabilities of these nodes and reporting those capabilities back to the Razor server (so that the Razor server can determine what it should do with them). Without the Razor Microkernel (or something similar to it), the Razor Server cannot do its job (classifying nodes and, based on their capabilities, applying models to them). So how exactly does the Razor Microkernel work? What does it do and how does it interact with the Razor Server in order to accomplish those tasks? Perhaps a diagram would be helpful in terms of understanding the interactions between these two components:
As you can see in this diagram, the Razor Server (which is represented by the yellow-colored components in the center of this diagram) actually has two “channels” that can use when interacting with the Razor Microkernel instances that it is managing (which are labeled as MK, colored brown, and appear on the left-hand side of this diagram). The first communication channel is request-based, and is driven by the HTTP 'checkin' and 'register' requests that are sent to the Razor Server by the Razor Microkernel Controller (using an API that is defined by one of the Node.js instances provided by the Razor Server). The response back to the Microkernel instances in response to these checkin requests can include meta-data and commands that the Razor server would like to pass back to these Microkernel instances (more on this below). There is a second (currently unused) channel that the Razor server could use to communicate with the Microkernel instances; the Marionette Collective or MCollective, shown as the MColl node in the block diagram, above). There is a MCollective daemon instance running on every Microkernel instance, and that daemon could be used for communication between the Razor Server and one (or more) Microkernel instances. This channel of communication is currently not used, but it is available, and it may be used (in a future version of the Razor Microkernel) to control multiple Microkernel instances from a single Razor Server command. The other two components shown in this diagram (the MongoDB instance and the Puppet Master instance) are key components that the Razor server interacts with over time, but we won’t go into any specifics here as to how those interactions occur, since our primary goal here is to show how the Razor Server and Razor Microkernel work together to discover and manage servers in the network.
So, now that we’ve shown how the Razor Microkernel and Razor Server interact, what exactly is the role that the Razor Microkernel plays in this process? As some of you may already know, the primary responsibility of the Razor Server is to use the properties (or “facts”) about the hardware (or nodes) that are “discovered” in order to determine what should be done with those nodes. The properties that are reported to Razor during the node registration process (more on this, below) are used to “tag” the nodes being managed by Razor, and those tags can then be used to map a policy to each of the nodes (which can trigger the process of provisioning an OS to one or more nodes, for example). In this picture, the primary responsibility of the Razor Microkernel is to provide the Razor Server with the facts for the nodes onto which the Microkernel is “deployed”. The Razor Microkernel gathers these facts using a combination of tools (primarily the Facter tool, from Puppet Labs, along with the lshw, dmidecode, and lscpu commands), and these facts are reported back to the Razor Server by the Microkernel as part of the node registration process (more on this, below). Without the Microkernel (or something like it) running on these nodes, the Razor Server has no way to determine what the capabilities of the nodes are and, using those capabilities, determine what sort of policy it should be applying to any given node.
The Razor Microkernel also has a secondary responsibility in this picture. That secondary responsibility is to provide a default boot state for any node that is discovered by the Razor Server. When a new node is discovered (any node for which the Razor Server cannot find an applicable policy), the Razor Server applies a default policy that that node which results in that newly discovered node being booted using the Razor Microkernel. This will typically trigger the process of node checkin and registration, but in the future we might use the same pattern to trigger additional actions using customized Microkernel instances (a Microkernel that performs a system audit or a “boot-nuke”, for example). The existence of the Microkernel instance (and the fact that the Razor Server is selecting the Microkernel instance based on policies defined within the Razor Server) means the the possibilities here are almost endless.
Given that the Razor Microkernel is the default boot state for any new node encountered by the Razor Server, perhaps it would be worthwhile to describe the Microkernel boot process itself. This process begins with the delivery of the Razor Microkernel (as a compressed kernel image and a ram-disk image) by the Razor Server’s “Image Service”. As part of the Microkernel boot process, a number of “built-in” extensions to our base-line Tiny Core Linux OS are installed. In the current implementation, the built-in extensions (and their dependencies) that are installed include the extensions for Ruby (v1.8.7), bash, lshw, and dmidecode along with the drivers and/or firmware needed to support accessing SCSI disks and the Broadcom NetXtreme II networking card. In addition to the extensions that are installed at boot, there are also extensions that are installed during the post-boot process. Currently the only additional extension that is installed during the post-boot phase is an Open VM Tools extension that we have built, and that extension (and its dependencies) is only installed if we are booting a VM in a VMware environment using our Microkernel. In that case, not only is the extension installed, but the kernel modules provided by that extension are dynamically loaded, providing us with complete access to information about the underlying VM (information we would not be able to see without these kernel modules).
Once the Microkernel has been booted and these extensions have been installed, the next step in the boot process is to finish the process of initializing the OS. This includes finalizing the initial configuration that will be used by our Microkernel Controller (using information gathered from the DHCP server to change this configuration so that it points to the correct Razor Server instance for its initial checkin, for example) and setting the hostname for the Microkernel instance (so that the hostnames will be unique, based on the underlying hardware). Once the configuration is finalized, a few key services are started up, along with the Microkernel Controller itself. When this process is finally complete, the following processes will be running in our Microkernel Controller instance:
- The Microkernel Controller – a Ruby-based daemon process that interacts with the Razor Server via HTTP
- The Microkernel TCE Mirror – a WEBrick instance that provides a completely internal web-server that can be used to obtain TCL extensions that should be installed once the boot process has completed. As was mentioned previously, the only extension that is currently provided by this mirror is the Open VM Tools extension (and it’s dependencies).
- The Microkernel Web Server – a WEBrick instance that can be used to interact with the Microkernel Controller via HTTP; currently this server is only used by the Microkernel Controller itself to save any configuration changes it might receive from the Razor Server (this action actually triggers a restart of the Microkernel Controller by this web server instance), but this is the most-likely point of interaction between the MCollective and the Microkernel Controller in the future.
- The MCollective daemon – as was mentioned previously, this process is not currently used, but it is available for future use
- The OpenSSH server daemon – only installed and running if we are in a “development” Microkernel; in a “production” Microkernel this daemon process is not started (in fact, the package containing this daemon process isn’t even installed).
Once the node has been successfully booted using the Microkernel, the the Microkernel Controller’s first action is to checkin with the Razor Server (this “checkin action” is repeated periodically, and the timeing of these checkins is set in the configuration that the Razor Server passes back to the Microkernel in the checkin response, more on this below). In the Razor Server’s response to these checkin requests, the server includes two additional components. The first component included by the server is a command that tells the Microkernel what it should do next. Currently, this set of commands is limited to the following:
- acknowledge – A command from the Razor Server indicating that the checkin request has been received and that there is no action necessary on the part of the Microkernel at this time
- register – A command from the Razor server asking the Microkernel to report back the “facts” that it can discover about the underlying hardware that it has been deployed onto
- reboot – A command from the Razor Server asking the Microkernel instance to reboot itself. This is typically the result of the Razor Server finding an applicable policy for that node after the Microkernel has registered the node with the Razor Server (or after a new policy has been defined), but this command might be sent back under other circumstances in the future.
The second component sent back to the Microkernel Controller by the Razor Server in response to a checkin request is the configuration that the Razor Server would like to apply to that Microkernel instance (this includes parameters like the periodicity that the Microkernel should be using for its checkin requests, a pattern indicating which “facts” should NOT be reported, the TCE mirror location that the Microkernel should use to download any extensions, and even the URL of the Razor Server itself). If this configuration has changed in any way since the last checkin by the Microkernel, the Microkernel Controller will save this new configuration and that Microkernel Controller will be restarted (forcing it to picks up the new configuration). This ability to set the Microkernel Controller’s configuration using the checkin response gives the Razor Server complete control over the behavior of the Microkernel instances that it is interacting with. The following sequence diagram can be used to visualize the sequence of actions outlined above:
It should be noted here that there are three situations under which the Microkernel Controller might register with the Razor server (providing the Razor Server with the latest “facts” about the underling node):
- Whenever the Razor server sends back a “register” command in the response to the checkin request (or if the facts about the node have changed since the last checkin). This can occur if the Razor server has not ever seen that node before, or if the Razor server has not seen that node in a while (the timing for this is configurable).
- When the Microkernel Controller first starts after the Microkernel finishes booting. In that case, the Microkernel Controller sets a flag in the checkin request indicating that this is the first checkin after a (re)boot of the Microkernel and the Razor server will send back a “register” command in the checkin response (forcing the Microkernel Controller to register with the Razor server, see above).
- Whenever the Microkernel Controller detects that the “facts” that it gathers about the underlying node are different than they were during its last successful checkin with the Razor server. In that case, the Microkernel Controller will register with the Razor Server (without being prompted to do so).