Skip to content

Embedding

K Lange edited this page Feb 9, 2021 · 2 revisions

There are two ways to connect Kuroko with C code: embedding and modules.

Embedding involves including the interpreter library and initializing and managing the VM state yourself.

C modules allow C code to provide functions through imported modules.

If you want to provide C functionality for Kuroko, build a module. If you want to provide Kuroko as a scripting language in a C project, embed the interpreter.

With either approach, the API provided by Kuroko is the same beyond initialization.

Embedding Kuroko

Kuroko is built as a shared library, libkuroko.so, which can be linked against. libkuroko.so generally depends on the system dynamic linker, which may involve an additional library (eg. -ldl).

The simplest example of embedding Kuroko is to initialize the VM and interpret an embedded line of code:

#include <stdio.h>
#include <kuroko.h>

int main(int argc, char *argv[]) {
    krk_initVM(0);
    krk_interpret("import kuroko\nprint('Kuroko',kuroko.version)\n", 1, "<stdin>","<stdin>");
    krk_freeVM();
    return 0;
}

There is a single, shared VM state. krk_initVM(flags) will initialize the compiler and create built-in objects.

krk_interpret compiles and executes a block of code and takes the following arguments:

    KrkValue krk_interpret(const char *sourceText, int newModuleScope, char *fromName, char *fromFile);

If newModuleScope is non-zero, the interpreter will parse code in the context of a new module and the KrkValue returned will be a module object.

If newModuleScope is zero, the return value will be the last value popped from the stack during execution of sourceText. This can be used, as in the REPL, when providing interactive sessions.

The arguments fromName provide the name of the module created by when newModuleScope is non-zero, and fromFile will be used when displaying tracebacks.

Building Modules

Modules are shared objects with at least one exported symbol: krk_module_onload_{NAME} (where {NAME} is the name of your module, which should also be the name of your shared object file excluding the .so suffix).

Your module's krk_module_onload_... function should return a KrkValue representing a KrkInstance of the vm.moduleClass class.

KrkValue krk_module_onload_fileio(void) {
	KrkInstance * module = krk_newInstance(vm.baseClasses->moduleClass);
	/* Store it on the stack for now so we can do stuff that may trip GC
	 * and not lose it to garbage collection... */
	krk_push(OBJECT_VAL(module));

	/* ... */

	/* Pop the module object before returning; it'll get pushed again
	 * by the VM before the GC has a chance to run, so it's safe. */
	assert(AS_INSTANCE(krk_pop()) == module);
	return OBJECT_VAL(module);
}

Defining Native Functions

Simple functions may be added to the interpreter by binding to them to vm.builtins or your own module instance.

Native functions should have a call signature as follows:

KrkNative my_native_function(int argc, KrkValue argv[], int hasKw);

If hasKw is non-zero, then the value in argv[argc] will represent a dictionary of keyword and value pairs. Positional arguments will be provided in order in the other indexes of argv.

Functions must return a value. If you do not need to return data to callers, return NONE_VAL().

To bind the function, use krk_defineNative:

krk_defineNative(&vm.builtins->fields, "my_native_function", my_native_function);

Binding to vm.builtins->fields will make your function accessible from any scope (if its name is not shadowed by a module global or function local) and is discouraged for modules but recommended for embedded applications.

Kuroko's Object Model

For both embedding and C modules, you will likely want to create and attach functions, classes, objects, and so on.

It is recommended you read Crafting Interpreters, particularly the third section describing the implementation of clox, as a primer on the basic mechanisms of the value system that Kuroko is built upon.

Essentially, everything accessible to the VM is represented as a KrkValue, which this documentation will refer to simply as a value from here on out.

Values are small, fixed-sized items and are generally considered immutable. Simple types, such as integers, booleans, and None, are directly represented as values and do not exist in any other form.

More complex types are represented by subtypes of KrkObj known as objects, and values that represent them contain pointers to these KrkObjs. The KrkObjs themselves live on the heap and are managed by the garbage collector.

Strings, functions, closures, classes, instances, and tuples are all basic objects and carry additional data in their heap representations.

Strings (KrkString) are immutable and de-duplicated - any two strings with the same text have the same object. (See Crafting Interpreters, chapter 19) Strings play a heavy role in the object model, providing the basic type for indexing into attribute tables in classes and instances.

Functions (KrkFunction) represent bytecode, argument lists, default values, local names, and constants - the underlying elements of execution for a function. Generally, functions are not relevant to either embedding or C modules and are an internal implementation detail of the VM.

Closures (KrkClosure) represent the callable objects for functions defined in user code. When embedding or building a C module, you may need to deal with closures for Kuroko code passed to your C code.

Bound methods (KrkBoundMethod) connect methods with the "self" object they belong to, allowing a single value to be passed on the stack and through fields.

Classes (KrkClass) represent collections of functions. In Kuroko, all object and value types have a corresponding KrkClass.

Instances (KrkInstance) represent user objects and store fields in a hashmap and also point to the class they are an instance of. Instances can represent many things, including collections like lists and dictionaries, modules, and so on.

Tuples (KrkTuple) represent simple fixed-sized lists and are intended to be immutable.

Finally, native functions (KrkNative) represent callable references to C code.

Most extensions to Kuroko, both in the form of embedded applications and C modules, will primarily deal with classes, instances, strings, and native functions.

Two of the high-level collection types, lists and dictionaries, are instances of classes provided by the __builtins__ module. While they are not their own type of KrkObj, some macros are provided to deal with them.

Creating Objects

Now that we've gotten the introduction out of the way, we can get to actually creating and using these things.

The C module example above demonstrates the process of creating an object in the form of an instance of the vm.moduleClass class. All C modules should create one of these instances to expose other data to user code that imports them.

Most extensions will also want to provide their own types through classes, as well as create instances of those classes.

NOTE: When creating and attaching objects, pay careful attention to the order in which you allocate new objects, including strings. If two allocations happen in sequence without the first allocated object being stored in a location reachable from the interpreter roots, the second allocation may trigger the garbage collector which will immediately free the first object. If you need to deal with complex allocation patterns, place values temporarily on the stack to prevent them from being collected.

    /* Create a class 'className_' and attach it to our module. */
    KrkClass * myNewClass = krk_newClass(krk_copyString("MyNewClass", 10), vm.baseClasses->objectClass);
    krk_attachNamedObject(&module->fields, "MyNewClass", (KrkObj*)myNewClass);

Here we have created a new class named MyNameClass and exposed it through the fields table of our module object under the same name. We're not done preparing our class, though:

We also want to make sure that our new class fits into the general inheritance hierarchy, which typically means inheriting from vm.objectClass - we do this by passing vm.objectClass to krk_newClass as a base class.

Native functions are attached to class method tables in a similar manner to normal functions:

krk_defineNative(&myNewClass->methods, ".my_native_method", my_native_method);

When attaching methods, notice the . at the start of the name. This indicates to krk_defineNative that this method will take a "self" value as its first argument. This affects how the VM modifies the stack when calling native code and allows native functions to integrate with user code functions and methods.

In addition to methods, native functions may also provide classes with dynamic fields. A dynamic field works much like a method, but it is called implicitly when the field is accessed. Dynamic fields are used by the native classes to provide non-instance values with field values.

krk_defineNative(&myNewClass->methods, ":my_dynamic_field", my_dynamic_field);

If your new instances of your class will be created by user code, you can provide an __init__ method, or any of the other special methods described in the Examples above.

When you've finished attaching all of the relevant methods to your class, be sure to call krk_finalizeClass, which creates shortcuts within your class's struct representation that allow the VM to find special functions quickly:

krk_finalizeClass(myNewClass)

Specifically, this will search through the class's method table to find implementations for functions like __repr__ and __init__. This step is required for these functions to work as expected as the VM will not look them up by name.

Creating Types with Internal State

There are two ways to attach internal state to new types:

  • If state lookup does not need to be fast and consists entirely of values that can be represented with Kuroko's type system, use the instance's fields table.
  • If state lookup needs to be immediate and involves non-Kuroko types, extend KrkInstance.

The first approach is easy to implement: Just attach named values to an instance's fields table where appropriate, such as in the type's __init__ method.

The second approach requires some additional work: The class must specify its allocation size, define a structure with a KrkInstance as its first member (followed by the additional members for the type), ensure that values are properly initialized on creation, and also provide callbacks for any work that needs to be done when the object is scanned or sweeped by the garbage collector.

The range class is an example of a simple type that extends KrkInstance:

struct Range {
	KrkInstance inst;
	krk_integer_type min;
	krk_integer_type max;
};

As the additional members min and max do not need any cleanup work, the range class only needs to indicate its allocation size when it is defined:

ADD_BASE_CLASS(vm.baseClasses->rangeClass, "range", vm.baseClasses->objectClass);
vm.baseClasses.rangeClass->allocSize = sizeof(struct Range);
...

The list class, however, stores Kuroko objects in a flexible array:

typedef struct {
	KrkInstance inst;
	KrkValueArray values;
} KrkList;

And must bind callbacks to ensure its contents are not garbage collected, and that when the list itself is garbage collected the additional memory of its flexible array is correctly freed:

vm.baseClasses.listClass->allocSize = sizeof(KrkList);
vm.baseClasses.listClass->_ongcscan = _list_gcscan;
vm.baseClasses.listClass->_ongcsweep = _list_gcsweep;
...
static void _list_gcscan(KrkInstance * self) {
	for (size_t i = 0; i < ((KrkList*)self)->values.count; ++i) {
		krk_markValue(((KrkList*)self)->values.values[i]);
	}
}

static void _list_gcsweep(KrkInstance * self) {
	krk_freeValueArray(&((KrkList*)self)->values);
}