Skip to content

Commit

Permalink
[Reland][Libomptarget] Statically link all plugin runtimes (llvm#87009)
Browse files Browse the repository at this point in the history
This patch overhauls the `libomptarget` and plugin interface. Currently,
we define a C API and compile each plugin as a separate shared library.
Then, `libomptarget` loads these API functions and forwards its internal
calls to them. This was originally designed to allow multiple
implementations of a library to be live. However, since then no one has
used this functionality and it prevents us from using much nicer
interfaces. If the old behavior is desired it should instead be
implemented as a separate plugin.

This patch replaces the `PluginAdaptorTy` interface with the
`GenericPluginTy` that is used by the plugins. Each plugin exports a
`createPlugin_<name>` function that is used to get the specific
implementation. This code is now shared with `libomptarget`.

There are some notable improvements to this.
1. Massively improved lifetimes of life runtime objects
2. The plugins can use a C++ interface
3. Global state does not need to be duplicated for each plugin +
   libomptarget
4. Easier to use and add features and improve error handling
5. Less function call overhead / Improved LTO performance.

Additional changes in this plugin are related to contending with the
fact that state is now shared. Initialization and deinitialization is
now handled correctly and in phase with the underlying runtime, allowing
us to actually know when something is getting deallocated.

Depends on llvm#86971
llvm#86875
llvm#86868
  • Loading branch information
jhuber6 committed May 9, 2024
1 parent 846ffc7 commit fa9e90f
Show file tree
Hide file tree
Showing 24 changed files with 125 additions and 523 deletions.
2 changes: 1 addition & 1 deletion clang/test/Driver/linker-wrapper-image.c
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@

// OPENMP: define internal void @.omp_offloading.descriptor_reg() section ".text.startup" {
// OPENMP-NEXT: entry:
// OPENMP-NEXT: %0 = call i32 @atexit(ptr @.omp_offloading.descriptor_unreg)
// OPENMP-NEXT: call void @__tgt_register_lib(ptr @.omp_offloading.descriptor)
// OPENMP-NEXT: %0 = call i32 @atexit(ptr @.omp_offloading.descriptor_unreg)
// OPENMP-NEXT: ret void
// OPENMP-NEXT: }

Expand Down
7 changes: 4 additions & 3 deletions llvm/lib/Frontend/Offloading/OffloadWrapper.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -232,12 +232,13 @@ void createRegisterFunction(Module &M, GlobalVariable *BinDesc,
// Construct function body
IRBuilder<> Builder(BasicBlock::Create(C, "entry", Func));

Builder.CreateCall(RegFuncC, BinDesc);

// Register the destructors with 'atexit'. This is expected by the CUDA
// runtime and ensures that we clean up before dynamic objects are destroyed.
// This needs to be done before the runtime is called and registers its own.
// This needs to be done after plugin initialization to ensure that it is
// called before the plugin runtime is destroyed.
Builder.CreateCall(AtExit, UnregFunc);

Builder.CreateCall(RegFuncC, BinDesc);
Builder.CreateRetVoid();

// Add this function to constructors.
Expand Down
61 changes: 16 additions & 45 deletions offload/include/PluginManager.h
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,11 @@
#ifndef OMPTARGET_PLUGIN_MANAGER_H
#define OMPTARGET_PLUGIN_MANAGER_H

#include "PluginInterface.h"

#include "DeviceImage.h"
#include "ExclusiveAccess.h"
#include "Shared/APITypes.h"
#include "Shared/PluginAPI.h"
#include "Shared/Requirements.h"

#include "device.h"
Expand All @@ -34,38 +35,7 @@
#include <mutex>
#include <string>

struct PluginManager;

/// Plugin adaptors should be created via `PluginAdaptorTy::create` which will
/// invoke the constructor and call `PluginAdaptorTy::init`. Eventual errors are
/// reported back to the caller, otherwise a valid and initialized adaptor is
/// returned.
struct PluginAdaptorTy {
/// Try to create a plugin adaptor from a filename.
static llvm::Expected<std::unique_ptr<PluginAdaptorTy>>
create(const std::string &Name);

/// Name of the shared object file representing the plugin.
std::string Name;

/// Access to the shared object file representing the plugin.
std::unique_ptr<llvm::sys::DynamicLibrary> LibraryHandler;

#define PLUGIN_API_HANDLE(NAME) \
using NAME##_ty = decltype(__tgt_rtl_##NAME); \
NAME##_ty *NAME = nullptr;

#include "Shared/PluginAPI.inc"
#undef PLUGIN_API_HANDLE

/// Create a plugin adaptor for filename \p Name with a dynamic library \p DL.
PluginAdaptorTy(const std::string &Name,
std::unique_ptr<llvm::sys::DynamicLibrary> DL);

/// Initialize the plugin adaptor, this can fail in which case the adaptor is
/// useless.
llvm::Error init();
};
using GenericPluginTy = llvm::omp::target::plugin::GenericPluginTy;

/// Struct for the data required to handle plugins
struct PluginManager {
Expand All @@ -80,6 +50,8 @@ struct PluginManager {

void init();

void deinit();

// Register a shared library with all (compatible) RTLs.
void registerLib(__tgt_bin_desc *Desc);

Expand All @@ -92,10 +64,9 @@ struct PluginManager {
std::make_unique<DeviceImageTy>(TgtBinDesc, TgtDeviceImage));
}

/// Initialize as many devices as possible for this plugin adaptor. Devices
/// that fail to initialize are ignored. Returns the offset the devices were
/// registered at.
void initDevices(PluginAdaptorTy &RTL);
/// Initialize as many devices as possible for this plugin. Devices that fail
/// to initialize are ignored.
void initDevices(GenericPluginTy &RTL);

/// Return the device presented to the user as device \p DeviceNo if it is
/// initialized and ready. Otherwise return an error explaining the problem.
Expand Down Expand Up @@ -151,8 +122,8 @@ struct PluginManager {
// Initialize all plugins.
void initAllPlugins();

/// Iterator range for all plugin adaptors (in use or not, but always valid).
auto pluginAdaptors() { return llvm::make_pointee_range(PluginAdaptors); }
/// Iterator range for all plugins (in use or not, but always valid).
auto plugins() { return llvm::make_pointee_range(Plugins); }

/// Return the user provided requirements.
int64_t getRequirements() const { return Requirements.getRequirements(); }
Expand All @@ -164,14 +135,14 @@ struct PluginManager {
bool RTLsLoaded = false;
llvm::SmallVector<__tgt_bin_desc *> DelayedBinDesc;

// List of all plugin adaptors, in use or not.
llvm::SmallVector<std::unique_ptr<PluginAdaptorTy>> PluginAdaptors;
// List of all plugins, in use or not.
llvm::SmallVector<std::unique_ptr<GenericPluginTy>> Plugins;

// Mapping of plugin adaptors to offsets in the device table.
llvm::DenseMap<const PluginAdaptorTy *, int32_t> DeviceOffsets;
// Mapping of plugins to offsets in the device table.
llvm::DenseMap<const GenericPluginTy *, int32_t> DeviceOffsets;

// Mapping of plugin adaptors to the number of used devices.
llvm::DenseMap<const PluginAdaptorTy *, int32_t> DeviceUsed;
// Mapping of plugins to the number of used devices.
llvm::DenseMap<const GenericPluginTy *, int32_t> DeviceUsed;

// Set of all device images currently in use.
llvm::DenseSet<const __tgt_device_image *> UsedImages;
Expand Down
8 changes: 5 additions & 3 deletions offload/include/device.h
Original file line number Diff line number Diff line change
Expand Up @@ -33,17 +33,19 @@
#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/SmallVector.h"

#include "PluginInterface.h"
using GenericPluginTy = llvm::omp::target::plugin::GenericPluginTy;

// Forward declarations.
struct PluginAdaptorTy;
struct __tgt_bin_desc;
struct __tgt_target_table;

struct DeviceTy {
int32_t DeviceID;
PluginAdaptorTy *RTL;
GenericPluginTy *RTL;
int32_t RTLDeviceID;

DeviceTy(PluginAdaptorTy *RTL, int32_t DeviceID, int32_t RTLDeviceID);
DeviceTy(GenericPluginTy *RTL, int32_t DeviceID, int32_t RTLDeviceID);
// DeviceTy is not copyable
DeviceTy(const DeviceTy &D) = delete;
DeviceTy &operator=(const DeviceTy &D) = delete;
Expand Down
19 changes: 3 additions & 16 deletions offload/plugins-nextgen/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
set(common_dir ${CMAKE_CURRENT_SOURCE_DIR}/common)
add_subdirectory(common)
function(add_target_library target_name lib_name)
add_llvm_library(${target_name} SHARED
add_llvm_library(${target_name} STATIC
LINK_COMPONENTS
${LLVM_TARGETS_TO_BUILD}
AggressiveInstCombine
Expand Down Expand Up @@ -46,27 +46,14 @@ function(add_target_library target_name lib_name)
)

llvm_update_compile_flags(${target_name})
target_include_directories(${target_name} PUBLIC ${common_dir}/include)
target_link_libraries(${target_name} PRIVATE
PluginCommon ${OPENMP_PTHREAD_LIB})

target_compile_definitions(${target_name} PRIVATE TARGET_NAME=${lib_name})
target_compile_definitions(${target_name} PRIVATE
DEBUG_PREFIX="TARGET ${lib_name} RTL")

if(CMAKE_SYSTEM_NAME MATCHES "FreeBSD")
# On FreeBSD, the 'environ' symbol is undefined at link time, but resolved by
# the dynamic linker at runtime. Therefore, allow the symbol to be undefined
# when creating a shared library.
target_link_libraries(${target_name} PRIVATE "-Wl,--allow-shlib-undefined")
else()
target_link_libraries(${target_name} PRIVATE "-Wl,-z,defs")
endif()

if(LIBOMP_HAVE_VERSION_SCRIPT_FLAG)
target_link_libraries(${target_name} PRIVATE
"-Wl,--version-script=${common_dir}/../exports")
endif()
set_target_properties(${target_name} PROPERTIES CXX_VISIBILITY_PRESET protected)
set_target_properties(${target_name} PROPERTIES POSITION_INDEPENDENT_CODE ON)
endfunction()

foreach(plugin IN LISTS LIBOMPTARGET_PLUGINS_TO_BUILD)
Expand Down
5 changes: 0 additions & 5 deletions offload/plugins-nextgen/amdgpu/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -57,8 +57,3 @@ else()
libomptarget_say("Not generating AMDGPU tests, no supported devices detected."
" Use 'LIBOMPTARGET_FORCE_AMDGPU_TESTS' to override.")
endif()

# Install plugin under the lib destination folder.
install(TARGETS omptarget.rtl.amdgpu LIBRARY DESTINATION "${OFFLOAD_INSTALL_LIBDIR}")
set_target_properties(omptarget.rtl.amdgpu PROPERTIES
INSTALL_RPATH "$ORIGIN" BUILD_RPATH "$ORIGIN:${CMAKE_CURRENT_BINARY_DIR}/..")
14 changes: 8 additions & 6 deletions offload/plugins-nextgen/amdgpu/src/rtl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3064,10 +3064,6 @@ struct AMDGPUPluginTy final : public GenericPluginTy {
// HSA functions from now on, e.g., hsa_shut_down.
Initialized = true;

#ifdef OMPT_SUPPORT
ompt::connectLibrary();
#endif

// Register event handler to detect memory errors on the devices.
Status = hsa_amd_register_system_event_handler(eventHandler, nullptr);
if (auto Err = Plugin::check(
Expand Down Expand Up @@ -3155,6 +3151,8 @@ struct AMDGPUPluginTy final : public GenericPluginTy {

Triple::ArchType getTripleArch() const override { return Triple::amdgcn; }

const char *getName() const override { return GETNAME(TARGET_NAME); }

/// Get the ELF code for recognizing the compatible image binary.
uint16_t getMagicElfBits() const override { return ELF::EM_AMDGPU; }

Expand Down Expand Up @@ -3387,8 +3385,6 @@ Error AMDGPUKernelTy::printLaunchInfoDetails(GenericDeviceTy &GenericDevice,
return Plugin::success();
}

GenericPluginTy *PluginTy::createPlugin() { return new AMDGPUPluginTy(); }

template <typename... ArgsTy>
static Error Plugin::check(int32_t Code, const char *ErrFmt, ArgsTy... Args) {
hsa_status_t ResultCode = static_cast<hsa_status_t>(Code);
Expand Down Expand Up @@ -3476,3 +3472,9 @@ void *AMDGPUDeviceTy::allocate(size_t Size, void *, TargetAllocTy Kind) {
} // namespace target
} // namespace omp
} // namespace llvm

extern "C" {
llvm::omp::target::plugin::GenericPluginTy *createPlugin_amdgpu() {
return new llvm::omp::target::plugin::AMDGPUPluginTy();
}
}
4 changes: 1 addition & 3 deletions offload/plugins-nextgen/common/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,4 @@ target_include_directories(PluginCommon PUBLIC
${LIBOMPTARGET_INCLUDE_DIR}
)

set_target_properties(PluginCommon PROPERTIES
POSITION_INDEPENDENT_CODE ON
CXX_VISIBILITY_PRESET protected)
set_target_properties(PluginCommon PROPERTIES POSITION_INDEPENDENT_CODE ON)
94 changes: 4 additions & 90 deletions offload/plugins-nextgen/common/include/PluginInterface.h
Original file line number Diff line number Diff line change
Expand Up @@ -1010,6 +1010,9 @@ struct GenericPluginTy {
/// Get the target triple of this plugin.
virtual Triple::ArchType getTripleArch() const = 0;

/// Get the constant name identifier for this plugin.
virtual const char *getName() const = 0;

/// Allocate a structure using the internal allocator.
template <typename Ty> Ty *allocate() {
return reinterpret_cast<Ty *>(Allocator.Allocate(sizeof(Ty), alignof(Ty)));
Expand Down Expand Up @@ -1226,7 +1229,7 @@ namespace Plugin {
/// Create a success error. This is the same as calling Error::success(), but
/// it is recommended to use this one for consistency with Plugin::error() and
/// Plugin::check().
static Error success() { return Error::success(); }
static inline Error success() { return Error::success(); }

/// Create a string error.
template <typename... ArgsTy>
Expand All @@ -1246,95 +1249,6 @@ template <typename... ArgsTy>
static Error check(int32_t ErrorCode, const char *ErrFmt, ArgsTy... Args);
} // namespace Plugin

/// Class for simplifying the getter operation of the plugin. Anywhere on the
/// code, the current plugin can be retrieved by Plugin::get(). The class also
/// declares functions to create plugin-specific object instances. The check(),
/// createPlugin(), createDevice() and createGlobalHandler() functions should be
/// defined by each plugin implementation.
class PluginTy {
// Reference to the plugin instance.
static GenericPluginTy *SpecificPlugin;

PluginTy() {
if (auto Err = init())
REPORT("Failed to initialize plugin: %s\n",
toString(std::move(Err)).data());
}

~PluginTy() {
if (auto Err = deinit())
REPORT("Failed to deinitialize plugin: %s\n",
toString(std::move(Err)).data());
}

PluginTy(const PluginTy &) = delete;
void operator=(const PluginTy &) = delete;

/// Create and intialize the plugin instance.
static Error init() {
assert(!SpecificPlugin && "Plugin already created");

// Create the specific plugin.
SpecificPlugin = createPlugin();
assert(SpecificPlugin && "Plugin was not created");

// Initialize the plugin.
return SpecificPlugin->init();
}

// Deinitialize and destroy the plugin instance.
static Error deinit() {
assert(SpecificPlugin && "Plugin no longer valid");

for (int32_t DevNo = 0, NumDev = SpecificPlugin->getNumDevices();
DevNo < NumDev; ++DevNo)
if (auto Err = SpecificPlugin->deinitDevice(DevNo))
return Err;

// Deinitialize the plugin.
if (auto Err = SpecificPlugin->deinit())
return Err;

// Delete the plugin instance.
delete SpecificPlugin;

// Invalidate the plugin reference.
SpecificPlugin = nullptr;

return Plugin::success();
}

public:
/// Initialize the plugin if needed. The plugin could have been initialized by
/// a previous call to Plugin::get().
static Error initIfNeeded() {
// Trigger the initialization if needed.
get();

return Error::success();
}

/// Get a reference (or create if it was not created) to the plugin instance.
static GenericPluginTy &get() {
// This static variable will initialize the underlying plugin instance in
// case there was no previous explicit initialization. The initialization is
// thread safe.
static PluginTy Plugin;

assert(SpecificPlugin && "Plugin is not active");
return *SpecificPlugin;
}

/// Get a reference to the plugin with a specific plugin-specific type.
template <typename Ty> static Ty &get() { return static_cast<Ty &>(get()); }

/// Indicate whether the plugin is active.
static bool isActive() { return SpecificPlugin != nullptr; }

/// Create a plugin instance.
static GenericPluginTy *createPlugin();
};

/// Auxiliary interface class for GenericDeviceResourceManagerTy. This class
/// acts as a reference to a device resource, such as a stream, and requires
/// some basic functions to be implemented. The derived class should define an
Expand Down
2 changes: 0 additions & 2 deletions offload/plugins-nextgen/common/include/Utils/ELF.h
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,6 @@
#ifndef LLVM_OPENMP_LIBOMPTARGET_PLUGINS_ELF_UTILS_H
#define LLVM_OPENMP_LIBOMPTARGET_PLUGINS_ELF_UTILS_H

#include "Shared/PluginAPI.h"

#include "llvm/Object/ELF.h"
#include "llvm/Object/ELFObjectFile.h"

Expand Down
Loading

0 comments on commit fa9e90f

Please sign in to comment.