forked from ssrg-vt/popcorn-compiler
-
Notifications
You must be signed in to change notification settings - Fork 1
/
README
164 lines (138 loc) · 6.48 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
Popcorn Linux Compiler Toolchain, Copyright Systems Software Research Group at
Virginia Tech, 2017.
For more information, please visit http://popcornlinux.org or e-mail Rob Lyerly
(rlyerly@vt.edu).
--------
Overview
--------
The goal of the heterogeneous compiler toolchain is to prepare multi-ISA
binaries for migration through a series of analyses and transformations. We
utilize and extend clang/LLVM in order to prepare heterogeneous binaries. We
also use a python-based tool to prepare custom linker scripts to align program
objects in the generated binaries. Finally, there are a number of additional
libraries needed for migration and runtime state transformation.
We need to prepare the binary so that before and after migrating between
architectures, the application is able to find the required code and data to
seamlessly continue execution. This is done by both using a common layout
(where it is possible without significant performance overhead) and state
transformation (where state is dictated by the ISA or where a common format
would be too costly in performance).
The toolchain operates as follows:
1. Parsing/LLVM bitcode generation (clang) - clang frontend, mostly unmodified
2. Middle-end analysis, refactoring & optimization - in addition to standard
optimizations, run several passes which adjust the linkage of some
variables, insert migration points into the application and notify
architecture-specific backends to generate stack frame metadata
3. Backend - modified LLVM backend(s) generate custom data and function
location information and compile bitcode to object-code
4. Linker - modified gold linker generates detailed data and function linking
information, aligns thread-local storage for all architectures
5. Alignment - python tool uses information provided by gold linker to generate
linker scripts that align data & function symbols across binaries
6. Metadata Generation - parse information generated by LLVM backends and add
stack transformation metadata to binaries (must be run post-alignment as it
requires final symbol layouts)
In order to have functionally-identical implementations for all compiled source
code, the binary must be compiled into a single IR representation, which is
then used by each of the backends to generate architecture-specific code:
----------------
| Orig. Source |
----------------
|
| (clang)
V
----------------
| LLVM IR |
----------------
|
| (opt)
V
----------------
| Optimized IR |
----------------
|
--------------------------------- (arch-specific backend)
| | |
V V V
--------------- ---------------
| aarch64 bin | ... | x86 bin |
--------------- ---------------
clang is set up to automatically implement this process -- generate a single
set of instrumented IR and lower the IR to each target's machine code.
The toolchain's installation folder is organized as shown below, along with a
brief introduction about important sub-folders:
root
\
common - source/headers common between components in different folders
|
lib - libraries needed by the compiler and/or the compiled application
\
libelf - library for parsing & reading ELF objects
|
libopenpop - Popcorn Distributed OpenMP library
|
migration - functionality for migrating between architectures
|
musl-1.1.18 - standard C library
|
stack_transformation - runtime for transforming stacks between ISA-specific
layouts
|
patches - patches for compiler components
\
binutils-gold - patch for GNU gold linker
|
llvm - patches clang/LLVM
|
tool - post-compilation binary tools for alignment & metadata-generation
\
alignment - python tool for generating linker scripts to align binaries
|
stack_metadata - tools for post-processing binaries in preparation for
stack transformation
|
util - various utilities, including a Makefile template, for heterogeneous
applications
\
scripts - scripts for patch generation, testing and running binaries
See the README file in each subdirectory for more information, and the INSTALL
file for installation instructions.
-------------
Prerequisites
-------------
Hardware requirements:
It is highly recommended that the toolchain be built & installed on a machine
with at least 4 cores and 8GB of RAM. This is because LLVM is a very large
codebase and is built with debugging information by default, making the compile
& linking phases taxing in terms of memory consumption.
Software requirements:
Note: the toolchain has been tested on x86-64 with Ubuntu 16.04 and up. It
should work on other architectures and distributions (in particular, Debian 8),
but may require installing alternate packages and some extra hacking. It is
*highly* recommended that applications be built on x86-64.
+ gcc and g++ 4.8 or higher for x86-64, gcc 4.8 or higher for aarch64
- On Ubuntu 16.04 & higher, x86-64, install the following packages:
build-essential
gcc-aarch64-linux-gnu
+ flex & bison (x86-64 only), needed by binutils
- On Ubuntu 16.04 & higher, x86-64, install the following packages:
flex
bison
+ Python, v2.7 and v3
- v2 is needed for the installation script and the alignment tool
- v3 is needed for scripts in util/scripts
- Both are pre-installed on Ubuntu 16.04 and up
+ Cmake
- On Ubuntu 16.04 & higher, x86-64, install the following packages:
cmake
+ SVN
- On Ubuntu 16.04 & higher, x86-64, install the following packages:
subversion
-----------
Limitations
-----------
The current version of the toolchain only works for aarch64 + x86-64.
Applications must be written in C (although anything for which LLVM IR can be
generated should be supported), and inline assembly is not supported. All
optimizations except auto-vectorization and frame-pointer elimination (required
by stackmap intrinsic) are supported.