Remote cross-JIT a Mongoose HTTP Server
In a recent post I explained how to JIT compile and run a minimal C program on a remote target connected via TCP. We then inspected and modified the program state from the host machine with LLDB as if it was a static executable running locally:
Today we will cross-compile a simple HTTP server based on Mongoose and run it in the same way. The approach enables rapid edit-compile-test cycles while keeping resource intensive compilation and linking tasks on the local host. The remote host can be a low-resource device that only runs the self-contained executor on a minimal Linux.
Build and run locally
Mongoose is an established embedded web server and networking library. Let’s check out the sources and run the http-server example locally for illustration. In the end we won’t need the local build anymore so you could as well skip the make
step:
> cd /path/to/demo
> git clone https://github.com/cesanta/mongoose
> git -C mongoose checkout 912dd518bf986e04
> cd mongoose/examples/http-server
> apt-get install libmbedtls-dev
> make
cc ../../mongoose.c main.c -I../.. -I../.. -W -Wall -DMG_ENABLE_IPV6=1 -DMG_ENABLE_LINES=1 -DMG_ENABLE_DIRECTORY_LISTING=1 -DMG_ENABLE_SSI=1 -o example
./example
2021-03-30 12:18:05 I sock.c:484:mg_listen 1 accepting on http://localhost:8000
2021-03-30 12:18:05 I main.c:78:main Starting Mongoose v7.3, serving [.]
Navigating to http://localhost:8000 should bring up a directory index like this in the browser:
Build a matching version of Clang
In order to run the server in ORC, we have to compile the sources to LLVM bitcode first. Bitcode is the serialized binary form of LLVM’s internal program representaiton. It is not stable and not forward compatible. This means we should exchange bitcode only between tools that use one and the same LLVM version. Hence, in a first step we build a Clang compiler that matches the version of our JIT.
Note: Not every new LLVM version introduces breaking changes. If you have a recent Clang installed, chances are that it works well and you can fast-forward to the next section.
Let’s check out the LLVM mono-repo and start the build. If you followed my recent post, you can simply reconfigure your existing LLVM build. Once again we try to avoid surprises by taking a specific commit that is sufficient and functional for this demo:
> cd /path/to/demo
> git clone https://github.com/llvm/llvm-project
> git -C llvm-project checkout 7b9df09e2050b8b2
> mkdir llvm-build && cd llvm-build
> cmake -GNinja -DCMAKE_BUILD_TYPE=Release -DLLVM_TARGETS_TO_BUILD=host -DLLVM_ENABLE_PROJECTS=clang -DLLVM_BUILD_EXAMPLES=On ../llvm-project/llvm
> ninja clang
The CMake command should dump a clang project is enabled
status line. As usual we build with ninja and it will take a while. In the meantime we can take care of the other preparations.
Set up the remote executor
The executor is the remote endpoint for our JIT and it gets connected via TCP. We keep it simple and use an Alpine Linux Docker container in this demo. Mongoose depends on libc and the mbedtls
library. In principle we could provide them as a JIT modules as well, but it’s a topic for a whole other post. For now they will be provided from the remote executor:
FROM weliveindetail/llvm-jit-remote
RUN apk add --no-cache mbedtls && \
ln -s $(ls -v /usr/lib/libmbedtls.so* | tail -1) /usr/lib/libmbedtls.so
We use the 8.83MB Alpine Linux image llvm-jit-remote as a base. It’s even smaller than the one we used in the previous post because it comes without the debug server. We install mbedtls on-top and not mbedtls-dev
, because the executor doesn’t need any headers or static build artifacts. Eventually, the JIT driver on our local host will ask the remote executor process to load the dynamic library before running our code.
Installed dynamic library files have an SO number suffix that encodes the required package version to prevent conflicts. As I don’t want to hardcode the demo for a specific package version, I added the extra ln
command that creates a plain libmbedtls.so
symlink to the file with the highest available SO number. This is fine for now. A solid solution for getting pre-installed library versions right is once again a topic for another post.
musl
libc is an inherent part of Alpine, so we don’t have to install it. The executor binary links it dynamically and automatically exposes its symbols to the JITed code. We can build and run the container in a separate terminal like this:
> docker build -t llvm-jit-remote:mongoose /path/to/demo/executor-docker
> docker run --rm -p 9000:9000 -p 8000:8000 -it llvm-jit-remote:mongoose
Listening at 0.0.0.0:9000
Cross-compile bitcode for Alpine Linux
We have to pre-compile the Mongoose C sources to LLVM bitcode in order to feed it into the JIT. The pre-compilation is cross-platform, because our remote executor runs on a different operating system. Thus, Clang will need the target platform headers for our dependencies: Alpine’s musl
libc system headers and those from APK’s mbedtls-dev
package. Let’s run a Docker container for that, where we install the dependencies and mount the necessary paths back to the host system. This is especially easy in our case since the container runs on the host architecture and we really only need the headers:
> cd /path/to/demo
> mkdir x86_64-alpine-linux-musl
> docker run -v x86_64-alpine-linux-musl/usr:/usr -t alpine apk add --no-cache musl-dev mbedtls-dev
> find x86_64-alpine-linux-musl/usr/include | wc -l
304
Note: At the time of writing this post, the above approach was super useful, but apparently it was (considered) a bug in docker. It doesn’t work anymore in version 20.10.8:
> docker --version
Docker version 20.10.8, build 3967b7d
> docker run -v x86_64-alpine-linux-musl/usr:/usr -t alpine apk add --no-cache musl-dev mbedtls-dev
docker: Error response from daemon: create x86_64-alpine-linux-musl/usr: "x86_64-alpine-linux-musl/usr" includes invalid characters for a local volume name, only "[a-zA-Z0-9][a-zA-Z0-9_.-]" are allowed. If you intended to pass a host directory, use absolute path.
See 'docker run --help'.
There is one detail we have to adjust in the http-server example before we pre-compile it to bitcode. Because it’s meant as a quick demo, it’s hardcoded to be reachable only from the local network namespace. Once we run it in a Docker container and want it to be reachable from the host, it must listen for external connections as well. Thus, we have to change the network address from localhost
to 0.0.0.0
in the source code:
--- a/path/to/demo/mongoose/examples/http-server/main.c
+++ b/path/to/demo/mongoose/examples/http-server/main.c
@@ -6,7 +6,7 @@
static const char *s_debug_level = "2";
static const char *s_root_dir = ".";
-static const char *s_listening_address = "http://localhost:8000";
+static const char *s_listening_address = "http://0.0.0.0:8000";
static const char *s_enable_hexdump = "no";
static const char *s_ssi_pattern = "#.shtml";
Now is the time when we actually need the Clang executable that we started to build in the beginning. Let’s invoke it once for each source file:
> cd /path/to/demo
> mkdir -p build-mongoose/x86_64-alpine-linux-musl
> llvm-build/bin/clang --target=x86_64-alpine-linux-musl --sysroot=x86_64-alpine-linux-musl -iquote mongoose -DMG_ENABLE_DIRECTORY_LISTING=1 -v -g -S -emit-llvm -o build-mongoose/x86_64-alpine-linux-musl/mongoose.ll mongoose/mongoose.c
> llvm-build/bin/clang --target=x86_64-alpine-linux-musl --sysroot=x86_64-alpine-linux-musl -iquote mongoose -DMG_ENABLE_DIRECTORY_LISTING=1 -v -g -S -emit-llvm -o build-mongoose/x86_64-alpine-linux-musl/http-server.ll mongoose/examples/http-server/main.c
Command-line arguments in detail
--target=<triple>
- Compile for the platform specified in the triple:
x86_64-alpine-linux-musl
for our Alpine Linux container. --sysroot=<path>
- Set the root directory for system header and library search paths. Convenient, even though we only pre-compile and don’t need any libraries. We pass the mount point from the previous section here and get
x86_64-alpine-linux-musl/usr/include
as the search path for system headers. -iquote <path>
- Add private include path for quoted
#include
directives. In our case it’s the Mongoose source directory. -emit-llvm
- Emit bitcode. We also add
-S
to obtain the human-readable assembly form known as LLVM IR instead of the equivalent binary form. It’s slower to parse but easier to read. -v
- Run in verbose mode, i.e. dump the actual include search paths; useful to track down include errors.
-g
- Generate debug infomation.
-DMG_ENABLE_DIRECTORY_LISTING=1
- Mongoose preprocessor switch to enable code that populates file listings for a directory.
This gives us the two bitcode files mongoose.ll
and http-server.ll
in build-mongoose/x86_64-alpine-linux-musl
. Have a look at the assembly and see how it relates to the original source. It’s a SSA representation of the individual compile units, that is specific for our target platform and LLVM version. Since we requested debug output, Clang didn’t run any optimizations and the amount of bitcode is quite large. No linking has happened yet.
Having mounted the target system root inside our working directory caused all file entries in debug info tags to refer to the same directory. It comes handy when setting up a source-map for debugging:
> cat build-mongoose/x86_64-alpine-linux-musl/mongoose.ll | grep DIFile
!3 = !DIFile(filename: "mongoose/mongoose.c", directory: "/path/to/demo")
!6 = !DIFile(filename: "mongoose/mongoose.h", directory: "/path/to/demo")
!46 = !DIFile(filename: "x86_64-alpine-linux-musl/usr/include/bits/alltypes.h", directory: "/path/to/demo")
!1369 = !DIFile(filename: "x86_64-alpine-linux-musl/usr/include/sys/socket.h", directory: "/path/to/demo")
!1379 = !DIFile(filename: "x86_64-alpine-linux-musl/usr/include/netinet/in.h", directory: "/path/to/demo")
!1825 = !DIFile(filename: "x86_64-alpine-linux-musl/usr/include/time.h", directory: "/path/to/demo")
!3600 = !DIFile(filename: "x86_64-alpine-linux-musl/usr/include/bits/stat.h", directory: "/path/to/demo")
!7901 = !DIFile(filename: "x86_64-alpine-linux-musl/usr/include/sys/select.h", directory: "/path/to/demo")
!8830 = !DIFile(filename: "x86_64-alpine-linux-musl/usr/include/ctype.h", directory: "/path/to/demo")
Remote cross-JIT Mongoose to Alpine Linux
Now that we started our remote executor and cross-compiled the bitcode, we can build the demo JIT in a new terminal and use it to run the Mongoose server in the remote container:
> cd /path/to/demo
> ninja -C llvm-build LLJITWithRemoteDebugging
> llvm-build/bin/LLJITWithRemoteDebugging --connect localhost:9000 --dlopen /usr/lib/libmbedtls.so build-mongoose/x86_64-alpine-linux-musl/mongoose.ll build-mongoose/x86_64-alpine-linux-musl/http-server.ll
Parsed input IR code from: build-mongoose/x86_64-alpine-linux-musl/mongoose.ll
Parsed input IR code from: build-mongoose/x86_64-alpine-linux-musl/http-server.ll
Connected to executor at localhost:9000
Established TargetProcessControl connection to the executor
Initialized LLJIT for remote executor
Running: main()
The terminal running the Docker container should now dump the server’s stdout:
Connection established. Running OrcRPCTPCServer...
2021-05-19 10:01:49 I mongoose.c:3074:mg_listen 1 accepting on http://0.0.0.0:8000
2021-05-19 10:01:49 I main.c:67:main Starting Mongoose v7.3, serving [.]
Let’s open a browser and view the HTML page provided from our cross-JITed Mongoose server:
Great, this looks as expected! Interestingly, if we navigate to the root folder we get an unexpected result: Not found [/]
. Clicking on one of the linked directories shows a similar issue. When running the server locally on my machine, I didn’t observe any such misbehavior. Looks like an excellent opportunity for another post to demonstrate remote debugging in a real-world program!