<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
	<channel>
		<title></title>
		<description>Sporadic details on compilers, code and tooling from the world of LLVM</description>
		<link>https://weliveindetail.github.io/blog</link>
		<atom:link href="https://weliveindetail.github.io/blog/feed.xml" rel="self" type="application/rss+xml" />
		
			<item>
				<title>Using the TPDE Codegen Backend in LLVM ORC</title>
				<description>&lt;style&gt;
  #banner-image {
    margin-bottom: 50px;
  }
  #large-image {
    max-width: min(100%, 230px);
  }
  .center {
    display: block;
    margin: 0 auto;
  }
&lt;/style&gt;

&lt;p&gt;&lt;img src=&quot;https://weliveindetail.github.io/blog/res/2025-tpde-orc.png&quot; alt=&quot;tpde-banner&quot; id=&quot;large-image&quot; class=&quot;center&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://arxiv.org/abs/2505.22610&quot; target=&quot;_blank&quot;&gt;TPDE&lt;/a&gt; is a single-pass compiler backend for LLVM that was &lt;a href=&quot;https://discourse.llvm.org/t/tpde-llvm-10-20x-faster-llvm-o0-back-end/&quot; target=&quot;_blank&quot;&gt;open-sourced earlier this year&lt;/a&gt; by researchers at &lt;a href=&quot;https://db.in.tum.de&quot; target=&quot;_blank&quot;&gt;TUM&lt;/a&gt;. The &lt;a href=&quot;https://docs.tpde.org/tpde-llvm-main.html&quot; target=&quot;_blank&quot;&gt;comprehensive documentation&lt;/a&gt; walks you through integrating TPDE into custom builds of Clang and Flang. Currently, it supports &lt;a href=&quot;https://github.com/tpde2/tpde/blob/c857798/llvm.ab51eccf88f5.patch&quot; target=&quot;_blank&quot;&gt;LLVM 19&lt;/a&gt; and &lt;a href=&quot;https://github.com/tpde2/tpde/blob/c857798/llvm.616f2b685b06.patch&quot; target=&quot;_blank&quot;&gt;LLVM 20&lt;/a&gt; release versions.&lt;/p&gt;

&lt;h3 id=&quot;integration-in-llvm-orc-jit&quot;&gt;Integration in LLVM ORC JIT&lt;/h3&gt;

&lt;p&gt;TPDE’s primary strength lies in delivering low-latency code generation while maintaining reasonable &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-O0&lt;/code&gt; code quality — making it an ideal choice for a baseline JIT compiler. LLVM’s &lt;a href=&quot;https://llvm.org/docs/ORCv2.html&quot; target=&quot;_blank&quot;&gt;On-Request Compilation (ORC)&lt;/a&gt; framework provides a set of libraries for building JIT compilers for LLVM IR. While ORC uses LLVM’s built-in backends by default, its flexible architecture makes it straightforward to swap in TPDE instead!&lt;/p&gt;

&lt;p&gt;Let’s say we use the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LLJITBuilder&lt;/code&gt; interface to instantiate an off-the-shelf JIT:&lt;/p&gt;

&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;ExitOnError&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ExitOnErr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Builder&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;LLJITBuilder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unique_ptr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LLJIT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;JIT&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ExitOnErr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Builder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;create&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;());&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The builder offers several extension points to customize the JIT instance it creates. To integrate TPDE, we’ll override the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CreateCompileFunction&lt;/code&gt; member, which defines how LLVM IR gets compiled into machine code:&lt;/p&gt;
&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;Builder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CreateCompileFunction&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[](&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;JITTargetMachineBuilder&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;JTMB&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Expected&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unique_ptr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IRCompileLayer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IRCompiler&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;make_unique&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TPDECompiler&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;JTMB&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;To use TPDE in this context, we need to wrap it in a class that’s compatible with ORC’s interface:&lt;/p&gt;
&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;TPDECompiler&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;IRCompileLayer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IRCompiler&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nl&quot;&gt;public:&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;TPDECompiler&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;JITTargetMachineBuilder&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;JTMB&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;IRCompiler&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;irManglingOptionsFromTargetOptions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;JTMB&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getOptions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()))&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;Compiler&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tpde_llvm&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LLVMCompiler&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;create&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;JTMB&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getTargetTriple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;());&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;assert&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Compiler&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;nullptr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;Unknown architecture&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

  &lt;span class=&quot;n&quot;&gt;Expected&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unique_ptr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MemoryBuffer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;operator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Module&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;override&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

&lt;span class=&quot;nl&quot;&gt;private:&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unique_ptr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tpde_llvm&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LLVMCompiler&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Compiler&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;vector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unique_ptr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;vector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;uint8_t&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Buffers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In the constructor, we initialize TPDE with a target triple (like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x86_64-pc-linux-gnu&lt;/code&gt;). TPDE currently works on ELF-based systems and supports both 64-bit Intel and ARM architectures (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x86_64&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;aarch64&lt;/code&gt;). For now let’s assume that’s all we need. Now let’s implement the actual wrapper code:&lt;/p&gt;
&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;Expected&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unique_ptr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MemoryBuffer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TPDECompiler&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;operator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Module&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;Buffers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;push_back&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;make_unique&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;vector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;uint8_t&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;());&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;vector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;uint8_t&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;B&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Buffers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;back&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;

  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Compiler&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;compile_to_elf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;string&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;raw_string_ostream&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;TPDE failed to compile: &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getName&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;createStringError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;move&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;inconvertibleErrorCode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;());&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

  &lt;span class=&quot;n&quot;&gt;StringRef&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BufferRef&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;reinterpret_cast&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MemoryBuffer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getMemBuffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BufferRef&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Here’s what’s happening: we create a new buffer &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;B&lt;/code&gt; to store the compiled binary code, then pass both the buffer and the module &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;M&lt;/code&gt; to TPDE for compilation. If TPDE fails, we bail out with an error. On success, we wrap the result in a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MemoryBuffer&lt;/code&gt; and return it. (Note: LLVM still uses &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;char&lt;/code&gt; pointers for binary buffers and the &lt;a href=&quot;https://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf&quot; target=&quot;_blank&quot;&gt;three-way definition of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;char&lt;/code&gt; in the C Standard&lt;/a&gt; falls on our feet sometimes, but it’s difficult to change in a mature codebase like LLVM.)&lt;/p&gt;

&lt;p&gt;For the basic integration this is it! No need to patch LLVM — this works with official release versions. We can compile simple LLVM IR code already:&lt;/p&gt;
&lt;div class=&quot;language-llvm highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;cat&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;01&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;-basic&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;ll&lt;/span&gt; 
&lt;span class=&quot;c1&quot;&gt;; ModuleID = &apos;test.ll&apos;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;source_filename&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;test.ll&quot;&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;define&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;i32&lt;/span&gt; &lt;span class=&quot;vg&quot;&gt;@main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nl&quot;&gt;entry:&lt;/span&gt;
  &lt;span class=&quot;nv&quot;&gt;%1&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;call&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;i32&lt;/span&gt; &lt;span class=&quot;vg&quot;&gt;@custom_entry&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  &lt;span class=&quot;nv&quot;&gt;%2&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;sub&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;i32&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;%1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;123&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;ret&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;i32&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;%2&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;define&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;i32&lt;/span&gt; &lt;span class=&quot;vg&quot;&gt;@custom_entry&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nl&quot;&gt;entry:&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;ret&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;i32&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;123&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I’ve created a &lt;a href=&quot;https://github.com/weliveindetail/tpde-orc&quot; target=&quot;_blank&quot;&gt;complete working demo on GitHub&lt;/a&gt; that you can try out. The code in the repo handles a few more details that we’ll explore shortly. Here’s what the output looks like:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; ./tpde-orc 01-basic.ll 
Loaded module: 01-basic.ll
Executing main()
Program returned: 0

&amp;gt; ./tpde-orc 01-basic.ll --entrypoint custom_entry
Loaded module: 01-basic.ll
Executing custom_entry()
Program returned: 123
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;a name=&quot;compare-llvm-perf&quot;&gt;&lt;/a&gt;
We can see an impressive 20x speedup with TPDE compared to built-in LLVM codegen for 100 repetitions with a large self-contained module that was generated with &lt;a href=&quot;https://github.com/csmith-project/csmith&quot; target=&quot;_blank&quot;&gt;csmith&lt;/a&gt;:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; ./build/tpde-orc --par 1 tpde-orc/03-csmith-tpde.ll
...
Compile-time was: 329 ms

&amp;gt; ./build/tpde-orc --par 1 tpde-orc/03-csmith-tpde.ll --llvm
...
Compile-time was: 6796 ms
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Please note:&lt;/strong&gt; This is not a benchmark! It’s just one example on one machine with one possible implementation. The implementation isn’t tuned to the limit and the example might not be representative.&lt;/p&gt;

&lt;p&gt;My original post showed slower compile-times on both sides. I addressed some feedback and the TPDE case got a lot faster, so I updated the numbers. It’s worth mentioning that it’s not really fair to compare against LLVM like that, because the off-the-shelf ORC JIT uses optimization level &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-O2&lt;/code&gt;. I changed that to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-O0&lt;/code&gt; and TPDE speedup comes down to 12x:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; ./build/tpde-orc --par 1 tpde-orc/03-csmith-tpde.ll --llvm
...
Compile-time was: 4060 ms
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;lljitbuilder-has-a-catch&quot;&gt;LLJITBuilder has a Catch&lt;/h3&gt;

&lt;p&gt;While &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LLJITBuilder&lt;/code&gt; is convenient, it comes with a minor trade-off. The interface incorporates standard LLVM components &lt;a href=&quot;https://github.com/llvm/llvm-project/blob/release/20.x/llvm/lib/ExecutionEngine/Orc/JITTargetMachineBuilder.cpp#L42&quot; target=&quot;_blank&quot;&gt;including &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;TargetRegistry&lt;/code&gt;&lt;/a&gt;, which is perfectly reasonable for most use cases. However, this creates a dependency we might not want: the built-in LLVM target backend must be initialized first via &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;InitializeNativeTarget()&lt;/code&gt;. This means we still need to ship the LLVM backend, even though TPDE could theoretically replace it entirely.&lt;/p&gt;

&lt;p&gt;If you want to avoid this dependency, you’ll need to set up your ORC JIT manually. For inspiration on this approach, check out &lt;a href=&quot;https://github.com/tpde2/tpde/blob/c2bf98b592a8/tpde-llvm/tools/tpde-lli.cpp#L134-L160&quot; target=&quot;_blank&quot;&gt;how the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tpde-lli&lt;/code&gt; tool implements it&lt;/a&gt;. Before diving into that rabbit hole though, let’s explore another important aspect!&lt;/p&gt;

&lt;h3 id=&quot;implementing-a-llvm-fallback&quot;&gt;Implementing a LLVM Fallback&lt;/h3&gt;

&lt;p&gt;One reason why TPDE is so fast and compact is that it focuses on the most common use cases rather than &lt;a href=&quot;https://docs.tpde.org/tpde-llvm-main.html#autotoc_md91&quot; target=&quot;_blank&quot;&gt;covering every edge case&lt;/a&gt; in the LLVM instruction set. The documentation provides this guideline:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Code generated by Clang (-O0/-O1) will typically compile; -O2 and higher will typically fail due to unsupported vector operations.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;When your code includes advanced features like vector operations or non-trivial floating-point types, TPDE won’t be able to handle it. In these cases, we need a fallback to LLVM. Since this scenario is quite common in real-world applications, most tools will include both backends (and we can keep using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LLJITBuilder&lt;/code&gt;). Implementing the fallback is straightforward using &lt;a href=&quot;https://github.com/llvm/llvm-project/blob/release/20.x/llvm/include/llvm/ExecutionEngine/Orc/CompileUtils.h#L36&quot; target=&quot;_blank&quot;&gt;ORC’s CompileUtils&lt;/a&gt;:&lt;/p&gt;

&lt;div class=&quot;language-diff highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;@@ -29,7 +29,8 @@&lt;/span&gt; static cl::opt&amp;lt;std::string&amp;gt; EntryPoint(&quot;entrypoint&quot;,
 class TPDECompiler : public IRCompileLayer::IRCompiler {
 public:
   TPDECompiler(JITTargetMachineBuilder JTMB)
&lt;span class=&quot;gd&quot;&gt;-      : IRCompiler(irManglingOptionsFromTargetOptions(JTMB.getOptions())) {
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+      : IRCompiler(irManglingOptionsFromTargetOptions(JTMB.getOptions())),
+        JTMB(std::move(JTMB)) {
&lt;/span&gt;     Compiler = tpde_llvm::LLVMCompiler::create(JTMB.getTargetTriple());
     assert(Compiler != nullptr &amp;amp;&amp;amp; &quot;Unknown architecture&quot;);
   }
&lt;span class=&quot;p&quot;&gt;@@ -37,9 +38,9 @@&lt;/span&gt; Expected&amp;lt;std::unique_ptr&amp;lt;MemoryBuffer&amp;gt;&amp;gt; TPDECompiler::operator()(Module &amp;amp;M) {
   std::vector&amp;lt;uint8_t&amp;gt; &amp;amp;B = *Buffers.back();
 
   if (!Compiler-&amp;gt;compile_to_elf(M, B)) {
&lt;span class=&quot;gd&quot;&gt;-    std::string Msg;
-    raw_string_ostream(Msg) &amp;lt;&amp;lt; &quot;TPDE failed to compile: &quot; &amp;lt;&amp;lt; M.getName();
-    return createStringError(std::move(Msg), inconvertibleErrorCode());
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+    errs() &amp;lt;&amp;lt; &quot;Falling back to LLVM for module: &quot; &amp;lt;&amp;lt; M.getName() &amp;lt;&amp;lt; &quot;\n&quot;;
+    auto TM = ExitOnErr(JTMB.createTargetMachine());
+    return SimpleCompiler(*TM)(M);
&lt;/span&gt;   }
 
   StringRef BufferRef{reinterpret_cast&amp;lt;char *&amp;gt;(B.data()), B.size()};
&lt;span class=&quot;p&quot;&gt;@@ -50,6 +51,7 @@&lt;/span&gt; public:
 private:
   std::unique_ptr&amp;lt;tpde_llvm::LLVMCompiler&amp;gt; Compiler;
   std::vector&amp;lt;std::unique_ptr&amp;lt;std::vector&amp;lt;uint8_t&amp;gt;&amp;gt;&amp;gt; Buffers;
&lt;span class=&quot;gi&quot;&gt;+  JITTargetMachineBuilder JTMB;
&lt;/span&gt; };
 
 int main(int argc, char *argv[]) {
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Let’s test our fallback mechanism with the following IR file that uses an unsupported type:&lt;/p&gt;
&lt;div class=&quot;language-llvm highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;vg&quot;&gt;@const_val&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;global&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;bfloat&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;R4248&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;define&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;i32&lt;/span&gt; &lt;span class=&quot;vg&quot;&gt;@main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nl&quot;&gt;entry:&lt;/span&gt;
  &lt;span class=&quot;nv&quot;&gt;%c&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;load&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;bfloat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;ptr&lt;/span&gt; &lt;span class=&quot;vg&quot;&gt;@const_val&lt;/span&gt;
  &lt;span class=&quot;nv&quot;&gt;%i&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;fptosi&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;bfloat&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;%c&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;i32&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;ret&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;i32&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;%i&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Here’s what happens when we run it:&lt;/p&gt;
&lt;div class=&quot;language-terminal highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gp&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;./tpde-orc 02-bfloat.ll
&lt;span class=&quot;go&quot;&gt;Loaded module: 02-bfloat.ll
[2025-09-25 12:54:03.076] [error] unsupported type: bfloat
[2025-09-25 12:54:03.076] [error] Failed to compile function main
Falling back to LLVM for module: 02-bfloat.ll
Executing main()
Program returned: 50
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In this implementation, we create a new &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SimpleCompiler&lt;/code&gt; instance for each fallback case. While this adds some overhead, it’s acceptable since we’re already on the slow path. The key assumption is that most code in your workload will successfully compile with TPDE — if that’s not the case, then TPDE might not be the right choice in the first place. Interestingly, this approach has a valuable side-effect that becomes important in the next section: it’s inherently thread-safe!&lt;/p&gt;

&lt;h3 id=&quot;adding-concurrent-compilation-support&quot;&gt;Adding Concurrent Compilation Support&lt;/h3&gt;

&lt;p&gt;ORC JIT has built-in support for concurrent compilation. This is neat, but it requires attention when customizing the JIT. Our current setup uses a single &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;TPDECompiler&lt;/code&gt; instance, but TPDE’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;compile_to_elf()&lt;/code&gt; method isn’t thread-safe. Enabling concurrent compilation would cause multiple threads to call this method simultaneously, leading to failures.&lt;/p&gt;

&lt;p&gt;How can we solve this? One option would be creating a new &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tpde_llvm::LLVMCompiler&lt;/code&gt; instance for each compilation job, but that adds an overhead of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;O(#jobs)&lt;/code&gt; — not ideal for our fast path. Essentially, we want to avoid calling into &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;compile_to_elf()&lt;/code&gt; while there is another call in-flight on the same thread. We can achieve this easily by making the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;TPDECompiler&lt;/code&gt; instance thread-local, reducing the overhead to just &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;O(#threads)&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-diff highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;@@ -32,7 +32,6 @@&lt;/span&gt; public:
   TPDECompiler(JITTargetMachineBuilder JTMB)
       : IRCompiler(irManglingOptionsFromTargetOptions(JTMB.getOptions())),
         JTMB(std::move(JTMB)) {
&lt;span class=&quot;gd&quot;&gt;-    Compiler = tpde_llvm::LLVMCompiler::create(JTMB.getTargetTriple());
&lt;/span&gt;     assert(Compiler != nullptr &amp;amp;&amp;amp; &quot;Unknown architecture&quot;);
   }
 
&lt;span class=&quot;p&quot;&gt;@@ -50,11 +49,14 @@&lt;/span&gt; public:
   }
 
 private:
&lt;span class=&quot;gd&quot;&gt;-  std::unique_ptr&amp;lt;tpde_llvm::LLVMCompiler&amp;gt; Compiler;
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+  static thread_local std::unique_ptr&amp;lt;tpde_llvm::LLVMCompiler&amp;gt; Compiler;
&lt;/span&gt;   std::vector&amp;lt;std::unique_ptr&amp;lt;std::vector&amp;lt;uint8_t&amp;gt;&amp;gt;&amp;gt; Buffers;
   JITTargetMachineBuilder JTMB;
 };
 
&lt;span class=&quot;gi&quot;&gt;+thread_local std::unique_ptr&amp;lt;tpde_llvm::LLVMCompiler&amp;gt; TPDECompiler::Compiler =
+    tpde_llvm::LLVMCompiler::create(Triple(LLVM_HOST_TRIPLE));
+
&lt;/span&gt; int main(int argc, char *argv[]) {
   InitLLVM X(argc, argv);
   cl::ParseCommandLineOptions(argc, argv, &quot;TPDE ORC JIT Compiler\n&quot;);
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We also need to guard access to our underlying buffers:&lt;/p&gt;
&lt;div class=&quot;language-diff highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;@@ -35,15 +35,19 @@&lt;/span&gt; public:
   }
 
   Expected&amp;lt;std::unique_ptr&amp;lt;MemoryBuffer&amp;gt;&amp;gt; operator()(Module &amp;amp;M) override {
&lt;span class=&quot;gd&quot;&gt;-    Buffers.push_back(std::make_unique&amp;lt;std::vector&amp;lt;uint8_t&amp;gt;&amp;gt;());
-    std::vector&amp;lt;uint8_t&amp;gt; *B = *Buffers.back().get();
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+    std::vector&amp;lt;uint8_t&amp;gt; *B;
+    {
+      std::lock_guard&amp;lt;std::mutex&amp;gt; Lock(BuffersAccess);
+      Buffers.push_back(std::make_unique&amp;lt;std::vector&amp;lt;uint8_t&amp;gt;&amp;gt;());
+      B = Buffers.back().get();
+    }
&lt;/span&gt; 
     if (!Compiler-&amp;gt;compile_to_elf(M, *B)) {
       errs() &amp;lt;&amp;lt; &quot;Falling back to LLVM for module: &quot; &amp;lt;&amp;lt; M.getName() &amp;lt;&amp;lt; &quot;\n&quot;;
&lt;span class=&quot;p&quot;&gt;@@ -50,6 +54,7 @@&lt;/span&gt; public:
 private:
   static thread_local std::unique_ptr&amp;lt;tpde_llvm::LLVMCompiler&amp;gt; Compiler;
   std::vector&amp;lt;std::unique_ptr&amp;lt;std::vector&amp;lt;uint8_t&amp;gt;&amp;gt;&amp;gt; Buffers;
&lt;span class=&quot;gi&quot;&gt;+  std::mutex BuffersAccess;
&lt;/span&gt;   JITTargetMachineBuilder JTMB;
 };

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;With thread safety handled, we can now enable concurrent compilation:&lt;/p&gt;
&lt;div class=&quot;language-diff highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;@@ -27,6 +27,10 @@&lt;/span&gt; static cl::opt&amp;lt;std::string&amp;gt; EntryPoint(&quot;entrypoint&quot;,
                                       cl::desc(&quot;Entry point function name&quot;),
                                       cl::init(&quot;main&quot;));

+static cl::opt&amp;lt;unsigned&amp;gt;
&lt;span class=&quot;gi&quot;&gt;+    Threads(&quot;par&quot;, cl::desc(&quot;Compile csmith code on N threads concurrently&quot;),
+            cl::init(0));
+
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;class TPDECompiler : public IRCompileLayer::IRCompiler {
public:
@@ -65,6 +65,8 @@&lt;/span&gt; int main(int argc, char *argv[]) {
       -&amp;gt; Expected&amp;lt;std::unique_ptr&amp;lt;IRCompileLayer::IRCompiler&amp;gt;&amp;gt; {
     return std::make_unique&amp;lt;TPDECompiler&amp;gt;(JTMB);
   };
&lt;span class=&quot;gi&quot;&gt;+  Builder.SupportConcurrentCompilation = true;
+  Builder.NumCompileThreads = Threads;
&lt;/span&gt;   std::unique_ptr&amp;lt;LLJIT&amp;gt; JIT = ExitOnErr(Builder.create());
 
   ThreadSafeModule TSM(std::move(Mod), std::move(Context));
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;exercising-concurrent-lookup&quot;&gt;Exercising Concurrent Lookup&lt;/h3&gt;

&lt;p&gt;It needs a lot more support code to actually exercise concurrent compilation and do basic performance measurements. The &lt;a href=&quot;https://github.com/weliveindetail/tpde-orc&quot; target=&quot;_blank&quot;&gt;complete sample project on GitHub&lt;/a&gt; has one possible implementation: after loading the input module, it creates 100 duplicates of it with different entry-point names and issues a single JIT lookup for all the entry-points at once. Here’s a simplified version of how this works:&lt;/p&gt;
&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;SymbolMap&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SymMap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;SymbolLookupSet&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;EntryPoints&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;addDuplicates&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;JIT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Mod&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;outs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;Compiling &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;EntryPoints&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot; modules on &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Threads&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot; threads in parallel&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;using&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;namespace&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;chrono&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ES&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;JIT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getExecutionSession&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SO&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;makeJITDylibSearchOrder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;JIT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getMainJITDylib&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()});&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Start&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;steady_clock&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;now&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;// Lookup all entry-points at once to execise concurrent compilation&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;SymMap&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ExitOnErr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ES&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lookup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SO&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;EntryPoints&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;End&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;steady_clock&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;now&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Elapsed&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;duration_cast&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;milliseconds&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;End&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;outs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;Compile-time was: &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Elapsed&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot; ms&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Compile-times for our csmith example drop from ~2200ms to just ~740ms when utilizing 8 threads in parallel:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; ./tpde-orc --par 8 tpde-orc/03-csmith-tpde.ll
Load module: tpde-orc/03-csmith-tpde.ll
Compiling 100 modules on 8 threads in parallel
...
Compile-time was: 737 ms
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;et-voilà&quot;&gt;Et voilà!&lt;/h3&gt;

&lt;p&gt;Let’s wrap it up and appreciate the remarkable complexity that LLVM effortlessly handles in our little example. We parse a &lt;a href=&quot;https://llvm.org/docs/LangRef.html&quot; target=&quot;_blank&quot;&gt;well-defined, human-readable representation&lt;/a&gt; of Turing-complete programs generated from various general-purpose languages like C++, Fortran, Rust, Swift, Julia, and Zig.&lt;/p&gt;

&lt;p&gt;LLVM’s composable JIT engine seamlessly manages these parsed modules, automatically resolving symbols and dependencies. It compiles machine code in the native object format on-demand for multiple platforms and CPU architectures, while giving us complete control over the optimization pipeline, code generator (like our TPDE integration) and many more components. The engine then links everything into an executable form — all in-memory and without external tools or platform-specific dynamic library tricks! It’s really impressive that we can simply enable compilation on N threads in parallel and have it “just work” :-)&lt;/p&gt;

&lt;p&gt;&lt;a name=&quot;follow-ups&quot;&gt;&lt;/a&gt;
&lt;strong&gt;Follow-up:&lt;/strong&gt; After revisiting the single-threaded compile-times, the multi-threading result is rather disappointing. Digging deeper, there seem to be two important reasons why this is so slow:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;If we enable concurrent compilation, LLJIT activates a mechanism that makes sure the workload can actually be processed in parallel. This requires modules to live in distinct LLVM contexts and &lt;a href=&quot;https://github.com/llvm/llvm-project/blob/release/20.x/llvm/lib/ExecutionEngine/Orc/LLJIT.cpp#L1040&quot; target=&quot;_blank&quot;&gt;LLJIT clones all modules&lt;/a&gt; before dispatch. This is very expensive and actually dominates compile-times. The mechanism wouldn’t be necessary in our case here. As long as we keep using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LLJITBuilder&lt;/code&gt; though, we cannot disable it, because the API gives no access to LLJIT’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;InitHelperTransformLayer&lt;/code&gt;, which is responsible for the cloning.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DynamicThreadPoolTaskDispatcher&lt;/code&gt; has a catch: It only pools compile jobs, but &lt;a href=&quot;https://github.com/llvm/llvm-project/blob/87f0227cb60147a26a1eeb4fb06e3b505e9c7261/llvm/lib/ExecutionEngine/Orc/TaskDispatch.cpp#L67&quot; target=&quot;_blank&quot;&gt;keeps spawning new threads for everything else&lt;/a&gt;, which is the majority of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Task&lt;/code&gt;s it dispatches in our examples. Respectively, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;O(#threads)&lt;/code&gt; is larger than &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;O(#jobs)&lt;/code&gt; so that the thread-local compiler setup even slows down our multi-threading case! That’s a pity and it caused issues in real-world projects already. Right now, implementing a custom task dispatcher downstream seems to be the best solution. This is how JuliaLang fixed it: &lt;a href=&quot;https://github.com/JuliaLang/julia/pull/58950&quot; target=&quot;_blank&quot;&gt;julia#58950&lt;/a&gt;. &lt;a href=&quot;https://github.com/lhames&quot; target=&quot;_blank&quot;&gt;Lang Hames&lt;/a&gt;, the author of ORC, wants to address this issue in &lt;a href=&quot;https://llvm.swoogo.com/2025devmtg/session/3366605/jit-loading-arbitrary-programs-%E2%80%94-powering-xcode-previews-with-llvm%E2%80%99s-jit&quot; target=&quot;_blank&quot;&gt;his upcoming presentation&lt;/a&gt; at the US LLVM Developers’ Meeting in October.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Last but not least, my post inspired &lt;a href=&quot;https://github.com/aengelke&quot; target=&quot;_blank&quot;&gt;Alexis Engelke&lt;/a&gt;, one of the TPDE authors, to &lt;a href=&quot;https://github.com/tpde2/tpde/commit/29bcf1841c572fcdc75dd61bb3efff5bfb1c5ac6&quot; target=&quot;_blank&quot;&gt;sketch on a TPDE layer for ORC upstream&lt;/a&gt;. This is a pretty good idea indeed!&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;
</description>
				<pubDate>Tue, 30 Sep 2025 10:00:00 +0000</pubDate>
				<link>https://weliveindetail.github.io/blog/post/2025/09/30/tpde-in-llvm-orc.html</link>
				<guid isPermaLink="true">https://weliveindetail.github.io/blog/post/2025/09/30/tpde-in-llvm-orc.html</guid>
			</item>
		
			<item>
				<title>Native binary obfuscation in clang-repl</title>
				<description>&lt;style&gt;
  #banner-image {
    margin-bottom: 50px;
  }
  #large-image {
    max-width: min(100%, 800px);
  }
  .center {
    display: block;
    margin: 0 auto;
  }
&lt;/style&gt;

&lt;p&gt;&lt;img src=&quot;https://weliveindetail.github.io/blog/res/2024-omvll-clang-repl.png&quot; alt=&quot;docker-banner&quot; id=&quot;banner-image&quot; class=&quot;center&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;[TL;DR]&lt;/strong&gt; Check &lt;a href=&quot;https://hub.docker.com/r/weliveindetail/omvll-clang-repl&quot; target=&quot;_blank&quot;&gt;the PoC&lt;/a&gt; and &lt;a href=&quot;#conclusions-and-future-work&quot;&gt;skip to conclusions&lt;/a&gt;&lt;/p&gt;

&lt;h3 id=&quot;background&quot;&gt;Background&lt;/h3&gt;

&lt;p&gt;Obfuscation is a technique for making code more difficult to understand. It has its origins in source-distributed languages like JavaScript and &lt;a href=&quot;https://stackoverflow.com/questions/194397/how-can-i-obfuscate-protect-javascript#answers&quot; target=&quot;_blank&quot;&gt;evolved in close relation&lt;/a&gt; to &lt;a href=&quot;https://web.archive.org/web/20160424125048/https://docs.webplatform.org/wiki/concepts/programming/javascript/minification&quot; target=&quot;_blank&quot;&gt;minification&lt;/a&gt; since the 2000s.&lt;/p&gt;

&lt;p&gt;More recently, mature tools for decompilation and binary analysis have become widely available. 2019 marked a turning point when NSA &lt;a href=&quot;https://www.nsa.gov/Press-Room/News-Highlights/Article/Article/1775584/ghidra-the-software-reverse-engineering-tool-youve-been-waiting-for-is-here/&quot; target=&quot;_blank&quot;&gt;open-sourced Ghidra&lt;/a&gt;. Combined with automation toolkits like &lt;a href=&quot;https://joern.io&quot; target=&quot;_blank&quot;&gt;Joern&lt;/a&gt;, vulnerability analysis for binary-distributed software (and &lt;a href=&quot;https://www.youtube.com/watch?v=hfxCDx9BTLo&quot; target=&quot;_blank&quot;&gt;even device firmware&lt;/a&gt;) has made great strides. As a consequence, anti-tampering solutions for native toolchains are gaining traction. Obfuscation plays an important role, because it raises the bar for attackers, e.g. to identify sensitive parts in the code through pattern matching.&lt;/p&gt;

&lt;p&gt;Obfuscation can be implemented as a post-link step: Lift binary code to an intermediate-representation (like LLVM IR), transform it and lower the result back to binary. Trail of Bits’ &lt;a href=&quot;https://github.com/lifting-bits/mcsema&quot; target=&quot;_blank&quot;&gt;McSema&lt;/a&gt; and &lt;a href=&quot;https://github.com/lifting-bits/remill&quot; target=&quot;_blank&quot;&gt;Remill&lt;/a&gt; are well-known binary lifters. Meta’s &lt;a href=&quot;https://dl.acm.org/doi/abs/10.5555/3314872.3314876&quot; target=&quot;_blank&quot;&gt;BOLT&lt;/a&gt; uses the same approach for post-link optimizations. It doesn’t have to be like that though.&lt;/p&gt;

&lt;h3 id=&quot;obfuscation-as-compiler-pass&quot;&gt;Obfuscation as compiler pass&lt;/h3&gt;

&lt;p&gt;Compilers run a lot of transformations when they translate source code to binary. We call them &lt;em&gt;passes&lt;/em&gt;. There are analysis passes (e.g. reaching definitions, alias, branch probabilities), optimization passes (inlining, loop unrolling), elimination passes (dead-code, loops), allocation passes (registers, stack slots), instruction selection passes (isel, inst-combine) and many more. In clang we can dump them with the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-fdebug-pass-structure&lt;/code&gt; option – give it a try and if you feel lucky, check &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-O3&lt;/code&gt;!&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;➜ clang -c hello.c -o /dev/null -O2 -fdebug-pass-structure
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The &lt;a href=&quot;https://llvm.org/docs/NewPassManager.html&quot; target=&quot;_blank&quot;&gt;New Pass Manager&lt;/a&gt; in LLVM provides a comfortable way for compilers to inject additional passes into the codegen pipeline. We call them &lt;em&gt;out-of-tree passes&lt;/em&gt;, because they are not part of the original distribution (unlike &lt;em&gt;in-tree&lt;/em&gt; passes that are built-in). Clang provides the command-line option &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-fpass-plugin&lt;/code&gt; and expects a shared library that registers new passes in the transformation pipeline (&lt;a href=&quot;https://llvm.org/docs/WritingAnLLVMNewPMPass.html&quot; target=&quot;_blank&quot;&gt;API docs&lt;/a&gt;, &lt;a href=&quot;https://github.com/llvm/llvm-project/blob/main/llvm/examples/Bye/Bye.cpp&quot; target=&quot;_blank&quot;&gt;LLVM example plugin&lt;/a&gt;).&lt;/p&gt;

&lt;h3 id=&quot;obfuscatorre--o-mvll&quot;&gt;obfuscator.re / O-MVLL&lt;/h3&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/open-obfuscator/o-mvll&quot; target=&quot;_blank&quot;&gt;O-MVLL&lt;/a&gt; uses pass injection to implement obfuscation at compile-time. It was built by &lt;a href=&quot;https://www.romainthomas.fr/post/22-10-open-obfuscator/&quot; target=&quot;_blank&quot;&gt;Romain Thomas&lt;/a&gt; and got open-sourced in 2022 as a framework to develop, tune and show-case obfuscation techniques. The project is maintained by &lt;a href=&quot;https://build38.com/&quot; target=&quot;_blank&quot;&gt;Build38&lt;/a&gt; and I &lt;a href=&quot;https://github.com/open-obfuscator/o-mvll/commits?author=weliveindetail&quot; target=&quot;_blank&quot;&gt;helped building some infrastructure&lt;/a&gt; for it in 2023.&lt;/p&gt;

&lt;p&gt;O-MVLL provides a Python configuration API that allows users to select and paramterize transformations for specific use-cases. I added a &lt;a href=&quot;https://github.com/open-obfuscator/o-mvll/pull/49&quot; target=&quot;_blank&quot;&gt;new API callback&lt;/a&gt; recently, that exposes actual IR-level changes to Python. The following script enables the &lt;a href=&quot;https://obfuscator.re/omvll/passes/arithmetic/&quot; target=&quot;_blank&quot;&gt;Arithmetic Obfuscation pass&lt;/a&gt; and dumps a diff for each applied transformation:&lt;/p&gt;

&lt;div class=&quot;language-py highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;omvll&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;functools&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lru_cache&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;difflib&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;unified_diff&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;MyConfig&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;omvll&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ObfuscationConfig&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;nb&quot;&gt;super&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;obfuscate_arithmetic&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mod&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;omvll&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ArithmeticOpt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rounds&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;report_diff&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pass_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;original&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;obfuscated&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pass_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;applied obfuscation:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;green&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\x1b&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;[38;5;16;48;5;2m&apos;&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;red&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\x1b&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;[38;5;16;48;5;1m&apos;&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\x1b&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;[0m&apos;&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;diff&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;unified_diff&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;original&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;splitlines&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;obfuscated&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;splitlines&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt;
                            &lt;span class=&quot;s&quot;&gt;&apos;original&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&apos;obfuscated&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lineterm&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;diff&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;m&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;+&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;green&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;-&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;red&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;o&quot;&gt;@&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lru_cache&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;maxsize&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;omvll_get_config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;omvll&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ObfuscationConfig&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MyConfig&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Let’s consider a minimal C++ example with an XOR operation in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;printf&lt;/code&gt; parameter:&lt;/p&gt;
&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;cp&quot;&gt;#include&lt;/span&gt; &lt;span class=&quot;cpf&quot;&gt;&amp;lt;cstdio&amp;gt;&lt;/span&gt;&lt;span class=&quot;cp&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;argv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[])&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;printf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;%d&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;^&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;123&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We can easily see how the obfuscator outlines the operation and replaces it with an equivalent, more complex expression:&lt;/p&gt;
&lt;div class=&quot;language-diff highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;err&quot;&gt;➜&lt;/span&gt; clang++ -O1 -fpass-plugin=/path/to/libOMVLL.so -o xor-example -c xor-example.cpp
&lt;span class=&quot;p&quot;&gt;omvll::Arithmetic applied obfuscation:
&lt;/span&gt;&lt;span class=&quot;gd&quot;&gt;--- original
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+++ obfuscated
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;@@ -9,16 +9,45 @@&lt;/span&gt;
 ; Function Attrs: mustprogress norecurse uwtable
 define dso_local noundef i32 @main(i32 noundef %argc, ptr noundef %argv) #0 {
 entry:
&lt;span class=&quot;gd&quot;&gt;-  %xor = xor i32 %argc, 123
-  %call = call i32 (ptr, ...) @printf(ptr noundef @.str, i32 noundef %xor)
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+  %0 = call i32 @__omvll_mba(i32 %argc, i32 123)
+  %call = call i32 (ptr, ...) @printf(ptr noundef @.str, i32 noundef %0)
&lt;/span&gt;   ret i32 0
 }
 
 ; Function Attrs: nofree nounwind
 declare noundef i32 @printf(ptr nocapture noundef readonly, ...) #1
 
&lt;span class=&quot;gi&quot;&gt;+; Function Attrs: alwaysinline optnone
+define private i32 @__omvll_mba(i32 %0, i32 %1) #2 {
+entry:
+  %2 = add i32 %0, %1
+  %3 = add i32 %2, 1
+  %4 = xor i32 %0, -1
+  %5 = xor i32 %1, -1
+  %6 = or i32 %4, %5
+  %7 = add i32 %3, %6
+  %8 = or i32 %0, %1
+  %9 = add i32 %0, %1
+  %10 = or i32 %0, %1
+  %11 = sub i32 %9, %10
+  %12 = and i32 %0, %1
+  %13 = sub i32 0, %11
+  %14 = xor i32 %7, %13
+  %15 = sub i32 0, %11
+  %16 = and i32 %7, %15
+  %17 = mul i32 2, %16
+  %18 = add i32 %14, %17
+  %19 = sub i32 %7, %11
+  %20 = or i32 %0, %1
+  %21 = and i32 %0, %1
+  %22 = sub i32 %20, %21
+  %23 = xor i32 %0, %1
+  ret i32 %18
+}
+
&lt;/span&gt; attributes #0 = { mustprogress norecurse uwtable &quot;min-legal-vector-width&quot;=&quot;0&quot; &quot;no-trapping-math&quot;=&quot;true&quot; &quot;stack-protector-buffer-size&quot;=&quot;8&quot; &quot;target-cpu&quot;=&quot;x86-64&quot; &quot;target-features&quot;=&quot;+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87&quot; &quot;tune-cpu&quot;=&quot;generic&quot; }
 attributes #1 = { nofree nounwind &quot;no-trapping-math&quot;=&quot;true&quot; &quot;stack-protector-buffer-size&quot;=&quot;8&quot; &quot;target-cpu&quot;=&quot;x86-64&quot; &quot;target-features&quot;=&quot;+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87&quot; &quot;tune-cpu&quot;=&quot;generic&quot; }
&lt;span class=&quot;gi&quot;&gt;+attributes #2 = { alwaysinline optnone }
&lt;/span&gt; 
 !llvm.module.flags = !{!0, !1, !2, !3}
 !llvm.ident = !{!4}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The new code doesn’t contain the one single XOR operation anymore. Potential attackers need to &lt;a href=&quot;https://obfuscator.re/omvll/passes/arithmetic/#limitations--attacks&quot; target=&quot;_blank&quot;&gt;invest more effort to uncover it&lt;/a&gt;. In this simple case, we can get the same insight like this:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;➜ clang++ -O1 -S -emit-llvm xor-example.cpp -o original.ll
➜ clang++ -O1 -S -emit-llvm xor-example.cpp -o obfuscated.ll -fpass-plugin=/path/to/libOMVLL.so
➜ diff -u original.ll obfuscated.ll
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;But the the script-based approach has a few advantages already:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;If multiple obfuscations are applied, we can see each individual step and not only the sum of all transformations.&lt;/li&gt;
  &lt;li&gt;We can enable/disable/fine-tune subsequent obfuscations based on actual transformations.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;interactive-c-with-obfuscations&quot;&gt;Interactive C++ with obfuscations&lt;/h3&gt;

&lt;p&gt;The outstanding benefit of the script-based approach is the ability to use it in tools that don’t write their outputs to files on disk! In particular, this is interesting for &lt;a href=&quot;https://clang.llvm.org/docs/ClangRepl.html&quot; target=&quot;_blank&quot;&gt;clang-repl&lt;/a&gt;, the interactive C++ interpreter in upstream LLVM. It uses the clang frontend internally and supports the same options (with a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-Xcc&lt;/code&gt; prefix), including &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-fpass-plugin&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-diff highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;err&quot;&gt;➜&lt;/span&gt; clang-repl -Xcc -O1 -Xcc -fpass-plugin=/path/to/libOMVLL.so
&lt;span class=&quot;p&quot;&gt;clang-repl&amp;gt; #include &amp;lt;cstdio&amp;gt;
clang-repl&amp;gt; int a = 1;
clang-repl&amp;gt; printf(&quot;%d\n&quot;, a^123);
omvll::Arithmetic applied obfuscation:
&lt;/span&gt;&lt;span class=&quot;gd&quot;&gt;--- original
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+++ obfuscated
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;@@ -11,8 +11,8 @@&lt;/span&gt;
 define internal void @__stmts__0() #0 {
 entry:
   %0 = load i32, ptr @a, align 4, !tbaa !5
&lt;span class=&quot;gd&quot;&gt;-  %xor = xor i32 %0, 123
-  %call = call i32 (ptr, ...) @printf(ptr noundef @.str, i32 noundef %xor)
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+  %1 = call i32 @__omvll_mba(i32 %0, i32 123)
+  %call = call i32 (ptr, ...) @printf(ptr noundef @.str, i32 noundef %1)
&lt;/span&gt;   ret void
 }
 
&lt;span class=&quot;p&quot;&gt;@@ -26,9 +26,38 @@&lt;/span&gt;
   ret void
 }
 
&lt;span class=&quot;gi&quot;&gt;+; Function Attrs: alwaysinline optnone
+define private i32 @__omvll_mba(i32 %0, i32 %1) #3 {
+entry:
+  %2 = add i32 %0, %1
+  %3 = add i32 %2, 1
+  %4 = xor i32 %0, -1
+  %5 = xor i32 %1, -1
+  %6 = or i32 %4, %5
+  %7 = add i32 %3, %6
+  %8 = or i32 %0, %1
+  %9 = add i32 %0, %1
+  %10 = or i32 %0, %1
+  %11 = sub i32 %9, %10
+  %12 = and i32 %0, %1
+  %13 = sub i32 0, %11
+  %14 = xor i32 %7, %13
+  %15 = sub i32 0, %11
+  %16 = and i32 %7, %15
+  %17 = mul i32 2, %16
+  %18 = add i32 %14, %17
+  %19 = sub i32 %7, %11
+  %20 = or i32 %0, %1
+  %21 = and i32 %0, %1
+  %22 = sub i32 %20, %21
+  %23 = xor i32 %0, %1
+  ret i32 %18
+}
+
&lt;/span&gt; attributes #0 = { &quot;min-legal-vector-width&quot;=&quot;0&quot; }
 attributes #1 = { nofree nounwind &quot;no-trapping-math&quot;=&quot;true&quot; &quot;stack-protector-buffer-size&quot;=&quot;8&quot; &quot;target-cpu&quot;=&quot;x86-64&quot; &quot;target-features&quot;=&quot;+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87&quot; &quot;tune-cpu&quot;=&quot;generic&quot; }
 attributes #2 = { uwtable &quot;min-legal-vector-width&quot;=&quot;0&quot; &quot;no-trapping-math&quot;=&quot;true&quot; &quot;stack-protector-buffer-size&quot;=&quot;8&quot; &quot;target-cpu&quot;=&quot;x86-64&quot; &quot;target-features&quot;=&quot;+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87&quot; &quot;tune-cpu&quot;=&quot;generic&quot; }
&lt;span class=&quot;gi&quot;&gt;+attributes #3 = { alwaysinline optnone }
&lt;/span&gt; 
 !llvm.module.flags = !{!0, !1, !2, !3}
 !llvm.ident = !{!4}

122
&lt;span class=&quot;p&quot;&gt;clang-repl&amp;gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;limitations-in-the-poc&quot;&gt;Limitations in the PoC&lt;/h3&gt;

&lt;p&gt;Pass implementations in OMVLL use LLVM’s bare C++ interfaces, so that the version it links must match the one in the target compiler exactly. O-MVLL was built for LLVM 14 and &lt;a href=&quot;https://github.com/open-obfuscator/o-mvll/commit/0ba3907da23ae1a188515238407c033b3374a3a7&quot; target=&quot;_blank&quot;&gt;just recently moved on to LLVM 16&lt;/a&gt;. clang-repl is still under development and works best with very recent versions of LLVM.&lt;/p&gt;

&lt;p&gt;For the proof-of-concept in this post, I had to do a &lt;a href=&quot;https://github.com/weliveindetail/o-mvll/commit/poc-clang-repl&quot; target=&quot;_blank&quot;&gt;partial upgrade of O-MVLL to LLVM 18&lt;/a&gt;. It’s just enough to run the demo, so please don’t expect other obfuscation passes to work yet.&lt;/p&gt;

&lt;h3 id=&quot;conclusions-and-future-work&quot;&gt;Conclusions and future work&lt;/h3&gt;

&lt;p&gt;In the race between protection techniques and reverse engineering, the accessibility of tools for exploration plays a crucial role. This PoC might inspire some form of obfuscation workbench. &lt;a href=&quot;https://hub.docker.com/r/weliveindetail/omvll-clang-repl&quot; target=&quot;_blank&quot;&gt;My Docker image&lt;/a&gt; is probably not the ideal distribution method. (It was the fastest for sure!) &lt;a href=&quot;https://jupyter.org/&quot; target=&quot;_blank&quot;&gt;Jupyter notebooks&lt;/a&gt; provide a much better user experience and they are very popular in scientific communities. &lt;a href=&quot;https://github.com/compiler-research/xeus-clang-repl&quot; target=&quot;_blank&quot;&gt;xeus-clang-repl&lt;/a&gt; is a Jupyter implementation for C++ with Python interop and might serve as a basis.&lt;/p&gt;

&lt;p&gt;O-MVLL provides a nice framework for exploring obfuscation techniques. The extensible Python API is flexible and powerful. However, I imagine that tinkering with O-MVLL is still difficult for security experts. We have to implement obfuscations in C++ and build the plugin from source. The plugin shared library and the target compiler must be ABI-compatible. This is very easy to break and quite complicated to debug. The version locking is likely to remain an issue for the foreseeable future. We can use the &lt;a href=&quot;https://github.com/open-obfuscator/o-mvll/actions/runs/10582169607/job/29321225687#step:3:1&quot; target=&quot;_blank&quot;&gt;prebuilt deps packages&lt;/a&gt; from the upstream CI, but this limits us to the target compilers supported on current mainline (Android NDK r26d and Xcode 15.2 at the time of writing this post).&lt;/p&gt;

&lt;p&gt;Another way to solve the build problem seems interesting: Why not extend the script API in a way that allows implementing entire obfuscations in Python? For performance reasons, it’s certainly not suitable for use in production, but for experimentation that seems acceptable. It could reuse existing bindings like &lt;a href=&quot;https://github.com/numba/llvmlite&quot;&gt;Numba’s llvmlite&lt;/a&gt;. Key requirements are indeed similar: “Numba and many JIT compilers do not need a full LLVM API. Only the IR builder, optimizer, and JIT compiler APIs are necessary. The IR builder is pure Python code and decoupled from LLVM’s frequently-changing C++ APIs.”&lt;/p&gt;

&lt;p&gt;Thanks for reading! Please ping me, if any of this triggers your interest!&lt;/p&gt;
</description>
				<pubDate>Thu, 29 Aug 2024 10:00:00 +0000</pubDate>
				<link>https://weliveindetail.github.io/blog/post/2024/08/29/omvll-clang-repl.html</link>
				<guid isPermaLink="true">https://weliveindetail.github.io/blog/post/2024/08/29/omvll-clang-repl.html</guid>
			</item>
		
			<item>
				<title>LLVM Projects in 2023 Google Summer of Code</title>
				<description>&lt;style&gt;
  #banner-image {
    max-width: min(100%, 500px);
  }
  #large-image {
    max-width: min(100%, 800px);
  }
  .center {
    display: block;
    margin: 0 auto;
  }
&lt;/style&gt;

&lt;p&gt;&lt;img src=&quot;https://weliveindetail.github.io/blog/res/gsoc-banner.png&quot; alt=&quot;gsoc-banner&quot; id=&quot;banner-image&quot; class=&quot;center&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The LLVM Compiler Infrastructure as been accepted again as a &lt;a href=&quot;https://summerofcode.withgoogle.com/programs/2023/organizations/llvm-compiler-infrastructure&quot; target=&quot;_blank&quot;&gt;mentoring organisation for Google Summer of Code 2023&lt;/a&gt; (GSoC). This is a great opportunity for &lt;a href=&quot;https://developers.google.com/open-source/gsoc/faq#what_are_the_eligibility_requirements_for_participation&quot; target=&quot;_blank&quot;&gt;students and other eligible people&lt;/a&gt; to gain substantial development experience with LLVM in a short period of time!&lt;/p&gt;

&lt;p&gt;The application period for contributors is starting today and &lt;a href=&quot;https://developers.google.com/open-source/gsoc/timeline#march_20_-_1800_utc&quot; target=&quot;_blank&quot;&gt;ends on 4 April 2023&lt;/a&gt;. Please find the full list of projects on &lt;a href=&quot;https://llvm.org/OpenProjects.html#gsoc23&quot; target=&quot;_blank&quot;&gt;llvm.org&lt;/a&gt;. In the following I want to provide more details on the projects I am mentoring and highlight a few others that caught my interest.&lt;/p&gt;

&lt;h3 id=&quot;out-of-process-execution-for-clang-repl&quot;&gt;&lt;a href=&quot;https://discourse.llvm.org/t/clang-out-of-process-execution-for-clang-repl/68225&quot; target=&quot;_blank&quot;&gt;Out-of-process execution for clang-repl&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/llvm/llvm-project/tree/release/16.x/clang/tools/clang-repl&quot; target=&quot;_blank&quot;&gt;clang-repl&lt;/a&gt; is an interactive C++ command line (REPL). It’s based on Clang and the ORC/JITLink infrastructure and it aims to implement to core functionality of &lt;a href=&quot;https://github.com/root-project/cling&quot; target=&quot;_blank&quot;&gt;CERN’s Cling&lt;/a&gt; in upstream LLVM.&lt;/p&gt;

&lt;p&gt;Right now clang-repl can execute user code only its own process. This has a number of drawbacks and limits its applications. It’s the goal of this GSoC project to implement an out-of-process execution model in clang-repl. LLVM’s underlying &lt;a href=&quot;https://llvm.org/docs/ORCv2.html&quot; target=&quot;_blank&quot;&gt;ORC&lt;/a&gt; and &lt;a href=&quot;https://llvm.org/docs/JITLink.html&quot; target=&quot;_blank&quot;&gt;JITLink&lt;/a&gt; libraries provide most of the basic functionality already. Now, we need to teach clang-repl to generate code that makes use of it and extend RPC features where necessary.&lt;/p&gt;

&lt;p&gt;Applicants need a good understanding of C++ and basic assembly debugging skills. Experience with RPC and generation of LLVM IR code is a plus, but not strictly required.&lt;/p&gt;

&lt;p&gt;Get in touch with us in the &lt;a href=&quot;https://discourse.llvm.org/t/clang-out-of-process-execution-for-clang-repl/68225&quot; target=&quot;_blank&quot;&gt;Discourse topic&lt;/a&gt; or in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;#jit&lt;/code&gt; channel of the &lt;a href=&quot;https://discord.gg/x6rszdMN&quot; target=&quot;_blank&quot;&gt;LLVM Discord server&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;jitlink-new-backends&quot;&gt;&lt;a href=&quot;https://discourse.llvm.org/t/jitlink-new-backends/68223&quot; target=&quot;_blank&quot;&gt;JITLink new backends&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;LLVM’s &lt;a href=&quot;https://llvm.org/docs/JITLink.html&quot; target=&quot;_blank&quot;&gt;JITLink&lt;/a&gt; library links and loads static build artifacts for immediate execution. We can use clang to compile source code into an object file and execute it right away with the &lt;a href=&quot;https://github.com/llvm/llvm-project/tree/release/16.x/llvm/tools/llvm-jitlink&quot; target=&quot;_blank&quot;&gt;llvm-jitlink&lt;/a&gt; tool.&lt;/p&gt;

&lt;p&gt;JITLink runs generic link steps and uses dedicated backends for target- and object-format-specific operations. This includes resolution of relocations, registration of exception handlers and preparation of thread-local storage. There is good support for a number of formats and platforms already, but we aren’t done yet. The goal of this GSoC project is to complete one of the existing implementations or add an entirely new one. In particular, we are thinking about AArch32 (&lt;a href=&quot;https://reviews.llvm.org/D144083&quot;&gt;skeleton is in review&lt;/a&gt;), BPF and PowerPC (both new).&lt;/p&gt;

&lt;p&gt;Applicants need intermediate knowledge of both, C++ and target assembly. They will go through the target ABI documentation (like &lt;a href=&quot;https://github.com/ARM-software/abi-aa/blob/main/aaelf32/aaelf32.rst#relocation-codes-table&quot; target=&quot;_blank&quot;&gt;this for AArch32&lt;/a&gt;) and implement missing relocation types in their JITLink backend, populate &lt;a href=&quot;https://refspecs.linuxfoundation.org/ELF/zSeries/lzsabi0_zSeries/x2251.html#GLOBALOFFSETTABLE&quot; target=&quot;_blank&quot;&gt;GOT&lt;/a&gt; and &lt;a href=&quot;https://refspecs.linuxfoundation.org/ELF/zSeries/lzsabi0_zSeries/x2251.html#PROCEDURELINKAGETABLE&quot; target=&quot;_blank&quot;&gt;PLT&lt;/a&gt; and add &lt;a href=&quot;https://llvm.org/docs/JITLink.html#connection-to-the-orc-runtime&quot;&gt;ORC Runtime&lt;/a&gt; support for advanced features.&lt;/p&gt;

&lt;p&gt;Get in touch with us in the &lt;a href=&quot;https://discourse.llvm.org/t/jitlink-new-backends/68223&quot; target=&quot;_blank&quot;&gt;Discourse topic&lt;/a&gt; or in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;#jit&lt;/code&gt; channel of the &lt;a href=&quot;https://discord.gg/x6rszdMN&quot; target=&quot;_blank&quot;&gt;LLVM Discord server&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;patch-based-test-coverage-for-quick-test-feedback&quot;&gt;&lt;a href=&quot;https://discourse.llvm.org/t/coverage-patch-based-test-coverage-for-quick-test-feedback/68628&quot; target=&quot;_blank&quot;&gt;Patch based test coverage for quick test feedback&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;Regression tests are a fundamental part of any serious software engineering, but test suites keep growing and we rarely invest time to sort them out:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://weliveindetail.github.io/blog/res/2021-check-llvm-stats.png&quot; alt=&quot;check-llvm-stats&quot; id=&quot;check-llvm-stats&quot; class=&quot;center&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Running all the tests gets increasingly time-consuming. It would be very useful if we could figure out what tests are actually affected by a given change in the code-base. In practice, this should be possible in many situations. The goal of this GSoC project is to extend LLVM’s &lt;a href=&quot;https://weliveindetail.github.io/blog/post/2021/08/06/debug-llvm-lit.html&quot; target=&quot;_blank&quot;&gt;llvm-lit&lt;/a&gt; test driver tool to record coverage information and feed them back in. (I guess in a first step limited to subsequent runs on the same system?) As mentioned by the mentor, applicants don’t need compiler experience, but knowledge in Python, data processing and diff processing. I assume experience with llvm-lit is a plus, but not strictly required.&lt;/p&gt;

&lt;h3 id=&quot;addressing-rust-optimization-failures&quot;&gt;&lt;a href=&quot;https://discourse.llvm.org/t/llvm-addressing-rust-optimization-failures-in-llvm/68096&quot; target=&quot;_blank&quot;&gt;Addressing Rust optimization failures&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;While the LLVM codegen backends and opimization pipelines aim to be language agnostic, they have been developed with a focus on C++ and it is not a surprise that they are not a perfect fit (yet) for IR code generated from other languages. Rust is a notable example here. The goal of this GSoC project is to find and mitigate missing optimization opportunities. Applicants will likely have a deep dive into the LLVM optimizer and learn a lot about the involved tooling. Prior knowlegde of C++, LLVM IR and Rust is required.&lt;/p&gt;

&lt;h3 id=&quot;improving-compile-times&quot;&gt;&lt;a href=&quot;https://discourse.llvm.org/t/llvm-improving-compile-times/68094&quot; target=&quot;_blank&quot;&gt;Improving compile times&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;One unfortunate drawback from the continuous growth of LLVM’s code-base is that compile-time is &lt;a href=&quot;https://www.npopov.com/2020/05/10/Make-LLVM-fast-again.html&quot; target=&quot;_blank&quot;&gt;getting a few percent slower with every release&lt;/a&gt;. The goal of this GSoC project is to help mitigate that. Applicants will review profiling information to identify bottlenecks and try to improve the underlying code structures to reduce overhead. Prior knowledge of C++ and profiling tools will be necessary.&lt;/p&gt;
</description>
				<pubDate>Mon, 20 Mar 2023 08:00:00 +0000</pubDate>
				<link>https://weliveindetail.github.io/blog/post/2023/03/20/llvm-gsoc-2023.html</link>
				<guid isPermaLink="true">https://weliveindetail.github.io/blog/post/2023/03/20/llvm-gsoc-2023.html</guid>
			</item>
		
			<item>
				<title>Remote C++ live coding on Raspberry Pi with ez-clang-linux</title>
				<description>&lt;style&gt;
  #banner-image {
    max-width: min(100%, 500px);
  }
  #large-image {
    max-width: min(100%, 800px);
  }
  .center {
    display: block;
    margin: 0 auto;
  }
&lt;/style&gt;

&lt;p&gt;&lt;img src=&quot;https://weliveindetail.github.io/blog/res/ez-clang-linux-banner.png&quot; alt=&quot;ez-clang-pycfg-banner&quot; id=&quot;banner-image&quot; class=&quot;center&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://echtzeit.dev/ez-clang/&quot; target=&quot;_blank&quot;&gt;ez-clang&lt;/a&gt; is a C++ REPL for bare-metal embedded devices. Serial connections to such devices can be hairy. Hosted remote devices are usually much easier to handle and might help with experimentation and development. In &lt;a href=&quot;https://github.com/echtzeit-dev/ez-clang/releases/tag/v0.0.6&quot; target=&quot;_blank&quot;&gt;release 0.0.6&lt;/a&gt; ez-clang gained a reference implementation for Linux-based remotes, that can now be connected via the &lt;a href=&quot;https://weliveindetail.github.io/blog/post/2023/02/03/ez-clang-pycfg.html&quot;&gt;pycfg layer&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In Python it’s easy to set up a socket connection to another device via TCP. If we find a 32-bit ARM device that runs Linux, then we just have to port our remote endpoint there and connect it to the TCP socket. Well, no need to search for long, of course Raspberry Pi Models 2 and 3b are both a great fit. Model 2 features a Cortex-A7 (ARMv7a instruction set) and Model 3b a Cortex-A53 with ARMv8a. While ARMv8a is in-fact a 64-bit instruction set, it does provide user-space compatibility with ARMv7a. This is what we want for ez-clang! We install 32-bit Raspbian and are ready to go.&lt;/p&gt;

&lt;script id=&quot;asciicast-3Mdujgd1EI4OL8FAnHEmU1h9G&quot; src=&quot;https://asciinema.org/a/3Mdujgd1EI4OL8FAnHEmU1h9G.js&quot; async=&quot;&quot;&gt;&lt;/script&gt;

&lt;p&gt;The screencast shows how to SSH into a Raspberry Pi Model 3b and configure it as a ez-clang remote. The relevant steps are:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Checkout the &lt;a href=&quot;https://github.com/echtzeit-dev/ez-clang-linux&quot; target=&quot;_blank&quot;&gt;ez-clang-linux&lt;/a&gt; repository&lt;/li&gt;
  &lt;li&gt;Configure with CMake and build the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ez-clang-linux-socket&lt;/code&gt; target with Ninja. You can keep the default system toolchain or use your own one. There are no special requirements here.&lt;/li&gt;
  &lt;li&gt;Use the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ez-clang-linux-socket.service&lt;/code&gt; file produced by CMake to configure Systemd so that the executable is auto-restarted infinitely. This is great because we don’t need to implement session handling on our own. Each connection will automatically get its own process and on disconnect it simply shuts down.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;On our host machine we can just start ez-clang now and connect to the respective IP and port. In my local network it looks like this when running with docker:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; docker run -it echtzeit/ez-clang:0.0.6 --connect=raspi32@192.168.1.105:10819
Welcome to ez-clang, your friendly remote C++ REPL. Type `.q` to exit.
Connected: TCP -&amp;gt; raspi32 (arm-none-eabi @ cortex-a53)
raspi32&amp;gt;.q
&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;On the remote we can inspect the RPC traffic from this (minimal) session with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;journalctl&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; journalctl --user -u ez-clang-linux-socket.service --follow
systemd[629]: Started Permanent ez-clang-linux remote host service.
ez-clang-linux-socket[26884]: Listening on port 10819
ez-clang-linux-socket[26884]: Connected
ez-clang-linux-socket[26884]: Send -&amp;gt;
ez-clang-linux-socket[26884]:   6a 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ez-clang-linux-socket[26884]: Send -&amp;gt;
ez-clang-linux-socket[26884]:   05 00 00 00 00 00 00 00 30 2e 30 2e 36 00 30 aa 00 00 00 00 00 00 80 00 00 00 00 00 00 01 00 00 00 00 00 00 00 15 00 00 00 00 00 00 00 5f 5f 65 7a 5f 63 6c 61 6e 67 5f 72 70 63 5f 6c 6f 6f 6b 75 70 74 26 01 00 00 00 00 00
ez-clang-linux-socket[26884]: Receive &amp;lt;-
ez-clang-linux-socket[26884]:   47 00 00 00 00 00 00 00 03 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 74 26 01 00 00 00 00 00
ez-clang-linux-socket[26884]: Receive &amp;lt;-
ez-clang-linux-socket[26884]:   01 00 00 00 00 00 00 00 17 00 00 00 00 00 00 00 5f 5f 65 7a 5f 63 6c 61 6e 67 5f 72 65 70 6f 72 74 5f 76 61 6c 75 65
ez-clang-linux-socket[26884]: Send -&amp;gt;
ez-clang-linux-socket[26884]:   31 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ez-clang-linux-socket[26884]: Send -&amp;gt;
ez-clang-linux-socket[26884]:   00 01 00 00 00 00 00 00 00 c0 2b 01 00 00 00 00 00
ez-clang-linux-socket[26884]: Receive &amp;lt;-
ez-clang-linux-socket[26884]:   20 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ez-clang-linux-socket[26884]: Send -&amp;gt;
ez-clang-linux-socket[26884]:   21 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ez-clang-linux-socket[26884]: Send -&amp;gt;
ez-clang-linux-socket[26884]:   00
ez-clang-linux-socket[26884]: Receive &amp;lt;-
ez-clang-linux-socket[26884]:   20 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ez-clang-linux-socket[26884]: Exiting
systemd[629]: ez-clang-linux-socket.service: Succeeded.
systemd[629]: ez-clang-linux-socket.service: Scheduled restart job, restart counter is at 1.
systemd[629]: Stopped Permanent ez-clang-linux remote host service.
systemd[629]: Started Permanent ez-clang-linux remote host service.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;voilà&quot;&gt;Voilà!&lt;/h3&gt;

&lt;p&gt;With these few steps we can start and run C++ code on Raspberry Pi on the fly! While keeping in mind, of course, that the &lt;a href=&quot;https://github.com/echtzeit-dev/ez-clang&quot; target=&quot;_blank&quot;&gt;ez-clang project&lt;/a&gt; is still in a very experimental state ;-)&lt;/p&gt;
</description>
				<pubDate>Mon, 06 Feb 2023 21:00:00 +0000</pubDate>
				<link>https://weliveindetail.github.io/blog/post/2023/02/06/ez-clang-linux.html</link>
				<guid isPermaLink="true">https://weliveindetail.github.io/blog/post/2023/02/06/ez-clang-linux.html</guid>
			</item>
		
			<item>
				<title>ez-clang Python Device Configuration Layer</title>
				<description>&lt;style&gt;
  #banner-image {
    max-width: min(100%, 500px);
  }
  #large-image {
    max-width: min(100%, 800px);
  }
  .center {
    display: block;
    margin: 0 auto;
  }
&lt;/style&gt;

&lt;p&gt;&lt;img src=&quot;https://weliveindetail.github.io/blog/res/ez-clang-pycfg-banner.png&quot; alt=&quot;ez-clang-pycfg-banner&quot; id=&quot;banner-image&quot; class=&quot;center&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;br /&gt;
&lt;a href=&quot;http://echtzeit.dev/ez-clang/&quot; target=&quot;_blank&quot;&gt;ez-clang&lt;/a&gt; is a C++ REPL for bare-metal embedded devices. In &lt;a href=&quot;https://github.com/echtzeit-dev/ez-clang/releases/tag/v0.0.6&quot; target=&quot;_blank&quot;&gt;release 0.0.6&lt;/a&gt; it gained a &lt;a href=&quot;https://github.com/echtzeit-dev/ez-clang-pycfg&quot; target=&quot;_blank&quot;&gt;Python Device Configuration Layer (pycfg)&lt;/a&gt; that allows users to connect their own devices.&lt;/p&gt;

&lt;h3 id=&quot;elements-of-a-device-script&quot;&gt;Elements of a device script&lt;/h3&gt;

&lt;p&gt;As before we specify the target device in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--connect&lt;/code&gt; parameter, e.g.:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;ez-clang --connect=teensylc@/dev/ttyACM0
ez-clang --connect=raspi32@192.168.0.100:20000
ez-clang --connect=lm3s811@qemu
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The &lt;a href=&quot;https://github.com/echtzeit-dev/ez-clang-pycfg/blob/v0.0.6/.share/ez/scan.py#L7&quot; target=&quot;_blank&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;scan()&lt;/code&gt;&lt;/a&gt; function will parse this string and load the respective device script. The &lt;a href=&quot;https://github.com/echtzeit-dev/ez-clang-pycfg/blob/v0.0.6/.share/ez/util/script.py#L27&quot; target=&quot;_blank&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Script&lt;/code&gt; class&lt;/a&gt; implements the interface between ez-clang and Python. Compatible device scripts define the following freestanding functions.&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;accept()&lt;/code&gt; checks whether the provided info matches the device. The first candidate script that returns a non-Null value here will be choosen. The type of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;info&lt;/code&gt; parameter depends on the selected transport type. For serial transport we get a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;serial.tools.list_ports_linux.SysFS&lt;/code&gt; object and will probably check the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;hwid&lt;/code&gt; field (&lt;a href=&quot;https://github.com/echtzeit-dev/ez-clang-pycfg/blob/v0.0.6/teensylc/serial.py#L115&quot; target=&quot;_blank&quot;&gt;example for TeensyLC&lt;/a&gt;).&lt;/p&gt;
&lt;div class=&quot;language-py highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;accept&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Any&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Any&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;connect()&lt;/code&gt; establishes the raw connection to the device and returns a serializer for it. The serializer has a &lt;a href=&quot;https://github.com/echtzeit-dev/ez-clang-pycfg/blob/v0.0.6/.share/ez/repl/serialize.py&quot; target=&quot;_blank&quot;&gt;standard implementation&lt;/a&gt;, that can typically be reused (&lt;a href=&quot;https://github.com/echtzeit-dev/ez-clang-pycfg/blob/v0.0.6/teensylc/serial.py#L122&quot; target=&quot;_blank&quot;&gt;example for TeensyLC&lt;/a&gt;).&lt;/p&gt;
&lt;div class=&quot;language-py highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;connect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Any&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ez_clang_api&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ez_clang_api&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Device&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ez&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;repl&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Serializer&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;setup()&lt;/code&gt; configures the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ez_clang_api.Device&lt;/code&gt; and adds it to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ez_clang_api.Host&lt;/code&gt;. This is the core function for device configuration. We read the setup message from the device, initialize endpoints, set CPU, target triple and code buffer (memory range available for JITed code) as well as header search paths and compiler flags. Examples: &lt;a href=&quot;https://github.com/echtzeit-dev/ez-clang-pycfg/blob/v0.0.6/teensylc/serial.py#L134&quot; target=&quot;_blank&quot;&gt;TeensyLC&lt;/a&gt;, &lt;a href=&quot;https://github.com/echtzeit-dev/ez-clang-pycfg/blob/v0.0.6/due/serial.py#L84&quot; target=&quot;_blank&quot;&gt;Arduino Due&lt;/a&gt;, &lt;a href=&quot;https://github.com/echtzeit-dev/ez-clang-pycfg/blob/v0.0.6/raspi32/socket.py#L111&quot; target=&quot;_blank&quot;&gt;Raspberry Pi (Socket)&lt;/a&gt;, &lt;a href=&quot;https://github.com/echtzeit-dev/ez-clang-pycfg/blob/v0.0.6/lm3s811/qemu.py#L95&quot; target=&quot;_blank&quot;&gt;LM3S811 (QEMU)&lt;/a&gt;.&lt;/p&gt;
&lt;div class=&quot;language-py highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;setup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stream&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ez&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;repl&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Serializer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ez_clang_api&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ez_clang_api&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Device&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;bool&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;call()&lt;/code&gt; allows invoking the RPC function &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;endpoint&lt;/code&gt; on the device with the parameters in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;data&lt;/code&gt;. As in the last release, &lt;a href=&quot;https://github.com/echtzeit-dev/ez-clang-pycfg/blob/v0.0.6/.share/ez/repl/__init__.py#L223-L228&quot; target=&quot;_blank&quot;&gt;built-in endpoints&lt;/a&gt; are &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;lookup&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;commit&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;execute&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;memory.read.cstr&lt;/code&gt; (see &lt;a href=&quot;https://github.com/echtzeit-dev/ez-clang/blob/v0.0.5/release/0.0.5/docs/rpc.md&quot; target=&quot;_blank&quot;&gt;binary interface docs&lt;/a&gt;).&lt;/p&gt;
&lt;div class=&quot;language-py highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;call&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;endpoint&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;dict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;dict&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;disconnect()&lt;/code&gt; shuts down the session and closes the device connection.&lt;/p&gt;
&lt;div class=&quot;language-py highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;disconnect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;bool&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;host-and-device-interfaces&quot;&gt;Host and Device interfaces&lt;/h3&gt;

&lt;p&gt;These interfaces describe specific properties for the host and the device respectively. They are both implemented in C++ inside ez-clang and not formally documented yet. For testing, however, we &lt;a href=&quot;https://github.com/echtzeit-dev/ez-clang-pycfg/blob/main/.share/ez_clang_api/__init__.py&quot; target=&quot;_blank&quot;&gt;mock these types&lt;/a&gt; and the mocks can be used as an informal documentation for now.&lt;/p&gt;

&lt;h3 id=&quot;install&quot;&gt;Install&lt;/h3&gt;

&lt;p&gt;Clone the repository and install dependencies:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; git clone https://github.com/echtzeit-dev/ez-clang-pycfg
&amp;gt; python3 --version
Python 3.8.10
&amp;gt; cd ez-clang-pycfg
&amp;gt; python3 -m pip install -e .share
&amp;gt; python3 -m pip install -r requirements.txt
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;testing&quot;&gt;Testing&lt;/h3&gt;

&lt;p&gt;One major benefit of pycfg is that connectivity testing and debugging can be done in Python. This is a lot easier and quicker than C++ and it allows for better isolation. Each target device implementation comes with a test suite that covers all relevant connectivity features (example for &lt;a href=&quot;https://github.com/echtzeit-dev/ez-clang-pycfg/tree/v0.0.6/due/test&quot; target=&quot;_blank&quot;&gt;Arduino Due&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;run_all.py&lt;/code&gt; script is for batch testing. It uploads a fresh firmware and runs all tests in sequence:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; python3 due/test/run_all.py
Found compatible device at /dev/ttyACM2
Device unique identifier: 7503130343135130F0A0
Uploading firmware: /usr/lib/ez-clang/0.0.6/firmware/due/firmware.bin
Running tests from /usr/lib/ez-clang/0.0.6/pycfg/due/test
Selecting 8 out of 9 discovered tests
  [basics] connect ........................................ 4.51s
  [basics] setup .......................................... 4.52s
  [basics] connect_setup_repeat ........................... 9.02s
  [endpoints] ez.rpc.lookup ............................... 9.07s
  [endpoints] ez.rpc.commit ............................... 5.46s
  [endpoints] ez.rpc.execute .............................. 5.02s
  [endpoints] memory.read.cstr ............................ 5.43s
  [recovery] replace_firmware ............................. 19.96s

Testing Time: 88.65s
  Disabled: 1
  Excluded: 0
  Failed  : 0
  Passed  : 8

SUCCESS
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The individual tests are self-contained and can be executed as standalone Python scripts. The output shows the serialized RPC traffic:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; python3 due/test/00-basics/01-connect.py
Found compatible device at /dev/ttyACM2
Connect &amp;lt;-
  6a 00 00 00 00 00 00 00
  00 00 00 00 00 00 00 00
  00 00 00 00 00 00 00 00
  00 00 00 00 00 00 00 00
  05 00 00 00 00 00 00 00 30 2e 30 2e 35 50 1a 07 20 00 00 00 00
  b0 61 01 00 00 00 00 00 01 00 00 00 00 00 00 00 15 00 00 00 00
  00 00 00 5f 5f 65 7a 5f 63 6c 61 6e 67 5f 72 70 63 5f 6c 6f 6f
  6b 75 70 19 05 08 00 00 00 00 00
Disconnect -&amp;gt;
  20 00 00 00 00 00 00 00
  01 00 00 00 00 00 00 00
  01 00 00 00 00 00 00 00
  00 00 00 00 00 00 00 00
Disconnect &amp;lt;-
  21 00 00 00 00 00 00 00
  01 00 00 00 00 00 00 00
  00 00 00 00 00 00 00 00
  00 00 00 00 00 00 00 00
  00
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;run-in-ez-clang&quot;&gt;Run in ez-clang&lt;/h3&gt;

&lt;p&gt;In order to use new or modified scripts with ez-clang, just mount the repo folder in docker with the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-v&lt;/code&gt; parameter:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; docker run -it -v $(pwd)/ez-clang-pycfg:/lib/ez-clang/0.0.6/pycfg echtzeit/ez-clang:0.0.6 --connect=raspi32@192.168.1.105:10819
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;debugging-in-ez-clang&quot;&gt;Debugging in ez-clang&lt;/h3&gt;

&lt;p&gt;We can also debug scripts when they run in ez-clang. For that we have &lt;a href=&quot;https://github.com/microsoft/debugpy&quot; target=&quot;_blank&quot;&gt;debugpy&lt;/a&gt; hooks at the start of each device script:&lt;/p&gt;
&lt;div class=&quot;language-py highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ez_clang_api&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Host&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;debugPython&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;__debug__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;debugpy&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;debugpy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;listen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&apos;0.0.0.0&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5678&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ez&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;io&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;note&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Python API waiting for debugger. Attach to 0.0.0.0:5678 to proceed.&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;debugpy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;wait_for_client&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;debugpy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;breakpoint&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;They get enabled by passing the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--rpc-debug-python&lt;/code&gt; flag to ez-clang. Additionally we have to forward the debug port from the docker container to the host with the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-p&lt;/code&gt; parameter:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; docker run --rm -p 5678:5678 -it echtzeit/ez-clang:0.0.6 --connect=raspi32@192.168.1.105:10819 --rpc-debug-python
Welcome to ez-clang, your friendly remote C++ REPL. Type `.q` to exit.
Python API waiting for debugger. Attach to 0.0.0.0:5678 to proceed.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now any appropriate debugger should be able to attach, e.g. vscode with a launch configuration like this:&lt;/p&gt;
&lt;div class=&quot;language-json highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;name&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;ez-clang-pycfg attach&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;python&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;request&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;attach&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;connect&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;host&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;localhost&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;port&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5678&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;pathMappings&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;localRoot&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;${workspaceFolder}/ez-clang-pycfg&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;remoteRoot&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;/usr/lib/ez-clang/0.0.6/pycfg&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;voilà&quot;&gt;Voilà!&lt;/h3&gt;

&lt;p&gt;&lt;img src=&quot;https://weliveindetail.github.io/blog/res/ez-clang-pycfg-debug.png&quot; alt=&quot;ez-clang-pycfg-debug&quot; id=&quot;large-image&quot; class=&quot;center&quot; /&gt;&lt;/p&gt;
</description>
				<pubDate>Fri, 03 Feb 2023 22:00:00 +0000</pubDate>
				<link>https://weliveindetail.github.io/blog/post/2023/02/03/ez-clang-pycfg.html</link>
				<guid isPermaLink="true">https://weliveindetail.github.io/blog/post/2023/02/03/ez-clang-pycfg.html</guid>
			</item>
		
			<item>
				<title>GDB JIT Interface 101</title>
				<description>&lt;style&gt;
  #large-image {
    max-width: min(100%, 500px);
  }
  .center {
    display: block;
    margin: 0 auto;
  }
&lt;/style&gt;

&lt;p&gt;&lt;img src=&quot;https://weliveindetail.github.io/blog/res/gdb-jit-register-sequence-diagram.png&quot; alt=&quot;EuroLLVM 2022 Logo&quot; id=&quot;large-image&quot; class=&quot;center&quot; /&gt;&lt;/p&gt;

&lt;p&gt;LLVM’s JIT libraries allow to link and load static build artifacts at runtime. We can play with object files in a JIT session without having to link them into a static executable. That’s great — as long as everything works as expected.&lt;/p&gt;

&lt;p&gt;If something goes wrong, we face a little extra effort to inspect our code with a debugger. Mainstream debuggers scan executable and shared library files on disk to collect their symbols and debug info. And they know how to intercept shared library events like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dlopen()&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dlclose()&lt;/code&gt;. When we compile and link code in-memory, however, and load it on the fly, there is not much they can do. In order to debug such code we have to collaborate.&lt;/p&gt;

&lt;p&gt;There is a surprising amount of things that can go wrong when debugging JITed code. This article explains debug info registration at runtime and tries to give assistance on what to check when things go south.&lt;/p&gt;

&lt;h3 id=&quot;gdb-jit-interface&quot;&gt;GDB JIT Interface&lt;/h3&gt;

&lt;p&gt;In 2009 &lt;a href=&quot;https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=4efc6507960ac76505ebb1be9886f207ceb46c3a&quot; target=&quot;_blank&quot;&gt;GDB introduced an ingenious way&lt;/a&gt; for executables to register new code at runtime: &lt;a href=&quot;http://sourceware.org/gdb/onlinedocs/gdb/JIT-Interface.html&quot; target=&quot;_blank&quot;&gt;JIT Compilation Interface docs&lt;/a&gt;. It’s kind of a loose standard with a single version so far. Other debuggers like LLDB picked it up later on. The interface relies on two symbols:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/llvm/llvm-project/blob/release/15.x/llvm/lib/ExecutionEngine/Orc/TargetProcess/JITLoaderGDB.cpp#L24-L53&quot; target=&quot;_blank&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__jit_debug_descriptor&lt;/code&gt;&lt;/a&gt; has type &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;jit_descriptor&lt;/code&gt; and implements a linked list of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;jit_code_entry&lt;/code&gt; items. Our program adds items here for any new code that the debugger should know about.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/llvm/llvm-project/blob/release/15.x/llvm/lib/ExecutionEngine/Orc/TargetProcess/JITLoaderGDB.cpp#L55-L63&quot; target=&quot;_blank&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__jit_debug_register_code&lt;/code&gt;&lt;/a&gt; is an empty function that our program calls in order to signal the debugger to process new list items.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Debuggers that implement the interface apply special handling for those symbols. At launch they check for the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__jit_debug_register_code&lt;/code&gt; symbol and set a breakpoint that triggers the JIT registration hook: When it hits, the debugger walks the list in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__jit_debug_descriptor&lt;/code&gt; for new items, reads them from process memory and extracts debug info.&lt;/p&gt;

&lt;p&gt;An application that uses the interface will add new items to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__jit_debug_descriptor&lt;/code&gt; list whenever it emits new code. Each such item refers to an in-memory object file. Once the list is up to date, the application calls its own &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__jit_debug_register_code&lt;/code&gt; knowing that a debugger might interrupt execution and process the debug sections of the in-memory objects.&lt;/p&gt;

&lt;p&gt;There was an attempt to add a &lt;a href=&quot;https://pwparchive.wordpress.com/2011/11/20/new-jit-interface-for-gdb/&quot; target=&quot;_blank&quot;&gt;more advanced JIT interface in GDB&lt;/a&gt; that involves GDB-side plugins, but the approach didn’t gain enough momentum as it seems (given that even &lt;a href=&quot;https://v8.dev/docs/gdb-jit&quot; target=&quot;_blank&quot;&gt;V8&lt;/a&gt; still uses the original interface).&lt;/p&gt;

&lt;h3 id=&quot;lldb-support&quot;&gt;LLDB Support&lt;/h3&gt;

&lt;p&gt;In 2014 LLDB gained an &lt;a href=&quot;https://github.com/llvm/llvm-project/commit/17220c188635721e948cf02d2b6ad36b267ea393&quot; target=&quot;_blank&quot;&gt;initial implementation for the GDB JIT interface&lt;/a&gt; which was first released as part of LLDB 3.5. However, the feature &lt;a href=&quot;https://github.com/llvm/llvm-project/issues/35557&quot; target=&quot;_blank&quot;&gt;silently broke during the 6.0 development cycle&lt;/a&gt; since we had no good test for it. I managed to fix it a little later with two patches: The original &lt;a href=&quot;https://reviews.llvm.org/rGf0ee69f75d61dc5c2ad476536ef695a50b320a6e&quot; target=&quot;_blank&quot;&gt;registration bug in 2019&lt;/a&gt; and &lt;a href=&quot;https://reviews.llvm.org/rG203b4774b88322de22c99881a3e1e4c78a9d5a0e&quot; target=&quot;_blank&quot;&gt;source-level debugging in 2020&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;LLDB fully supports the GDB JIT interface for ELF object files &lt;a href=&quot;https://weliveindetail.github.io/blog/post/2021/04/19/lldb-12-jit-interface.html&quot;&gt;again since release 12&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;llvm-jit-support&quot;&gt;LLVM JIT support&lt;/h3&gt;

&lt;p&gt;At the time of writing, LLVM comes with two different JIT-linker implementations: RuntimeDyLD and JITLink. RuntimeDyLD was designed as a dynamic loader for LLVM’s monolithic MCJIT implementation. As it grew into a fully fledged cross-platform JIT-linker, it surpassed the limits of its architecture and became increasingly harder to maintain and extend (for different code models, EH registration, TLS support, etc.). With the rise of LLVM’s composable ORC JIT libraries, JITLink came up as a replacement for RuntimeDyLD and today it is set to become the default JIT-linker for most LLVM target platforms.&lt;/p&gt;

&lt;p&gt;RuntimeDyLD provides a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;JITEventListener&lt;/code&gt; interface for clients to run actions when new code is loaded or unloaded. Debug object registration is implemented in the &lt;a href=&quot;https://github.com/llvm/llvm-project/blob/release/15.x/llvm/lib/ExecutionEngine/GDBRegistrationListener.cpp&quot; target=&quot;_blank&quot;&gt;GDBRegistrationListener&lt;/a&gt;. JITLink provides a much more comprehensive &lt;a href=&quot;https://github.com/llvm/llvm-project/blob/release/15.x/llvm/include/llvm/ExecutionEngine/Orc/ObjectLinkingLayer.h#L60-L95&quot; target=&quot;_blank&quot;&gt;plugin interface&lt;/a&gt;, which allows to hook into the linking process at various stages. In early 2021, I implemented a simple &lt;a href=&quot;https://reviews.llvm.org/rGef2389235c5dec03be93f8c9585cd9416767ef4c&quot; target=&quot;_blank&quot;&gt;DebugObjectManagerPlugin for JITLink&lt;/a&gt; that works for ELF objects in both cases, in-process and &lt;a href=&quot;https://github.com/llvm/llvm-project/blob/release/15.x/llvm/examples/OrcV2Examples/LLJITWithRemoteDebugging/LLJITWithRemoteDebugging.cpp&quot; target=&quot;_blank&quot;&gt;out-of-process JITing&lt;/a&gt;. Later that year the &lt;a href=&quot;https://github.com/llvm/llvm-project/blob/release/15.x/llvm/include/llvm/ExecutionEngine/Orc/DebuggerSupportPlugin.h&quot; target=&quot;_blank&quot;&gt;GDBJITDebugInfoRegistrationPlugin&lt;/a&gt; for MachO support landed upstream.&lt;/p&gt;

&lt;h3 id=&quot;in-memory-object-files&quot;&gt;In-memory object files&lt;/h3&gt;

&lt;p&gt;The LLVM JIT libraries work with position-independent code. In principle, relocations for in-memory objects can be resolved on either side, the JIT or the debugger. It appears reasonable, however, to leave it to the debugger, because debug sections contain loads of relocations and the debugger can defer the task until it eventually needs to access the data. In order to resolve relocations, the debugger must know the load address in target memory for each allocated section. For that purpose the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DebugObjectManagerPlugin&lt;/code&gt; &lt;a href=&quot;https://github.com/llvm/llvm-project/blob/main/llvm/lib/ExecutionEngine/Orc/DebugObjectManagerPlugin.cpp#L440-L445&quot; target=&quot;_blank&quot;&gt;collects section load addresses at link-time&lt;/a&gt; and &lt;a href=&quot;https://github.com/llvm/llvm-project/blob/main/llvm/lib/ExecutionEngine/Orc/DebugObjectManagerPlugin.cpp#L67-L73&quot; target=&quot;_blank&quot;&gt;writes them to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sh_addr&lt;/code&gt; field&lt;/a&gt; in the respective section headers (and leaves the object untouched otherwise).&lt;/p&gt;

&lt;h3 id=&quot;troubleshooting&quot;&gt;Troubleshooting&lt;/h3&gt;

&lt;p&gt;There is a surprising amount of things that can go wrong when debugging JITed code. In order to check what we can do in such cases, we will use a minimal test executable to compare results against:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;cp&quot;&gt;#include&lt;/span&gt; &lt;span class=&quot;cpf&quot;&gt;&quot;stdint.h&quot;&lt;/span&gt;&lt;span class=&quot;cp&quot;&gt;
&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;jit_descriptor&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;kt&quot;&gt;uint32_t&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;version&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;kt&quot;&gt;uint32_t&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;action_flag&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;relevant_entry&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;first_entry&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;// External global symbol for debuggers to obtain debug info at runtime.&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;jit_descriptor&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__jit_debug_descriptor&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nullptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nullptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;// Debuggers put a special breakpoint in this function. The noinline and the asm&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// prevent calls to this function from being optimized out.&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;__attribute__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;noinline&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__jit_debug_register_code&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;asm&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;volatile&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:::&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;memory&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;__jit_debug_register_code&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// Trap into the debugger&apos;s JIT registration&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h4 id=&quot;1-are-we-running-lldb-version-12&quot;&gt;1. Are we running LLDB version 12+?&lt;/h4&gt;

&lt;p&gt;Check the version and upgrade to a more recent one if necessary:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; lldb --version
lldb version 15.0.2
&amp;gt; lldb
(lldb) version
lldb version 15.0.2
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h4 id=&quot;2-does-our-platform-disable-the-jit-debug-hook-by-default&quot;&gt;2. Does our platform disable the JIT debug hook by default?&lt;/h4&gt;

&lt;p&gt;Apple platforms prefer to &lt;a href=&quot;https://github.com/llvm/llvm-project/blob/release/15.x/lldb/source/Plugins/JITLoader/GDB/JITLoaderGDB.cpp#L418&quot; target=&quot;_blank&quot;&gt;disable the JIT debug hook&lt;/a&gt; in LLDB by default, while everyone else wants to have it default enabled. As a compromise the &lt;a href=&quot;https://reviews.llvm.org/D57689&quot; target=&quot;_blank&quot;&gt;setting is now platform-dependent&lt;/a&gt; and reading it doesn’t tell us anything:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;(lldb) settings show plugin.jit-loader.gdb.enable
plugin.jit-loader.gdb.enable (enum) = default
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In case of uncertainty turn it &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;on&lt;/code&gt; explicitly:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;(lldb) settings set plugin.jit-loader.gdb.enable on
(lldb) settings show plugin.jit-loader.gdb.enable
plugin.jit-loader.gdb.enable (enum) = on
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h4 id=&quot;3-print-logs-from-the-jit-category-to-see-whats-going-on&quot;&gt;3. Print logs from the JIT category to see what’s going on&lt;/h4&gt;

&lt;p&gt;Dump status info for important steps. For our demo executable, the output should look like this:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;(lldb) b main
(lldb) log enable lldb jit
(lldb) r
lldb-15          JITLoaderGDB::SetJITBreakpoint looking for JIT register hook
lldb-15          JITLoaderGDB::SetJITBreakpoint looking for JIT register hook
lldb-15          JITLoaderGDB::SetJITBreakpoint looking for JIT register hook
lldb-15          JITLoaderGDB::SetJITBreakpoint setting JIT breakpoint
Process 2363705 launched: &apos;/path/to/test&apos; (x86_64)
Process 2363705 stopped
* thread #1, name = &apos;test&apos;, stop reason = breakpoint 1.1
    frame #0: 0x000055555555514f test`main at test.cpp:51:3
   48   } // extern &quot;C&quot;
   49
   50   int main() {
-&amp;gt; 51     __jit_debug_register_code();
   52     return 0;
   53   }
(lldb) n
intern-state     JITLoaderGDB::JITDebugBreakpointHit hit JIT breakpoint
Process 2363705 stopped
* thread #1, name = &apos;test&apos;, stop reason = step over
    frame #0: 0x0000555555555154 test`main at test.cpp:52:3
   49
   50   int main() {
   51     __jit_debug_register_code();
-&amp;gt; 52     return 0;
   53   }
(lldb) c
Process 2363705 resuming
Process 2363705 exited with status = 0 (0x00000000)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h4 id=&quot;4-can-lldb-see-our-jit-interface-symbols&quot;&gt;4. Can LLDB see our JIT interface symbols?&lt;/h4&gt;

&lt;p&gt;For our demo executable, the symbols are in the symbol table of the binary:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; nm test | grep __jit_debug_
0000000000004028 D __jit_debug_descriptor
0000000000001130 T __jit_debug_register_code
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In LLDB we can look it up right away:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;(lldb) target create &quot;test&quot;
Current executable set to &apos;/path/to/test&apos; (x86_64).
(lldb) image lookup -s __jit_debug_descriptor
1 symbols match &apos;__jit_debug_descriptor&apos; in /path/to/test:
        Address: test[0x0000000000004028] (test.PT_LOAD[3]..data + 16)
        Summary: __jit_debug_descriptor
(lldb) image lookup -s __jit_debug_register_code
1 symbols match &apos;__jit_debug_register_code&apos; in /path/to/test:
        Address: test[0x0000000000001130] (test.PT_LOAD[1]..text + 240)
        Summary: test`::__jit_debug_register_code() at test.cpp:40
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;It’s not always as easy though!&lt;/strong&gt; Looking at &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;lli&lt;/code&gt; from the official LLVM 15 release, the symbols are in the LLVM shared library that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;lli&lt;/code&gt; links against! This is why we won’t find them in LLDB before the executable launched and actually resolved its load-time libraries:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; lldb-15 -- lli-15
(lldb) target create &quot;lli-15&quot;
Current executable set to &apos;/usr/bin/lli-15&apos; (x86_64).
(lldb) image lookup -s __jit_debug_descriptor
(lldb) image lookup -s __jit_debug_register_code
(lldb) b main
Breakpoint 1: where = lli-15`main, address = 0x00000000000169c0
(lldb) r
Process 2364389 launched: &apos;/usr/bin/lli-15&apos; (x86_64)
Process 2364389 stopped
* thread #1, name = &apos;lli-15&apos;, stop reason = breakpoint 1.1
    frame #0: 0x000055555556a9c0 lli-15`main
lli-15`main:
-&amp;gt;  0x55555556a9c0 &amp;lt;+0&amp;gt;: push   rbp
    0x55555556a9c1 &amp;lt;+1&amp;gt;: push   r15
    0x55555556a9c3 &amp;lt;+3&amp;gt;: push   r14
    0x55555556a9c5 &amp;lt;+5&amp;gt;: push   r13
(lldb) image lookup -s __jit_debug_descriptor
1 symbols match &apos;__jit_debug_descriptor&apos; in /usr/lib/llvm-15/lib/libLLVM-15.so.1:
        Address: libLLVM-15.so.1[0x0000000006f0e1f8] (libLLVM-15.so.1.PT_LOAD[1]..data + 255592)
        Summary: __jit_debug_descriptor
(lldb) image lookup -s __jit_debug_register_code
1 symbols match &apos;__jit_debug_register_code&apos; in /usr/lib/llvm-15/lib/libLLVM-15.so.1:
        Address: libLLVM-15.so.1[0x0000000002a54420] (libLLVM-15.so.1.PT_LOAD[0]..text + 30459296)
        Summary: libLLVM-15.so.1`__jit_debug_register_code
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Moreover shared libraries contain “undefined” &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;U&lt;/code&gt; records for external symbols and this triggered a &lt;a href=&quot;https://github.com/llvm/llvm-project/issues/56085&quot; target=&quot;_blank&quot;&gt;bug in LLDB’s JITLoaderGDB&lt;/a&gt;. If we linked against such a shared library and LLDB encountered it before the one with the actual definition, LLDB failed to resolve &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__jit_debug_register_code&lt;/code&gt;. The issue &lt;a href=&quot;https://reviews.llvm.org/D138750&quot; target=&quot;_blank&quot;&gt;should be fixed&lt;/a&gt; in the upcoming LLDB release 16. We can stop in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;main()&lt;/code&gt; and lookup the symbol manually to check for this case, e.g. in a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BUILD_SHARED_LIBS&lt;/code&gt; build of mainline LLVM:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; lldb-15 -- /path/to/llvm-project/build/bin/lli
(lldb) b main
(lldb) run --version
(lldb) image lookup -s __jit_debug_register_code
1 symbols match &apos;__jit_debug_register_code&apos; in /path/to/llvm-project/build/lib/libLLVMExecutionEngine.so.16git:
        Name: __jit_debug_register_code
        Value: 0x0000000000000000

1 symbols match &apos;__jit_debug_register_code&apos; in /path/to/llvm-project/build/lib/libLLVMOrcTargetProcess.so.16git:
        Address: libLLVMOrcTargetProcess.so.16git[0x000000000000fa5a] (libLLVMOrcTargetProcess.so.16git.PT_LOAD[1]..text + 20058)
        Summary: libLLVMOrcTargetProcess.so.16git`__jit_debug_register_code
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h4 id=&quot;6-does-our-jited-code-contain-debug-info&quot;&gt;6. Does our JITed code contain debug info?&lt;/h4&gt;

&lt;p&gt;We can check for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DI&lt;/code&gt; meta data in LLVM IR code and for debug sections in object files:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; clang -S -emit-llvm -g -o - test_foo.c | grep &quot;\!DI&quot; | wc -l
9
&amp;gt; clang -S -emit-llvm -o - test_foo.c | grep &quot;\!DI&quot; | wc -l
0
&amp;gt; clang -c -g -o - test_foo.c | llvm-objdump -h - | grep debug | wc -l
11
&amp;gt; clang -c -o - test_foo.c | llvm-objdump -h - | grep debug | wc -l
0
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We can break on both, functions and source locations if we have debug info. Otherwise we can only break on functions and only get disassembly:&lt;/p&gt;
&lt;div class=&quot;language-diff highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt; 1 location added to breakpoint 1
 Process 1625659 stopped
 * thread #1, name = &apos;lli&apos;, stop reason = breakpoint 1.1
&lt;span class=&quot;gd&quot;&gt;-    frame #0: 0x00007ffff70d0004 JIT(0x56e4e0)`foo
-JIT(0x56e4e0)`foo:
--&amp;gt;  0x7ffff70d1000 &amp;lt;+0&amp;gt;: mov    eax, 0x2a
-    0x7ffff70d1005 &amp;lt;+5&amp;gt;: ret
-    0x7ffff70d1006:      add    byte ptr [rax], al
-    0x7ffff70d1008:      add    byte ptr [rax], al
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+    frame #0: 0x00007ffff70d0004 JIT(0x56e4e0)`foo at test_foo.c:4:10
+   1    static int some_value = 42;
+   2
+   3    int foo() {
+-&amp;gt; 4      return some_value;
+   5    }
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h4 id=&quot;7-does-llvm-orcjit-support-debugging-for-your-platform&quot;&gt;7. Does LLVM OrcJIT support debugging for your platform?&lt;/h4&gt;

&lt;p&gt;As of 2022 the &lt;a href=&quot;https://github.com/llvm/llvm-project/blob/release/15.x/llvm/include/llvm/ExecutionEngine/Orc/DebugObjectManagerPlugin.h&quot; target=&quot;_blank&quot;&gt;DebugObjectManagerPlugin&lt;/a&gt; covers ELF on all matching platforms. (I never tested it outside of 64-bit x86 systems but it might actually work!) There are upstream tests in &lt;a href=&quot;https://github.com/llvm/llvm-project/blob/release/15.x/llvm/test/ExecutionEngine/OrcLazy/debug-objects-elf-minimal.ll&quot; target=&quot;_blank&quot;&gt;LLVM&lt;/a&gt; and &lt;a href=&quot;https://github.com/llvm/llvm-project/blob/release/15.x/lldb/test/Shell/Breakpoint/jit-loader_jitlink_elf.test&quot; target=&quot;_blank&quot;&gt;LLDB&lt;/a&gt; and the plugin is wired up in both tools, &lt;a href=&quot;https://github.com/llvm/llvm-project/blob/release/15.x/llvm/tools/lli/lli.cpp#L948-949&quot; target=&quot;_blank&quot;&gt;lli&lt;/a&gt; and &lt;a href=&quot;https://github.com/llvm/llvm-project/blob/release/15.x/llvm/tools/llvm-jitlink/llvm-jitlink.cpp#L1026-1027&quot; target=&quot;_blank&quot;&gt;llvm-jitlink&lt;/a&gt;. There is an example for debugging &lt;a href=&quot;https://github.com/llvm/llvm-project/blob/release/15.x/llvm/examples/OrcV2Examples/LLJITWithRemoteDebugging/LLJITWithRemoteDebugging.cpp&quot; target=&quot;_blank&quot;&gt;out-of-process&lt;/a&gt; and a &lt;a href=&quot;https://github.com/llvm/llvm-project/blob/release/15.x/llvm/test/Examples/OrcV2Examples/lljit-with-remote-debugging.test&quot; target=&quot;_blank&quot;&gt;test for it&lt;/a&gt; upstream.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;https://github.com/llvm/llvm-project/blob/release/15.x/llvm/include/llvm/ExecutionEngine/Orc/DebuggerSupportPlugin.h&quot; target=&quot;_blank&quot;&gt;GDBJITDebugInfoRegistrationPlugin&lt;/a&gt; implements debug support for MachO on Apple systems. Right now it doesn’t appear to get tested and it’s only wired up in &lt;a href=&quot;https://github.com/llvm/llvm-project/blob/release/15.x/llvm/tools/llvm-jitlink/llvm-jitlink.cpp#L986-987&quot; target=&quot;_blank&quot;&gt;llvm-jitlink&lt;/a&gt;.&lt;/p&gt;
</description>
				<pubDate>Sun, 27 Nov 2022 10:00:00 +0000</pubDate>
				<link>https://weliveindetail.github.io/blog/post/2022/11/27/gdb-jit-interface-101.html</link>
				<guid isPermaLink="true">https://weliveindetail.github.io/blog/post/2022/11/27/gdb-jit-interface-101.html</guid>
			</item>
		
			<item>
				<title>EuroLLVM 2022 Trip Report</title>
				<description>&lt;style&gt;
  #large-image {
    max-width: min(100%, 500px);
  }
  #badge-image {
    max-width: min(100%, 300px);
  }
  div.youtubeWrapper {
    width: 100%;
    height: auto;
    max-width: min(100%, 500px);
  }
  div.youtubeWrapper &gt; iframe {
    width: 100%;
  }
  .center {
    display: block;
    margin: 0 auto;
  }
&lt;/style&gt;

&lt;p&gt;&lt;img src=&quot;https://weliveindetail.github.io/blog/res/eurollvm2022-logo.png&quot; alt=&quot;EuroLLVM 2022 Logo&quot; id=&quot;large-image&quot; class=&quot;center&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;https://llvm.swoogo.com/2022eurollvm&quot; target=&quot;_blank&quot;&gt;EuroLLVM conference&lt;/a&gt; happened again in person last week in London. It’s been a blast and two very intense days for me. Even though preparations have been slightly last minute, it was totally worth all the effort. After a two years break it was great to see old and new colleagues again in person and at the same time meet so many new developers who joined the community in the meantime.&lt;/p&gt;

&lt;p&gt;The collective expertise in the audience remains to be the outstanding quality of the LLVM conferences. Presentations, discussions and round tables are on a level of technical excellence unequaled by any other conference I’ve seen. All recordings from the conference are published on the &lt;a href=&quot;https://www.youtube.com/c/LLVMPROJ/search?query=2022%20EuroLLVM&quot; target=&quot;_blank&quot;&gt;LLVM Youtube channel&lt;/a&gt;. Here is a summary of my personal highlights from the conference.&lt;/p&gt;

&lt;h3 id=&quot;finding-missed-optimizations-through-the-lens-of-dead-code-elimination&quot;&gt;Finding Missed Optimizations Through the Lens of Dead Code Elimination&lt;/h3&gt;

&lt;p&gt;Day 2 started with a keynote that made big waves. &lt;a href=&quot;https://thetheodor.github.io/&quot; target=&quot;_blank&quot;&gt;Theodoros Theodoridis from ETH Zürich&lt;/a&gt; presented a nouvelle approach to utilize dead code elimination as an indicator for misbehaviors in optimization passes: &lt;a href=&quot;https://www.youtube.com/watch?v=jD0WDB5bFPo&quot; target=&quot;_blank&quot;&gt;recording&lt;/a&gt;, &lt;a href=&quot;https://llvm.org/devmtg/2022-05/slides/2022EuroLLVM-FindingMissedOptimizations.pdf&quot; target=&quot;_blank&quot;&gt;slides&lt;/a&gt;, &lt;a href=&quot;https://llvm.swoogo.com/2022eurollvm/session/861298/finding-missed-optimizations-through-the-lens-of-dead-code-elimination&quot; target=&quot;_blank&quot;&gt;abstract&lt;/a&gt;. The tool augments source input with extra function calls and measures success by checking whether these calls are still present in compiled output. Comparing of results between different versions of Clang is a great way to find regressions. Furthermore, the approach is not concerned with details like intermediate representations and thus also allows to compare against other compilers like GCC.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://twitter.com/weliveindetail/status/1524304422414819328&quot; target=&quot;_blank&quot;&gt;&lt;img src=&quot;https://weliveindetail.github.io/blog/res/eurollvm2022-twitter.png&quot; alt=&quot;Twitter Post&quot; id=&quot;large-image&quot; class=&quot;center&quot; /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That said, the potential for automation is amazing: A script can drive generation of input with &lt;a href=&quot;https://github.com/csmith-project/csmith&quot; target=&quot;_blank&quot;&gt;Csmith&lt;/a&gt;, instrumentation, compilation and result comparison. Differences indicate misbehaviors and they get checked against a database of known and previously found issues, before the input is reduced with &lt;a href=&quot;https://github.com/csmith-project/creduce&quot; target=&quot;_blank&quot;&gt;C-Reduce&lt;/a&gt;. Outputs are almost complete bug reports, all without human interference!&lt;/p&gt;

&lt;h3 id=&quot;ez-clang-c-repl-for-bare-metal-embedded-devices&quot;&gt;ez-clang C++ REPL for bare-metal embedded devices&lt;/h3&gt;

&lt;p&gt;Naturally, presenting my very own project was a highlight for me as well: &lt;a href=&quot;https://www.youtube.com/watch?v=_qYqEYh1nHE&quot; target=&quot;_blank&quot;&gt;recording&lt;/a&gt;, &lt;a href=&quot;https://github.com/weliveindetail/talks/blob/master/EuroLLVM22-ez-clang.pdf&quot; target=&quot;_blank&quot;&gt;slides&lt;/a&gt;, &lt;a href=&quot;https://llvm.swoogo.com/2022eurollvm/session/861310/ez-clang-c++-repl-for-bare-metal-embedded-devices&quot; target=&quot;_blank&quot;&gt;abstract&lt;/a&gt;. And it’s been a first-timer in terms of setup. When doing a live demo with an &lt;a href=&quot;https://echtzeit.dev/ez-clang/&quot;&gt;early-stage experimental tool&lt;/a&gt; that closely interacts with external hardware, things can go wrong in any possible way.&lt;/p&gt;

&lt;p&gt;So I had to prepare. I brought my development board, a camera, loads of cables and a backup for everything. When submitting the talk, I asked to get scheduled after a break so I had enough time to set up and double-check. There are still so many more details, the spotlights on stage, reflections in the glass on the table and the resolution of the projector that made it hard for the audience to see the LED blink. However, both presentation and demo worked out pretty well! Please find the recording here:&lt;/p&gt;

&lt;div class=&quot;youtubeWrapper center&quot;&gt;
  &lt;iframe src=&quot;https://www.youtube.com/embed/_qYqEYh1nHE?rel=0&amp;amp;showinfo=0&quot; width=&quot;500&quot; height=&quot;280&quot; title=&quot;ez-clang C++ REPL for bare-metal embedded devices&quot; allowfullscreen=&quot;&quot; frameborder=&quot;0&quot;&gt;
  &lt;/iframe&gt;
&lt;/div&gt;

&lt;h3 id=&quot;round-tables&quot;&gt;Round tables&lt;/h3&gt;

&lt;p&gt;Round tables are a great way to gather people for informal discussions on a defined topic. I scheduled &lt;a href=&quot;https://twitter.com/weliveindetail/status/1521109781171351553&quot;&gt;two round tables beforehand&lt;/a&gt;, JIT and Embedded development with LLVM/Clang.&lt;/p&gt;

&lt;p&gt;The JIT round table went as usual: A handful of attendees with very specific use-cases and almost no overlap. We talked about the state of &lt;a href=&quot;https://llvm.org/docs/ORCv2.html&quot; target=&quot;_blank&quot;&gt;ORC&lt;/a&gt;, &lt;a href=&quot;https://llvm.org/docs/JITLink.html&quot; target=&quot;_blank&quot;&gt;JITLink&lt;/a&gt;, &lt;a href=&quot;https://llvm.org/docs/MCJITDesignAndImplementation.html&quot;&gt;MCJIT and RuntimeDyLd&lt;/a&gt;. We shared tipps and best practices and discussed portation road-maps from MCJIT to ORCv1 and ORCv2. As always I pointed out the convenience of the &lt;a href=&quot;https://discord.com/channels/636084430946959380/687692371038830597&quot; target=&quot;_blank&quot;&gt;#JIT channel on LLVM’s discord server&lt;/a&gt; for quick questions and discussions online.&lt;/p&gt;

&lt;p&gt;The Embedded development with LLVM/Clang round table was a surprise for me. The embedded ecosystem is dominated by GCC and so there is usually not too much interest in LLVM anywhere online. On-site on the conference, however, I found myself with 30 people awaiting moderation. Targeted discussions are hard with a group of this size. After a small introduction and general topics, we collected common interests and split up into smaller groups: Clang multilib support, LLDB on embedded devices, size optimizations in Clang and the current state of standard library support. The last two even spawned their own follow-up round tables in the course of the conference.&lt;/p&gt;

&lt;p&gt;I visited a few more round tables, like LLDB which was held by my former colleagues &lt;a href=&quot;https://jonasdevlieghere.com/&quot; target=&quot;_blank&quot;&gt;Jonas&lt;/a&gt; and &lt;a href=&quot;https://se.linkedin.com/in/raphael-isemann-5b350284&quot; target=&quot;_blank&quot;&gt;Raffael&lt;/a&gt;. The dominating topic was debugging of optimized code. The Formal Verification round table discussed SMT solvers and on the Freestanding libc++ round table, we concluded to make one of the libc++ downstream adaptions public in order to identify their potential for upstreaming.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://weliveindetail.github.io/blog/res/eurollvm2022-badge.jpeg&quot; alt=&quot;Speaker Bagde&quot; id=&quot;badge-image&quot; class=&quot;center&quot; /&gt;&lt;/p&gt;

&lt;p&gt;All in all the conference was a great experience. As always, the atmosphere is welcoming and attendees diverse. I am very much looking forward to EuroLLVM 2023 in Paris next year!&lt;/p&gt;
</description>
				<pubDate>Sun, 15 May 2022 06:00:00 +0000</pubDate>
				<link>https://weliveindetail.github.io/blog/post/2022/05/15/eurollvm.html</link>
				<guid isPermaLink="true">https://weliveindetail.github.io/blog/post/2022/05/15/eurollvm.html</guid>
			</item>
		
			<item>
				<title>Porting Cling to LLVM 13 and ORCv2</title>
				<description>&lt;style&gt;
  #teaser-image {
    max-width: min(100%, 500px);
  }
  .center {
    display: block;
    margin: 0 auto;
    width: 80vmin;
  }
  .baobab {
    display: block;
    margin: 0 auto;
    width: 80vmin;
    height: 80vmin;
    max-width: 500px;
    max-height: 500px;
    border: 0;
  }
  .flex-grid {
    display: flex;
    flex-direction: row;
    margin-top: -20px;
    margin-bottom: 20px;
  }
  .flex-grid .baobab {
    width: 40vmin;
    height: 50vmin;
    min-width: 250px;
    min-height: 300px;
  }
  @media only screen and (max-width: 500px) {
    .flex-grid {
      flex-direction: column;
    }
  }
&lt;/style&gt;

&lt;p&gt;&lt;img src=&quot;https://weliveindetail.github.io/blog/res/cling-llvm13-orcv2-teaser.png&quot; alt=&quot;cling demo on llvm13 branch&quot; id=&quot;teaser-image&quot; class=&quot;center&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/root-project/cling&quot; target=&quot;_blank&quot;&gt;Cling&lt;/a&gt; is a Clang-based C++ interpreter developed by &lt;a href=&quot;https://root.cern/cling/&quot; target=&quot;_blank&quot;&gt;CERN&lt;/a&gt; as part of the high-energy physics data analysis project &lt;a href=&quot;https://root.cern/&quot; target=&quot;_blank&quot;&gt;ROOT&lt;/a&gt;. It built against LLVM 5 since 2015 and was &lt;a href=&quot;https://lists.llvm.org/pipermail/llvm-dev/2020-July/143257.html&quot; target=&quot;_blank&quot;&gt;ported to LLVM 9&lt;/a&gt; last year. With this latest official version it still uses LLVM’s &lt;a href=&quot;https://reviews.llvm.org/D64609&quot; target=&quot;_blank&quot;&gt;now deprecated ORCv1 JIT libraries&lt;/a&gt;. &lt;a href=&quot;https://llvm.org/docs/ORCv2.html&quot; target=&quot;_blank&quot;&gt;ORCv2&lt;/a&gt; evolved in parallel to its predecessor since LLVM 7 and introduced a complete redesign of the JIT API. It’s one of the breaking changes in LLVM that cling hasn’t caught up with yet.&lt;/p&gt;

&lt;p&gt;Updating an external dependency can be a major effort for any project. LLVM bears a high risk here, because the C++ APIs don’t offer any guarantees for stability across release versions. Furthermore, cling maintains it’s own set of downstream modifications on &lt;a href=&quot;https://github.com/llvm/llvm-project/commits/release/9.x&quot; target=&quot;_blank&quot;&gt;LLVM’s release/9.x&lt;/a&gt; branch. &lt;a href=&quot;https://weliveindetail.github.io/blog/res/cling-llvm09-baobab.html&quot; target=&quot;_blank&quot;&gt;Here&lt;/a&gt; is an interactive visualization:&lt;/p&gt;

&lt;iframe src=&quot;https://weliveindetail.github.io/blog/res/cling-llvm09-baobab.html&quot; title=&quot;git-baobab: cling downstream changes LLVM 9&quot; class=&quot;baobab&quot;&gt;
  &lt;a href=&quot;https://weliveindetail.github.io/blog/res/cling-llvm09-baobab.html&quot; target=&quot;_blank&quot; class=&quot;baobab&quot;&gt;
    &lt;img src=&quot;https://weliveindetail.github.io/blog/res/cling-llvm09-baobab.png&quot; alt=&quot;git-baobab: cling downstream changes LLVM 9&quot; /&gt;
  &lt;/a&gt;
&lt;/iframe&gt;

&lt;h3 id=&quot;strategy&quot;&gt;Strategy&lt;/h3&gt;

&lt;p&gt;Cling’s downstream patches for LLVM won’t apply anymore if the code was modified upstream. We will get merge conflicts when rebasing it. The majority of conflicts are plain mix-and-match exercises: Find the colliding commit(s) and adopt the change in the downstream patch. However, there will be non-obvious side effects from time to time. We’d be more confident in our conflict resolutions, if we could run a smoke test that exercised the modified code. Unfortunately, Cling won’t compile until we finished rebasing onto the new version &lt;strong&gt;and&lt;/strong&gt; also fixed Cling’s usage of the new API!&lt;/p&gt;

&lt;p&gt;Eventually, we want to bring Cling’s downstream patches to LLVM’s most recent release branch. But the distance from release 9 is huge. Upstream LLVM is subject to change, permanently. There’s 73k commits in between the two releases:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;➜ git clone https://github.com/weliveindetail/llvm-project
➜ cd llvm-project
➜ git log --oneline $(git merge-base release/9.x release/13.x)..release/13.x | wc -l
73303
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Trying to do this in a single step bears the risk of ending up with runtime failures, that will be hard to nail down to a specific patch. Instead, it’s common practice to go from one release version to the next. In each iteration we will follow a simple plan:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;Rebase Cling’s downstream patches to the next release.&lt;/strong&gt; Solve each conflict on a best effort basis and make sure the affected LLVM libraries compile.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Build Cling against the new LLVM libraries.&lt;/strong&gt; Each compiler error indicates an API change. For now we fix one after the other in the Cling code. Maybe we have to add further downstream patches in LLVM as well.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Run a smoke test&lt;/strong&gt; once Cling compiles and links. Each bug that didn’t exist in the previous version, was likely introduced by us. Fix it and amend the changes to the respective commit(s).&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Sort out the Cling API fixes&lt;/strong&gt; into meaningful commits and track them on a new branch. Don’t skip this! A clean history is key for future iterations.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3 id=&quot;rebase-to-the-next-llvm-release-version&quot;&gt;Rebase to the next LLVM release version&lt;/h3&gt;

&lt;p&gt;In our first iteration we go from &lt;a href=&quot;https://github.com/llvm/llvm-project/commits/release/9.x&quot; target=&quot;_blank&quot;&gt;release/9.x&lt;/a&gt; to &lt;a href=&quot;https://github.com/llvm/llvm-project/commits/release/10.x&quot; target=&quot;_blank&quot;&gt;release/10.x&lt;/a&gt;:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;➜ cd llvm-project
➜ git checkout cling-09
➜ git fetch origin release/9.x release/10.x
➜ git log --oneline release/9.x..HEAD
497d28c58a51 Allow interfaces to operate on in-memory buffers ...
...
5c50e7430981 Enable unicode output on terminals.
➜ git checkout -b cling-10
➜ git rebase release/10.x
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This is mostly straightforward, &lt;a href=&quot;https://github.com/weliveindetail/llvm-project/commits/cling-10&quot; target=&quot;_blank&quot;&gt;here&lt;/a&gt; is the rebased set of commits. There is &lt;a href=&quot;https://github.com/weliveindetail/llvm-project/commit/792e5b9169d005730178ee11c03d14cf7f7f0104&quot; target=&quot;_blank&quot;&gt;one additional patch&lt;/a&gt; that fixes the &lt;a href=&quot;https://llvm.org/docs/CMake.html#llvm-related-variables&quot; target=&quot;_blank&quot;&gt;shared libraries&lt;/a&gt; build in upstream LLVM (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;libLLVMExtensions&lt;/code&gt;  didn’t link against &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;libLLVMSupport&lt;/code&gt; and thus missed definitions for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LLVM_ENABLE_ABI_BREAKING_CHECKS&lt;/code&gt;). Quick hacks like this can be useful, because &lt;strong&gt;we can’t fix the world while rebasing&lt;/strong&gt;. Make sure to mark commits as such!&lt;/p&gt;

&lt;p&gt;Now that our downstream LLVM does compile, we can fix Cling’s usage of the API. This is a &lt;strong&gt;search-intensive task&lt;/strong&gt;. An editor or IDE with reference search and efficient code navigation makes a big difference. I’ve been using vscode with the &lt;a href=&quot;https://marketplace.visualstudio.com/items?itemName=llvm-vs-code-extensions.vscode-clangd&quot; target=&quot;_blank&quot;&gt;clangd plugin&lt;/a&gt;. It’s good for plain text and regex search. For search in CMake files I use &lt;a href=&quot;https://github.com/weliveindetail/the_silver_searcher/commit/e50ba4de24415f421f3fd945dcadfe14bafcda19&quot; target=&quot;_blank&quot;&gt;my own version of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ag&lt;/code&gt; (The Silver Searcher)&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/weliveindetail/cling/commits/cling-10&quot; target=&quot;_blank&quot;&gt;The number of changes&lt;/a&gt; is fairly small. It contains &lt;a href=&quot;https://github.com/weliveindetail/cling/commit/db5be6e732fa02096843dd35a9bf59c398e4df61&quot; target=&quot;_blank&quot;&gt;another hack&lt;/a&gt; which is useful to &lt;strong&gt;suppress all compiler warnings&lt;/strong&gt; in Cling. Adopting API changes means that we compile and scan through the output over and over again. &lt;a href=&quot;https://weliveindetail.github.io/blog/post/2021/10/20/vscode-hide-warning-markers.html&quot;&gt;Any warning that pops up&lt;/a&gt; on the way will cause additional fuzz and slow down the process. We can deal with them later.&lt;/p&gt;

&lt;p&gt;Once we tested the intermediate state and sorted out changes, we can start the next iteration.&lt;/p&gt;

&lt;h3 id=&quot;skipping-release12x&quot;&gt;Skipping release/12.x&lt;/h3&gt;

&lt;p&gt;Sometimes it’s worth skipping a release. In case of Cling LLVM 12 is a good candidate, because it had &lt;a href=&quot;https://lists.llvm.org/pipermail/llvm-dev/2020-September/144885.html&quot; target=&quot;_blank&quot;&gt;removed the ORCv1 JIT libraries&lt;/a&gt; (for good reasons) while the ORCv2 API was still quite unstable: &lt;a href=&quot;https://github.com/llvm/llvm-project/llvm/include/llvm/ExecutionEngine/Orc&quot; target=&quot;_blank&quot;&gt;The public ORCv2 API&lt;/a&gt; (left/top) saw &lt;a href=&quot;https://weliveindetail.github.io/blog/res/llvm13-orcv2-api.html?path=llvm-project/llvm/include/llvm/ExecutionEngine&quot; target=&quot;_blank&quot;&gt;4.7K insertions/deletions&lt;/a&gt; during that time! For comparison, it had only &lt;a href=&quot;https://weliveindetail.github.io/blog/res/llvm11-orcv2-api.html?path=llvm-project/llvm/include/llvm/ExecutionEngine&quot; target=&quot;_blank&quot;&gt;2K insertions/deletions&lt;/a&gt; in the LLVM 11 cycle (right/bottom).&lt;/p&gt;

&lt;div class=&quot;flex-grid&quot;&gt;
&lt;iframe src=&quot;https://weliveindetail.github.io/blog/res/llvm13-orcv2-api.html?path=llvm-project/llvm/include/llvm/ExecutionEngine&quot; title=&quot;git-baobab: ORCv2 API changes in LLVM 13&quot; class=&quot;baobab&quot;&gt;
  &lt;a href=&quot;https://weliveindetail.github.io/blog/res/llvm13-orcv2-api.html?path=llvm-project/llvm/include/llvm/ExecutionEngine&quot; target=&quot;_blank&quot; class=&quot;baobab&quot;&gt;
    &lt;img src=&quot;https://weliveindetail.github.io/blog/res/llvm13-orcv2-api.html.png&quot; alt=&quot;git-baobab: ORCv2 API changes in LLVM 13&quot; /&gt;
  &lt;/a&gt;
&lt;/iframe&gt;

&lt;iframe src=&quot;https://weliveindetail.github.io/blog/res/llvm11-orcv2-api.html?path=llvm-project/llvm/include/llvm/ExecutionEngine&quot; title=&quot;git-baobab: ORCv2 API changes in LLVM 11&quot; class=&quot;baobab&quot;&gt;
  &lt;a href=&quot;https://weliveindetail.github.io/blog/res/llvm11-orcv2-api.html?path=llvm-project/llvm/include/llvm/ExecutionEngine&quot; target=&quot;_blank&quot; class=&quot;baobab&quot;&gt;
    &lt;img src=&quot;https://weliveindetail.github.io/blog/res/llvm11-orcv2-api.html.png&quot; alt=&quot;git-baobab: ORCv2 API changes in LLVM 11&quot; /&gt;
  &lt;/a&gt;
&lt;/iframe&gt;
&lt;/div&gt;

&lt;p&gt;There was a moderate risk, that reimplementing &lt;a href=&quot;https://github.com/weliveindetail/cling/blob/cling-09/lib/Interpreter/IncrementalJIT.cpp&quot; target=&quot;_blank&quot;&gt;Cling’s IncrementalJIT&lt;/a&gt; for LLVM 12 would cause a large refactor in the next iteration. For me this justified a bigger rebase step from 11 directly to 13.&lt;/p&gt;

&lt;p&gt;In the end, it didn’t matter much, because the basic &lt;a href=&quot;https://github.com/weliveindetail/cling/blob/cling-13/lib/Interpreter/IncrementalJIT_ORCv2.cpp&quot; target=&quot;_blank&quot;&gt;ORCv2 replacement&lt;/a&gt; uses the API on a pretty high level. Looking back, an &lt;strong&gt;alternative strategy&lt;/strong&gt; might have been better: Reimplement the IncrementalJIT based on LLVM 11 (where ORCv1 was still present) and then move on to the next versions as usual.&lt;/p&gt;

&lt;h3 id=&quot;incrementaljit-orcv2-replacement&quot;&gt;IncrementalJIT ORCv2 replacement&lt;/h3&gt;

&lt;p&gt;The rebase plan works very well until Cling’s downstream patches &lt;a href=&quot;https://github.com/weliveindetail/llvm-project/commits/cling-13&quot; target=&quot;_blank&quot;&gt;arrive on release/13.x&lt;/a&gt;. Step 2 of this last iteration is more difficult. In order to get Cling to build against the new LLVM libraries, we would need to reimplement the entire IncrementalJIT against the ORCv2 API.&lt;/p&gt;

&lt;p&gt;This ORCv2 replacement is a big effort and we’re better off with a version that’s incomplete but executable. For that we first &lt;a href=&quot;https://github.com/weliveindetail/cling/compare/baa8416fb30cde7371c412672eb7bcf8fc250111..eb078ba782dc9e2e7a7556525ce9c2f47f38d9af&quot; target=&quot;_blank&quot;&gt;adopt all non-JIT API changes&lt;/a&gt; and &lt;a href=&quot;https://github.com/weliveindetail/cling/commit/40647a9fa77f6136c513f69df6e3bbbdb7cdbb2f&quot; target=&quot;_blank&quot;&gt;comment out the old IncrementalJIT&lt;/a&gt; until Cling compiles and links again. Next we &lt;a href=&quot;https://github.com/weliveindetail/cling/commit/e107d757734a035b512a52ba02f499e8de2f0e08&quot; target=&quot;_blank&quot;&gt;setup an ORCv2 replacement class&lt;/a&gt; with stub functions that fit the existing interface. Note that we have to keep including the old &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;IncrementalJIT.h&lt;/code&gt;, because  other components in Cling are using some of its type definitions.&lt;/p&gt;

&lt;p&gt;From here on we can run Cling in a debugger again. We can put breakpoints in our stubs and run commands in Cling until they hit. This is a very &lt;strong&gt;convenient way to reimplement the existing semantics&lt;/strong&gt;, because we can inspect the relevant states at runtime!&lt;/p&gt;

&lt;p&gt;In the constructor we create a &lt;a href=&quot;https://github.com/weliveindetail/cling/commit/5917f160ce452e2bbd7651ce5ccdf6c55ce4bc89&quot; target=&quot;_blank&quot;&gt;basic greedy &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;llvm::orc::LLJIT&lt;/code&gt;&lt;/a&gt; instance. The &lt;a href=&quot;https://github.com/weliveindetail/cling/commit/21d5a188df1a7fe9dc15dedc4c7a5c47cb45eb11&quot; target=&quot;_blank&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;addModule()&lt;/code&gt; function&lt;/a&gt; wraps an incoming &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;llvm::Module&lt;/code&gt; in a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;llvm::orc::ThreadSafeModule&lt;/code&gt; and passes it on to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LLJIT&lt;/code&gt;. The existing interface is not prepared to propagate LLVM’s &lt;a href=&quot;https://weliveindetail.github.io/blog/post/2017/10/22/llvm-expected.html&quot; target=&quot;_blank&quot;&gt;rich recoverable errors&lt;/a&gt;. If we get an error, we log it to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;stderr&lt;/code&gt; and stick to the existing interface semantics. The &lt;a href=&quot;https://github.com/weliveindetail/cling/commit/2ace5803d4f385968db7b5a010783682117e7b9c&quot; target=&quot;_blank&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;getSymbolAddress()&lt;/code&gt; function&lt;/a&gt; does a simple lookup in the JIT, while &lt;a href=&quot;https://github.com/weliveindetail/cling/commit/b2ac058af82fc7c83ccedb8dd0d88d8c2dfd52bd&quot; target=&quot;_blank&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;lookupSymbol()&lt;/code&gt;&lt;/a&gt; appears to be used only to inject symbols for existing function addresses (like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__cxa_atexit&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__dso_handle&lt;/code&gt;). It runs a lookup beforehand to make sure the symbol doesn’t exist and it records the injected symbols in a map.&lt;/p&gt;

&lt;p&gt;Cling has its own little runtime library. It’s used when we assign existing values or print them out. In order to get it to work, we have to &lt;a href=&quot;https://github.com/weliveindetail/cling/commit/a606044670c3813184ffd62e29d19ab788fdc655&quot; target=&quot;_blank&quot;&gt;implement host process lookup&lt;/a&gt;. With that we can evaluate simple expressions already!&lt;/p&gt;

&lt;h3 id=&quot;basic-transaction-rollback&quot;&gt;Basic transaction rollback&lt;/h3&gt;

&lt;p&gt;Cling’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.undo&lt;/code&gt; command rolls back the JITed program to the state of the previous expression. The implementation in &lt;a href=&quot;https://github.com/weliveindetail/cling/blob/cling-13/lib/Interpreter/TransactionUnloader.h&quot; target=&quot;_blank&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cling::TransactionUnloader&lt;/code&gt;&lt;/a&gt; makes some assumptions about the way &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;IncrementalJIT&lt;/code&gt; works. In particular, it expects to find the transaction’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;llvm::Module&lt;/code&gt; in a map of “non-pending” modules. I guess this design has historic reasons: With &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ORCv1&lt;/code&gt; modules could only be unloaded once the JIT pipeline had finished processing them. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ORCv2&lt;/code&gt; removed that restriction and allows any module to be unloaded, independent of its materialization state. Because fixing &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;TransactionUnloader&lt;/code&gt; is not our goal, we hack the system and &lt;a href=&quot;https://github.com/weliveindetail/cling/commit/bfeac9e7f752c0e1105ed24f4d703333e7d66d2f&quot; target=&quot;_blank&quot;&gt;report any module as non-pending immediately&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Now we can add resource tracking to our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LLJIT&lt;/code&gt; and &lt;a href=&quot;https://github.com/weliveindetail/cling/commit/5a554cc883f3f269aeba2ef81b0eb2d937920b49&quot; target=&quot;_blank&quot;&gt;implement &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;removeModule()&lt;/code&gt; for basic code unloading&lt;/a&gt;. The patch adds a callback to our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;IRCompileLayer&lt;/code&gt;, which transfers back ownership of processed modules to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;IncrementalJIT&lt;/code&gt;, so we can hand it out to the caller of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;removeModule()&lt;/code&gt;. But there is a final hurdle: We must extract the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;llvm::Module&lt;/code&gt; from the received &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;llvm::orc::ThreadSafeModule&lt;/code&gt; and LLVM’s upstream API doesn’t allow this (for good reasons). We’d like to keep it simple for now, so &lt;a href=&quot;https://github.com/weliveindetail/llvm-project/commit/949bfdb1381bec244280e6317515392cc9defe24&quot; target=&quot;_blank&quot;&gt;we add a new downstream patch&lt;/a&gt; that does it.&lt;/p&gt;

&lt;h3 id=&quot;voilà&quot;&gt;Voilà!&lt;/h3&gt;

&lt;p&gt;This way we got Cling to work with LLVM 13 and ORCv2. What is left are a few non-functional cleanup steps, like &lt;a href=&quot;https://github.com/weliveindetail/cling/commit/987486d2a26396bd4bee8b1df07f8b067e3dfb67&quot; target=&quot;_blank&quot;&gt;removing ORCv1 types from function signatures&lt;/a&gt; and &lt;a href=&quot;https://github.com/weliveindetail/cling/commit/93881402b425c7a144e89f27712f6bf18ad5fe9d&quot; target=&quot;_blank&quot;&gt;removing the now unused ORCv1 IncrementalJIT&lt;/a&gt;. Here is the final set of &lt;a href=&quot;https://github.com/weliveindetail/cling/commits/cling-13&quot;&gt;changes in Cling&lt;/a&gt; and in &lt;a href=&quot;https://github.com/weliveindetail/llvm-project/commits/cling-13&quot; target=&quot;_blank&quot;&gt;Cling’s LLVM version&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The resulting Cling is a best-effort version. I didn’t go and investigate test-suite failures. It’s good enough to produce &lt;a href=&quot;#teaser-image&quot;&gt;the initial screenshot&lt;/a&gt; on macOS, but it doesn’t support the full feature-set yet. For now I didn’t look at other platforms. Linux will be a low-hanging fruit, Windows is a different story. There was &lt;a href=&quot;https://github.com/weliveindetail/cling/commit/8d8fe60a4ce45281e2a9320d08646009638591bc&quot; target=&quot;_blank&quot;&gt;one API change&lt;/a&gt; where I didn’t find a solution quickly, so I left a note and skipped it. It causes a number of issues, like failing &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.undo&lt;/code&gt; commands in non-trivial cases. I am sure there are more problems that I didn’t spot, but from here on we can iterate.&lt;/p&gt;

&lt;p&gt;Thanks for reading! Here is how to build our new version of Cling in debug mode:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;➜ git clone https://github.com/weliveindetail/llvm-project
➜ git -C llvm-project checkout cling-13
➜ git clone https://github.com/weliveindetail/cling llvm-project/llvm/tools/cling
➜ git -C llvm-project/llvm/tools/cling checkout cling-13
➜ ln -s ../../clang llvm-project/llvm/tools/clang
➜ mkdir build
➜ cmake -S &quot;llvm-project/llvm&quot; -B &quot;build&quot; -GNinja -DBUILD_SHARED_LIBS=On -DLLVM_ENABLE_PROJECTS=clang -DLLVM_EXTERNAL_PROJECTS=cling -DLLVM_TARGETS_TO_BUILD=&quot;ARM;NVPTX;X86&quot;
➜ ninja -C build cling
➜ build/bin/cling --version
1.0~dev
LLVM (http://llvm.org/):
  LLVM version 13.0.0
  DEBUG build with assertions.
  Default target: x86_64-apple-darwin20.6.0
  Host CPU: skylake
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
</description>
				<pubDate>Wed, 27 Oct 2021 11:30:00 +0000</pubDate>
				<link>https://weliveindetail.github.io/blog/post/2021/10/27/cling-llvm13-orcv2.html</link>
				<guid isPermaLink="true">https://weliveindetail.github.io/blog/post/2021/10/27/cling-llvm13-orcv2.html</guid>
			</item>
		
			<item>
				<title>Stop warning markers from polluting vscode minimap</title>
				<description>&lt;p&gt;&lt;img src=&quot;https://weliveindetail.github.io/blog/res/vscode-hide-warning-markers-on.png&quot; alt=&quot;vscode with clang-tidy warnings and minimap markers on&quot; id=&quot;teaser-image&quot; /&gt;&lt;/p&gt;

&lt;script&gt;
  // Make it easier to spot the difference, by animating the teaser with markers on/off
  var other = &quot;https://weliveindetail.github.io/blog/res/vscode-hide-warning-markers-off.png&quot;;
  window.setInterval(function() {
    var elem = document.getElementById(&apos;teaser-image&apos;);
    [elem.src, other] = [other, elem.src];
  }, 1000);
&lt;/script&gt;

&lt;p&gt;&lt;a href=&quot;https://marketplace.visualstudio.com/items?itemName=llvm-vs-code-extensions.vscode-clangd&quot; target=&quot;_blank&quot;&gt;clangd&lt;/a&gt; and other plugins tremendously increase efficiency when working with C++ in Visual Studio Code (vscode), but more plugins means more warnings and not everything is configurable. The above screenshot shows how a load of &lt;a href=&quot;https://clang.llvm.org/extra/clang-tidy/&quot; target=&quot;_blank&quot;&gt;clang-tidy&lt;/a&gt; warning highlights from clangd clashes with the highlights for our search results in the minimap. This is going to slow down search-intensive tasks.&lt;/p&gt;

&lt;p&gt;Ideally we fixed the warnings altogether or we adjusted the project’s clang-tidy settings so they won’t show up anymore. However, we all know these situations where we don’t have the time right now to deal with such things. So what else can we do? Disable warnings in clangd? Unfortunately that doesn’t seem possible:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://weliveindetail.github.io/blog/res/vscode-hide-warning-markers-clangd-settings.png&quot; alt=&quot;clangd plugin settings&quot; /&gt;&lt;/p&gt;

&lt;p&gt;To the rescue, there is a way to &lt;a href=&quot;https://code.visualstudio.com/api/references/theme-color&quot; target=&quot;_blank&quot;&gt;change highlight colors globally&lt;/a&gt; in vscode! So we can make warnings transparent in the minimap and in the overview-ruler:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;{
  &quot;workbench.colorCustomizations&quot;: {
    &quot;minimap.warningHighlight&quot;: &quot;#00000000&quot;,
    &quot;editorOverviewRuler.warningForeground&quot;: &quot;#00000000&quot;,
  },
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;voilà&quot;&gt;Voilà!&lt;/h3&gt;

&lt;p&gt;We still see the warning squiggles in the editor, but they don’t pollute the minimap anymore.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://weliveindetail.github.io/blog/res/vscode-hide-warning-markers-off.png&quot; alt=&quot;vscode with clang-tidy warnings and minimap markers off&quot; /&gt;&lt;/p&gt;
</description>
				<pubDate>Wed, 20 Oct 2021 16:00:00 +0000</pubDate>
				<link>https://weliveindetail.github.io/blog/post/2021/10/20/vscode-hide-warning-markers.html</link>
				<guid isPermaLink="true">https://weliveindetail.github.io/blog/post/2021/10/20/vscode-hide-warning-markers.html</guid>
			</item>
		
			<item>
				<title>Diffing Clang AST dumps</title>
				<description>&lt;p&gt;&lt;img src=&quot;https://weliveindetail.github.io/blog/res/clang-ast-dump-diffable.png&quot; alt=&quot;Inspect lit.local.cfg in vscode&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Clang makes it easy to dump the AST, but searching for differences in two given AST dumps is a little tricky. &lt;a href=&quot;https://github.com/weliveindetail/astpp&quot; target=&quot;_blank&quot;&gt;A short Python script&lt;/a&gt; can fix most of it. Let’s have a look at an example.&lt;/p&gt;

&lt;p&gt;Here is the AST for a function with a plain C++11 lambda:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;➜ clang++ -fsyntax-only -fno-color-diagnostics -Xclang -ast-dump plain.cpp 1&amp;gt;plain.ast
➜ cat plain.ast
TranslationUnitDecl 0x7f8916040008 &amp;lt;&amp;lt;invalid sloc&amp;gt;&amp;gt; &amp;lt;invalid sloc&amp;gt;
`-FunctionDecl 0x7f891607bc28 &amp;lt;plain.cpp:1:1, line:4:1&amp;gt; line:1:5 main &apos;int (int, char **)&apos;
  |-ParmVarDecl 0x7f891607b9c0 &amp;lt;col:10, col:14&amp;gt; col:14 used argc &apos;int&apos;
  |-ParmVarDecl 0x7f891607bb08 &amp;lt;col:20, col:31&amp;gt; col:26 argv &apos;char **&apos;:&apos;char **&apos;
  `-CompoundStmt 0x7f89160a9ba8 &amp;lt;col:34, line:4:1&amp;gt;
    |-DeclStmt 0x7f89160a9a20 &amp;lt;line:2:3, col:46&amp;gt;
    | `-VarDecl 0x7f891607bd88 &amp;lt;col:3, col:45&amp;gt; col:8 used lambda &apos;(lambda at plain.cpp:2:17)&apos;:&apos;(lambda at plain.cpp:2:17)&apos; cinit
    |   `-ExprWithCleanups 0x7f89160a9a08 &amp;lt;col:17, col:45&amp;gt; &apos;(lambda at plain.cpp:2:17)&apos;:&apos;(lambda at plain.cpp:2:17)&apos;
    |     `-CXXConstructExpr 0x7f89160a99d8 &amp;lt;col:17, col:45&amp;gt; &apos;(lambda at plain.cpp:2:17)&apos;:&apos;(lambda at plain.cpp:2:17)&apos; &apos;void ((lambda at plain.cpp:2:17) &amp;amp;&amp;amp;) noexcept&apos; elidable
    |       `-MaterializeTemporaryExpr 0x7f89160a9970 &amp;lt;col:17, col:45&amp;gt; &apos;(lambda at plain.cpp:2:17)&apos; xvalue
    |         `-LambdaExpr 0x7f891607c4c8 &amp;lt;col:17, col:45&amp;gt; &apos;(lambda at plain.cpp:2:17)&apos;
    |           |-CXXRecordDecl 0x7f891607bed0 &amp;lt;col:17&amp;gt; col:17 implicit class definition
    |           | |-DefinitionData lambda pass_in_registers empty standard_layout trivially_copyable can_const_default_init
    |           | | |-DefaultConstructor defaulted_is_constexpr
    |           | | |-CopyConstructor simple trivial has_const_param implicit_has_const_param
    |           | | |-MoveConstructor exists simple trivial
    |           | | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param
    |           | | |-MoveAssignment
    |           | | `-Destructor simple irrelevant trivial
    |           | |-CXXMethodDecl 0x7f891607c018 &amp;lt;col:28, col:45&amp;gt; col:17 used operator() &apos;int (int) const&apos; inline
    |           | | |-ParmVarDecl 0x7f891607be08 &amp;lt;col:20, col:24&amp;gt; col:24 used argc &apos;int&apos;
    |           | | `-CompoundStmt 0x7f891607c118 &amp;lt;col:30, col:45&amp;gt;
    |           | |   `-ReturnStmt 0x7f891607c108 &amp;lt;col:32, col:39&amp;gt;
    |           | |     `-ImplicitCastExpr 0x7f891607c0f0 &amp;lt;col:39&amp;gt; &apos;int&apos; &amp;lt;LValueToRValue&amp;gt;
    |           | |       `-DeclRefExpr 0x7f891607c0d0 &amp;lt;col:39&amp;gt; &apos;int&apos; lvalue ParmVar 0x7f891607be08 &apos;argc&apos; &apos;int&apos;
    |           | |-CXXConversionDecl 0x7f891607c358 &amp;lt;col:17, col:45&amp;gt; col:17 implicit operator int (*)(int) &apos;int (*() const noexcept)(int)&apos; inline
    |           | |-CXXMethodDecl 0x7f891607c408 &amp;lt;col:17, col:45&amp;gt; col:17 implicit __invoke &apos;int (int)&apos; static inline
    |           | | `-ParmVarDecl 0x7f891607c2f0 &amp;lt;col:20, col:24&amp;gt; col:24 argc &apos;int&apos;
    |           | |-CXXDestructorDecl 0x7f891607c4f0 &amp;lt;col:17&amp;gt; col:17 implicit referenced ~ &apos;void () noexcept&apos; inline default trivial
    |           | |-CXXConstructorDecl 0x7f89160a9600 &amp;lt;col:17&amp;gt; col:17 implicit constexpr  &apos;void (const (lambda at plain.cpp:2:17) &amp;amp;)&apos; inline default trivial noexcept-unevaluated 0x7f89160a9600
    |           | | `-ParmVarDecl 0x7f89160a9730 &amp;lt;col:17&amp;gt; col:17 &apos;const (lambda at plain.cpp:2:17) &amp;amp;&apos;
    |           | `-CXXConstructorDecl 0x7f89160a97d0 &amp;lt;col:17&amp;gt; col:17 implicit used constexpr  &apos;void ((lambda at plain.cpp:2:17) &amp;amp;&amp;amp;) noexcept&apos; inline default trivial
    |           |   |-ParmVarDecl 0x7f89160a9900 &amp;lt;col:17&amp;gt; col:17 &apos;(lambda at plain.cpp:2:17) &amp;amp;&amp;amp;&apos;
    |           |   `-CompoundStmt 0x7f89160a99c8 &amp;lt;col:17&amp;gt;
    |           `-CompoundStmt 0x7f891607c118 &amp;lt;col:30, col:45&amp;gt;
    |             `-ReturnStmt 0x7f891607c108 &amp;lt;col:32, col:39&amp;gt;
    |               `-ImplicitCastExpr 0x7f891607c0f0 &amp;lt;col:39&amp;gt; &apos;int&apos; &amp;lt;LValueToRValue&amp;gt;
    |                 `-DeclRefExpr 0x7f891607c0d0 &amp;lt;col:39&amp;gt; &apos;int&apos; lvalue ParmVar 0x7f891607be08 &apos;argc&apos; &apos;int&apos;
    `-ReturnStmt 0x7f89160a9b98 &amp;lt;line:3:3, col:21&amp;gt;
      `-CXXOperatorCallExpr 0x7f89160a9b58 &amp;lt;col:10, col:21&amp;gt; &apos;int&apos; &apos;()&apos;
        |-ImplicitCastExpr 0x7f89160a9ad8 &amp;lt;col:16, col:21&amp;gt; &apos;int (*)(int) const&apos; &amp;lt;FunctionToPointerDecay&amp;gt;
        | `-DeclRefExpr 0x7f89160a9a78 &amp;lt;col:16, col:21&amp;gt; &apos;int (int) const&apos; lvalue CXXMethod 0x7f891607c018 &apos;operator()&apos; &apos;int (int) const&apos;
        |-ImplicitCastExpr 0x7f89160a9b28 &amp;lt;col:10&amp;gt; &apos;const (lambda at plain.cpp:2:17)&apos; lvalue &amp;lt;NoOp&amp;gt;
        | `-DeclRefExpr 0x7f89160a9a38 &amp;lt;col:10&amp;gt; &apos;(lambda at plain.cpp:2:17)&apos;:&apos;(lambda at plain.cpp:2:17)&apos; lvalue Var 0x7f891607bd88 &apos;lambda&apos; &apos;(lambda at plain.cpp:2:17)&apos;:&apos;(lambda at plain.cpp:2:17)&apos;
        `-ImplicitCastExpr 0x7f89160a9b40 &amp;lt;col:17&amp;gt; &apos;int&apos; &amp;lt;LValueToRValue&amp;gt;
          `-DeclRefExpr 0x7f89160a9a58 &amp;lt;col:17&amp;gt; &apos;int&apos; lvalue ParmVar 0x7f891607b9c0 &apos;argc&apos; &apos;int&apos;

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And the corresponding source code:&lt;/p&gt;
&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;argv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[])&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[](&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lambda&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Here is an equivalent implementation that uses a &lt;a href=&quot;https://isocpp.org/wiki/faq/cpp14-language#generic-lambdas&quot; target=&quot;_blank&quot;&gt;C++14 generic lambda&lt;/a&gt; instead:&lt;/p&gt;
&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;char&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;argv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[])&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[](&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lambda&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now let’s try to fingure out what parts of the AST changed:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;➜ clang++ -fsyntax-only -fno-color-diagnostics -Xclang -ast-dump -std=c++14 generic.cpp 1&amp;gt;generic.ast
➜ diff -u plain.ast generic.ast | wc -l
     101
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Well, for now the diff doesn’t help much:&lt;/p&gt;
&lt;div class=&quot;language-diff highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gd&quot;&gt;--- plain.ast	2021-09-03 11:50 +0200
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+++ generic.ast	2021-09-03 11:50 +0200
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;@@ -1,46 +1,58 @@&lt;/span&gt;
&lt;span class=&quot;gd&quot;&gt;-TranslationUnitDecl 0x7f9492040008 &amp;lt;&amp;lt;invalid sloc&amp;gt;&amp;gt; &amp;lt;invalid sloc&amp;gt;
-`-FunctionDecl 0x7f949207bc28 &amp;lt;plain.cpp:1:1, line:4:1&amp;gt; line:1:5 main &apos;int (int, char **)&apos;
-  |-ParmVarDecl 0x7f949207b9c0 &amp;lt;col:10, col:14&amp;gt; col:14 used argc &apos;int&apos;
-  |-ParmVarDecl 0x7f949207bb08 &amp;lt;col:20, col:31&amp;gt; col:26 argv &apos;char **&apos;:&apos;char **&apos;
-  `-CompoundStmt 0x7f94920a9d58 &amp;lt;col:34, line:4:1&amp;gt;
-    |-DeclStmt 0x7f94920a9b90 &amp;lt;line:2:3, col:46&amp;gt;
-    | `-VarDecl 0x7f949207bd88 &amp;lt;col:3, col:45&amp;gt; col:8 used lambda &apos;(lambda at plain.cpp:2:17)&apos;:&apos;(lambda at plain.cpp:2:17)&apos; cinit
-    |   `-ExprWithCleanups 0x7f94920a9b78 &amp;lt;col:17, col:45&amp;gt; &apos;(lambda at plain.cpp:2:17)&apos;:&apos;(lambda at plain.cpp:2:17)&apos;
-    |     `-CXXConstructExpr 0x7f94920a9b48 &amp;lt;col:17, col:45&amp;gt; &apos;(lambda at plain.cpp:2:17)&apos;:&apos;(lambda at plain.cpp:2:17)&apos; &apos;void ((lambda at plain.cpp:2:17) &amp;amp;&amp;amp;) noexcept&apos; elidable
-    |       `-MaterializeTemporaryExpr 0x7f94920a9ae0 &amp;lt;col:17, col:45&amp;gt; &apos;(lambda at plain.cpp:2:17)&apos; xvalue
-    |         `-LambdaExpr 0x7f949207c6d8 &amp;lt;col:17, col:45&amp;gt; &apos;(lambda at plain.cpp:2:17)&apos;
-    |           |-CXXRecordDecl 0x7f949207bef8 &amp;lt;col:17&amp;gt; col:17 implicit class definition
-    |           | |-DefinitionData lambda pass_in_registers empty standard_layout trivially_copyable can_const_default_init
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+TranslationUnitDecl 0x7f8fde840008 &amp;lt;&amp;lt;invalid sloc&amp;gt;&amp;gt; &amp;lt;invalid sloc&amp;gt;
+`-FunctionDecl 0x7f8fde87bc28 &amp;lt;generic.cpp:1:1, line:4:1&amp;gt; line:1:5 main &apos;int (int, char **)&apos;
+  |-ParmVarDecl 0x7f8fde87b9c0 &amp;lt;col:10, col:14&amp;gt; col:14 used argc &apos;int&apos;
+  |-ParmVarDecl 0x7f8fde87bb08 &amp;lt;col:20, col:31&amp;gt; col:26 argv &apos;char **&apos;:&apos;char **&apos;
+  `-CompoundStmt 0x7f8fde88c228 &amp;lt;col:34, line:4:1&amp;gt;
+    |-DeclStmt 0x7f8fde88bc10 &amp;lt;line:2:3, col:47&amp;gt;
+    | `-VarDecl 0x7f8fde87bd88 &amp;lt;col:3, col:46&amp;gt; col:8 used lambda &apos;(lambda at generic.cpp:2:17)&apos;:&apos;(lambda at generic.cpp:2:17)&apos; cinit
+    |   `-ExprWithCleanups 0x7f8fde88bbf8 &amp;lt;col:17, col:46&amp;gt; &apos;(lambda at generic.cpp:2:17)&apos;:&apos;(lambda at generic.cpp:2:17)&apos;
+    |     `-CXXConstructExpr 0x7f8fde88bbc8 &amp;lt;col:17, col:46&amp;gt; &apos;(lambda at generic.cpp:2:17)&apos;:&apos;(lambda at generic.cpp:2:17)&apos; &apos;void ((lambda at generic.cpp:2:17) &amp;amp;&amp;amp;) noexcept&apos; elidable
+    |       `-MaterializeTemporaryExpr 0x7f8fde88bb60 &amp;lt;col:17, col:46&amp;gt; &apos;(lambda at generic.cpp:2:17)&apos; xvalue
+    |         `-LambdaExpr 0x7f8fde88b578 &amp;lt;col:17, col:46&amp;gt; &apos;(lambda at generic.cpp:2:17)&apos;
+    |           |-CXXRecordDecl 0x7f8fde87c048 &amp;lt;col:17&amp;gt; col:17 implicit class definition
+    |           | |-DefinitionData generic lambda pass_in_registers empty standard_layout trivially_copyable can_const_default_init
&lt;/span&gt;     |           | | |-DefaultConstructor defaulted_is_constexpr
     |           | | |-CopyConstructor simple trivial has_const_param implicit_has_const_param
     |           | | |-MoveConstructor exists simple trivial
 ...
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;challenges&quot;&gt;Challenges&lt;/h3&gt;

&lt;p&gt;For Clang it’s efficient to identify AST nodes by their memory address. Almost all lines in our diff represent a AST node and thus contain an address. That’s why they never match: addresses are not deterministic. Some are referred to in subsequent lines, but most of them are not. We can drop the solo ones and replace the others with a simple incremental ID. This eliminates the major source of conflicts, but for now it only reduces our diff by 9 lines:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;➜ astpp plain.ast --process node-ids &amp;gt; plain1.ast
➜ astpp generic.ast --process node-ids &amp;gt; generic1.ast
➜ diff -u plain1.ast generic1.ast | wc -l
      92
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Another source of conflicts that we are not really interested in are source locations: varying file names as well as line- and column-numbers. We can drop them altogether and make sure they don’t leave artifacts like brackets or whitespace. It cuts our diff by an additional 27 lines:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;➜ astpp plain.ast --process node-ids source-locs &amp;gt; plain2.ast
➜ astpp generic.ast --process node-ids source-locs &amp;gt; generic2.ast
➜ diff -u plain2.ast generic2.ast | wc -l
      65
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Running the script without the explicit &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--process&lt;/code&gt; parameter runs both steps and additionally trims trailing whitespace.&lt;/p&gt;

&lt;h3 id=&quot;voilà&quot;&gt;Voilà!&lt;/h3&gt;

&lt;p&gt;The result gives some good insight what has changed in the AST:&lt;/p&gt;

&lt;div class=&quot;language-diff highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gd&quot;&gt;--- plain.out.ast	2021-09-03 11:50 +0200
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+++ generic.out.ast	2021-09-03 11:50 +0200
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;@@ -10,36 +10,48 @@&lt;/span&gt;
     |       `-MaterializeTemporaryExpr &apos;(lambda at  )&apos; xvalue
     |         `-LambdaExpr &apos;(lambda at  )&apos;
     |           |-CXXRecordDecl implicit class definition
&lt;span class=&quot;gd&quot;&gt;-    |           | |-DefinitionData lambda pass_in_registers empty standard_layout trivially_copyable can_const_default_init
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+    |           | |-DefinitionData generic lambda pass_in_registers empty standard_layout trivially_copyable can_const_default_init
&lt;/span&gt;     |           | | |-DefaultConstructor defaulted_is_constexpr
     |           | | |-CopyConstructor simple trivial has_const_param implicit_has_const_param
     |           | | |-MoveConstructor exists simple trivial
     |           | | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param
     |           | | |-MoveAssignment
     |           | | `-Destructor simple irrelevant trivial
&lt;span class=&quot;gd&quot;&gt;-    |           | |-CXXMethodDecl ID0003 used operator() &apos;int (int) const&apos; inline
-    |           | | |-ParmVarDecl ID0004 used argc &apos;int&apos;
-    |           | | `-CompoundStmt ID0005
-    |           | |   `-ReturnStmt ID0006
-    |           | |     `-ImplicitCastExpr ID0007 &apos;int&apos; &amp;lt;LValueToRValue&amp;gt;
-    |           | |       `-DeclRefExpr ID0008 &apos;int&apos; lvalue ParmVar ID0004 &apos;argc&apos; &apos;int&apos;
-    |           | |-CXXConversionDecl implicit operator int (*)(int) &apos;int (*() const noexcept)(int)&apos; inline
-    |           | |-CXXMethodDecl implicit __invoke &apos;int (int)&apos; static inline
-    |           | | `-ParmVarDecl argc &apos;int&apos;
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+    |           | |-FunctionTemplateDecl operator()
+    |           | | |-TemplateTypeParmDecl ID0003 implicit class depth 0 index 0 argc:auto
+    |           | | |-CXXMethodDecl operator() &apos;auto (auto) const&apos; inline
+    |           | | | |-ParmVarDecl ID0004 referenced argc &apos;auto&apos;
+    |           | | | `-CompoundStmt ID0005
+    |           | | |   `-ReturnStmt ID0006
+    |           | | |     `-DeclRefExpr ID0007 &apos;auto&apos; lvalue ParmVar ID0004 &apos;argc&apos; &apos;auto&apos;
+    |           | | `-CXXMethodDecl ID0008 used operator() &apos;int (int) const&apos; inline
+    |           | |   |-TemplateArgument type &apos;int&apos;
+    |           | |   | `-BuiltinType  &apos;int&apos;
+    |           | |   |-ParmVarDecl ID0009 used argc &apos;int&apos;:&apos;int&apos;
+    |           | |   `-CompoundStmt
+    |           | |     `-ReturnStmt
+    |           | |       `-ImplicitCastExpr &apos;int&apos;:&apos;int&apos; &amp;lt;LValueToRValue&amp;gt;
+    |           | |         `-DeclRefExpr &apos;int&apos;:&apos;int&apos; lvalue ParmVar ID0009 &apos;argc&apos; &apos;int&apos;:&apos;int&apos;
+    |           | |-FunctionTemplateDecl implicit operator auto (*)(type-parameter-0-0)
+    |           | | |-TemplateTypeParmDecl ID0003 implicit class depth 0 index 0 argc:auto
+    |           | | `-CXXConversionDecl implicit operator auto (*)(type-parameter-0-0) &apos;auto (*() const noexcept)(auto)&apos; inline
+    |           | |-FunctionTemplateDecl implicit __invoke
+    |           | | |-TemplateTypeParmDecl ID0003 implicit class depth 0 index 0 argc:auto
+    |           | | `-CXXMethodDecl implicit __invoke &apos;auto (auto)&apos; static inline
+    |           | |   `-ParmVarDecl argc &apos;auto&apos;
&lt;/span&gt;     |           | |-CXXDestructorDecl implicit referenced ~ &apos;void () noexcept&apos; inline default trivial
&lt;span class=&quot;gd&quot;&gt;-    |           | |-CXXConstructorDecl ID0009 implicit constexpr  &apos;void (const (lambda at  ) &amp;amp;)&apos; inline default trivial noexcept-unevaluated ID0009
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+    |           | |-CXXConstructorDecl ID0010 implicit constexpr  &apos;void (const (lambda at  ) &amp;amp;)&apos; inline default trivial noexcept-unevaluated ID0010
&lt;/span&gt;     |           | | `-ParmVarDecl &apos;const (lambda at  ) &amp;amp;&apos;
     |           | `-CXXConstructorDecl implicit used constexpr  &apos;void ((lambda at  ) &amp;amp;&amp;amp;) noexcept&apos; inline default trivial
     |           |   |-ParmVarDecl &apos;(lambda at  ) &amp;amp;&amp;amp;&apos;
     |           |   `-CompoundStmt
     |           `-CompoundStmt ID0005
     |             `-ReturnStmt ID0006
&lt;span class=&quot;gd&quot;&gt;-    |               `-ImplicitCastExpr ID0007 &apos;int&apos; &amp;lt;LValueToRValue&amp;gt;
-    |                 `-DeclRefExpr ID0008 &apos;int&apos; lvalue ParmVar ID0004 &apos;argc&apos; &apos;int&apos;
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+    |               `-DeclRefExpr ID0007 &apos;auto&apos; lvalue ParmVar ID0004 &apos;argc&apos; &apos;auto&apos;
&lt;/span&gt;     `-ReturnStmt
       `-CXXOperatorCallExpr &apos;int&apos;:&apos;int&apos; &apos;()&apos;
         |-ImplicitCastExpr &apos;int (*)(int) const&apos; &amp;lt;FunctionToPointerDecay&amp;gt;
&lt;span class=&quot;gd&quot;&gt;-        | `-DeclRefExpr &apos;int (int) const&apos; lvalue CXXMethod ID0003 &apos;operator()&apos; &apos;int (int) const&apos;
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+        | `-DeclRefExpr &apos;int (int) const&apos; lvalue CXXMethod ID0008 &apos;operator()&apos; &apos;int (int) const&apos;
&lt;/span&gt;         |-ImplicitCastExpr &apos;const (lambda at  )&apos; lvalue &amp;lt;NoOp&amp;gt;
         | `-DeclRefExpr &apos;(lambda at  )&apos;:&apos;(lambda at  )&apos; lvalue Var ID0002 &apos;lambda&apos; &apos;(lambda at  )&apos;:&apos;(lambda at  )&apos;
         `-ImplicitCastExpr &apos;int&apos; &amp;lt;LValueToRValue&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
</description>
				<pubDate>Fri, 03 Sep 2021 16:00:00 +0000</pubDate>
				<link>https://weliveindetail.github.io/blog/post/2021/09/03/clang-ast-dump-diffable.html</link>
				<guid isPermaLink="true">https://weliveindetail.github.io/blog/post/2021/09/03/clang-ast-dump-diffable.html</guid>
			</item>
		
	</channel>
</rss>
