Reverse Engineering Jieli SDK
Jieli
There is an interesting series of chips from a Chinese company called Jieli, used in a lot of cheap Bluetooth headsets on AliExpress. If you want to know more about these chips, go checkout Jieli website and some documentation on GitHub done by third parties.
I wanted to make a bluetooth headphone as a hobby project and came across these chips, eventually I found the SDK for these chips is published on github by the company itself(https://github.com/Jieli-Tech/fw-AC63_BT_SDK). But unfortunatly even the most basic functionality like uart or even clock initialization code, comes as precompiled static libraries in this sdk.
So I got curious and wanted to see if I can reverse engineer some of this functionality. There has already been a few attempts(#1, #2) at this. But things are made difficult by the fact that these chips use a set of custom ISAs. There is a Ghidra processor module made for these architectures. But for the chip I have(AC6965A
with pi32v2
) architecture, the Ghidra module has limited support and can’t disassemble a lot of instructions, so the decompilation is incomplete and most functions are unrecoverable.
While poking at the SDK, I noticed that the static library files provided by the SDK don’t contain real object code, but rather LLVM bitcode. I think this is because Jieli wants to ship a single library for multiple related ISAs and re-target them at compile time for the exact architecture for the chip.
There are a few tools out there to decompile llvm bitcode files back to C code. Let’s take a look at two of them.
LLVM-CBE
llvm-cbe or “LLVM C backend” is not a real decompiler but rather a compiler backend for LLVM. It takes LLVM bitcode and produces C code as the output. The output is not bad, but it’s riddled with GOTOs, LLVM IR internal details like phi nodes, and other missing things that can be extracted from debug metadata like struct field names and pointer types.
You’ll have to compile llvm-cbe from sources, then find the cpu.a
for your chip, then extract it somewhere, you can use 7z
or ar
to extract these archives.
Then,
|
|
Some example code from the output
|
|
Rellic
As the Rellic GitHub repo mentions, it takes LLVM bitcode and produces goto free C code as the output. It’s similar to CBE, but it produces much nicer and readable code. Download it from the release section of the Rellic GitHub repo.
When I tried to use it, it didn’t like the bitcode file,
|
|
The problem is, the bitcode target triple(architecture) is not supported by the LLVM version Rellic depends on. Since bitcode is supposed to be target independent, I can change the target triple using llvm-dis
. Rellic is finicky and some bitcode files require some more fiddling work with Rellic. So I’m using a different bitcode file from here.
|
|
Then edit vm_api.c.o.ll
and replace target triple = "pi32v2"
with target triple = "i386-pc-none-elf"
. Then I can feed this file directly to Rellic.
|
|
Some example code
|
|
Rellic output also has some of the same issues, like not being able to recover some struct field values, function argument names, etc., It also takes a long time to run on some files because it’s using a constraint solver(Z3
) under the hood to simplify the output, like removing GOTOs. Or it just crashes on some bitcode files outright with cryptic errors.
Bitcode Re-targeting
While poking at the SDK some more, I noticed that the SDK saves the intermediate compile states of the output binary to files named like sdk.elf.*.*.****.bc
(due to compiler argument --plugin-opt=save-temps
). All of these are bitcode files. This gave me an idea. Why not try to recompile these files to an architecture well supported by decompiler tools like Ghidra
.
Let’s try to compile the unoptimized bitcode file to x86
|
|
It turns out LLVM toolchains are not backward compatible with old LLVM bitcode. The SDK uses a toolchain based on super old llvm-4.0.1. It is possible to “upgrade” this bitcode using llvm-dis
to disassemble it and assembling it back, but I’ve had some other issues with it. So I decided do everything with llvm-4.0.1. The Jieli toolchain doesn’t seem to support any other targets than their own chips so I got a prebuilt copy of llvm-4.0.1 from the llvm website download section. It will complain about missing libtinfo.so.5
on modern Ubuntu versions, just download libtinfo5_6.4-4_amd64.deb
from Debian or Ubuntu archives, extract libtinfo.so.5.x
from it, and put it info LLVM lib directory as libtinfo..so.5
.
Let’s try to compile with llvm-4.0.1
|
|
Oops, looks like this bitcode has a bunch of inline assembly doing various low level operations, and inline assembly is not architecture independent. Removing these shouldn’t affect the decompilation. I can’t just remove the entire inline assembly line because it’ll break the bitcode semantics. So I decided replace inline assembly body with a nop.
First disassemble bitcode into bitcode text format.
|
|
To replace the inline assembly I wrote a python script
|
|
run it with
|
|
then compile it
|
|
This produces a x86 binary I can import into Ghidra and it recovers all function parameters, struct field names etc. from debug metadata.
With some features enabled, inline assembly in produced bitcode might try to allocate more registers than x86 has, it’ll throw this error.
|
|
I tried ARM architecture since it has more general purpose registers. And this seems to work fine.
|
|
Some inline assembly blocks may still allocate more registers than even ARM has, I tried MIPS with even more registers, but it had issues with relocations. The only option was to find these blocks(uncomment the relevant line in the python code above) and remove the body of the entire function. In my case, this was only a few functions relating to some DSP code.
Using powerpc like this can generate a properly linked elf file. The binary generated by llc might have register regions etc. overlayed by code because llc doesn’t place them properly.
|
|
You might have to edit cpu/br25/sdk.ld
and expand code0 and ram0 regions a little bit. And you will also have to build clang+llvm with LLVMgold plugin enabled.