disrobe: a universal decompiler, deobfuscator, and unpacker
One tool to decompile, deobfuscate, and unpack almost anything, deterministically, in a single Rust binary.
disrobe is a universal multi-language decompiler and deobfuscator. It decompiles Python .pyc bytecode, unpacks PyArmor and PyInstaller, reads Nuitka-compiled binaries, decompiles WebAssembly, deobfuscates JavaScript, decompiles .NET / CIL and JVM / Java, recovers Android DEX, and unwraps native PE / ELF / Mach-O packers, all from one static binary built for malware analysis and reverse engineering.
Try it in your browser: the disrobe playground. Decompile a
.pyc, scan a pickle for malicious reduce callables, and summarize a.wasmmodule, all client-side, with the core passes compiled to WebAssembly. Nothing is uploaded.
disrobe reverses the bytecode, packers, freezers, and protectors layered onto compiled and frozen software across 20+ ecosystems: Python, JavaScript/TypeScript, WebAssembly, JVM and Android, .NET, native PE/ELF/Mach-O, Go, Lua, PHP, Ruby, Erlang/Elixir (BEAM), Swift/Objective-C, ActionScript 3, React Native Hermes, Flutter Dart AOT, and the native packer tier layered on top of them (UPX, MPRESS, NSPack, FSG, kkrunchy, MEW, ASPack, PECompact, Petite, Yoda's Crypter). It ships as a single static Rust binary.
Built for forensic and recovery work where reproducibility matters:
- Deterministic. No model anywhere in the decompile path. The same input produces byte-identical output on every machine and every run, usable as evidence and as a diff baseline.
- Single static binary. No JVM, no Python runtime, no Docker image required to run the core. Builds from one
cargo build --release. Drops into CI headlessly. - Content-addressed. Every recovered artifact persists as a
.drenvelope: an rkyv hot payload plus a postcard cold sidecar, rooted by a BLAKE3 hash. Cache hits are byte-identical and chains compose offline. - Honest. Every Python decompile is recompiled on the matching interpreter and compared opcode-for-opcode: 92.76% per-code-object equivalence on the CPython 3.14 stdlib (5831 of 6286), measured against the interpreter, not the tool's own output. Recovery that is not perfect is labelled
SEMANTIC,PARTIAL, orSKELETONrather than presented as ground truth. Commercial-tier packers that disrobe cannot fully unpack are reported as detect-only by design, never faked.
Who this is for
- Malware analysts and incident responders who receive a packed, frozen, or obfuscated sample and need to read what it does, without executing it.
- Security researchers auditing a closed binary for interoperability or vulnerability research.
- Developers recovering their own lost source from a shipped
.pyc,.jar,.dll, or bundled.js. - Coding agents. Every pass can emit a structured metadata sidecar (
--llm) carrying the call graph, type signatures, control-flow shape, capability surface, and decompile provenance, so an LLM can reason about recovered code without re-deriving its structure.
What makes it different
disrobe ships passes for every ecosystem above from a single binary. Where mature FOSS already exists (CFR, Vineflower, jadx, ILSpy, JPEXS, unluac, hermes-dec, Ghidra), disrobe wraps it headlessly behind a unified CLI and adds chain auto-detection, deterministic .dr envelopes, and round-trip verification. Where FOSS coverage is thin (PyArmor v9-pro, the native packer tier, Hermes against a live bundle, Flutter Dart AOT, MicroPython .mpy, PEP 750 t-strings), it is among the few tools handling these statically and offline. Where the field is dominant (Ghidra/IDA/Binary Ninja for native decompilation), disrobe is the unpack, symbol-recovery, and chain-detect layer that feeds them cleaner input.
Measured recovery
Every figure below is produced by a committed test gate or a local measurement harness graded against an independent oracle, never the tool's own output. The full per-value sourcing lives in xtask/data/recovery.json.
| Ecosystem | Measured | Oracle |
|---|---|---|
| Python bytecode | 92.76% per-code-object equivalence on the CPython 3.14 stdlib (5831 of 6286) | recompile on CPython 3.14.5, opcode diff |
| CPython legacy 1.0-3.7 | 152 of 191 proven-correct (CI floor); 166 of 191 measured locally | recompile-equivalence or structural token-match |
| WebAssembly | 100% op-coverage on the 30 parseable corpus modules; 24 of 24 execution-eligible functions equivalent | execution differential under wasmtime |
| JVM classfile | 93.1% of methods recompile error-free (122 of 131 floor; 128 measured) | real javac |
| Android (Dalvik) | 99% of verifiable classes pass the JVM verifier (102 of 103) | -Xverify:all over assembled jar |
| Ruby YARV | greeter 100%, megafile 85% opcode-multiset equivalence | recompile on MRI |
| PyArmor | 72 of 72 real-corpus samples recovered | plaintext-absent oracle |
| Containers | 98 formats detected, 98 extracted in-tree | per-format byte length |
The numbers that are not perfect are labelled SEMANTIC, PARTIAL, or SKELETON, and the information-theoretic walls (native-virtualized code, runtime-only keys, RSA-wrapped capsule keys) are reported as detect-only by design.
How to read these docs
- New here? Start with Installation and Quickstart.
- Want to understand the design? Read the Architecture overview, then the five-rung IR ladder.
- Looking for a specific language? Jump to its language guide.
- Need an exact command or flag? See the CLI command reference.
- Running disrobe against untrusted samples? Read Forensics and malware-safety posture first.
Installation
disrobe is distributed two ways: prebuilt binaries from the GitHub Releases tab, and build from source with a single Rust toolchain. There is intentionally no PyPI/npm/Homebrew/crates.io/Docker channel for the binary itself; GitHub Releases is the canonical distribution point.
Prebuilt binaries (recommended)
Each tagged release attaches prebuilt, statically-linkable binaries for the common targets, alongside SHA256SUMS, a cosign keyless signature, and a minisign signature.
| OS | Architectures |
|---|---|
| Windows 10/11 | x86-64, ARM64 |
| Linux (glibc + musl) | x86-64, ARM64 |
| macOS 13+ | x86-64, ARM64 (Apple Silicon) |
-
Download the archive for your platform from the Releases page.
-
Verify the checksum:
sha256sum -c SHA256SUMS # Linux / macOS -
(Optional) verify the cosign signature against the Sigstore transparency log:
cosign verify-blob \ --certificate-identity-regexp '^https://github.com/1-3-7/disrobe/' \ --certificate-oidc-issuer 'https://token.actions.githubusercontent.com' \ --signature disrobe-<version>-<target>.tar.zst.sig \ --bundle disrobe-<version>-<target>.tar.zst.cosign.bundle \ disrobe-<version>-<target>.tar.zst -
Extract and place disrobe (
disrobe.exeon Windows) anywhere on yourPATH.
Build from source
Building requires Rust 1.95 or newer (stable). That is the only build dependency for the core; the optional external backends are fetched separately (see below).
git clone https://github.com/1-3-7/disrobe
cd disrobe
cargo build --release
./target/release/disrobe --version
A release build takes roughly four to six minutes on commodity hardware. The binary lands at target/release/disrobe; copy it onto your PATH.
Verifying the install
disrobe --version # print the version
disrobe passes # list every registered pass with a one-line summary
disrobe --help # full subcommand surface
disrobe <pass> --help # drill into any pass, e.g. `disrobe py --help`
Optional external backends
disrobe's in-house passes run with zero external dependencies. A subset of capabilities, however, wrap mature external tools headlessly: Ghidra for native decompilation; CFR / Vineflower / Procyon / jadx for the JVM and Android; ILSpy / dnSpy / de4dot for .NET; Rizin and friends elsewhere. These are never the product for bytecode languages (disrobe ships its own in-house decompilers there) and are always optional.
Probe what is installed and what is missing:
disrobe doctor # probe ~50 optional external tools
disrobe doctor --auto-install # install every missing tool with a known action
Install a single tool through your platform's native package manager (winget / brew / apt / dnf / pacman / apk). disrobe never installs itself this way; it only fetches the optional backends:
disrobe install --list # list every known tool + per-platform package name
disrobe install ghidra
disrobe install upx
Heavyweight dependencies that ship as upstream release archives rather than OS packages (Ghidra, for instance) have a dedicated installer:
disrobe install-deps ghidra
disrobe install-deps --all
Shell completions and man pages
disrobe completions bash --install # also: zsh, fish, powershell, elvish
disrobe man --out ./man # one .1 page per subcommand
Quickstart
The fastest path is disrobe auto: hand it a file and it fingerprints the input, then chains the right passes end to end.
Auto-detect and chain
disrobe auto suspect.exe --out recovered/
# detected: PE -> UPX -> rust-demangle
# stage 01-upx ok (.text byte-identical, 1.18 MiB in 9 ms)
# stage 02-demangle ok (4172 Rust symbols, 312 C++ symbols, 0 unresolved)
# final ok -> recovered/final/
disrobe auto understands chains such as:
PE -> UPX -> rust-demangle -> symbol recoveryPyInstaller -> PyArmor -> .pyc decompileAPK -> dex -> JADX + Smali + manifestElectron .asar -> webcrack -> source
Use --capture-stages to mirror every executed pass's byte-exact output under <out>/NN-<pass>/ and link the terminal stage(s) under <out>/final/. Cap the chain depth with --max-depth (default 8).
Per-language one-liners
Every one of these is real and backed by an in-tree fixture and integration test:
# Python
disrobe py decompile module.pyc --out recovered/
disrobe py disasm module.pyc --out trace.txt
disrobe py deob obfuscated.py --out clean.py --cleanup
disrobe pyinstaller extract onefile.exe --out out/
disrobe pyarmor unpack protected.py --out out/ # add --allow-dynamic only on trusted samples
disrobe nuitka extract app.exe --out out/
# JavaScript / TypeScript / WebAssembly
disrobe js deob bundle.min.js --out clean.js
disrobe js unbundle app.bundle.js --out src/
disrobe wasm decompile module.wasm --target rust --out lifted.rs
# JVM / Android / .NET
disrobe jvm decompile app.apk --backend jadx --out src/
disrobe dotnet decompile App.dll --backend ilspy --out src/
# Native
disrobe native unpack packed.exe --out unpacked.bin
disrobe native symbols app.exe --out symbols.json
disrobe go recover app --out symbols.json
# Mobile / Lua / others
disrobe hermes decompile index.android.bundle --out surface/
disrobe flutter dump libapp.so --out layout.json
disrobe lua decompile script.luac --out script.lua
disrobe ruby decompile app.rb
disrobe php decode payload.php --out clean.php
disrobe beam parse module.beam
Structured output
Every command accepts the global --json, --ndjson, or --sarif flags for machine-readable output. SARIF 2.1.0 drops straight into GitHub code scanning:
disrobe scan firmware.bin --sarif > findings.sarif
Inspecting a run
After any chain or pass, inspect what landed:
disrobe status # per-stage artifact counts, sizes, manifests in ./out/
disrobe context --out recovered/ # per-pass status, confidence tiers, verdict, provenance
disrobe envelope inspect out/final/module.dr
disrobe verify out/final/module.dr
Generating an LLM sidecar
Add --llm to any pass to emit a schema-conforming metadata bundle next to the recovered artifact, ready for a coding agent to consume:
disrobe py decompile module.pyc --out recovered/ --llm --llm-briefs
See LLM sidecar and provenance for the full category and pack model.
Comparison
How disrobe sits next to the established tool for each ecosystem. The aim is not to win every cell. For several targets a mature dedicated tool already exists and is the better choice for that one format; disrobe's value is doing all of them from one binary, behind a deterministic pipeline that records what it recovered and what it could not.
Recovery is always measured against an independent reference, never the tool's own output, and lossy results are reported as measured. The limits are listed at the bottom of this page and in Limits.
Source recovery (bytecode to source)
| Ecosystem | Established tools | Where disrobe differs |
|---|---|---|
| Python | pycdc, pylingual, uncompyle6, decompyle3 | One engine spans 3.6-3.15; each construct is recompiled and diffed opcode-for-opcode, reaching 92.76% per-code-object equivalence on the CPython 3.14 stdlib (5831 of 6286). uncompyle6 stops near 3.8 and decompyle3 near 3.9; the ML decompilers are non-deterministic and carry benchmark-contamination risk. |
| JVM / Kotlin / Scala | CFR, Vineflower, Procyon, Fernflower | In-house Rust decompiler is the default; on the EdgeCases corpus at least 93.1% of decompiled methods recompile error-free under real javac (asserted floor 122 of 131, 128 measured on JDK 25). CFR, Vineflower, Procyon, and jadx remain available as --backend. Adds ProGuard/R8 mapping replay and obfuscator peel in the same pass. |
| Android (DEX) | jadx, dex2jar + a Java decompiler | Direct DEX-to-Java without the dex2jar hop; APK signature v1-v3 verification and RASP detection in one binary. Recovered bodies are graded by the real JVM verifier (-Xverify:all) on the committed dex corpus, where 99% of verifiable classes pass (102 of 103). On the gitignored real FOSS apks the lifter self-reports a body for 89% to 92.5% of methods (transmissionic 92.5%, enrecipes 90.7%, rustdesk 89.0%), not verifier-attested; a live-range-splitting pass recovers bodies whose registers carry conflicting JVM types across control-flow joins (307 re-hosted bodies verify clean), leaving only the one class whose Kotlin supertype is unbundled in the corpus. |
| .NET / CIL | ILSpy, dnSpy, de4dot | In-house CIL to C#/F#/VB plus an actively maintained obfuscator reverser; de4dot has been unmaintained since 2020. ILSpy, dnSpy, and de4dot stay available as --backend. |
| Lua | unluac, luadec | Covers 5.1-5.4, LuaJIT, full Luau (all 82 opcodes, child-proto linking), and GLua in one decoder, plus 11 obfuscator reversers. For real IronBrew2 2.7.0 register-VM output, disrobe reconstructs the opcode permutation and xor key from the bootstrap, decodes the constant pool, and lifts the VM bytecode back to runnable Lua in standard and MAX mode, validated by a real-Lua execution differential against the original on a committed sample corpus (the execution differentials CI job). MoonSec-shape recovery runs against a synthetic bootstrap of our own design pending a real sample. Runtime-key and anti-tamper variants return unrecovered. unluac is the maturity benchmark for stock luac. |
| Ruby | none (no FOSS YARV/mruby decompiler) | Decompiles MRI/YARV 2.6-3.4 and mruby, with a recompile-equivalence oracle reaching 100% opcode-equivalence on a greeter fixture and 85% on a mixed-construct megafile (about 90% recoverable, the rest an oracle comment-stripping artifact). No comparable open-source Ruby decompiler exists. |
| BEAM (Erlang / Elixir) | beam_disasm (disassembly only) | Lifts BEAM chunks to Core Erlang and recovers Elixir from the Dbgi chunk. The standard tooling disassembles but does not reconstruct source. |
| PHP | none for modern bytecode (ionCube/SourceGuardian are commercial) | Detects ionCube/SourceGuardian/Zend Guard and walls them (the decrypt key is native-loader-resident, so no static opcode recovery), with a partial op_array skeleton for the legacy statically-keyed cases; full Phar archive extraction. No maintained FOSS PHP-bytecode decompiler exists. |
| WebAssembly | wasm-decompile, wasm2c, wasm-tools | Lifts to typed Rust, TypeScript, WAT, or C with DWARF recovery and four obfuscator reversers (Jscrambler-WASM, Wobfuscator, Tigress-via-Emscripten, Wasmixer), rather than emitting a single fixed target. A fifth family, wasm-name-obfuscator, is detected and classified but its hex renames are not reversible. |
| JavaScript / TypeScript | webcrack, synchrony, REstringer; for V8 .jsc, View8, ghidra_nodejs, jscdecompiler.com | obfuscator.io (full pipeline), JS-Confuser, Jscrambler, esoteric encoders, V8/Bytenode, and 11 bundlers with scope-aware renaming and source-map reconstruction. For .jsc, the self-contained static option: View8 needs a patched V8 binary, ghidra_nodejs needs Ghidra, jscdecompiler.com is an online service; disrobe recovers the user-string layer plus structure offline with Node 18-24 detection. Internalized identifiers (console, log) live in V8's read-only snapshot heap, absent from the .jsc, an information-theoretic boundary the binary-dependent tools cross and disrobe does not. |
| ActionScript 3 / Flash | JPEXS Free Flash Decompiler (FFDec) | FFDec is the mature, full-graph SWF and AS3 decompiler and remains the better tool for deep Flash work. disrobe parses SWF and disassembles ABC bytecode as part of the unified chain; this path is local-corpus only and not CI-validated. |
Unpacking and extraction (byte-exact where possible)
| Ecosystem | Established tools | Where disrobe differs |
|---|---|---|
| Python freezers | pyinstxtractor (PyInstaller only), per-freezer scripts | PyInstaller, cx_Freeze, py2exe, PyOxidizer, shiv, pex, Briefcase, and Nuitka unpacked by one tool, then handed to the decompiler. Nuitka onefile/standalone is byte-exact; its native bodies are lossy. |
| Python protectors | Pyarmor-Static-Unpack-1shot, GDATA Pyarmor-Tooling (PyArmor v8-9.x) | PyArmor v6-v9-pro runtime-key extraction and AES decrypt (72 of 72 real-corpus samples) paired with an in-tree decompiler for end-to-end source, plus SourceDefender .pye decryption; the v3-v5 RSA-wrapped-key tier is an information-theoretic wall. |
| Native packers | unipacker; UPX (unpacks UPX only); per-packer one-off scripts | General-purpose unpacker for the tier in one static Rust binary (no Python or unicorn): UPX, kkrunchy, NSPack, Petite, MPRESS, MEW, ASPack, and PECompact are decompressed via our own decoders and an in-house x86 stub emulator, scored against real committed originals. NSPack reaches ~99% of the content section. ASPack and PECompact rebuild the decompressed section image at its load RVA on committed samples: the packed .text of near-random entropy and zero resolvable calls drops to ~6.2-6.5 with dozens to hundreds of disassembler-resolvable intra-code calls (no disk-aligned byte-identity reference for the overlay). MEW rebuilds a flat image of the committed Sysinternals samples, read as the entropy drop to ~4.2-4.9 and tens of thousands of decoded instructions. FSG decodes via the same aPLib-clone path but ships no committed fixture, so no number reproduces from a checkout. Yoda's Crypter recovers .rsrc byte-identical and decrypts .text to full plaintext through the stub emulator; Yoda's Protector and the virtualized tier (VMProtect, Themida, Enigma) are detect + carve (runtime-only keys). |
| React Native Hermes | hermes-dec, hbctool | Bytecode v60-v96. A structured lifter, not a full decompiler: variadic call args are marked <arg?> where the Hermes frame-register layout is not modeled, and no-op opcodes count as fallback. On a 66 MiB production bundle, disrobe parsed the 122,633-function table with no parse failure (header scale, not body lifts, measured locally, not a CI-gated correctness percentage; the committed hello fixture is the CI-gated case). |
| Containers and archives | unsquashfs, 7-Zip, binwalk, asar, format-specific CLIs | Detects 98 container/archive/filesystem/firmware formats and writes member bytes in-tree for all 98 (ZIP, tar, 7z, .deb, .rpm, MSI, NSIS, Docker/OCI; bare gz/bz2/zst/lzma/lzip/lz4-frame/zlib/.Z streams; legacy ar/arj/arc/lzh/lzop/uzip/xalz/par2/ELF-overlay; embedded-linux squashfs/cramfs/ext4/romfs/minixfs/jffs2/ubifs/yaffs/erofs/ntfs/android-sparse/btrfs-send; GPT/MBR/VHD/VHDX/WIM disk images walked through FAT; vendor firmware via D-Link AES, EnGenius XOR, Autel table, QNAP PC1 and CRC-verified Netgear/Xiaomi/Tesla carves) with a recursive carve-everything engine and universal zip-slip and decompression-bomb guards; WIM XPRESS/LZX/LZMS chunk payloads decompress in-tree, InnoSetup writes the decoded setup-data block stream and InstallShield its stored and zlib members. Member-byte decoding is byte-exact for the documented codecs; ARJ method 4, ARC methods 5-7, EROFS microlzma and the compact index, StuffIt compressed forks, and OTP-AES airoha firmware are carved or reported rather than decoded. binwalk surveys firmware but does not decrypt the vendor schemes or walk these filesystems to byte-exact files. |
Native and AOT-compiled (symbols and demangling, not source)
These targets compile to machine code, so function bodies are not recoverable as source. The deliverable is unpacking, symbol and metadata recovery, and demangling that gives a disassembler cleaner input.
| Ecosystem | Established tools | Where disrobe differs |
|---|---|---|
| Native (PE/ELF/Mach-O) | Ghidra, IDA, Binary Ninja | Not a competitor on raw decompilation. The unpack, symbol-recovery, and chain-detect layer that feeds those tools cleaner input. DWARF/PDB/STABS across x86, ARM, RISC-V, MIPS, PowerPC, SPARC, and eBPF. |
| Go | GoReSym, redress | Symbol recovery plus garble undo and embedded-FS walking; pclntab eras go1.2 through go1.26, with type-name resolution gated above 85%. |
| Swift / Objective-C | class-dump, swift-demangle | Mach-O class-dump with SwiftConfidential and SwiftShield rename-undo in one pass. Bodies stay native; the output is the interface and restored names. |
| Flutter (Dart) | blutter, doldrums | Two paths: a Dart kernel (.dill) yields byte-exact source bodies from the kernel source table, while an ARM64 AOT libapp.so erases bodies and recovers class and method names and library URIs plus disassembly. Both are CI-gated on a committed sample. |
| Nim / Zig / Crystal | binary disassemblers, language demanglers | Detect, demangle, and recover symbols and metadata from each binary's own symtab and type table. Source is not recoverable. |
Where disrobe is not the right tool
- Deep, interactive Flash reversing: use FFDec.
- Full native decompilation to C: use Ghidra, IDA, or Binary Ninja.
disrobeprepares input for them; it does not replace them. - Devirtualizing a commercial protector against a running licensed sample (VMProtect, Themida, Enigma):
disrobedetects the protector and carves intact sections. Its generic devirtualizer (disrobe native devirt) recovers the interpreter, fingerprints each handler behaviorally, and lifts the handler bytecode to a re-executable IR plus pseudo-code, validated end-to-end on a self-authored Tigress-shape VM; the commercial front-ends are extended from published RE write-ups, not a running commercial sample, and a handler stream assembled at run time from a per-machine key is the wall. - Any AOT-compiled language (Nim, Zig, Crystal, Flutter Dart, Swift bodies): source bodies are gone at compile time; demangling and symbol recovery are the ceiling, the same ceiling every tool faces.
Architecture overview
disrobe is a workspace of focused Rust crates orchestrated by one CLI. Every recovery flows through the same shape: bytes in, a chain of passes that each transform the artifact up the IR ladder, an optional .dr envelope for content-addressed caching, and a final emit.
The shape of a recovery
┌──────────────────────────────────────────────┐
input bytes ──> │ detect ──> pass 1 ──> pass 2 ──> ... │ ──> recovered artifact
└──────────────────────────────────────────────┘
│ │ │
│ │ └─ each pass: raw -> disasm -> mir -> hir -> surface
│ └─ capability resolver gates what can run next
└─ chain runner records chain.json + stage mirrors + recovery.json
Crate map
The workspace splits into a small set of shared cores and one crate per ecosystem pass:
| Crate | Role |
|---|---|
disrobe-core | Shared types: error codes (DR-<DOMAIN>-<NNNN>), progress, confidence tiers, secret/credential scanning, cyclomatic metrics. |
disrobe-ir | The five-rung IR ladder, the .dr envelope (rkyv hot + postcard cold + BLAKE3 root), transcoders, capability descriptors. |
disrobe-binfmt | Container, archive, filesystem, and firmware layer (98 formats detected, 98 extracted in-tree, plus a recursive carve-everything engine) with shared zip-slip and decompression-bomb quota machinery. |
disrobe-llm-metadata | The --llm sidecar: 18 categories, 4 packs, AGENTS.md / SKILL.md brief generation. |
disrobe-mcp | The rmcp Model Context Protocol companion wired to disrobe serve --mcp. |
disrobe-py-marshal | CPython marshal reader: code objects across 1.0-3.15. |
disrobe-pass-* | One crate per ecosystem (py-decompile, py-disasm, py-deob, pyarmor, pyinstaller, pyfreeze, nuitka, js-deob, wasm-deob, jvm, dotnet, native, go, lua, php, ruby, beam, pickle, swift-objc, as3, mobile, sourcedefender). The native pass adds the iced-backed disassembler, symbol-independent function discovery, call graph and basic-block CFG, instruction re-encode/relocate, C++ RTTI/vtable recovery, and emulation-driven string recovery. |
disrobe-query | Queryable-IR layer over the disassembled native code: functions, calls-to, xrefs, string-decoders, complexity, capability sites, behind disrobe query. |
disrobe-capabilities | Capability rule engine over the queryable IR, mapping matched behaviors to MITRE ATT&CK and MBC, behind disrobe capabilities. |
disrobe-cli | The disrobe binary: argument parsing, output formats, the chain runner, the daemon. |
disrobe-validator | Walks a corpus and validates every fixture round-trips, used in CI. |
The Pass trait
Every pass implements one trait. A pass takes a .dr envelope at some rung, does its work, and returns an envelope one or more rungs higher, declaring which capabilities it requires on the way in and which it produces on the way out. Because every pass speaks the same envelope dialect, the chain runner can compose any pass with any other as long as the capability resolver is satisfied. This is what lets PyInstaller -> PyArmor -> .pyc decompile work as a single disrobe auto invocation rather than three hand-wired steps.
Each pass also exposes a standardized set of emits (source, disasm, ast, cfg, ir, manifest, sourcemap, symbols, strings, imports, signatures, report). A pass that cannot produce a given emit writes an explicit applicable: false stub with the DR-IR-NotApplicable code rather than silently dropping it.
The four pillars
The rest of this section expands each pillar:
- The five-rung IR ladder: the common intermediate representation every artifact climbs.
- Passes and the capability model: how passes declare and resolve what they need.
- The chain runner: auto-detection, stage mirrors, depth and cycle caps.
- The
.drenvelope: the content-addressed wire format that makes caching deterministic.
Determinism is a design constraint, not a feature
The entire architecture exists to make output reproducible. There is no model in the decompile path. RNG-backed backends take an explicit --seed. Timing tokens are scrubbed from golden outputs so that two runs hash identically. The .dr envelope is content-addressed (BLAKE3) rather than timestamp-addressed, so a cache hit is provably the same bytes. This is what makes disrobe output usable as a forensic baseline and as a disrobe diff input across versions.
The five-rung IR ladder
Every artifact disrobe recovers climbs the same five-rung intermediate-representation ladder. A pass never jumps arbitrarily; it lifts an artifact from one rung to the next, which is what lets passes from completely different ecosystems compose through a shared envelope.
Raw ──> Disasm ──> MIR ──> HIR ──> Surface
bytes opcodes mid high source
The rungs are defined once in disrobe-ir as Rung::{Raw, Disasm, Mir, Hir, Surface} and are the same for Python bytecode, JVM classfiles, .NET CIL, Lua chunks, WebAssembly, and native binaries.
Rung 1: Raw
The original bytes, exactly as received, wrapped in an envelope with its detected format and BLAKE3 root recorded. Unpacking and decryption passes (UPX unpack, PyArmor decrypt, PyInstaller extract, container extraction) operate at this rung: they take raw bytes and emit raw bytes that are structurally cleaner but still pre-disassembly. This is where byte-exact recovery lives: a UPX-unpacked image at the Raw rung carries the original executable code (.text) and exception data (.pdata) bit-for-bit, with only the loader-rebuilt relocations and IAT (resolved by the OS loader at run time, never stored in the packed file) left as residual.
Rung 2: Disasm
The artifact decoded into a per-instruction listing: CPython opcodes, JVM bytecode, CIL, Lua VM instructions, Wasm operators, or native machine instructions via Capstone/iced-x86/yaxpeax. Disassembly is faithful and lossless: a 1:1 decode of the bytes, with offsets preserved. disrobe py disasm, disrobe beam disasm, disrobe as3 disasm, and disrobe pickle disasm all stop here deliberately.
Rung 3: MIR (mid-level IR)
A normalized, control-flow-aware representation: basic blocks, an explicit CFG, stack effects resolved, super-instructions un-fused, jump targets relocated. This is the rung where decompilers do the structural work that separates a faithful disassembly from readable source. For Python specifically, the frame-tree pre-pass runs here: the nested source-construct tree is reconstructed from the 3.11+ exception table before the instruction walk, which eliminates the single-pass stack-walker desync that other Python decompilers suffer from.
Rung 4: HIR (high-level IR)
Structured control flow recovered: loops, conditionals, try/except/finally, with-blocks, comprehensions, pattern-match arms, generator state machines. Names, types, and idioms are recovered where the source language has them (Kotlin idiom recovery from JVM bytecode, C# vs F# vs VB shapes from CIL). The HIR is language-shaped but not yet rendered as text.
Rung 5: Surface
The final rendered source: Python, Java, Kotlin, C#, F#, VB, Rust, TypeScript, WAT, C, Lua, Erlang, or whatever the target dictates. For Python this is where the round-trip metric runs: the emitted file is recompiled on the matching interpreter and compared opcode-for-opcode against the original. The verdict is recorded as PERFECT (byte-identical), SEMANTIC (same program, different layout), or CODE_DIFF (a real bug, fixed before ship).
Why the ladder matters
- Composition. Because a pass only declares "I take rung N, I produce rung N+1," the chain runner can stitch passes from different crates without any of them knowing about each other.
- Honest partial recovery. If a decompiler can climb to HIR but not cleanly render Surface, it can stop and emit the HIR/Disasm artifact with a
PARTIALorSKELETONconfidence tier rather than fabricating source. - Transcoding.
disrobe-ircarries a transcode registry keyed on(from_version, from_rung, to_version, to_rung), so an envelope can be migrated across schema versions while staying at the same rung.disrobe envelope migrate-checkvalidates that such a path exists and that every required capability stays satisfiable.
Confidence tiers
Surface output is tagged with one of four tiers, defined in disrobe-core:
| Tier | Meaning |
|---|---|
exact | Reserved for byte-roundtrip-verified output. |
semantic | Same program, different but equivalent layout. |
partial | Some bodies recovered, some left as disasm or stubs. |
skeleton | Structure recovered, bodies emitted as pass/placeholder. |
These tiers propagate into the recovery.json sidecar and the --llm bundle, so a downstream consumer always knows how much to trust each recovered symbol.
Passes and the capability model
A pass is the unit of work in disrobe. Each pass lives in its own crate, implements a shared trait, and declares the capabilities it requires and produces. The capability resolver is what allows arbitrary passes to chain.
Registered passes
Run disrobe passes for the live list. As of the current release:
| Pass | Capability summary |
|---|---|
pyarmor | PyArmor v6 / v7 (dynamic-hook) + v8 / v9-pro static unpack. |
pyinstaller | PyInstaller 2.x-6.20+ extract + AES-CTR / CFB decrypt. |
pyfreeze | cx_Freeze / py2exe / shiv / pex / PyOxidizer / Briefcase detect + extract. |
nuitka | --onefile payload extract (zstd) + symbol / constants scan. |
py | Deobfuscate (peel + cleanup) / disassemble / decompile / extract / SourceDefender decrypt. |
js | Deobfuscate (string-array + unminify + scope-aware rename) / unbundle 11 bundlers. |
wasm | Analyze / decompile (JSON / Rust / TypeScript / WAT / C) / reverse 4 obfuscator families (plus wasm-name-obfuscator detect + classify). |
envelope | .dr create / inspect / verify / diff / migrate-check. |
query | Query a Disasm-rung .dr IR: functions / calls-to / xrefs-to / string-decoders / complexity-over / capability sites. |
capabilities | Match a binary against built-in capability rules with evidence addresses and MITRE ATT&CK / MBC tags. |
native | Ghidra-headless decompile / symbol dump / unpack / devirt / entropy / crypto signatures / disasm / callgraph / patch / sigmaker / diff. |
jvm | Classfile / .jar / .dex / .apk decompile via CFR / Vineflower / Procyon / JADX. |
apk | AndroidManifest.xml decode + resource id-to-name map + signer-cert SHA-256. |
dotnet | .NET PE decompile via ILSpy / dnSpyEx / de4dot + protector detection. |
hermes | React Native Hermes bundle disasm + JS surface lift. |
macho | Mach-O / fat / .ipa dump + ObjC + Swift class-dump. |
lua | Lua 5.1-5.4 / LuaJIT / Luau / GLua decompile + obfuscator peel. |
php | Encoder decode (phar / ionCube / SourceGuardian / ZendGuard) + eval-chain peel. |
ruby | MRI / YARV / mruby / JRuby / TruffleRuby / Ruby2Exe / Ocra analysis. |
beam | .beam IFF parse + Core Erlang lift + Code chunk disasm. |
pickle | Disasm + decompile + safety + symbolic trace + polyglot + ML model detect. |
go | pclntab + moduledata + garble report + embed.FS extraction. |
swift | Swift / ObjC class-dump + SwiftShield undo + Confidential XOR-decrypt. |
as3 | ActionScript 3 .swf DoABC tag disasm. |
flutter | Dart AOT / libapp.so dump + obfuscation_map parse. |
chain | Explicit pass pipeline orchestrator. |
serve | HTTP daemon + WebSocket stream + LSP-stdio + gRPC + MCP. |
The capability resolver
Rather than hard-coding which pass follows which, each pass declares:
- Requires: the capabilities and IR rung its input envelope must already carry.
- Produces: the capabilities and IR rung its output envelope will carry.
When the chain runner picks the next pass, it matches the current envelope's produced capabilities against each candidate pass's requirements. A pass only runs if its requirements are satisfiable. This is why disrobe auto can detect that a PyInstaller archive contains a PyArmor-protected module and route it through the unpack-then-decompile chain without any per-combination glue code.
Capabilities are versioned. A pass can require, for example, "a CPython 3.12 code object at the Disasm rung," and the resolver will refuse to feed it a 2.7 object. This keeps the chain sound across the wide version ranges disrobe supports.
Standardized emits
Every pass exposes the same twelve emit kinds:
source disasm ast cfg ir manifest sourcemap symbols strings imports signatures report
Pass --emit source,disasm,report (comma-separated) to select a subset, or --all-emits on passes that support it to write every kind. A pass that cannot produce a given emit writes an explicit stub:
{
"schema": "disrobe.emit.stub/v0",
"pass": "pyarmor",
"emit_kind": "source",
"applicable": false,
"error_code": "DR-IR-NotApplicable",
"reason": "pyarmor pass does not produce source; chain with disrobe py decompile"
}
A downstream tool can request any emit from any pass and get a well-formed, self-describing answer: either the artifact or a "not applicable here, chain with X."
Error codes
Every failure carries a DR-<DOMAIN>-<NNNN> code rendered through miette diagnostics. Look any code up with:
disrobe explain DR-PYARM-0050
disrobe explain CLI-1 # short form also works
The chain runner
The chain runner is what turns a pile of single-purpose passes into a one-command recovery. It backs both disrobe auto (detect and chain automatically) and disrobe chain (drive an explicit pipeline).
Auto-detection
disrobe auto suspect.exe --out recovered/
disrobe auto fingerprints the input, picks the first pass, runs it, then re-fingerprints the output and repeats, following the capability resolver, until no further pass applies or the depth cap is hit. Detection spans 23 pass crates: native packer, PyArmor, JS deob, Python deob, container formats, SourceDefender, py-decompile, py-disasm, PyInstaller, JVM, .NET, Go, mobile, AS3, BEAM, Lua, Ruby, shell, PHP, Nuitka, Wasm, pyfreeze, and swift-objc.
Representative chains:
PE -> UPX -> rust-demanglePyInstaller -> PyArmor -> .pyc decompileAPK -> dex -> JADX + Smali + manifestElectron .asar -> webcrack -> source
Explicit chains
When you want to pin the pipeline rather than auto-detect:
disrobe chain input.bin --chain 'pyarmor+py-decompile' --out recovered/
disrobe chain input.bin --chain 'auto:8' --out recovered/ # auto-detect, depth 8
disrobe chain input.bin --chain 'pyarmor+py-decompile' --chain-pin pyarmor@0.10.0,py-decompile@0.10.0
--chain-pin locks each pass to a specific version so a recovery is reproducible against an exact pass build.
Depth and cycle safety
Adversarial input can try to make a chain recurse forever (an archive nested inside itself, a packer that re-emits its own signature). The chain runner defends against this:
- Depth cap.
--max-depth(default 8) bounds how many passes can run in one chain. - Cycle detection. Each stage's output is content-hashed (BLAKE3); if a stage produces bytes already seen earlier in the chain, the runner stops rather than looping.
Stage mirrors
Pass --capture-stages to materialize every executed pass's byte-exact output:
recovered/
├── 01-pyinstaller/ # byte-exact output of pass 1
├── 02-pyarmor/ # byte-exact output of pass 2
├── 03-py-decompile/ # byte-exact output of pass 3
├── final/ # terminal stage(s), linked
│ └── 03-py-decompile/ # symlink -> NTFS junction -> recursive copy fallback (Windows)
├── chain.json # the chain topology descriptor
└── recovery.json # per-pass status, confidence histogram, timings
The final/ link prefers a symlink, falls back to an NTFS junction on Windows, and finally to a recursive copy, so final/ always resolves to the terminal artifact regardless of platform and privilege.
chain.json: the topology descriptor
chain.json records the executed pipeline: each pass, its version, the input and output BLAKE3 hashes, the rung transition, byte sizes, and the per-stage verdict. It is the document disrobe diff and disrobe guard verify operate on (see Diff and guard tooling).
recovery.json: the provenance sidecar
recovery.json is the per-run report: each pass's status, a confidence-tier histogram, and timings. Summarize it without reading raw JSON:
disrobe context --out recovered/
This prints per-pass status, confidence tiers, the overall verdict, and provenance, which is the human-facing view of what the chain actually managed to recover and how much to trust it.
The .dr envelope
The .dr envelope is disrobe's content-addressed wire format. Every recovered artifact can be persisted as one, and the chain runner uses envelopes internally to pass work between stages. The format is designed for one thing above all: deterministic, verifiable, offline-composable caching.
Anatomy
A .dr envelope has three parts:
- Hot payload (rkyv). The primary data, serialized with rkyv 0.8 for zero-copy access. An envelope can be
mmap-ed and the payload read without a deserialize pass, measured at roughly 21 ns to "deserialize" a cached envelope, because there is effectively nothing to deserialize. - Cold sidecar (postcard). Secondary metadata serialized with postcard, kept out of the hot path so the common case stays fast.
- BLAKE3 root hash. A content hash over the payload that is the envelope's identity. Two envelopes with the same root hash are byte-identical by construction.
Every envelope also carries its schema version, its IR rung (see the ladder), its capability set, and a provenance record describing which pass produced it.
Why content-addressed, not timestamp-addressed
Because the identity is the BLAKE3 hash of the content, a cache hit is provably the same bytes, not "probably the same, modified at the same time." This is what makes --no-cache an optimization toggle rather than a correctness toggle: with the cache on or off, the output is identical. It is also what lets chains compose offline: a downstream pass can trust that an upstream envelope it pulled from cache is exactly what would have been produced live.
Working with envelopes
# Create an envelope from a source file
disrobe envelope create source.bin --out source.dr
# Inspect: version, rung, capabilities, provenance, root hash
disrobe envelope inspect source.dr
# Verify the BLAKE3 root against the payload
disrobe envelope verify source.dr
disrobe verify source.dr # convenience alias
# Structurally diff two envelopes
disrobe envelope diff a.dr b.dr # version, rung, flags, root hash, producer, capabilities, provenance
# Validate a migration is sound before performing it
disrobe envelope migrate-check a.dr --to-version 0.10.0 --to-rung surface
migrate-check answers a precise question: can this envelope be transcoded from its (version, rung) to the target (version, rung) such that a transcode path exists and every Requires capability remains satisfiable? It is how disrobe stays sound across schema bumps without silently dropping capability guarantees.
Transcoding across schema versions
disrobe-ir carries a transcode registry keyed on (from_version, from_rung, to_version, to_rung). Identity transcodes are registered for every rung at every version, and real transcodes are registered for the migration paths disrobe supports. Transcoding never changes the rung implicitly: a transcode moves an envelope across schema versions while it stays at the same rung, which keeps the operation auditable.
Hardening
The envelope decoder parses a content-addressed binary format and is treated as a security-sensitive surface. Adversarial envelopes that attempt read-past-end, integer overflow, or BLAKE3-mismatch acceptance are in scope for the security policy. The decoder lives in crates/disrobe-ir/src/envelope.rs and is fuzzed.
LLM sidecar and provenance
disrobe is built to feed coding agents. Any pass can emit a structured metadata bundle that lets an LLM reason about recovered code without re-deriving its structure: call graph, types, control flow, capability surface, decompile provenance, and round-trip verdict, all in one schema-conforming sidecar.
Turning it on
disrobe py decompile module.pyc --out recovered/ --llm
disrobe py decompile module.pyc --out recovered/ --llm --llm-briefs
--llm is an alias for the full Pack-4 selection. --llm-briefs additionally renders AGENTS.md and SKILL.md reconstruction briefs next to the bundle, so the output drops straight into an agentic IDE.
By default the bundle is written next to the primary output as <stem>.disrobe.llm.json. Override with --metadata-out <path> and choose the format with --metadata-format json|jsonl|cbor|msgpack.
The four packs
Packs are cumulative presets over the 18 categories:
| Pack | Flag | Adds |
|---|---|---|
| Pack-1 | --metadata-pack-1 | ast + disasm + symbols + strings |
| Pack-2 | --metadata-pack-2 | Pack-1 + cfg + types + imports + provenance |
| Pack-3 | --metadata-pack-3 | Pack-2 + dfg + signatures + constants + roundtrip + sourcemap + manifest |
| Pack-4 | --metadata-pack-4 / --llm | Pack-3 + confidence + opcode-coverage + pii-map + decryption-keys (auth-gated) |
The 18 categories
Each category can also be toggled individually:
ast disasm cfg dfg symbols strings types imports constants signatures
provenance roundtrip-verdict source-map manifest-cat decryption-keys
confidence opcode-coverage pii-map
Fine-tune any pack:
disrobe py decompile m.pyc --metadata-pack-3 --metadata-exclude ast,symbols
disrobe py decompile m.pyc --metadata-include cfg,types,provenance
Auth-gated categories
The decryption-keys category exposes recovered keys and IVs and is gated: passing --decryption-keys without --i-have-authorization fails with DR-CLI-0420. This is the same authorization gate that protects grey-zone commercial-protector behavior elsewhere in the tool.
Provenance sidecars
Independently of the --llm bundle, a chain run writes two provenance artifacts:
recovery.json: per-pass status, confidence-tier histogram, and timings. Summarize withdisrobe context --out <dir>.provenance/<file>.map.json: a line-level map from each recovered source line to(pass, source_offset, opcode_range, confidence). This is what lets an agent (or a human) trace any line of recovered source back to the exact bytes it came from and how confident the recovery is.
The .disrobe/ workspace
Scaffold a full agent workspace in the current directory:
disrobe init # scaffold .disrobe/
disrobe init --ide claude # also generate IDE-specific settings (claude, cursor, windsurf, aider)
This lays down an AGENTS.md forensic-framing template, per-symbol annotation schemas under .disrobe/annotations/, skill packs under .disrobe/skills/, slash commands, and a settings hook template that denies edits to ground-truth stage directories (see Diff and guard tooling). Maintain it with:
disrobe annot refresh # rebuild .disrobe/annotations/<stem>.annot.json
disrobe rename oldName newName --note "why" # append-only rename record
disrobe context --out recovered/ # summarize the recovery report
Python
Python is disrobe's most contested and most developed ecosystem. It ships an in-house Rust decompiler as the product, never a wrapper around pycdc, pylingual, decompyle3, or uncompyle6 (those are benchmark competitors, available only as optional --backend fallbacks).
At a glance
| Layer | Coverage |
|---|---|
| Bytecode disassembly | CPython 1.0-3.15, PyPy, MicroPython .mpy v0-v6, Jython, IronPython, Brython |
| Decompilation | In-house engine across CPython 1.0-3.15 with per-version opcode dispatch; 92.76% per-code-object recompile-equivalence on a pinned 200-module CPython 3.14 stdlib corpus (5831 of 6286, above a 90% CI floor), and the legacy 1.0-3.7 band asserts a CI floor of 152 of 191 proven-correct (67 by recompile-equivalence, the rest by structural token-match) |
| Modern constructs | match, walrus, f-strings and PEP 750 t-strings, exception groups, PEP 695/696/709 |
| Freezers | PyInstaller 2.x-6.20+, Nuitka, cx_Freeze, py2exe, PyOxidizer, shiv, pex, Briefcase, SourceDefender |
| Protectors | PyArmor v6-v9-pro and 18 source obfuscators with an AST-evaluator backend |
Decompiling .pyc
disrobe py decompile module.pyc --out recovered/
disrobe py decompile module.pyc --out recovered/ --backend native # default; deterministic, no external tools
disrobe py decompile module.pyc --out recovered/ --emit source,disasm,ast
The default native backend is the in-tree engine: it runs a frame-tree pre-pass, per-version opcode dispatch, and then round-trip verification. The optional --backend pycdc|decompyle3|uncompyle6 flags shell out to those external tools (which must be on PATH) purely for benchmark comparison; they are never the default.
How the in-house engine works
- Frame-tree pre-pass. Before walking instructions, the engine reconstructs the nested source-construct tree from the 3.11+ exception table. This eliminates the single-pass stack-walker desync that causes other decompilers to mis-nest try/except and with-blocks.
- Provably-inert normalizations. Twelve normalizations (padding, super-instruction fusion, constant-pool ordering, and more) run before the round-trip check, each gated by an adversarial test proving it masks no real bug.
- Round-trip metric. Every emitted file is recompiled on the matching interpreter and compared opcode-for-opcode against the original.
PERFECTis byte-identical;SEMANTICis the same program with a different layout;CODE_DIFFflags a real bug that is fixed before ship. The normalizer preserves jump-condition polarity rather than collapsing all jumps, so an inverted condition reads as aCODE_DIFFinstead of passing silently.
Measured equivalence
The per-code-object figure is measured against an independent oracle, not the tool's own output: each recovered module is recompiled on CPython 3.14.5 and its code objects are diffed against the originals. On a pinned 200-module stdlib corpus (6286 code objects) the rate is 92.76% (5831 of 6286), above a 90% floor a committed CI gate enforces (arbitrary_recompile_gate.rs). uncompyle6 stops near 3.8 and decompyle3 near 3.9; the ML-based decompilers self-flag benchmark contamination, and there is no model here to contaminate.
Disassembling
disrobe py disasm module.pyc --out trace.txt
A faithful per-instruction trace across every supported interpreter dialect. This is the Disasm rung: lossless, offset-preserving, no structural reconstruction.
Deobfuscating source
disrobe py deob obfuscated.py --out clean.py
disrobe py deob obfuscated.py --out clean.py --cleanup
Peels source-level obfuscator wrappers (Kramer/Specter, Berserker, Jawbreaker, BlankOBF, PlusOBF, Wodx, pyobfuscate.com, PyObfuscator (mauricelambert), python-obfuscator (PyPI), ObfuXtreme, Manglify, Oxyry, pyminifier, online obfuscator family, Xindex, pyobfus, Pypacker, Patchwork) with an AST-evaluator backend. --cleanup runs a ruff-AST constant-fold and dead-branch-elimination pass afterward.
Freezers and packagers
disrobe pyinstaller extract onefile.exe --out out/ # PyInstaller 2.1 .. 6.x, AES-CTR/CFB decrypt
disrobe pyinstaller detect onefile.exe # cookie, Python version, TOC offsets, no extract
disrobe pyfreeze extract app.exe --out out/ # cx_Freeze / py2exe / shiv / pex / PyOxidizer / Briefcase
disrobe nuitka detect app.exe # flavor + Python version
disrobe nuitka extract app.exe --out out/ # --onefile payload (zstd)
disrobe nuitka symbols app.exe # impl_* + module-init scan on --standalone builds
disrobe py sourcedefender app.pye --out app.msgpack # SourceDefender .pye decrypt
PyArmor
disrobe pyarmor unpack protected.py --out out/
Unpacks a PyArmor wrapper back to its original .pyc. v8 and v9-pro are handled by a pure-static path (no code execution). v6/v7 can optionally use a dynamic-hook fallback that runs the obfuscated wrapper in a watched subprocess to capture marshal streams; this is opt-in and unsafe on untrusted input:
disrobe pyarmor unpack protected.py --out out/ --allow-dynamic --dynamic-timeout 60
The
--allow-dynamicpath executes the sample. Only enable it on trusted samples or inside an isolated sandbox. See Forensics and malware-safety posture.
Other useful flags: --mode auto|standard|super, --target 3.11 (rewrite emitted .pyc magic), --allow-bcc (BCC native-body lift via Ghidra-headless), --strict (exit non-zero on any partial decode), and --all-emits.
End-to-end
A real-world Python sample is often frozen, then protected, then compiled. disrobe auto chains the whole stack:
disrobe auto suspect.exe --out recovered/ # PyInstaller -> PyArmor -> .pyc decompile
JavaScript / TypeScript
disrobe deobfuscates obfuscated JS/TS, splits bundled output back into per-module sources, and inspects packaged JS runtimes, all behind a deterministic codegen.
At a glance
| Layer | Coverage |
|---|---|
| Family detector | obfuscator.io, Jscrambler, jsobfu, plus bundler and minified-only classification, each with confidence and markers |
obfuscator.io (--full) | string-array decode, control-flow unflattening, opaque-predicate folding, packing expansion, dead-code and debug-protection strip, iterated to a fixpoint |
| Reverser library | JS-Confuser (string encoding/compression, dispatcher, flatten, opaque predicates, RGF, shuffle, variable masking, locks and integrity) and Jscrambler template reversals; Arxan-JS, JSDefender, and PACE protector detectors |
| Esoteric encoders | jsfuck, JJEncode, AAEncode, JSFiretruck, Dean Edwards Packer, atob/eval indirection |
| Renaming | --rename (hex idents to var_N) and --rename-scope-aware (oxc_semantic, conflict-checked) |
| Bundlers | Webpack 4/5, Vite, Rollup, Rolldown, esbuild, Turbopack, Bun, Browserify, Parcel, SystemJS, AMD |
| Packaged runtimes | V8 cached-data .jsc (bytenode), Node SEA blobs, nexe, nw.js zip-suffix, Electron .asar |
Deobfuscation
disrobe js deob bundle.min.js --out clean.js --full --rename-scope-aware
disrobe js deob legacy.js --out clean.js --legacy auto --unminify
The default path runs string-array recovery and writes a detection.json sidecar naming the matched family. --full runs the complete obfuscator.io reversal pipeline and records per-stage statistics in a pipeline.json (string-array call sites inlined, dispatch blocks collapsed, opaque predicates folded, packed blocks expanded). --legacy jsobfu|jscrambler-free|auto targets the older families; --unminify adds the !0/void 0/string-concat peepholes.
Unbundling
disrobe js unbundle app.bundle.js --out src/
disrobe js unbundle app.bundle.js --out src/ --emit sourcemap
Auto-detects the bundler runtime from its markers (the full table above) or forces one with --target webpack|webpack4|webpack5|vite|rollup|rolldown|esbuild|turbopack|bun|browserify|parcel|systemjs|amd. Modules land as separate files with chunk and module identifiers preserved, plus a manifest.json. --emit sourcemap synthesizes per-chunk v3 source maps and decodes embedded data-url maps.
Packaged JS runtimes
disrobe js v8 app.jsc
disrobe js v8 app.asar --json-out report.json
Classifies the artifact and prints real detection: bytenode header layout and Node version for .jsc, SEA flags and code length, nexe/nw.js payload geometry, or the .asar entry listing. For .jsc, disrobe is the self-contained, static, offline option: it recovers the user-string layer plus structure and detects the serializer version across Node 18-24, with no patched V8 binary (View8), Ghidra (ghidra_nodejs), or online service (jscdecompiler.com) required. The boundary is that internalized identifiers (most variable and property names, for example console and log) are serialized as references into V8's read-only snapshot heap, not as inline bytes in the .jsc; resolving them needs the exact V8 binary's RO heap. disrobe reports that as a lossy-internalized-roots boundary rather than fabricating past it. For V8 snapshots it reports a SnapshotDeserializeWall: the format prevents full bytecode recovery, so disrobe scrapes the string pool (tunable via --scrape-min) and states the boundary rather than fabricating past it.
Chaining
Electron and Node packaging chains run end to end:
disrobe auto app.asar --out recovered/
WebAssembly
disrobe parses WebAssembly modules and lifts them to four target surfaces, reverses Wasm-specific obfuscators, and decodes the Component Model and GC type graphs.
Decompilation
disrobe wasm decompile module.wasm --target rust --out lifted.rs
disrobe wasm decompile module.wasm --target ts --out lifted.ts
disrobe wasm decompile module.wasm --target wat --out lifted.wat
disrobe wasm decompile module.wasm --target c --out lifted.c
disrobe wasm decompile module.wasm --target json --out summary.json
Lifts to Rust, TypeScript, WAT, or C pseudo-source, or a JSON summary, with DWARF / source-map name recovery where debug info is present.
Per-op coverage is measured, not assumed. Op-coverage is scored only when every operator in a function is lowered (no unreachable/todo! stub) and the result validates through an independent re-parser; parseability alone does not count. On the committed corpus all 94 functions in the 30 parseable modules are fully op-covered: every operator has a lowering rule and the re-emitted WAT re-parses. The MVP instruction set plus the SIMD, atomics, bulk-memory, table/element, reference, and tail-call proposals are lowered. Three corpus modules are skipped on wat-parse or signature-extraction failure, so this is op-coverage of the supported subset, not of all wasm.
Op-coverage is not the same as execution-equivalence, except for the functions execution-verified below. Separately, 24 of the 24 execution-eligible functions (those with a pure-numeric, i32/i64/f32/f64 ABI) are execution-equivalent to the original under wasmtime (the semantic_differential test compares return values, trap parity, and linear memory between the original and the recovered module; 1 is byte-identical in memory), with the remainder op-coverage-only. The Component Model envelope, threads, memory64, and the GC type graph are parsed and decoded by dedicated scanners; that is distinct from lifting their per-instruction semantics to source.
Deobfuscation
disrobe wasm deob module.wasm --out clean.wasm
Reverses four Wasm obfuscator families with byte- or IR-transforming passes: Jscrambler-WASM (strip integrity imports, fold opaque predicates), Wobfuscator (recover the eval op-table and lift each handler), Tigress-via-Emscripten (unflatten the dispatcher, demangle _Z names), and Wasmixer (unwrap the XOR decrypt stub, defragment). A fifth family, wasm-name-obfuscator, is detected and its rename strategy classified, but its high-entropy hex renames destroy the original names, so there is nothing to reverse.
Component Model and GC types
disrobe wasm component module.wasm # parse the Component Model envelope -> world / adapter manifest
disrobe wasm gc-types module.wasm # recover the GC type graph (struct / array / ref types)
JVM and Android
disrobe decompiles JVM classfiles and Android DEX through a unified command, wrapping the best FOSS decompilers headlessly while adding obfuscator reversal, ProGuard/R8 mapping replay, and chain auto-detection.
Decompiling
disrobe jvm decompile App.class --out src/
disrobe jvm decompile app.jar --backend vineflower --out src/
disrobe jvm decompile app.apk --backend jadx --out src/
disrobe jvm decompile classes.dex --backend jadx --out src/
Routes a .class, .jar, .dex, or .apk through a JVM/Android backend: CFR, Vineflower, Procyon, JADX, and others. disrobe validates the classfile itself (format 1.0.2-25) and recovers records, sealed types, and pattern matching where the backend supports them, plus Kotlin and Scala idioms.
Fidelity
The in-house classfile decompiler is gated against real javac: on the EdgeCases corpus, the asserted floor is 122 of 131 decompiled methods (93.1%) recompiling error-free, and 128 of 131 (97.7%) are measured on JDK 25. CI provisions a JDK so this gate runs there. The Dalvik lifter's recovered bodies are graded by the real JVM bytecode verifier rather than by the lifter's own output: a committed gate assembles the recovered classes from the committed dex corpus, loads them under -Xverify:all, and asserts that the recovered classes pass the verifier; 99% of verifiable classes pass (102 of 103, 0 lifter verify failures; the 103rd is link-blocked by an unbundled Kotlin Function1 supertype, a test-harness limit not a lifter defect). On the gitignored real FOSS apks the lifter self-reports a lowered body for 89% to 92.5% of methods (transmissionic 92.5%, enrecipes 90.7%, rustdesk 89.0%), but that figure counts the lifter returning a body rather than a throw-stub and is not verifier-attested; those apks cannot run in CI. A live-range-splitting pass recovers method bodies whose registers carry conflicting JVM types across control-flow joins; 307 re-hosted bodies verify clean under the same gate. The committed-corpus verifier floor and the EdgeCases recompile floor are asserted by committed test gates.
Inventory and backends
disrobe jvm extract app.apk --out classes/ # extract a .jar / .apk + dump classfile inventory
disrobe jvm backends # report available JVM/Android backends on PATH
Obfuscator reversal
disrobe reverses JVM obfuscators that the raw decompilers cannot (Zelix KlassMaster, Allatori, Stringer, DashO, and DexGuard control-flow obfuscation on the Android side) and replays ProGuard/R8 mapping files to restore original names.
String-encryption protectors are decrypted by emulating each class's decrypt method on the encrypted constants rather than by guessing a key, so Allatori and DashO style decrypt(String) and decrypt(int, String) routines are recovered statically. The evaluator runs <clinit> for a per-class static key, constructs the receiver and runs <init> for an instance decrypt keyed on a field, synthesizes the calling frame for a caller/stack-trace-keyed key, and covers the long-accumulator arithmetic, getfield/putfield, and switch-dispatched keystreams these routines emit, all under a hard step cap. Runtime-keyed schemes (system property, environment, clock, secure random, or a live cross-class table, as Stringer sometimes uses) are flagged as walled instead of faked. ProGuard and R8 names are restored from a mapping.txt with disrobe jvm decompile --mapping mapping.txt, disambiguating overloaded methods by their descriptor argument count.
On the Android side, disrobe deflattens BlackObfuscator control-flow flattening: it recognizes the String.hashCode() keyed dispatcher, matches each block's const-string block-name to its switch case, and recovers the original linear block order, annotating the deflattened sequence directly in the decompiled output. Obfuscator-planted out-of-range exception-table entries are dropped before structuring so they cannot poison the control-flow graph, and jsr/ret subroutines are inlined into a linear stream.
DexGuard hides string constants in an encrypted static String[] decrypted at run time through java.lang.reflect.Method.invoke rather than a direct call. The key and ciphertext are present in the dex; only the dispatch is reflective. disrobe runs a constrained Dalvik register machine over the dex's own routine: it executes the class <clinit> to rebuild the encrypted table, then runs the decrypt(int) body for each index (read the table element, apply the per-char transform against the embedded key, rebuild the string) and emits the plaintext, with the Class.getDeclaredMethod + Method.invoke call sites resolved to their concrete target. disrobe jvm decompile app.dex surfaces the recovered strings and resolved sites in the manifest. A routine that derives its key from runtime-only state (a system property, the environment, the clock, or a secure random) is reported as runtime-keyed rather than guessed. Commercial DexGuard is paid Guardsquare software whose protected output is unsafe to build on an analysis box, so this path is validated against a self-authored benign dex that exhibits the same reflection-string-decryption technique, graded against its authored plaintext.
Chaining
disrobe auto app.apk --out recovered/ # APK -> dex -> JADX + Smali + manifest
.NET / CIL
disrobe parses the full .NET PE + CLR metadata surface, decompiles CIL to C#, F#, and VB pseudo-source, detects 19 protectors, and handles ReadyToRun and Native AOT images. In-house static recovery reverses ConfuserEx2 constant decryption on a real committed sample (its encrypted-resource layer is carved byte-exact but walled on the runtime key, and full ConfuserEx2 cleanup delegates to --backend de4dot); the Eazfuscator VM-tier is devirtualized at 57 of 57 instructions against an in-repo EazVM virtualizer of our own (the committed assembly is encoded by that virtualizer, not the shipping Eazfuscator.NET product), and ILProtector and MaxToCode are detected and structurally enumerated with their method bodies walled on the native-runtime key (derived in the loader, absent from the assembly), never fabricated. The rest are detected with watermark-strip and encrypted-resource classification.
Decompiling
disrobe dotnet decompile App.dll --backend ilspy --out src/
disrobe dotnet decompile App.exe --backend dnspyex --out src/
disrobe dotnet decompile App.dll --backend de4dot --out src/
Routes a .NET PE (.dll / .exe) through ILSpy, dnSpy, dnSpyEx, or de4dot. disrobe owns the in-house CIL disassembler (full opcode table) and the CIL-to-C#/F#/VB lowering, so the structural recovery is its own even when a rendering backend is used.
Static analysis
disrobe dotnet analyze App.dll
disrobe dotnet backends # report available .NET backends on PATH
analyze reports the PE header, CLR metadata, protector detection, and probes for ReadyToRun (R2R) and Native AOT images, with symbol recovery on AOT builds.
Obfuscator reversal
disrobe detects 19 protector families. Recovery depth varies by protector and by what is statically present in the artifact. The model for in-house recovery is the same one used by the JVM and Lua passes: locate the decryptor method or key inside the assembly and emulate it over the encrypted data through the in-house CIL stack-machine, never a re-derived or hard-coded key.
Reversed on a real committed sample (plaintext recovered from the artifact, plaintext-absent oracle):
- ConfuserEx2: in-house recovery reverses the constants protection (the documented FOSS "Ki.Constants" block-XOR / LZMA-validated algorithm) on a real committed
SampleConstants.confuserex2.dll, with a test whose fixture holds only ciphertext plus the real decryptor and asserts plaintext not present anywhere in it. The encrypted-resource layer is carved byte-exact but walled on the runtime key. Full deobfuscation (control-flow flattening, runtime-VM string decryption, anti-tamper) is delegated to de4dot viadisrobe auto/--backend de4dot; disrobe does not reimplement that tier in-house.
In-assembly-decryptor recovery, graded by round-trip against the pre-encryption original:
- Eazfuscator.NET: locates the static
char[]/byte[]string-decryptor method and emulates its CIL over the encrypted#USliteral table to recover the pre-VM plaintext strings. The VM-tier is devirtualized against an in-repo EazVM virtualizer of our own: the committed assembly is encoded by that virtualizer, not the shipping Eazfuscator.NET product. disrobe reads the embedded resource, recovers the per-build opcode map from the in-assembly dispatch table by fingerprinting each handler, decrypts the position-keyed instruction stream, and lifts every virtualized method body back to CIL, then grades that CIL against the clean DLL. The grade is an ordered instruction comparison (opcode and operand, with branch targets resolved to instruction index, not raw token): 57 of 57 instructions match in sequence across the five bodies (100%). A second gate rebuilds a runnable assembly from the recovered CIL and asserts its stdout is byte-identical to the clean baseline (run wherever a .NET runtime is onPATH). Per-build randomization is fully recovered; only a runtime-only homomorphic key, not present statically, would bound a given build. - ILProtector / MaxToCode: classified by Invoke-stub and zero-RVA method enumeration, runtime-resource and
.mtc/.text1section location, and container-framing parse. Real builds derive the per-method key inside the native loader (Protect32/64.dll) at run time, not in the managed assembly, so the encrypted bodies are walled and reported absent, never fabricated. - KoiVM (ConfuserEx VM): detected by
#Koistream andVMDispatchermethod markers; the VM-dispatch handler table is fingerprinted and the virtualized methods are lifted back to CIL through the same in-house CIL stack-machine used for Eazfuscator.
Doable now with an in-assembly key (detected + classified today; a per-protector decryptor emulation can be added on the same model when a real sample is available to fix the exact algorithm against):
- SmartAssembly, .NET Reactor, Babel, Dotfuscator (Pro), Skater, Goliath, DeepSea: the string key (per-string XOR lane, AES/Rijndael resource key, RC4(SHA1(resource)), or single-byte XOR) is embedded in the assembly, so the data is present and not a wall. These are currently detected with watermark-strip, identifier, and encrypted-resource classification, and the generic static-decoder opportunistically recovers in-lined integer/string constants where the decoder is a pure transform. Confirming a per-build algorithm needs a real protected sample (flagged for consent; nothing is downloaded).
- CryptoObfuscator, Agile.NET (CV tier): 3DES/Rijndael string keys are in-assembly; same status as above.
Needs a real sample to build against (flagged for consent, never downloaded): Spices.Net (Cyrillic-homoglyph + ROT-N per-method scramble), and the commercial protectors above whose exact per-build transform is not documented.
Genuine walls (the key or the original code is not in the static artifact):
- Themida / .NET wrapper: managed methods are lifted into the Oreans native VM; per project policy disrobe does not ship a native-VM devirtualizer.
- ArmDot: custom per-method VM with LCG-encrypted opcodes; static devirtualization is not performed.
- ILProtector / MaxToCode native-keyed configurations: when the per-method key is computed inside the native stub, the original CIL is not statically present.
Other:
- Obfuscar: dedicated in-house peeler (NameMaker odometer classification; rename-only metadata, so there is no byte rewrite and no embedded name map to recover).
Grey-zone commercial protectors are gated behind --i-have-authorization.
Chaining
disrobe auto App.exe --out recovered/ # ConfuserEx2 PE -> de4dot -> ILSpy -> C#
Native (PE / ELF / Mach-O)
disrobe does not compete with Ghidra, IDA, or Binary Ninja on raw decompilation. It is the unpack, symbol-recovery, and chain-detect layer that feeds those tools cleaner input, and it wraps Ghidra headlessly when you want a full decompile in one command.
Symbol recovery and dumping
disrobe native symbols app.exe --out symbols.json
Dumps symbols, sections, segments, imports, and debug info from a PE / ELF / Mach-O. Demangles and restores Rust and C++ symbols across x86 / ARM / RISC-V / MIPS / PowerPC / SPARC / eBPF / AVR, reading DWARF, PDB, and STABS debug formats. For C++ binaries it recovers the class hierarchy from the in-memory RTTI and vtable layout: ABI, each class's inheritance kind, direct base classes (virtual flagged), virtual-method slot counts, and detected STL templates.
Disassembly, call graph, and CFG
disrobe native disasm app.exe --out app.asm # recovered per-function listing
disrobe native disasm app.exe --emit cfg-dot --out cfg.dot # per-function basic-block CFG
disrobe native disasm app.exe --emit json --out disasm.json # structured instruction stream
disrobe native disasm --raw shellcode.bin --base 0x1000 --bits 64 --syntax intel # linear sweep
disrobe native callgraph app.exe --out callgraph.dot # whole-program call graph
An in-tree iced-x86 disassembler discovers functions without symbols (call-target and prologue scanning), partitions each into basic blocks, builds the whole-program call graph, and renders the per-function listing or, with --emit cfg-dot, the basic-block CFG. --raw disassembles a flat code blob with no container, and --syntax intel|at&t|nasm|masm selects the dialect for raw output. Each decoded instruction carries its register, memory, and rflags read/write effects, and the native layer can re-encode and relocate instruction blocks (the engine behind native patch).
Patching, signatures, and binary diff
disrobe native patch app.exe --at 0x1400 --bytes 0x90,0x90 --out patched.exe # rewrite + revalidate
disrobe native patch app.exe --nop-range 0x1400:0x1410 --out patched.exe # nop a span
disrobe native sigmaker app.exe --at 0x1400 # wildcarded signature
disrobe native diff old.exe new.exe # match functions across builds
native patch rewrites bytes at a virtual address (or nops a VA span), maps the VA to a file offset, applies the edit, and revalidates the image. native sigmaker generates a wildcarded byte signature from the function at an address (immediates and displacements masked via the instruction decoder), then uniqueness-tests it across the image. native diff matches functions across two binaries by content, relocation-invariant, and control-flow-graph fingerprints and reports the added, removed, and changed functions. All three work on stripped input through the in-tree function discovery.
Queryable IR and capabilities
disrobe query app.exe functions # discovered functions, complexity, exports
disrobe query app.exe calls-to malloc # call sites to a target
disrobe query app.exe xrefs-to sekret # references to a symbol
disrobe query app.exe string-decoders # decoder-shaped functions (loops + byte arith)
disrobe query app.exe complexity-over 20 # functions over a cyclomatic threshold
disrobe query app.exe capability network # instructions tied to a capability
disrobe capabilities app.exe # MITRE ATT&CK + MBC behavior report
disrobe query runs a queryable-IR layer over the disassembled code (functions, calls-to, xrefs-to, string-decoders, complexity-over, capability), symbol-independent and driven by the same function discovery. disrobe capabilities runs a rule engine over that IR and reports matched behaviors mapped to MITRE ATT&CK techniques and Malware Behavior Catalog (MBC) IDs, with the matching instruction offsets as per-rule evidence. Both accept a stripped binary directly or a .dr envelope.
Emulation-driven string recovery
disrobe strings app.exe # static strings + decoder-execution recovery
Beyond a static scan, disrobe strings locates decoder-shaped functions and drives each through the in-house x86 emulator, recovering the plaintext that only exists after the decoder runs. Emulation-recovered strings are deduplicated against the static set and reported with the decoder and buffer addresses, so a single-byte or multi-byte XOR/sub stack-string scheme yields the decoded text without executing the sample natively.
Identifying the compiler, packer, and protector
disrobe native identify app.exe --out identity.json
An in-house signature engine fingerprints what built and protected a binary, the way Detect-It-Easy does, then routes each finding to the disrobe pass that handles it. It detects compilers and toolchains (Go, Rust, MSVC, GCC, Clang, MinGW, Delphi, Nim, Zig, Crystal, Swift, GHC, .NET, Nuitka), packers (UPX, ASPack, PECompact, FSG, MEW, MPRESS, Petite, NSPack, kkrunchy), protectors (Themida, VMProtect, Enigma, Obsidium, Armadillo, ConfuserEx, .NET Reactor, Eazfuscator), installers (NSIS, Inno Setup, InstallShield, AutoIt, PyInstaller, Electron, Bun), linkers (MSVC link via the Rich header, GNU ld, LLD), and code-signing. Every hit carries a support route: a Go binary points at disrobe go recover, a packed PE at disrobe native unpack, a .NET assembly at disrobe dotnet decompile, an installer at disrobe auto. Virtualizing protectors (Themida, VMProtect) are detect-and-carve only, never a fabricated devirtualization.
Unpacking native packers
disrobe native unpack packed.exe --out unpacked.bin
Detects the runtime packer and unpacks it. In-house decoders cover UPX (.text and .pdata byte-identical, ~96% whole loaded image), kkrunchy (byte-exact), NSPack (~99% content-section), MPRESS, Petite, MEW, ASPack, and PECompact. On committed samples ASPack and PECompact rebuild the decompressed section image at its load RVA: the section report confirms the recovered .text byte-identical and the import table >=98% byte-identical to the original, both gated in CI, while the packed .text of near-random entropy and zero resolvable calls drops to ~6.2-6.5 with hundreds of disassembler-resolvable intra-code calls. Because the whole rebuild is a loaded-memory image rather than a disk-aligned file, the bench marks whole-output byte-identity n/a. MEW rebuilds a flat image of the committed Sysinternals samples, read as the entropy drop to ~4.2-4.9 and tens of thousands of decoded instructions. FSG decodes through the same aPLib-clone path but ships no committed fixture (its samples live under the gitignored .developer/ tree), so no number reproduces from a checkout. ASProtect, Morphine, nPack, NeoLite, and Yoda's Crypter are recovered by driving their unpack stub through the in-house x86 stub emulator: the decompressor or stream decryptor runs to the original entry point inside the emulator, then the reconstructed sections are read back and sliced byte-for-byte (Yoda's Crypter .rsrc recovers byte-identical and .text decrypts to full plaintext). Yoda's Protector is detect + resource-carve, its stream key being a runtime-only value absent from the file. On UPX and NSPack the whole-image residual is the loader-rebuilt zone (bound import address table and base relocations): those addresses are resolved by the OS loader at run time and are not present in the packed stream, not a decoder gap. The virtualizing protector tier (VMProtect, Themida, Enigma, and 15+ others) is detect-and-carve: the stub is still driven through the emulator, but the original code is decrypted only by a per-machine key assembled after the stub validates an un-instrumented host (RDTSC deltas, debugger-handler identity, BOUND/FPU exception fingerprints). That key is not present in the file, so faithful recovery is an information-theoretic wall; disrobe carves what survives in place and reports the wall rather than fabricating an unpack. Per-fixture recovery scores are pinned in corpus/native/packers/MANIFEST.toml.
Devirtualizing a bytecode VM
disrobe native devirt protected.exe --out recovered/
disrobe native devirt targets the bytecode-VM tier rather than the compression tier. It locates the interpreter, fingerprints each handler's micro-op behaviorally by probing it through the in-tree x86 emulator (so a per-build handler permutation does not break the lift), recovers the handler-to-opcode table, reconstructs the VM CFG, and lifts the handler bytecode to a re-executable IR plus pseudo-code. The output directory receives the recovered listing, the pseudo-code, and a devirt.manifest.json (schema disrobe.native.devirt/v1).
The lifter is validated end-to-end on a self-authored Tigress-shape bytecode VM: the recovered IR re-executes to the same outputs as the original across arithmetic, loop, and branch programs, lifted from machine code alone (vm_devirt_oracle.rs). The commercial front-ends (VMProtect, Themida, Code Virtualizer, Enigma, WinLicense, PELock) mutate their handler set per build; the lifter is the engine, the Tigress-shape VM is the validated level, and the commercial families are extended from published RE write-ups, not a running licensed sample. A handler stream assembled at run time from a per-machine key, or fetched over the network, is the one residual, and section carve plus protector identification stay available for every family.
Forensic primitives
disrobe native entropy app.exe # ASCII heat-strip + byte histogram + packed-region runs
disrobe native entropy app.exe --format svg --svg map.svg # dark-theme SVG entropy map with section overlays
disrobe native entropy app.exe --format json --out e.json # the disrobe.native.entropy/v0 document
disrobe native signatures app.exe --out sigs.json # AES T-tables, SHA/MD5 IV+K, ChaCha20 sigma
disrobe native signatures app.exe --flirt db.sig # match against an IDA FLIRT database
disrobe native fingerprint app.exe # crypto + FLIRT + string-xref sidecar
disrobe native graph app.exe --out imports.dot # import/export table as Graphviz DOT
disrobe native sbom app.exe --out app.cyclonedx.json # CycloneDX 1.5 SBOM from cargo-auditable metadata
Entropy map & byte histogram
disrobe native entropy slides a 4 KB window across the file computing Shannon entropy (bits/byte) to locate packed, compressed, or encrypted regions, and renders the profile three ways via --format text|json|svg:
- text (default): a Unicode heat-strip sparkline (one glyph per 4 KB block, taller = higher entropy), a 16-bucket ASCII byte-frequency bar chart, and a list of contiguous high-entropy runs (entropy >= 7.0 bits/byte) with their file offsets, the candidate packed/encrypted regions. Pass
--out <path>to also drop the JSON document. - svg: a self-contained, dependency-free neutral-gray entropy map. Each block is a column colored on a calm-gray to amber to red ramp, with PE/ELF/Mach-O section boundaries (parsed from the file's own section table) overlaid as labeled dashed markers and a color legend. Written to
./out/<stem>.entropy.svgor the explicit--svg <path>(which implies SVG rendering). The SVG is byte-stable for a given input (no clock or RNG) and all section names are XML-escaped. - json: the machine-readable
disrobe.native.entropy/v0document: per-block entropy, the sparkline/heat-strip strings, the 16-bucket histogram, the detected high-entropy runs, and the section spans.
The reusable rendering logic lives in disrobe_pass_native::entropy_viz (entropy_sparkline, byte_histogram, histogram_ascii_16, high_entropy_runs, render_entropy_svg) so other tools can embed it.
Full decompile via Ghidra
disrobe native decompile app.exe --out decompiled/
Runs Ghidra headlessly (install it with disrobe install-deps ghidra) and returns pseudo-C alongside the standardized emits. This is the one place where an external native engine is the legitimate primary: disrobe's job is to hand it a clean, unpacked, symbol-rich input.
Go
disrobe recovers symbols from stripped and garbled Go binaries across PE, ELF, and Mach-O by parsing the Go runtime's own metadata tables. The deliverable is symbols, types, and embedded data, not source bodies.
At a glance
| Layer | Coverage |
|---|---|
| Binary formats | PE, ELF, Mach-O |
| pclntab | Header eras go1.2, go1.16, go1.18, and go1.20, located structurally even when the magic word has been stomped |
| Symbol recovery | pclntab function table, moduledata, typelinks/itablinks type metadata, buildversion |
| Obfuscation | garble report graded None / Detected / Partial / Full, with per-scheme literal-recovery statistics |
| Embedded data | embed.FS usage report and directive extraction |
| Debug info | DWARF report when the sections survive |
Recovering a binary
disrobe go recover app --out app-go.json
disrobe go info app
recover writes the full analysis JSON (default ./out/<stem>-go.json); info prints the fingerprint without writing anything. Output shape (illustrative):
go recover: OK
input: app
image kind: elf
ptr size: 8
pclntab ver: go1.20
buildversion: go1.26.3
funcs: ...
packages: ...
garble: None
embed.FS: used=true directives=...
wrote: ./out/app-go.json
info adds the stripped-binary fingerprint: whether the symbol table was stripped, how many functions were still recovered from pclntab, and the stdlib-name ratio that feeds the garble grading.
Garble
The garble report separates a real wall from a tooling boundary. Standard-library names survive in pclntab and are recovered, while hashed user identifiers stay walled: garble hashes them with a keyed HMAC-SHA256 over a build seed that is not in the binary, so the original names are information-theoretically gone and are reported as a name_recovery_wall rather than guessed at.
The garble -literals string encryption is a different story and is not a one-time pad. Each literal is decrypted by an init-time thunk whose key material is itself compiled into the binary, so the plaintext is statically recoverable by emulating that thunk. For the simple obfuscator disrobe does exactly this: garble emits both the key[] and data[] byte arrays as rodata []byte literals and the runtime loop computes plaintext = data <op> key (op in XOR/ADD/SUB), so disrobe locates the adjacent equal-length blob pair, applies the inverse op, strips the junk padding, and bridges the scattered external-key byte-mutations. Recovery is counted per scheme (plain_ascii, single_xor, single_add, single_sub, repeating_xor, garble_simple) and the simple-scheme scan only runs once garble is positively detected, so a normal Go binary is never mined for phantom literals. The residual boundary is concrete rather than information-theoretic: the external-key mutation index and op live as immediate operands in the decrypt thunk's machine code (not in rodata), and the swap/split/shuffle/seed obfuscators wrap each literal in its own control flow, so byte-exact recovery of heavily mutated or control-flow obfuscated literals needs full decrypt-thunk emulation over the .text immediates. The key material is present in the file in every case, which is what makes this a depth boundary and not a wall.
Validation and chaining
The pass is validated against a go1.26.3 fixture, and the test suite gates type-name recovery at >= 85% on that fixture. UPX-on-Go chains automatically: disrobe auto unpacks the UPX layer first, then recovers the Go symbols underneath.
Lua
disrobe decompiles compiled Lua chunks across every common dialect, peels all 11 major Lua obfuscators, and devirtualizes custom-VM wrappers back to runnable Lua source.
At a glance
| Layer | Coverage |
|---|---|
| Dialects | Lua 5.1 / 5.2 / 5.3 / 5.4, LuaJIT 2.0 / 2.1, full Luau (all 82 opcodes, child-proto linking), GLua |
| Decompile output | Lua source plus a manifest recording the detected dialect, fidelity grade (Lossless / Lossy / BestEffort), and any warnings |
| Obfuscator detection | Prometheus, MoonSec v1 / v2 / v3, IronBrew2, WeAreDevs, SLua, AztupBrew, DarkSec, Boronide, PSU, luaobfuscator.com |
Peelers (--family) | prometheus, moonsec-v1, moonsec-v2, moonsec-v3, ironbrew2, wearedevs, slua; default auto detects first |
| VM devirtualization | IronBrew2 2.7.0 reversed on real committed output, graded by real-Lua execution differential (hello / arith / control / tables / edge in standard and MAX mode); MoonSec-shape recovery is pending a real sample |
Decompiling
disrobe lua decompile script.luac --out script.lua
disrobe lua detect script.luac
decompile writes the recovered source (default ./out/<stem>.lua) and a manifest.json recording the format, fidelity grade, and warnings. detect reports the dialect and header field summary (constant, proto, and code counts) without writing output.
Output shape (illustrative):
lua decompile: OK
input: script.luac
format: Lua54
fidelity: Lossless
warnings: 0
wrote: ./out/script.lua
manifest: ./out/script.manifest.json
Peeling an obfuscator
disrobe lua deobfuscate obfuscated.lua --out clean.lua
disrobe lua deobfuscate dumped.lua --family moonsec-v3 --i-have-authorization
MoonSec v3 and IronBrew2 are commercial-tier wrappers; their peelers require the explicit --i-have-authorization flag. Output shape (illustrative):
lua deobfuscate: OK
family: Auto
detected: Prometheus (confidence=0.97)
passes run: 3
- string_decode
- bytecode_unwrap
- emit
recovered: 12 string(s)
fully peeled: true
residual: 0
wrote: ./out/obfuscated.peeled.lua
The report lists every pass that ran, recovered string constants, a fully peeled verdict, and any residual markers. Where fully peeled is false the report carries the residual marker names and the reason (runtime key, anti-tamper variant, or unmodeled VM tier).
VM devirtualization
IronBrew2 and MoonSec ship their payload behind a custom register-VM: a permuted opcode-handler table and an embedded constant pool that a stock decompiler cannot read. The permutation is not stored in the loader; it is computed at load time inside the obfuscated bootstrap, then used to dispatch handlers and key the constant decryptor. disrobe reconstructs it the same way the loader does.
For IronBrew2 2.7.0, the devirtualizer parses the bootstrap's dispatch chain to derive the encoded -> canonical permutation and the XOR key, decodes the constant pool, and lifts the VM bytecode back to runnable Lua. The committed corpus/lua/ironbrew2 set carries real obfuscated bootstraps for five programs in both standard and MAX mode; each must produce byte-identical output to the original under a real lua binary (tests/ironbrew2_real_oracle.rs).
MAX mode adds three layers on top of standard: a control-flow-flattened dispatch (a nested binary search over the opcode enum, which the same handler walker un-flattens), comparison-polarity number-mutation (the EQ handler tests ~= and jumps on equality, captured as the literal operator plus jump direction), and fused super-operator handlers (one VM step covering several real ops, whose hoisted scratch locals are stripped after classification).
The MoonSec emulate_perm_builder path interprets a bootstrap table-builder over its seed and is unit-tested on a realistic synthetic bootstrap of our own design whose permutation is derived at runtime. End-to-end validation against a real captured MoonSec dump is pending: no live sample is publicly available.
Runtime-key and anti-tamper variants (MoonSec v3 with an encrypted constant pool keyed at runtime) are the wall: the key is not present statically, so disrobe returns fully_recovered: false with a runtime keys residual marker.
PHP
The commercial PHP encoder market has no maintained FOSS competition offline. disrobe decodes all three dominant encoders fully offline: nothing is uploaded anywhere. It also peels stacked eval-chain obfuscation and walks Phar archives.
At a glance
| Layer | Coverage |
|---|---|
| Commercial encoders | ionCube, SourceGuardian, Zend Guard: envelope detect and wall (the decrypt key is native-loader-resident); a partial op_array skeleton only for legacy statically-keyed cases (Zend legacy XOR), graded StructuralOnly otherwise |
| Phar archives | Manifest walker with path-sanitized extraction |
| Eval-chain layers | base64_decode, gzinflate, gzuncompress, gzdecode, str_rot13, strrev, str_replace, urldecode / rawurldecode, hex escapes, pack-hex, chr() concatenation, uudecode, single-key XOR, create_function, nested eval, FOPO, Better PHP Obfuscator |
| Recovery grading | EvalChainPeeled / OpArrayDecompiled / StructuralOnly / PlainSource |
Decoding an encoder envelope
disrobe php decode payload.php --out out/payload-php/
disrobe php decode payload.php --encoder ioncube --i-have-authorization
--encoder is auto (default), phar, ioncube, sourceguardian, or zendguard. Commercial encoders require the explicit --i-have-authorization flag. The output directory receives the decoded payload, a skeleton .php when an op_array was decompiled, and a manifest.json recording the encoder, version label, marker offset, ciphertext and plaintext byte counts, and the recovery stage.
Output shape (illustrative):
php decode: OK
input: payload.php
encoder: Ioncube
out dir: ./out/payload-php
manifest: ./out/payload-php/manifest.json
Peeling eval chains
disrobe php deobfuscate obfuscated.php --out clean.php
Unwraps stacked eval() layers until the residue is plain PHP. Output shape (illustrative):
php deobfuscate: OK
input: obfuscated.php
layers: 3
residual_eval:false
wrote: ./out/obfuscated.peeled.php
manifest: ./out/obfuscated.peeled.manifest.json
The manifest counts each layer kind that was peeled and flags whether any eval remains in the residue.
Phar archives
disrobe php extract archive.phar --out extracted/
Walks the Phar manifest and extracts every entry through a path-sanitizer (no .. escapes), writing a manifest.json with the entry count and API version.
Output shape (illustrative):
php extract: OK
input: archive.phar
entries: 14
out dir: ./out/archive-phar
manifest: ./out/archive-phar/manifest.json
When an encoder's key lives only in its runtime loader, the decode is graded StructuralOnly and the manifest carries the residual ciphertext length rather than pretending at plaintext.
Ruby
disrobe is the only active Ruby bytecode decompiler with a measured fidelity grade. It analyzes every major Ruby artifact format and decompiles YARV and mruby bytecode toward source, verified by a recompile-equivalence oracle on a real MRI interpreter.
At a glance
| Layer | Coverage |
|---|---|
| Flavors detected | MRI source, YARV binary (YARB magic), mruby RITE, JRuby .class, TruffleRuby AOT, Ruby2Exe, Ocra |
| YARV | IBF reader (iseqs, object table, literals) plus a decompiler driven by per-version opcode tables for Ruby 2.6 through 3.4 |
| mruby | RITE reader covering format versions 0001-0007, 0030, 0200, and 0300, with irep disassembly and decompilation |
| Fidelity | 100% opcode-multiset equivalence on a greeter fixture; 85% on a mixed-construct megafile (gate floor, CI-enforced); approximately 90% genuinely recoverable, the remainder an oracle artifact from comment-stripping magic pragmas |
| Output | Analysis JSON plus a .rb source file carrying the decompiled body and a YARV disassembly trailer |
Analyzing an artifact
disrobe ruby decompile app.bin --out app-ruby.json
disrobe ruby detect app.bin
decompile sniffs the flavor, runs the matching reader and decompiler, and writes the analysis JSON (default ./out/<stem>-ruby.json) plus a .rb source file beside it. detect reports the flavor and exits without writing output.
Output shape (illustrative):
ruby decompile: OK
input: app.bin
flavor: YarvBinary
yarv header: major=3 minor=4
yarv iseqs: 12
yarv bodies: 12
yarv objects: 34
yarv literals:18
yarv insns: 97
yarv decomp: Lossless
yarv stmts: 23
decompiled: ./out/app.rb (yarv)
wrote: ./out/app-ruby.json
For MRI source the summary reports token and definition counts. For YARV it adds the IBF header fields, iseq and object counts, instruction count, decompile fidelity, and statement count. For mruby it reports the compiler version string, irep count, instruction count, and whether a body was recovered.
Fidelity measurement
A committed recompile-equivalence oracle compiles the recovered YARV source on the matching interpreter and diffs the opcode multiset. The gate asserts 100% equivalence on the greeter fixture and at least 85% on the megafile fixture; both run in CI. The roughly 5-point megafile gap is an oracle artifact: the oracle strips comment lines before recompiling, discarding the original's magic pragmas (# frozen_string_literal: true and similar), which shifts a handful of string and array literals to non-frozen opcodes.
Wrappers
Ruby2Exe and Ocra self-extracting packages are detected as their own flavors so the chain layer can route the embedded payload onward. JRuby .class files and TruffleRuby AOT images are classified but not decompiled here: JVM-class material belongs to the JVM guide, and AOT-compiled native code has no recoverable Ruby body.
BEAM (Erlang / Elixir)
disrobe parses BEAM IFF files, recovers Erlang or Elixir source when debug chunks survive, lifts to Core Erlang otherwise, and disassembles the Code chunk per instruction. A flat text disassembly listing lands beside the JSON automatically.
At a glance
| Layer | Coverage |
|---|---|
| Chunks | AtU8 (short and long form), Code, StrT, Attr, CInf, Dbgi, Docs, ExpT, ImpT, LocT, FunT, Line, LitT (zlib-deflated on OTP 26 and earlier, raw on OTP 27+); unknown chunks are preserved verbatim |
| Source recovery | Erlang abstract code when present, Elixir source from a Dbgi form, best-effort Core Erlang lift as the floor; provenance is recorded in recovered_from |
| Disassembly | Per-instruction Code-chunk trace including the bs_match (opcode 182) command list; a flat .txt listing lands beside the JSON |
| Containers | .ez archives extract through the container layer |
Parsing
disrobe beam parse module.beam --out ./out/module-beam.json
Reports the module name, atom / export / import / fun counts, which optional chunks are present, and any unrecognized chunk names.
Output shape (illustrative):
beam parse: OK
module: my_module
atoms: 42
exports: 8
imports: 15
funs: 3
wrote: ./out/module-beam.json
Lifting to source
disrobe beam lift module.beam --out out/module-beam-lift/
Writes three files: <stem>.<ext> (recovered Erlang or Elixir source, extension derived from recovered_from), <stem>.surface.json (the surface record with provenance), and <stem>.core.json (lifted Core Erlang functions), plus a manifest.json linking them.
When a Dbgi chunk is present the original forms are recovered directly and labelled AbstractCode (Erlang) or ElixirDbgiForm (Elixir). Without it the output is a best-effort Core Erlang lift labelled CoreLifted.
Output shape (illustrative):
beam lift: OK
module: my_module
core fns: 8
recovered: AbstractCode
source: ./out/module-beam-lift/module.erl
surface: ./out/module-beam-lift/module.surface.json
core erlang: ./out/module-beam-lift/module.core.json
manifest: ./out/module-beam-lift/manifest.json
Disassembling
disrobe beam disasm module.beam --out ./out/module-beam.disasm.json
Emits the per-instruction Code-chunk trace as JSON and a flat .txt listing beside it. Opcodes beyond the known table fail with an explicit DR-BEAM-0012 error naming the offending opcode rather than silently skipping bytes.
Output shape (illustrative):
beam disasm: OK
input: module.beam
instructions: 214
wrote: ./out/module-beam.disasm.json
listing: ./out/module-beam.disasm.txt
Swift / Objective-C
Swift and Objective-C compile to native machine code; function bodies are gone at compile time. What survives in the binary is the type metadata the Objective-C and Swift runtimes need at run time. disrobe extracts that metadata, demangles it, and reverses the two dominant rename obfuscators so a dump of a shielded binary reads with its original names.
Commands
disrobe swift classdump App.app/App --out dump.json
disrobe swift shield-undo map.txt --out renames.json
disrobe swift confidential-decrypt blob.bin --key 0x55 --out strings.json
disrobe macho classdump App.ipa --out dump.json
disrobe macho dump App.app/App
disrobe macho slices universal.bin
Class-dump
classdump reconstructs the type interface from two metadata sources the runtime leaves in the binary.
The Objective-C side walks __objc_classlist, __objc_catlist, and __objc_protolist to recover classes, categories, protocols, ivars, properties, and method selectors with their type encodings.
The Swift side parses the reflection sections (__swift5_types, __swift5_fieldmd, __swift5_proto) and demangles the symbols to recover type names, stored fields, and conformances.
The output is a header-style interface listing. Beside the JSON it writes a .swift source file with all recovered type declarations. disrobe swift classdump handles single-slice inputs; for fat binaries and .ipa containers use disrobe macho classdump, which walks every slice.
Output shape (illustrative):
swift classdump: OK
input: App
cpu/bits: arm64 / Bits64
swift types: 24
reflected: 18
mangled syms: 312
demangled: 312
swift source: ./out/App-swift.swift
wrote: ./out/App-swift.json
Rename-undo
shield-undo reverses a SwiftShield run. SwiftShield renames symbols to high-entropy identifiers and emits an obf ==> original mapping in the .dSYM. disrobe parses that mapping and builds the undo lookup, so a subsequent class-dump of the shielded binary reads with the original names.
confidential-decrypt recovers plaintext strings from a SwiftConfidential XOR-obfuscated blob given its single-byte key (--key, default 0x55).
Mach-O commands
disrobe macho dump reports the header, load commands, segments, sections, and any LC_ENCRYPTION_INFO or LC_ENCRYPTION_INFO_64 records. disrobe macho slices walks a fat binary and reports each slice's CPU type, subtype, and offset.
FairPlay-encrypted regions (App Store DRM) are reported detect-only via LC_ENCRYPTION_INFO: the decryption key is not present in the binary, so class-dump of those regions is an information-theoretic wall.
ActionScript 3 / Flash
disrobe parses SWF files, disassembles their embedded ActionScript 3 bytecode (AVM2), and lifts method bodies back to readable AS3 pseudocode via operand-stack abstract interpretation.
Commands
disrobe as3 disasm movie.swf --out out/
disrobe as3 tags movie.swf
disasm walks every DoABC and DoABCDefine tag, emits a per-instruction AVM2 listing as <label>.disasm.txt beside the JSON, and reconstructs class skeletons with lifted method bodies as <label>.source.as3. tags lists every tag in the SWF: TagCode, byte offset, and payload size.
Output shape (illustrative):
as3 disasm: OK
input: movie.swf
swf version: 10
abc blocks: 2
classes: 6
methods: 24
instructions: 512
source files: 2
disasm files: 2
out dir: ./out
manifest: ./out/manifest.json
Lifted output
The source lifter reconstructs class skeletons with property access, calls, arithmetic, and if / goto control flow by abstractly interpreting the operand stack. Any method the lifter could not fully model is prefixed with a /// DR-AS3-PARTIAL: line naming the unmodelled opcodes or fabricated operands; a partial recovery is never silently presented as complete.
Obfuscation detection
disrobe fingerprints commercial AS3 obfuscators (secureSWF, DoSWF, Kindi, Irrfuscator, swfLock) and flags techniques: string encryption, name mangling, control-flow flattening, register and stack shuffle, string-pool-rebuild candidates. Each finding carries a confidence score. This is detection only: no decryption, pool rebuild, or unflattening is performed.
Limits
ABC erases local variable names (non-parameter slots surface as loc{n}) and the compiler erases generics before ABC; both are hard ceilings. FFDec is the mature full Flash decompiler and goes further on source-level recovery; disrobe covers SWF parsing and AVM2 disassembly as part of its chain pass, not as a Flash-decompiler replacement.
Mobile (Hermes / Flutter)
disrobe detects the runtime inside a mobile package, extracts React Native and other bundles, lifts Hermes bytecode to a JavaScript surface, and recovers Dart source or disassembles the ARM64 AOT snapshot from Flutter artifacts.
Runtime detection and extraction
disrobe mobile detect app.apk
disrobe mobile extract app.apk --out bundles/
disrobe mobile hermes index.android.bundle --out disasm/
disrobe mobile flutter libapp.so --out layout.json
detect classifies a package as one of: react-native-apk, react-native-ipa, hermes-raw-bytecode, flutter-libapp-so, flutter-dart-kernel, xamarin-apk, cordova-apk, capacitor-apk, nativescript-apk, ipa, android-apk-dex, or unknown. extract pulls bundles out of the container and writes a manifest.json listing each artifact.
Hermes
disrobe hermes decompile index.android.bundle --out surface/
disrobe hermes disasm index.android.bundle --out disasm/
disrobe hermes info index.android.bundle
decompile handles Hermes bytecode versions v60 through v96 and lifts each function back to pseudo-JavaScript. On a hermesc-built HBC v96 sample (8 functions, CI-gated) every function lifts at 0 fallback opcodes. A non-redistributable 66 MiB production bundle parsed the 122,633-function table with no parse failure (measured locally, not CI-gated). disasm emits a per-function summary without a JS surface. info prints the version, function count, string count, and identifier count.
This is a structured lifter, not a full decompiler. Variadic call arguments are marked <arg?> where the Hermes frame-register layout is not modeled; unreconstructed opcodes appear in disasm form inline.
Output shape (illustrative):
hermes decompile: OK
input: index.android.bundle
hermes ver: 96
functions: 8
with body: 8
identifiers: 24
strings: 12
opcode cov: 100.0% (312 reconstructed / 0 fallback)
if/loop/try: 3/2/1
source: ./out/index.android.bundle-hermes/index.android.bundle.js
manifest: ./out/index.android.bundle-hermes/manifest.json
Flutter
disrobe flutter dump libapp.so --out layout.json
disrobe flutter decompile libapp.so --out estimate.json
disrobe flutter kernel app.dill --out kernel.json
disrobe flutter disasm libapp.so --emit-listing
disrobe flutter map obfuscation_map.json --out map.json
Two distinct recovery paths cover two distinct Flutter artifacts.
Dart kernel (.dill / kernel_blob.bin). A kernel is the serialized Dart AST. disrobe parses the kernel binary format (magic 0x90abcdef): the footer component index, the string table, per-library class and procedure offset tables, and the embedded UriSource table. From the source table it recovers byte-exact original Dart bodies, sliced per procedure by the kernel file offsets. The recovered .dart source file is always written beside the JSON without needing --emit-source.
Output shape (illustrative):
flutter kernel: OK
input: app.dill
format ver: 130
libraries: 3
classes: 8
procedures: 21
fields: 14
bodies: 21 recovered (byte-exact Dart source from the kernel source table)
strings: 112
wrote: ./out/app-dart-kernel.json
dart source: ./out/app-dart-kernel.recovered.dart
ARM64 AOT snapshot (libapp.so). The AOT snapshot is ordinary AArch64 machine code. disrobe locates the four _kDart*Snapshot* symbols, recovers class and method names from the isolate-data string table, scans frame prologues to bound functions, and disassembles each body to readable instructions with resolved direct-call and branch targets. flutter decompile also recovers the class table estimate, library URIs, and a string pool from the isolate image.
flutter dump reports the four Dart snapshot sections and their sizes. flutter map parses a Flutter obfuscation_map.json into a typed original-to-obfuscated lookup.
Exact Dart source for an optimized AOT function is not byte-recoverable from the machine code (register allocation and inlining are lossy); use the kernel path for source bodies.
Python pickle
Pickle is a code-execution format wearing a data format's clothes. Unpickling a crafted stream runs arbitrary code through __reduce__ / REDUCE, which makes weaponized .pkl and .pt files a recurring ML supply-chain vector. disrobe analyzes pickles statically: it never unpickles, never calls __reduce__, never executes a REDUCE opcode. You can audit what a stream would do on load before it ever touches a real interpreter.
Commands
disrobe pickle disasm model.pkl --out trace.txt
disrobe pickle decompile model.pkl --out graph.py
disrobe pickle safety model.pkl
disrobe pickle trace model.pkl
disrobe pickle polyglot suspicious.bin
disrobe pickle model-detect model.bin
The symbolic VM
disrobe pickle trace walks the opcode stream and reconstructs the object graph the same way a real unpickler would build it, but every operation is symbolic. A GLOBAL records a (module, name) reference without importing the module; a REDUCE records "this callable would be applied to these arguments" without calling it; the memo, stack, and STOP result are all inert values. The output is the full graph (objects, memo, resolved globals, reduce count) with nothing executed. decompile renders that graph back to equivalent Python assignments.
Safety grading
disrobe pickle safety grades a stream into one of three severity tiers. Each finding is tagged with a confidence tier so a reviewer can tell a certain hit from an inference.
| Severity | Meaning |
|---|---|
benign | No dangerous import, no reduce against a risky callable, no memo abuse |
suspicious | A pattern that can be malicious in context (unusual import, opaque reduce, memo manipulation) |
overtly_malicious | A reduce against a known code-execution sink (os.system, subprocess.Popen, builtins.eval / exec, __import__) |
| Confidence | Meaning |
|---|---|
signature_certain | The finding follows directly from the opcodes (a GLOBAL os system then REDUCE) |
pattern_inferred | A heuristic shape, not a literal signature match |
context_dependent | Risky only depending on how the loader uses it |
The report also surfaces the resolved import list, the REDUCE count, and the unused-memo count (a common obfuscation tell), so a triage decision does not require reading the raw opcodes.
Polyglots and model files
disrobe pickle polyglot catches the trick of hiding a malicious pickle inside a zip, zip64, or tar that a model loader will open as an archive and then unpickle. disrobe pickle model-detect recognizes PyTorch, TorchScript, and numpy containers and lists every embedded pickle stream, so a multi-file .pt archive is enumerated rather than treated as one opaque blob.
Python bindings
The same static suite is available as a library. Nothing is ever unpickled.
import disrobe
from disrobe import PickleSafety
with open("model.pkl", "rb") as fh:
payload: bytes = fh.read()
safety: PickleSafety = disrobe.pickle_safety(payload)
severity: str | None = safety.severity # "benign" | "suspicious" | "overtly_malicious"
finding_count: int = safety.finding_count
reduce_count: int = safety.reduce_count
listing: str = disrobe.pickle_disasm(payload)
Shell / PowerShell
disrobe deobfuscates PowerShell, Bash, Batch, VBScript, and VBA. It reverses every major PowerShell obfuscator family and Bashfuscator, recovers VBA source from Office documents, and decompiles VBA p-code with stomp detection.
Commands
disrobe shell deob payload.ps1 --out recovered.ps1
disrobe shell detect payload.ps1
deob auto-detects the dialect and obfuscator family, applies the right reversal, and writes the recovered source plus a manifest.json. detect reports the dialect, family, confidence score, and detection markers without writing output.
Output shape (illustrative):
shell deob: OK
input: payload.ps1
dialect: PowerShell
family: InvokeObfuscationToken
confidence: 0.94
markers: ["iex", "token-replace"]
wrote: ./out/payload.deob.ps1
manifest: ./out/payload.deob.manifest.json
Covered families
| Dialect | Families |
|---|---|
| PowerShell | Invoke-Obfuscation (Token, AST, String, Encoding, Compress, Launcher), Invoke-Stealth, PowerHell, Chameleon, psobf, ISESteroids |
| Bash | Bashfuscator (Token, String, Obfuscate, Compress modes), indirection peeler |
| Batch | .bat / .cmd random-char and set-indirection patterns |
| VBA / VBScript | VBA module source recovery, VBScript WSH patterns |
VBA source and p-code
From a .docm / .xlsm / .bin Office container, disrobe parses the dir stream (MS-OVBA), maps each module to its stream and TextOffset, and MS-OVBA-decompresses the CompressedSourceCode at that offset to emit the original .bas / .cls text per module (multi-chunk compression and CopyToken bit-count edges handled). Validated against real Word and Excel documents authored via COM, byte-for-byte against the known module text.
The p-code path lifts a 264-opcode table across VBA3 / VBA5 / VBA6 / VBA7 (32-bit and 64-bit) with identifier resolution. VBA-stomping detection runs a p-code-vs-source classifier that flags modules whose compiled p-code diverges from the stored source and recovers the stomped behavior from the p-code.
Auto-dispatch
disrobe auto detects the dialect and routes obfuscated shell automatically:
disrobe auto payload.ps1 --out recovered/
Containers and archives
Before disrobe can decompile anything, it often has to get inside a container. The disrobe-binfmt layer detects 98 archive, installer, filesystem, and firmware formats and writes member bytes in-tree for all 98, with auto-detection, recursive chaining through nested layers, and shared zip-slip and decompression-bomb guards.
A recursive carve-everything engine scans for every known magic, models chunked payloads, recurses by depth, and uses entropy to separate code from padding.
Supported formats
| Category | Formats |
|---|---|
| Archives and installers | ZIP (incl. ZIP64 + AES), tar.gz / tar.bz2 / tar.xz / tar.zst, 7z, RAR4 and RAR5 (stored members from both; RAR5 LZ "normal" method decoded in-tree; RAR 2.9/3.x LZ used by compressed RAR4 is named per-entry, not decoded in-tree), .cab, MSI, MSIX / APPX, NSIS (solid and non-solid), Inno Setup (decoded setup-data block stream; per-file split via version-specific TSetupHeader parse is the documented limit), InstallShield (stored and zlib members), .deb, .rpm (metadata), AppImage, Flatpak, Snap |
| Bare compression streams | gzip, bzip2, zstd, lzma, lzip, lz4-frame, zlib, .Z (Unix compress) |
| Legacy archives | ar, arj (methods 1-3 decoded; method 4 carved verbatim), arc (rle / squeeze / lzw decoded; methods 5-7 carved verbatim), lzh, lzop, FreeBSD uzip, Xamarin xalz, par2, ELF appended-overlay carve, StuffIt (classic stored forks decoded; compressed forks carved verbatim with a documented note), partclone (decoded) |
| Embedded-linux filesystems | squashfs, cramfs, ext4, romfs, minixfs, jffs2, UBI + UBIFS, yaffs, erofs (chunk and lcluster lz4 / deflate / zstd decoded; microlzma and compact index carved), NTFS, android-sparse, btrfs-send |
| Disk images and partitions | GPT and MBR (partition tables parsed; each partition carved and recursed in-tree), VHD (fixed and dynamic BAT), VHDX (region table + BAT; logical disk materialized from the block-allocation table, then partition-carved and FAT12 / 16 / 32 walked to pull individual stored files), WIM (header resources with XPRESS / LZX / LZMS chunk payloads decompressed in-tree), FAT12 / 16 / 32 (boot sector, FAT chain walk, root and subdirectory traversal) |
| Apple | .dmg (UDIF: koly trailer + blkx mish chunks; ADC / zlib / bzip2 / LZFSE / LZMA chunk decoders; then HFS+ catalog walk extracts individual files, all in-tree), .pkg (xar TOC + gzip / bzip2 heap, extracted in-tree) |
| Vendor firmware | D-Link (SHRS / encrypted-img AES / alpha / fpkg), EnGenius XOR, Autel ECC table, QNAP PC1, plus CRC-verified Netgear (chk / trx), Xiaomi, Tesla, HP, Moxa, INSTAR, and Airoha carves; OTP-AES Airoha firmware is an information-theoretic wall and is carved verbatim |
| Standalone executables | Bun --compile binaries (embedded JS module graph + sourcemaps), Unity AssetBundle (UnityFS) |
| App / runtime | Electron .asar, Docker image tarball, OCI image manifest + layers, ISO 9660 + Joliet (extracted in-tree) |
Extraction
Most extraction happens implicitly inside disrobe auto, which detects a container, extracts it, and recurses into the contents. Archive-shaped inputs are also available directly:
disrobe py extract package.whl --out extracted/
disrobe auto installer.msi --out extracted/
disrobe auto firmware.bin --out extracted/ --recursive --max-depth 6
--recursive enables recursive carve; --max-depth limits nesting depth (default 8).
Safety guards
Every extractor shares the quota machinery in crates/disrobe-binfmt/src/quota.rs:
- Per-entry size cap and aggregate size cap defuse decompression bombs.
- Recursion-depth cap defuses container-in-container bombs.
- Zip-slip path sanitization (
sanitize_entry_path): every entry path is sanitized so no extraction can escape the output directory, on every format.
Bypasses of any of these are treated as security issues; see the security policy.
CLI overview
Every disrobe operation follows one shape:
disrobe <pass> <action> <input> [--out <path>] [flags]
A few top-level commands (auto, chain, scan, ioc, strings, behavior, identify, detect, query, capabilities, extract, report, diff, status, verify, passes, doctor) take their arguments directly rather than through a pass/action pair.
Discovering the surface
disrobe --help # every subcommand
disrobe <pass> --help # actions and flags for one pass, e.g. `disrobe py --help`
disrobe passes # one-line capability summary per registered pass
disrobe explain DR-CLI-0030 # look up any error code
Subcommand inference is enabled: unambiguous prefixes work (disrobe dec ... resolves if only one subcommand starts with dec).
Output formats
The output format is a global flag, so it applies to any command:
| Flag | Output |
|---|---|
| (default) | Human-readable text |
--json | A single structured JSON document |
--ndjson | Newline-delimited JSON (streaming) |
--sarif | SARIF 2.1.0, for GitHub code scanning and other SARIF consumers |
disrobe scan firmware.bin --sarif > findings.sarif
disrobe py decompile m.pyc --json
The standard recovery loop
disrobe auto input.bin --out recovered/ --capture-stages # recover
disrobe status # what landed in ./out/
disrobe context --out recovered/ # per-pass verdict + confidence
disrobe verify recovered/final/*.dr # check envelope integrity
The next pages cover global flags in full, the complete command reference, the diff and guard tooling, and the daemon surface.
Global flags
These flags are accepted on every subcommand (they are declared global = true on the root parser).
Output and verbosity
| Flag | Effect |
|---|---|
-v, -vv, -vvv | Increase log verbosity. |
-q, --quiet | Suppress non-error output. |
--color <auto|always|never> | Control ANSI color in terminal output. |
--json | Emit a structured JSON document instead of human text. |
--ndjson | Emit newline-delimited JSON (streaming). |
--sarif | Emit SARIF 2.1.0 (GitHub code scanning, etc.). |
--progress <auto|always|never> | Progress-bar rendering. auto renders only on a TTY. |
Execution control
| Flag | Effect |
|---|---|
--seed <N> | RNG seed for any non-deterministic backend (keeps runs reproducible). |
--config <PATH> | Load a .disrobe.toml config file. Without it, disrobe walks up from the CWD to discover one. See project configuration. |
--in-place | Rewrite the input file in place. |
--force | Overwrite existing outputs without prompting. |
-j, --threads <N> | Worker thread-pool size (defaults to detected CPU count). |
--no-cache | Bypass the .dr envelope cache. Output is identical with or without this; it is a performance toggle, not a correctness one. |
--dry-run | Report what would happen without writing any output. |
LLM sidecar flags
The full --llm family is also global. See LLM sidecar and provenance for the complete pack/category model. Summary:
| Flag | Effect |
|---|---|
--llm | Alias for --metadata-pack-4 (full bundle, auth-gated categories included). |
--metadata-pack-1 | Pack-1: ast + disasm + symbols + strings. |
--metadata-pack-2 | Pack-2: pack-1 + cfg + types + imports + provenance. |
--metadata-pack-3 | Pack-3: pack-2 + dfg + signatures + constants + roundtrip + sourcemap + manifest. |
--metadata-pack-4 | Pack-4: pack-3 + confidence + opcode-coverage + pii-map + decryption-keys (auth-gated). |
--ast, --disasm, --cfg, --dfg | Add individual AST / disassembly / CFG / DFG categories. |
--symbols, --strings, --types, --imports | Add symbols / strings / recovered-types / imports categories. |
--constants, --signatures, --provenance | Add constants / function-signatures / provenance categories. |
--roundtrip-verdict, --source-map, --manifest-cat | Add roundtrip-verdict / source-map / manifest categories. |
--confidence, --opcode-coverage, --pii-map | Add confidence-scores / opcode-coverage / pii-map categories. |
--decryption-keys | Add decryption-keys category (requires --i-have-authorization). |
--metadata-include <cats> / --metadata-exclude <cats> | Toggle comma-separated categories after applying a pack preset. |
--metadata-out <PATH> | Override the bundle output path (default: <stem>.disrobe.llm.json next to the primary output). |
--metadata-format <json|jsonl|cbor|msgpack> | Bundle serialization format (default json). |
--llm-briefs | Also emit AGENTS.md and SKILL.md reconstruction briefs next to the bundle. |
--i-have-authorization | Unlocks the auth-gated decryption-keys category and grey-zone protector behavior. |
The authorization gate
--i-have-authorization is the single gate guarding behavior that is legally sensitive: grey-zone commercial-protector reversal and the decryption-keys LLM category. Without it, those paths refuse to run (DR-CLI-0420 for decryption keys). It is your assertion that you are authorized to analyze the input under the statutory framing in LEGAL.md.
Command reference
The authoritative source is always disrobe <command> --help. This page is a complete map of the command surface. [--out] and the standardized [--emit ...] selector are available on most passes; see the global flags for flags that apply everywhere.
Python
| Command | Purpose |
|---|---|
disrobe py decompile <pyc> | Decompile a .pyc to source. --backend native (the only supported value). --no-roundtrip skips the recompile-equivalence check. |
disrobe py disasm <pyc> | Per-instruction disassembly (1.0-3.15 + PyPy/MicroPython/Jython/IronPython/Brython). |
disrobe py deob <src> | Peel a source obfuscator. --cleanup runs a ruff-AST fold. |
disrobe py extract <archive> | Extract a wheel / sdist / egg / .whl / .zip / any archive. |
disrobe py sourcedefender <pye> | Decrypt a SourceDefender .pye envelope. |
disrobe pyarmor unpack <py> | Unpack PyArmor v6-v9-pro. --allow-dynamic permits the dynamic-hook fallback (trusted/sandboxed samples only). --dynamic-timeout <SECS>. --mode auto|standard|super. --target <PYVER>. --allow-bcc. --strict. --no-cextract / --cextract-only. --all-emits writes stubs for all 12 emit kinds. --cache <DIR>. |
disrobe pyinstaller extract <exe> | Extract a PyInstaller build (2.x-6.20+, AES decrypt). |
disrobe pyinstaller detect <exe> | Report cookie / Python version / TOC offsets without extracting. |
disrobe pyfreeze extract <exe> | Extract cx_Freeze / py2exe / shiv / pex / PyOxidizer / Briefcase. |
disrobe pyfreeze detect <exe> | Identify the freezer without extracting. |
disrobe nuitka detect|extract|symbols|decompile|const <input> | Nuitka flavor detect, --onefile extract, symbol scan, constants decompile, single .const decode. |
JavaScript / WebAssembly
| Command | Purpose |
|---|---|
disrobe js deob <js> | Deobfuscate (obfuscator.io, JS-Confuser, Jscrambler, esoteric encoders). |
disrobe js unbundle <js> | Split a bundle into per-module sources (11 bundlers). |
disrobe js v8 <blob> | Inspect V8 .jsc / Node SEA / nexe / nw.js / Electron .asar. |
disrobe wasm decompile <wasm> | Lift to --target json|rust|ts|wat|c. |
disrobe wasm deob <wasm> | Reverse Wasm obfuscator families. |
disrobe wasm component <wasm> | Parse a Component Model envelope. |
disrobe wasm gc-types <wasm> | Recover the GC type graph. |
JVM / Android / .NET
| Command | Purpose |
|---|---|
disrobe jvm decompile <class|jar|dex|apk> | Decompile via --backend cfr|vineflower|procyon|jadx. |
disrobe jvm extract <jar|apk> | Extract container + dump classfile inventory. |
disrobe jvm backends | Report JVM/Android backends on PATH. |
disrobe apk <apk> | Decode the binary AndroidManifest.xml, map resource ids to names, and dump each signer certificate's SHA-256. --out <DIR> writes the decoded manifest and resource table to disk. |
disrobe dotnet decompile <dll|exe> | Decompile via --backend ilspy|dnspy|dnspyex|de4dot. |
disrobe dotnet analyze <dll> | PE/CLR metadata, protector detection, R2R + NativeAOT probe. |
disrobe dotnet backends | Report .NET backends on PATH. |
Native
| Command | Purpose |
|---|---|
disrobe native decompile <bin> | Ghidra-headless decompile. --emit source,disasm,ast,cfg,ir,manifest,sourcemap,symbols,strings,imports,signatures,report. |
disrobe native symbols <bin> | Dump symbols, sections, segments, imports, and debug info. |
disrobe native identify <bin> | Fingerprint compiler / packer / protector / installer, each routed to its pass. |
disrobe native unpack [bin] | Detect + unpack UPX/kkrunchy/NSPack/Petite/MPRESS/MEW/FSG/ASPack/PECompact/Yoda's Crypter via in-house decoders + x86 stub emulator. Input is optional; --list shows all supported packers. |
disrobe native devirt <bin> | Devirtualize the bytecode-VM tier: recover the handler table, lift to a re-executable IR + pseudo-code. |
disrobe native export <bin> | Unpack, recover symbols, and export a backend-ready bundle: a rebuilt loadable PE + a Ghidra post-script / IDAPython / JSON symbol map. --format ghidra|ida|json (default ghidra). |
disrobe native disasm <bin> | Per-function listing / --emit cfg-dot CFG / --emit json / --raw linear sweep (--syntax intel|at&t|nasm|masm). Accepts a .dr envelope. |
disrobe native callgraph <bin> | Whole-program call graph as Graphviz DOT. |
disrobe native patch <bin> | Rewrite bytes at a VA (or nop a span) and revalidate the image. |
disrobe native sigmaker <bin> | Wildcarded byte signature from a function, uniqueness-tested. |
disrobe native diff <a> <b> | Match functions across two builds by content + CFG fingerprint. |
disrobe native entropy <bin> | 4KB sliding-window Shannon entropy; ASCII heat-strip + byte histogram + packed-region runs. --format text|json|svg (default text), --svg <out> for a dark-theme entropy map with section overlays. |
disrobe native signatures <bin> | Crypto-constant fingerprints (AES, SHA, ChaCha20). --flirt <sig> to match a FLIRT DB. |
disrobe native fingerprint <bin> | Aggregate crypto-constant + FLIRT + string-xref sidecar at .disrobe/fingerprints/<stem>.json. --flirt <sig>. |
disrobe native sbom <bin> | CycloneDX 1.5 SBOM from cargo-auditable metadata embedded in the binary. |
disrobe native graph <bin> | Import/export table as Graphviz DOT. |
disrobe query <bin|.dr> <q...> | Queryable IR: functions, calls-to <sym>, xrefs-to <sym>, string-decoders, complexity-over <n>, capability <network|crypto|filesystem|process>. Accepts a raw binary or a Disasm-rung .dr envelope. |
disrobe capabilities <bin|.dr> | Rule engine over the IR, mapping behaviors to MITRE ATT&CK + MBC with per-match evidence. |
Other languages
| Command | Purpose |
|---|---|
disrobe go recover|info <bin> | Go symbol recovery / build fingerprint. |
disrobe lua decompile|deobfuscate|detect <chunk> | Lua decompile / obfuscator peel / dialect detect. |
disrobe php decode|deobfuscate|extract <input> | Encoder decode / eval-chain peel / Phar extract. |
disrobe ruby decompile|detect <input> | Ruby artifact analysis / flavor detection. |
disrobe beam parse|lift|disasm <beam> | BEAM chunk parse / Core Erlang lift / Code disasm. |
disrobe pickle disasm|decompile|safety|trace|polyglot|model-detect <input> | Pickle static analysis suite. |
disrobe swift classdump|shield-undo|confidential-decrypt <input> | Swift/ObjC class-dump, SwiftShield rename-undo, Confidential XOR-decrypt. |
disrobe macho dump|classdump|slices <input> | Mach-O / fat / .ipa inspection. |
disrobe as3 disasm|tags <swf> | AS3 DoABC disasm / SWF tag list. |
disrobe hermes decompile|disasm|info <bundle> | Hermes JS-surface lift / disasm / header. |
disrobe flutter dump|decompile|kernel|disasm|map <input> | Flutter Dart AOT + kernel inspection. |
disrobe mobile detect|extract|hermes|flutter <input> | Mobile runtime pipeline. |
Chain, envelope, and forensics
| Command | Purpose |
|---|---|
disrobe detect <input> | Run every obfuscator/packer catalog detector against a file and report each hit (pass, obfuscator, confidence, markers). |
disrobe auto <input> | Auto-detect + chain. --max-depth <N> (default 8), --capture-stages, --emit recovery, --dry-run. A directory input is batch-processed recursively (--include <GLOB>, --exclude <GLOB>, --batch-max-depth <N>, --jobs <N>) into an aggregate manifest.json. |
disrobe chain <input> | Explicit pipeline. --chain 'auto:8' or 'pyarmor+py-decompile', --chain-pin <ver>, --capture-stages. |
disrobe diff <left> <right> | Structurally diff two chain.json documents (passes, stage BLAKE3 hashes, sizes, verdicts). |
disrobe guard verify <subject> --reference <ref> | Verify a subject chain.json's per-stage output hashes against a committed reference. |
disrobe guard check <path> [--root <subtree>...] | Deny writes to ground-truth stage paths (out/**/stages, out/**/final, .disrobe-stage-lock). --root adds extra protected subtrees (repeatable). |
disrobe envelope create|inspect|verify|diff|migrate-check <dr> | .dr envelope operations. |
disrobe verify <dr> | Alias for disrobe envelope verify. |
disrobe scan <path> | Scan raw bytes for leaked credentials. |
disrobe ioc <path> [--format text|json|sarif] [--defang] | Extract indicators of compromise (URLs, IPs, domains, emails, paths, registry keys, wallets, crypto constants); decodes one base64/hex layer. |
disrobe strings <path> [--min-len N] [--no-decode] | Cross-format string extraction: ASCII + UTF-16LE, with single-byte XOR / base64 / ROT-n / stack-string deobfuscation. |
disrobe behavior <path> | Behavior / capability summary across 7 categories, tagged with MITRE ATT&CK technique ids. |
disrobe yara parse <path> | Parse a YARA ruleset into a typed AST (read-only, no matching). |
disrobe yara generate <input> [--name N] [--sha256 H] [--date D] | Generate a candidate YARA rule from an artifact; output round-trips through the parser. |
disrobe status | Summarize ./out/: per-stage counts, sizes, manifests. |
disrobe context --out <dir> | Summarize a recovery report (status, confidence, verdict, provenance). |
disrobe report <dir-or-input> [--format text|json|markdown|html] | Consolidate a completed run (or raw input) into a forensic summary: identity, topology, per-stage verdicts/scores, artifact inventory, timings. --format html emits a self-contained, offline, dark-theme report (inline SVG bars, IOC + ATT&CK tables, XSS-escaped). |
Workspace, agents, and meta
| Command | Purpose |
|---|---|
disrobe init [--ide claude|cursor|windsurf|aider] [--force] | Scaffold a .disrobe/ workspace. |
disrobe config [show] | Print the resolved .disrobe.toml config (honors --json). See project configuration. |
disrobe config init [--out <path>] [--force] | Write a documented .disrobe.toml template. |
disrobe annot refresh|regenerate | Rebuild a symbol annotation file. |
disrobe rename <old> <new> [--note] | Record an append-only rename. |
disrobe passes | List every registered pass with a one-line capability summary. |
disrobe explain <code> | Look up a DR-* error code and print its description and common fixes. |
disrobe doctor [--auto-install] [-y] | Probe ~50 optional external tools; report installed, missing, or stale. |
disrobe install <tool> [--list] [-y] [--dry-run] | Install one optional tool via the native package manager. |
disrobe install-deps [<dep>] [--all] [--dry-run] | Install heavyweight deps (Ghidra) from upstream releases. |
disrobe serve [--bind <ADDR>] [--stdio|--mcp|--grpc] | Run the daemon. See the daemon. |
disrobe completions <shell> [--install] [--rc-file <PATH>] | Generate shell completions (bash, zsh, fish, PowerShell, elvish). |
disrobe man [--out <dir>] | Generate man pages (one .1 per subcommand). |
disrobe bug-report [--out <PATH|->] | Collect environment, manifests, and tooling versions into a markdown bug report. |
disrobe self-update [--check-only] [--dry-run] | Print self-update guidance (source-only distribution; no network by default). |
Project configuration (.disrobe.toml)
disrobe reads an optional .disrobe.toml so a project can set its own defaults instead of repeating the same global flags on every invocation.
Resolution order
Values are merged from three layers, lowest precedence first:
- Built-in defaults: what
disrobedoes with no config and no flags. .disrobe.toml: the discovered or explicitly named config file.- CLI flags: anything you type on the command line.
A flag you actually pass always wins over the config file; a flag you leave off falls back to the config value, and only then to the built-in default. "Did the user pass this flag" is decided by clap's value source, so there is no ambiguity between --json being absent and being false.
Discovery
- With
--config <PATH>, that file is loaded. A missing path is a hard error (DR-CLI-0332). - Without
--config,disrobewalks up from the current working directory looking for.disrobe.toml, exactly the way cargo findsCargo.toml. The first one found wins. If none exists, the built-in defaults are used.
Malformed TOML (DR-CLI-0330) and unknown keys are hard errors: a typo fails fast instead of being silently ignored.
The config command
| Command | Purpose |
|---|---|
disrobe config / disrobe config show | Print the resolved effective config and the file it came from. Honors --json. |
disrobe config init [--out <PATH>] [--force] | Write a fully documented .disrobe.toml template (default ./.disrobe.toml). |
Schema
[output]
dir = "out" # default output directory for chain/auto runs
emit = ["source", "manifest"] # default --emit kinds where a pass accepts them
json = false # default machine-output toggles (CLI flags still override)
ndjson = false
sarif = false
color = "auto" # auto | always | never
progress = "auto" # auto | always | never
verbosity = "warn" # warn | info | debug | trace
quiet = false
[execution]
threads = 8 # worker pool size (default: detected CPU count)
force = false
in_place = false
no_cache = false
cache_dir = "/var/cache/disrobe" # content-addressed .dr envelope cache (default: OS cache dir)
dry_run = false
seed = 42 # RNG seed for non-deterministic backends
max_depth = 8 # default chain depth for `auto`
[backends]
py = "native" # native (in-tree CPython 1.0..3.15 engine; the only supported value)
jvm = "cfr" # cfr | vineflower | procyon | jadx
dotnet = "ilspy" # ilspy | dnspy | dnspyex | de4dot
wasm = "wat" # json | rust | ts | wat | c
lua = "native"
[passes]
enable = ["pyarmor.unpack", "py.decompile"] # restrict chain runs to these passes
disable = ["native.packer-unpack"] # never run these passes
All tables and all keys are optional. An empty file is valid and resolves to the built-in defaults.
Batch directory processing
disrobe auto accepts a directory as well as a single file. Point it at a directory and it walks the tree, runs the auto-chain on every file, writes each file's outputs under out/<relative-stem>/, and emits one aggregate out/manifest.json.
Single-file behavior is unchanged: disrobe auto <file> still writes chain.json + recovery.json into a single out dir.
Usage
disrobe auto ./samples
disrobe auto ./samples --out ./out/triage --include '**/*.pyc' --exclude '*_test.*' --jobs 4
If --out is omitted, batch output lands in ./out/<dir-name>-batch/.
Flags (batch-only)
| Flag | Effect |
|---|---|
--batch-max-depth <N> | Maximum directory recursion depth (default: unlimited). Depth 0 is the directory itself; depth 1 is its immediate children. |
--include <GLOB> | Only process files matching this glob. Repeatable. With no include, all files are in scope. |
--exclude <GLOB> | Skip files matching this glob. Repeatable. Exclude wins over include. |
--jobs <N> | Bounded worker concurrency. Default is 1, kept conservative because chains can be memory-heavy. Raise it on machines with headroom. |
The --max-depth <N> (default 8), --capture-stages, --emit recovery, and global flags continue to apply. --max-depth is the per-file chain depth; --batch-max-depth is the directory recursion depth.
Glob syntax
Globs match against the slash-normalized path relative to the root.
| Token | Matches |
|---|---|
* | Any run of characters within a single path segment (does not cross /). |
** | Any run including / (spans directories). |
? | Exactly one non-/ character. |
[abc], [a-z], [!0-9] | A character class, with !/^ negation and a-z ranges. |
A bare pattern with no / (for example *.bin) also matches files in subdirectories, so the common "all .bin files" case works without writing **/.
manifest.json
Schema disrobe.batch.manifest/v1:
{
"schema": "disrobe.batch.manifest/v1",
"tool_version": "0.10.0",
"root": "samples",
"out_root": "out/samples-batch",
"chain": "auto:8",
"jobs": 4,
"summary": { "processed": 12, "recovered": 9, "detect_only": 2, "errors": 1 },
"entries": [
{
"input": "samples/app.pyc",
"relative": "app.pyc",
"size": 4096,
"detected_format": "Python",
"chain": ["py.decompile"],
"verdict": "Complete",
"recovery_score": 0.67,
"output_dir": "out/samples-batch/app.pyc",
"duration_ms": 31,
"error": null
}
]
}
- recovery_score is the mean per-pass confidence-tier rank across the chain, normalized to
[0, 1](skeleton 0, partial 0.33, semantic 0.67, exact 1.0), ornullwhen no pass ran. - A file that fails (unreadable, or its chain errors) is recorded with a non-null
errorand counted undersummary.errors; one bad file never aborts the batch. - Files with no pass in their chain are counted as
detect_only.
The human-readable summary line mirrors the manifest: N processed, M recovered, K detect-only, E errors.
Run reports (disrobe report)
disrobe report consolidates a completed run into a single forensic summary: input identity, chain topology, per-stage verdicts and recovery scores, the recovered-artifact inventory, and timings. It is the read-side companion to auto and chain.
Usage
disrobe report ./out/sample-auto # a completed single-file run
disrobe report ./out/samples-batch # a completed batch run
disrobe report ./malware.bin # raw input: runs auto first, then reports
disrobe report ./out/sample-auto --format markdown
disrobe report ./out/sample-auto --format html > report.html # self-contained, offline
Target resolution
The single positional argument can be:
| Target | Behavior |
|---|---|
A directory with manifest.json | Read it and render a batch report. |
A directory with chain.json + recovery.json | Read them and render a single-run report. |
| A raw input file | Run auto into ./out/<stem>-auto/ first, then report. |
| A raw directory (not an out dir) | Run a batch into ./out/<dir>-batch/ first, then report. |
A non-existent target is a hard error (DR-CLI-0350).
Formats
--format text|json|markdown|html (default text). The global --json flag forces JSON regardless of --format.
- text: a clean, aligned human report for the terminal.
- markdown: a shareable report with tables, ready to paste into an issue or PR.
- json: the machine-readable
disrobe.report/v1document. - html: a single self-contained HTML file (printed to stdout; redirect to a
.html). CSS is inlined; there is no JavaScript and no external/CDN reference, so it renders offline when double-clicked. Flat neutral-gray dark theme, single green accent, JetBrains-Mono numerics. Sections: input identity, a chain-topology flow, per-stage verdicts with inline SVG recovery bars, a recovery-tier histogram, the recovered-artifact inventory, and, when the input is still readable, an IOC table (defanged) plus a behavior / MITRE ATT&CK summary. Every interpolated value is HTML-escaped (XSS-safe), and the renderer uses no clock or randomness, so the document is byte-stable for identical report data.
Single-run report contents
- Input identity: path, size, BLAKE3, detected format chain, final format.
- Topology + verdict: linear vs tree, and the overall chain verdict.
- Recovery score: the mean per-stage confidence-tier rank normalized to
[0, 1], plus a tier label (skeleton / partial / semantic / exact). - Tier histogram: exact / semantic / partial / skeleton counts.
- Per-stage table: index, pass id, confidence, score, duration.
- Recovered-artifact inventory: the union of artifact names produced by the stages.
- Notes: detect-only and skeleton-tier caveats.
Batch report contents
- The aggregate counts (
processed,recovered,detect-only,errors) and mean recovery score. - A per-file table: file, detected format, score, and status (recovered / detect-only / error).
Analysis-depth commands
A set of static-analysis primitives that make disrobe useful as a triage and reverse-engineering tool, not only a decompiler. Each one operates on raw bytes and (where relevant) on the strings/source a chain has already recovered, so they compose with the rest of the pipeline. None of them execute the sample.
| Command | What it does |
|---|---|
disrobe scan | Scan raw bytes for leaked credentials and secrets. |
disrobe identify | Fingerprint compiler, packer, protector, and installer. |
disrobe ioc | Pull indicators of compromise out of an artifact. |
disrobe strings | Cross-format string extraction with single-byte XOR / base64 / ROT brute-forcing. |
disrobe yara generate | Synthesize a candidate YARA rule from an artifact. |
disrobe behavior | Summarize what a binary does, tagged with MITRE ATT&CK technique IDs. |
Credential scan
disrobe scan firmware.bin
disrobe scan firmware.bin --json
disrobe scan firmware.bin --sarif > findings.sarif
disrobe scan scans the target's raw bytes for leaked credentials: cloud provider keys (AWS, GCP, Azure, GitHub, Stripe, and others), VCS tokens, JWTs, PEM and SSH private keys, and other high-confidence secret patterns.
Unlike disrobe ioc, scan focuses exclusively on secrets that represent an immediate credential exposure rather than general network or host indicators. Output is text (one finding per line) or SARIF 2.1.0.
Format identification
disrobe identify sample.exe
disrobe identify sample.exe --json
disrobe identify fingerprints what built or packed a PE, ELF, or Mach-O binary. It reports the compiler, linker, packer, protector, and installer if detected, with structural evidence and the disrobe pass that handles each detected layer. The output is the same as disrobe native identify but works as a top-level command without routing through the native subcommand tree.
IOC extraction
disrobe ioc suspicious.bin
disrobe ioc suspicious.bin --format json
disrobe ioc suspicious.bin --defang # hxxp://, 1[.]2[.]3[.]4 for safe reporting
disrobe ioc malware.exe --format sarif # GitHub code-scanning ingest
disrobe ioc scans the target's bytes and any UTF-16 / ASCII text inside it for:
- Network: URLs (
http/https/ftp/ftps/smb/file), bare domains, IPv4, IPv6, email addresses. - Host artifacts: Windows file paths, registry keys (
HKLM\...,HKEY_CURRENT_USER\...), Unix paths under well-known roots (/etc,/usr,/var,/Users, ...). - Crypto wallets: Bitcoin (legacy
1/3and bech32bc1), Ethereum (0x...40), Monero (4...). - Crypto constants: AES S-box and inverse S-box, MD5 / SHA-1 / SHA-256 / SHA-512 init vectors, ChaCha20 sigma/tau, and the standard/URL base64 alphabets.
When the input is a native PE/ELF/Mach-O binary, the import table (library!symbol) is folded into the scan so DLL- and symbol-borne indicators surface too.
Encoding recursion
Base64 and hex blobs in the input are decoded and re-scanned one level deep. An indicator found inside a decoded blob is tagged with its encoding (base64 or hex) so you can tell a plaintext URL from one that was hidden behind a layer of encoding. The recursion is intentionally single-level to keep the scan bounded.
Output
- Text (default): one indicator per line,
kind<TAB>encoding<TAB>@offset<TAB>value, followed by a count. - JSON (
--format jsonor the global--json): thedisrobe.ioc/v0document,{ schema, uri, byte_len, total, indicators[] }, each indicator carryingkind,value,offset,encoding, and an optionalcontextwindow. - SARIF (
--format sarifor the global--sarif): SARIF 2.1.0 with one result per indicator and aDR-IOC-<KIND>rule id, for GitHub code scanning.
--defang rewrites URLs, domains, IPs, and emails into a non-clickable form (hxxp://, 1[.]2[.]3[.]4, user@host[.]tld) in every format.
Safety and determinism
Every pattern is bounded (explicit upper repetition counts) so adversarial input cannot trigger catastrophic regex backtracking, and the indicator set is deduplicated and offset-sorted, so the same bytes always produce the same report. The library logic lives in disrobe_core::ioc and is reused by the daemon and by disrobe report.
String extraction
disrobe strings sample.bin
disrobe strings sample.bin --min-len 6
disrobe strings sample.bin --no-decode # plain ASCII / UTF-16 only
disrobe strings sample.bin --json
An in-house FLOSS-style extractor. It pulls printable ASCII and UTF-16LE runs at or above --min-len (default 4), then runs a set of deobfuscation passes and tags each result by how it was recovered:
| Tag | Meaning |
|---|---|
plain / plain:wide | Printable ASCII run / UTF-16LE run. |
xor:0xKK | Recovered by brute-forcing single-byte XOR key KK over a printable run; kept only when the decoded text clears a printable-ratio bar and hits at least two dictionary words. |
base64 | A base64 token whose decoded bytes are printable text. |
rot:N | A run that, rotated by N (ROT13 and other ROT-n), becomes dictionary-rich text. |
stack-string | A run reconstructed from interleaved-NUL / fragmented bytes characteristic of compiler-built stack strings. |
The XOR, ROT, and stack-string heuristics are deliberately conservative: they require dictionary hits, trading recall for precision so the output stays signal, not noise. Results are deduplicated by (value, tag) and offset-sorted.
Output is text (tag<TAB>@offset<TAB>value) or the disrobe.strings/v0 JSON document via --json. The library logic lives in disrobe_core::strings.
YARA rule generation
disrobe yara generate sample.bin
disrobe yara generate sample.bin --name Trojan_Foo_2026
disrobe yara generate sample.bin --sha256 <hash> --date 2026-06-10
disrobe yara generate sample.bin --json
Synthesizes a candidate YARA rule from an artifact. It selects high-signal strings (long, multi-character-class, non-dictionary, and any that were XOR/base64/ROT-recovered get a scoring bonus), detects the file's magic / format header, and emits a leading $magic hex pattern, producing a well-formed:
rule <name> : disrobe generated {
meta:
generated_by = "disrobe <version>"
schema = "disrobe.yara.generated/v0"
format = "pe"
sha256 = "..." // only when --sha256 is given
date = "..." // only when --date is given
strings:
$magic = { 4D 5A 90 00 ... }
$s0 = "..." ascii
...
condition:
$magic at 0 and N of ($s*)
}
The condition combines an anchored magic check (when a format was recognized) with an "N of" string threshold (half the selected strings, rounded up).
Provenance
disrobe has no wall clock available to its analysis core, so the rule is not stamped with the current date automatically. Pass --sha256 and --date to embed those values in the meta block; otherwise they are omitted rather than fabricated.
Self-verification
Every generated rule is parsed back through the in-house YARA parser (the same one behind disrobe yara parse) before it is returned. If the emitter ever produced something the parser could not read, generation fails loudly with DR-YARAGEN-0001 rather than emitting a broken rule. The library logic lives in disrobe_core::yara_gen.
Behavior summary
disrobe behavior sample.exe
disrobe behavior sample.exe --json
disrobe behavior answers "what does this binary do?" by classifying it across seven categories:
| Category | Covers |
|---|---|
network | Sockets, WinHTTP/WinINet, DNS lookups, downloads. |
filesystem | File create/read/write/delete, directory enumeration. |
process_exec | Process / command execution, remote-thread injection. |
registry_persistence | Registry writes, Run keys, scheduled tasks, services. |
crypto | CryptoAPI / BCrypt / embedded crypto constants. |
anti_analysis | Debugger checks, timing checks, sandbox evasion. |
dynamic_code | LoadLibrary/GetProcAddress, VirtualProtect, dlopen, RWX memory. |
Signals come from three sources, each tagged in the evidence list:
import: the native import table (PE/ELF/Mach-O), parsed when the input is a binary.string: API / symbol names found in the extracted strings (including XOR/base64-recovered ones), so signals survive light obfuscation.ioc: network/host/crypto indicators from the IOC extractor.
MITRE ATT&CK mapping
Confident matches are tagged with a MITRE ATT&CK technique id (for example LoadLibrary -> T1129, IsDebuggerPresent -> T1622, a Run key -> T1547.001). The mapping is a small, hand-curated static table: only techniques that follow directly from the signal are emitted, never a probabilistic guess. The aggregate attack_ids list at the end of the report is the union across all categories, ready to paste into a triage ticket. The library logic lives in disrobe_core::behavior and is reusable by disrobe report.
Scope
This is a static summary: disrobe never executes the sample. A signal means the capability is present in the binary's imports/strings, not that it necessarily fires at runtime. Treat it as a lead, not a verdict.
Diff and guard tooling
disrobe treats recovered artifacts as a forensic baseline you can diff across versions and protect against tampering. Two command families support this: disrobe diff and disrobe guard.
disrobe diff: structural chain diff
disrobe diff left/chain.json right/chain.json
Structurally diffs two chain.json documents, the topology descriptors written by disrobe auto / disrobe chain. It compares the passes that ran, each stage's BLAKE3 output hash, byte sizes, and per-stage verdicts. This is how you answer "did upgrading the PyArmor pass change what we recover from this sample?" without eyeballing two output trees.
There is also a parallel disrobe envelope diff for two .dr envelopes directly, which compares version, rung, flags, root hash, producer, capability set, and provenance.
disrobe guard: ground-truth protection
In a recovery workspace, the byte-exact stage outputs (out/**/stages, out/**/final) are ground truth; an analyst or an agent should never edit them in place, because that would silently corrupt the provenance chain. disrobe guard enforces this.
guard verify: hash verification
disrobe guard verify subject/chain.json --reference reference/chain.json
Verifies that a subject chain.json's per-stage output hashes match a committed reference. Use it in CI to assert that a recovery is reproducible: re-run the chain, then verify the new chain.json against the checked-in reference.
guard check: edit denial
disrobe guard check out/final/module.py
disrobe guard check some/path --root extra/protected/subtree --root other/protected
Decides whether a path about to be written or edited is inside a protected ground-truth subtree. It denies writes to out/**/stages, out/**/final, and any .disrobe-stage-lock-marked path, and allows writes elsewhere. --root adds extra protected subtrees; it is repeatable and also accepts comma-separated values.
This is the command wired into the agent settings hook that disrobe init --ide claude generates: a PreToolUse hook calls disrobe guard check and denies edits to the 01-*/ and 02-*/ stage directories, so a coding agent working in a recovery workspace cannot accidentally rewrite the ground truth it is supposed to be analyzing.
The daemon: HTTP, gRPC, LSP, MCP
disrobe serve runs disrobe as a long-lived service over four protocols. The core capability is the same (analyze bytes, return recovered artifacts), exposed through whichever transport fits the caller.
disrobe serve # HTTP on 127.0.0.1:7373
disrobe serve --bind 0.0.0.0:7373 # HTTP on all interfaces (emits a warning banner)
disrobe serve --grpc # HTTP on :7373, gRPC on :7374
disrobe serve --stdio # LSP over stdio
disrobe serve --mcp # MCP companion over stdio (rmcp)
Flags
| Flag | Default | Effect |
|---|---|---|
--bind <ADDR> | 127.0.0.1:7373 | HTTP bind address. |
--stdio | off | Serve LSP JSON-RPC over stdin/stdout instead of HTTP. |
--mcp | off | Serve the MCP companion over stdio instead of HTTP/LSP. |
--grpc | off | Expose the gRPC surface alongside HTTP (binds to <bind-ip>:<bind-port+1>). |
--cors-origin <ORIGIN> | (none) | Additional CORS origin to allow. Repeatable. With no origins, all origins are allowed. |
--max-body-size <N> | 52428800 (50 MiB) | Maximum request body size in bytes. |
HTTP
The default surface. Binds to loopback (127.0.0.1:7373) by default; a non-loopback bind emits a warning at startup.
The API is versioned: /v1/* and /v2/* are aliases. The spec is served at /openapi.json.
| Method | Path | Description |
|---|---|---|
GET | /v1/health | Server liveness ({ status, version }). |
GET | /v1/version | Tool name, version, and API level ({ name, version, api }). |
GET | /v1/passes | List registered passes ({ passes: [{ name, description }] }). |
POST | /v1/analyze | Classify and route bytes. Body: { bytes_b64, hint? }. Returns: { routed_action, bytes_read, blake3_hash, reason, candidates }. |
POST | /v1/explain/{code} | Look up a DR-* error code. Returns: { code, known, title?, description?, crate_path? }. |
POST | /v1/envelope/verify | Verify a .dr envelope. Body: { bytes_b64 }. Returns: { verified, version, rung, hot_bytes, cold_bytes, root_hash_blake3 }. |
POST | /v1/envelope/create | Wrap raw bytes into a Raw-rung .dr envelope. Body: { bytes_b64, source_label?, produced_by?, detected_format? }. Returns: { envelope_b64, envelope_bytes, bytes_in, root_hash_blake3, source_hash_blake3 }. |
WS | /v1/stream | WebSocket NDJSON stream (upgrade with Sec-WebSocket-Protocol: disrobe-stream.v1). Bytes are sent bytes_b64-only; the server never reads from disk. |
GET | /openapi.json | OpenAPI 3 spec for the HTTP surface. |
gRPC
Exposed alongside HTTP with --grpc. The gRPC server binds to the same IP as --bind but at <port> + 1 (default 127.0.0.1:7374). Includes health and reflection services.
LSP over stdio
--stdio speaks JSON-RPC over stdin/stdout using the LSP framing and initialize handshake. It does not implement the standard textDocument surface. It advertises two custom methods under capabilities.experimental.disrobe:
| Method | Description |
|---|---|
disrobe/analyze | Params: { bytes_b64, label? }. Returns the same classification payload as POST /v1/analyze. |
disrobe/explain | Params: { code }. Returns the error-code lookup payload. |
Unknown fields in params are rejected (hard error). A path field is specifically rejected; all bytes are transmitted inline.
MCP companion
--mcp runs the rmcp-based Model Context Protocol companion, exposing disrobe's capabilities as MCP tools (auto, decompile, ioc, behavior, strings, verify, rename, annot, and provenance_lookup) so an MCP-aware agent (Claude Code, Cursor, and others) can call disrobe directly as a tool server. The standalone disrobe-mcp binary is equivalent. See the dedicated MCP server integration page for the full tool catalog and client registration.
Security posture
All four transports share one hard rule: the server never reads a file from disk based on client input. Requests carry bytes_b64 only, never a filesystem path. Unknown JSON fields are rejected via deny_unknown_fields. Any way to make the server read a file via a client-controlled string is a high-severity vulnerability under the security policy. Run the daemon on loopback unless you have a specific reason not to.
Python bindings
disrobe ships a typed Python library that mirrors the full CLI surface. The
importable disrobe module is built from crates/disrobe-python with pyo3
(abi3, Python 3.9+) and wraps the same Rust library the CLI uses. Bytes go in;
a concrete typed report object comes out. Output is deterministic: the same
input produces the same report bytes.
The library does not read or write the filesystem; the caller owns all I/O. Wheels are not published to PyPI; build from source.
Install
git clone https://github.com/1-3-7/disrobe
cd disrobe/bindings/python
pip install maturin
maturin develop --release
For a redistributable wheel:
maturin build --release
pip install target/wheels/disrobe-*.whl
The pyproject.toml pins maturin>=1.5,<2.0, sets module-name = "disrobe.disrobe", and points at crates/disrobe-python/Cargo.toml. On
Windows the crate's build.rs searches PYO3_PYTHON, an active
VIRTUAL_ENV, and standard install locations; set PYO3_PYTHON=<path-to-python.exe>
if none is found. A py.typed marker is shipped so pyright and mypy resolve
every attribute from the .pyi stub.
import disrobe
version: str = disrobe.__version__
Report model
Every analysis function returns a concrete subclass of _Report. The base
carries the full serialization surface every report shares.
_Report
| Member | Signature | Notes |
|---|---|---|
raw | @property -> dict[str, Any] | Full underlying record; no detail dropped |
to_json | () -> str | Compact JSON string |
from_json_str | classmethod(text: str) -> Self | Rebuild from a to_json string |
from_obj | classmethod(obj: dict[str, Any]) -> Self | Wrap an already-decoded dict |
Reports compare equal when their underlying records are equal (== / !=).
_LlmReport
Subclasses _Report. Adds one property:
| Member | Signature | Notes |
|---|---|---|
llm | @property -> LlmBundle | None | Populated on LLM-wired passes; None otherwise |
LLM-wired passes: py_decompile, py_disasm, py_deob, pyarmor_detect,
pyarmor_unpack. Functions that build an LLM bundle accept
pack: Pack | None where Pack = Literal["pack-1", "pack-2", "pack-3", "pack-4"].
LlmBundle
A TypedDict(total=False) mirroring the disrobe.metadata.llm.v1 on-disk schema.
Keys present depend on which pack was requested.
from disrobe import LlmBundle
from typing import Any
bundle: LlmBundle = {
"schema": "disrobe.metadata.llm.v1",
"schema_version": "1",
"generated_at": "2026-06-16T00:00:00Z",
"tool": {},
"selection": {},
"input": {},
"pipeline": [],
"categories": {},
}
Literal type aliases
| Alias | Values |
|---|---|
Pack | "pack-1", "pack-2", "pack-3", "pack-4" |
RoundtripStatus | "perfect", "semantic", "code-diff", "no-interpreter", "recompile-failed", "skipped" |
PyarmorUnpackStatus | "functional", "bcc-partial", "detect-only", "skeleton" |
ContainerListing | "enumerated", "requires-extraction", "unreadable" |
SymbolKind | "function", "data", "label", "export", "import" |
InstructionFlow | "sequential", "call", "indirect-call", "conditional-branch", "unconditional-branch", "indirect-branch", "return", "interrupt" |
SourceLanguage | "python", "py", "python3" |
JsLanguage | "javascript", "js", "typescript", "ts" |
ByteLanguage | "python-bytecode", "pyc", "jvm-class", "class", "dex", "beam", "hermes", "hermes-bundle", "hbc", "wasm" |
ParseByteLanguage | "go", "swift", "objc", "objective-c", "kotlin", "ruby", "lua", "php" |
DisasmByteLanguage | "ruby", "ruby-bytecode", "yarv", "mruby", "php", "php-bytecode" |
DecompileLanguage | "python-bytecode", "pyc", "jvm-class", "class", "java", "kotlin", "lua", "ruby", "php", "php-bytecode", "javascript", "js", "typescript", "ts" |
Exception hierarchy
| Class | Base | Raised when |
|---|---|---|
DisrobeError | Exception | Any binding fails |
UnsupportedLanguage | DisrobeError | disasm/parse/compile/decompile for a language with no backing implementation; message includes a hint |
Module-level functions: full surface
| Category | Function | Returns |
|---|---|---|
| Auto chain | auto(input, *, max_depth=8, path_hint=None) | ChainReport |
| Generic dispatch | decompile(language, source) | CanonicalSource |
disasm(language, source) | str | |
parse(language, source) | typed report or dict[str, Any] | |
compile(language, source, *, version=None) | bytes | |
| Custom pass | register_pass(name, callable) | None |
register_consumer(name, callable) | None | |
registered_passes() | list[str] | |
registered_consumers() | list[str] | |
unregister(name) | bool | |
run_pass(name, data) | Any | |
run_chain(names, data) | Any | |
emit(name, result, **context) | Any | |
| Analysis | strings_extract(data, *, min_len=4, decode=True) | StringsReport |
ioc_extract(data) | IocReport | |
behavior_analyze(data) | BehaviorReport | |
identify(data) | IdentifyReport | |
secret_scan(data) | SecretScanReport | |
capabilities(binary_bytes) | Capabilities | |
extract(data, out_dir) | ExtractionResult | |
extract_recursive(data, *, source_label='inline', max_depth=8) | OverlayReport | |
yara_parse(ruleset_source) | YaraReport | |
yara_generate(data, *, name=None) | YaraReport | |
| Native | native_symbols(data) | SymbolsReport |
native_disasm(data) | DisasmPayload | |
native_callgraph(data) | CallGraph | |
native_imports_dot(data) | str | |
native_entropy(data) | EntropyReport | |
native_sbom(data) | SbomReport | |
native_fingerprint(data, *, flirt=None) | FingerprintReport | |
native_signatures(data, *, flirt=None) | SignatureReport | |
native_sigmaker(data, at) | SigmakerReport | |
native_diff(a, b) | DiffReport | |
native_patch(data, *, at, replacement=None, nop_start=None, nop_end=None) | tuple[bytes, PatchReport] | |
native_format(binary_bytes) | NativeFormat | |
native_detect(binary_bytes) | DetectionList | |
native_probe_backends() | BackendList | |
native_deobfuscate(code, *, bits=64, base=0, entry=0) | NativeDeobfuscation | |
| Query IR | query_functions(dr_bytes) | FunctionList |
query_calls_to(dr_bytes, target) | QueryReport | |
query_xrefs_to(dr_bytes, symbol) | QueryReport | |
query_string_decoders(dr_bytes) | QueryReport | |
query_complexity_over(dr_bytes, threshold) | QueryReport | |
query_capability_sites(dr_bytes, capability) | QueryReport | |
query_call_graph(dr_bytes) | CallGraph | |
| Envelope | envelope_create(payload, *, source_label='inline', produced_by=None, detected_format=None) | bytes |
envelope_verify(envelope_bytes) | EnvelopeReport | |
| LLM renders | agents_md(result) | str |
skill_md(result) | str | |
provenance(result) | Provenance | |
| Python decompile | py_decompile(pyc_bytes, *, roundtrip=False, pack=None) | PyDecompileReport |
py_disasm(pyc_bytes, *, pack=None) | PyDisasmReport | |
| Python deobfuscate | py_deob(source, *, cleanup=True, pack=None) | PyDeobReport |
py_deob_detect(source) | PyDeobDetection | |
py_deob_list_passes() | list[ObfuscatorPass] | |
py_deob_detect_pass(source, pass_id) | PyDeobDetection | |
| PyArmor | pyarmor_detect(source, *, pack=None) | PyarmorDetection |
pyarmor_unpack(wrapper_bytes, *, pack=None) | PyarmorUnpack | |
pyarmor_classify(source, payload) | PyarmorClassification | |
| PyInstaller | pyinstaller_extract(image_bytes) | PyInstallerArchive |
pyinstaller_entry_bytes(image_bytes, entry_name) | bytes | |
| Nuitka | nuitka_detect(image_bytes) | NuitkaDetection |
nuitka_extract(image_bytes) | NuitkaExtraction | |
| Hermes | hermes_disasm(bundle_bytes) | HermesDisassembly |
hermes_lift(bundle_bytes) | HermesLift | |
hermes_info(bundle_bytes) | HermesInfo | |
| Mach-O | macho_dump(macho_bytes) | MachoReport |
swift_analyze(macho_bytes) | SwiftReport | |
| JVM / Android | jvm_parse_class(class_bytes) | JvmClass |
jvm_parse_dex(dex_bytes) | DexFileReport | |
jvm_decompile_class(class_bytes) | JvmDecompiledClass | |
jvm_detect(class_bytes) | DetectionList | |
jvm_backends() | JvmBackends | |
apk_resources(apk_bytes) | ApkResources | |
| .NET | dotnet_parse_pe(pe_bytes) | DotnetPe |
dotnet_parse_metadata(pe_bytes) | DotnetMetadata | |
dotnet_detect(pe_bytes) | DotnetDetection | |
dotnet_analyze(pe_bytes) | DotnetAnalysis | |
dotnet_decompile(pe_bytes) | DotnetDecompilation | |
dotnet_recover_decoders(pe_bytes) | DotnetDecoders | |
dotnet_backends() | BackendList | |
| WebAssembly | wasm_analyze(wasm_bytes) | WasmAnalysis |
wasm_detect(wasm_bytes) | WasmDetection | |
| JavaScript | js_detect(js_source) | JsDetection |
js_unminify(js_source) | JsUnminify | |
js_unbundle(js_source, *, bundler=None) | JsUnbundle | |
| Lua | lua_detect(bytecode) | LuaDetection |
lua_decompile(bytecode) | LuaDecompilation | |
lua_deobfuscate(source, *, authorize=False, strict=False) | LuaDeobfuscation | |
| Go | go_analyze(binary_bytes) | GoAnalysis |
go_symbols(binary_bytes) | GoSymbols | |
go_pclntab(binary_bytes) | GoPclntab | |
go_garble(binary_bytes) | GarbleReport | |
| Ruby | ruby_detect(ruby_bytes, *, source_path=None) | RubyDetection |
ruby_decompile(ruby_bytes, *, source_path=None) | RubyAnalysis | |
| PHP | php_detect(php_bytes) | PhpDetection |
php_scan(php_bytes) | PhpScan | |
php_decode(php_bytes, *, max_depth=None) | PhpDecode | |
| Shell | batch_deobfuscate(script, *, args=None) | BatchDeobReport |
powershell_detect(script) | PowershellDetection | |
powershell_deobfuscate(script) | PowershellDeobfuscation | |
| Containers | container_detect(container_bytes) | ContainerDetection |
container_members(container_bytes) | ContainerMembers | |
| Pickle | pickle_disasm(pickle_bytes) | str |
pickle_decompile(pickle_bytes) | PickleDecompilation | |
pickle_safety(pickle_bytes) | PickleSafety | |
pickle_trace(pickle_bytes) | PickleTrace | |
pickle_polyglot(file_bytes) | PicklePolyglot | |
pickle_ml_detect(file_bytes) | PickleMlReport |
Auto chain
import disrobe
from disrobe import ChainReport
with open("sample.bin", "rb") as fh:
chain: ChainReport = disrobe.auto(fh.read(), max_depth=8)
spec: str | None = chain.spec
pass_count: int = chain.pass_count
terminated: bool = chain.terminated
full_plan: dict[str, object] = chain.raw
auto runs the chain detector against raw bytes and returns a ChainReport
carrying the full chain.json plan the CLI produces; it does not write stage
outputs to disk. max_depth must be 1-16; out-of-range values raise
DisrobeError. The registered pass tree covers pyarmor, pyinstaller, nuitka,
py-decompile, py-deob, container, js, jvm, dotnet, wasm, mobile, swift-objc,
and the native packer detector.
ChainReport accessors
| Property | Type |
|---|---|
spec | str | None |
pass_count | int |
terminated | bool |
Generic dispatch
Language-keyed entry points that fan out to the per-language passes.
decompile
def decompile(language: str, source: str | bytes) -> CanonicalSource: ...
Wired families: python/pyc (py-decompile), jvm-class/class/java/kotlin
(JVM lifter), lua (register lifter), ruby (YARV/mruby recovery),
php/php-bytecode (eval-chain peel/op-array skeleton),
javascript/js/typescript/ts (unminify). Binary-only targets (go,
swift, wasm) have no single source body; call their structural binding or
parse instead.
import disrobe
from disrobe import CanonicalSource
with open("module.pyc", "rb") as fh:
recovered: CanonicalSource = disrobe.decompile("python-bytecode", fh.read())
source: str | None = recovered.source
language: str | None = recovered.language
produced_by: str | None = recovered.produced_by
confidence: float | None = recovered.confidence
CanonicalSource accessors
| Property | Type |
|---|---|
source | str | None |
language | str | None |
produced_by | str | None |
confidence | float | None |
disasm
def disasm(language: str, source: str | bytes) -> str: ...
Returns a rendered instruction listing as text. Wired: python/pyc,
jvm-class/class, dex, beam, hermes, wasm,
ruby/yarv/mruby, and php/php-bytecode. For Lua use
decompile('lua', ...) or parse('lua', ...) instead.
parse
def parse(language: str, source: str | bytes) -> (
dict[str, Any]
| GoAnalysis
| SwiftReport
| JvmClass
| RubyAnalysis
| LuaDecompilation
| PhpDecode
| JsUnminify
): ...
Returns a typed report for structural-recovery languages: go -> GoAnalysis,
swift/objc/objective-c -> SwiftReport, kotlin -> JvmClass,
ruby -> RubyAnalysis, lua -> LuaDecompilation, php -> PhpDecode,
javascript/js/typescript/ts -> JsUnminify. Container and bytecode
formats (pyc, jvm-class, dex, wasm, hermes, beam) return a nested
dict[str, Any] because their full parse records have no single typed shape.
import disrobe
from typing import Any
with open("Hello.class", "rb") as fh:
parsed: dict[str, Any] = disrobe.parse("jvm-class", fh.read())
method_count: int = len(parsed["methods"])
compile
def compile(language: str, source: str, *, version: str | None = None) -> bytes: ...
Implemented for Python only; returns raw marshal.dumps bytes (no .pyc
header) via the host interpreter. lua, ruby, and php raise
UnsupportedLanguage with a hint pointing at the CLI subcommand or toolchain.
import disrobe
blob: bytes = disrobe.compile("python", "x: int = 1 + 2\n")
listing: str = disrobe.disasm("python", "x: int = 1 + 2\n")
Custom pass plugin protocol
Register and compose named passes and output consumers in the host process.
from typing import Any
import disrobe
from disrobe import Pass, OutputConsumer
def my_pass(data: Any) -> Any:
return data[::-1]
def my_consumer(result: Any, **context: Any) -> Any:
print(result, context)
disrobe.register_pass("reverse", my_pass)
disrobe.register_consumer("print", my_consumer)
names: list[str] = disrobe.registered_passes()
consumer_names: list[str] = disrobe.registered_consumers()
output: Any = disrobe.run_pass("reverse", b"hello")
chained: Any = disrobe.run_chain(["reverse", "reverse"], b"hello")
disrobe.emit("print", chained, source="example")
removed: bool = disrobe.unregister("reverse")
Pass and OutputConsumer protocols
Both are @runtime_checkable protocols.
| Protocol | Signature |
|---|---|
Pass | __call__(self, data: Any) -> Any |
OutputConsumer | __call__(self, result: Any, **context: Any) -> Any |
Analysis
strings_extract
def strings_extract(data: bytes, *, min_len: int = 4, decode: bool = True) -> StringsReport: ...
Extracts ASCII and UTF-16 strings from a binary blob.
ioc_extract
def ioc_extract(data: bytes) -> IocReport: ...
Harvests indicators of compromise from bytes and recovered strings.
behavior_analyze
def behavior_analyze(data: bytes) -> BehaviorReport: ...
Behavioral summary by category with MITRE ATT&CK IDs.
identify
def identify(data: bytes) -> IdentifyReport: ...
Compiler/linker/packer/protector/installer fingerprint.
secret_scan
def secret_scan(data: bytes) -> SecretScanReport: ...
Leaked-credential scan over raw bytes.
capabilities
def capabilities(binary_bytes: bytes) -> Capabilities: ...
Capability rule-set matches for a native binary.
extract
def extract(data: bytes, out_dir: str) -> ExtractionResult: ...
Carves container/firmware members to out_dir.
extract_recursive
def extract_recursive(
data: bytes, *, source_label: str = "inline", max_depth: int = 8
) -> OverlayReport: ...
Recursive multi-magic carve; classifies every chunk by entropy and nesting.
yara_parse / yara_generate
def yara_parse(ruleset_source: str) -> YaraReport: ...
def yara_generate(data: bytes, *, name: str | None = None) -> YaraReport: ...
Parse a YARA ruleset AST or generate a candidate rule from a binary blob.
import disrobe
from disrobe import (
StringsReport, IocReport, BehaviorReport, IdentifyReport,
SecretScanReport, Capabilities, OverlayReport, YaraReport,
)
with open("suspect.bin", "rb") as fh:
data: bytes = fh.read()
strings: StringsReport = disrobe.strings_extract(data, min_len=6)
string_count: int = strings.string_count
iocs: IocReport = disrobe.ioc_extract(data)
indicator_count: int = iocs.indicator_count
behavior: BehaviorReport = disrobe.behavior_analyze(data)
category_count: int = behavior.category_count
ident: IdentifyReport = disrobe.identify(data)
fmt: str | None = ident.format
finding_count: int = ident.finding_count
caps: Capabilities = disrobe.capabilities(data)
match_count: int = caps.match_count
overlay: OverlayReport = disrobe.extract_recursive(data, max_depth=4)
chunks_total: int | None = overlay.chunks_total
bytes_carved: int | None = overlay.bytes_carved
rule: YaraReport = disrobe.yara_generate(data, name="suspect")
rule_count: int = rule.rule_count
Analysis report classes
| Class | Notable typed accessors |
|---|---|
StringsReport | string_count: int |
IocReport | indicator_count: int |
BehaviorReport | category_count: int |
IdentifyReport | format: str | None, finding_count: int |
SecretScanReport | finding_count: int |
Capabilities | match_count: int, format: str | None |
ExtractionResult | kind: str | None, entry_count: int, integrity_violation_count: int |
OverlayReport | max_depth: int | None, nodes_visited: int | None, chunks_total: int | None, bytes_carved: int | None |
YaraReport | rule_count: int |
Native binary
Functions
| Function | Returns | Notes |
|---|---|---|
native_symbols(data) | SymbolsReport | Symbols, sections, imports, debug info |
native_disasm(data) | DisasmPayload | Full disassembly: functions, stream, symbols |
native_callgraph(data) | CallGraph | Whole-program call graph |
native_imports_dot(data) | str | GraphViz DOT of the import graph |
native_entropy(data) | EntropyReport | Sliding-window Shannon entropy map |
native_sbom(data) | SbomReport | CycloneDX 1.5 SBOM from cargo-auditable section |
native_fingerprint(data, *, flirt=None) | FingerprintReport | Crypto-constant + FLIRT + string-xref sidecar |
native_signatures(data, *, flirt=None) | SignatureReport | Crypto-primitive signatures and FLIRT matches |
native_sigmaker(data, at) | SigmakerReport | Wildcarded byte signature for a VA |
native_diff(a, b) | DiffReport | Function-level diff of two binaries |
native_patch(data, *, at, ...) | tuple[bytes, PatchReport] | Rewrite bytes and revalidate |
native_format(binary_bytes) | NativeFormat | Format: kind, bitness, subsystem |
native_detect(binary_bytes) | DetectionList | Packer/protector detection hits |
native_probe_backends() | BackendList | Probe for installed external tools |
native_deobfuscate(code, *, bits=64, base=0, entry=0) | NativeDeobfuscation | x86 OLLVM/Tigress deflattening |
import disrobe
from disrobe import (
SymbolsReport, DisasmPayload, CallGraph, EntropyReport,
SbomReport, FingerprintReport, SignatureReport, SigmakerReport,
DiffReport, PatchReport, NativeFormat, DetectionList,
BackendList, NativeDeobfuscation,
)
with open("binary.elf", "rb") as fh:
data: bytes = fh.read()
syms: SymbolsReport = disrobe.native_symbols(data)
symbol_count: int = syms.symbol_count
section_count: int = syms.section_count
import_count: int = syms.import_count
disasm_payload: DisasmPayload = disrobe.native_disasm(data)
instruction_count: int = disasm_payload.instruction_count
source_hash: str | None = disasm_payload.source_hash
entropy: EntropyReport = disrobe.native_entropy(data)
mean: float | None = entropy.mean
sig: SigmakerReport = disrobe.native_sigmaker(data, at=0x1000)
ida_pattern: str | None = sig.ida_pattern
patched_bytes: bytes
patch_report: PatchReport
patched_bytes, patch_report = disrobe.native_patch(data, at=0x1234, nop_start=0x1234, nop_end=0x1240)
revalidated: bool = patch_report.revalidated
deob: NativeDeobfuscation = disrobe.native_deobfuscate(data, bits=64, base=0x400000)
recovered_blocks: int | None = deob.recovered_blocks
fully_recovered: bool = deob.fully_recovered
Native report classes
| Class | Notable typed accessors |
|---|---|
SymbolsReport | symbol_count: int, section_count: int, import_count: int |
DisasmPayload | instruction_count: int, symbol_count: int, source_hash: str | None |
CallGraph | node_count: int, edge_count: int |
EntropyReport | window_count: int, mean: float | None, min: float | None, max: float | None |
SbomReport | component_count: int, bom_format: str | None, spec_version: str | None |
FingerprintReport | crypto_hit_count: int |
SignatureReport | signature_count: int |
SigmakerReport | ida_pattern: str | None, byte_count: int |
DiffReport | added: int, removed: int, changed: int |
PatchReport | at: int | None, bytes_written: int | None, revalidated: bool |
NativeFormat | kind: str | None, bits: int | None, subsystem: str | None |
DetectionList | count: int |
BackendList | count: int, available_count: int |
NativeDeobfuscation | bits: int | None, recovered_blocks: int | None, original_blocks: int | None, fully_recovered: bool |
Query IR
The query functions operate on a Disasm-rung .dr envelope (raw bytes). See
Editable IR objects for how to produce and consume .dr
envelopes programmatically.
| Function | Returns | Notes |
|---|---|---|
query_functions(dr_bytes) | FunctionList | All recovered functions |
query_calls_to(dr_bytes, target) | QueryReport | Call sites targeting a symbol name |
query_xrefs_to(dr_bytes, symbol) | QueryReport | Data/code cross-references to a symbol |
query_string_decoders(dr_bytes) | QueryReport | Functions with string-decode patterns |
query_complexity_over(dr_bytes, threshold) | QueryReport | Functions with cyclomatic complexity above threshold |
query_capability_sites(dr_bytes, capability) | QueryReport | Sites exercising a named capability |
query_call_graph(dr_bytes) | CallGraph | Whole-program call graph from IR |
import disrobe
from disrobe import FunctionList, QueryReport, CallGraph
with open("module.dr", "rb") as fh:
dr: bytes = fh.read()
functions: FunctionList = disrobe.query_functions(dr)
fn_count: int = functions.count
callers: QueryReport = disrobe.query_calls_to(dr, "malloc")
match_count: int = callers.match_count
complex_fns: QueryReport = disrobe.query_complexity_over(dr, threshold=20)
graph: CallGraph = disrobe.query_call_graph(dr)
edge_count: int = graph.edge_count
Query report classes
| Class | Notable typed accessors |
|---|---|
FunctionList | kind: str | None, count: int |
QueryReport | kind: str | None, match_count: int |
CallGraph | node_count: int, edge_count: int |
Envelope
envelope_create wraps a payload as a Raw-rung .dr envelope and returns the
encoded bytes. envelope_verify decodes and verifies, returning an
EnvelopeReport.
import disrobe
from disrobe import EnvelopeReport
envelope: bytes = disrobe.envelope_create(
b"payload",
source_label="inline",
produced_by="my-tool",
detected_format="elf64",
)
report: EnvelopeReport = disrobe.envelope_verify(envelope)
ok: bool = report.verified
root_hash: str | None = report.root_hash
rung: str | None = report.rung
hot_bytes: int | None = report.hot_bytes
cold_bytes: int | None = report.cold_bytes
version: int | None = report.version
EnvelopeReport accessors
| Property | Type |
|---|---|
verified | bool |
rung | str | None |
version | int | None |
hot_bytes | int | None |
cold_bytes | int | None |
root_hash | str | None |
The sidecar DrEnvelope TypedDict (bindings/python/dr-envelope.pyi) mirrors
the raw on-disk header shape: magic, version, rung, flags, hot_len,
cold_len, root_hash.
LLM renders
agents_md and skill_md render the AGENTS.md and SKILL.md reconstruction
briefs for a report from an LLM-enabled pass (or a bare bundle dict), returning
a str. provenance extracts tool/selection/input metadata as a typed
Provenance. Passing a report whose llm slot is None raises DisrobeError.
import disrobe
from disrobe import PyDecompileReport, Provenance
with open("module.pyc", "rb") as fh:
report: PyDecompileReport = disrobe.py_decompile(fh.read(), pack="pack-2")
agents_brief: str = disrobe.agents_md(report)
skill_brief: str = disrobe.skill_md(report)
prov: Provenance = disrobe.provenance(report)
generated_at: str | None = prov.generated_at
schema: str | None = prov.schema
schema_version: str | None = prov.schema_version
Provenance accessors
| Property | Type |
|---|---|
schema | str | None |
schema_version | str | None |
generated_at | str | None |
Python passes
See also Python decompiler for the full decompiler design.
py_decompile
Decompiles a .pyc (with header) to source. 92.76% of CPython 3.14 stdlib
code objects recompile to bytecode-equivalent output (5831 of 6286, CI floor
90%). Legacy CPython 1.0-3.7: 79.6% proven-correct (CI floor 152 of 191;
166 of 191 with the full interpreter zoo present).
import disrobe
from disrobe import PyDecompileReport, RoundtripStatus
with open("module.pyc", "rb") as fh:
report: PyDecompileReport = disrobe.py_decompile(fh.read(), roundtrip=True)
source: str | None = report.source
marshal_version: str | None = report.marshal_version
decompile_version: str | None = report.decompile_version
recovered_directly: bool = report.recovered_directly
fallback_reason: str | None = report.fallback_reason
status: RoundtripStatus | None = report.roundtrip_status
roundtrip_detail: str | None = report.roundtrip_detail
interpreter_path: str | None = report.interpreter_path
interpreter_version: str | None = report.interpreter_version
if status == "perfect":
print("recompiled bytecode matched")
Round-tripping (when roundtrip=True) shells out to a matching host
interpreter; it is the one binding that may run an external python.
py_disasm
import disrobe
from disrobe import PyDisasmReport
with open("module.pyc", "rb") as fh:
result: PyDisasmReport = disrobe.py_disasm(fh.read())
marshal_version: str | None = result.marshal_version
instruction_count: int = result.instruction_count
text: str | None = result.text
py_deob, py_deob_detect, py_deob_list_passes, py_deob_detect_pass
import disrobe
from disrobe import ObfuscatorPass, PyDeobDetection, PyDeobReport
obfuscated: str = "exec(__import__('base64').b64decode('cHJpbnQoMSk='))\n"
deob: PyDeobReport = disrobe.py_deob(obfuscated, cleanup=True)
peeled_source: str | None = deob.peeled_source
cleanup_source: str | None = deob.cleanup_source
layer_count: int = deob.layer_count
detection: PyDeobDetection = disrobe.py_deob_detect(obfuscated)
match_count: int = detection.match_count
passes: list[ObfuscatorPass] = disrobe.py_deob_list_passes()
first_id: str | None = passes[0].id if passes else None
per_pass: PyDeobDetection = disrobe.py_deob_detect_pass(obfuscated, "base64-exec")
py_deob_detect_pass raises DisrobeError for an unknown pass_id.
Python pass report classes
| Class | Notable typed accessors |
|---|---|
PyDecompileReport | source, marshal_version, decompile_version, recovered_directly, fallback_reason, roundtrip_status, roundtrip_detail, interpreter_path, interpreter_version, llm |
PyDisasmReport | marshal_version: str | None, instruction_count: int, text: str | None, llm |
PyDeobReport | peeled_source: str | None, cleanup_source: str | None, layer_count: int, llm |
PyDeobDetection | match_count: int, llm |
ObfuscatorPass | id: str | None |
PyArmor
pyarmor_detect
Parses a PyArmor wrapper from source text.
import disrobe
from disrobe import PyarmorDetection
detection: PyarmorDetection = disrobe.pyarmor_detect(open("wrapped.py").read())
version: str | None = detection.version
protection: str | None = detection.protection
confidence: str | None = detection.confidence
serial: str | None = detection.serial
python_major: int | None = detection.python_major
python_minor: int | None = detection.python_minor
payload_offset: int | None = detection.payload_offset
payload_size: int | None = detection.payload_size
pyarmor_unpack
Statically unpacks a PyArmor wrapper image. 72 of 72 PyArmor samples (v6-v9)
recover. The bindings expose only the static path; there is no
--allow-dynamic equivalent.
import disrobe
from disrobe import PyarmorUnpack, PyarmorUnpackStatus
with open("wrapper.pyc", "rb") as fh:
unpacked: PyarmorUnpack = disrobe.pyarmor_unpack(fh.read())
status: PyarmorUnpackStatus | None = unpacked.status
pyarmor_version: str | None = unpacked.pyarmor_version
protection_kind: str | None = unpacked.protection_kind
plaintext_len: int | None = unpacked.plaintext_len
digest: str | None = unpacked.plaintext_blake3_hex
bcc_blob_count: int | None = unpacked.bcc_blob_count
inner_cipher_recovered_co: int | None = unpacked.inner_cipher_recovered_co
pyarmor_classify
import disrobe
from disrobe import PyarmorClassification
with open("payload.bin", "rb") as fh:
payload: bytes = fh.read()
classification: PyarmorClassification = disrobe.pyarmor_classify(open("wrapped.py").read(), payload)
script_type: str | None = classification.script_type
bootstrap_import: str | None = classification.bootstrap_import
disposition: str | None = classification.disposition
rft_enabled: bool = classification.rft_enabled
ecc_enabled: bool = classification.ecc_enabled
The sidecar PyarmorDetection TypedDict (bindings/python/pyarmor-detection.pyi)
names the confidence and protection Literal values used in the raw dict.
PyArmor report classes
| Class | Notable typed accessors |
|---|---|
PyarmorDetection | version, protection, confidence, serial, python_major, python_minor, payload_offset, payload_size, llm |
PyarmorUnpack | status, pyarmor_version, protection_kind, plaintext_len, plaintext_blake3_hex, bcc_blob_count, inner_cipher_recovered_co, llm |
PyarmorClassification | script_type, bootstrap_import, disposition, rft_enabled, ecc_enabled |
PyInstaller and Nuitka
pyinstaller_extract / pyinstaller_entry_bytes
import disrobe
from disrobe import PyInstallerArchive
with open("app.exe", "rb") as fh:
image: bytes = fh.read()
archive: PyInstallerArchive = disrobe.pyinstaller_extract(image)
entry_count: int = archive.entry_count
encrypted: bool = archive.encrypted
encryption_key_present: bool = archive.encryption_key_present
python_major: int | None = archive.python_major
python_minor: int | None = archive.python_minor
entries: list[dict[str, object]] = archive.raw["entries"]
main_payload: bytes = disrobe.pyinstaller_entry_bytes(image, str(entries[0]["name"]))
nuitka_detect / nuitka_extract
import disrobe
from disrobe import NuitkaDetection, NuitkaExtraction
with open("app.exe", "rb") as fh:
image: bytes = fh.read()
det: NuitkaDetection = disrobe.nuitka_detect(image)
flavor: str | None = det.flavor
version: str | None = det.version
wheel_marker: str | None = det.wheel_marker
onefile_payload_offset: int | None = det.onefile_payload_offset
onefile_payload_compressed: bool = det.onefile_payload_compressed
extraction: NuitkaExtraction = disrobe.nuitka_extract(image)
variant: str | None = extraction.variant
The sidecar FreezerManifest TypedDict (bindings/python/freezer-manifest.pyi)
describes the manifest schema for cx-freeze/py2exe/shiv/pex/py-oxidizer/briefcase
freezer families; reach it via report.raw.
PyInstaller/Nuitka report classes
| Class | Notable typed accessors |
|---|---|
PyInstallerArchive | entry_count: int, encrypted: bool, encryption_key_present: bool, python_major: int | None, python_minor: int | None, llm |
NuitkaDetection | flavor: str | None, version: str | None, wheel_marker: str | None, onefile_payload_offset: int | None, onefile_payload_compressed: bool, llm |
NuitkaExtraction | variant: str | None, llm |
Hermes (React Native)
All 8 functions in the committed hermesc-built HBC v96 sample lift at 100% op-coverage with 0 fallback ops. 122,633 functions lift with no failure on a production React Native bundle.
import disrobe
from disrobe import HermesDisassembly, HermesLift, HermesInfo
with open("index.android.bundle", "rb") as fh:
bundle: bytes = fh.read()
disasm_result: HermesDisassembly = disrobe.hermes_disasm(bundle)
function_count: int = disasm_result.function_count
identifier_count: int = disasm_result.identifier_count
string_count: int = disasm_result.string_count
lift: HermesLift = disrobe.hermes_lift(bundle)
function_surface_count: int = lift.function_surface_count
info: HermesInfo = disrobe.hermes_info(bundle)
version: int | None = info.version
header_size: int | None = info.header_size
Hermes report classes
| Class | Notable typed accessors |
|---|---|
HermesDisassembly | function_count: int, identifier_count: int, string_count: int, llm |
HermesLift | function_surface_count: int, string_count: int, identifier_count: int, llm |
HermesInfo | version: int | None, function_count: int | None, string_count: int | None, header_size: int | None, llm |
Mach-O and Swift
import disrobe
from disrobe import MachoReport, SwiftReport
with open("universal.dylib", "rb") as fh:
data: bytes = fh.read()
macho: MachoReport = disrobe.macho_dump(data)
kind: str | None = macho.kind
fat_entry_count: int = macho.fat_entry_count
slice_count: int = macho.slice_count
swift: SwiftReport = disrobe.swift_analyze(data)
container: str | None = swift.container
swift_fat_entry_count: int = swift.fat_entry_count
swift_slice_count: int = swift.slice_count
Mach-O report classes
| Class | Notable typed accessors |
|---|---|
MachoReport | kind: str | None, fat_entry_count: int, slice_count: int, llm |
SwiftReport | container: str | None, fat_entry_count: int, slice_count: int, llm |
JVM and Android
93.1% of JVM methods recompile error-free under javac (CI floor 122 of 131;
128 of 131 measured with JDK 25). 99% of committed DEX classes pass -Xverify:all.
import disrobe
from disrobe import (
JvmClass, DexFileReport, JvmDecompiledClass,
DetectionList, JvmBackends, ApkResources,
)
with open("Hello.class", "rb") as fh:
cls: JvmClass = disrobe.jvm_parse_class(fh.read())
major_version: int | None = cls.major_version
minor_version: int | None = cls.minor_version
method_count: int = cls.method_count
field_count: int = cls.field_count
constant_pool_count: int = cls.constant_pool_count
with open("classes.dex", "rb") as fh:
dex: DexFileReport = disrobe.jvm_parse_dex(fh.read())
class_count: int = dex.class_count
dex_method_count: int = dex.method_count
with open("Hello.class", "rb") as fh:
decompiled: JvmDecompiledClass = disrobe.jvm_decompile_class(fh.read())
source: str | None = decompiled.source
fully_lifted_methods: int = decompiled.fully_lifted_methods
fallback_methods: int = decompiled.fallback_methods
detections: DetectionList = disrobe.jvm_detect(open("obf.class", "rb").read())
detection_count: int = detections.count
backends: JvmBackends = disrobe.jvm_backends()
jvm_count: int = backends.jvm_count
android_count: int = backends.android_count
with open("app.apk", "rb") as fh:
apk: ApkResources = disrobe.apk_resources(fh.read())
package: str | None = apk.package
manifest_xml: str | None = apk.manifest_xml
resource_entry_count: int = apk.resource_entry_count
certificate_count: int = apk.certificate_count
dex_count: int = apk.dex_count
jvm_backends and dotnet_backends probe the host for installed external
tools but never shell out to them. Counts are informational only.
JVM/Android report classes
| Class | Notable typed accessors |
|---|---|
JvmClass | major_version: int | None, minor_version: int | None, method_count: int, field_count: int, constant_pool_count: int, llm |
DexFileReport | string_count: int, type_count: int, class_count: int, method_count: int, llm |
JvmDecompiledClass | source: str | None, method_count: int, field_count: int, fully_lifted_methods: int, fallback_methods: int |
DetectionList | count: int |
JvmBackends | jvm_count: int, android_count: int, llm |
ApkResources | package: str | None, manifest_xml: str | None, resource_entry_count: int, certificate_count: int, dex_count: int, llm |
.NET
import disrobe
from disrobe import (
DotnetPe, DotnetMetadata, DotnetDetection,
DotnetAnalysis, DotnetDecompilation, DotnetDecoders, BackendList,
)
with open("Sample.dll", "rb") as fh:
pe_bytes: bytes = fh.read()
pe: DotnetPe = disrobe.dotnet_parse_pe(pe_bytes)
bitness: str | None = pe.bitness
machine: int | None = pe.machine
section_count: int = pe.section_count
entry_point_rva: int | None = pe.entry_point_rva
metadata: DotnetMetadata = disrobe.dotnet_parse_metadata(pe_bytes)
version: str | None = metadata.version
major_runtime_version: int | None = metadata.major_runtime_version
stream_count: int = metadata.stream_count
detection: DotnetDetection = disrobe.dotnet_detect(pe_bytes)
primary: str | None = detection.primary
match_count: int = detection.match_count
analysis: DotnetAnalysis = disrobe.dotnet_analyze(pe_bytes)
pe_bitness: str | None = analysis.pe_bitness
native_aot: bool = analysis.native_aot
primary_protector: str | None = analysis.primary_protector
opcode_spec_coverage_pct: int | None = analysis.opcode_spec_coverage_pct
decompilation: DotnetDecompilation = disrobe.dotnet_decompile(pe_bytes)
module_name: str | None = decompilation.module_name
methods_decompiled: int | None = decompilation.methods_decompiled
methods_bodyless: int | None = decompilation.methods_bodyless
methods_failed: int | None = decompilation.methods_failed
decoders: DotnetDecoders = disrobe.dotnet_recover_decoders(pe_bytes)
pure_decoders_found: int | None = decoders.pure_decoders_found
constants_recovered: int = decoders.constants_recovered
backend_list: BackendList = disrobe.dotnet_backends()
available_count: int = backend_list.available_count
.NET report classes
| Class | Notable typed accessors |
|---|---|
DotnetPe | bitness: str | None, machine: int | None, section_count: int, entry_point_rva: int | None, llm |
DotnetMetadata | version: str | None, major_runtime_version: int | None, stream_count: int, llm |
DotnetDetection | primary: str | None, match_count: int, llm |
DotnetAnalysis | pe_bitness: str | None, clr_runtime_version: str | None, native_aot: bool, primary_protector: str | None, opcode_spec_coverage_pct: int | None, llm |
DotnetDecompilation | module_name: str | None, methods_decompiled: int | None, methods_bodyless: int | None, methods_failed: int | None, llm |
DotnetDecoders | pure_decoders_found: int | None, constants_recovered: int, llm |
WebAssembly
100% op-coverage on 94 functions across 30 parseable corpus modules. 24 of 24 execution-eligible functions are execution-equivalent under wasmtime.
import disrobe
from disrobe import WasmAnalysis, WasmDetection
with open("module.wasm", "rb") as fh:
wasm_bytes: bytes = fh.read()
analysis: WasmAnalysis = disrobe.wasm_analyze(wasm_bytes)
import_count: int = analysis.import_count
export_count: int = analysis.export_count
func_count: int | None = analysis.func_count
code_size_bytes: int | None = analysis.code_size_bytes
has_dwarf: bool = analysis.has_dwarf
detection: WasmDetection = disrobe.wasm_detect(wasm_bytes)
obfuscator: str | None = detection.obfuscator
confidence: float | None = detection.confidence
has_name_section: bool = detection.has_name_section
function_count: int | None = detection.function_count
WebAssembly report classes
| Class | Notable typed accessors |
|---|---|
WasmAnalysis | import_count: int, export_count: int, func_count: int | None, code_size_bytes: int | None, has_dwarf: bool, llm |
WasmDetection | obfuscator: str | None, confidence: float | None, has_name_section: bool, function_count: int | None, llm |
JavaScript
Supports 11 bundlers: webpack4, webpack5/webpack, vite, rollup,
rolldown, esbuild, turbopack, bun, browserify, parcel,
systemjs. An unrecognised hint string raises DisrobeError.
import disrobe
from disrobe import JsDetection, JsUnminify, JsUnbundle
source: str = open("main.js").read()
detection: JsDetection = disrobe.js_detect(source)
family: str | None = detection.family
confidence: float | None = detection.confidence
marker_count: int = detection.marker_count
unminified: JsUnminify = disrobe.js_unminify(source)
recovered_source: str | None = unminified.source
bundle_source: str = open("bundle.js").read()
unbundled: JsUnbundle = disrobe.js_unbundle(bundle_source)
module_count: int = unbundled.module_count
bundler: str | None = unbundled.bundler
unbundled_hinted: JsUnbundle = disrobe.js_unbundle(bundle_source, bundler="webpack5")
JavaScript report classes
| Class | Notable typed accessors |
|---|---|
JsDetection | family: str | None, confidence: float | None, marker_count: int, llm |
JsUnminify | source: str | None, llm |
JsUnbundle | module_count: int, bundler: str | None, llm |
Lua
Detects, decompiles, and deobfuscates 11 Lua obfuscator families. IronBrew2 2.7.0 is reversed against real committed output with a Lua execution differential.
import disrobe
from disrobe import LuaDetection, LuaDecompilation, LuaDeobfuscation
with open("chunk.luac", "rb") as fh:
bytecode: bytes = fh.read()
det: LuaDetection = disrobe.lua_detect(bytecode)
lua_format: str | None = det.format
decompiled: LuaDecompilation = disrobe.lua_decompile(bytecode)
decompiled_source: str | None = decompiled.source
fidelity: str | None = decompiled.fidelity
warning_count: int = decompiled.warning_count
deob: LuaDeobfuscation = disrobe.lua_deobfuscate(open("obf.lua").read(), authorize=True)
obfuscator: str | None = deob.obfuscator
deobfuscated: str | None = deob.deobfuscated
fully_recovered: bool = deob.fully_recovered
passes_run_count: int = deob.passes_run_count
recovered_string_count: int = deob.recovered_string_count
Lua report classes
| Class | Notable typed accessors |
|---|---|
LuaDetection | format: str | None, llm |
LuaDecompilation | source: str | None, fidelity: str | None, warning_count: int, llm |
LuaDeobfuscation | obfuscator: str | None, deobfuscated: str | None, fully_recovered: bool, passes_run_count: int, recovered_string_count: int, llm |
Go
85%+ type-name recovery on stripped go1.26 fixtures; 528 of 528 measured.
import disrobe
from disrobe import GoAnalysis, GoSymbols, GoPclntab, GarbleReport
with open("binary", "rb") as fh:
go_bytes: bytes = fh.read()
analysis: GoAnalysis = disrobe.go_analyze(go_bytes)
image_kind: str | None = analysis.image_kind
pclntab_version: str | None = analysis.pclntab_version
buildversion: str | None = analysis.buildversion
ptr_size: int | None = analysis.ptr_size
symbols: GoSymbols = disrobe.go_symbols(go_bytes)
version_label: str | None = symbols.version_label
function_count: int = symbols.function_count
source_file_count: int = symbols.source_file_count
package_count: int = symbols.package_count
pclntab: GoPclntab = disrobe.go_pclntab(go_bytes)
version: str | None = pclntab.version
func_count: int | None = pclntab.func_count
garble: GarbleReport = disrobe.go_garble(go_bytes)
quality: str | None = garble.quality
detection_score: int | None = garble.detection_score
seed_recoverable: bool = garble.seed_recoverable
seed_hash: str | None = garble.seed_hash
recovered_string_count: int = garble.recovered_string_count
Go report classes
| Class | Notable typed accessors |
|---|---|
GoAnalysis | image_kind: str | None, pclntab_version: str | None, buildversion: str | None, ptr_size: int | None, llm |
GoSymbols | version_label: str | None, function_count: int, source_file_count: int, package_count: int, llm |
GoPclntab | version: str | None, ptr_size: int | None, func_count: int | None, image_kind: str | None, llm |
GarbleReport | quality: str | None, detection_score: int | None, seed_recoverable: bool, seed_hash: str | None, recovered_string_count: int, llm |
Ruby
import disrobe
from disrobe import RubyDetection, RubyAnalysis
with open("hello.rb.enc", "rb") as fh:
ruby_bytes: bytes = fh.read()
det: RubyDetection = disrobe.ruby_detect(ruby_bytes, source_path="hello.rb.enc")
flavor: str | None = det.flavor
analysis: RubyAnalysis = disrobe.ruby_decompile(ruby_bytes, source_path="hello.rb.enc")
ruby_flavor: str | None = analysis.flavor
source_path: str | None = analysis.source_path
input_len: int | None = analysis.input_len
Ruby report classes
| Class | Notable typed accessors |
|---|---|
RubyDetection | flavor: str | None, llm |
RubyAnalysis | flavor: str | None, source_path: str | None, input_len: int | None, llm |
PHP
import disrobe
from disrobe import PhpDetection, PhpScan, PhpDecode
with open("obfuscated.php", "rb") as fh:
php_bytes: bytes = fh.read()
det: PhpDetection = disrobe.php_detect(php_bytes)
kind: str | None = det.kind
confidence: str | None = det.confidence
open_tag_offset: int | None = det.open_tag_offset
has_halt_compiler: bool = det.has_halt_compiler
scan: PhpScan = disrobe.php_scan(php_bytes)
hit_count: int = scan.hit_count
family_count: int = scan.family_count
decoded: PhpDecode = disrobe.php_decode(php_bytes, max_depth=10)
php_source: str | None = decoded.source
layer_count: int = decoded.layer_count
residual_eval: bool = decoded.residual_eval
PHP report classes
| Class | Notable typed accessors |
|---|---|
PhpDetection | kind: str | None, confidence: str | None, open_tag_offset: int | None, has_halt_compiler: bool, llm |
PhpScan | hit_count: int, family_count: int, llm |
PhpDecode | source: str | None, layer_count: int, residual_eval: bool, llm |
Shell
import disrobe
from disrobe import BatchDeobReport, PowershellDetection, PowershellDeobfuscation
batch_script: str = open("dropper.bat").read()
batch_result: BatchDeobReport = disrobe.batch_deobfuscate(batch_script, args=["/run"])
output: str | None = batch_result.output
embedded_payload_count: int = batch_result.embedded_payload_count
decrypted_stage_count: int = batch_result.decrypted_stage_count
commands_emulated: int | None = batch_result.commands_emulated
ps_script: str = open("obf.ps1").read()
ps_det: PowershellDetection = disrobe.powershell_detect(ps_script)
obfuscator: str | None = ps_det.obfuscator
ps_confidence: float | None = ps_det.confidence
marker_count: int = ps_det.marker_count
ps_deob: PowershellDeobfuscation = disrobe.powershell_deobfuscate(ps_script)
ps_output: str | None = ps_deob.output
level: str | None = ps_deob.level
transformation_count: int = ps_deob.transformation_count
Shell report classes
| Class | Notable typed accessors |
|---|---|
BatchDeobReport | output: str | None, embedded_payload_count: int, decrypted_stage_count: int, commands_emulated: int | None, llm |
PowershellDetection | obfuscator: str | None, confidence: float | None, marker_count: int, llm |
PowershellDeobfuscation | output: str | None, level: str | None, transformation_count: int, llm |
Containers
98 container families detected and extracted in-tree. See container docs for the full family list.
import disrobe
from disrobe import ContainerDetection, ContainerMembers, ContainerListing
with open("archive.zip", "rb") as fh:
container_bytes: bytes = fh.read()
det: ContainerDetection = disrobe.container_detect(container_bytes)
detected: bool = det.detected
kind: str | None = det.kind
is_zip_family: bool = det.is_zip_family
members: ContainerMembers = disrobe.container_members(container_bytes)
fmt: str | None = members.format
size: int | None = members.size
listing: ContainerListing | None = members.listing
entry_count: int = members.entry_count
Container report classes
| Class | Notable typed accessors |
|---|---|
ContainerDetection | detected: bool, kind: str | None, is_zip_family: bool, llm |
ContainerMembers | format: str | None, size: int | None, listing: ContainerListing | None, entry_count: int, llm |
Pickle
Nothing is ever unpickled; the VM is symbolic.
import disrobe
from disrobe import (
PickleDecompilation, PickleSafety, PickleTrace,
PicklePolyglot, PickleMlReport,
)
with open("model.pkl", "rb") as fh:
pkl: bytes = fh.read()
listing: str = disrobe.pickle_disasm(pkl)
decompilation: PickleDecompilation = disrobe.pickle_decompile(pkl)
pkl_source: str | None = decompilation.source
safety: PickleSafety = disrobe.pickle_safety(pkl)
severity: str | None = safety.severity
finding_count: int = safety.finding_count
import_count: int = safety.import_count
reduce_count: int | None = safety.reduce_count
trace: PickleTrace = disrobe.pickle_trace(pkl)
protocol: int | None = trace.protocol
memo_count: int | None = trace.memo_count
max_stack_depth: int | None = trace.max_stack_depth
global_ref_count: int = trace.global_ref_count
trace_reduce_count: int | None = trace.reduce_count
polyglot: PicklePolyglot = disrobe.pickle_polyglot(pkl)
is_pickle: bool = polyglot.is_pickle
is_polyglot: bool = polyglot.is_polyglot
kind_count: int = polyglot.kind_count
with open("model.pt", "rb") as fh:
ml_report: PickleMlReport = disrobe.pickle_ml_detect(fh.read())
fmt: str | None = ml_report.format
framing: str | None = ml_report.framing
embedded_count: int = ml_report.embedded_count
Pickle report classes
| Class | Notable typed accessors |
|---|---|
PickleDecompilation | source: str | None, llm |
PickleSafety | severity: str | None, finding_count: int, import_count: int, reduce_count: int | None, llm |
PickleTrace | protocol: int | None, memo_count: int | None, max_stack_depth: int | None, global_ref_count: int, reduce_count: int | None, llm |
PicklePolyglot | is_pickle: bool, is_polyglot: bool, kind_count: int, llm |
PickleMlReport | format: str | None, framing: str | None, embedded_count: int, llm |
Editable IR objects
CodeObject, Instruction, and Symbol let you load a Disasm-rung .dr
envelope, modify it in Python, and write a fresh integrity-hashed .dr.
Instruction
from disrobe import Instruction, InstructionFlow
instr: Instruction = Instruction(
offset=0,
mnemonic="mov",
operands=["rax", "rbx"],
bytes=b"\x48\x89\xd8",
)
instr.branch_target = None
flow: InstructionFlow = instr.flow
text: str = instr.text()
| Member | Type | Notes |
|---|---|---|
offset | int | Mutable |
mnemonic | str | Mutable |
operands | list[str] | Mutable |
bytes | bytes | Mutable |
branch_target | int | None | Mutable |
flow | @property InstructionFlow | Read-only |
text() | -> str | Rendered disassembly line |
Symbol
from disrobe import Symbol, SymbolKind
sym: Symbol = Symbol(address=0x1000, name="entry", kind="function")
sym.name = "main"
sym.kind = "export"
| Member | Type | Notes |
|---|---|---|
address | int | Mutable |
name | str | Mutable |
kind | SymbolKind | Mutable |
CodeObject
import disrobe
from disrobe import CodeObject, Instruction, Symbol
with open("module.dr", "rb") as fh:
co: CodeObject = CodeObject.from_dr(fh.read())
instruction_count: int = co.instruction_count
symbol_count: int = co.symbol_count
source_hash: str = co.source_hash
produced_by: str = co.produced_by
instrs: list[Instruction] = co.instructions
syms: list[Symbol] = co.symbols
metadata: dict[str, str] = co.metadata
capabilities: list[str] = co.capabilities
llm_bundle: dict[str, object] | None = co.llm
new_sym: Symbol = Symbol(address=0x2000, name="renamed_fn", kind="function")
co.add_symbol(new_sym)
co.set_metadata("analysis", "patched")
co.add_capability("NETWORK_CONNECT", 1)
fresh_dr: bytes = co.to_dr()
with open("module_patched.dr", "wb") as fh:
fh.write(fresh_dr)
CodeObject.from_dr parses a Disasm-rung .dr envelope. to_dr produces a
fresh envelope with a recomputed integrity hash. The set_instructions and
set_symbols methods replace the full list; add_instruction / add_symbol
append. set_metadata(key, value) sets a single string key; clear_metadata
resets all. set_llm(sidecar) attaches or removes the LLM sidecar dict.
| Member | Signature | Notes |
|---|---|---|
from_dr | staticmethod(dr_bytes: bytes) -> CodeObject | Parse existing envelope |
instructions | @property -> list[Instruction] | |
set_instructions | (instructions: list[Instruction]) -> None | Replace |
add_instruction | (instruction: Instruction) -> None | Append |
symbols | @property -> list[Symbol] | |
set_symbols | (symbols: list[Symbol]) -> None | Replace |
add_symbol | (symbol: Symbol) -> None | Append |
instruction_count | @property -> int | |
symbol_count | @property -> int | |
source_hash | str | Mutable attribute |
produced_by | str | Mutable attribute |
metadata | @property -> dict[str, str] | |
set_metadata | (key: str, value: str) -> None | |
clear_metadata | () -> None | |
capabilities | @property -> list[str] | |
add_capability | (name: str, major: int) -> None | |
llm | @property -> dict[str, Any] | None | |
set_llm | (sidecar: dict[str, Any] | None) -> None | |
to_dr | () -> bytes | Produce fresh integrity-hashed envelope |
Scope
- No file or directory handling: no
--outtrees, no--capture-stages, no container extraction to disk.autoreturns the plan document only. - External backend tools (
jvm_backends,dotnet_backends,native_probe_backends) are probed for availability but never executed. - AS3, Flutter, BEAM (beyond
disasm/parse), and the freezer family beyond PyInstaller/Nuitka have no dedicated bindings in this release. - No SARIF/NDJSON emitters and no
servedaemon; drive the CLI or daemon directly.
Forensics and malware-safety posture
disrobe is designed to be run against hostile input by analysts who must not detonate it. This page states precisely what does and does not execute, so you can decide what to run inside a sandbox.
The default is static analysis, no sample execution
By default, disrobe does not execute the sample. Every default path is pure static analysis: it parses bytes, decodes bytecode, walks structures, and emits derived artifacts. It does not unpickle, does not call __reduce__, does not run a packed binary, does not invoke a sample's entry point.
This holds for the entire pickle suite in particular. disrobe pickle trace runs a symbolic VM: it walks the opcode stream and builds the object graph without instantiating a single real object or resolving a single real global. disrobe pickle safety grades danger statically. You can audit a downloaded .pt or .pkl for what it would do on load without ever letting it load.
The opt-in execution paths
There are a small number of paths that can execute code, and every one is behind an explicit, named flag. None of them is on by default.
| Path | Gate | What it does |
|---|---|---|
| PyArmor v6/v7 dynamic-hook | --allow-dynamic | Runs the obfuscated wrapper in a watched subprocess to capture marshal streams. Watchdog timeout via --dynamic-timeout (default 60s). |
| PyArmor BCC native-body lift | --allow-bcc | Lifts BCC-protected native bodies via Ghidra-headless on PATH (Ghidra runs, not the sample's logic in-process). |
If you must use --allow-dynamic, do it inside an isolated sandbox (a disposable VM or container with no network and no access to anything you care about). disrobe gives you the watchdog timeout and a captured-marshal manifest, but a dynamic hook is, by definition, executing adversarial code. The pure-static paths (v8, v9-pro) need no such gate.
Subprocess backends
The optional external backends (Ghidra, CFR, Vineflower, jadx, ILSpy, dnSpy, de4dot, Rizin) run as subprocesses over the artifact, not by executing the sample's own logic. Command lines are constructed from configuration and sometimes user input; command injection or argument smuggling there is in scope for the security policy.
Hardened parsing surface
disrobe parses adversarial binary input constantly, so the parsing surface is hardened deliberately:
- Pure-Rust,
unsafeforbidden workspace-wide. The only opt-outs are the two pyo3 C-interop crates (disrobe-pyarmor-cextract,disrobe-pyarmor-pytrace), gated behind explicit features. Any panic or abort on adversarial input that is not a cleanResult::Erris a bug. - Resource-exhaustion guards. Zip-bombs, decompression bombs, container-recursion bombs, and malformed-length-field bombs are defused by the shared quota machinery in
crates/disrobe-binfmt/src/quota.rs(per-entry cap, aggregate cap, recursion-depth cap). - Path-traversal guards. zip-slip and equivalents are sanitized on every container extraction path.
- Envelope decoder hardening. The
.drdecoder is fuzzed; read-past-end, integer overflow, and BLAKE3-mismatch acceptance are all in scope. - Chain safety. Depth cap (default 8) and content-hash cycle detection stop a malicious input from making a chain recurse forever.
Reporting
Found a way to make a default path execute a sample, or to escape a container, or to crash the parser? That is a security issue: report it privately, never as a public issue. See Security.
Threat model
This page is the explicit trust-boundary analysis for disrobe. It states what the tool treats as untrusted, where the boundaries are drawn, what each boundary defends against, and what is deliberately out of scope. It complements the operational Security policy and the Forensics and malware-safety posture: the security policy says how to report a problem and what is in scope for a report; this page says what the design assumes an attacker can do and where the walls are.
The single load-bearing assumption is this: the input is hostile. disrobe exists to parse protector output, packed executables, obfuscated bytecode, and exotic encoders. Every byte of every sample is treated as attacker-controlled. The analyst running disrobe is trusted; the artifact they point it at is not.
Trust boundaries at a glance
┌─────────────────────────── trusted ───────────────────────────┐
│ analyst, host filesystem, disrobe binary, configuration │
└───────────────▲───────────────────────────────▲───────────────┘
│ B1 │ B2
┌───────────────┴───────────────┐ ┌────────────┴───────────────┐
│ untrusted sample bytes │ │ untrusted .dr envelope │
│ (file / bytes_b64 / stdin) │ │ (cache hit, peer-supplied) │
└───────────────────────────────┘ └────────────────────────────┘
│ B3 │ B4
┌───────────────┴───────────────┐ ┌────────────┴───────────────┐
│ network surface │ │ subprocess backends + │
│ (serve: HTTP / gRPC / LSP) │ │ optional sample execution │
└───────────────────────────────┘ └────────────────────────────┘
There are four boundaries. Boundary 1 (sample bytes) and Boundary 2 (envelope bytes) are always present. Boundary 3 (network) is present only when disrobe serve is running. Boundary 4 (subprocess and dynamic execution) is present only when an explicit opt-in flag is passed.
Boundary 1: untrusted sample bytes
Trusted side: the disrobe process, the host, the analyst's intent.
Untrusted side: the sample. It arrives as a filesystem path, a bytes_b64 blob over the daemon, or stdin. The parser must assume every length field, offset, opcode, and nested container is chosen by an adversary to break it.
What this boundary defends against, and how:
| Threat | Defense | Where |
|---|---|---|
| Memory-corruption via the parser | Pure-Rust, #![forbid(unsafe_code)] workspace-wide; the only opt-out is the two pyo3 C-interop crates, gated behind explicit features and never on the default path. | workspace lint config |
| Panic / abort on adversarial input | Any non-Result::Err failure on hostile bytes is a bug. Decoders return errors, they do not unwrap. | every disrobe-pass-* decoder |
| Decompression and zip bombs | Per-entry cap, aggregate cap, and an observed-ratio ceiling in the shared quota machinery. | crates/disrobe-binfmt/src/quota.rs |
| Path traversal (zip-slip and kin) | Every container extraction path routes through sanitize_entry_path and siblings before any write. | crates/disrobe-binfmt/src/quota.rs |
| Container-recursion bombs | Recursion-depth cap plus content-hash cycle detection in the chain runner (default depth 8). | chain runner |
| Malformed-length-field bombs | Length fields are validated against remaining buffer length before allocation; no length field is trusted to size an allocation. | binfmt + envelope decoder |
| Signature defeat (scrambled magic, renamed sections, corrupted markers) | Detection falls back from magic to self-consistent internal structure, which an adversary cannot break without breaking the file's own functionality. | crates/disrobe-binfmt/src/structural.rs |
The envelope decoder and the container layer are the two most-exposed parsing surfaces and are fuzzed.
Signature defeat and header scrambling
A common evasion against signature-based detectors and unpackers is to scramble the parts a fast scanner keys on: flip the MZ of a PE, zero the \x7fELF of an ELF, mangle the Mach-O / DEX / class-file / wasm magic, rename UPX0/UPX1 and corrupt the UPX! marker. These edits defeat a tool that identifies a format by a leading magic byte or a section name, but they do not change what the file actually is: the loader, the OS, or the runtime still has to find the real structure to execute it, so the structure is the one thing the adversary cannot remove.
disrobe identifies a format by magic on the fast path, then falls back to structural validation when the magic is absent or wrong. The fallback parses far enough into the format's own header tables to confirm they refer to one another consistently, which keeps false positives low (a loose pattern match would not satisfy a full cross-referenced walk):
- PE. Resolve
e_lfanewto aPE\0\0signature, then a COFF header with a known machine type, a PE32/PE32+ optional header, and a section table that fits the file. A corruptede_lfanewitself is recovered by scanning for thePE\0\0whose following headers validate, so a flippedMZand a manglede_lfanewtogether still parse. - ELF. Validate the class / endianness / version bytes and confirm the program- and section-header table offsets, entry sizes, and counts are self-consistent against the declared entry sizes and the file length. A zeroed
\x7fELFdoes not move any of those fields. - Mach-O. Walk the load-command stream (
ncmds/sizeofcmdsand eachcmdsize) for a single-arch image, or the arch offset/size table for a fat image, accepting only when the run lands exactly at its declared end. - Native packers (UPX). Detect and unpack by the decompressor stub's
PackHeader(a known method id, self-consistent compressed/uncompressed lengths, a plausible version) located by structural scan rather than by theUPX!marker, and resolve packed-section data through the structural PE header rather than a literalMZ. A renamed-section, corrupted-marker UPX still unpacks byte-identically. - ZIP and zip-family archives. Anchor on the End-of-Central-Directory record (the format's authoritative trailer) and confirm its central-directory offset and size land on a record carrying the central-directory-header signature. A scrambled first local header does not move the EOCD.
- DEX. Confirm
header_size == 0x70, a legal endian tag, and string / type / proto / method / class section sizes and offsets self-consistent againstfile_sizeand the byte length; a zeroeddex\n0XX\0magic still parses, defaulting the version when the version triple is unreadable. - JVM class file. Confirm a major version in the JVM-known range and walk the constant pool (Utf8 lengths, long/double double-slots) to its end; a scrambled
0xCAFEBABEstill parses. - wasm. Confirm a version word of 1 and that the section id/size LEB128 stream validates end to end, terminating exactly at end of file; a scrambled
\0asmstill lifts.
The structural detector is shared (identify_by_structure) so the central sniffer (classify.rs), the container detector, and the native packer and identity passes all benefit from the same validated logic, and every validator is bounds-checked against deliberately malformed input. Python .pyc / marshal detection is handled on a separate path and is not part of this fallback. The behavior is proven by adversarial tests that take real committed corpus samples, scramble their magic bytes, section names, and markers, and assert disrobe still detects the correct format and, where it unpacks or parses, still produces the correct recovered output.
Boundary 2: untrusted .dr envelopes
The .dr envelope is content-addressed (BLAKE3-rooted, rkyv hot payload + postcard cold sidecar). A cache hit, a peer-supplied envelope, or a downstream stage all cross this boundary. An envelope is not trusted merely because it claims a hash.
What this boundary defends against:
- Read-past-end. The zero-copy rkyv access path is bounds-checked at decode; an envelope whose declared lengths exceed its actual bytes is rejected, not read past.
- Integer overflow in length math. Offset and length arithmetic is checked; an envelope cannot induce a wrapping add that yields an in-bounds-looking slice.
- BLAKE3-mismatch acceptance. The root hash is recomputed over the payload and compared; an envelope whose content does not match its claimed root is rejected. This is the property that makes
--no-cachean optimization toggle and not a correctness toggle: a cache hit is provably the same bytes.
The decoder lives in crates/disrobe-ir/src/envelope.rs and is fuzzed against exactly these three attacks.
Boundary 3: the network surface (disrobe serve)
When the daemon runs, HTTP, gRPC, and LSP-over-stdio each cross a trust boundary. The governing rule is that the server never opens a file based on a client-controlled string.
- HTTP, gRPC, and the LSP
disrobe/analyzemethod acceptbytes_b64only, never a path. There is no client-reachable code path that turns a request field into a filesystem read. - All request bodies reject unknown fields via
#[serde(deny_unknown_fields)], closing field-smuggling and forward-compat-confusion attacks. - A non-loopback HTTP bind emits a
tracing::warn!banner at startup, so an operator who exposes the daemon beyond localhost is told so explicitly.
The daemon is intended for localhost / trusted-network use; it is not an authenticated multi-tenant service, and exposing it publicly is an operator decision the warning banner flags.
Boundary 4: subprocess backends and optional sample execution
This is the boundary an analyst can choose to not cross at all. Two distinct sub-cases:
Subprocess backends over the artifact (not the sample's logic). Optional external tools (Ghidra, CFR, Vineflower, jadx, ILSpy, dnSpy, de4dot, Rizin) run as subprocesses over the derived artifact. They never execute the sample's own entry point. The exposure here is command-line construction: command lines are built from configuration and sometimes from user input, so command injection and argument smuggling are the in-scope threats, mitigated by constructing argument vectors directly rather than shelling out through a string.
Dynamic execution of the sample. A small number of paths can run adversarial code, and none is on by default. Each sits behind a named flag:
| Path | Gate | What runs |
|---|---|---|
| PyArmor v6/v7 dynamic-hook | --allow-dynamic | The obfuscated wrapper, in a watched subprocess, to capture marshal streams. Watchdog via --dynamic-timeout (default 60s). |
| PyArmor BCC native-body lift | --allow-bcc | Ghidra-headless over the native body: the analysis tool runs, not the sample's logic in-process. |
The default static paths (pickle symbolic VM, the v8 and v9-pro PyArmor peels) need no such gate: they parse and walk, they do not detonate. When dynamic execution is unavoidable, run it inside a disposable, network-isolated sandbox. disrobe gives you a watchdog and a captured-marshal manifest, but a dynamic hook is, by definition, executing attacker code.
Non-execution stance (restated as an invariant)
The default-static stance is a design invariant, not a configuration default that can drift:
- disrobe does not unpickle.
disrobe pickle tracewalks the opcode stream symbolically, building the object graph without instantiating a single real object or resolving a single real global;disrobe pickle safetygrades danger statically. - disrobe does not call
__reduce__, does not run a packed binary, does not invoke a sample's entry point on any default path. - Any way to make a default path execute a sample is a vulnerability, in scope for the Security policy.
Plugin and WASM isolation
Where disrobe loads analysis logic as data rather than as native code, that logic runs sandboxed: WASM-hosted analysis executes inside a wasmparser-validated, memory-bounded interpreter with no ambient filesystem or network capability, so a malicious or malformed module can consume bounded compute and nothing more. This keeps the extensibility surface from becoming a fresh native-code execution boundary.
Supply chain
The integrity of the binary the analyst runs is its own boundary:
- No untrusted bytecode in the public corpus. The repository does not ship third-party copyrighted obfuscated bytecode; fixtures are either self-generated by
corpus/generate.{sh,ps1}or referenced by BLAKE3 hash only. Every shipped fixture is pinned by hash incorpus/native/packers/MANIFEST.tomland sibling registries, and tests verify byte-identity before the parser ever sees the bytes. - Signed releases. Release artifacts are signed with cosign keyless OIDC and minisign; every cosign signature is recorded in the Rekor public transparency log. Verification commands are in the Security policy.
- Dependency hygiene.
cargo deny(advisories / bans / licenses / sources) runs on every push and weekly;cargo auditruns weekly. The clippy gate (-D warnings) is required for every commit onmain. - History hygiene. CI runs on every push, and the local verification chain (clippy
-D warnings, fmt, tests,cargo deny) is the enforced pre-push gate; commit authorship uses the GitHub noreply form so personal email never enters history.
Explicitly out of scope
The threat model deliberately does not defend against:
- Decompilation-output correctness on adversarial bytecode. disrobe will sometimes emit wrong source for hostile input; the round-trip metric exists to flag this. A non-byte-perfect decompile is correctness work, not a security boundary.
- Compute exhaustion via legitimate input. Decompiling a 66 MiB Hermes bundle is genuinely expensive; a slow-but-bounded decompile of real input is not a vulnerability. (Adversarial amplification, a tiny input that forces unbounded work, is in scope under Boundary 1.)
- Vulnerabilities inside wrapped third-party tools. Ghidra, jadx, CFR, and friends have their own security channels; we forward where we can identify the upstream.
- Trusting the analyst. disrobe assumes the operator is authorized and acting in good faith; it gates grey-zone protector behavior behind
--i-have-authorizationbut does not, and cannot, adjudicate that authorization.
Reporting
If you find a way to cross a boundary that this model claims is sealed (make a default path execute a sample, escape a container, accept a hash-mismatched envelope, or make the daemon read a file from a client string), that is a security issue. Report it privately, never as a public issue. See the Security policy.
GitHub Action
disrobe ships a composite GitHub Action that downloads the matching release binary, runs a scan over a path or glob, and uploads the result to GitHub code scanning as SARIF. It runs entirely in the runner shell (no Docker image, no build step) so it starts in seconds.
Quick start
Add a workflow that scans build artifacts on every push and surfaces findings in the Security -> Code scanning tab.
name: disrobe-scan
on:
push:
pull_request:
permissions:
contents: read
security-events: write # required for the SARIF upload
jobs:
scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: 1-3-7/disrobe@v0.10.0
with:
path: dist/
command: auto
fail-on: failed
The security-events: write permission is what lets the action publish SARIF to code scanning; without it the upload step is skipped by GitHub.
What it does
- Resolves the runner OS/arch to a release target triple (
x86_64-unknown-linux-gnu,aarch64-apple-darwin,x86_64-pc-windows-msvc, and the rest of the release matrix). - Downloads
disrobe-<version>-<target>.tar.zst(or.zipon Windows) plusSHA256SUMSfrom this repository's Releases, and verifies the archive againstSHA256SUMSbefore extracting. A checksum mismatch fails the step. - Runs
disrobe <command> <path> <args> --sarif --out <out-dir>, capturing the SARIF document. - Uploads the SARIF to code scanning and the recovered-artifact directory as a workflow artifact.
Inputs
| Input | Default | Description |
|---|---|---|
path | (required) | File, directory, or glob to analyze. Passed verbatim to the command. |
command | auto | disrobe subcommand (auto, scan, behavior, ...). |
args | "" | Extra arguments inserted after the command and before the path (for example --max-depth 12). |
version | action ref, then latest | Release tag to download (v0.10.0, latest). |
fail-on | never | Fail the step at or above a verdict: never, incomplete, failed, any. |
sarif-file | disrobe.sarif | Path the action writes the SARIF to. |
out-dir | disrobe-out | Directory disrobe writes recovered artifacts into. |
upload-sarif | true | Upload SARIF to GitHub code scanning. |
upload-artifact | true | Upload the recovered-artifact directory. |
token | ${{ github.token }} | Token used to download the release asset. |
Outputs
| Output | Description |
|---|---|
sarif | Path to the SARIF file the action produced. |
verdict | Worst verdict observed (ok, incomplete, failed). |
summary | One-line human-readable run summary. |
Pinning the version
Pin a tag for reproducible CI:
- uses: 1-3-7/disrobe@v0.10.0
with:
path: suspect.bin
version: v0.10.0
Leaving version unset downloads the release matching the action ref, falling back to the rolling latest release. Pin a tag in production so a new release cannot change your scan results silently.
Gating a pull request
Block a merge when a packed or obfuscated artifact slips into the tree:
- uses: 1-3-7/disrobe@v0.10.0
with:
path: "build/**/*"
command: auto
args: --max-depth 12
fail-on: failed
fail-on: failed fails only when the chain itself errors; fail-on: incomplete is stricter and also fails when disrobe reports findings it could not fully resolve. The action reports what disrobe detects; it does not invent verdicts.
Security posture
The action verifies every download against SHA256SUMS before extracting it, and every release archive additionally carries a cosign signature bundle you can verify out of band. disrobe itself performs pure static analysis by default; see the forensics and malware-safety posture.
pre-commit hook
disrobe ships a pre-commit.com hook that scans staged files and fails the commit when a packed or protected artifact is detected. Use it to stop someone from accidentally (or maliciously) committing a UPX-packed binary, a PyArmor-protected module, a PyInstaller one-file build, or a SourceDefender/Nuitka/PyFreeze blob.
Setup
Add the hook to a consuming project's .pre-commit-config.yaml:
repos:
- repo: https://github.com/1-3-7/disrobe
rev: v0.10.0
hooks:
- id: disrobe
Then install it:
pre-commit install
The hook requires the disrobe binary on PATH (install from the Releases page), or point it at an explicit path with the DISROBE_BIN environment variable. It also needs python3 available to parse the chain report.
What it detects
For each staged file the hook runs disrobe auto <file> --json against a throwaway output directory and inspects the chain's chosen detector picks. By default it blocks only the high-precision packer/protector detectors, which key off unambiguous structural magic and do not false-positive on ordinary source:
| Detector pass | Blocks |
|---|---|
native.packer-unpack | UPX, Petite, kkrunchy, and other native packers |
pyarmor.unpack | PyArmor-protected Python |
pyinstaller.extract | PyInstaller one-file / one-dir builds |
sourcedefender.decrypt | SourceDefender-encrypted Python |
nuitka.extract | Nuitka-compiled binaries |
pyfreeze.extract | Frozen-Python blobs |
This is deliberately conservative: it blocks exactly the formats above, nothing more.
Tuning
Two environment variables tune the gate:
| Variable | Default | Effect |
|---|---|---|
DISROBE_BIN | disrobe | Path to the disrobe binary. |
DISROBE_BLOCK_PASSES | the six passes above | Comma-separated detector pass-ids to block. |
DISROBE_BLOCK_FAMILIES | (empty) | Comma-separated detector families to additionally block. |
The family override is broader but noisier. disrobe's source-level obfuscation classifiers (js.deob, lua.deob, py.deob) are tuned to attempt recovery aggressively, so they can fire at high confidence on ordinary text and markdown. Enabling DISROBE_BLOCK_FAMILIES=obfuscator-wrapper,packer-archive will catch source-level obfuscation but expect false positives on benign files; scope it with the hook's files:/exclude: patterns.
repos:
- repo: https://github.com/1-3-7/disrobe
rev: v0.10.0
hooks:
- id: disrobe
files: '\.(exe|dll|so|dylib|pyc|pyz|bin)$'
Bypassing
A legitimately-committed protected artifact can skip the hook for one commit:
SKIP=disrobe git commit -m "vendor signed third-party binary"
Security posture
The hook runs disrobe auto, which performs pure static analysis by default; it does not execute the staged file. See the forensics and malware-safety posture. The scan writes recovered artifacts only into a temporary directory that the hook deletes on exit; your working tree is never modified.
MCP server
disrobe ships a Model Context Protocol server so an AI agent (Claude, Cursor, and other MCP clients) can drive deobfuscation and decompilation directly as tools. It speaks MCP over stdio via rmcp.
Every tool takes inline base64 bytes and returns structured JSON. The server never reads a file from disk based on client input; there is no path parameter on any tool. This is the same hard rule the HTTP/gRPC/LSP transports enforce; see the security posture.
Running it
Two equivalent entry points:
disrobe serve --mcp # the CLI's MCP companion over stdio
disrobe-mcp # the standalone MCP binary
Registering with Claude Code
claude mcp add disrobe -- disrobe serve --mcp
Or register the standalone binary:
claude mcp add disrobe -- disrobe-mcp
Registering with a generic MCP client
Most clients accept a JSON server entry. Point command at disrobe (or disrobe-mcp) and run over stdio:
{
"mcpServers": {
"disrobe": {
"command": "disrobe",
"args": ["serve", "--mcp"]
}
}
}
Tool catalog
| Tool | Input | Output |
|---|---|---|
auto | bytes_b64, optional max_depth | Chain verdict, detected formats, and per-pass recovery summary. Auto-detects and chains disrobe's Python + native-packer passes. |
decompile | bytes_b64, optional max_depth | Every terminal recovered-source artifact (language-keyed text), for example a .pyc decompiled to Python. |
ioc | bytes_b64 | Indicators of compromise: URLs, domains, IPs, emails, paths, registry keys, wallet addresses, crypto constants (one decode layer of base64/hex). |
behavior | bytes_b64, optional imports | Static capability summary across network, filesystem, process-exec, registry-persistence, crypto, anti-analysis, and dynamic-code categories, with MITRE ATT&CK ids. |
strings | bytes_b64, optional min_len, decode | Printable ASCII + UTF-16 strings, optionally decoding base64/rot/stack-string obfuscation, tagged with their encoding. |
verify | bytes_b64 | Verify a .dr envelope: blake3 root hash, rung, hot/cold sizes. |
rename | old, new, optional note | Append a symbol-rename record to .disrobe/notes/renames.json. |
annot | target | Regenerate and validate an annotation sidecar under .disrobe/annotations/. |
provenance_lookup | map_json, line | Look up the provenance entry for a line in a provenance-map document. |
The auto and decompile tools cover disrobe's Python and native-packer surface (PyArmor, PyInstaller, SourceDefender, Nuitka, PyFreeze, .pyc disassembly + decompilation, native packers, and container formats), the highest-value chain for an agent triaging an unknown blob. For the full language matrix, drive the CLI or the HTTP daemon.
Example call
A client calls decompile with the base64 of a .pyc and receives the recovered Python:
{
"name": "decompile",
"arguments": { "bytes_b64": "4w0NCgAAAAA..." }
}
{
"schema": "disrobe.decompile/v1",
"verdict": "Complete",
"recovered": [
{ "pass": "py.decompile", "language": "Python", "formatted": true, "source": "x = a + b\n..." }
]
}
Security posture
The server performs pure static analysis by default and never executes the supplied bytes. It rejects empty or malformed base64 with a typed error, and rejects unknown JSON fields on every tool. Because no tool accepts a filesystem path, there is no way to make the server read an arbitrary file via a client-controlled string. See the forensics and malware-safety posture and the threat model.
Contributing
Contributions are welcome; see the contributing guide.
Building and testing
disrobe builds with a single stable Rust 1.95+ toolchain:
cargo build --release
cargo test -p <crate> # test a single crate
The JVM decompiler can be memory-intensive on adversarial input. Prefer per-crate test runs over a bare workspace-wide
cargo test --workspacewhen iterating locally.
The quality bar
Every commit on main must pass the workspace clippy gate with zero warnings:
cargo clippy --all-targets -- -D warnings -W unreachable_pub -W missing_debug_implementations -W unused
cargo fmt --all -- --check
The workspace lints are strict by design: unwrap_used is denied, todo! and unimplemented! are denied, and unsafe is forbidden outside the two C-interop crates. New code is fully type-annotated and self-documenting; the codebase carries durable context in dedicated docs rather than inline comments.
README graphs
The dark-theme benchmark and architecture SVGs in the README are generated, not drawn by hand. The data lives in xtask/data/*.json (every plotted value cites its source gate or harness inline), and xtask renders deterministic SVGs into docs/assets/:
cargo run -p xtask -- graphs # regenerate docs/assets/*.svg
cargo run -p xtask -- graphs --check # fail if committed SVGs are stale (CI runs this)
After changing a number in xtask/data/, rerun graphs and commit the regenerated SVGs; the graphs CI job rejects any drift. Numbers come only from a committed test gate or a local measurement harness, never from the tool grading its own output, and no competitor recovery percentage is plotted.
Adding a pass
A new ecosystem pass is a new disrobe-pass-<name> crate that:
- Implements the shared
Passtrait, declaring its required and produced capabilities and its rung transition. - Climbs the five-rung IR ladder rather than jumping rungs.
- Ships a
pass_run_envelope_roundtriptest and at least one real-fixture integration test incrates/disrobe-cli/tests/. - Wires its standardized emits, returning explicit
applicable: falsestubs for emits it cannot produce.
Every capability claim must be backed by a fixture in corpus/ and a passing test; nothing aspirational ships as done. Fixtures are baked locally from known-good inputs by corpus/generate.{sh,ps1}; copyrighted third-party obfuscated bytecode is never committed to the public corpus.
No fabrication
A decode that only passes against synthetic, self-generated fixtures is not a feature. Per-pass work is verified against a real corpus and the upstream format spec. Partial recovery carries a confidence tier; detect-only is stated as detect-only. If you are not sure a capability works against real-world input, say so in the PR.
Reporting bugs
Generate an environment report to attach to an issue:
disrobe bug-report --out report.md
disrobe bug-report --out - # write to stdout
For security issues, do not open a public issue; use the private advisory channel. See Security.
Security
The full security policy lives in SECURITY.md. This page summarizes it.
Reporting a vulnerability
Do not open a public issue for security reports. Use GitHub's private advisory channel:
→ https://github.com/1-3-7/disrobe/security/advisories/new
Include a description and impact, a minimal reproducer (input bytes, command line, expected vs observed), the disrobe --version output, the OS/arch, and whether you have a candidate fix. Reports are acknowledged within 72 hours; high-severity fixes target 30 days, with same-week turnaround for parsing-of-untrusted-input issues. Reporters are credited (with their preferred handle) in the advisory and release notes; anonymous reports are welcome.
In scope
- Memory safety in the parsing surface. Any panic/abort on adversarial input that is not a clean
Result::Err; any heap corruption is high severity. - Resource exhaustion. Zip-bombs, decompression bombs, recursion bombs, and malformed-length-field bombs: bypasses of the
crates/disrobe-binfmt/src/quota.rsquotas. - Path traversal. zip-slip and equivalents on every container extraction path.
- Server input handling.
disrobe serve(HTTP/gRPC/LSP/MCP) acceptsbytes_b64only; any way to make it read a file via a client-controlled string is high severity. - Subprocess invocation. Command injection or argument smuggling in backend invocation.
.drenvelope handling. Read-past-end, integer overflow, or BLAKE3-mismatch acceptance.- Supply chain. Tampering with published binaries, signature bypass, replay, cosign-bundle manipulation.
Out of scope
- Decompilation output correctness on adversarial input: that is correctness work flagged by the round-trip metric, not a security bug. File a normal issue.
- Compute exhaustion from legitimate input (a slow decompile of a 66 MiB bundle is not a vulnerability).
- Issues in third-party tools disrobe wraps: report to their upstreams.
Hardening posture
#![forbid(unsafe_code)] workspace-wide (except the two pyo3-interop crates). Strict clippy gate on every commit. cargo deny on every push plus weekly; cargo audit weekly. Shared container quota machinery. BLAKE3-pinned fixtures. Loopback-default servers with a warning banner on non-loopback binds. Branch protection on main (1 approval + green CI + linear history + no force-push).
Verifying release artifacts
Release binaries are signed with cosign keyless OIDC (recorded in the Rekor transparency log) and minisign:
cosign verify-blob \
--certificate-identity-regexp '^https://github.com/1-3-7/disrobe/' \
--certificate-oidc-issuer 'https://token.actions.githubusercontent.com' \
--signature disrobe-<version>-<target>.tar.zst.sig \
--bundle disrobe-<version>-<target>.tar.zst.cosign.bundle \
disrobe-<version>-<target>.tar.zst
Legal
Decompilation for security research, interoperability, and recovery of your own source is permitted in most jurisdictions: US DMCA §1201(f), EU Software Directive 2009/24/EC art. 6, UK CDPA §50B/50BA, and equivalents in CA / AU / JP. The full statutory posture, with citations and a takedown channel, is in LEGAL.md.
The authorization gate
Grey-zone commercial protectors (PyArmor paid tier, ionCube, SourceGuardian, Zend Guard, the commercial native-packer tier, and the grey-zone .NET/JVM obfuscators) are gated behind the explicit --i-have-authorization flag and never run otherwise. Passing the flag is your assertion that you are authorized to analyze the input under the statutory framing above. Use is your responsibility.
The same flag unlocks the decryption-keys category of the --llm sidecar; without it, requesting that category fails with DR-CLI-0420.
What disrobe will not do
- It does not ship copyrighted third-party obfuscated bytecode in its public corpus. Fixtures are baked locally from known-good inputs.
- Grey-zone protectors ship recognizers first; escalation to a full peel only happens after a written legal-posture review committed to
docs/legal/<protector>-stance.md. - It does not phone home. The only documented network endpoint is
disrobe self-update --check-only, and the binary is distributed source-and-release-only.
License
disrobe is licensed under the Elastic License 2.0. Companies and security researchers may use, copy, modify, and distribute it for free; attribution is required, so keep the author, copyright, and licensing notices intact. You may not provide disrobe to third parties as a hosted or managed service, and you may not remove or obscure any licensing, copyright, or other notices. The "disrobe" name and marks are reserved; the license grants no trademark rights. See LICENSE and NOTICE.