disrobe: a universal decompiler, deobfuscator, and unpacker

disrobe

One tool to decompile, deobfuscate, and unpack almost anything, deterministically, in a single Rust binary.

disrobe is a universal multi-language decompiler and deobfuscator. It decompiles Python .pyc bytecode, unpacks PyArmor and PyInstaller, reads Nuitka-compiled binaries, decompiles WebAssembly, deobfuscates JavaScript, decompiles .NET / CIL and JVM / Java, recovers Android DEX, and unwraps native PE / ELF / Mach-O packers, all from one static binary built for malware analysis and reverse engineering.

disrobe demo

Try it in your browser: the disrobe playground. Decompile a .pyc, scan a pickle for malicious reduce callables, and summarize a .wasm module, all client-side, with the core passes compiled to WebAssembly. Nothing is uploaded.

disrobe reverses the bytecode, packers, freezers, and protectors layered onto compiled and frozen software across 20+ ecosystems: Python, JavaScript/TypeScript, WebAssembly, JVM and Android, .NET, native PE/ELF/Mach-O, Go, Lua, PHP, Ruby, Erlang/Elixir (BEAM), Swift/Objective-C, ActionScript 3, React Native Hermes, Flutter Dart AOT, and the native packer tier layered on top of them (UPX, MPRESS, NSPack, FSG, kkrunchy, MEW, ASPack, PECompact, Petite, Yoda's Crypter). It ships as a single static Rust binary.

Built for forensic and recovery work where reproducibility matters:

  • Deterministic. No model anywhere in the decompile path. The same input produces byte-identical output on every machine and every run, usable as evidence and as a diff baseline.
  • Single static binary. No JVM, no Python runtime, no Docker image required to run the core. Builds from one cargo build --release. Drops into CI headlessly.
  • Content-addressed. Every recovered artifact persists as a .dr envelope: an rkyv hot payload plus a postcard cold sidecar, rooted by a BLAKE3 hash. Cache hits are byte-identical and chains compose offline.
  • Honest. Every Python decompile is recompiled on the matching interpreter and compared opcode-for-opcode: 92.76% per-code-object equivalence on the CPython 3.14 stdlib (5831 of 6286), measured against the interpreter, not the tool's own output. Recovery that is not perfect is labelled SEMANTIC, PARTIAL, or SKELETON rather than presented as ground truth. Commercial-tier packers that disrobe cannot fully unpack are reported as detect-only by design, never faked.

Who this is for

  • Malware analysts and incident responders who receive a packed, frozen, or obfuscated sample and need to read what it does, without executing it.
  • Security researchers auditing a closed binary for interoperability or vulnerability research.
  • Developers recovering their own lost source from a shipped .pyc, .jar, .dll, or bundled .js.
  • Coding agents. Every pass can emit a structured metadata sidecar (--llm) carrying the call graph, type signatures, control-flow shape, capability surface, and decompile provenance, so an LLM can reason about recovered code without re-deriving its structure.

What makes it different

disrobe ships passes for every ecosystem above from a single binary. Where mature FOSS already exists (CFR, Vineflower, jadx, ILSpy, JPEXS, unluac, hermes-dec, Ghidra), disrobe wraps it headlessly behind a unified CLI and adds chain auto-detection, deterministic .dr envelopes, and round-trip verification. Where FOSS coverage is thin (PyArmor v9-pro, the native packer tier, Hermes against a live bundle, Flutter Dart AOT, MicroPython .mpy, PEP 750 t-strings), it is among the few tools handling these statically and offline. Where the field is dominant (Ghidra/IDA/Binary Ninja for native decompilation), disrobe is the unpack, symbol-recovery, and chain-detect layer that feeds them cleaner input.

Measured recovery

Every figure below is produced by a committed test gate or a local measurement harness graded against an independent oracle, never the tool's own output. The full per-value sourcing lives in xtask/data/recovery.json.

Measured recovery by ecosystem

EcosystemMeasuredOracle
Python bytecode92.76% per-code-object equivalence on the CPython 3.14 stdlib (5831 of 6286)recompile on CPython 3.14.5, opcode diff
CPython legacy 1.0-3.7152 of 191 proven-correct (CI floor); 166 of 191 measured locallyrecompile-equivalence or structural token-match
WebAssembly100% op-coverage on the 30 parseable corpus modules; 24 of 24 execution-eligible functions equivalentexecution differential under wasmtime
JVM classfile93.1% of methods recompile error-free (122 of 131 floor; 128 measured)real javac
Android (Dalvik)99% of verifiable classes pass the JVM verifier (102 of 103)-Xverify:all over assembled jar
Ruby YARVgreeter 100%, megafile 85% opcode-multiset equivalencerecompile on MRI
PyArmor72 of 72 real-corpus samples recoveredplaintext-absent oracle
Containers98 formats detected, 98 extracted in-treeper-format byte length

The numbers that are not perfect are labelled SEMANTIC, PARTIAL, or SKELETON, and the information-theoretic walls (native-virtualized code, runtime-only keys, RSA-wrapped capsule keys) are reported as detect-only by design.

How to read these docs

Installation

disrobe is distributed two ways: prebuilt binaries from the GitHub Releases tab, and build from source with a single Rust toolchain. There is intentionally no PyPI/npm/Homebrew/crates.io/Docker channel for the binary itself; GitHub Releases is the canonical distribution point.

Each tagged release attaches prebuilt, statically-linkable binaries for the common targets, alongside SHA256SUMS, a cosign keyless signature, and a minisign signature.

OSArchitectures
Windows 10/11x86-64, ARM64
Linux (glibc + musl)x86-64, ARM64
macOS 13+x86-64, ARM64 (Apple Silicon)
  1. Download the archive for your platform from the Releases page.

  2. Verify the checksum:

    sha256sum -c SHA256SUMS        # Linux / macOS
    
  3. (Optional) verify the cosign signature against the Sigstore transparency log:

    cosign verify-blob \
      --certificate-identity-regexp '^https://github.com/1-3-7/disrobe/' \
      --certificate-oidc-issuer 'https://token.actions.githubusercontent.com' \
      --signature disrobe-<version>-<target>.tar.zst.sig \
      --bundle    disrobe-<version>-<target>.tar.zst.cosign.bundle \
      disrobe-<version>-<target>.tar.zst
    
  4. Extract and place disrobe (disrobe.exe on Windows) anywhere on your PATH.

Build from source

Building requires Rust 1.95 or newer (stable). That is the only build dependency for the core; the optional external backends are fetched separately (see below).

git clone https://github.com/1-3-7/disrobe
cd disrobe
cargo build --release
./target/release/disrobe --version

A release build takes roughly four to six minutes on commodity hardware. The binary lands at target/release/disrobe; copy it onto your PATH.

Verifying the install

disrobe --version          # print the version
disrobe passes             # list every registered pass with a one-line summary
disrobe --help             # full subcommand surface
disrobe <pass> --help      # drill into any pass, e.g. `disrobe py --help`

Optional external backends

disrobe's in-house passes run with zero external dependencies. A subset of capabilities, however, wrap mature external tools headlessly: Ghidra for native decompilation; CFR / Vineflower / Procyon / jadx for the JVM and Android; ILSpy / dnSpy / de4dot for .NET; Rizin and friends elsewhere. These are never the product for bytecode languages (disrobe ships its own in-house decompilers there) and are always optional.

Probe what is installed and what is missing:

disrobe doctor                 # probe ~50 optional external tools
disrobe doctor --auto-install  # install every missing tool with a known action

Install a single tool through your platform's native package manager (winget / brew / apt / dnf / pacman / apk). disrobe never installs itself this way; it only fetches the optional backends:

disrobe install --list         # list every known tool + per-platform package name
disrobe install ghidra
disrobe install upx

Heavyweight dependencies that ship as upstream release archives rather than OS packages (Ghidra, for instance) have a dedicated installer:

disrobe install-deps ghidra
disrobe install-deps --all

Shell completions and man pages

disrobe completions bash --install        # also: zsh, fish, powershell, elvish
disrobe man --out ./man                   # one .1 page per subcommand

Quickstart

The fastest path is disrobe auto: hand it a file and it fingerprints the input, then chains the right passes end to end.

Auto-detect and chain

disrobe auto suspect.exe --out recovered/
# detected: PE -> UPX -> rust-demangle
# stage 01-upx        ok    (.text byte-identical, 1.18 MiB in 9 ms)
# stage 02-demangle   ok    (4172 Rust symbols, 312 C++ symbols, 0 unresolved)
# final               ok    -> recovered/final/

disrobe auto understands chains such as:

  • PE -> UPX -> rust-demangle -> symbol recovery
  • PyInstaller -> PyArmor -> .pyc decompile
  • APK -> dex -> JADX + Smali + manifest
  • Electron .asar -> webcrack -> source

Use --capture-stages to mirror every executed pass's byte-exact output under <out>/NN-<pass>/ and link the terminal stage(s) under <out>/final/. Cap the chain depth with --max-depth (default 8).

Per-language one-liners

Every one of these is real and backed by an in-tree fixture and integration test:

# Python
disrobe py decompile module.pyc --out recovered/
disrobe py disasm module.pyc --out trace.txt
disrobe py deob obfuscated.py --out clean.py --cleanup
disrobe pyinstaller extract onefile.exe --out out/
disrobe pyarmor unpack protected.py --out out/             # add --allow-dynamic only on trusted samples
disrobe nuitka extract app.exe --out out/

# JavaScript / TypeScript / WebAssembly
disrobe js deob bundle.min.js --out clean.js
disrobe js unbundle app.bundle.js --out src/
disrobe wasm decompile module.wasm --target rust --out lifted.rs

# JVM / Android / .NET
disrobe jvm decompile app.apk --backend jadx --out src/
disrobe dotnet decompile App.dll --backend ilspy --out src/

# Native
disrobe native unpack packed.exe --out unpacked.bin
disrobe native symbols app.exe --out symbols.json
disrobe go recover app --out symbols.json

# Mobile / Lua / others
disrobe hermes decompile index.android.bundle --out surface/
disrobe flutter dump libapp.so --out layout.json
disrobe lua decompile script.luac --out script.lua
disrobe ruby decompile app.rb
disrobe php decode payload.php --out clean.php
disrobe beam parse module.beam

Structured output

Every command accepts the global --json, --ndjson, or --sarif flags for machine-readable output. SARIF 2.1.0 drops straight into GitHub code scanning:

disrobe scan firmware.bin --sarif > findings.sarif

Inspecting a run

After any chain or pass, inspect what landed:

disrobe status                    # per-stage artifact counts, sizes, manifests in ./out/
disrobe context --out recovered/  # per-pass status, confidence tiers, verdict, provenance
disrobe envelope inspect out/final/module.dr
disrobe verify out/final/module.dr

Generating an LLM sidecar

Add --llm to any pass to emit a schema-conforming metadata bundle next to the recovered artifact, ready for a coding agent to consume:

disrobe py decompile module.pyc --out recovered/ --llm --llm-briefs

See LLM sidecar and provenance for the full category and pack model.

Comparison

How disrobe sits next to the established tool for each ecosystem. The aim is not to win every cell. For several targets a mature dedicated tool already exists and is the better choice for that one format; disrobe's value is doing all of them from one binary, behind a deterministic pipeline that records what it recovered and what it could not.

Recovery is always measured against an independent reference, never the tool's own output, and lossy results are reported as measured. The limits are listed at the bottom of this page and in Limits.

Source recovery (bytecode to source)

EcosystemEstablished toolsWhere disrobe differs
Pythonpycdc, pylingual, uncompyle6, decompyle3One engine spans 3.6-3.15; each construct is recompiled and diffed opcode-for-opcode, reaching 92.76% per-code-object equivalence on the CPython 3.14 stdlib (5831 of 6286). uncompyle6 stops near 3.8 and decompyle3 near 3.9; the ML decompilers are non-deterministic and carry benchmark-contamination risk.
JVM / Kotlin / ScalaCFR, Vineflower, Procyon, FernflowerIn-house Rust decompiler is the default; on the EdgeCases corpus at least 93.1% of decompiled methods recompile error-free under real javac (asserted floor 122 of 131, 128 measured on JDK 25). CFR, Vineflower, Procyon, and jadx remain available as --backend. Adds ProGuard/R8 mapping replay and obfuscator peel in the same pass.
Android (DEX)jadx, dex2jar + a Java decompilerDirect DEX-to-Java without the dex2jar hop; APK signature v1-v3 verification and RASP detection in one binary. Recovered bodies are graded by the real JVM verifier (-Xverify:all) on the committed dex corpus, where 99% of verifiable classes pass (102 of 103). On the gitignored real FOSS apks the lifter self-reports a body for 89% to 92.5% of methods (transmissionic 92.5%, enrecipes 90.7%, rustdesk 89.0%), not verifier-attested; a live-range-splitting pass recovers bodies whose registers carry conflicting JVM types across control-flow joins (307 re-hosted bodies verify clean), leaving only the one class whose Kotlin supertype is unbundled in the corpus.
.NET / CILILSpy, dnSpy, de4dotIn-house CIL to C#/F#/VB plus an actively maintained obfuscator reverser; de4dot has been unmaintained since 2020. ILSpy, dnSpy, and de4dot stay available as --backend.
Luaunluac, luadecCovers 5.1-5.4, LuaJIT, full Luau (all 82 opcodes, child-proto linking), and GLua in one decoder, plus 11 obfuscator reversers. For real IronBrew2 2.7.0 register-VM output, disrobe reconstructs the opcode permutation and xor key from the bootstrap, decodes the constant pool, and lifts the VM bytecode back to runnable Lua in standard and MAX mode, validated by a real-Lua execution differential against the original on a committed sample corpus (the execution differentials CI job). MoonSec-shape recovery runs against a synthetic bootstrap of our own design pending a real sample. Runtime-key and anti-tamper variants return unrecovered. unluac is the maturity benchmark for stock luac.
Rubynone (no FOSS YARV/mruby decompiler)Decompiles MRI/YARV 2.6-3.4 and mruby, with a recompile-equivalence oracle reaching 100% opcode-equivalence on a greeter fixture and 85% on a mixed-construct megafile (about 90% recoverable, the rest an oracle comment-stripping artifact). No comparable open-source Ruby decompiler exists.
BEAM (Erlang / Elixir)beam_disasm (disassembly only)Lifts BEAM chunks to Core Erlang and recovers Elixir from the Dbgi chunk. The standard tooling disassembles but does not reconstruct source.
PHPnone for modern bytecode (ionCube/SourceGuardian are commercial)Detects ionCube/SourceGuardian/Zend Guard and walls them (the decrypt key is native-loader-resident, so no static opcode recovery), with a partial op_array skeleton for the legacy statically-keyed cases; full Phar archive extraction. No maintained FOSS PHP-bytecode decompiler exists.
WebAssemblywasm-decompile, wasm2c, wasm-toolsLifts to typed Rust, TypeScript, WAT, or C with DWARF recovery and four obfuscator reversers (Jscrambler-WASM, Wobfuscator, Tigress-via-Emscripten, Wasmixer), rather than emitting a single fixed target. A fifth family, wasm-name-obfuscator, is detected and classified but its hex renames are not reversible.
JavaScript / TypeScriptwebcrack, synchrony, REstringer; for V8 .jsc, View8, ghidra_nodejs, jscdecompiler.comobfuscator.io (full pipeline), JS-Confuser, Jscrambler, esoteric encoders, V8/Bytenode, and 11 bundlers with scope-aware renaming and source-map reconstruction. For .jsc, the self-contained static option: View8 needs a patched V8 binary, ghidra_nodejs needs Ghidra, jscdecompiler.com is an online service; disrobe recovers the user-string layer plus structure offline with Node 18-24 detection. Internalized identifiers (console, log) live in V8's read-only snapshot heap, absent from the .jsc, an information-theoretic boundary the binary-dependent tools cross and disrobe does not.
ActionScript 3 / FlashJPEXS Free Flash Decompiler (FFDec)FFDec is the mature, full-graph SWF and AS3 decompiler and remains the better tool for deep Flash work. disrobe parses SWF and disassembles ABC bytecode as part of the unified chain; this path is local-corpus only and not CI-validated.

Unpacking and extraction (byte-exact where possible)

EcosystemEstablished toolsWhere disrobe differs
Python freezerspyinstxtractor (PyInstaller only), per-freezer scriptsPyInstaller, cx_Freeze, py2exe, PyOxidizer, shiv, pex, Briefcase, and Nuitka unpacked by one tool, then handed to the decompiler. Nuitka onefile/standalone is byte-exact; its native bodies are lossy.
Python protectorsPyarmor-Static-Unpack-1shot, GDATA Pyarmor-Tooling (PyArmor v8-9.x)PyArmor v6-v9-pro runtime-key extraction and AES decrypt (72 of 72 real-corpus samples) paired with an in-tree decompiler for end-to-end source, plus SourceDefender .pye decryption; the v3-v5 RSA-wrapped-key tier is an information-theoretic wall.
Native packersunipacker; UPX (unpacks UPX only); per-packer one-off scriptsGeneral-purpose unpacker for the tier in one static Rust binary (no Python or unicorn): UPX, kkrunchy, NSPack, Petite, MPRESS, MEW, ASPack, and PECompact are decompressed via our own decoders and an in-house x86 stub emulator, scored against real committed originals. NSPack reaches ~99% of the content section. ASPack and PECompact rebuild the decompressed section image at its load RVA on committed samples: the packed .text of near-random entropy and zero resolvable calls drops to ~6.2-6.5 with dozens to hundreds of disassembler-resolvable intra-code calls (no disk-aligned byte-identity reference for the overlay). MEW rebuilds a flat image of the committed Sysinternals samples, read as the entropy drop to ~4.2-4.9 and tens of thousands of decoded instructions. FSG decodes via the same aPLib-clone path but ships no committed fixture, so no number reproduces from a checkout. Yoda's Crypter recovers .rsrc byte-identical and decrypts .text to full plaintext through the stub emulator; Yoda's Protector and the virtualized tier (VMProtect, Themida, Enigma) are detect + carve (runtime-only keys).
React Native Hermeshermes-dec, hbctoolBytecode v60-v96. A structured lifter, not a full decompiler: variadic call args are marked <arg?> where the Hermes frame-register layout is not modeled, and no-op opcodes count as fallback. On a 66 MiB production bundle, disrobe parsed the 122,633-function table with no parse failure (header scale, not body lifts, measured locally, not a CI-gated correctness percentage; the committed hello fixture is the CI-gated case).
Containers and archivesunsquashfs, 7-Zip, binwalk, asar, format-specific CLIsDetects 98 container/archive/filesystem/firmware formats and writes member bytes in-tree for all 98 (ZIP, tar, 7z, .deb, .rpm, MSI, NSIS, Docker/OCI; bare gz/bz2/zst/lzma/lzip/lz4-frame/zlib/.Z streams; legacy ar/arj/arc/lzh/lzop/uzip/xalz/par2/ELF-overlay; embedded-linux squashfs/cramfs/ext4/romfs/minixfs/jffs2/ubifs/yaffs/erofs/ntfs/android-sparse/btrfs-send; GPT/MBR/VHD/VHDX/WIM disk images walked through FAT; vendor firmware via D-Link AES, EnGenius XOR, Autel table, QNAP PC1 and CRC-verified Netgear/Xiaomi/Tesla carves) with a recursive carve-everything engine and universal zip-slip and decompression-bomb guards; WIM XPRESS/LZX/LZMS chunk payloads decompress in-tree, InnoSetup writes the decoded setup-data block stream and InstallShield its stored and zlib members. Member-byte decoding is byte-exact for the documented codecs; ARJ method 4, ARC methods 5-7, EROFS microlzma and the compact index, StuffIt compressed forks, and OTP-AES airoha firmware are carved or reported rather than decoded. binwalk surveys firmware but does not decrypt the vendor schemes or walk these filesystems to byte-exact files.

Native and AOT-compiled (symbols and demangling, not source)

These targets compile to machine code, so function bodies are not recoverable as source. The deliverable is unpacking, symbol and metadata recovery, and demangling that gives a disassembler cleaner input.

EcosystemEstablished toolsWhere disrobe differs
Native (PE/ELF/Mach-O)Ghidra, IDA, Binary NinjaNot a competitor on raw decompilation. The unpack, symbol-recovery, and chain-detect layer that feeds those tools cleaner input. DWARF/PDB/STABS across x86, ARM, RISC-V, MIPS, PowerPC, SPARC, and eBPF.
GoGoReSym, redressSymbol recovery plus garble undo and embedded-FS walking; pclntab eras go1.2 through go1.26, with type-name resolution gated above 85%.
Swift / Objective-Cclass-dump, swift-demangleMach-O class-dump with SwiftConfidential and SwiftShield rename-undo in one pass. Bodies stay native; the output is the interface and restored names.
Flutter (Dart)blutter, doldrumsTwo paths: a Dart kernel (.dill) yields byte-exact source bodies from the kernel source table, while an ARM64 AOT libapp.so erases bodies and recovers class and method names and library URIs plus disassembly. Both are CI-gated on a committed sample.
Nim / Zig / Crystalbinary disassemblers, language demanglersDetect, demangle, and recover symbols and metadata from each binary's own symtab and type table. Source is not recoverable.

Where disrobe is not the right tool

  • Deep, interactive Flash reversing: use FFDec.
  • Full native decompilation to C: use Ghidra, IDA, or Binary Ninja. disrobe prepares input for them; it does not replace them.
  • Devirtualizing a commercial protector against a running licensed sample (VMProtect, Themida, Enigma): disrobe detects the protector and carves intact sections. Its generic devirtualizer (disrobe native devirt) recovers the interpreter, fingerprints each handler behaviorally, and lifts the handler bytecode to a re-executable IR plus pseudo-code, validated end-to-end on a self-authored Tigress-shape VM; the commercial front-ends are extended from published RE write-ups, not a running commercial sample, and a handler stream assembled at run time from a per-machine key is the wall.
  • Any AOT-compiled language (Nim, Zig, Crystal, Flutter Dart, Swift bodies): source bodies are gone at compile time; demangling and symbol recovery are the ceiling, the same ceiling every tool faces.

Architecture overview

disrobe is a workspace of focused Rust crates orchestrated by one CLI. Every recovery flows through the same shape: bytes in, a chain of passes that each transform the artifact up the IR ladder, an optional .dr envelope for content-addressed caching, and a final emit.

The shape of a recovery

                   ┌──────────────────────────────────────────────┐
   input bytes ──> │  detect  ──>  pass 1  ──>  pass 2  ──>  ...  │ ──> recovered artifact
                   └──────────────────────────────────────────────┘
                          │           │            │
                          │           │            └─ each pass: raw -> disasm -> mir -> hir -> surface
                          │           └─ capability resolver gates what can run next
                          └─ chain runner records chain.json + stage mirrors + recovery.json

Crate map

The workspace splits into a small set of shared cores and one crate per ecosystem pass:

CrateRole
disrobe-coreShared types: error codes (DR-<DOMAIN>-<NNNN>), progress, confidence tiers, secret/credential scanning, cyclomatic metrics.
disrobe-irThe five-rung IR ladder, the .dr envelope (rkyv hot + postcard cold + BLAKE3 root), transcoders, capability descriptors.
disrobe-binfmtContainer, archive, filesystem, and firmware layer (98 formats detected, 98 extracted in-tree, plus a recursive carve-everything engine) with shared zip-slip and decompression-bomb quota machinery.
disrobe-llm-metadataThe --llm sidecar: 18 categories, 4 packs, AGENTS.md / SKILL.md brief generation.
disrobe-mcpThe rmcp Model Context Protocol companion wired to disrobe serve --mcp.
disrobe-py-marshalCPython marshal reader: code objects across 1.0-3.15.
disrobe-pass-*One crate per ecosystem (py-decompile, py-disasm, py-deob, pyarmor, pyinstaller, pyfreeze, nuitka, js-deob, wasm-deob, jvm, dotnet, native, go, lua, php, ruby, beam, pickle, swift-objc, as3, mobile, sourcedefender). The native pass adds the iced-backed disassembler, symbol-independent function discovery, call graph and basic-block CFG, instruction re-encode/relocate, C++ RTTI/vtable recovery, and emulation-driven string recovery.
disrobe-queryQueryable-IR layer over the disassembled native code: functions, calls-to, xrefs, string-decoders, complexity, capability sites, behind disrobe query.
disrobe-capabilitiesCapability rule engine over the queryable IR, mapping matched behaviors to MITRE ATT&CK and MBC, behind disrobe capabilities.
disrobe-cliThe disrobe binary: argument parsing, output formats, the chain runner, the daemon.
disrobe-validatorWalks a corpus and validates every fixture round-trips, used in CI.

The Pass trait

Every pass implements one trait. A pass takes a .dr envelope at some rung, does its work, and returns an envelope one or more rungs higher, declaring which capabilities it requires on the way in and which it produces on the way out. Because every pass speaks the same envelope dialect, the chain runner can compose any pass with any other as long as the capability resolver is satisfied. This is what lets PyInstaller -> PyArmor -> .pyc decompile work as a single disrobe auto invocation rather than three hand-wired steps.

Each pass also exposes a standardized set of emits (source, disasm, ast, cfg, ir, manifest, sourcemap, symbols, strings, imports, signatures, report). A pass that cannot produce a given emit writes an explicit applicable: false stub with the DR-IR-NotApplicable code rather than silently dropping it.

The four pillars

The rest of this section expands each pillar:

  1. The five-rung IR ladder: the common intermediate representation every artifact climbs.
  2. Passes and the capability model: how passes declare and resolve what they need.
  3. The chain runner: auto-detection, stage mirrors, depth and cycle caps.
  4. The .dr envelope: the content-addressed wire format that makes caching deterministic.

Determinism is a design constraint, not a feature

The entire architecture exists to make output reproducible. There is no model in the decompile path. RNG-backed backends take an explicit --seed. Timing tokens are scrubbed from golden outputs so that two runs hash identically. The .dr envelope is content-addressed (BLAKE3) rather than timestamp-addressed, so a cache hit is provably the same bytes. This is what makes disrobe output usable as a forensic baseline and as a disrobe diff input across versions.

The five-rung IR ladder

Every artifact disrobe recovers climbs the same five-rung intermediate-representation ladder. A pass never jumps arbitrarily; it lifts an artifact from one rung to the next, which is what lets passes from completely different ecosystems compose through a shared envelope.

   Raw  ──>  Disasm  ──>  MIR  ──>  HIR  ──>  Surface
   bytes     opcodes      mid       high      source

The rungs are defined once in disrobe-ir as Rung::{Raw, Disasm, Mir, Hir, Surface} and are the same for Python bytecode, JVM classfiles, .NET CIL, Lua chunks, WebAssembly, and native binaries.

Rung 1: Raw

The original bytes, exactly as received, wrapped in an envelope with its detected format and BLAKE3 root recorded. Unpacking and decryption passes (UPX unpack, PyArmor decrypt, PyInstaller extract, container extraction) operate at this rung: they take raw bytes and emit raw bytes that are structurally cleaner but still pre-disassembly. This is where byte-exact recovery lives: a UPX-unpacked image at the Raw rung carries the original executable code (.text) and exception data (.pdata) bit-for-bit, with only the loader-rebuilt relocations and IAT (resolved by the OS loader at run time, never stored in the packed file) left as residual.

Rung 2: Disasm

The artifact decoded into a per-instruction listing: CPython opcodes, JVM bytecode, CIL, Lua VM instructions, Wasm operators, or native machine instructions via Capstone/iced-x86/yaxpeax. Disassembly is faithful and lossless: a 1:1 decode of the bytes, with offsets preserved. disrobe py disasm, disrobe beam disasm, disrobe as3 disasm, and disrobe pickle disasm all stop here deliberately.

Rung 3: MIR (mid-level IR)

A normalized, control-flow-aware representation: basic blocks, an explicit CFG, stack effects resolved, super-instructions un-fused, jump targets relocated. This is the rung where decompilers do the structural work that separates a faithful disassembly from readable source. For Python specifically, the frame-tree pre-pass runs here: the nested source-construct tree is reconstructed from the 3.11+ exception table before the instruction walk, which eliminates the single-pass stack-walker desync that other Python decompilers suffer from.

Rung 4: HIR (high-level IR)

Structured control flow recovered: loops, conditionals, try/except/finally, with-blocks, comprehensions, pattern-match arms, generator state machines. Names, types, and idioms are recovered where the source language has them (Kotlin idiom recovery from JVM bytecode, C# vs F# vs VB shapes from CIL). The HIR is language-shaped but not yet rendered as text.

Rung 5: Surface

The final rendered source: Python, Java, Kotlin, C#, F#, VB, Rust, TypeScript, WAT, C, Lua, Erlang, or whatever the target dictates. For Python this is where the round-trip metric runs: the emitted file is recompiled on the matching interpreter and compared opcode-for-opcode against the original. The verdict is recorded as PERFECT (byte-identical), SEMANTIC (same program, different layout), or CODE_DIFF (a real bug, fixed before ship).

Why the ladder matters

  • Composition. Because a pass only declares "I take rung N, I produce rung N+1," the chain runner can stitch passes from different crates without any of them knowing about each other.
  • Honest partial recovery. If a decompiler can climb to HIR but not cleanly render Surface, it can stop and emit the HIR/Disasm artifact with a PARTIAL or SKELETON confidence tier rather than fabricating source.
  • Transcoding. disrobe-ir carries a transcode registry keyed on (from_version, from_rung, to_version, to_rung), so an envelope can be migrated across schema versions while staying at the same rung. disrobe envelope migrate-check validates that such a path exists and that every required capability stays satisfiable.

Confidence tiers

Surface output is tagged with one of four tiers, defined in disrobe-core:

TierMeaning
exactReserved for byte-roundtrip-verified output.
semanticSame program, different but equivalent layout.
partialSome bodies recovered, some left as disasm or stubs.
skeletonStructure recovered, bodies emitted as pass/placeholder.

These tiers propagate into the recovery.json sidecar and the --llm bundle, so a downstream consumer always knows how much to trust each recovered symbol.

Passes and the capability model

A pass is the unit of work in disrobe. Each pass lives in its own crate, implements a shared trait, and declares the capabilities it requires and produces. The capability resolver is what allows arbitrary passes to chain.

Registered passes

Run disrobe passes for the live list. As of the current release:

PassCapability summary
pyarmorPyArmor v6 / v7 (dynamic-hook) + v8 / v9-pro static unpack.
pyinstallerPyInstaller 2.x-6.20+ extract + AES-CTR / CFB decrypt.
pyfreezecx_Freeze / py2exe / shiv / pex / PyOxidizer / Briefcase detect + extract.
nuitka--onefile payload extract (zstd) + symbol / constants scan.
pyDeobfuscate (peel + cleanup) / disassemble / decompile / extract / SourceDefender decrypt.
jsDeobfuscate (string-array + unminify + scope-aware rename) / unbundle 11 bundlers.
wasmAnalyze / decompile (JSON / Rust / TypeScript / WAT / C) / reverse 4 obfuscator families (plus wasm-name-obfuscator detect + classify).
envelope.dr create / inspect / verify / diff / migrate-check.
queryQuery a Disasm-rung .dr IR: functions / calls-to / xrefs-to / string-decoders / complexity-over / capability sites.
capabilitiesMatch a binary against built-in capability rules with evidence addresses and MITRE ATT&CK / MBC tags.
nativeGhidra-headless decompile / symbol dump / unpack / devirt / entropy / crypto signatures / disasm / callgraph / patch / sigmaker / diff.
jvmClassfile / .jar / .dex / .apk decompile via CFR / Vineflower / Procyon / JADX.
apkAndroidManifest.xml decode + resource id-to-name map + signer-cert SHA-256.
dotnet.NET PE decompile via ILSpy / dnSpyEx / de4dot + protector detection.
hermesReact Native Hermes bundle disasm + JS surface lift.
machoMach-O / fat / .ipa dump + ObjC + Swift class-dump.
luaLua 5.1-5.4 / LuaJIT / Luau / GLua decompile + obfuscator peel.
phpEncoder decode (phar / ionCube / SourceGuardian / ZendGuard) + eval-chain peel.
rubyMRI / YARV / mruby / JRuby / TruffleRuby / Ruby2Exe / Ocra analysis.
beam.beam IFF parse + Core Erlang lift + Code chunk disasm.
pickleDisasm + decompile + safety + symbolic trace + polyglot + ML model detect.
gopclntab + moduledata + garble report + embed.FS extraction.
swiftSwift / ObjC class-dump + SwiftShield undo + Confidential XOR-decrypt.
as3ActionScript 3 .swf DoABC tag disasm.
flutterDart AOT / libapp.so dump + obfuscation_map parse.
chainExplicit pass pipeline orchestrator.
serveHTTP daemon + WebSocket stream + LSP-stdio + gRPC + MCP.

The capability resolver

Rather than hard-coding which pass follows which, each pass declares:

  • Requires: the capabilities and IR rung its input envelope must already carry.
  • Produces: the capabilities and IR rung its output envelope will carry.

When the chain runner picks the next pass, it matches the current envelope's produced capabilities against each candidate pass's requirements. A pass only runs if its requirements are satisfiable. This is why disrobe auto can detect that a PyInstaller archive contains a PyArmor-protected module and route it through the unpack-then-decompile chain without any per-combination glue code.

Capabilities are versioned. A pass can require, for example, "a CPython 3.12 code object at the Disasm rung," and the resolver will refuse to feed it a 2.7 object. This keeps the chain sound across the wide version ranges disrobe supports.

Standardized emits

Every pass exposes the same twelve emit kinds:

source  disasm  ast  cfg  ir  manifest  sourcemap  symbols  strings  imports  signatures  report

Pass --emit source,disasm,report (comma-separated) to select a subset, or --all-emits on passes that support it to write every kind. A pass that cannot produce a given emit writes an explicit stub:

{
  "schema": "disrobe.emit.stub/v0",
  "pass": "pyarmor",
  "emit_kind": "source",
  "applicable": false,
  "error_code": "DR-IR-NotApplicable",
  "reason": "pyarmor pass does not produce source; chain with disrobe py decompile"
}

A downstream tool can request any emit from any pass and get a well-formed, self-describing answer: either the artifact or a "not applicable here, chain with X."

Error codes

Every failure carries a DR-<DOMAIN>-<NNNN> code rendered through miette diagnostics. Look any code up with:

disrobe explain DR-PYARM-0050
disrobe explain CLI-1            # short form also works

The chain runner

The chain runner is what turns a pile of single-purpose passes into a one-command recovery. It backs both disrobe auto (detect and chain automatically) and disrobe chain (drive an explicit pipeline).

Auto-detection

disrobe auto suspect.exe --out recovered/

disrobe auto fingerprints the input, picks the first pass, runs it, then re-fingerprints the output and repeats, following the capability resolver, until no further pass applies or the depth cap is hit. Detection spans 23 pass crates: native packer, PyArmor, JS deob, Python deob, container formats, SourceDefender, py-decompile, py-disasm, PyInstaller, JVM, .NET, Go, mobile, AS3, BEAM, Lua, Ruby, shell, PHP, Nuitka, Wasm, pyfreeze, and swift-objc.

Representative chains:

  • PE -> UPX -> rust-demangle
  • PyInstaller -> PyArmor -> .pyc decompile
  • APK -> dex -> JADX + Smali + manifest
  • Electron .asar -> webcrack -> source

Explicit chains

When you want to pin the pipeline rather than auto-detect:

disrobe chain input.bin --chain 'pyarmor+py-decompile' --out recovered/
disrobe chain input.bin --chain 'auto:8' --out recovered/        # auto-detect, depth 8
disrobe chain input.bin --chain 'pyarmor+py-decompile' --chain-pin pyarmor@0.10.0,py-decompile@0.10.0

--chain-pin locks each pass to a specific version so a recovery is reproducible against an exact pass build.

Depth and cycle safety

Adversarial input can try to make a chain recurse forever (an archive nested inside itself, a packer that re-emits its own signature). The chain runner defends against this:

  • Depth cap. --max-depth (default 8) bounds how many passes can run in one chain.
  • Cycle detection. Each stage's output is content-hashed (BLAKE3); if a stage produces bytes already seen earlier in the chain, the runner stops rather than looping.

Stage mirrors

Pass --capture-stages to materialize every executed pass's byte-exact output:

recovered/
├── 01-pyinstaller/        # byte-exact output of pass 1
├── 02-pyarmor/            # byte-exact output of pass 2
├── 03-py-decompile/       # byte-exact output of pass 3
├── final/                 # terminal stage(s), linked
│   └── 03-py-decompile/   # symlink -> NTFS junction -> recursive copy fallback (Windows)
├── chain.json             # the chain topology descriptor
└── recovery.json          # per-pass status, confidence histogram, timings

The final/ link prefers a symlink, falls back to an NTFS junction on Windows, and finally to a recursive copy, so final/ always resolves to the terminal artifact regardless of platform and privilege.

chain.json: the topology descriptor

chain.json records the executed pipeline: each pass, its version, the input and output BLAKE3 hashes, the rung transition, byte sizes, and the per-stage verdict. It is the document disrobe diff and disrobe guard verify operate on (see Diff and guard tooling).

recovery.json: the provenance sidecar

recovery.json is the per-run report: each pass's status, a confidence-tier histogram, and timings. Summarize it without reading raw JSON:

disrobe context --out recovered/

This prints per-pass status, confidence tiers, the overall verdict, and provenance, which is the human-facing view of what the chain actually managed to recover and how much to trust it.

The .dr envelope

The .dr envelope is disrobe's content-addressed wire format. Every recovered artifact can be persisted as one, and the chain runner uses envelopes internally to pass work between stages. The format is designed for one thing above all: deterministic, verifiable, offline-composable caching.

Anatomy

A .dr envelope has three parts:

  1. Hot payload (rkyv). The primary data, serialized with rkyv 0.8 for zero-copy access. An envelope can be mmap-ed and the payload read without a deserialize pass, measured at roughly 21 ns to "deserialize" a cached envelope, because there is effectively nothing to deserialize.
  2. Cold sidecar (postcard). Secondary metadata serialized with postcard, kept out of the hot path so the common case stays fast.
  3. BLAKE3 root hash. A content hash over the payload that is the envelope's identity. Two envelopes with the same root hash are byte-identical by construction.

Every envelope also carries its schema version, its IR rung (see the ladder), its capability set, and a provenance record describing which pass produced it.

Why content-addressed, not timestamp-addressed

Because the identity is the BLAKE3 hash of the content, a cache hit is provably the same bytes, not "probably the same, modified at the same time." This is what makes --no-cache an optimization toggle rather than a correctness toggle: with the cache on or off, the output is identical. It is also what lets chains compose offline: a downstream pass can trust that an upstream envelope it pulled from cache is exactly what would have been produced live.

Working with envelopes

# Create an envelope from a source file
disrobe envelope create source.bin --out source.dr

# Inspect: version, rung, capabilities, provenance, root hash
disrobe envelope inspect source.dr

# Verify the BLAKE3 root against the payload
disrobe envelope verify source.dr
disrobe verify source.dr               # convenience alias

# Structurally diff two envelopes
disrobe envelope diff a.dr b.dr        # version, rung, flags, root hash, producer, capabilities, provenance

# Validate a migration is sound before performing it
disrobe envelope migrate-check a.dr --to-version 0.10.0 --to-rung surface

migrate-check answers a precise question: can this envelope be transcoded from its (version, rung) to the target (version, rung) such that a transcode path exists and every Requires capability remains satisfiable? It is how disrobe stays sound across schema bumps without silently dropping capability guarantees.

Transcoding across schema versions

disrobe-ir carries a transcode registry keyed on (from_version, from_rung, to_version, to_rung). Identity transcodes are registered for every rung at every version, and real transcodes are registered for the migration paths disrobe supports. Transcoding never changes the rung implicitly: a transcode moves an envelope across schema versions while it stays at the same rung, which keeps the operation auditable.

Hardening

The envelope decoder parses a content-addressed binary format and is treated as a security-sensitive surface. Adversarial envelopes that attempt read-past-end, integer overflow, or BLAKE3-mismatch acceptance are in scope for the security policy. The decoder lives in crates/disrobe-ir/src/envelope.rs and is fuzzed.

LLM sidecar and provenance

disrobe is built to feed coding agents. Any pass can emit a structured metadata bundle that lets an LLM reason about recovered code without re-deriving its structure: call graph, types, control flow, capability surface, decompile provenance, and round-trip verdict, all in one schema-conforming sidecar.

Turning it on

disrobe py decompile module.pyc --out recovered/ --llm
disrobe py decompile module.pyc --out recovered/ --llm --llm-briefs

--llm is an alias for the full Pack-4 selection. --llm-briefs additionally renders AGENTS.md and SKILL.md reconstruction briefs next to the bundle, so the output drops straight into an agentic IDE.

By default the bundle is written next to the primary output as <stem>.disrobe.llm.json. Override with --metadata-out <path> and choose the format with --metadata-format json|jsonl|cbor|msgpack.

The four packs

Packs are cumulative presets over the 18 categories:

PackFlagAdds
Pack-1--metadata-pack-1ast + disasm + symbols + strings
Pack-2--metadata-pack-2Pack-1 + cfg + types + imports + provenance
Pack-3--metadata-pack-3Pack-2 + dfg + signatures + constants + roundtrip + sourcemap + manifest
Pack-4--metadata-pack-4 / --llmPack-3 + confidence + opcode-coverage + pii-map + decryption-keys (auth-gated)

The 18 categories

Each category can also be toggled individually:

ast  disasm  cfg  dfg  symbols  strings  types  imports  constants  signatures
provenance  roundtrip-verdict  source-map  manifest-cat  decryption-keys
confidence  opcode-coverage  pii-map

Fine-tune any pack:

disrobe py decompile m.pyc --metadata-pack-3 --metadata-exclude ast,symbols
disrobe py decompile m.pyc --metadata-include cfg,types,provenance

Auth-gated categories

The decryption-keys category exposes recovered keys and IVs and is gated: passing --decryption-keys without --i-have-authorization fails with DR-CLI-0420. This is the same authorization gate that protects grey-zone commercial-protector behavior elsewhere in the tool.

Provenance sidecars

Independently of the --llm bundle, a chain run writes two provenance artifacts:

  • recovery.json: per-pass status, confidence-tier histogram, and timings. Summarize with disrobe context --out <dir>.
  • provenance/<file>.map.json: a line-level map from each recovered source line to (pass, source_offset, opcode_range, confidence). This is what lets an agent (or a human) trace any line of recovered source back to the exact bytes it came from and how confident the recovery is.

The .disrobe/ workspace

Scaffold a full agent workspace in the current directory:

disrobe init                    # scaffold .disrobe/
disrobe init --ide claude       # also generate IDE-specific settings (claude, cursor, windsurf, aider)

This lays down an AGENTS.md forensic-framing template, per-symbol annotation schemas under .disrobe/annotations/, skill packs under .disrobe/skills/, slash commands, and a settings hook template that denies edits to ground-truth stage directories (see Diff and guard tooling). Maintain it with:

disrobe annot refresh           # rebuild .disrobe/annotations/<stem>.annot.json
disrobe rename oldName newName --note "why"   # append-only rename record
disrobe context --out recovered/              # summarize the recovery report

Python

Python is disrobe's most contested and most developed ecosystem. It ships an in-house Rust decompiler as the product, never a wrapper around pycdc, pylingual, decompyle3, or uncompyle6 (those are benchmark competitors, available only as optional --backend fallbacks).

At a glance

LayerCoverage
Bytecode disassemblyCPython 1.0-3.15, PyPy, MicroPython .mpy v0-v6, Jython, IronPython, Brython
DecompilationIn-house engine across CPython 1.0-3.15 with per-version opcode dispatch; 92.76% per-code-object recompile-equivalence on a pinned 200-module CPython 3.14 stdlib corpus (5831 of 6286, above a 90% CI floor), and the legacy 1.0-3.7 band asserts a CI floor of 152 of 191 proven-correct (67 by recompile-equivalence, the rest by structural token-match)
Modern constructsmatch, walrus, f-strings and PEP 750 t-strings, exception groups, PEP 695/696/709
FreezersPyInstaller 2.x-6.20+, Nuitka, cx_Freeze, py2exe, PyOxidizer, shiv, pex, Briefcase, SourceDefender
ProtectorsPyArmor v6-v9-pro and 18 source obfuscators with an AST-evaluator backend

Decompiling .pyc

disrobe py decompile module.pyc --out recovered/
disrobe py decompile module.pyc --out recovered/ --backend native    # default; deterministic, no external tools
disrobe py decompile module.pyc --out recovered/ --emit source,disasm,ast

The default native backend is the in-tree engine: it runs a frame-tree pre-pass, per-version opcode dispatch, and then round-trip verification. The optional --backend pycdc|decompyle3|uncompyle6 flags shell out to those external tools (which must be on PATH) purely for benchmark comparison; they are never the default.

How the in-house engine works

  1. Frame-tree pre-pass. Before walking instructions, the engine reconstructs the nested source-construct tree from the 3.11+ exception table. This eliminates the single-pass stack-walker desync that causes other decompilers to mis-nest try/except and with-blocks.
  2. Provably-inert normalizations. Twelve normalizations (padding, super-instruction fusion, constant-pool ordering, and more) run before the round-trip check, each gated by an adversarial test proving it masks no real bug.
  3. Round-trip metric. Every emitted file is recompiled on the matching interpreter and compared opcode-for-opcode against the original. PERFECT is byte-identical; SEMANTIC is the same program with a different layout; CODE_DIFF flags a real bug that is fixed before ship. The normalizer preserves jump-condition polarity rather than collapsing all jumps, so an inverted condition reads as a CODE_DIFF instead of passing silently.

Measured equivalence

The per-code-object figure is measured against an independent oracle, not the tool's own output: each recovered module is recompiled on CPython 3.14.5 and its code objects are diffed against the originals. On a pinned 200-module stdlib corpus (6286 code objects) the rate is 92.76% (5831 of 6286), above a 90% floor a committed CI gate enforces (arbitrary_recompile_gate.rs). uncompyle6 stops near 3.8 and decompyle3 near 3.9; the ML-based decompilers self-flag benchmark contamination, and there is no model here to contaminate.

Disassembling

disrobe py disasm module.pyc --out trace.txt

A faithful per-instruction trace across every supported interpreter dialect. This is the Disasm rung: lossless, offset-preserving, no structural reconstruction.

Deobfuscating source

disrobe py deob obfuscated.py --out clean.py
disrobe py deob obfuscated.py --out clean.py --cleanup

Peels source-level obfuscator wrappers (Kramer/Specter, Berserker, Jawbreaker, BlankOBF, PlusOBF, Wodx, pyobfuscate.com, PyObfuscator (mauricelambert), python-obfuscator (PyPI), ObfuXtreme, Manglify, Oxyry, pyminifier, online obfuscator family, Xindex, pyobfus, Pypacker, Patchwork) with an AST-evaluator backend. --cleanup runs a ruff-AST constant-fold and dead-branch-elimination pass afterward.

Freezers and packagers

disrobe pyinstaller extract onefile.exe --out out/       # PyInstaller 2.1 .. 6.x, AES-CTR/CFB decrypt
disrobe pyinstaller detect onefile.exe                   # cookie, Python version, TOC offsets, no extract
disrobe pyfreeze extract app.exe --out out/              # cx_Freeze / py2exe / shiv / pex / PyOxidizer / Briefcase
disrobe nuitka detect app.exe                            # flavor + Python version
disrobe nuitka extract app.exe --out out/                # --onefile payload (zstd)
disrobe nuitka symbols app.exe                           # impl_* + module-init scan on --standalone builds
disrobe py sourcedefender app.pye --out app.msgpack      # SourceDefender .pye decrypt

PyArmor

disrobe pyarmor unpack protected.py --out out/

Unpacks a PyArmor wrapper back to its original .pyc. v8 and v9-pro are handled by a pure-static path (no code execution). v6/v7 can optionally use a dynamic-hook fallback that runs the obfuscated wrapper in a watched subprocess to capture marshal streams; this is opt-in and unsafe on untrusted input:

disrobe pyarmor unpack protected.py --out out/ --allow-dynamic --dynamic-timeout 60

The --allow-dynamic path executes the sample. Only enable it on trusted samples or inside an isolated sandbox. See Forensics and malware-safety posture.

Other useful flags: --mode auto|standard|super, --target 3.11 (rewrite emitted .pyc magic), --allow-bcc (BCC native-body lift via Ghidra-headless), --strict (exit non-zero on any partial decode), and --all-emits.

End-to-end

A real-world Python sample is often frozen, then protected, then compiled. disrobe auto chains the whole stack:

disrobe auto suspect.exe --out recovered/    # PyInstaller -> PyArmor -> .pyc decompile

JavaScript / TypeScript

disrobe deobfuscates obfuscated JS/TS, splits bundled output back into per-module sources, and inspects packaged JS runtimes, all behind a deterministic codegen.

At a glance

LayerCoverage
Family detectorobfuscator.io, Jscrambler, jsobfu, plus bundler and minified-only classification, each with confidence and markers
obfuscator.io (--full)string-array decode, control-flow unflattening, opaque-predicate folding, packing expansion, dead-code and debug-protection strip, iterated to a fixpoint
Reverser libraryJS-Confuser (string encoding/compression, dispatcher, flatten, opaque predicates, RGF, shuffle, variable masking, locks and integrity) and Jscrambler template reversals; Arxan-JS, JSDefender, and PACE protector detectors
Esoteric encodersjsfuck, JJEncode, AAEncode, JSFiretruck, Dean Edwards Packer, atob/eval indirection
Renaming--rename (hex idents to var_N) and --rename-scope-aware (oxc_semantic, conflict-checked)
BundlersWebpack 4/5, Vite, Rollup, Rolldown, esbuild, Turbopack, Bun, Browserify, Parcel, SystemJS, AMD
Packaged runtimesV8 cached-data .jsc (bytenode), Node SEA blobs, nexe, nw.js zip-suffix, Electron .asar

Deobfuscation

disrobe js deob bundle.min.js --out clean.js --full --rename-scope-aware
disrobe js deob legacy.js --out clean.js --legacy auto --unminify

The default path runs string-array recovery and writes a detection.json sidecar naming the matched family. --full runs the complete obfuscator.io reversal pipeline and records per-stage statistics in a pipeline.json (string-array call sites inlined, dispatch blocks collapsed, opaque predicates folded, packed blocks expanded). --legacy jsobfu|jscrambler-free|auto targets the older families; --unminify adds the !0/void 0/string-concat peepholes.

Unbundling

disrobe js unbundle app.bundle.js --out src/
disrobe js unbundle app.bundle.js --out src/ --emit sourcemap

Auto-detects the bundler runtime from its markers (the full table above) or forces one with --target webpack|webpack4|webpack5|vite|rollup|rolldown|esbuild|turbopack|bun|browserify|parcel|systemjs|amd. Modules land as separate files with chunk and module identifiers preserved, plus a manifest.json. --emit sourcemap synthesizes per-chunk v3 source maps and decodes embedded data-url maps.

Packaged JS runtimes

disrobe js v8 app.jsc
disrobe js v8 app.asar --json-out report.json

Classifies the artifact and prints real detection: bytenode header layout and Node version for .jsc, SEA flags and code length, nexe/nw.js payload geometry, or the .asar entry listing. For .jsc, disrobe is the self-contained, static, offline option: it recovers the user-string layer plus structure and detects the serializer version across Node 18-24, with no patched V8 binary (View8), Ghidra (ghidra_nodejs), or online service (jscdecompiler.com) required. The boundary is that internalized identifiers (most variable and property names, for example console and log) are serialized as references into V8's read-only snapshot heap, not as inline bytes in the .jsc; resolving them needs the exact V8 binary's RO heap. disrobe reports that as a lossy-internalized-roots boundary rather than fabricating past it. For V8 snapshots it reports a SnapshotDeserializeWall: the format prevents full bytecode recovery, so disrobe scrapes the string pool (tunable via --scrape-min) and states the boundary rather than fabricating past it.

Chaining

Electron and Node packaging chains run end to end:

disrobe auto app.asar --out recovered/

WebAssembly

disrobe parses WebAssembly modules and lifts them to four target surfaces, reverses Wasm-specific obfuscators, and decodes the Component Model and GC type graphs.

Decompilation

disrobe wasm decompile module.wasm --target rust --out lifted.rs
disrobe wasm decompile module.wasm --target ts   --out lifted.ts
disrobe wasm decompile module.wasm --target wat  --out lifted.wat
disrobe wasm decompile module.wasm --target c    --out lifted.c
disrobe wasm decompile module.wasm --target json --out summary.json

Lifts to Rust, TypeScript, WAT, or C pseudo-source, or a JSON summary, with DWARF / source-map name recovery where debug info is present.

Per-op coverage is measured, not assumed. Op-coverage is scored only when every operator in a function is lowered (no unreachable/todo! stub) and the result validates through an independent re-parser; parseability alone does not count. On the committed corpus all 94 functions in the 30 parseable modules are fully op-covered: every operator has a lowering rule and the re-emitted WAT re-parses. The MVP instruction set plus the SIMD, atomics, bulk-memory, table/element, reference, and tail-call proposals are lowered. Three corpus modules are skipped on wat-parse or signature-extraction failure, so this is op-coverage of the supported subset, not of all wasm.

Op-coverage is not the same as execution-equivalence, except for the functions execution-verified below. Separately, 24 of the 24 execution-eligible functions (those with a pure-numeric, i32/i64/f32/f64 ABI) are execution-equivalent to the original under wasmtime (the semantic_differential test compares return values, trap parity, and linear memory between the original and the recovered module; 1 is byte-identical in memory), with the remainder op-coverage-only. The Component Model envelope, threads, memory64, and the GC type graph are parsed and decoded by dedicated scanners; that is distinct from lifting their per-instruction semantics to source.

Deobfuscation

disrobe wasm deob module.wasm --out clean.wasm

Reverses four Wasm obfuscator families with byte- or IR-transforming passes: Jscrambler-WASM (strip integrity imports, fold opaque predicates), Wobfuscator (recover the eval op-table and lift each handler), Tigress-via-Emscripten (unflatten the dispatcher, demangle _Z names), and Wasmixer (unwrap the XOR decrypt stub, defragment). A fifth family, wasm-name-obfuscator, is detected and its rename strategy classified, but its high-entropy hex renames destroy the original names, so there is nothing to reverse.

Component Model and GC types

disrobe wasm component module.wasm        # parse the Component Model envelope -> world / adapter manifest
disrobe wasm gc-types module.wasm         # recover the GC type graph (struct / array / ref types)

JVM and Android

disrobe decompiles JVM classfiles and Android DEX through a unified command, wrapping the best FOSS decompilers headlessly while adding obfuscator reversal, ProGuard/R8 mapping replay, and chain auto-detection.

Decompiling

disrobe jvm decompile App.class --out src/
disrobe jvm decompile app.jar --backend vineflower --out src/
disrobe jvm decompile app.apk --backend jadx --out src/
disrobe jvm decompile classes.dex --backend jadx --out src/

Routes a .class, .jar, .dex, or .apk through a JVM/Android backend: CFR, Vineflower, Procyon, JADX, and others. disrobe validates the classfile itself (format 1.0.2-25) and recovers records, sealed types, and pattern matching where the backend supports them, plus Kotlin and Scala idioms.

Fidelity

The in-house classfile decompiler is gated against real javac: on the EdgeCases corpus, the asserted floor is 122 of 131 decompiled methods (93.1%) recompiling error-free, and 128 of 131 (97.7%) are measured on JDK 25. CI provisions a JDK so this gate runs there. The Dalvik lifter's recovered bodies are graded by the real JVM bytecode verifier rather than by the lifter's own output: a committed gate assembles the recovered classes from the committed dex corpus, loads them under -Xverify:all, and asserts that the recovered classes pass the verifier; 99% of verifiable classes pass (102 of 103, 0 lifter verify failures; the 103rd is link-blocked by an unbundled Kotlin Function1 supertype, a test-harness limit not a lifter defect). On the gitignored real FOSS apks the lifter self-reports a lowered body for 89% to 92.5% of methods (transmissionic 92.5%, enrecipes 90.7%, rustdesk 89.0%), but that figure counts the lifter returning a body rather than a throw-stub and is not verifier-attested; those apks cannot run in CI. A live-range-splitting pass recovers method bodies whose registers carry conflicting JVM types across control-flow joins; 307 re-hosted bodies verify clean under the same gate. The committed-corpus verifier floor and the EdgeCases recompile floor are asserted by committed test gates.

Inventory and backends

disrobe jvm extract app.apk --out classes/    # extract a .jar / .apk + dump classfile inventory
disrobe jvm backends                          # report available JVM/Android backends on PATH

Obfuscator reversal

disrobe reverses JVM obfuscators that the raw decompilers cannot (Zelix KlassMaster, Allatori, Stringer, DashO, and DexGuard control-flow obfuscation on the Android side) and replays ProGuard/R8 mapping files to restore original names.

String-encryption protectors are decrypted by emulating each class's decrypt method on the encrypted constants rather than by guessing a key, so Allatori and DashO style decrypt(String) and decrypt(int, String) routines are recovered statically. The evaluator runs <clinit> for a per-class static key, constructs the receiver and runs <init> for an instance decrypt keyed on a field, synthesizes the calling frame for a caller/stack-trace-keyed key, and covers the long-accumulator arithmetic, getfield/putfield, and switch-dispatched keystreams these routines emit, all under a hard step cap. Runtime-keyed schemes (system property, environment, clock, secure random, or a live cross-class table, as Stringer sometimes uses) are flagged as walled instead of faked. ProGuard and R8 names are restored from a mapping.txt with disrobe jvm decompile --mapping mapping.txt, disambiguating overloaded methods by their descriptor argument count.

On the Android side, disrobe deflattens BlackObfuscator control-flow flattening: it recognizes the String.hashCode() keyed dispatcher, matches each block's const-string block-name to its switch case, and recovers the original linear block order, annotating the deflattened sequence directly in the decompiled output. Obfuscator-planted out-of-range exception-table entries are dropped before structuring so they cannot poison the control-flow graph, and jsr/ret subroutines are inlined into a linear stream.

DexGuard hides string constants in an encrypted static String[] decrypted at run time through java.lang.reflect.Method.invoke rather than a direct call. The key and ciphertext are present in the dex; only the dispatch is reflective. disrobe runs a constrained Dalvik register machine over the dex's own routine: it executes the class <clinit> to rebuild the encrypted table, then runs the decrypt(int) body for each index (read the table element, apply the per-char transform against the embedded key, rebuild the string) and emits the plaintext, with the Class.getDeclaredMethod + Method.invoke call sites resolved to their concrete target. disrobe jvm decompile app.dex surfaces the recovered strings and resolved sites in the manifest. A routine that derives its key from runtime-only state (a system property, the environment, the clock, or a secure random) is reported as runtime-keyed rather than guessed. Commercial DexGuard is paid Guardsquare software whose protected output is unsafe to build on an analysis box, so this path is validated against a self-authored benign dex that exhibits the same reflection-string-decryption technique, graded against its authored plaintext.

Chaining

disrobe auto app.apk --out recovered/    # APK -> dex -> JADX + Smali + manifest

.NET / CIL

disrobe parses the full .NET PE + CLR metadata surface, decompiles CIL to C#, F#, and VB pseudo-source, detects 19 protectors, and handles ReadyToRun and Native AOT images. In-house static recovery reverses ConfuserEx2 constant decryption on a real committed sample (its encrypted-resource layer is carved byte-exact but walled on the runtime key, and full ConfuserEx2 cleanup delegates to --backend de4dot); the Eazfuscator VM-tier is devirtualized at 57 of 57 instructions against an in-repo EazVM virtualizer of our own (the committed assembly is encoded by that virtualizer, not the shipping Eazfuscator.NET product), and ILProtector and MaxToCode are detected and structurally enumerated with their method bodies walled on the native-runtime key (derived in the loader, absent from the assembly), never fabricated. The rest are detected with watermark-strip and encrypted-resource classification.

Decompiling

disrobe dotnet decompile App.dll --backend ilspy --out src/
disrobe dotnet decompile App.exe --backend dnspyex --out src/
disrobe dotnet decompile App.dll --backend de4dot --out src/

Routes a .NET PE (.dll / .exe) through ILSpy, dnSpy, dnSpyEx, or de4dot. disrobe owns the in-house CIL disassembler (full opcode table) and the CIL-to-C#/F#/VB lowering, so the structural recovery is its own even when a rendering backend is used.

Static analysis

disrobe dotnet analyze App.dll
disrobe dotnet backends                  # report available .NET backends on PATH

analyze reports the PE header, CLR metadata, protector detection, and probes for ReadyToRun (R2R) and Native AOT images, with symbol recovery on AOT builds.

Obfuscator reversal

disrobe detects 19 protector families. Recovery depth varies by protector and by what is statically present in the artifact. The model for in-house recovery is the same one used by the JVM and Lua passes: locate the decryptor method or key inside the assembly and emulate it over the encrypted data through the in-house CIL stack-machine, never a re-derived or hard-coded key.

Reversed on a real committed sample (plaintext recovered from the artifact, plaintext-absent oracle):

  • ConfuserEx2: in-house recovery reverses the constants protection (the documented FOSS "Ki.Constants" block-XOR / LZMA-validated algorithm) on a real committed SampleConstants.confuserex2.dll, with a test whose fixture holds only ciphertext plus the real decryptor and asserts plaintext not present anywhere in it. The encrypted-resource layer is carved byte-exact but walled on the runtime key. Full deobfuscation (control-flow flattening, runtime-VM string decryption, anti-tamper) is delegated to de4dot via disrobe auto / --backend de4dot; disrobe does not reimplement that tier in-house.

In-assembly-decryptor recovery, graded by round-trip against the pre-encryption original:

  • Eazfuscator.NET: locates the static char[]/byte[] string-decryptor method and emulates its CIL over the encrypted #US literal table to recover the pre-VM plaintext strings. The VM-tier is devirtualized against an in-repo EazVM virtualizer of our own: the committed assembly is encoded by that virtualizer, not the shipping Eazfuscator.NET product. disrobe reads the embedded resource, recovers the per-build opcode map from the in-assembly dispatch table by fingerprinting each handler, decrypts the position-keyed instruction stream, and lifts every virtualized method body back to CIL, then grades that CIL against the clean DLL. The grade is an ordered instruction comparison (opcode and operand, with branch targets resolved to instruction index, not raw token): 57 of 57 instructions match in sequence across the five bodies (100%). A second gate rebuilds a runnable assembly from the recovered CIL and asserts its stdout is byte-identical to the clean baseline (run wherever a .NET runtime is on PATH). Per-build randomization is fully recovered; only a runtime-only homomorphic key, not present statically, would bound a given build.
  • ILProtector / MaxToCode: classified by Invoke-stub and zero-RVA method enumeration, runtime-resource and .mtc/.text1 section location, and container-framing parse. Real builds derive the per-method key inside the native loader (Protect32/64.dll) at run time, not in the managed assembly, so the encrypted bodies are walled and reported absent, never fabricated.
  • KoiVM (ConfuserEx VM): detected by #Koi stream and VMDispatcher method markers; the VM-dispatch handler table is fingerprinted and the virtualized methods are lifted back to CIL through the same in-house CIL stack-machine used for Eazfuscator.

Doable now with an in-assembly key (detected + classified today; a per-protector decryptor emulation can be added on the same model when a real sample is available to fix the exact algorithm against):

  • SmartAssembly, .NET Reactor, Babel, Dotfuscator (Pro), Skater, Goliath, DeepSea: the string key (per-string XOR lane, AES/Rijndael resource key, RC4(SHA1(resource)), or single-byte XOR) is embedded in the assembly, so the data is present and not a wall. These are currently detected with watermark-strip, identifier, and encrypted-resource classification, and the generic static-decoder opportunistically recovers in-lined integer/string constants where the decoder is a pure transform. Confirming a per-build algorithm needs a real protected sample (flagged for consent; nothing is downloaded).
  • CryptoObfuscator, Agile.NET (CV tier): 3DES/Rijndael string keys are in-assembly; same status as above.

Needs a real sample to build against (flagged for consent, never downloaded): Spices.Net (Cyrillic-homoglyph + ROT-N per-method scramble), and the commercial protectors above whose exact per-build transform is not documented.

Genuine walls (the key or the original code is not in the static artifact):

  • Themida / .NET wrapper: managed methods are lifted into the Oreans native VM; per project policy disrobe does not ship a native-VM devirtualizer.
  • ArmDot: custom per-method VM with LCG-encrypted opcodes; static devirtualization is not performed.
  • ILProtector / MaxToCode native-keyed configurations: when the per-method key is computed inside the native stub, the original CIL is not statically present.

Other:

  • Obfuscar: dedicated in-house peeler (NameMaker odometer classification; rename-only metadata, so there is no byte rewrite and no embedded name map to recover).

Grey-zone commercial protectors are gated behind --i-have-authorization.

Chaining

disrobe auto App.exe --out recovered/    # ConfuserEx2 PE -> de4dot -> ILSpy -> C#

Native (PE / ELF / Mach-O)

disrobe does not compete with Ghidra, IDA, or Binary Ninja on raw decompilation. It is the unpack, symbol-recovery, and chain-detect layer that feeds those tools cleaner input, and it wraps Ghidra headlessly when you want a full decompile in one command.

Symbol recovery and dumping

disrobe native symbols app.exe --out symbols.json

Dumps symbols, sections, segments, imports, and debug info from a PE / ELF / Mach-O. Demangles and restores Rust and C++ symbols across x86 / ARM / RISC-V / MIPS / PowerPC / SPARC / eBPF / AVR, reading DWARF, PDB, and STABS debug formats. For C++ binaries it recovers the class hierarchy from the in-memory RTTI and vtable layout: ABI, each class's inheritance kind, direct base classes (virtual flagged), virtual-method slot counts, and detected STL templates.

Disassembly, call graph, and CFG

disrobe native disasm app.exe --out app.asm                  # recovered per-function listing
disrobe native disasm app.exe --emit cfg-dot --out cfg.dot   # per-function basic-block CFG
disrobe native disasm app.exe --emit json --out disasm.json  # structured instruction stream
disrobe native disasm --raw shellcode.bin --base 0x1000 --bits 64 --syntax intel   # linear sweep
disrobe native callgraph app.exe --out callgraph.dot         # whole-program call graph

An in-tree iced-x86 disassembler discovers functions without symbols (call-target and prologue scanning), partitions each into basic blocks, builds the whole-program call graph, and renders the per-function listing or, with --emit cfg-dot, the basic-block CFG. --raw disassembles a flat code blob with no container, and --syntax intel|at&t|nasm|masm selects the dialect for raw output. Each decoded instruction carries its register, memory, and rflags read/write effects, and the native layer can re-encode and relocate instruction blocks (the engine behind native patch).

Patching, signatures, and binary diff

disrobe native patch app.exe --at 0x1400 --bytes 0x90,0x90 --out patched.exe   # rewrite + revalidate
disrobe native patch app.exe --nop-range 0x1400:0x1410 --out patched.exe        # nop a span
disrobe native sigmaker app.exe --at 0x1400                                      # wildcarded signature
disrobe native diff old.exe new.exe                                             # match functions across builds

native patch rewrites bytes at a virtual address (or nops a VA span), maps the VA to a file offset, applies the edit, and revalidates the image. native sigmaker generates a wildcarded byte signature from the function at an address (immediates and displacements masked via the instruction decoder), then uniqueness-tests it across the image. native diff matches functions across two binaries by content, relocation-invariant, and control-flow-graph fingerprints and reports the added, removed, and changed functions. All three work on stripped input through the in-tree function discovery.

Queryable IR and capabilities

disrobe query app.exe functions                         # discovered functions, complexity, exports
disrobe query app.exe calls-to malloc                   # call sites to a target
disrobe query app.exe xrefs-to sekret                   # references to a symbol
disrobe query app.exe string-decoders                   # decoder-shaped functions (loops + byte arith)
disrobe query app.exe complexity-over 20                # functions over a cyclomatic threshold
disrobe query app.exe capability network                # instructions tied to a capability
disrobe capabilities app.exe                            # MITRE ATT&CK + MBC behavior report

disrobe query runs a queryable-IR layer over the disassembled code (functions, calls-to, xrefs-to, string-decoders, complexity-over, capability), symbol-independent and driven by the same function discovery. disrobe capabilities runs a rule engine over that IR and reports matched behaviors mapped to MITRE ATT&CK techniques and Malware Behavior Catalog (MBC) IDs, with the matching instruction offsets as per-rule evidence. Both accept a stripped binary directly or a .dr envelope.

Emulation-driven string recovery

disrobe strings app.exe                                  # static strings + decoder-execution recovery

Beyond a static scan, disrobe strings locates decoder-shaped functions and drives each through the in-house x86 emulator, recovering the plaintext that only exists after the decoder runs. Emulation-recovered strings are deduplicated against the static set and reported with the decoder and buffer addresses, so a single-byte or multi-byte XOR/sub stack-string scheme yields the decoded text without executing the sample natively.

Identifying the compiler, packer, and protector

disrobe native identify app.exe --out identity.json

An in-house signature engine fingerprints what built and protected a binary, the way Detect-It-Easy does, then routes each finding to the disrobe pass that handles it. It detects compilers and toolchains (Go, Rust, MSVC, GCC, Clang, MinGW, Delphi, Nim, Zig, Crystal, Swift, GHC, .NET, Nuitka), packers (UPX, ASPack, PECompact, FSG, MEW, MPRESS, Petite, NSPack, kkrunchy), protectors (Themida, VMProtect, Enigma, Obsidium, Armadillo, ConfuserEx, .NET Reactor, Eazfuscator), installers (NSIS, Inno Setup, InstallShield, AutoIt, PyInstaller, Electron, Bun), linkers (MSVC link via the Rich header, GNU ld, LLD), and code-signing. Every hit carries a support route: a Go binary points at disrobe go recover, a packed PE at disrobe native unpack, a .NET assembly at disrobe dotnet decompile, an installer at disrobe auto. Virtualizing protectors (Themida, VMProtect) are detect-and-carve only, never a fabricated devirtualization.

Unpacking native packers

disrobe native unpack packed.exe --out unpacked.bin

Detects the runtime packer and unpacks it. In-house decoders cover UPX (.text and .pdata byte-identical, ~96% whole loaded image), kkrunchy (byte-exact), NSPack (~99% content-section), MPRESS, Petite, MEW, ASPack, and PECompact. On committed samples ASPack and PECompact rebuild the decompressed section image at its load RVA: the section report confirms the recovered .text byte-identical and the import table >=98% byte-identical to the original, both gated in CI, while the packed .text of near-random entropy and zero resolvable calls drops to ~6.2-6.5 with hundreds of disassembler-resolvable intra-code calls. Because the whole rebuild is a loaded-memory image rather than a disk-aligned file, the bench marks whole-output byte-identity n/a. MEW rebuilds a flat image of the committed Sysinternals samples, read as the entropy drop to ~4.2-4.9 and tens of thousands of decoded instructions. FSG decodes through the same aPLib-clone path but ships no committed fixture (its samples live under the gitignored .developer/ tree), so no number reproduces from a checkout. ASProtect, Morphine, nPack, NeoLite, and Yoda's Crypter are recovered by driving their unpack stub through the in-house x86 stub emulator: the decompressor or stream decryptor runs to the original entry point inside the emulator, then the reconstructed sections are read back and sliced byte-for-byte (Yoda's Crypter .rsrc recovers byte-identical and .text decrypts to full plaintext). Yoda's Protector is detect + resource-carve, its stream key being a runtime-only value absent from the file. On UPX and NSPack the whole-image residual is the loader-rebuilt zone (bound import address table and base relocations): those addresses are resolved by the OS loader at run time and are not present in the packed stream, not a decoder gap. The virtualizing protector tier (VMProtect, Themida, Enigma, and 15+ others) is detect-and-carve: the stub is still driven through the emulator, but the original code is decrypted only by a per-machine key assembled after the stub validates an un-instrumented host (RDTSC deltas, debugger-handler identity, BOUND/FPU exception fingerprints). That key is not present in the file, so faithful recovery is an information-theoretic wall; disrobe carves what survives in place and reports the wall rather than fabricating an unpack. Per-fixture recovery scores are pinned in corpus/native/packers/MANIFEST.toml.

Devirtualizing a bytecode VM

disrobe native devirt protected.exe --out recovered/

disrobe native devirt targets the bytecode-VM tier rather than the compression tier. It locates the interpreter, fingerprints each handler's micro-op behaviorally by probing it through the in-tree x86 emulator (so a per-build handler permutation does not break the lift), recovers the handler-to-opcode table, reconstructs the VM CFG, and lifts the handler bytecode to a re-executable IR plus pseudo-code. The output directory receives the recovered listing, the pseudo-code, and a devirt.manifest.json (schema disrobe.native.devirt/v1).

The lifter is validated end-to-end on a self-authored Tigress-shape bytecode VM: the recovered IR re-executes to the same outputs as the original across arithmetic, loop, and branch programs, lifted from machine code alone (vm_devirt_oracle.rs). The commercial front-ends (VMProtect, Themida, Code Virtualizer, Enigma, WinLicense, PELock) mutate their handler set per build; the lifter is the engine, the Tigress-shape VM is the validated level, and the commercial families are extended from published RE write-ups, not a running licensed sample. A handler stream assembled at run time from a per-machine key, or fetched over the network, is the one residual, and section carve plus protector identification stay available for every family.

Forensic primitives

disrobe native entropy app.exe                           # ASCII heat-strip + byte histogram + packed-region runs
disrobe native entropy app.exe --format svg --svg map.svg # dark-theme SVG entropy map with section overlays
disrobe native entropy app.exe --format json --out e.json # the disrobe.native.entropy/v0 document
disrobe native signatures app.exe --out sigs.json        # AES T-tables, SHA/MD5 IV+K, ChaCha20 sigma
disrobe native signatures app.exe --flirt db.sig         # match against an IDA FLIRT database
disrobe native fingerprint app.exe                       # crypto + FLIRT + string-xref sidecar
disrobe native graph app.exe --out imports.dot           # import/export table as Graphviz DOT
disrobe native sbom app.exe --out app.cyclonedx.json     # CycloneDX 1.5 SBOM from cargo-auditable metadata

Entropy map & byte histogram

disrobe native entropy slides a 4 KB window across the file computing Shannon entropy (bits/byte) to locate packed, compressed, or encrypted regions, and renders the profile three ways via --format text|json|svg:

  • text (default): a Unicode heat-strip sparkline (one glyph per 4 KB block, taller = higher entropy), a 16-bucket ASCII byte-frequency bar chart, and a list of contiguous high-entropy runs (entropy >= 7.0 bits/byte) with their file offsets, the candidate packed/encrypted regions. Pass --out <path> to also drop the JSON document.
  • svg: a self-contained, dependency-free neutral-gray entropy map. Each block is a column colored on a calm-gray to amber to red ramp, with PE/ELF/Mach-O section boundaries (parsed from the file's own section table) overlaid as labeled dashed markers and a color legend. Written to ./out/<stem>.entropy.svg or the explicit --svg <path> (which implies SVG rendering). The SVG is byte-stable for a given input (no clock or RNG) and all section names are XML-escaped.
  • json: the machine-readable disrobe.native.entropy/v0 document: per-block entropy, the sparkline/heat-strip strings, the 16-bucket histogram, the detected high-entropy runs, and the section spans.

The reusable rendering logic lives in disrobe_pass_native::entropy_viz (entropy_sparkline, byte_histogram, histogram_ascii_16, high_entropy_runs, render_entropy_svg) so other tools can embed it.

Full decompile via Ghidra

disrobe native decompile app.exe --out decompiled/

Runs Ghidra headlessly (install it with disrobe install-deps ghidra) and returns pseudo-C alongside the standardized emits. This is the one place where an external native engine is the legitimate primary: disrobe's job is to hand it a clean, unpacked, symbol-rich input.

Go

disrobe recovers symbols from stripped and garbled Go binaries across PE, ELF, and Mach-O by parsing the Go runtime's own metadata tables. The deliverable is symbols, types, and embedded data, not source bodies.

At a glance

LayerCoverage
Binary formatsPE, ELF, Mach-O
pclntabHeader eras go1.2, go1.16, go1.18, and go1.20, located structurally even when the magic word has been stomped
Symbol recoverypclntab function table, moduledata, typelinks/itablinks type metadata, buildversion
Obfuscationgarble report graded None / Detected / Partial / Full, with per-scheme literal-recovery statistics
Embedded dataembed.FS usage report and directive extraction
Debug infoDWARF report when the sections survive

Recovering a binary

disrobe go recover app --out app-go.json
disrobe go info app

recover writes the full analysis JSON (default ./out/<stem>-go.json); info prints the fingerprint without writing anything. Output shape (illustrative):

go recover: OK
  input:        app
  image kind:   elf
  ptr size:     8
  pclntab ver:  go1.20
  buildversion: go1.26.3
  funcs:        ...
  packages:     ...
  garble:       None
  embed.FS:     used=true directives=...
  wrote:        ./out/app-go.json

info adds the stripped-binary fingerprint: whether the symbol table was stripped, how many functions were still recovered from pclntab, and the stdlib-name ratio that feeds the garble grading.

Garble

The garble report separates a real wall from a tooling boundary. Standard-library names survive in pclntab and are recovered, while hashed user identifiers stay walled: garble hashes them with a keyed HMAC-SHA256 over a build seed that is not in the binary, so the original names are information-theoretically gone and are reported as a name_recovery_wall rather than guessed at.

The garble -literals string encryption is a different story and is not a one-time pad. Each literal is decrypted by an init-time thunk whose key material is itself compiled into the binary, so the plaintext is statically recoverable by emulating that thunk. For the simple obfuscator disrobe does exactly this: garble emits both the key[] and data[] byte arrays as rodata []byte literals and the runtime loop computes plaintext = data <op> key (op in XOR/ADD/SUB), so disrobe locates the adjacent equal-length blob pair, applies the inverse op, strips the junk padding, and bridges the scattered external-key byte-mutations. Recovery is counted per scheme (plain_ascii, single_xor, single_add, single_sub, repeating_xor, garble_simple) and the simple-scheme scan only runs once garble is positively detected, so a normal Go binary is never mined for phantom literals. The residual boundary is concrete rather than information-theoretic: the external-key mutation index and op live as immediate operands in the decrypt thunk's machine code (not in rodata), and the swap/split/shuffle/seed obfuscators wrap each literal in its own control flow, so byte-exact recovery of heavily mutated or control-flow obfuscated literals needs full decrypt-thunk emulation over the .text immediates. The key material is present in the file in every case, which is what makes this a depth boundary and not a wall.

Validation and chaining

The pass is validated against a go1.26.3 fixture, and the test suite gates type-name recovery at >= 85% on that fixture. UPX-on-Go chains automatically: disrobe auto unpacks the UPX layer first, then recovers the Go symbols underneath.

Lua

disrobe decompiles compiled Lua chunks across every common dialect, peels all 11 major Lua obfuscators, and devirtualizes custom-VM wrappers back to runnable Lua source.

At a glance

LayerCoverage
DialectsLua 5.1 / 5.2 / 5.3 / 5.4, LuaJIT 2.0 / 2.1, full Luau (all 82 opcodes, child-proto linking), GLua
Decompile outputLua source plus a manifest recording the detected dialect, fidelity grade (Lossless / Lossy / BestEffort), and any warnings
Obfuscator detectionPrometheus, MoonSec v1 / v2 / v3, IronBrew2, WeAreDevs, SLua, AztupBrew, DarkSec, Boronide, PSU, luaobfuscator.com
Peelers (--family)prometheus, moonsec-v1, moonsec-v2, moonsec-v3, ironbrew2, wearedevs, slua; default auto detects first
VM devirtualizationIronBrew2 2.7.0 reversed on real committed output, graded by real-Lua execution differential (hello / arith / control / tables / edge in standard and MAX mode); MoonSec-shape recovery is pending a real sample

Decompiling

disrobe lua decompile script.luac --out script.lua
disrobe lua detect script.luac

decompile writes the recovered source (default ./out/<stem>.lua) and a manifest.json recording the format, fidelity grade, and warnings. detect reports the dialect and header field summary (constant, proto, and code counts) without writing output.

Output shape (illustrative):

lua decompile: OK
  input:        script.luac
  format:       Lua54
  fidelity:     Lossless
  warnings:     0
  wrote:        ./out/script.lua
  manifest:     ./out/script.manifest.json

Peeling an obfuscator

disrobe lua deobfuscate obfuscated.lua --out clean.lua
disrobe lua deobfuscate dumped.lua --family moonsec-v3 --i-have-authorization

MoonSec v3 and IronBrew2 are commercial-tier wrappers; their peelers require the explicit --i-have-authorization flag. Output shape (illustrative):

lua deobfuscate: OK
  family:       Auto
  detected:     Prometheus (confidence=0.97)
  passes run:   3
    - string_decode
    - bytecode_unwrap
    - emit
  recovered:    12 string(s)
  fully peeled: true
  residual:     0
  wrote:        ./out/obfuscated.peeled.lua

The report lists every pass that ran, recovered string constants, a fully peeled verdict, and any residual markers. Where fully peeled is false the report carries the residual marker names and the reason (runtime key, anti-tamper variant, or unmodeled VM tier).

VM devirtualization

IronBrew2 and MoonSec ship their payload behind a custom register-VM: a permuted opcode-handler table and an embedded constant pool that a stock decompiler cannot read. The permutation is not stored in the loader; it is computed at load time inside the obfuscated bootstrap, then used to dispatch handlers and key the constant decryptor. disrobe reconstructs it the same way the loader does.

For IronBrew2 2.7.0, the devirtualizer parses the bootstrap's dispatch chain to derive the encoded -> canonical permutation and the XOR key, decodes the constant pool, and lifts the VM bytecode back to runnable Lua. The committed corpus/lua/ironbrew2 set carries real obfuscated bootstraps for five programs in both standard and MAX mode; each must produce byte-identical output to the original under a real lua binary (tests/ironbrew2_real_oracle.rs).

MAX mode adds three layers on top of standard: a control-flow-flattened dispatch (a nested binary search over the opcode enum, which the same handler walker un-flattens), comparison-polarity number-mutation (the EQ handler tests ~= and jumps on equality, captured as the literal operator plus jump direction), and fused super-operator handlers (one VM step covering several real ops, whose hoisted scratch locals are stripped after classification).

The MoonSec emulate_perm_builder path interprets a bootstrap table-builder over its seed and is unit-tested on a realistic synthetic bootstrap of our own design whose permutation is derived at runtime. End-to-end validation against a real captured MoonSec dump is pending: no live sample is publicly available.

Runtime-key and anti-tamper variants (MoonSec v3 with an encrypted constant pool keyed at runtime) are the wall: the key is not present statically, so disrobe returns fully_recovered: false with a runtime keys residual marker.

PHP

The commercial PHP encoder market has no maintained FOSS competition offline. disrobe decodes all three dominant encoders fully offline: nothing is uploaded anywhere. It also peels stacked eval-chain obfuscation and walks Phar archives.

At a glance

LayerCoverage
Commercial encodersionCube, SourceGuardian, Zend Guard: envelope detect and wall (the decrypt key is native-loader-resident); a partial op_array skeleton only for legacy statically-keyed cases (Zend legacy XOR), graded StructuralOnly otherwise
Phar archivesManifest walker with path-sanitized extraction
Eval-chain layersbase64_decode, gzinflate, gzuncompress, gzdecode, str_rot13, strrev, str_replace, urldecode / rawurldecode, hex escapes, pack-hex, chr() concatenation, uudecode, single-key XOR, create_function, nested eval, FOPO, Better PHP Obfuscator
Recovery gradingEvalChainPeeled / OpArrayDecompiled / StructuralOnly / PlainSource

Decoding an encoder envelope

disrobe php decode payload.php --out out/payload-php/
disrobe php decode payload.php --encoder ioncube --i-have-authorization

--encoder is auto (default), phar, ioncube, sourceguardian, or zendguard. Commercial encoders require the explicit --i-have-authorization flag. The output directory receives the decoded payload, a skeleton .php when an op_array was decompiled, and a manifest.json recording the encoder, version label, marker offset, ciphertext and plaintext byte counts, and the recovery stage.

Output shape (illustrative):

php decode: OK
  input:        payload.php
  encoder:      Ioncube
  out dir:      ./out/payload-php
  manifest:     ./out/payload-php/manifest.json

Peeling eval chains

disrobe php deobfuscate obfuscated.php --out clean.php

Unwraps stacked eval() layers until the residue is plain PHP. Output shape (illustrative):

php deobfuscate: OK
  input:        obfuscated.php
  layers:       3
  residual_eval:false
  wrote:        ./out/obfuscated.peeled.php
  manifest:     ./out/obfuscated.peeled.manifest.json

The manifest counts each layer kind that was peeled and flags whether any eval remains in the residue.

Phar archives

disrobe php extract archive.phar --out extracted/

Walks the Phar manifest and extracts every entry through a path-sanitizer (no .. escapes), writing a manifest.json with the entry count and API version.

Output shape (illustrative):

php extract: OK
  input:        archive.phar
  entries:      14
  out dir:      ./out/archive-phar
  manifest:     ./out/archive-phar/manifest.json

When an encoder's key lives only in its runtime loader, the decode is graded StructuralOnly and the manifest carries the residual ciphertext length rather than pretending at plaintext.

Ruby

disrobe is the only active Ruby bytecode decompiler with a measured fidelity grade. It analyzes every major Ruby artifact format and decompiles YARV and mruby bytecode toward source, verified by a recompile-equivalence oracle on a real MRI interpreter.

At a glance

LayerCoverage
Flavors detectedMRI source, YARV binary (YARB magic), mruby RITE, JRuby .class, TruffleRuby AOT, Ruby2Exe, Ocra
YARVIBF reader (iseqs, object table, literals) plus a decompiler driven by per-version opcode tables for Ruby 2.6 through 3.4
mrubyRITE reader covering format versions 0001-0007, 0030, 0200, and 0300, with irep disassembly and decompilation
Fidelity100% opcode-multiset equivalence on a greeter fixture; 85% on a mixed-construct megafile (gate floor, CI-enforced); approximately 90% genuinely recoverable, the remainder an oracle artifact from comment-stripping magic pragmas
OutputAnalysis JSON plus a .rb source file carrying the decompiled body and a YARV disassembly trailer

Analyzing an artifact

disrobe ruby decompile app.bin --out app-ruby.json
disrobe ruby detect app.bin

decompile sniffs the flavor, runs the matching reader and decompiler, and writes the analysis JSON (default ./out/<stem>-ruby.json) plus a .rb source file beside it. detect reports the flavor and exits without writing output.

Output shape (illustrative):

ruby decompile: OK
  input:        app.bin
  flavor:       YarvBinary
  yarv header:  major=3 minor=4
  yarv iseqs:   12
  yarv bodies:  12
  yarv objects: 34
  yarv literals:18
  yarv insns:   97
  yarv decomp:  Lossless
  yarv stmts:   23
  decompiled:   ./out/app.rb (yarv)
  wrote:        ./out/app-ruby.json

For MRI source the summary reports token and definition counts. For YARV it adds the IBF header fields, iseq and object counts, instruction count, decompile fidelity, and statement count. For mruby it reports the compiler version string, irep count, instruction count, and whether a body was recovered.

Fidelity measurement

A committed recompile-equivalence oracle compiles the recovered YARV source on the matching interpreter and diffs the opcode multiset. The gate asserts 100% equivalence on the greeter fixture and at least 85% on the megafile fixture; both run in CI. The roughly 5-point megafile gap is an oracle artifact: the oracle strips comment lines before recompiling, discarding the original's magic pragmas (# frozen_string_literal: true and similar), which shifts a handful of string and array literals to non-frozen opcodes.

Wrappers

Ruby2Exe and Ocra self-extracting packages are detected as their own flavors so the chain layer can route the embedded payload onward. JRuby .class files and TruffleRuby AOT images are classified but not decompiled here: JVM-class material belongs to the JVM guide, and AOT-compiled native code has no recoverable Ruby body.

BEAM (Erlang / Elixir)

disrobe parses BEAM IFF files, recovers Erlang or Elixir source when debug chunks survive, lifts to Core Erlang otherwise, and disassembles the Code chunk per instruction. A flat text disassembly listing lands beside the JSON automatically.

At a glance

LayerCoverage
ChunksAtU8 (short and long form), Code, StrT, Attr, CInf, Dbgi, Docs, ExpT, ImpT, LocT, FunT, Line, LitT (zlib-deflated on OTP 26 and earlier, raw on OTP 27+); unknown chunks are preserved verbatim
Source recoveryErlang abstract code when present, Elixir source from a Dbgi form, best-effort Core Erlang lift as the floor; provenance is recorded in recovered_from
DisassemblyPer-instruction Code-chunk trace including the bs_match (opcode 182) command list; a flat .txt listing lands beside the JSON
Containers.ez archives extract through the container layer

Parsing

disrobe beam parse module.beam --out ./out/module-beam.json

Reports the module name, atom / export / import / fun counts, which optional chunks are present, and any unrecognized chunk names.

Output shape (illustrative):

beam parse: OK
  module:       my_module
  atoms:        42
  exports:      8
  imports:      15
  funs:         3
  wrote:        ./out/module-beam.json

Lifting to source

disrobe beam lift module.beam --out out/module-beam-lift/

Writes three files: <stem>.<ext> (recovered Erlang or Elixir source, extension derived from recovered_from), <stem>.surface.json (the surface record with provenance), and <stem>.core.json (lifted Core Erlang functions), plus a manifest.json linking them.

When a Dbgi chunk is present the original forms are recovered directly and labelled AbstractCode (Erlang) or ElixirDbgiForm (Elixir). Without it the output is a best-effort Core Erlang lift labelled CoreLifted.

Output shape (illustrative):

beam lift: OK
  module:       my_module
  core fns:     8
  recovered:    AbstractCode
  source:       ./out/module-beam-lift/module.erl
  surface:      ./out/module-beam-lift/module.surface.json
  core erlang:  ./out/module-beam-lift/module.core.json
  manifest:     ./out/module-beam-lift/manifest.json

Disassembling

disrobe beam disasm module.beam --out ./out/module-beam.disasm.json

Emits the per-instruction Code-chunk trace as JSON and a flat .txt listing beside it. Opcodes beyond the known table fail with an explicit DR-BEAM-0012 error naming the offending opcode rather than silently skipping bytes.

Output shape (illustrative):

beam disasm: OK
  input:        module.beam
  instructions: 214
  wrote:        ./out/module-beam.disasm.json
  listing:      ./out/module-beam.disasm.txt

Swift / Objective-C

Swift and Objective-C compile to native machine code; function bodies are gone at compile time. What survives in the binary is the type metadata the Objective-C and Swift runtimes need at run time. disrobe extracts that metadata, demangles it, and reverses the two dominant rename obfuscators so a dump of a shielded binary reads with its original names.

Commands

disrobe swift classdump App.app/App --out dump.json
disrobe swift shield-undo map.txt --out renames.json
disrobe swift confidential-decrypt blob.bin --key 0x55 --out strings.json

disrobe macho classdump App.ipa --out dump.json
disrobe macho dump App.app/App
disrobe macho slices universal.bin

Class-dump

classdump reconstructs the type interface from two metadata sources the runtime leaves in the binary.

The Objective-C side walks __objc_classlist, __objc_catlist, and __objc_protolist to recover classes, categories, protocols, ivars, properties, and method selectors with their type encodings.

The Swift side parses the reflection sections (__swift5_types, __swift5_fieldmd, __swift5_proto) and demangles the symbols to recover type names, stored fields, and conformances.

The output is a header-style interface listing. Beside the JSON it writes a .swift source file with all recovered type declarations. disrobe swift classdump handles single-slice inputs; for fat binaries and .ipa containers use disrobe macho classdump, which walks every slice.

Output shape (illustrative):

swift classdump: OK
  input:        App
  cpu/bits:     arm64 / Bits64
  swift types:  24
  reflected:    18
  mangled syms: 312
  demangled:    312
  swift source: ./out/App-swift.swift
  wrote:        ./out/App-swift.json

Rename-undo

shield-undo reverses a SwiftShield run. SwiftShield renames symbols to high-entropy identifiers and emits an obf ==> original mapping in the .dSYM. disrobe parses that mapping and builds the undo lookup, so a subsequent class-dump of the shielded binary reads with the original names.

confidential-decrypt recovers plaintext strings from a SwiftConfidential XOR-obfuscated blob given its single-byte key (--key, default 0x55).

Mach-O commands

disrobe macho dump reports the header, load commands, segments, sections, and any LC_ENCRYPTION_INFO or LC_ENCRYPTION_INFO_64 records. disrobe macho slices walks a fat binary and reports each slice's CPU type, subtype, and offset.

FairPlay-encrypted regions (App Store DRM) are reported detect-only via LC_ENCRYPTION_INFO: the decryption key is not present in the binary, so class-dump of those regions is an information-theoretic wall.

ActionScript 3 / Flash

disrobe parses SWF files, disassembles their embedded ActionScript 3 bytecode (AVM2), and lifts method bodies back to readable AS3 pseudocode via operand-stack abstract interpretation.

Commands

disrobe as3 disasm movie.swf --out out/
disrobe as3 tags movie.swf

disasm walks every DoABC and DoABCDefine tag, emits a per-instruction AVM2 listing as <label>.disasm.txt beside the JSON, and reconstructs class skeletons with lifted method bodies as <label>.source.as3. tags lists every tag in the SWF: TagCode, byte offset, and payload size.

Output shape (illustrative):

as3 disasm: OK
  input:        movie.swf
  swf version:  10
  abc blocks:   2
  classes:      6
  methods:      24
  instructions: 512
  source files: 2
  disasm files: 2
  out dir:      ./out
  manifest:     ./out/manifest.json

Lifted output

The source lifter reconstructs class skeletons with property access, calls, arithmetic, and if / goto control flow by abstractly interpreting the operand stack. Any method the lifter could not fully model is prefixed with a /// DR-AS3-PARTIAL: line naming the unmodelled opcodes or fabricated operands; a partial recovery is never silently presented as complete.

Obfuscation detection

disrobe fingerprints commercial AS3 obfuscators (secureSWF, DoSWF, Kindi, Irrfuscator, swfLock) and flags techniques: string encryption, name mangling, control-flow flattening, register and stack shuffle, string-pool-rebuild candidates. Each finding carries a confidence score. This is detection only: no decryption, pool rebuild, or unflattening is performed.

Limits

ABC erases local variable names (non-parameter slots surface as loc{n}) and the compiler erases generics before ABC; both are hard ceilings. FFDec is the mature full Flash decompiler and goes further on source-level recovery; disrobe covers SWF parsing and AVM2 disassembly as part of its chain pass, not as a Flash-decompiler replacement.

Mobile (Hermes / Flutter)

disrobe detects the runtime inside a mobile package, extracts React Native and other bundles, lifts Hermes bytecode to a JavaScript surface, and recovers Dart source or disassembles the ARM64 AOT snapshot from Flutter artifacts.

Runtime detection and extraction

disrobe mobile detect app.apk
disrobe mobile extract app.apk --out bundles/
disrobe mobile hermes index.android.bundle --out disasm/
disrobe mobile flutter libapp.so --out layout.json

detect classifies a package as one of: react-native-apk, react-native-ipa, hermes-raw-bytecode, flutter-libapp-so, flutter-dart-kernel, xamarin-apk, cordova-apk, capacitor-apk, nativescript-apk, ipa, android-apk-dex, or unknown. extract pulls bundles out of the container and writes a manifest.json listing each artifact.

Hermes

disrobe hermes decompile index.android.bundle --out surface/
disrobe hermes disasm index.android.bundle --out disasm/
disrobe hermes info index.android.bundle

decompile handles Hermes bytecode versions v60 through v96 and lifts each function back to pseudo-JavaScript. On a hermesc-built HBC v96 sample (8 functions, CI-gated) every function lifts at 0 fallback opcodes. A non-redistributable 66 MiB production bundle parsed the 122,633-function table with no parse failure (measured locally, not CI-gated). disasm emits a per-function summary without a JS surface. info prints the version, function count, string count, and identifier count.

This is a structured lifter, not a full decompiler. Variadic call arguments are marked <arg?> where the Hermes frame-register layout is not modeled; unreconstructed opcodes appear in disasm form inline.

Output shape (illustrative):

hermes decompile: OK
  input:        index.android.bundle
  hermes ver:   96
  functions:    8
  with body:    8
  identifiers:  24
  strings:      12
  opcode cov:   100.0% (312 reconstructed / 0 fallback)
  if/loop/try:  3/2/1
  source:       ./out/index.android.bundle-hermes/index.android.bundle.js
  manifest:     ./out/index.android.bundle-hermes/manifest.json

Flutter

disrobe flutter dump libapp.so --out layout.json
disrobe flutter decompile libapp.so --out estimate.json
disrobe flutter kernel app.dill --out kernel.json
disrobe flutter disasm libapp.so --emit-listing
disrobe flutter map obfuscation_map.json --out map.json

Two distinct recovery paths cover two distinct Flutter artifacts.

Dart kernel (.dill / kernel_blob.bin). A kernel is the serialized Dart AST. disrobe parses the kernel binary format (magic 0x90abcdef): the footer component index, the string table, per-library class and procedure offset tables, and the embedded UriSource table. From the source table it recovers byte-exact original Dart bodies, sliced per procedure by the kernel file offsets. The recovered .dart source file is always written beside the JSON without needing --emit-source.

Output shape (illustrative):

flutter kernel: OK
  input:        app.dill
  format ver:   130
  libraries:    3
  classes:      8
  procedures:   21
  fields:       14
  bodies:       21 recovered (byte-exact Dart source from the kernel source table)
  strings:      112
  wrote:        ./out/app-dart-kernel.json
  dart source:  ./out/app-dart-kernel.recovered.dart

ARM64 AOT snapshot (libapp.so). The AOT snapshot is ordinary AArch64 machine code. disrobe locates the four _kDart*Snapshot* symbols, recovers class and method names from the isolate-data string table, scans frame prologues to bound functions, and disassembles each body to readable instructions with resolved direct-call and branch targets. flutter decompile also recovers the class table estimate, library URIs, and a string pool from the isolate image.

flutter dump reports the four Dart snapshot sections and their sizes. flutter map parses a Flutter obfuscation_map.json into a typed original-to-obfuscated lookup.

Exact Dart source for an optimized AOT function is not byte-recoverable from the machine code (register allocation and inlining are lossy); use the kernel path for source bodies.

Python pickle

Pickle is a code-execution format wearing a data format's clothes. Unpickling a crafted stream runs arbitrary code through __reduce__ / REDUCE, which makes weaponized .pkl and .pt files a recurring ML supply-chain vector. disrobe analyzes pickles statically: it never unpickles, never calls __reduce__, never executes a REDUCE opcode. You can audit what a stream would do on load before it ever touches a real interpreter.

Commands

disrobe pickle disasm model.pkl --out trace.txt
disrobe pickle decompile model.pkl --out graph.py
disrobe pickle safety model.pkl
disrobe pickle trace model.pkl
disrobe pickle polyglot suspicious.bin
disrobe pickle model-detect model.bin

The symbolic VM

disrobe pickle trace walks the opcode stream and reconstructs the object graph the same way a real unpickler would build it, but every operation is symbolic. A GLOBAL records a (module, name) reference without importing the module; a REDUCE records "this callable would be applied to these arguments" without calling it; the memo, stack, and STOP result are all inert values. The output is the full graph (objects, memo, resolved globals, reduce count) with nothing executed. decompile renders that graph back to equivalent Python assignments.

Safety grading

disrobe pickle safety grades a stream into one of three severity tiers. Each finding is tagged with a confidence tier so a reviewer can tell a certain hit from an inference.

SeverityMeaning
benignNo dangerous import, no reduce against a risky callable, no memo abuse
suspiciousA pattern that can be malicious in context (unusual import, opaque reduce, memo manipulation)
overtly_maliciousA reduce against a known code-execution sink (os.system, subprocess.Popen, builtins.eval / exec, __import__)
ConfidenceMeaning
signature_certainThe finding follows directly from the opcodes (a GLOBAL os system then REDUCE)
pattern_inferredA heuristic shape, not a literal signature match
context_dependentRisky only depending on how the loader uses it

The report also surfaces the resolved import list, the REDUCE count, and the unused-memo count (a common obfuscation tell), so a triage decision does not require reading the raw opcodes.

Polyglots and model files

disrobe pickle polyglot catches the trick of hiding a malicious pickle inside a zip, zip64, or tar that a model loader will open as an archive and then unpickle. disrobe pickle model-detect recognizes PyTorch, TorchScript, and numpy containers and lists every embedded pickle stream, so a multi-file .pt archive is enumerated rather than treated as one opaque blob.

Python bindings

The same static suite is available as a library. Nothing is ever unpickled.

import disrobe
from disrobe import PickleSafety

with open("model.pkl", "rb") as fh:
    payload: bytes = fh.read()

safety: PickleSafety = disrobe.pickle_safety(payload)

severity: str | None = safety.severity          # "benign" | "suspicious" | "overtly_malicious"
finding_count: int   = safety.finding_count
reduce_count: int    = safety.reduce_count
listing: str         = disrobe.pickle_disasm(payload)

Shell / PowerShell

disrobe deobfuscates PowerShell, Bash, Batch, VBScript, and VBA. It reverses every major PowerShell obfuscator family and Bashfuscator, recovers VBA source from Office documents, and decompiles VBA p-code with stomp detection.

Commands

disrobe shell deob payload.ps1 --out recovered.ps1
disrobe shell detect payload.ps1

deob auto-detects the dialect and obfuscator family, applies the right reversal, and writes the recovered source plus a manifest.json. detect reports the dialect, family, confidence score, and detection markers without writing output.

Output shape (illustrative):

shell deob: OK
  input:        payload.ps1
  dialect:      PowerShell
  family:       InvokeObfuscationToken
  confidence:   0.94
  markers:      ["iex", "token-replace"]
  wrote:        ./out/payload.deob.ps1
  manifest:     ./out/payload.deob.manifest.json

Covered families

DialectFamilies
PowerShellInvoke-Obfuscation (Token, AST, String, Encoding, Compress, Launcher), Invoke-Stealth, PowerHell, Chameleon, psobf, ISESteroids
BashBashfuscator (Token, String, Obfuscate, Compress modes), indirection peeler
Batch.bat / .cmd random-char and set-indirection patterns
VBA / VBScriptVBA module source recovery, VBScript WSH patterns

VBA source and p-code

From a .docm / .xlsm / .bin Office container, disrobe parses the dir stream (MS-OVBA), maps each module to its stream and TextOffset, and MS-OVBA-decompresses the CompressedSourceCode at that offset to emit the original .bas / .cls text per module (multi-chunk compression and CopyToken bit-count edges handled). Validated against real Word and Excel documents authored via COM, byte-for-byte against the known module text.

The p-code path lifts a 264-opcode table across VBA3 / VBA5 / VBA6 / VBA7 (32-bit and 64-bit) with identifier resolution. VBA-stomping detection runs a p-code-vs-source classifier that flags modules whose compiled p-code diverges from the stored source and recovers the stomped behavior from the p-code.

Auto-dispatch

disrobe auto detects the dialect and routes obfuscated shell automatically:

disrobe auto payload.ps1 --out recovered/

Containers and archives

Before disrobe can decompile anything, it often has to get inside a container. The disrobe-binfmt layer detects 98 archive, installer, filesystem, and firmware formats and writes member bytes in-tree for all 98, with auto-detection, recursive chaining through nested layers, and shared zip-slip and decompression-bomb guards.

A recursive carve-everything engine scans for every known magic, models chunked payloads, recurses by depth, and uses entropy to separate code from padding.

Supported formats

CategoryFormats
Archives and installersZIP (incl. ZIP64 + AES), tar.gz / tar.bz2 / tar.xz / tar.zst, 7z, RAR4 and RAR5 (stored members from both; RAR5 LZ "normal" method decoded in-tree; RAR 2.9/3.x LZ used by compressed RAR4 is named per-entry, not decoded in-tree), .cab, MSI, MSIX / APPX, NSIS (solid and non-solid), Inno Setup (decoded setup-data block stream; per-file split via version-specific TSetupHeader parse is the documented limit), InstallShield (stored and zlib members), .deb, .rpm (metadata), AppImage, Flatpak, Snap
Bare compression streamsgzip, bzip2, zstd, lzma, lzip, lz4-frame, zlib, .Z (Unix compress)
Legacy archivesar, arj (methods 1-3 decoded; method 4 carved verbatim), arc (rle / squeeze / lzw decoded; methods 5-7 carved verbatim), lzh, lzop, FreeBSD uzip, Xamarin xalz, par2, ELF appended-overlay carve, StuffIt (classic stored forks decoded; compressed forks carved verbatim with a documented note), partclone (decoded)
Embedded-linux filesystemssquashfs, cramfs, ext4, romfs, minixfs, jffs2, UBI + UBIFS, yaffs, erofs (chunk and lcluster lz4 / deflate / zstd decoded; microlzma and compact index carved), NTFS, android-sparse, btrfs-send
Disk images and partitionsGPT and MBR (partition tables parsed; each partition carved and recursed in-tree), VHD (fixed and dynamic BAT), VHDX (region table + BAT; logical disk materialized from the block-allocation table, then partition-carved and FAT12 / 16 / 32 walked to pull individual stored files), WIM (header resources with XPRESS / LZX / LZMS chunk payloads decompressed in-tree), FAT12 / 16 / 32 (boot sector, FAT chain walk, root and subdirectory traversal)
Apple.dmg (UDIF: koly trailer + blkx mish chunks; ADC / zlib / bzip2 / LZFSE / LZMA chunk decoders; then HFS+ catalog walk extracts individual files, all in-tree), .pkg (xar TOC + gzip / bzip2 heap, extracted in-tree)
Vendor firmwareD-Link (SHRS / encrypted-img AES / alpha / fpkg), EnGenius XOR, Autel ECC table, QNAP PC1, plus CRC-verified Netgear (chk / trx), Xiaomi, Tesla, HP, Moxa, INSTAR, and Airoha carves; OTP-AES Airoha firmware is an information-theoretic wall and is carved verbatim
Standalone executablesBun --compile binaries (embedded JS module graph + sourcemaps), Unity AssetBundle (UnityFS)
App / runtimeElectron .asar, Docker image tarball, OCI image manifest + layers, ISO 9660 + Joliet (extracted in-tree)

Extraction

Most extraction happens implicitly inside disrobe auto, which detects a container, extracts it, and recurses into the contents. Archive-shaped inputs are also available directly:

disrobe py extract package.whl --out extracted/
disrobe auto installer.msi --out extracted/
disrobe auto firmware.bin --out extracted/ --recursive --max-depth 6

--recursive enables recursive carve; --max-depth limits nesting depth (default 8).

Safety guards

Every extractor shares the quota machinery in crates/disrobe-binfmt/src/quota.rs:

  • Per-entry size cap and aggregate size cap defuse decompression bombs.
  • Recursion-depth cap defuses container-in-container bombs.
  • Zip-slip path sanitization (sanitize_entry_path): every entry path is sanitized so no extraction can escape the output directory, on every format.

Bypasses of any of these are treated as security issues; see the security policy.

CLI overview

Every disrobe operation follows one shape:

disrobe <pass> <action> <input> [--out <path>] [flags]

A few top-level commands (auto, chain, scan, ioc, strings, behavior, identify, detect, query, capabilities, extract, report, diff, status, verify, passes, doctor) take their arguments directly rather than through a pass/action pair.

Discovering the surface

disrobe --help                # every subcommand
disrobe <pass> --help         # actions and flags for one pass, e.g. `disrobe py --help`
disrobe passes                # one-line capability summary per registered pass
disrobe explain DR-CLI-0030   # look up any error code

Subcommand inference is enabled: unambiguous prefixes work (disrobe dec ... resolves if only one subcommand starts with dec).

Output formats

The output format is a global flag, so it applies to any command:

FlagOutput
(default)Human-readable text
--jsonA single structured JSON document
--ndjsonNewline-delimited JSON (streaming)
--sarifSARIF 2.1.0, for GitHub code scanning and other SARIF consumers
disrobe scan firmware.bin --sarif > findings.sarif
disrobe py decompile m.pyc --json

The standard recovery loop

disrobe auto input.bin --out recovered/ --capture-stages   # recover
disrobe status                                              # what landed in ./out/
disrobe context --out recovered/                           # per-pass verdict + confidence
disrobe verify recovered/final/*.dr                        # check envelope integrity

The next pages cover global flags in full, the complete command reference, the diff and guard tooling, and the daemon surface.

Global flags

These flags are accepted on every subcommand (they are declared global = true on the root parser).

Output and verbosity

FlagEffect
-v, -vv, -vvvIncrease log verbosity.
-q, --quietSuppress non-error output.
--color <auto|always|never>Control ANSI color in terminal output.
--jsonEmit a structured JSON document instead of human text.
--ndjsonEmit newline-delimited JSON (streaming).
--sarifEmit SARIF 2.1.0 (GitHub code scanning, etc.).
--progress <auto|always|never>Progress-bar rendering. auto renders only on a TTY.

Execution control

FlagEffect
--seed <N>RNG seed for any non-deterministic backend (keeps runs reproducible).
--config <PATH>Load a .disrobe.toml config file. Without it, disrobe walks up from the CWD to discover one. See project configuration.
--in-placeRewrite the input file in place.
--forceOverwrite existing outputs without prompting.
-j, --threads <N>Worker thread-pool size (defaults to detected CPU count).
--no-cacheBypass the .dr envelope cache. Output is identical with or without this; it is a performance toggle, not a correctness one.
--dry-runReport what would happen without writing any output.

LLM sidecar flags

The full --llm family is also global. See LLM sidecar and provenance for the complete pack/category model. Summary:

FlagEffect
--llmAlias for --metadata-pack-4 (full bundle, auth-gated categories included).
--metadata-pack-1Pack-1: ast + disasm + symbols + strings.
--metadata-pack-2Pack-2: pack-1 + cfg + types + imports + provenance.
--metadata-pack-3Pack-3: pack-2 + dfg + signatures + constants + roundtrip + sourcemap + manifest.
--metadata-pack-4Pack-4: pack-3 + confidence + opcode-coverage + pii-map + decryption-keys (auth-gated).
--ast, --disasm, --cfg, --dfgAdd individual AST / disassembly / CFG / DFG categories.
--symbols, --strings, --types, --importsAdd symbols / strings / recovered-types / imports categories.
--constants, --signatures, --provenanceAdd constants / function-signatures / provenance categories.
--roundtrip-verdict, --source-map, --manifest-catAdd roundtrip-verdict / source-map / manifest categories.
--confidence, --opcode-coverage, --pii-mapAdd confidence-scores / opcode-coverage / pii-map categories.
--decryption-keysAdd decryption-keys category (requires --i-have-authorization).
--metadata-include <cats> / --metadata-exclude <cats>Toggle comma-separated categories after applying a pack preset.
--metadata-out <PATH>Override the bundle output path (default: <stem>.disrobe.llm.json next to the primary output).
--metadata-format <json|jsonl|cbor|msgpack>Bundle serialization format (default json).
--llm-briefsAlso emit AGENTS.md and SKILL.md reconstruction briefs next to the bundle.
--i-have-authorizationUnlocks the auth-gated decryption-keys category and grey-zone protector behavior.

The authorization gate

--i-have-authorization is the single gate guarding behavior that is legally sensitive: grey-zone commercial-protector reversal and the decryption-keys LLM category. Without it, those paths refuse to run (DR-CLI-0420 for decryption keys). It is your assertion that you are authorized to analyze the input under the statutory framing in LEGAL.md.

Command reference

The authoritative source is always disrobe <command> --help. This page is a complete map of the command surface. [--out] and the standardized [--emit ...] selector are available on most passes; see the global flags for flags that apply everywhere.

Python

CommandPurpose
disrobe py decompile <pyc>Decompile a .pyc to source. --backend native (the only supported value). --no-roundtrip skips the recompile-equivalence check.
disrobe py disasm <pyc>Per-instruction disassembly (1.0-3.15 + PyPy/MicroPython/Jython/IronPython/Brython).
disrobe py deob <src>Peel a source obfuscator. --cleanup runs a ruff-AST fold.
disrobe py extract <archive>Extract a wheel / sdist / egg / .whl / .zip / any archive.
disrobe py sourcedefender <pye>Decrypt a SourceDefender .pye envelope.
disrobe pyarmor unpack <py>Unpack PyArmor v6-v9-pro. --allow-dynamic permits the dynamic-hook fallback (trusted/sandboxed samples only). --dynamic-timeout <SECS>. --mode auto|standard|super. --target <PYVER>. --allow-bcc. --strict. --no-cextract / --cextract-only. --all-emits writes stubs for all 12 emit kinds. --cache <DIR>.
disrobe pyinstaller extract <exe>Extract a PyInstaller build (2.x-6.20+, AES decrypt).
disrobe pyinstaller detect <exe>Report cookie / Python version / TOC offsets without extracting.
disrobe pyfreeze extract <exe>Extract cx_Freeze / py2exe / shiv / pex / PyOxidizer / Briefcase.
disrobe pyfreeze detect <exe>Identify the freezer without extracting.
disrobe nuitka detect|extract|symbols|decompile|const <input>Nuitka flavor detect, --onefile extract, symbol scan, constants decompile, single .const decode.

JavaScript / WebAssembly

CommandPurpose
disrobe js deob <js>Deobfuscate (obfuscator.io, JS-Confuser, Jscrambler, esoteric encoders).
disrobe js unbundle <js>Split a bundle into per-module sources (11 bundlers).
disrobe js v8 <blob>Inspect V8 .jsc / Node SEA / nexe / nw.js / Electron .asar.
disrobe wasm decompile <wasm>Lift to --target json|rust|ts|wat|c.
disrobe wasm deob <wasm>Reverse Wasm obfuscator families.
disrobe wasm component <wasm>Parse a Component Model envelope.
disrobe wasm gc-types <wasm>Recover the GC type graph.

JVM / Android / .NET

CommandPurpose
disrobe jvm decompile <class|jar|dex|apk>Decompile via --backend cfr|vineflower|procyon|jadx.
disrobe jvm extract <jar|apk>Extract container + dump classfile inventory.
disrobe jvm backendsReport JVM/Android backends on PATH.
disrobe apk <apk>Decode the binary AndroidManifest.xml, map resource ids to names, and dump each signer certificate's SHA-256. --out <DIR> writes the decoded manifest and resource table to disk.
disrobe dotnet decompile <dll|exe>Decompile via --backend ilspy|dnspy|dnspyex|de4dot.
disrobe dotnet analyze <dll>PE/CLR metadata, protector detection, R2R + NativeAOT probe.
disrobe dotnet backendsReport .NET backends on PATH.

Native

CommandPurpose
disrobe native decompile <bin>Ghidra-headless decompile. --emit source,disasm,ast,cfg,ir,manifest,sourcemap,symbols,strings,imports,signatures,report.
disrobe native symbols <bin>Dump symbols, sections, segments, imports, and debug info.
disrobe native identify <bin>Fingerprint compiler / packer / protector / installer, each routed to its pass.
disrobe native unpack [bin]Detect + unpack UPX/kkrunchy/NSPack/Petite/MPRESS/MEW/FSG/ASPack/PECompact/Yoda's Crypter via in-house decoders + x86 stub emulator. Input is optional; --list shows all supported packers.
disrobe native devirt <bin>Devirtualize the bytecode-VM tier: recover the handler table, lift to a re-executable IR + pseudo-code.
disrobe native export <bin>Unpack, recover symbols, and export a backend-ready bundle: a rebuilt loadable PE + a Ghidra post-script / IDAPython / JSON symbol map. --format ghidra|ida|json (default ghidra).
disrobe native disasm <bin>Per-function listing / --emit cfg-dot CFG / --emit json / --raw linear sweep (--syntax intel|at&t|nasm|masm). Accepts a .dr envelope.
disrobe native callgraph <bin>Whole-program call graph as Graphviz DOT.
disrobe native patch <bin>Rewrite bytes at a VA (or nop a span) and revalidate the image.
disrobe native sigmaker <bin>Wildcarded byte signature from a function, uniqueness-tested.
disrobe native diff <a> <b>Match functions across two builds by content + CFG fingerprint.
disrobe native entropy <bin>4KB sliding-window Shannon entropy; ASCII heat-strip + byte histogram + packed-region runs. --format text|json|svg (default text), --svg <out> for a dark-theme entropy map with section overlays.
disrobe native signatures <bin>Crypto-constant fingerprints (AES, SHA, ChaCha20). --flirt <sig> to match a FLIRT DB.
disrobe native fingerprint <bin>Aggregate crypto-constant + FLIRT + string-xref sidecar at .disrobe/fingerprints/<stem>.json. --flirt <sig>.
disrobe native sbom <bin>CycloneDX 1.5 SBOM from cargo-auditable metadata embedded in the binary.
disrobe native graph <bin>Import/export table as Graphviz DOT.
disrobe query <bin|.dr> <q...>Queryable IR: functions, calls-to <sym>, xrefs-to <sym>, string-decoders, complexity-over <n>, capability <network|crypto|filesystem|process>. Accepts a raw binary or a Disasm-rung .dr envelope.
disrobe capabilities <bin|.dr>Rule engine over the IR, mapping behaviors to MITRE ATT&CK + MBC with per-match evidence.

Other languages

CommandPurpose
disrobe go recover|info <bin>Go symbol recovery / build fingerprint.
disrobe lua decompile|deobfuscate|detect <chunk>Lua decompile / obfuscator peel / dialect detect.
disrobe php decode|deobfuscate|extract <input>Encoder decode / eval-chain peel / Phar extract.
disrobe ruby decompile|detect <input>Ruby artifact analysis / flavor detection.
disrobe beam parse|lift|disasm <beam>BEAM chunk parse / Core Erlang lift / Code disasm.
disrobe pickle disasm|decompile|safety|trace|polyglot|model-detect <input>Pickle static analysis suite.
disrobe swift classdump|shield-undo|confidential-decrypt <input>Swift/ObjC class-dump, SwiftShield rename-undo, Confidential XOR-decrypt.
disrobe macho dump|classdump|slices <input>Mach-O / fat / .ipa inspection.
disrobe as3 disasm|tags <swf>AS3 DoABC disasm / SWF tag list.
disrobe hermes decompile|disasm|info <bundle>Hermes JS-surface lift / disasm / header.
disrobe flutter dump|decompile|kernel|disasm|map <input>Flutter Dart AOT + kernel inspection.
disrobe mobile detect|extract|hermes|flutter <input>Mobile runtime pipeline.

Chain, envelope, and forensics

CommandPurpose
disrobe detect <input>Run every obfuscator/packer catalog detector against a file and report each hit (pass, obfuscator, confidence, markers).
disrobe auto <input>Auto-detect + chain. --max-depth <N> (default 8), --capture-stages, --emit recovery, --dry-run. A directory input is batch-processed recursively (--include <GLOB>, --exclude <GLOB>, --batch-max-depth <N>, --jobs <N>) into an aggregate manifest.json.
disrobe chain <input>Explicit pipeline. --chain 'auto:8' or 'pyarmor+py-decompile', --chain-pin <ver>, --capture-stages.
disrobe diff <left> <right>Structurally diff two chain.json documents (passes, stage BLAKE3 hashes, sizes, verdicts).
disrobe guard verify <subject> --reference <ref>Verify a subject chain.json's per-stage output hashes against a committed reference.
disrobe guard check <path> [--root <subtree>...]Deny writes to ground-truth stage paths (out/**/stages, out/**/final, .disrobe-stage-lock). --root adds extra protected subtrees (repeatable).
disrobe envelope create|inspect|verify|diff|migrate-check <dr>.dr envelope operations.
disrobe verify <dr>Alias for disrobe envelope verify.
disrobe scan <path>Scan raw bytes for leaked credentials.
disrobe ioc <path> [--format text|json|sarif] [--defang]Extract indicators of compromise (URLs, IPs, domains, emails, paths, registry keys, wallets, crypto constants); decodes one base64/hex layer.
disrobe strings <path> [--min-len N] [--no-decode]Cross-format string extraction: ASCII + UTF-16LE, with single-byte XOR / base64 / ROT-n / stack-string deobfuscation.
disrobe behavior <path>Behavior / capability summary across 7 categories, tagged with MITRE ATT&CK technique ids.
disrobe yara parse <path>Parse a YARA ruleset into a typed AST (read-only, no matching).
disrobe yara generate <input> [--name N] [--sha256 H] [--date D]Generate a candidate YARA rule from an artifact; output round-trips through the parser.
disrobe statusSummarize ./out/: per-stage counts, sizes, manifests.
disrobe context --out <dir>Summarize a recovery report (status, confidence, verdict, provenance).
disrobe report <dir-or-input> [--format text|json|markdown|html]Consolidate a completed run (or raw input) into a forensic summary: identity, topology, per-stage verdicts/scores, artifact inventory, timings. --format html emits a self-contained, offline, dark-theme report (inline SVG bars, IOC + ATT&CK tables, XSS-escaped).

Workspace, agents, and meta

CommandPurpose
disrobe init [--ide claude|cursor|windsurf|aider] [--force]Scaffold a .disrobe/ workspace.
disrobe config [show]Print the resolved .disrobe.toml config (honors --json). See project configuration.
disrobe config init [--out <path>] [--force]Write a documented .disrobe.toml template.
disrobe annot refresh|regenerateRebuild a symbol annotation file.
disrobe rename <old> <new> [--note]Record an append-only rename.
disrobe passesList every registered pass with a one-line capability summary.
disrobe explain <code>Look up a DR-* error code and print its description and common fixes.
disrobe doctor [--auto-install] [-y]Probe ~50 optional external tools; report installed, missing, or stale.
disrobe install <tool> [--list] [-y] [--dry-run]Install one optional tool via the native package manager.
disrobe install-deps [<dep>] [--all] [--dry-run]Install heavyweight deps (Ghidra) from upstream releases.
disrobe serve [--bind <ADDR>] [--stdio|--mcp|--grpc]Run the daemon. See the daemon.
disrobe completions <shell> [--install] [--rc-file <PATH>]Generate shell completions (bash, zsh, fish, PowerShell, elvish).
disrobe man [--out <dir>]Generate man pages (one .1 per subcommand).
disrobe bug-report [--out <PATH|->]Collect environment, manifests, and tooling versions into a markdown bug report.
disrobe self-update [--check-only] [--dry-run]Print self-update guidance (source-only distribution; no network by default).

Project configuration (.disrobe.toml)

disrobe reads an optional .disrobe.toml so a project can set its own defaults instead of repeating the same global flags on every invocation.

Resolution order

Values are merged from three layers, lowest precedence first:

  1. Built-in defaults: what disrobe does with no config and no flags.
  2. .disrobe.toml: the discovered or explicitly named config file.
  3. CLI flags: anything you type on the command line.

A flag you actually pass always wins over the config file; a flag you leave off falls back to the config value, and only then to the built-in default. "Did the user pass this flag" is decided by clap's value source, so there is no ambiguity between --json being absent and being false.

Discovery

  • With --config <PATH>, that file is loaded. A missing path is a hard error (DR-CLI-0332).
  • Without --config, disrobe walks up from the current working directory looking for .disrobe.toml, exactly the way cargo finds Cargo.toml. The first one found wins. If none exists, the built-in defaults are used.

Malformed TOML (DR-CLI-0330) and unknown keys are hard errors: a typo fails fast instead of being silently ignored.

The config command

CommandPurpose
disrobe config / disrobe config showPrint the resolved effective config and the file it came from. Honors --json.
disrobe config init [--out <PATH>] [--force]Write a fully documented .disrobe.toml template (default ./.disrobe.toml).

Schema

[output]
dir = "out"                  # default output directory for chain/auto runs
emit = ["source", "manifest"] # default --emit kinds where a pass accepts them
json = false                 # default machine-output toggles (CLI flags still override)
ndjson = false
sarif = false
color = "auto"               # auto | always | never
progress = "auto"            # auto | always | never
verbosity = "warn"           # warn | info | debug | trace
quiet = false

[execution]
threads = 8                  # worker pool size (default: detected CPU count)
force = false
in_place = false
no_cache = false
cache_dir = "/var/cache/disrobe"  # content-addressed .dr envelope cache (default: OS cache dir)
dry_run = false
seed = 42                    # RNG seed for non-deterministic backends
max_depth = 8                # default chain depth for `auto`

[backends]
py = "native"                # native (in-tree CPython 1.0..3.15 engine; the only supported value)
jvm = "cfr"                  # cfr | vineflower | procyon | jadx
dotnet = "ilspy"             # ilspy | dnspy | dnspyex | de4dot
wasm = "wat"                 # json | rust | ts | wat | c
lua = "native"

[passes]
enable = ["pyarmor.unpack", "py.decompile"]  # restrict chain runs to these passes
disable = ["native.packer-unpack"]           # never run these passes

All tables and all keys are optional. An empty file is valid and resolves to the built-in defaults.

Batch directory processing

disrobe auto accepts a directory as well as a single file. Point it at a directory and it walks the tree, runs the auto-chain on every file, writes each file's outputs under out/<relative-stem>/, and emits one aggregate out/manifest.json.

Single-file behavior is unchanged: disrobe auto <file> still writes chain.json + recovery.json into a single out dir.

Usage

disrobe auto ./samples
disrobe auto ./samples --out ./out/triage --include '**/*.pyc' --exclude '*_test.*' --jobs 4

If --out is omitted, batch output lands in ./out/<dir-name>-batch/.

Flags (batch-only)

FlagEffect
--batch-max-depth <N>Maximum directory recursion depth (default: unlimited). Depth 0 is the directory itself; depth 1 is its immediate children.
--include <GLOB>Only process files matching this glob. Repeatable. With no include, all files are in scope.
--exclude <GLOB>Skip files matching this glob. Repeatable. Exclude wins over include.
--jobs <N>Bounded worker concurrency. Default is 1, kept conservative because chains can be memory-heavy. Raise it on machines with headroom.

The --max-depth <N> (default 8), --capture-stages, --emit recovery, and global flags continue to apply. --max-depth is the per-file chain depth; --batch-max-depth is the directory recursion depth.

Glob syntax

Globs match against the slash-normalized path relative to the root.

TokenMatches
*Any run of characters within a single path segment (does not cross /).
**Any run including / (spans directories).
?Exactly one non-/ character.
[abc], [a-z], [!0-9]A character class, with !/^ negation and a-z ranges.

A bare pattern with no / (for example *.bin) also matches files in subdirectories, so the common "all .bin files" case works without writing **/.

manifest.json

Schema disrobe.batch.manifest/v1:

{
  "schema": "disrobe.batch.manifest/v1",
  "tool_version": "0.10.0",
  "root": "samples",
  "out_root": "out/samples-batch",
  "chain": "auto:8",
  "jobs": 4,
  "summary": { "processed": 12, "recovered": 9, "detect_only": 2, "errors": 1 },
  "entries": [
    {
      "input": "samples/app.pyc",
      "relative": "app.pyc",
      "size": 4096,
      "detected_format": "Python",
      "chain": ["py.decompile"],
      "verdict": "Complete",
      "recovery_score": 0.67,
      "output_dir": "out/samples-batch/app.pyc",
      "duration_ms": 31,
      "error": null
    }
  ]
}
  • recovery_score is the mean per-pass confidence-tier rank across the chain, normalized to [0, 1] (skeleton 0, partial 0.33, semantic 0.67, exact 1.0), or null when no pass ran.
  • A file that fails (unreadable, or its chain errors) is recorded with a non-null error and counted under summary.errors; one bad file never aborts the batch.
  • Files with no pass in their chain are counted as detect_only.

The human-readable summary line mirrors the manifest: N processed, M recovered, K detect-only, E errors.

Run reports (disrobe report)

disrobe report consolidates a completed run into a single forensic summary: input identity, chain topology, per-stage verdicts and recovery scores, the recovered-artifact inventory, and timings. It is the read-side companion to auto and chain.

Usage

disrobe report ./out/sample-auto                 # a completed single-file run
disrobe report ./out/samples-batch               # a completed batch run
disrobe report ./malware.bin                     # raw input: runs auto first, then reports
disrobe report ./out/sample-auto --format markdown
disrobe report ./out/sample-auto --format html > report.html   # self-contained, offline

Target resolution

The single positional argument can be:

TargetBehavior
A directory with manifest.jsonRead it and render a batch report.
A directory with chain.json + recovery.jsonRead them and render a single-run report.
A raw input fileRun auto into ./out/<stem>-auto/ first, then report.
A raw directory (not an out dir)Run a batch into ./out/<dir>-batch/ first, then report.

A non-existent target is a hard error (DR-CLI-0350).

Formats

--format text|json|markdown|html (default text). The global --json flag forces JSON regardless of --format.

  • text: a clean, aligned human report for the terminal.
  • markdown: a shareable report with tables, ready to paste into an issue or PR.
  • json: the machine-readable disrobe.report/v1 document.
  • html: a single self-contained HTML file (printed to stdout; redirect to a .html). CSS is inlined; there is no JavaScript and no external/CDN reference, so it renders offline when double-clicked. Flat neutral-gray dark theme, single green accent, JetBrains-Mono numerics. Sections: input identity, a chain-topology flow, per-stage verdicts with inline SVG recovery bars, a recovery-tier histogram, the recovered-artifact inventory, and, when the input is still readable, an IOC table (defanged) plus a behavior / MITRE ATT&CK summary. Every interpolated value is HTML-escaped (XSS-safe), and the renderer uses no clock or randomness, so the document is byte-stable for identical report data.

Single-run report contents

  • Input identity: path, size, BLAKE3, detected format chain, final format.
  • Topology + verdict: linear vs tree, and the overall chain verdict.
  • Recovery score: the mean per-stage confidence-tier rank normalized to [0, 1], plus a tier label (skeleton / partial / semantic / exact).
  • Tier histogram: exact / semantic / partial / skeleton counts.
  • Per-stage table: index, pass id, confidence, score, duration.
  • Recovered-artifact inventory: the union of artifact names produced by the stages.
  • Notes: detect-only and skeleton-tier caveats.

Batch report contents

  • The aggregate counts (processed, recovered, detect-only, errors) and mean recovery score.
  • A per-file table: file, detected format, score, and status (recovered / detect-only / error).

Analysis-depth commands

A set of static-analysis primitives that make disrobe useful as a triage and reverse-engineering tool, not only a decompiler. Each one operates on raw bytes and (where relevant) on the strings/source a chain has already recovered, so they compose with the rest of the pipeline. None of them execute the sample.

CommandWhat it does
disrobe scanScan raw bytes for leaked credentials and secrets.
disrobe identifyFingerprint compiler, packer, protector, and installer.
disrobe iocPull indicators of compromise out of an artifact.
disrobe stringsCross-format string extraction with single-byte XOR / base64 / ROT brute-forcing.
disrobe yara generateSynthesize a candidate YARA rule from an artifact.
disrobe behaviorSummarize what a binary does, tagged with MITRE ATT&CK technique IDs.

Credential scan

disrobe scan firmware.bin
disrobe scan firmware.bin --json
disrobe scan firmware.bin --sarif > findings.sarif

disrobe scan scans the target's raw bytes for leaked credentials: cloud provider keys (AWS, GCP, Azure, GitHub, Stripe, and others), VCS tokens, JWTs, PEM and SSH private keys, and other high-confidence secret patterns.

Unlike disrobe ioc, scan focuses exclusively on secrets that represent an immediate credential exposure rather than general network or host indicators. Output is text (one finding per line) or SARIF 2.1.0.

Format identification

disrobe identify sample.exe
disrobe identify sample.exe --json

disrobe identify fingerprints what built or packed a PE, ELF, or Mach-O binary. It reports the compiler, linker, packer, protector, and installer if detected, with structural evidence and the disrobe pass that handles each detected layer. The output is the same as disrobe native identify but works as a top-level command without routing through the native subcommand tree.

IOC extraction

disrobe ioc suspicious.bin
disrobe ioc suspicious.bin --format json
disrobe ioc suspicious.bin --defang        # hxxp://, 1[.]2[.]3[.]4 for safe reporting
disrobe ioc malware.exe --format sarif      # GitHub code-scanning ingest

disrobe ioc scans the target's bytes and any UTF-16 / ASCII text inside it for:

  • Network: URLs (http/https/ftp/ftps/smb/file), bare domains, IPv4, IPv6, email addresses.
  • Host artifacts: Windows file paths, registry keys (HKLM\..., HKEY_CURRENT_USER\...), Unix paths under well-known roots (/etc, /usr, /var, /Users, ...).
  • Crypto wallets: Bitcoin (legacy 1/3 and bech32 bc1), Ethereum (0x...40), Monero (4...).
  • Crypto constants: AES S-box and inverse S-box, MD5 / SHA-1 / SHA-256 / SHA-512 init vectors, ChaCha20 sigma/tau, and the standard/URL base64 alphabets.

When the input is a native PE/ELF/Mach-O binary, the import table (library!symbol) is folded into the scan so DLL- and symbol-borne indicators surface too.

Encoding recursion

Base64 and hex blobs in the input are decoded and re-scanned one level deep. An indicator found inside a decoded blob is tagged with its encoding (base64 or hex) so you can tell a plaintext URL from one that was hidden behind a layer of encoding. The recursion is intentionally single-level to keep the scan bounded.

Output

  • Text (default): one indicator per line, kind<TAB>encoding<TAB>@offset<TAB>value, followed by a count.
  • JSON (--format json or the global --json): the disrobe.ioc/v0 document, { schema, uri, byte_len, total, indicators[] }, each indicator carrying kind, value, offset, encoding, and an optional context window.
  • SARIF (--format sarif or the global --sarif): SARIF 2.1.0 with one result per indicator and a DR-IOC-<KIND> rule id, for GitHub code scanning.

--defang rewrites URLs, domains, IPs, and emails into a non-clickable form (hxxp://, 1[.]2[.]3[.]4, user@host[.]tld) in every format.

Safety and determinism

Every pattern is bounded (explicit upper repetition counts) so adversarial input cannot trigger catastrophic regex backtracking, and the indicator set is deduplicated and offset-sorted, so the same bytes always produce the same report. The library logic lives in disrobe_core::ioc and is reused by the daemon and by disrobe report.

String extraction

disrobe strings sample.bin
disrobe strings sample.bin --min-len 6
disrobe strings sample.bin --no-decode      # plain ASCII / UTF-16 only
disrobe strings sample.bin --json

An in-house FLOSS-style extractor. It pulls printable ASCII and UTF-16LE runs at or above --min-len (default 4), then runs a set of deobfuscation passes and tags each result by how it was recovered:

TagMeaning
plain / plain:widePrintable ASCII run / UTF-16LE run.
xor:0xKKRecovered by brute-forcing single-byte XOR key KK over a printable run; kept only when the decoded text clears a printable-ratio bar and hits at least two dictionary words.
base64A base64 token whose decoded bytes are printable text.
rot:NA run that, rotated by N (ROT13 and other ROT-n), becomes dictionary-rich text.
stack-stringA run reconstructed from interleaved-NUL / fragmented bytes characteristic of compiler-built stack strings.

The XOR, ROT, and stack-string heuristics are deliberately conservative: they require dictionary hits, trading recall for precision so the output stays signal, not noise. Results are deduplicated by (value, tag) and offset-sorted.

Output is text (tag<TAB>@offset<TAB>value) or the disrobe.strings/v0 JSON document via --json. The library logic lives in disrobe_core::strings.

YARA rule generation

disrobe yara generate sample.bin
disrobe yara generate sample.bin --name Trojan_Foo_2026
disrobe yara generate sample.bin --sha256 <hash> --date 2026-06-10
disrobe yara generate sample.bin --json

Synthesizes a candidate YARA rule from an artifact. It selects high-signal strings (long, multi-character-class, non-dictionary, and any that were XOR/base64/ROT-recovered get a scoring bonus), detects the file's magic / format header, and emits a leading $magic hex pattern, producing a well-formed:

rule <name> : disrobe generated {
    meta:
        generated_by = "disrobe <version>"
        schema = "disrobe.yara.generated/v0"
        format = "pe"
        sha256 = "..."        // only when --sha256 is given
        date = "..."          // only when --date is given
    strings:
        $magic = { 4D 5A 90 00 ... }
        $s0 = "..." ascii
        ...
    condition:
        $magic at 0 and N of ($s*)
}

The condition combines an anchored magic check (when a format was recognized) with an "N of" string threshold (half the selected strings, rounded up).

Provenance

disrobe has no wall clock available to its analysis core, so the rule is not stamped with the current date automatically. Pass --sha256 and --date to embed those values in the meta block; otherwise they are omitted rather than fabricated.

Self-verification

Every generated rule is parsed back through the in-house YARA parser (the same one behind disrobe yara parse) before it is returned. If the emitter ever produced something the parser could not read, generation fails loudly with DR-YARAGEN-0001 rather than emitting a broken rule. The library logic lives in disrobe_core::yara_gen.

Behavior summary

disrobe behavior sample.exe
disrobe behavior sample.exe --json

disrobe behavior answers "what does this binary do?" by classifying it across seven categories:

CategoryCovers
networkSockets, WinHTTP/WinINet, DNS lookups, downloads.
filesystemFile create/read/write/delete, directory enumeration.
process_execProcess / command execution, remote-thread injection.
registry_persistenceRegistry writes, Run keys, scheduled tasks, services.
cryptoCryptoAPI / BCrypt / embedded crypto constants.
anti_analysisDebugger checks, timing checks, sandbox evasion.
dynamic_codeLoadLibrary/GetProcAddress, VirtualProtect, dlopen, RWX memory.

Signals come from three sources, each tagged in the evidence list:

  • import: the native import table (PE/ELF/Mach-O), parsed when the input is a binary.
  • string: API / symbol names found in the extracted strings (including XOR/base64-recovered ones), so signals survive light obfuscation.
  • ioc: network/host/crypto indicators from the IOC extractor.

MITRE ATT&CK mapping

Confident matches are tagged with a MITRE ATT&CK technique id (for example LoadLibrary -> T1129, IsDebuggerPresent -> T1622, a Run key -> T1547.001). The mapping is a small, hand-curated static table: only techniques that follow directly from the signal are emitted, never a probabilistic guess. The aggregate attack_ids list at the end of the report is the union across all categories, ready to paste into a triage ticket. The library logic lives in disrobe_core::behavior and is reusable by disrobe report.

Scope

This is a static summary: disrobe never executes the sample. A signal means the capability is present in the binary's imports/strings, not that it necessarily fires at runtime. Treat it as a lead, not a verdict.

Diff and guard tooling

disrobe treats recovered artifacts as a forensic baseline you can diff across versions and protect against tampering. Two command families support this: disrobe diff and disrobe guard.

disrobe diff: structural chain diff

disrobe diff left/chain.json right/chain.json

Structurally diffs two chain.json documents, the topology descriptors written by disrobe auto / disrobe chain. It compares the passes that ran, each stage's BLAKE3 output hash, byte sizes, and per-stage verdicts. This is how you answer "did upgrading the PyArmor pass change what we recover from this sample?" without eyeballing two output trees.

There is also a parallel disrobe envelope diff for two .dr envelopes directly, which compares version, rung, flags, root hash, producer, capability set, and provenance.

disrobe guard: ground-truth protection

In a recovery workspace, the byte-exact stage outputs (out/**/stages, out/**/final) are ground truth; an analyst or an agent should never edit them in place, because that would silently corrupt the provenance chain. disrobe guard enforces this.

guard verify: hash verification

disrobe guard verify subject/chain.json --reference reference/chain.json

Verifies that a subject chain.json's per-stage output hashes match a committed reference. Use it in CI to assert that a recovery is reproducible: re-run the chain, then verify the new chain.json against the checked-in reference.

guard check: edit denial

disrobe guard check out/final/module.py
disrobe guard check some/path --root extra/protected/subtree --root other/protected

Decides whether a path about to be written or edited is inside a protected ground-truth subtree. It denies writes to out/**/stages, out/**/final, and any .disrobe-stage-lock-marked path, and allows writes elsewhere. --root adds extra protected subtrees; it is repeatable and also accepts comma-separated values.

This is the command wired into the agent settings hook that disrobe init --ide claude generates: a PreToolUse hook calls disrobe guard check and denies edits to the 01-*/ and 02-*/ stage directories, so a coding agent working in a recovery workspace cannot accidentally rewrite the ground truth it is supposed to be analyzing.

The daemon: HTTP, gRPC, LSP, MCP

disrobe serve runs disrobe as a long-lived service over four protocols. The core capability is the same (analyze bytes, return recovered artifacts), exposed through whichever transport fits the caller.

disrobe serve                              # HTTP on 127.0.0.1:7373
disrobe serve --bind 0.0.0.0:7373          # HTTP on all interfaces (emits a warning banner)
disrobe serve --grpc                       # HTTP on :7373, gRPC on :7374
disrobe serve --stdio                      # LSP over stdio
disrobe serve --mcp                        # MCP companion over stdio (rmcp)

Flags

FlagDefaultEffect
--bind <ADDR>127.0.0.1:7373HTTP bind address.
--stdiooffServe LSP JSON-RPC over stdin/stdout instead of HTTP.
--mcpoffServe the MCP companion over stdio instead of HTTP/LSP.
--grpcoffExpose the gRPC surface alongside HTTP (binds to <bind-ip>:<bind-port+1>).
--cors-origin <ORIGIN>(none)Additional CORS origin to allow. Repeatable. With no origins, all origins are allowed.
--max-body-size <N>52428800 (50 MiB)Maximum request body size in bytes.

HTTP

The default surface. Binds to loopback (127.0.0.1:7373) by default; a non-loopback bind emits a warning at startup.

The API is versioned: /v1/* and /v2/* are aliases. The spec is served at /openapi.json.

MethodPathDescription
GET/v1/healthServer liveness ({ status, version }).
GET/v1/versionTool name, version, and API level ({ name, version, api }).
GET/v1/passesList registered passes ({ passes: [{ name, description }] }).
POST/v1/analyzeClassify and route bytes. Body: { bytes_b64, hint? }. Returns: { routed_action, bytes_read, blake3_hash, reason, candidates }.
POST/v1/explain/{code}Look up a DR-* error code. Returns: { code, known, title?, description?, crate_path? }.
POST/v1/envelope/verifyVerify a .dr envelope. Body: { bytes_b64 }. Returns: { verified, version, rung, hot_bytes, cold_bytes, root_hash_blake3 }.
POST/v1/envelope/createWrap raw bytes into a Raw-rung .dr envelope. Body: { bytes_b64, source_label?, produced_by?, detected_format? }. Returns: { envelope_b64, envelope_bytes, bytes_in, root_hash_blake3, source_hash_blake3 }.
WS/v1/streamWebSocket NDJSON stream (upgrade with Sec-WebSocket-Protocol: disrobe-stream.v1). Bytes are sent bytes_b64-only; the server never reads from disk.
GET/openapi.jsonOpenAPI 3 spec for the HTTP surface.

gRPC

Exposed alongside HTTP with --grpc. The gRPC server binds to the same IP as --bind but at <port> + 1 (default 127.0.0.1:7374). Includes health and reflection services.

LSP over stdio

--stdio speaks JSON-RPC over stdin/stdout using the LSP framing and initialize handshake. It does not implement the standard textDocument surface. It advertises two custom methods under capabilities.experimental.disrobe:

MethodDescription
disrobe/analyzeParams: { bytes_b64, label? }. Returns the same classification payload as POST /v1/analyze.
disrobe/explainParams: { code }. Returns the error-code lookup payload.

Unknown fields in params are rejected (hard error). A path field is specifically rejected; all bytes are transmitted inline.

MCP companion

--mcp runs the rmcp-based Model Context Protocol companion, exposing disrobe's capabilities as MCP tools (auto, decompile, ioc, behavior, strings, verify, rename, annot, and provenance_lookup) so an MCP-aware agent (Claude Code, Cursor, and others) can call disrobe directly as a tool server. The standalone disrobe-mcp binary is equivalent. See the dedicated MCP server integration page for the full tool catalog and client registration.

Security posture

All four transports share one hard rule: the server never reads a file from disk based on client input. Requests carry bytes_b64 only, never a filesystem path. Unknown JSON fields are rejected via deny_unknown_fields. Any way to make the server read a file via a client-controlled string is a high-severity vulnerability under the security policy. Run the daemon on loopback unless you have a specific reason not to.

Python bindings

disrobe ships a typed Python library that mirrors the full CLI surface. The importable disrobe module is built from crates/disrobe-python with pyo3 (abi3, Python 3.9+) and wraps the same Rust library the CLI uses. Bytes go in; a concrete typed report object comes out. Output is deterministic: the same input produces the same report bytes.

The library does not read or write the filesystem; the caller owns all I/O. Wheels are not published to PyPI; build from source.

Install

git clone https://github.com/1-3-7/disrobe
cd disrobe/bindings/python
pip install maturin
maturin develop --release

For a redistributable wheel:

maturin build --release
pip install target/wheels/disrobe-*.whl

The pyproject.toml pins maturin>=1.5,<2.0, sets module-name = "disrobe.disrobe", and points at crates/disrobe-python/Cargo.toml. On Windows the crate's build.rs searches PYO3_PYTHON, an active VIRTUAL_ENV, and standard install locations; set PYO3_PYTHON=<path-to-python.exe> if none is found. A py.typed marker is shipped so pyright and mypy resolve every attribute from the .pyi stub.

import disrobe

version: str = disrobe.__version__

Report model

Every analysis function returns a concrete subclass of _Report. The base carries the full serialization surface every report shares.

_Report

MemberSignatureNotes
raw@property -> dict[str, Any]Full underlying record; no detail dropped
to_json() -> strCompact JSON string
from_json_strclassmethod(text: str) -> SelfRebuild from a to_json string
from_objclassmethod(obj: dict[str, Any]) -> SelfWrap an already-decoded dict

Reports compare equal when their underlying records are equal (== / !=).

_LlmReport

Subclasses _Report. Adds one property:

MemberSignatureNotes
llm@property -> LlmBundle | NonePopulated on LLM-wired passes; None otherwise

LLM-wired passes: py_decompile, py_disasm, py_deob, pyarmor_detect, pyarmor_unpack. Functions that build an LLM bundle accept pack: Pack | None where Pack = Literal["pack-1", "pack-2", "pack-3", "pack-4"].

LlmBundle

A TypedDict(total=False) mirroring the disrobe.metadata.llm.v1 on-disk schema. Keys present depend on which pack was requested.

from disrobe import LlmBundle
from typing import Any

bundle: LlmBundle = {
    "schema": "disrobe.metadata.llm.v1",
    "schema_version": "1",
    "generated_at": "2026-06-16T00:00:00Z",
    "tool": {},
    "selection": {},
    "input": {},
    "pipeline": [],
    "categories": {},
}

Literal type aliases

AliasValues
Pack"pack-1", "pack-2", "pack-3", "pack-4"
RoundtripStatus"perfect", "semantic", "code-diff", "no-interpreter", "recompile-failed", "skipped"
PyarmorUnpackStatus"functional", "bcc-partial", "detect-only", "skeleton"
ContainerListing"enumerated", "requires-extraction", "unreadable"
SymbolKind"function", "data", "label", "export", "import"
InstructionFlow"sequential", "call", "indirect-call", "conditional-branch", "unconditional-branch", "indirect-branch", "return", "interrupt"
SourceLanguage"python", "py", "python3"
JsLanguage"javascript", "js", "typescript", "ts"
ByteLanguage"python-bytecode", "pyc", "jvm-class", "class", "dex", "beam", "hermes", "hermes-bundle", "hbc", "wasm"
ParseByteLanguage"go", "swift", "objc", "objective-c", "kotlin", "ruby", "lua", "php"
DisasmByteLanguage"ruby", "ruby-bytecode", "yarv", "mruby", "php", "php-bytecode"
DecompileLanguage"python-bytecode", "pyc", "jvm-class", "class", "java", "kotlin", "lua", "ruby", "php", "php-bytecode", "javascript", "js", "typescript", "ts"

Exception hierarchy

ClassBaseRaised when
DisrobeErrorExceptionAny binding fails
UnsupportedLanguageDisrobeErrordisasm/parse/compile/decompile for a language with no backing implementation; message includes a hint

Module-level functions: full surface

CategoryFunctionReturns
Auto chainauto(input, *, max_depth=8, path_hint=None)ChainReport
Generic dispatchdecompile(language, source)CanonicalSource
disasm(language, source)str
parse(language, source)typed report or dict[str, Any]
compile(language, source, *, version=None)bytes
Custom passregister_pass(name, callable)None
register_consumer(name, callable)None
registered_passes()list[str]
registered_consumers()list[str]
unregister(name)bool
run_pass(name, data)Any
run_chain(names, data)Any
emit(name, result, **context)Any
Analysisstrings_extract(data, *, min_len=4, decode=True)StringsReport
ioc_extract(data)IocReport
behavior_analyze(data)BehaviorReport
identify(data)IdentifyReport
secret_scan(data)SecretScanReport
capabilities(binary_bytes)Capabilities
extract(data, out_dir)ExtractionResult
extract_recursive(data, *, source_label='inline', max_depth=8)OverlayReport
yara_parse(ruleset_source)YaraReport
yara_generate(data, *, name=None)YaraReport
Nativenative_symbols(data)SymbolsReport
native_disasm(data)DisasmPayload
native_callgraph(data)CallGraph
native_imports_dot(data)str
native_entropy(data)EntropyReport
native_sbom(data)SbomReport
native_fingerprint(data, *, flirt=None)FingerprintReport
native_signatures(data, *, flirt=None)SignatureReport
native_sigmaker(data, at)SigmakerReport
native_diff(a, b)DiffReport
native_patch(data, *, at, replacement=None, nop_start=None, nop_end=None)tuple[bytes, PatchReport]
native_format(binary_bytes)NativeFormat
native_detect(binary_bytes)DetectionList
native_probe_backends()BackendList
native_deobfuscate(code, *, bits=64, base=0, entry=0)NativeDeobfuscation
Query IRquery_functions(dr_bytes)FunctionList
query_calls_to(dr_bytes, target)QueryReport
query_xrefs_to(dr_bytes, symbol)QueryReport
query_string_decoders(dr_bytes)QueryReport
query_complexity_over(dr_bytes, threshold)QueryReport
query_capability_sites(dr_bytes, capability)QueryReport
query_call_graph(dr_bytes)CallGraph
Envelopeenvelope_create(payload, *, source_label='inline', produced_by=None, detected_format=None)bytes
envelope_verify(envelope_bytes)EnvelopeReport
LLM rendersagents_md(result)str
skill_md(result)str
provenance(result)Provenance
Python decompilepy_decompile(pyc_bytes, *, roundtrip=False, pack=None)PyDecompileReport
py_disasm(pyc_bytes, *, pack=None)PyDisasmReport
Python deobfuscatepy_deob(source, *, cleanup=True, pack=None)PyDeobReport
py_deob_detect(source)PyDeobDetection
py_deob_list_passes()list[ObfuscatorPass]
py_deob_detect_pass(source, pass_id)PyDeobDetection
PyArmorpyarmor_detect(source, *, pack=None)PyarmorDetection
pyarmor_unpack(wrapper_bytes, *, pack=None)PyarmorUnpack
pyarmor_classify(source, payload)PyarmorClassification
PyInstallerpyinstaller_extract(image_bytes)PyInstallerArchive
pyinstaller_entry_bytes(image_bytes, entry_name)bytes
Nuitkanuitka_detect(image_bytes)NuitkaDetection
nuitka_extract(image_bytes)NuitkaExtraction
Hermeshermes_disasm(bundle_bytes)HermesDisassembly
hermes_lift(bundle_bytes)HermesLift
hermes_info(bundle_bytes)HermesInfo
Mach-Omacho_dump(macho_bytes)MachoReport
swift_analyze(macho_bytes)SwiftReport
JVM / Androidjvm_parse_class(class_bytes)JvmClass
jvm_parse_dex(dex_bytes)DexFileReport
jvm_decompile_class(class_bytes)JvmDecompiledClass
jvm_detect(class_bytes)DetectionList
jvm_backends()JvmBackends
apk_resources(apk_bytes)ApkResources
.NETdotnet_parse_pe(pe_bytes)DotnetPe
dotnet_parse_metadata(pe_bytes)DotnetMetadata
dotnet_detect(pe_bytes)DotnetDetection
dotnet_analyze(pe_bytes)DotnetAnalysis
dotnet_decompile(pe_bytes)DotnetDecompilation
dotnet_recover_decoders(pe_bytes)DotnetDecoders
dotnet_backends()BackendList
WebAssemblywasm_analyze(wasm_bytes)WasmAnalysis
wasm_detect(wasm_bytes)WasmDetection
JavaScriptjs_detect(js_source)JsDetection
js_unminify(js_source)JsUnminify
js_unbundle(js_source, *, bundler=None)JsUnbundle
Lualua_detect(bytecode)LuaDetection
lua_decompile(bytecode)LuaDecompilation
lua_deobfuscate(source, *, authorize=False, strict=False)LuaDeobfuscation
Gogo_analyze(binary_bytes)GoAnalysis
go_symbols(binary_bytes)GoSymbols
go_pclntab(binary_bytes)GoPclntab
go_garble(binary_bytes)GarbleReport
Rubyruby_detect(ruby_bytes, *, source_path=None)RubyDetection
ruby_decompile(ruby_bytes, *, source_path=None)RubyAnalysis
PHPphp_detect(php_bytes)PhpDetection
php_scan(php_bytes)PhpScan
php_decode(php_bytes, *, max_depth=None)PhpDecode
Shellbatch_deobfuscate(script, *, args=None)BatchDeobReport
powershell_detect(script)PowershellDetection
powershell_deobfuscate(script)PowershellDeobfuscation
Containerscontainer_detect(container_bytes)ContainerDetection
container_members(container_bytes)ContainerMembers
Picklepickle_disasm(pickle_bytes)str
pickle_decompile(pickle_bytes)PickleDecompilation
pickle_safety(pickle_bytes)PickleSafety
pickle_trace(pickle_bytes)PickleTrace
pickle_polyglot(file_bytes)PicklePolyglot
pickle_ml_detect(file_bytes)PickleMlReport

Auto chain

import disrobe
from disrobe import ChainReport

with open("sample.bin", "rb") as fh:
    chain: ChainReport = disrobe.auto(fh.read(), max_depth=8)

spec: str | None = chain.spec
pass_count: int = chain.pass_count
terminated: bool = chain.terminated
full_plan: dict[str, object] = chain.raw

auto runs the chain detector against raw bytes and returns a ChainReport carrying the full chain.json plan the CLI produces; it does not write stage outputs to disk. max_depth must be 1-16; out-of-range values raise DisrobeError. The registered pass tree covers pyarmor, pyinstaller, nuitka, py-decompile, py-deob, container, js, jvm, dotnet, wasm, mobile, swift-objc, and the native packer detector.

ChainReport accessors

PropertyType
specstr | None
pass_countint
terminatedbool

Generic dispatch

Language-keyed entry points that fan out to the per-language passes.

decompile

def decompile(language: str, source: str | bytes) -> CanonicalSource: ...

Wired families: python/pyc (py-decompile), jvm-class/class/java/kotlin (JVM lifter), lua (register lifter), ruby (YARV/mruby recovery), php/php-bytecode (eval-chain peel/op-array skeleton), javascript/js/typescript/ts (unminify). Binary-only targets (go, swift, wasm) have no single source body; call their structural binding or parse instead.

import disrobe
from disrobe import CanonicalSource

with open("module.pyc", "rb") as fh:
    recovered: CanonicalSource = disrobe.decompile("python-bytecode", fh.read())

source: str | None = recovered.source
language: str | None = recovered.language
produced_by: str | None = recovered.produced_by
confidence: float | None = recovered.confidence

CanonicalSource accessors

PropertyType
sourcestr | None
languagestr | None
produced_bystr | None
confidencefloat | None

disasm

def disasm(language: str, source: str | bytes) -> str: ...

Returns a rendered instruction listing as text. Wired: python/pyc, jvm-class/class, dex, beam, hermes, wasm, ruby/yarv/mruby, and php/php-bytecode. For Lua use decompile('lua', ...) or parse('lua', ...) instead.

parse

def parse(language: str, source: str | bytes) -> (
    dict[str, Any]
    | GoAnalysis
    | SwiftReport
    | JvmClass
    | RubyAnalysis
    | LuaDecompilation
    | PhpDecode
    | JsUnminify
): ...

Returns a typed report for structural-recovery languages: go -> GoAnalysis, swift/objc/objective-c -> SwiftReport, kotlin -> JvmClass, ruby -> RubyAnalysis, lua -> LuaDecompilation, php -> PhpDecode, javascript/js/typescript/ts -> JsUnminify. Container and bytecode formats (pyc, jvm-class, dex, wasm, hermes, beam) return a nested dict[str, Any] because their full parse records have no single typed shape.

import disrobe
from typing import Any

with open("Hello.class", "rb") as fh:
    parsed: dict[str, Any] = disrobe.parse("jvm-class", fh.read())
method_count: int = len(parsed["methods"])

compile

def compile(language: str, source: str, *, version: str | None = None) -> bytes: ...

Implemented for Python only; returns raw marshal.dumps bytes (no .pyc header) via the host interpreter. lua, ruby, and php raise UnsupportedLanguage with a hint pointing at the CLI subcommand or toolchain.

import disrobe

blob: bytes = disrobe.compile("python", "x: int = 1 + 2\n")
listing: str = disrobe.disasm("python", "x: int = 1 + 2\n")

Custom pass plugin protocol

Register and compose named passes and output consumers in the host process.

from typing import Any
import disrobe
from disrobe import Pass, OutputConsumer

def my_pass(data: Any) -> Any:
    return data[::-1]

def my_consumer(result: Any, **context: Any) -> Any:
    print(result, context)

disrobe.register_pass("reverse", my_pass)
disrobe.register_consumer("print", my_consumer)

names: list[str] = disrobe.registered_passes()
consumer_names: list[str] = disrobe.registered_consumers()

output: Any = disrobe.run_pass("reverse", b"hello")
chained: Any = disrobe.run_chain(["reverse", "reverse"], b"hello")
disrobe.emit("print", chained, source="example")

removed: bool = disrobe.unregister("reverse")

Pass and OutputConsumer protocols

Both are @runtime_checkable protocols.

ProtocolSignature
Pass__call__(self, data: Any) -> Any
OutputConsumer__call__(self, result: Any, **context: Any) -> Any

Analysis

strings_extract

def strings_extract(data: bytes, *, min_len: int = 4, decode: bool = True) -> StringsReport: ...

Extracts ASCII and UTF-16 strings from a binary blob.

ioc_extract

def ioc_extract(data: bytes) -> IocReport: ...

Harvests indicators of compromise from bytes and recovered strings.

behavior_analyze

def behavior_analyze(data: bytes) -> BehaviorReport: ...

Behavioral summary by category with MITRE ATT&CK IDs.

identify

def identify(data: bytes) -> IdentifyReport: ...

Compiler/linker/packer/protector/installer fingerprint.

secret_scan

def secret_scan(data: bytes) -> SecretScanReport: ...

Leaked-credential scan over raw bytes.

capabilities

def capabilities(binary_bytes: bytes) -> Capabilities: ...

Capability rule-set matches for a native binary.

extract

def extract(data: bytes, out_dir: str) -> ExtractionResult: ...

Carves container/firmware members to out_dir.

extract_recursive

def extract_recursive(
    data: bytes, *, source_label: str = "inline", max_depth: int = 8
) -> OverlayReport: ...

Recursive multi-magic carve; classifies every chunk by entropy and nesting.

yara_parse / yara_generate

def yara_parse(ruleset_source: str) -> YaraReport: ...
def yara_generate(data: bytes, *, name: str | None = None) -> YaraReport: ...

Parse a YARA ruleset AST or generate a candidate rule from a binary blob.

import disrobe
from disrobe import (
    StringsReport, IocReport, BehaviorReport, IdentifyReport,
    SecretScanReport, Capabilities, OverlayReport, YaraReport,
)

with open("suspect.bin", "rb") as fh:
    data: bytes = fh.read()

strings: StringsReport = disrobe.strings_extract(data, min_len=6)
string_count: int = strings.string_count

iocs: IocReport = disrobe.ioc_extract(data)
indicator_count: int = iocs.indicator_count

behavior: BehaviorReport = disrobe.behavior_analyze(data)
category_count: int = behavior.category_count

ident: IdentifyReport = disrobe.identify(data)
fmt: str | None = ident.format
finding_count: int = ident.finding_count

caps: Capabilities = disrobe.capabilities(data)
match_count: int = caps.match_count

overlay: OverlayReport = disrobe.extract_recursive(data, max_depth=4)
chunks_total: int | None = overlay.chunks_total
bytes_carved: int | None = overlay.bytes_carved

rule: YaraReport = disrobe.yara_generate(data, name="suspect")
rule_count: int = rule.rule_count

Analysis report classes

ClassNotable typed accessors
StringsReportstring_count: int
IocReportindicator_count: int
BehaviorReportcategory_count: int
IdentifyReportformat: str | None, finding_count: int
SecretScanReportfinding_count: int
Capabilitiesmatch_count: int, format: str | None
ExtractionResultkind: str | None, entry_count: int, integrity_violation_count: int
OverlayReportmax_depth: int | None, nodes_visited: int | None, chunks_total: int | None, bytes_carved: int | None
YaraReportrule_count: int

Native binary

Functions

FunctionReturnsNotes
native_symbols(data)SymbolsReportSymbols, sections, imports, debug info
native_disasm(data)DisasmPayloadFull disassembly: functions, stream, symbols
native_callgraph(data)CallGraphWhole-program call graph
native_imports_dot(data)strGraphViz DOT of the import graph
native_entropy(data)EntropyReportSliding-window Shannon entropy map
native_sbom(data)SbomReportCycloneDX 1.5 SBOM from cargo-auditable section
native_fingerprint(data, *, flirt=None)FingerprintReportCrypto-constant + FLIRT + string-xref sidecar
native_signatures(data, *, flirt=None)SignatureReportCrypto-primitive signatures and FLIRT matches
native_sigmaker(data, at)SigmakerReportWildcarded byte signature for a VA
native_diff(a, b)DiffReportFunction-level diff of two binaries
native_patch(data, *, at, ...)tuple[bytes, PatchReport]Rewrite bytes and revalidate
native_format(binary_bytes)NativeFormatFormat: kind, bitness, subsystem
native_detect(binary_bytes)DetectionListPacker/protector detection hits
native_probe_backends()BackendListProbe for installed external tools
native_deobfuscate(code, *, bits=64, base=0, entry=0)NativeDeobfuscationx86 OLLVM/Tigress deflattening
import disrobe
from disrobe import (
    SymbolsReport, DisasmPayload, CallGraph, EntropyReport,
    SbomReport, FingerprintReport, SignatureReport, SigmakerReport,
    DiffReport, PatchReport, NativeFormat, DetectionList,
    BackendList, NativeDeobfuscation,
)

with open("binary.elf", "rb") as fh:
    data: bytes = fh.read()

syms: SymbolsReport = disrobe.native_symbols(data)
symbol_count: int = syms.symbol_count
section_count: int = syms.section_count
import_count: int = syms.import_count

disasm_payload: DisasmPayload = disrobe.native_disasm(data)
instruction_count: int = disasm_payload.instruction_count
source_hash: str | None = disasm_payload.source_hash

entropy: EntropyReport = disrobe.native_entropy(data)
mean: float | None = entropy.mean

sig: SigmakerReport = disrobe.native_sigmaker(data, at=0x1000)
ida_pattern: str | None = sig.ida_pattern

patched_bytes: bytes
patch_report: PatchReport
patched_bytes, patch_report = disrobe.native_patch(data, at=0x1234, nop_start=0x1234, nop_end=0x1240)
revalidated: bool = patch_report.revalidated

deob: NativeDeobfuscation = disrobe.native_deobfuscate(data, bits=64, base=0x400000)
recovered_blocks: int | None = deob.recovered_blocks
fully_recovered: bool = deob.fully_recovered

Native report classes

ClassNotable typed accessors
SymbolsReportsymbol_count: int, section_count: int, import_count: int
DisasmPayloadinstruction_count: int, symbol_count: int, source_hash: str | None
CallGraphnode_count: int, edge_count: int
EntropyReportwindow_count: int, mean: float | None, min: float | None, max: float | None
SbomReportcomponent_count: int, bom_format: str | None, spec_version: str | None
FingerprintReportcrypto_hit_count: int
SignatureReportsignature_count: int
SigmakerReportida_pattern: str | None, byte_count: int
DiffReportadded: int, removed: int, changed: int
PatchReportat: int | None, bytes_written: int | None, revalidated: bool
NativeFormatkind: str | None, bits: int | None, subsystem: str | None
DetectionListcount: int
BackendListcount: int, available_count: int
NativeDeobfuscationbits: int | None, recovered_blocks: int | None, original_blocks: int | None, fully_recovered: bool

Query IR

The query functions operate on a Disasm-rung .dr envelope (raw bytes). See Editable IR objects for how to produce and consume .dr envelopes programmatically.

FunctionReturnsNotes
query_functions(dr_bytes)FunctionListAll recovered functions
query_calls_to(dr_bytes, target)QueryReportCall sites targeting a symbol name
query_xrefs_to(dr_bytes, symbol)QueryReportData/code cross-references to a symbol
query_string_decoders(dr_bytes)QueryReportFunctions with string-decode patterns
query_complexity_over(dr_bytes, threshold)QueryReportFunctions with cyclomatic complexity above threshold
query_capability_sites(dr_bytes, capability)QueryReportSites exercising a named capability
query_call_graph(dr_bytes)CallGraphWhole-program call graph from IR
import disrobe
from disrobe import FunctionList, QueryReport, CallGraph

with open("module.dr", "rb") as fh:
    dr: bytes = fh.read()

functions: FunctionList = disrobe.query_functions(dr)
fn_count: int = functions.count

callers: QueryReport = disrobe.query_calls_to(dr, "malloc")
match_count: int = callers.match_count

complex_fns: QueryReport = disrobe.query_complexity_over(dr, threshold=20)
graph: CallGraph = disrobe.query_call_graph(dr)
edge_count: int = graph.edge_count

Query report classes

ClassNotable typed accessors
FunctionListkind: str | None, count: int
QueryReportkind: str | None, match_count: int
CallGraphnode_count: int, edge_count: int

Envelope

envelope_create wraps a payload as a Raw-rung .dr envelope and returns the encoded bytes. envelope_verify decodes and verifies, returning an EnvelopeReport.

import disrobe
from disrobe import EnvelopeReport

envelope: bytes = disrobe.envelope_create(
    b"payload",
    source_label="inline",
    produced_by="my-tool",
    detected_format="elf64",
)
report: EnvelopeReport = disrobe.envelope_verify(envelope)
ok: bool = report.verified
root_hash: str | None = report.root_hash
rung: str | None = report.rung
hot_bytes: int | None = report.hot_bytes
cold_bytes: int | None = report.cold_bytes
version: int | None = report.version

EnvelopeReport accessors

PropertyType
verifiedbool
rungstr | None
versionint | None
hot_bytesint | None
cold_bytesint | None
root_hashstr | None

The sidecar DrEnvelope TypedDict (bindings/python/dr-envelope.pyi) mirrors the raw on-disk header shape: magic, version, rung, flags, hot_len, cold_len, root_hash.

LLM renders

agents_md and skill_md render the AGENTS.md and SKILL.md reconstruction briefs for a report from an LLM-enabled pass (or a bare bundle dict), returning a str. provenance extracts tool/selection/input metadata as a typed Provenance. Passing a report whose llm slot is None raises DisrobeError.

import disrobe
from disrobe import PyDecompileReport, Provenance

with open("module.pyc", "rb") as fh:
    report: PyDecompileReport = disrobe.py_decompile(fh.read(), pack="pack-2")

agents_brief: str = disrobe.agents_md(report)
skill_brief: str = disrobe.skill_md(report)

prov: Provenance = disrobe.provenance(report)
generated_at: str | None = prov.generated_at
schema: str | None = prov.schema
schema_version: str | None = prov.schema_version

Provenance accessors

PropertyType
schemastr | None
schema_versionstr | None
generated_atstr | None

Python passes

See also Python decompiler for the full decompiler design.

py_decompile

Decompiles a .pyc (with header) to source. 92.76% of CPython 3.14 stdlib code objects recompile to bytecode-equivalent output (5831 of 6286, CI floor 90%). Legacy CPython 1.0-3.7: 79.6% proven-correct (CI floor 152 of 191; 166 of 191 with the full interpreter zoo present).

import disrobe
from disrobe import PyDecompileReport, RoundtripStatus

with open("module.pyc", "rb") as fh:
    report: PyDecompileReport = disrobe.py_decompile(fh.read(), roundtrip=True)

source: str | None = report.source
marshal_version: str | None = report.marshal_version
decompile_version: str | None = report.decompile_version
recovered_directly: bool = report.recovered_directly
fallback_reason: str | None = report.fallback_reason
status: RoundtripStatus | None = report.roundtrip_status
roundtrip_detail: str | None = report.roundtrip_detail
interpreter_path: str | None = report.interpreter_path
interpreter_version: str | None = report.interpreter_version

if status == "perfect":
    print("recompiled bytecode matched")

Round-tripping (when roundtrip=True) shells out to a matching host interpreter; it is the one binding that may run an external python.

py_disasm

import disrobe
from disrobe import PyDisasmReport

with open("module.pyc", "rb") as fh:
    result: PyDisasmReport = disrobe.py_disasm(fh.read())

marshal_version: str | None = result.marshal_version
instruction_count: int = result.instruction_count
text: str | None = result.text

py_deob, py_deob_detect, py_deob_list_passes, py_deob_detect_pass

import disrobe
from disrobe import ObfuscatorPass, PyDeobDetection, PyDeobReport

obfuscated: str = "exec(__import__('base64').b64decode('cHJpbnQoMSk='))\n"

deob: PyDeobReport = disrobe.py_deob(obfuscated, cleanup=True)
peeled_source: str | None = deob.peeled_source
cleanup_source: str | None = deob.cleanup_source
layer_count: int = deob.layer_count

detection: PyDeobDetection = disrobe.py_deob_detect(obfuscated)
match_count: int = detection.match_count

passes: list[ObfuscatorPass] = disrobe.py_deob_list_passes()
first_id: str | None = passes[0].id if passes else None

per_pass: PyDeobDetection = disrobe.py_deob_detect_pass(obfuscated, "base64-exec")

py_deob_detect_pass raises DisrobeError for an unknown pass_id.

Python pass report classes

ClassNotable typed accessors
PyDecompileReportsource, marshal_version, decompile_version, recovered_directly, fallback_reason, roundtrip_status, roundtrip_detail, interpreter_path, interpreter_version, llm
PyDisasmReportmarshal_version: str | None, instruction_count: int, text: str | None, llm
PyDeobReportpeeled_source: str | None, cleanup_source: str | None, layer_count: int, llm
PyDeobDetectionmatch_count: int, llm
ObfuscatorPassid: str | None

PyArmor

pyarmor_detect

Parses a PyArmor wrapper from source text.

import disrobe
from disrobe import PyarmorDetection

detection: PyarmorDetection = disrobe.pyarmor_detect(open("wrapped.py").read())
version: str | None = detection.version
protection: str | None = detection.protection
confidence: str | None = detection.confidence
serial: str | None = detection.serial
python_major: int | None = detection.python_major
python_minor: int | None = detection.python_minor
payload_offset: int | None = detection.payload_offset
payload_size: int | None = detection.payload_size

pyarmor_unpack

Statically unpacks a PyArmor wrapper image. 72 of 72 PyArmor samples (v6-v9) recover. The bindings expose only the static path; there is no --allow-dynamic equivalent.

import disrobe
from disrobe import PyarmorUnpack, PyarmorUnpackStatus

with open("wrapper.pyc", "rb") as fh:
    unpacked: PyarmorUnpack = disrobe.pyarmor_unpack(fh.read())

status: PyarmorUnpackStatus | None = unpacked.status
pyarmor_version: str | None = unpacked.pyarmor_version
protection_kind: str | None = unpacked.protection_kind
plaintext_len: int | None = unpacked.plaintext_len
digest: str | None = unpacked.plaintext_blake3_hex
bcc_blob_count: int | None = unpacked.bcc_blob_count
inner_cipher_recovered_co: int | None = unpacked.inner_cipher_recovered_co

pyarmor_classify

import disrobe
from disrobe import PyarmorClassification

with open("payload.bin", "rb") as fh:
    payload: bytes = fh.read()

classification: PyarmorClassification = disrobe.pyarmor_classify(open("wrapped.py").read(), payload)
script_type: str | None = classification.script_type
bootstrap_import: str | None = classification.bootstrap_import
disposition: str | None = classification.disposition
rft_enabled: bool = classification.rft_enabled
ecc_enabled: bool = classification.ecc_enabled

The sidecar PyarmorDetection TypedDict (bindings/python/pyarmor-detection.pyi) names the confidence and protection Literal values used in the raw dict.

PyArmor report classes

ClassNotable typed accessors
PyarmorDetectionversion, protection, confidence, serial, python_major, python_minor, payload_offset, payload_size, llm
PyarmorUnpackstatus, pyarmor_version, protection_kind, plaintext_len, plaintext_blake3_hex, bcc_blob_count, inner_cipher_recovered_co, llm
PyarmorClassificationscript_type, bootstrap_import, disposition, rft_enabled, ecc_enabled

PyInstaller and Nuitka

pyinstaller_extract / pyinstaller_entry_bytes

import disrobe
from disrobe import PyInstallerArchive

with open("app.exe", "rb") as fh:
    image: bytes = fh.read()

archive: PyInstallerArchive = disrobe.pyinstaller_extract(image)
entry_count: int = archive.entry_count
encrypted: bool = archive.encrypted
encryption_key_present: bool = archive.encryption_key_present
python_major: int | None = archive.python_major
python_minor: int | None = archive.python_minor

entries: list[dict[str, object]] = archive.raw["entries"]
main_payload: bytes = disrobe.pyinstaller_entry_bytes(image, str(entries[0]["name"]))

nuitka_detect / nuitka_extract

import disrobe
from disrobe import NuitkaDetection, NuitkaExtraction

with open("app.exe", "rb") as fh:
    image: bytes = fh.read()

det: NuitkaDetection = disrobe.nuitka_detect(image)
flavor: str | None = det.flavor
version: str | None = det.version
wheel_marker: str | None = det.wheel_marker
onefile_payload_offset: int | None = det.onefile_payload_offset
onefile_payload_compressed: bool = det.onefile_payload_compressed

extraction: NuitkaExtraction = disrobe.nuitka_extract(image)
variant: str | None = extraction.variant

The sidecar FreezerManifest TypedDict (bindings/python/freezer-manifest.pyi) describes the manifest schema for cx-freeze/py2exe/shiv/pex/py-oxidizer/briefcase freezer families; reach it via report.raw.

PyInstaller/Nuitka report classes

ClassNotable typed accessors
PyInstallerArchiveentry_count: int, encrypted: bool, encryption_key_present: bool, python_major: int | None, python_minor: int | None, llm
NuitkaDetectionflavor: str | None, version: str | None, wheel_marker: str | None, onefile_payload_offset: int | None, onefile_payload_compressed: bool, llm
NuitkaExtractionvariant: str | None, llm

Hermes (React Native)

All 8 functions in the committed hermesc-built HBC v96 sample lift at 100% op-coverage with 0 fallback ops. 122,633 functions lift with no failure on a production React Native bundle.

import disrobe
from disrobe import HermesDisassembly, HermesLift, HermesInfo

with open("index.android.bundle", "rb") as fh:
    bundle: bytes = fh.read()

disasm_result: HermesDisassembly = disrobe.hermes_disasm(bundle)
function_count: int = disasm_result.function_count
identifier_count: int = disasm_result.identifier_count
string_count: int = disasm_result.string_count

lift: HermesLift = disrobe.hermes_lift(bundle)
function_surface_count: int = lift.function_surface_count

info: HermesInfo = disrobe.hermes_info(bundle)
version: int | None = info.version
header_size: int | None = info.header_size

Hermes report classes

ClassNotable typed accessors
HermesDisassemblyfunction_count: int, identifier_count: int, string_count: int, llm
HermesLiftfunction_surface_count: int, string_count: int, identifier_count: int, llm
HermesInfoversion: int | None, function_count: int | None, string_count: int | None, header_size: int | None, llm

Mach-O and Swift

import disrobe
from disrobe import MachoReport, SwiftReport

with open("universal.dylib", "rb") as fh:
    data: bytes = fh.read()

macho: MachoReport = disrobe.macho_dump(data)
kind: str | None = macho.kind
fat_entry_count: int = macho.fat_entry_count
slice_count: int = macho.slice_count

swift: SwiftReport = disrobe.swift_analyze(data)
container: str | None = swift.container
swift_fat_entry_count: int = swift.fat_entry_count
swift_slice_count: int = swift.slice_count

Mach-O report classes

ClassNotable typed accessors
MachoReportkind: str | None, fat_entry_count: int, slice_count: int, llm
SwiftReportcontainer: str | None, fat_entry_count: int, slice_count: int, llm

JVM and Android

93.1% of JVM methods recompile error-free under javac (CI floor 122 of 131; 128 of 131 measured with JDK 25). 99% of committed DEX classes pass -Xverify:all.

import disrobe
from disrobe import (
    JvmClass, DexFileReport, JvmDecompiledClass,
    DetectionList, JvmBackends, ApkResources,
)

with open("Hello.class", "rb") as fh:
    cls: JvmClass = disrobe.jvm_parse_class(fh.read())
major_version: int | None = cls.major_version
minor_version: int | None = cls.minor_version
method_count: int = cls.method_count
field_count: int = cls.field_count
constant_pool_count: int = cls.constant_pool_count

with open("classes.dex", "rb") as fh:
    dex: DexFileReport = disrobe.jvm_parse_dex(fh.read())
class_count: int = dex.class_count
dex_method_count: int = dex.method_count

with open("Hello.class", "rb") as fh:
    decompiled: JvmDecompiledClass = disrobe.jvm_decompile_class(fh.read())
source: str | None = decompiled.source
fully_lifted_methods: int = decompiled.fully_lifted_methods
fallback_methods: int = decompiled.fallback_methods

detections: DetectionList = disrobe.jvm_detect(open("obf.class", "rb").read())
detection_count: int = detections.count

backends: JvmBackends = disrobe.jvm_backends()
jvm_count: int = backends.jvm_count
android_count: int = backends.android_count

with open("app.apk", "rb") as fh:
    apk: ApkResources = disrobe.apk_resources(fh.read())
package: str | None = apk.package
manifest_xml: str | None = apk.manifest_xml
resource_entry_count: int = apk.resource_entry_count
certificate_count: int = apk.certificate_count
dex_count: int = apk.dex_count

jvm_backends and dotnet_backends probe the host for installed external tools but never shell out to them. Counts are informational only.

JVM/Android report classes

ClassNotable typed accessors
JvmClassmajor_version: int | None, minor_version: int | None, method_count: int, field_count: int, constant_pool_count: int, llm
DexFileReportstring_count: int, type_count: int, class_count: int, method_count: int, llm
JvmDecompiledClasssource: str | None, method_count: int, field_count: int, fully_lifted_methods: int, fallback_methods: int
DetectionListcount: int
JvmBackendsjvm_count: int, android_count: int, llm
ApkResourcespackage: str | None, manifest_xml: str | None, resource_entry_count: int, certificate_count: int, dex_count: int, llm

.NET

import disrobe
from disrobe import (
    DotnetPe, DotnetMetadata, DotnetDetection,
    DotnetAnalysis, DotnetDecompilation, DotnetDecoders, BackendList,
)

with open("Sample.dll", "rb") as fh:
    pe_bytes: bytes = fh.read()

pe: DotnetPe = disrobe.dotnet_parse_pe(pe_bytes)
bitness: str | None = pe.bitness
machine: int | None = pe.machine
section_count: int = pe.section_count
entry_point_rva: int | None = pe.entry_point_rva

metadata: DotnetMetadata = disrobe.dotnet_parse_metadata(pe_bytes)
version: str | None = metadata.version
major_runtime_version: int | None = metadata.major_runtime_version
stream_count: int = metadata.stream_count

detection: DotnetDetection = disrobe.dotnet_detect(pe_bytes)
primary: str | None = detection.primary
match_count: int = detection.match_count

analysis: DotnetAnalysis = disrobe.dotnet_analyze(pe_bytes)
pe_bitness: str | None = analysis.pe_bitness
native_aot: bool = analysis.native_aot
primary_protector: str | None = analysis.primary_protector
opcode_spec_coverage_pct: int | None = analysis.opcode_spec_coverage_pct

decompilation: DotnetDecompilation = disrobe.dotnet_decompile(pe_bytes)
module_name: str | None = decompilation.module_name
methods_decompiled: int | None = decompilation.methods_decompiled
methods_bodyless: int | None = decompilation.methods_bodyless
methods_failed: int | None = decompilation.methods_failed

decoders: DotnetDecoders = disrobe.dotnet_recover_decoders(pe_bytes)
pure_decoders_found: int | None = decoders.pure_decoders_found
constants_recovered: int = decoders.constants_recovered

backend_list: BackendList = disrobe.dotnet_backends()
available_count: int = backend_list.available_count

.NET report classes

ClassNotable typed accessors
DotnetPebitness: str | None, machine: int | None, section_count: int, entry_point_rva: int | None, llm
DotnetMetadataversion: str | None, major_runtime_version: int | None, stream_count: int, llm
DotnetDetectionprimary: str | None, match_count: int, llm
DotnetAnalysispe_bitness: str | None, clr_runtime_version: str | None, native_aot: bool, primary_protector: str | None, opcode_spec_coverage_pct: int | None, llm
DotnetDecompilationmodule_name: str | None, methods_decompiled: int | None, methods_bodyless: int | None, methods_failed: int | None, llm
DotnetDecoderspure_decoders_found: int | None, constants_recovered: int, llm

WebAssembly

100% op-coverage on 94 functions across 30 parseable corpus modules. 24 of 24 execution-eligible functions are execution-equivalent under wasmtime.

import disrobe
from disrobe import WasmAnalysis, WasmDetection

with open("module.wasm", "rb") as fh:
    wasm_bytes: bytes = fh.read()

analysis: WasmAnalysis = disrobe.wasm_analyze(wasm_bytes)
import_count: int = analysis.import_count
export_count: int = analysis.export_count
func_count: int | None = analysis.func_count
code_size_bytes: int | None = analysis.code_size_bytes
has_dwarf: bool = analysis.has_dwarf

detection: WasmDetection = disrobe.wasm_detect(wasm_bytes)
obfuscator: str | None = detection.obfuscator
confidence: float | None = detection.confidence
has_name_section: bool = detection.has_name_section
function_count: int | None = detection.function_count

WebAssembly report classes

ClassNotable typed accessors
WasmAnalysisimport_count: int, export_count: int, func_count: int | None, code_size_bytes: int | None, has_dwarf: bool, llm
WasmDetectionobfuscator: str | None, confidence: float | None, has_name_section: bool, function_count: int | None, llm

JavaScript

Supports 11 bundlers: webpack4, webpack5/webpack, vite, rollup, rolldown, esbuild, turbopack, bun, browserify, parcel, systemjs. An unrecognised hint string raises DisrobeError.

import disrobe
from disrobe import JsDetection, JsUnminify, JsUnbundle

source: str = open("main.js").read()

detection: JsDetection = disrobe.js_detect(source)
family: str | None = detection.family
confidence: float | None = detection.confidence
marker_count: int = detection.marker_count

unminified: JsUnminify = disrobe.js_unminify(source)
recovered_source: str | None = unminified.source

bundle_source: str = open("bundle.js").read()
unbundled: JsUnbundle = disrobe.js_unbundle(bundle_source)
module_count: int = unbundled.module_count
bundler: str | None = unbundled.bundler

unbundled_hinted: JsUnbundle = disrobe.js_unbundle(bundle_source, bundler="webpack5")

JavaScript report classes

ClassNotable typed accessors
JsDetectionfamily: str | None, confidence: float | None, marker_count: int, llm
JsUnminifysource: str | None, llm
JsUnbundlemodule_count: int, bundler: str | None, llm

Lua

Detects, decompiles, and deobfuscates 11 Lua obfuscator families. IronBrew2 2.7.0 is reversed against real committed output with a Lua execution differential.

import disrobe
from disrobe import LuaDetection, LuaDecompilation, LuaDeobfuscation

with open("chunk.luac", "rb") as fh:
    bytecode: bytes = fh.read()

det: LuaDetection = disrobe.lua_detect(bytecode)
lua_format: str | None = det.format

decompiled: LuaDecompilation = disrobe.lua_decompile(bytecode)
decompiled_source: str | None = decompiled.source
fidelity: str | None = decompiled.fidelity
warning_count: int = decompiled.warning_count

deob: LuaDeobfuscation = disrobe.lua_deobfuscate(open("obf.lua").read(), authorize=True)
obfuscator: str | None = deob.obfuscator
deobfuscated: str | None = deob.deobfuscated
fully_recovered: bool = deob.fully_recovered
passes_run_count: int = deob.passes_run_count
recovered_string_count: int = deob.recovered_string_count

Lua report classes

ClassNotable typed accessors
LuaDetectionformat: str | None, llm
LuaDecompilationsource: str | None, fidelity: str | None, warning_count: int, llm
LuaDeobfuscationobfuscator: str | None, deobfuscated: str | None, fully_recovered: bool, passes_run_count: int, recovered_string_count: int, llm

Go

85%+ type-name recovery on stripped go1.26 fixtures; 528 of 528 measured.

import disrobe
from disrobe import GoAnalysis, GoSymbols, GoPclntab, GarbleReport

with open("binary", "rb") as fh:
    go_bytes: bytes = fh.read()

analysis: GoAnalysis = disrobe.go_analyze(go_bytes)
image_kind: str | None = analysis.image_kind
pclntab_version: str | None = analysis.pclntab_version
buildversion: str | None = analysis.buildversion
ptr_size: int | None = analysis.ptr_size

symbols: GoSymbols = disrobe.go_symbols(go_bytes)
version_label: str | None = symbols.version_label
function_count: int = symbols.function_count
source_file_count: int = symbols.source_file_count
package_count: int = symbols.package_count

pclntab: GoPclntab = disrobe.go_pclntab(go_bytes)
version: str | None = pclntab.version
func_count: int | None = pclntab.func_count

garble: GarbleReport = disrobe.go_garble(go_bytes)
quality: str | None = garble.quality
detection_score: int | None = garble.detection_score
seed_recoverable: bool = garble.seed_recoverable
seed_hash: str | None = garble.seed_hash
recovered_string_count: int = garble.recovered_string_count

Go report classes

ClassNotable typed accessors
GoAnalysisimage_kind: str | None, pclntab_version: str | None, buildversion: str | None, ptr_size: int | None, llm
GoSymbolsversion_label: str | None, function_count: int, source_file_count: int, package_count: int, llm
GoPclntabversion: str | None, ptr_size: int | None, func_count: int | None, image_kind: str | None, llm
GarbleReportquality: str | None, detection_score: int | None, seed_recoverable: bool, seed_hash: str | None, recovered_string_count: int, llm

Ruby

import disrobe
from disrobe import RubyDetection, RubyAnalysis

with open("hello.rb.enc", "rb") as fh:
    ruby_bytes: bytes = fh.read()

det: RubyDetection = disrobe.ruby_detect(ruby_bytes, source_path="hello.rb.enc")
flavor: str | None = det.flavor

analysis: RubyAnalysis = disrobe.ruby_decompile(ruby_bytes, source_path="hello.rb.enc")
ruby_flavor: str | None = analysis.flavor
source_path: str | None = analysis.source_path
input_len: int | None = analysis.input_len

Ruby report classes

ClassNotable typed accessors
RubyDetectionflavor: str | None, llm
RubyAnalysisflavor: str | None, source_path: str | None, input_len: int | None, llm

PHP

import disrobe
from disrobe import PhpDetection, PhpScan, PhpDecode

with open("obfuscated.php", "rb") as fh:
    php_bytes: bytes = fh.read()

det: PhpDetection = disrobe.php_detect(php_bytes)
kind: str | None = det.kind
confidence: str | None = det.confidence
open_tag_offset: int | None = det.open_tag_offset
has_halt_compiler: bool = det.has_halt_compiler

scan: PhpScan = disrobe.php_scan(php_bytes)
hit_count: int = scan.hit_count
family_count: int = scan.family_count

decoded: PhpDecode = disrobe.php_decode(php_bytes, max_depth=10)
php_source: str | None = decoded.source
layer_count: int = decoded.layer_count
residual_eval: bool = decoded.residual_eval

PHP report classes

ClassNotable typed accessors
PhpDetectionkind: str | None, confidence: str | None, open_tag_offset: int | None, has_halt_compiler: bool, llm
PhpScanhit_count: int, family_count: int, llm
PhpDecodesource: str | None, layer_count: int, residual_eval: bool, llm

Shell

import disrobe
from disrobe import BatchDeobReport, PowershellDetection, PowershellDeobfuscation

batch_script: str = open("dropper.bat").read()
batch_result: BatchDeobReport = disrobe.batch_deobfuscate(batch_script, args=["/run"])
output: str | None = batch_result.output
embedded_payload_count: int = batch_result.embedded_payload_count
decrypted_stage_count: int = batch_result.decrypted_stage_count
commands_emulated: int | None = batch_result.commands_emulated

ps_script: str = open("obf.ps1").read()
ps_det: PowershellDetection = disrobe.powershell_detect(ps_script)
obfuscator: str | None = ps_det.obfuscator
ps_confidence: float | None = ps_det.confidence
marker_count: int = ps_det.marker_count

ps_deob: PowershellDeobfuscation = disrobe.powershell_deobfuscate(ps_script)
ps_output: str | None = ps_deob.output
level: str | None = ps_deob.level
transformation_count: int = ps_deob.transformation_count

Shell report classes

ClassNotable typed accessors
BatchDeobReportoutput: str | None, embedded_payload_count: int, decrypted_stage_count: int, commands_emulated: int | None, llm
PowershellDetectionobfuscator: str | None, confidence: float | None, marker_count: int, llm
PowershellDeobfuscationoutput: str | None, level: str | None, transformation_count: int, llm

Containers

98 container families detected and extracted in-tree. See container docs for the full family list.

import disrobe
from disrobe import ContainerDetection, ContainerMembers, ContainerListing

with open("archive.zip", "rb") as fh:
    container_bytes: bytes = fh.read()

det: ContainerDetection = disrobe.container_detect(container_bytes)
detected: bool = det.detected
kind: str | None = det.kind
is_zip_family: bool = det.is_zip_family

members: ContainerMembers = disrobe.container_members(container_bytes)
fmt: str | None = members.format
size: int | None = members.size
listing: ContainerListing | None = members.listing
entry_count: int = members.entry_count

Container report classes

ClassNotable typed accessors
ContainerDetectiondetected: bool, kind: str | None, is_zip_family: bool, llm
ContainerMembersformat: str | None, size: int | None, listing: ContainerListing | None, entry_count: int, llm

Pickle

Nothing is ever unpickled; the VM is symbolic.

import disrobe
from disrobe import (
    PickleDecompilation, PickleSafety, PickleTrace,
    PicklePolyglot, PickleMlReport,
)

with open("model.pkl", "rb") as fh:
    pkl: bytes = fh.read()

listing: str = disrobe.pickle_disasm(pkl)

decompilation: PickleDecompilation = disrobe.pickle_decompile(pkl)
pkl_source: str | None = decompilation.source

safety: PickleSafety = disrobe.pickle_safety(pkl)
severity: str | None = safety.severity
finding_count: int = safety.finding_count
import_count: int = safety.import_count
reduce_count: int | None = safety.reduce_count

trace: PickleTrace = disrobe.pickle_trace(pkl)
protocol: int | None = trace.protocol
memo_count: int | None = trace.memo_count
max_stack_depth: int | None = trace.max_stack_depth
global_ref_count: int = trace.global_ref_count
trace_reduce_count: int | None = trace.reduce_count

polyglot: PicklePolyglot = disrobe.pickle_polyglot(pkl)
is_pickle: bool = polyglot.is_pickle
is_polyglot: bool = polyglot.is_polyglot
kind_count: int = polyglot.kind_count

with open("model.pt", "rb") as fh:
    ml_report: PickleMlReport = disrobe.pickle_ml_detect(fh.read())
fmt: str | None = ml_report.format
framing: str | None = ml_report.framing
embedded_count: int = ml_report.embedded_count

Pickle report classes

ClassNotable typed accessors
PickleDecompilationsource: str | None, llm
PickleSafetyseverity: str | None, finding_count: int, import_count: int, reduce_count: int | None, llm
PickleTraceprotocol: int | None, memo_count: int | None, max_stack_depth: int | None, global_ref_count: int, reduce_count: int | None, llm
PicklePolyglotis_pickle: bool, is_polyglot: bool, kind_count: int, llm
PickleMlReportformat: str | None, framing: str | None, embedded_count: int, llm

Editable IR objects

CodeObject, Instruction, and Symbol let you load a Disasm-rung .dr envelope, modify it in Python, and write a fresh integrity-hashed .dr.

Instruction

from disrobe import Instruction, InstructionFlow

instr: Instruction = Instruction(
    offset=0,
    mnemonic="mov",
    operands=["rax", "rbx"],
    bytes=b"\x48\x89\xd8",
)
instr.branch_target = None
flow: InstructionFlow = instr.flow
text: str = instr.text()
MemberTypeNotes
offsetintMutable
mnemonicstrMutable
operandslist[str]Mutable
bytesbytesMutable
branch_targetint | NoneMutable
flow@property InstructionFlowRead-only
text()-> strRendered disassembly line

Symbol

from disrobe import Symbol, SymbolKind

sym: Symbol = Symbol(address=0x1000, name="entry", kind="function")
sym.name = "main"
sym.kind = "export"
MemberTypeNotes
addressintMutable
namestrMutable
kindSymbolKindMutable

CodeObject

import disrobe
from disrobe import CodeObject, Instruction, Symbol

with open("module.dr", "rb") as fh:
    co: CodeObject = CodeObject.from_dr(fh.read())

instruction_count: int = co.instruction_count
symbol_count: int = co.symbol_count
source_hash: str = co.source_hash
produced_by: str = co.produced_by

instrs: list[Instruction] = co.instructions
syms: list[Symbol] = co.symbols
metadata: dict[str, str] = co.metadata
capabilities: list[str] = co.capabilities
llm_bundle: dict[str, object] | None = co.llm

new_sym: Symbol = Symbol(address=0x2000, name="renamed_fn", kind="function")
co.add_symbol(new_sym)
co.set_metadata("analysis", "patched")
co.add_capability("NETWORK_CONNECT", 1)

fresh_dr: bytes = co.to_dr()
with open("module_patched.dr", "wb") as fh:
    fh.write(fresh_dr)

CodeObject.from_dr parses a Disasm-rung .dr envelope. to_dr produces a fresh envelope with a recomputed integrity hash. The set_instructions and set_symbols methods replace the full list; add_instruction / add_symbol append. set_metadata(key, value) sets a single string key; clear_metadata resets all. set_llm(sidecar) attaches or removes the LLM sidecar dict.

MemberSignatureNotes
from_drstaticmethod(dr_bytes: bytes) -> CodeObjectParse existing envelope
instructions@property -> list[Instruction]
set_instructions(instructions: list[Instruction]) -> NoneReplace
add_instruction(instruction: Instruction) -> NoneAppend
symbols@property -> list[Symbol]
set_symbols(symbols: list[Symbol]) -> NoneReplace
add_symbol(symbol: Symbol) -> NoneAppend
instruction_count@property -> int
symbol_count@property -> int
source_hashstrMutable attribute
produced_bystrMutable attribute
metadata@property -> dict[str, str]
set_metadata(key: str, value: str) -> None
clear_metadata() -> None
capabilities@property -> list[str]
add_capability(name: str, major: int) -> None
llm@property -> dict[str, Any] | None
set_llm(sidecar: dict[str, Any] | None) -> None
to_dr() -> bytesProduce fresh integrity-hashed envelope

Scope

  • No file or directory handling: no --out trees, no --capture-stages, no container extraction to disk. auto returns the plan document only.
  • External backend tools (jvm_backends, dotnet_backends, native_probe_backends) are probed for availability but never executed.
  • AS3, Flutter, BEAM (beyond disasm/parse), and the freezer family beyond PyInstaller/Nuitka have no dedicated bindings in this release.
  • No SARIF/NDJSON emitters and no serve daemon; drive the CLI or daemon directly.

Forensics and malware-safety posture

disrobe is designed to be run against hostile input by analysts who must not detonate it. This page states precisely what does and does not execute, so you can decide what to run inside a sandbox.

The default is static analysis, no sample execution

By default, disrobe does not execute the sample. Every default path is pure static analysis: it parses bytes, decodes bytecode, walks structures, and emits derived artifacts. It does not unpickle, does not call __reduce__, does not run a packed binary, does not invoke a sample's entry point.

This holds for the entire pickle suite in particular. disrobe pickle trace runs a symbolic VM: it walks the opcode stream and builds the object graph without instantiating a single real object or resolving a single real global. disrobe pickle safety grades danger statically. You can audit a downloaded .pt or .pkl for what it would do on load without ever letting it load.

The opt-in execution paths

There are a small number of paths that can execute code, and every one is behind an explicit, named flag. None of them is on by default.

PathGateWhat it does
PyArmor v6/v7 dynamic-hook--allow-dynamicRuns the obfuscated wrapper in a watched subprocess to capture marshal streams. Watchdog timeout via --dynamic-timeout (default 60s).
PyArmor BCC native-body lift--allow-bccLifts BCC-protected native bodies via Ghidra-headless on PATH (Ghidra runs, not the sample's logic in-process).

If you must use --allow-dynamic, do it inside an isolated sandbox (a disposable VM or container with no network and no access to anything you care about). disrobe gives you the watchdog timeout and a captured-marshal manifest, but a dynamic hook is, by definition, executing adversarial code. The pure-static paths (v8, v9-pro) need no such gate.

Subprocess backends

The optional external backends (Ghidra, CFR, Vineflower, jadx, ILSpy, dnSpy, de4dot, Rizin) run as subprocesses over the artifact, not by executing the sample's own logic. Command lines are constructed from configuration and sometimes user input; command injection or argument smuggling there is in scope for the security policy.

Hardened parsing surface

disrobe parses adversarial binary input constantly, so the parsing surface is hardened deliberately:

  • Pure-Rust, unsafe forbidden workspace-wide. The only opt-outs are the two pyo3 C-interop crates (disrobe-pyarmor-cextract, disrobe-pyarmor-pytrace), gated behind explicit features. Any panic or abort on adversarial input that is not a clean Result::Err is a bug.
  • Resource-exhaustion guards. Zip-bombs, decompression bombs, container-recursion bombs, and malformed-length-field bombs are defused by the shared quota machinery in crates/disrobe-binfmt/src/quota.rs (per-entry cap, aggregate cap, recursion-depth cap).
  • Path-traversal guards. zip-slip and equivalents are sanitized on every container extraction path.
  • Envelope decoder hardening. The .dr decoder is fuzzed; read-past-end, integer overflow, and BLAKE3-mismatch acceptance are all in scope.
  • Chain safety. Depth cap (default 8) and content-hash cycle detection stop a malicious input from making a chain recurse forever.

Reporting

Found a way to make a default path execute a sample, or to escape a container, or to crash the parser? That is a security issue: report it privately, never as a public issue. See Security.

Threat model

This page is the explicit trust-boundary analysis for disrobe. It states what the tool treats as untrusted, where the boundaries are drawn, what each boundary defends against, and what is deliberately out of scope. It complements the operational Security policy and the Forensics and malware-safety posture: the security policy says how to report a problem and what is in scope for a report; this page says what the design assumes an attacker can do and where the walls are.

The single load-bearing assumption is this: the input is hostile. disrobe exists to parse protector output, packed executables, obfuscated bytecode, and exotic encoders. Every byte of every sample is treated as attacker-controlled. The analyst running disrobe is trusted; the artifact they point it at is not.

Trust boundaries at a glance

   ┌─────────────────────────── trusted ───────────────────────────┐
   │  analyst, host filesystem, disrobe binary, configuration       │
   └───────────────▲───────────────────────────────▲───────────────┘
                   │ B1                             │ B2
   ┌───────────────┴───────────────┐   ┌────────────┴───────────────┐
   │  untrusted sample bytes       │   │  untrusted .dr envelope     │
   │  (file / bytes_b64 / stdin)   │   │  (cache hit, peer-supplied) │
   └───────────────────────────────┘   └────────────────────────────┘
                   │ B3                             │ B4
   ┌───────────────┴───────────────┐   ┌────────────┴───────────────┐
   │  network surface              │   │  subprocess backends +      │
   │  (serve: HTTP / gRPC / LSP)   │   │  optional sample execution  │
   └───────────────────────────────┘   └────────────────────────────┘

There are four boundaries. Boundary 1 (sample bytes) and Boundary 2 (envelope bytes) are always present. Boundary 3 (network) is present only when disrobe serve is running. Boundary 4 (subprocess and dynamic execution) is present only when an explicit opt-in flag is passed.

Boundary 1: untrusted sample bytes

Trusted side: the disrobe process, the host, the analyst's intent. Untrusted side: the sample. It arrives as a filesystem path, a bytes_b64 blob over the daemon, or stdin. The parser must assume every length field, offset, opcode, and nested container is chosen by an adversary to break it.

What this boundary defends against, and how:

ThreatDefenseWhere
Memory-corruption via the parserPure-Rust, #![forbid(unsafe_code)] workspace-wide; the only opt-out is the two pyo3 C-interop crates, gated behind explicit features and never on the default path.workspace lint config
Panic / abort on adversarial inputAny non-Result::Err failure on hostile bytes is a bug. Decoders return errors, they do not unwrap.every disrobe-pass-* decoder
Decompression and zip bombsPer-entry cap, aggregate cap, and an observed-ratio ceiling in the shared quota machinery.crates/disrobe-binfmt/src/quota.rs
Path traversal (zip-slip and kin)Every container extraction path routes through sanitize_entry_path and siblings before any write.crates/disrobe-binfmt/src/quota.rs
Container-recursion bombsRecursion-depth cap plus content-hash cycle detection in the chain runner (default depth 8).chain runner
Malformed-length-field bombsLength fields are validated against remaining buffer length before allocation; no length field is trusted to size an allocation.binfmt + envelope decoder
Signature defeat (scrambled magic, renamed sections, corrupted markers)Detection falls back from magic to self-consistent internal structure, which an adversary cannot break without breaking the file's own functionality.crates/disrobe-binfmt/src/structural.rs

The envelope decoder and the container layer are the two most-exposed parsing surfaces and are fuzzed.

Signature defeat and header scrambling

A common evasion against signature-based detectors and unpackers is to scramble the parts a fast scanner keys on: flip the MZ of a PE, zero the \x7fELF of an ELF, mangle the Mach-O / DEX / class-file / wasm magic, rename UPX0/UPX1 and corrupt the UPX! marker. These edits defeat a tool that identifies a format by a leading magic byte or a section name, but they do not change what the file actually is: the loader, the OS, or the runtime still has to find the real structure to execute it, so the structure is the one thing the adversary cannot remove.

disrobe identifies a format by magic on the fast path, then falls back to structural validation when the magic is absent or wrong. The fallback parses far enough into the format's own header tables to confirm they refer to one another consistently, which keeps false positives low (a loose pattern match would not satisfy a full cross-referenced walk):

  • PE. Resolve e_lfanew to a PE\0\0 signature, then a COFF header with a known machine type, a PE32/PE32+ optional header, and a section table that fits the file. A corrupted e_lfanew itself is recovered by scanning for the PE\0\0 whose following headers validate, so a flipped MZ and a mangled e_lfanew together still parse.
  • ELF. Validate the class / endianness / version bytes and confirm the program- and section-header table offsets, entry sizes, and counts are self-consistent against the declared entry sizes and the file length. A zeroed \x7fELF does not move any of those fields.
  • Mach-O. Walk the load-command stream (ncmds / sizeofcmds and each cmdsize) for a single-arch image, or the arch offset/size table for a fat image, accepting only when the run lands exactly at its declared end.
  • Native packers (UPX). Detect and unpack by the decompressor stub's PackHeader (a known method id, self-consistent compressed/uncompressed lengths, a plausible version) located by structural scan rather than by the UPX! marker, and resolve packed-section data through the structural PE header rather than a literal MZ. A renamed-section, corrupted-marker UPX still unpacks byte-identically.
  • ZIP and zip-family archives. Anchor on the End-of-Central-Directory record (the format's authoritative trailer) and confirm its central-directory offset and size land on a record carrying the central-directory-header signature. A scrambled first local header does not move the EOCD.
  • DEX. Confirm header_size == 0x70, a legal endian tag, and string / type / proto / method / class section sizes and offsets self-consistent against file_size and the byte length; a zeroed dex\n0XX\0 magic still parses, defaulting the version when the version triple is unreadable.
  • JVM class file. Confirm a major version in the JVM-known range and walk the constant pool (Utf8 lengths, long/double double-slots) to its end; a scrambled 0xCAFEBABE still parses.
  • wasm. Confirm a version word of 1 and that the section id/size LEB128 stream validates end to end, terminating exactly at end of file; a scrambled \0asm still lifts.

The structural detector is shared (identify_by_structure) so the central sniffer (classify.rs), the container detector, and the native packer and identity passes all benefit from the same validated logic, and every validator is bounds-checked against deliberately malformed input. Python .pyc / marshal detection is handled on a separate path and is not part of this fallback. The behavior is proven by adversarial tests that take real committed corpus samples, scramble their magic bytes, section names, and markers, and assert disrobe still detects the correct format and, where it unpacks or parses, still produces the correct recovered output.

Boundary 2: untrusted .dr envelopes

The .dr envelope is content-addressed (BLAKE3-rooted, rkyv hot payload + postcard cold sidecar). A cache hit, a peer-supplied envelope, or a downstream stage all cross this boundary. An envelope is not trusted merely because it claims a hash.

What this boundary defends against:

  • Read-past-end. The zero-copy rkyv access path is bounds-checked at decode; an envelope whose declared lengths exceed its actual bytes is rejected, not read past.
  • Integer overflow in length math. Offset and length arithmetic is checked; an envelope cannot induce a wrapping add that yields an in-bounds-looking slice.
  • BLAKE3-mismatch acceptance. The root hash is recomputed over the payload and compared; an envelope whose content does not match its claimed root is rejected. This is the property that makes --no-cache an optimization toggle and not a correctness toggle: a cache hit is provably the same bytes.

The decoder lives in crates/disrobe-ir/src/envelope.rs and is fuzzed against exactly these three attacks.

Boundary 3: the network surface (disrobe serve)

When the daemon runs, HTTP, gRPC, and LSP-over-stdio each cross a trust boundary. The governing rule is that the server never opens a file based on a client-controlled string.

  • HTTP, gRPC, and the LSP disrobe/analyze method accept bytes_b64 only, never a path. There is no client-reachable code path that turns a request field into a filesystem read.
  • All request bodies reject unknown fields via #[serde(deny_unknown_fields)], closing field-smuggling and forward-compat-confusion attacks.
  • A non-loopback HTTP bind emits a tracing::warn! banner at startup, so an operator who exposes the daemon beyond localhost is told so explicitly.

The daemon is intended for localhost / trusted-network use; it is not an authenticated multi-tenant service, and exposing it publicly is an operator decision the warning banner flags.

Boundary 4: subprocess backends and optional sample execution

This is the boundary an analyst can choose to not cross at all. Two distinct sub-cases:

Subprocess backends over the artifact (not the sample's logic). Optional external tools (Ghidra, CFR, Vineflower, jadx, ILSpy, dnSpy, de4dot, Rizin) run as subprocesses over the derived artifact. They never execute the sample's own entry point. The exposure here is command-line construction: command lines are built from configuration and sometimes from user input, so command injection and argument smuggling are the in-scope threats, mitigated by constructing argument vectors directly rather than shelling out through a string.

Dynamic execution of the sample. A small number of paths can run adversarial code, and none is on by default. Each sits behind a named flag:

PathGateWhat runs
PyArmor v6/v7 dynamic-hook--allow-dynamicThe obfuscated wrapper, in a watched subprocess, to capture marshal streams. Watchdog via --dynamic-timeout (default 60s).
PyArmor BCC native-body lift--allow-bccGhidra-headless over the native body: the analysis tool runs, not the sample's logic in-process.

The default static paths (pickle symbolic VM, the v8 and v9-pro PyArmor peels) need no such gate: they parse and walk, they do not detonate. When dynamic execution is unavoidable, run it inside a disposable, network-isolated sandbox. disrobe gives you a watchdog and a captured-marshal manifest, but a dynamic hook is, by definition, executing attacker code.

Non-execution stance (restated as an invariant)

The default-static stance is a design invariant, not a configuration default that can drift:

  • disrobe does not unpickle. disrobe pickle trace walks the opcode stream symbolically, building the object graph without instantiating a single real object or resolving a single real global; disrobe pickle safety grades danger statically.
  • disrobe does not call __reduce__, does not run a packed binary, does not invoke a sample's entry point on any default path.
  • Any way to make a default path execute a sample is a vulnerability, in scope for the Security policy.

Plugin and WASM isolation

Where disrobe loads analysis logic as data rather than as native code, that logic runs sandboxed: WASM-hosted analysis executes inside a wasmparser-validated, memory-bounded interpreter with no ambient filesystem or network capability, so a malicious or malformed module can consume bounded compute and nothing more. This keeps the extensibility surface from becoming a fresh native-code execution boundary.

Supply chain

The integrity of the binary the analyst runs is its own boundary:

  • No untrusted bytecode in the public corpus. The repository does not ship third-party copyrighted obfuscated bytecode; fixtures are either self-generated by corpus/generate.{sh,ps1} or referenced by BLAKE3 hash only. Every shipped fixture is pinned by hash in corpus/native/packers/MANIFEST.toml and sibling registries, and tests verify byte-identity before the parser ever sees the bytes.
  • Signed releases. Release artifacts are signed with cosign keyless OIDC and minisign; every cosign signature is recorded in the Rekor public transparency log. Verification commands are in the Security policy.
  • Dependency hygiene. cargo deny (advisories / bans / licenses / sources) runs on every push and weekly; cargo audit runs weekly. The clippy gate (-D warnings) is required for every commit on main.
  • History hygiene. CI runs on every push, and the local verification chain (clippy -D warnings, fmt, tests, cargo deny) is the enforced pre-push gate; commit authorship uses the GitHub noreply form so personal email never enters history.

Explicitly out of scope

The threat model deliberately does not defend against:

  • Decompilation-output correctness on adversarial bytecode. disrobe will sometimes emit wrong source for hostile input; the round-trip metric exists to flag this. A non-byte-perfect decompile is correctness work, not a security boundary.
  • Compute exhaustion via legitimate input. Decompiling a 66 MiB Hermes bundle is genuinely expensive; a slow-but-bounded decompile of real input is not a vulnerability. (Adversarial amplification, a tiny input that forces unbounded work, is in scope under Boundary 1.)
  • Vulnerabilities inside wrapped third-party tools. Ghidra, jadx, CFR, and friends have their own security channels; we forward where we can identify the upstream.
  • Trusting the analyst. disrobe assumes the operator is authorized and acting in good faith; it gates grey-zone protector behavior behind --i-have-authorization but does not, and cannot, adjudicate that authorization.

Reporting

If you find a way to cross a boundary that this model claims is sealed (make a default path execute a sample, escape a container, accept a hash-mismatched envelope, or make the daemon read a file from a client string), that is a security issue. Report it privately, never as a public issue. See the Security policy.

GitHub Action

disrobe ships a composite GitHub Action that downloads the matching release binary, runs a scan over a path or glob, and uploads the result to GitHub code scanning as SARIF. It runs entirely in the runner shell (no Docker image, no build step) so it starts in seconds.

Quick start

Add a workflow that scans build artifacts on every push and surfaces findings in the Security -> Code scanning tab.

name: disrobe-scan
on:
  push:
  pull_request:

permissions:
  contents: read
  security-events: write   # required for the SARIF upload

jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: 1-3-7/disrobe@v0.10.0
        with:
          path: dist/
          command: auto
          fail-on: failed

The security-events: write permission is what lets the action publish SARIF to code scanning; without it the upload step is skipped by GitHub.

What it does

  1. Resolves the runner OS/arch to a release target triple (x86_64-unknown-linux-gnu, aarch64-apple-darwin, x86_64-pc-windows-msvc, and the rest of the release matrix).
  2. Downloads disrobe-<version>-<target>.tar.zst (or .zip on Windows) plus SHA256SUMS from this repository's Releases, and verifies the archive against SHA256SUMS before extracting. A checksum mismatch fails the step.
  3. Runs disrobe <command> <path> <args> --sarif --out <out-dir>, capturing the SARIF document.
  4. Uploads the SARIF to code scanning and the recovered-artifact directory as a workflow artifact.

Inputs

InputDefaultDescription
path(required)File, directory, or glob to analyze. Passed verbatim to the command.
commandautodisrobe subcommand (auto, scan, behavior, ...).
args""Extra arguments inserted after the command and before the path (for example --max-depth 12).
versionaction ref, then latestRelease tag to download (v0.10.0, latest).
fail-onneverFail the step at or above a verdict: never, incomplete, failed, any.
sarif-filedisrobe.sarifPath the action writes the SARIF to.
out-dirdisrobe-outDirectory disrobe writes recovered artifacts into.
upload-sariftrueUpload SARIF to GitHub code scanning.
upload-artifacttrueUpload the recovered-artifact directory.
token${{ github.token }}Token used to download the release asset.

Outputs

OutputDescription
sarifPath to the SARIF file the action produced.
verdictWorst verdict observed (ok, incomplete, failed).
summaryOne-line human-readable run summary.

Pinning the version

Pin a tag for reproducible CI:

      - uses: 1-3-7/disrobe@v0.10.0
        with:
          path: suspect.bin
          version: v0.10.0

Leaving version unset downloads the release matching the action ref, falling back to the rolling latest release. Pin a tag in production so a new release cannot change your scan results silently.

Gating a pull request

Block a merge when a packed or obfuscated artifact slips into the tree:

      - uses: 1-3-7/disrobe@v0.10.0
        with:
          path: "build/**/*"
          command: auto
          args: --max-depth 12
          fail-on: failed

fail-on: failed fails only when the chain itself errors; fail-on: incomplete is stricter and also fails when disrobe reports findings it could not fully resolve. The action reports what disrobe detects; it does not invent verdicts.

Security posture

The action verifies every download against SHA256SUMS before extracting it, and every release archive additionally carries a cosign signature bundle you can verify out of band. disrobe itself performs pure static analysis by default; see the forensics and malware-safety posture.

pre-commit hook

disrobe ships a pre-commit.com hook that scans staged files and fails the commit when a packed or protected artifact is detected. Use it to stop someone from accidentally (or maliciously) committing a UPX-packed binary, a PyArmor-protected module, a PyInstaller one-file build, or a SourceDefender/Nuitka/PyFreeze blob.

Setup

Add the hook to a consuming project's .pre-commit-config.yaml:

repos:
  - repo: https://github.com/1-3-7/disrobe
    rev: v0.10.0
    hooks:
      - id: disrobe

Then install it:

pre-commit install

The hook requires the disrobe binary on PATH (install from the Releases page), or point it at an explicit path with the DISROBE_BIN environment variable. It also needs python3 available to parse the chain report.

What it detects

For each staged file the hook runs disrobe auto <file> --json against a throwaway output directory and inspects the chain's chosen detector picks. By default it blocks only the high-precision packer/protector detectors, which key off unambiguous structural magic and do not false-positive on ordinary source:

Detector passBlocks
native.packer-unpackUPX, Petite, kkrunchy, and other native packers
pyarmor.unpackPyArmor-protected Python
pyinstaller.extractPyInstaller one-file / one-dir builds
sourcedefender.decryptSourceDefender-encrypted Python
nuitka.extractNuitka-compiled binaries
pyfreeze.extractFrozen-Python blobs

This is deliberately conservative: it blocks exactly the formats above, nothing more.

Tuning

Two environment variables tune the gate:

VariableDefaultEffect
DISROBE_BINdisrobePath to the disrobe binary.
DISROBE_BLOCK_PASSESthe six passes aboveComma-separated detector pass-ids to block.
DISROBE_BLOCK_FAMILIES(empty)Comma-separated detector families to additionally block.

The family override is broader but noisier. disrobe's source-level obfuscation classifiers (js.deob, lua.deob, py.deob) are tuned to attempt recovery aggressively, so they can fire at high confidence on ordinary text and markdown. Enabling DISROBE_BLOCK_FAMILIES=obfuscator-wrapper,packer-archive will catch source-level obfuscation but expect false positives on benign files; scope it with the hook's files:/exclude: patterns.

repos:
  - repo: https://github.com/1-3-7/disrobe
    rev: v0.10.0
    hooks:
      - id: disrobe
        files: '\.(exe|dll|so|dylib|pyc|pyz|bin)$'

Bypassing

A legitimately-committed protected artifact can skip the hook for one commit:

SKIP=disrobe git commit -m "vendor signed third-party binary"

Security posture

The hook runs disrobe auto, which performs pure static analysis by default; it does not execute the staged file. See the forensics and malware-safety posture. The scan writes recovered artifacts only into a temporary directory that the hook deletes on exit; your working tree is never modified.

MCP server

disrobe ships a Model Context Protocol server so an AI agent (Claude, Cursor, and other MCP clients) can drive deobfuscation and decompilation directly as tools. It speaks MCP over stdio via rmcp.

Every tool takes inline base64 bytes and returns structured JSON. The server never reads a file from disk based on client input; there is no path parameter on any tool. This is the same hard rule the HTTP/gRPC/LSP transports enforce; see the security posture.

Running it

Two equivalent entry points:

disrobe serve --mcp     # the CLI's MCP companion over stdio
disrobe-mcp             # the standalone MCP binary

Registering with Claude Code

claude mcp add disrobe -- disrobe serve --mcp

Or register the standalone binary:

claude mcp add disrobe -- disrobe-mcp

Registering with a generic MCP client

Most clients accept a JSON server entry. Point command at disrobe (or disrobe-mcp) and run over stdio:

{
  "mcpServers": {
    "disrobe": {
      "command": "disrobe",
      "args": ["serve", "--mcp"]
    }
  }
}

Tool catalog

ToolInputOutput
autobytes_b64, optional max_depthChain verdict, detected formats, and per-pass recovery summary. Auto-detects and chains disrobe's Python + native-packer passes.
decompilebytes_b64, optional max_depthEvery terminal recovered-source artifact (language-keyed text), for example a .pyc decompiled to Python.
iocbytes_b64Indicators of compromise: URLs, domains, IPs, emails, paths, registry keys, wallet addresses, crypto constants (one decode layer of base64/hex).
behaviorbytes_b64, optional importsStatic capability summary across network, filesystem, process-exec, registry-persistence, crypto, anti-analysis, and dynamic-code categories, with MITRE ATT&CK ids.
stringsbytes_b64, optional min_len, decodePrintable ASCII + UTF-16 strings, optionally decoding base64/rot/stack-string obfuscation, tagged with their encoding.
verifybytes_b64Verify a .dr envelope: blake3 root hash, rung, hot/cold sizes.
renameold, new, optional noteAppend a symbol-rename record to .disrobe/notes/renames.json.
annottargetRegenerate and validate an annotation sidecar under .disrobe/annotations/.
provenance_lookupmap_json, lineLook up the provenance entry for a line in a provenance-map document.

The auto and decompile tools cover disrobe's Python and native-packer surface (PyArmor, PyInstaller, SourceDefender, Nuitka, PyFreeze, .pyc disassembly + decompilation, native packers, and container formats), the highest-value chain for an agent triaging an unknown blob. For the full language matrix, drive the CLI or the HTTP daemon.

Example call

A client calls decompile with the base64 of a .pyc and receives the recovered Python:

{
  "name": "decompile",
  "arguments": { "bytes_b64": "4w0NCgAAAAA..." }
}
{
  "schema": "disrobe.decompile/v1",
  "verdict": "Complete",
  "recovered": [
    { "pass": "py.decompile", "language": "Python", "formatted": true, "source": "x = a + b\n..." }
  ]
}

Security posture

The server performs pure static analysis by default and never executes the supplied bytes. It rejects empty or malformed base64 with a typed error, and rejects unknown JSON fields on every tool. Because no tool accepts a filesystem path, there is no way to make the server read an arbitrary file via a client-controlled string. See the forensics and malware-safety posture and the threat model.

Contributing

Contributions are welcome; see the contributing guide.

Building and testing

disrobe builds with a single stable Rust 1.95+ toolchain:

cargo build --release
cargo test -p <crate>          # test a single crate

The JVM decompiler can be memory-intensive on adversarial input. Prefer per-crate test runs over a bare workspace-wide cargo test --workspace when iterating locally.

The quality bar

Every commit on main must pass the workspace clippy gate with zero warnings:

cargo clippy --all-targets -- -D warnings -W unreachable_pub -W missing_debug_implementations -W unused
cargo fmt --all -- --check

The workspace lints are strict by design: unwrap_used is denied, todo! and unimplemented! are denied, and unsafe is forbidden outside the two C-interop crates. New code is fully type-annotated and self-documenting; the codebase carries durable context in dedicated docs rather than inline comments.

README graphs

The dark-theme benchmark and architecture SVGs in the README are generated, not drawn by hand. The data lives in xtask/data/*.json (every plotted value cites its source gate or harness inline), and xtask renders deterministic SVGs into docs/assets/:

cargo run -p xtask -- graphs            # regenerate docs/assets/*.svg
cargo run -p xtask -- graphs --check    # fail if committed SVGs are stale (CI runs this)

After changing a number in xtask/data/, rerun graphs and commit the regenerated SVGs; the graphs CI job rejects any drift. Numbers come only from a committed test gate or a local measurement harness, never from the tool grading its own output, and no competitor recovery percentage is plotted.

Adding a pass

A new ecosystem pass is a new disrobe-pass-<name> crate that:

  1. Implements the shared Pass trait, declaring its required and produced capabilities and its rung transition.
  2. Climbs the five-rung IR ladder rather than jumping rungs.
  3. Ships a pass_run_envelope_roundtrip test and at least one real-fixture integration test in crates/disrobe-cli/tests/.
  4. Wires its standardized emits, returning explicit applicable: false stubs for emits it cannot produce.

Every capability claim must be backed by a fixture in corpus/ and a passing test; nothing aspirational ships as done. Fixtures are baked locally from known-good inputs by corpus/generate.{sh,ps1}; copyrighted third-party obfuscated bytecode is never committed to the public corpus.

No fabrication

A decode that only passes against synthetic, self-generated fixtures is not a feature. Per-pass work is verified against a real corpus and the upstream format spec. Partial recovery carries a confidence tier; detect-only is stated as detect-only. If you are not sure a capability works against real-world input, say so in the PR.

Reporting bugs

Generate an environment report to attach to an issue:

disrobe bug-report --out report.md
disrobe bug-report --out -          # write to stdout

For security issues, do not open a public issue; use the private advisory channel. See Security.

Security

The full security policy lives in SECURITY.md. This page summarizes it.

Reporting a vulnerability

Do not open a public issue for security reports. Use GitHub's private advisory channel:

https://github.com/1-3-7/disrobe/security/advisories/new

Include a description and impact, a minimal reproducer (input bytes, command line, expected vs observed), the disrobe --version output, the OS/arch, and whether you have a candidate fix. Reports are acknowledged within 72 hours; high-severity fixes target 30 days, with same-week turnaround for parsing-of-untrusted-input issues. Reporters are credited (with their preferred handle) in the advisory and release notes; anonymous reports are welcome.

In scope

  • Memory safety in the parsing surface. Any panic/abort on adversarial input that is not a clean Result::Err; any heap corruption is high severity.
  • Resource exhaustion. Zip-bombs, decompression bombs, recursion bombs, and malformed-length-field bombs: bypasses of the crates/disrobe-binfmt/src/quota.rs quotas.
  • Path traversal. zip-slip and equivalents on every container extraction path.
  • Server input handling. disrobe serve (HTTP/gRPC/LSP/MCP) accepts bytes_b64 only; any way to make it read a file via a client-controlled string is high severity.
  • Subprocess invocation. Command injection or argument smuggling in backend invocation.
  • .dr envelope handling. Read-past-end, integer overflow, or BLAKE3-mismatch acceptance.
  • Supply chain. Tampering with published binaries, signature bypass, replay, cosign-bundle manipulation.

Out of scope

  • Decompilation output correctness on adversarial input: that is correctness work flagged by the round-trip metric, not a security bug. File a normal issue.
  • Compute exhaustion from legitimate input (a slow decompile of a 66 MiB bundle is not a vulnerability).
  • Issues in third-party tools disrobe wraps: report to their upstreams.

Hardening posture

#![forbid(unsafe_code)] workspace-wide (except the two pyo3-interop crates). Strict clippy gate on every commit. cargo deny on every push plus weekly; cargo audit weekly. Shared container quota machinery. BLAKE3-pinned fixtures. Loopback-default servers with a warning banner on non-loopback binds. Branch protection on main (1 approval + green CI + linear history + no force-push).

Verifying release artifacts

Release binaries are signed with cosign keyless OIDC (recorded in the Rekor transparency log) and minisign:

cosign verify-blob \
  --certificate-identity-regexp '^https://github.com/1-3-7/disrobe/' \
  --certificate-oidc-issuer 'https://token.actions.githubusercontent.com' \
  --signature disrobe-<version>-<target>.tar.zst.sig \
  --bundle    disrobe-<version>-<target>.tar.zst.cosign.bundle \
  disrobe-<version>-<target>.tar.zst

Legal

Decompilation for security research, interoperability, and recovery of your own source is permitted in most jurisdictions: US DMCA §1201(f), EU Software Directive 2009/24/EC art. 6, UK CDPA §50B/50BA, and equivalents in CA / AU / JP. The full statutory posture, with citations and a takedown channel, is in LEGAL.md.

The authorization gate

Grey-zone commercial protectors (PyArmor paid tier, ionCube, SourceGuardian, Zend Guard, the commercial native-packer tier, and the grey-zone .NET/JVM obfuscators) are gated behind the explicit --i-have-authorization flag and never run otherwise. Passing the flag is your assertion that you are authorized to analyze the input under the statutory framing above. Use is your responsibility.

The same flag unlocks the decryption-keys category of the --llm sidecar; without it, requesting that category fails with DR-CLI-0420.

What disrobe will not do

  • It does not ship copyrighted third-party obfuscated bytecode in its public corpus. Fixtures are baked locally from known-good inputs.
  • Grey-zone protectors ship recognizers first; escalation to a full peel only happens after a written legal-posture review committed to docs/legal/<protector>-stance.md.
  • It does not phone home. The only documented network endpoint is disrobe self-update --check-only, and the binary is distributed source-and-release-only.

License

disrobe is licensed under the Elastic License 2.0. Companies and security researchers may use, copy, modify, and distribute it for free; attribution is required, so keep the author, copyright, and licensing notices intact. You may not provide disrobe to third parties as a hosted or managed service, and you may not remove or obscure any licensing, copyright, or other notices. The "disrobe" name and marks are reserved; the license grants no trademark rights. See LICENSE and NOTICE.