chore: bump plugin version to 1.5.0

feat: post-filter --urls to drop dictionary noise while keeping IPs and apex hosts
The hardening patch widened STRICT_URL to recover IPv4 literals, apex 2-label domains and internal hosts that the PR's strict-only regex discarded as collateral while killing Kotlin-stdlib dictionary noise. Widening alone reopened a narrow noise class: 'word.word' fragments such as "www.this" / "this.introduction" pass as apex domains. Keep extraction permissive and add a small awk pass that decides per host: - IPv4 literal: always keep (dict fragments are words, never dotted-quads) - >=3 labels: always keep (any TLD; same tolerance as the original regex) - any host with a :port or /path: always keep (structured = high signal) - bare 2-label apex: keep only when the TLD is a real one, matched as a whole field (so "introduction" != "in" — the prefix-match bug a single mega-regex would have) Trade-off documented inline: a first-party host referenced bare with an uncommon TLD (e.g. https://foo.store with no path) is dropped; a path or port keeps it. awk is POSIX (sub/split/~/print) — more portable than the bash>=4 'declare -A' already used in the summary header. Verified: dictionary noise dropped; IPs, apex, internal and subdomain hosts kept; --all on a zero-match tree still exits 0; host list and full-URL list stay consistent (no orphan hosts). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 11:49:02 +02:00 · 2026-06-10 11:06:30 +02:00 · 2026-06-10 10:33:53 +02:00 · 2026-06-10 10:22:16 +02:00 · 2026-04-29 01:40:50 +02:00 · 2026-04-29 01:39:55 +02:00
12 changed files with 1176 additions and 26 deletions
--- a/.claude-plugin/marketplace.json
+++ b/.claude-plugin/marketplace.json
@ -7,14 +7,14 @@
  },
  "metadata": {
    "description": "Claude Code plugins for Android reverse engineering",
-    "version": "1.1.0"
+    "version": "1.5.0"
  },
  "plugins": [
    {
      "name": "android-reverse-engineering",
      "source": "./plugins/android-reverse-engineering",
      "description": "Decompile Android APK/JAR/AAR with jadx, trace call flows through libraries, and document extracted APIs.",
-      "version": "1.1.0",
+      "version": "1.5.0",
      "author": {
        "name": "Simone Avogadro"
      },
--- a/README.md
+++ b/README.md
@ -4,6 +4,8 @@

 A Claude Code skill that decompiles Android APK/XAPK/JAR/AAR files and **extracts the HTTP APIs** used by the app — Retrofit endpoints, OkHttp calls, hardcoded URLs, authentication patterns — so you can document and reproduce them without the original source code.

+> **First-class Kotlin support**: modern Android apps are Kotlin/KMP, heavily obfuscated with R8. This skill recovers the **original Kotlin class names** from metadata R8 cannot strip, and extracts APIs from **Ktor**, **Apollo (GraphQL)** and **Koin** — not just the classic Retrofit/OkHttp stack. See [Kotlin name recovery](#kotlin-name-recovery-r8-deobfuscation) below.
+
 > **Windows / PowerShell support (experimental)**: The `*.ps1` scripts alongside the bash ones are a recent community contribution, still being stabilised. For any issues please open an issue on **this** repository (not on the contributors' upstream forks): the PowerShell scripts are maintained here by [@SimoneAvogadro](https://github.com/SimoneAvogadro).

 ## Table of Contents
@ -22,11 +24,13 @@ A Claude Code skill that decompiles Android APK/XAPK/JAR/AAR files and **extract

 | Capability | Description |
 |------------|-------------|
+| **Fingerprint first (Phase 0)** | Triage an APK/XAPK in seconds — detect the framework (Flutter / React Native / Cordova / Xamarin / native-Kotlin), HTTP stack, obfuscation level and native libs *before* spending time on a full decompile |
 | **Decompile** | APK, XAPK, JAR, and AAR files using jadx and Fernflower/Vineflower (single engine or side-by-side comparison) |
-| **Extract APIs** | Retrofit endpoints, OkHttp calls, hardcoded URLs, auth headers and tokens |
+| **Recover Kotlin names** | Rebuild original `*Repository` / `*ViewModel` / `*UseCase` class names from R8-obfuscated binaries using Kotlin metadata that R8 cannot strip |
+| **Extract APIs** | Retrofit, OkHttp, Volley **and modern Kotlin/KMP stacks: Ktor, Apollo (GraphQL), Koin DI** — endpoints, hardcoded URLs, auth headers, tokens and HMAC request-signing schemes |
 | **Trace call flows** | From Activities/Fragments through ViewModels and repositories down to HTTP calls |
 | **Analyze structure** | Manifest, packages, architecture patterns |
-| **Handle obfuscation** | Strategies for navigating ProGuard/R8 output |
+| **Handle obfuscation** | R8-resistant path/URL extraction plus strategies for navigating ProGuard/R8 output |

 ## Requirements

@ -100,6 +104,10 @@ bash plugins/android-reverse-engineering/skills/android-reverse-engineering/scri
 bash plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/install-dep.sh jadx
 bash plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/install-dep.sh vineflower

+# Fingerprint an APK/XAPK BEFORE decompiling (Phase 0 triage):
+# framework, HTTP stack, obfuscation level, native libs, notable SDKs
+bash plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/fingerprint.sh app.apk
+
 # Decompile APK with jadx (default)
 bash plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/decompile.sh app.apk

@ -112,10 +120,38 @@ bash plugins/android-reverse-engineering/skills/android-reverse-engineering/scri
 # Run both engines and compare
 bash plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/decompile.sh --engine both --deobf app.apk

-# Find API calls
+# Find API calls — defaults to a full scan across every supported stack
 bash plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/find-api-calls.sh output/sources/
 bash plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/find-api-calls.sh output/sources/ --retrofit
 bash plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/find-api-calls.sh output/sources/ --urls
+
+# Modern Kotlin/KMP stacks and obfuscation-resistant extraction
+bash plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/find-api-calls.sh output/sources/ --ktor    # Ktor client
+bash plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/find-api-calls.sh output/sources/ --apollo  # Apollo / GraphQL
+bash plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/find-api-calls.sh output/sources/ --paths   # quoted path literals that survive R8 inlining
+```
+
+### Kotlin name recovery (R8 deobfuscation)
+
+Most real-world Kotlin/KMP apps ship through R8, so the decompiled classes come
+out as `a.b.c`. R8 renames the JVM symbols but **cannot strip the Kotlin
+metadata strings** — the Kotlin runtime (reflection, coroutines) needs the
+original fully-qualified names at runtime. This skill mines those
+`@DebugMetadata` / `@Metadata` annotations to rebuild an `obfuscated → real`
+class-name map. On a typical app it recovers ~100 % of the
+`*Repository` / `*ViewModel` / `*UseCase` / `*Impl` classes you actually want to
+read.
+
+```bash
+# 1. Build the mapping from the decompiled sources
+bash plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/recover-kotlin-names.sh output/sources/ output/names/
+#    → output/names/mapping.tsv, mapping.json, by_package/
+
+# 2. Query it: resolve an obfuscated name, search by real name, or grep
+#    the sources with each hit annotated with its recovered class name
+bash plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/lookup-name.sh output/names/ LoginRepository
+bash plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/lookup-name.sh output/names/ -o a.b.c
+bash plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/lookup-name.sh output/names/ --grep 'login' output/sources/
 ```

 ## Repository Structure
@ -130,12 +166,14 @@ android-reverse-engineering-skill/
 │       │   └── plugin.json                 # Plugin manifest
 │       ├── skills/
 │       │   └── android-reverse-engineering/
-│       │       ├── SKILL.md                # Core workflow (5 phases)
+│       │       ├── SKILL.md                # Core workflow (Phase 0–5)
 │       │       ├── references/
 │       │       │   ├── setup-guide.md
 │       │       │   ├── jadx-usage.md
 │       │       │   ├── fernflower-usage.md
 │       │       │   ├── api-extraction-patterns.md
+│       │       │   ├── kotlin-name-recovery.md
+│       │       │   ├── third_party_hosts.txt   # denylist for first/third-party bucketing
 │       │       │   └── call-flow-analysis.md
 │       │       └── scripts/
 │       │           ├── check-deps.sh       # Bash
@ -144,6 +182,9 @@ android-reverse-engineering-skill/
 │       │           ├── install-dep.ps1
 │       │           ├── decompile.sh
 │       │           ├── decompile.ps1
+│       │           ├── fingerprint.sh          # Phase 0 — pre-decompile triage
+│       │           ├── recover-kotlin-names.sh # R8 → real Kotlin class names
+│       │           ├── lookup-name.sh          # query the recovered name map
 │       │           ├── find-api-calls.sh
 │       │           └── find-api-calls.ps1
 │       └── commands/
@ -164,6 +205,7 @@ android-reverse-engineering-skill/

 Thanks to the contributors who have shaped this skill:

+- [@tajchert](https://github.com/tajchert) — Phase 0 fingerprinting, R8-resistant Kotlin name recovery (`recover-kotlin-names.sh`, `lookup-name.sh`), and Ktor / Apollo / Koin / HMAC extraction patterns (#16)
 - [@philjn](https://github.com/philjn) — Native Windows / PowerShell support (`check-deps.ps1`, `install-dep.ps1`, `decompile.ps1`, `find-api-calls.ps1`) and split/bundled APK detection in `decompile.sh` (#8)
 - [@txhno](https://github.com/txhno) — Migration to the maintained [`ThexXTURBOXx/dex2jar`](https://github.com/ThexXTURBOXx/dex2jar) fork (#12)
 - [@muqiao215](https://github.com/muqiao215) — Decompile partial-success handling, Fernflower timeout safeguard, intermediate-artifact directory (#10)
--- a/plugins/android-reverse-engineering/.claude-plugin/plugin.json
+++ b/plugins/android-reverse-engineering/.claude-plugin/plugin.json
@ -1,6 +1,6 @@
 {
  "name": "android-reverse-engineering",
-  "version": "1.1.0",
+  "version": "1.5.0",
  "description": "Decompile Android APK/JAR/AAR with jadx, trace call flows through libraries, and document extracted APIs.",
  "author": {
    "name": "Simone Avogadro"
--- a/plugins/android-reverse-engineering/skills/android-reverse-engineering/SKILL.md
+++ b/plugins/android-reverse-engineering/skills/android-reverse-engineering/SKILL.md
@ -24,6 +24,31 @@ If anything is missing, follow the installation instructions in `${CLAUDE_PLUGIN

 ## Workflow

+### Phase 0: Fingerprint the App (recommended before anything else)
+
+Before installing tools or decompiling, run a fast triage to determine what
+kind of app you are looking at. **Decompiling Java is mostly useless for
+Flutter, React Native, Cordova/Capacitor, and Xamarin apps** — the real code
+lives elsewhere. The fingerprint script tells you which.
+
+```bash
+bash ${CLAUDE_PLUGIN_ROOT}/skills/android-reverse-engineering/scripts/fingerprint.sh <file.apk|file.xapk>
+```
+
+It prints, in one screen:
+
+- **Mobile framework** (Flutter / React Native / Cordova / Xamarin / Native Kotlin / etc.) with the file marker that triggered the verdict.
+- **HTTP stack** (Retrofit, OkHttp, Ktor, Apollo, Volley) detected via DEX string scan — works even when class names are obfuscated.
+- **DI / serialization** signals (Hilt, Dagger, Koin, kotlinx.serialization, Moshi, Gson, Jackson).
+- **Obfuscation level** estimate based on root-level short-named packages.
+- **Notable third-party SDKs** (AppsFlyer, Datadog, Sentry, Firebase, payment SDKs, support/chat SDKs, etc.).
+- **Consolidated native libraries** across the base APK and all splits — XAPK split bundles often place `.so` files in `config.<abi>.apk`, not in `base.apk`.
+- **Recommended next step**, which differs by framework (e.g. for Flutter the script suggests `blutter` / `strings libapp.so` rather than jadx).
+
+If the fingerprint says the app is Flutter / RN / Cordova / Xamarin, **stop**
+and switch to the framework-appropriate tooling. Phases 1–5 below assume a
+native (Java/Kotlin) Android app.
+
 ### Phase 1: Verify and Install Dependencies

 Before decompiling, confirm that the required tools are available — and install any that are missing.
@ -123,12 +148,45 @@ Navigate the decompiled output to understand the app's architecture.
   - Distinguish app code from third-party libraries
   - Look for packages named `api`, `network`, `data`, `repository`, `service`, `retrofit`, `http` — these are where API calls live

-3. **Identify the architecture pattern**:
+3. **Read every `BuildConfig.java`** — these are almost never obfuscated and frequently leak the highest-signal constants in the entire APK (base URLs, flavor names, build type, third-party API keys, feature flags):
+   ```bash
+   find <output>/sources -name BuildConfig.java -exec grep -H '=' {} \;
+   ```
+   Each Gradle module emits its own `BuildConfig`, so expect 1–N hits. Read all of them.
+
+4. **Identify the architecture pattern**:
   - MVP: look for `Presenter` classes
   - MVVM: look for `ViewModel` classes and `LiveData`/`StateFlow`
   - Clean Architecture: look for `domain`, `data`, `presentation` packages
   - This informs where to look for network calls in the next phases

+### Phase 3.5: Recover Kotlin Class Names (only for obfuscated Kotlin apps)
+
+If Phase 0 reported moderate / high obfuscation **and** the app is Kotlin
+(Compose / kotlin_module markers detected), run the metadata recovery
+script before tracing call flows. R8 obfuscates JVM symbols but cannot
+strip Kotlin metadata strings, so original FQNs leak through
+`@DebugMetadata` and `@Metadata.d2`.
+
+```bash
+bash ${CLAUDE_PLUGIN_ROOT}/skills/android-reverse-engineering/scripts/recover-kotlin-names.sh \
+    <output>/sources <output>/mapping
+```
+
+Then use the lookup helper instead of plain grep — every hit comes
+annotated with the owning class's real name:
+
+```bash
+bash ${CLAUDE_PLUGIN_ROOT}/skills/android-reverse-engineering/scripts/lookup-name.sh \
+    <output>/mapping --grep '"/api/' <output>/sources
+```
+
+Typical recovery on a real-world Kotlin app: ~100% of `*Repository` /
+`*ViewModel` / `*UseCase` / `*Impl` classes, ~80% of DTOs.
+
+See `${CLAUDE_PLUGIN_ROOT}/skills/android-reverse-engineering/references/kotlin-name-recovery.md`
+for the full technique and limitations.
+
 ### Phase 4: Trace Call Flows

 Follow execution paths from user-facing entry points down to network calls.
@ -190,15 +248,32 @@ On Windows (PowerShell):
 & "${CLAUDE_PLUGIN_ROOT}/skills/android-reverse-engineering/scripts/find-api-calls.ps1" <output>/sources/ -Auth
 ```

-Then, for each discovered endpoint, read the surrounding source code to extract:
- HTTP method and path
- Base URL
- Path parameters, query parameters, request body
- Headers (especially authentication)
- Response type
- Where it's called from (the call chain from Phase 4)
+Document the endpoints in **two tiers** — going deep on every endpoint is
+prohibitively expensive on apps with 100+ paths, and most of them do not
+warrant it. Always produce Tier 1; expand Tier 2 only for the endpoints
+that matter.

-**Document each endpoint** using this format:
+#### Tier 1 — flat inventory (always)
+
+A single table covering every discovered endpoint. Aim for one line each;
+if you cannot determine a column, write `?`.
+
+| Host | Method | Path | Auth | Source file |
+|------|--------|------|------|-------------|
+| `api.example.com` | GET | `/v1/users/profile` | Bearer | `com/example/api/UserApi.java` |
+| `api.example.com` | POST | `/v1/auth/login` | none | `com/example/api/AuthApi.java` |
+
+This table answers "what does the backend look like" in one screen and
+takes ~5 minutes to produce from the `--paths` output even on a large app.
+
+#### Tier 2 — per-endpoint detail (only for high-value endpoints)
+
+Reserve the detailed format for the few endpoints that actually need it:
+
+- the entire authentication flow (login, refresh, logout, OTP/SMS, anonymous, registration)
+- payment / checkout / order-creation endpoints
+- anything the user explicitly asked about
+- anything that looked unusual during the scan (custom signing, undocumented headers, etc.)

 ```markdown
 ### `METHOD /path`
@ -213,6 +288,10 @@ Then, for each discovered endpoint, read the surrounding source code to extract:
 - **Called from**: `LoginActivity → LoginViewModel → UserRepository → ApiService`
 ```

+As a default, do not produce Tier 2 entries for more than ~10 endpoints
+unless the user explicitly asks for more — Tier 1 plus a Tier 2 deep dive
+on auth + 1-2 key flows is what most consumers of this work actually want.
+
 See `${CLAUDE_PLUGIN_ROOT}/skills/android-reverse-engineering/references/api-extraction-patterns.md` for library-specific search patterns and the full documentation template.

 ## Output
--- a/plugins/android-reverse-engineering/skills/android-reverse-engineering/references/api-extraction-patterns.md
+++ b/plugins/android-reverse-engineering/skills/android-reverse-engineering/references/api-extraction-patterns.md
@ -55,6 +55,65 @@ grep -rn 'Interceptor\|addInterceptor\|addNetworkInterceptor\|intercept(' source
 grep -rn '\.execute()\|\.enqueue(' sources/
 ```

+## Ktor (Kotlin)
+
+Ktor is the dominant HTTP client in Kotlin Multiplatform and modern
+Kotlin-only Android apps. Unlike Retrofit, Ktor does **not** use annotations
+to declare endpoints — paths appear as plain string arguments to
+`client.get(...)` / `client.post(...)`, often inside an extension function.
+
+```bash
+# Calls
+grep -rn '\b\(client\|httpClient\|HttpClient\)\.\(get\|post\|put\|delete\|patch\|head\|request\)\s*[<(]' sources/
+
+# Default request / base URL configuration
+grep -rn 'HttpRequestBuilder\|defaultRequest\s*{\|\burl\s*(\s*"\|URLBuilder' sources/
+
+# Auth plugin (bearer / refresh)
+grep -rn '\bbearer\s*{\|BearerTokens\s*(\|loadTokens\s*{\|refreshTokens\s*{' sources/
+```
+
+Typical Ktor call (after decompile):
+
+```java
+client.get("api/v1/users/profile") {
+    parameter("locale", "en-US");
+}
+```
+
+The base URL is usually applied via `defaultRequest { url { host = "..." } }`
+in the client builder. Search for `host =` and `URLProtocol.HTTPS` references
+to pin it down.
+
+**Note on obfuscation:** in heavily R8-shrunk apps the call site
+`client.get("path")` is inlined to something like `aVar.a(dVar, "path")`
+and the `client.<verb>(` regex misses it. The path string itself is **not**
+obfuscated, however — fall back to the generic path-literal search
+(`--paths`) for the endpoint inventory in those cases. Ktor library
+internals (`BearerTokens`, `loadTokens`, `refreshTokens`, `URLProtocol`)
+remain searchable because Ktor keeps these on its public API.
+
+Ktor's authentication plugin uses the
+[`Auth { bearer { loadTokens { ... }; refreshTokens { ... } } }`](https://ktor.io/docs/auth.html)
+DSL — bearer access tokens with automatic refresh. After R8, the DSL
+lambdas appear as `Function2`/`Function3` impls referencing
+`BearerTokens(...)` calls.
+
+## Apollo Kotlin (GraphQL)
+
+```bash
+# Client setup
+grep -rn 'ApolloClient\|\.serverUrl(\|HttpNetworkTransport' sources/
+
+# Operations (queries / mutations / subscriptions)
+grep -rn '\.query(\s*[A-Z]\|\.mutation(\s*[A-Z]\|\.subscription(\s*[A-Z]' sources/
+```
+
+Apollo generates one class per operation under a generated package; once you
+find the GraphQL endpoint URL via `ApolloClient.serverUrl("...")`, use the
+operation classes themselves as the API documentation — each carries its
+GraphQL document text in `OPERATION_DOCUMENT`.
+
 ## Volley

 ```bash
@ -77,6 +136,25 @@ grep -rn 'loadUrl\|evaluateJavascript\|addJavascriptInterface\|WebViewClient\|sh

 WebView-based apps may load API endpoints via JavaScript bridges. Look for `@JavascriptInterface` annotated methods.

+## Endpoint-Shaped Path Literals (obfuscation-resistant)
+
+When the HTTP client cannot be identified (custom abstraction, heavy
+inlining, KMP shared module), or the call sites are obfuscated to
+`a.b(c, "path")`, fall back to extracting the path string literals
+themselves. R8 does not obfuscate string contents, so paths leak through.
+
+```bash
+# All quoted strings shaped like an API path, deduplicated
+grep -rhoE '"(/[A-Za-z0-9_{}.\-]+(/[A-Za-z0-9_{}.\-]+)+/?|(api|v[0-9]+|graphql|users?|account|auth|sso|oauth|profile|cart|basket|order|product|inventory|search|category|address|location|delivery|payment|invoice|favo[u]?rites?)(/[A-Za-z0-9_{}.\-]+)+/?)"' sources/ \
+    | grep -Ev '^"(image|video|audio|text|application|content)/|^"/(proc|sys|dev|tmp|etc)/' \
+    | sort -u
+```
+
+The skill ships this as `find-api-calls.sh --paths`, which prints both a
+deduplicated inventory and the full list of call sites. On real-world
+Kotlin apps this single command typically produces 100–300 distinct
+endpoint paths, which is the most useful first artifact for documentation.
+
 ## Hardcoded URLs and Secrets

 ```bash
--- a/plugins/android-reverse-engineering/skills/android-reverse-engineering/references/call-flow-analysis.md
+++ b/plugins/android-reverse-engineering/skills/android-reverse-engineering/references/call-flow-analysis.md
@ -84,9 +84,9 @@ Look for:
 - Firebase/analytics initialization
 - Base URL configuration

-## 5. Dependency Injection (Dagger / Hilt)
+## 5. Dependency Injection

-Modern Android apps use DI. Trace bindings to find implementations:
+### Dagger / Hilt

 ```bash
 # Hilt modules
@ -102,10 +102,43 @@ grep -rn '@Component\|@Subcomponent' sources/
 grep -rn '@Inject' sources/
 ```

-To trace a call flow through DI:
-1. Find where an interface is used (e.g., `ApiService` injected into a repository)
-2. Find the `@Provides` or `@Binds` method that creates the implementation
-3. Follow the implementation to the actual HTTP call
+### Koin
+
+Koin is the dominant DI framework in Kotlin Multiplatform and a large
+share of Kotlin-only Android apps. It uses a runtime DSL rather than
+compile-time generated factories, so the search patterns are different:
+
+```bash
+# Confirm Koin is actually wired up
+grep -rn 'org\.koin\.' sources/
+
+# DI module declarations
+grep -rn 'fun [A-Za-z]\+Module\|module\s*{\|module(' sources/
+
+# Bindings inside a module DSL
+grep -rn 'single\s*[<{(]\|factory\s*[<{(]\|viewModel\s*[<{(]\|scoped\s*[<{(]\|singleOf\|factoryOf' sources/
+
+# Resolution call-sites (where a binding is consumed)
+grep -rn '\bget\s*<\|\binject\s*<\|by\s\+inject\b\|by\s\+viewModel\b\|getKoin' sources/
+```
+
+After R8, every binding lambda becomes an anonymous
+`Function2<Scope, ParametersHolder, T>` impl. To find the binding for an
+interface `Foo`, look for files that contain both a Koin import / module
+DSL marker and a reference to `Foo`:
+
+```bash
+grep -rln 'org\.koin\.core\.module' sources/ | xargs grep -l 'Foo'
+```
+
+### Trace through DI
+
+1. Find where an interface is used (e.g. `ApiService` injected into a
+   repository).
+2. Find the `@Provides` / `@Binds` method (Hilt) **or** the
+   `single { ... }` / `factory { ... }` block (Koin) that creates the
+   implementation.
+3. Follow the implementation to the actual HTTP call.

 ## 6. Find Constants and Configuration

@ -145,8 +178,9 @@ When code is obfuscated (ProGuard/R8):
 1. **Start from strings**: Search for URLs, error messages, and known constants
 2. **Start from framework classes**: Activities and Fragments are named in the manifest
 3. **Follow library calls**: Retrofit `@GET`/`@POST` annotations are readable even when the interface class name is obfuscated
-4. **Use `--deobf`**: jadx can generate readable replacement names
+4. **Recover original Kotlin names from metadata**: `@DebugMetadata` and `@Metadata.d2` strings preserve the original FQNs even after R8 obfuscation. Run `scripts/recover-kotlin-names.sh` to build an `obf -> real` map (typically recovers 30-50% of classes — and almost 100% of `*Repository` / `*ViewModel` / `*Impl`). See [`kotlin-name-recovery.md`](./kotlin-name-recovery.md). This is the single highest-leverage step on any Kotlin app.
 5. **Cross-reference**: If `class a` calls `Retrofit.create(b.class)`, then `b` is a Retrofit service interface
+6. **`--deobf` is rarely enough on its own**: jadx's `--deobf` renames obfuscated symbols with synthetic placeholders (`p001a`, `C0123Foo`) — useful for disambiguation but it does **not** recover original names. Pair it with the metadata recovery above.

 ## 8. Tracing a Complete Call Flow: Example

--- a/plugins/android-reverse-engineering/skills/android-reverse-engineering/references/kotlin-name-recovery.md
+++ b/plugins/android-reverse-engineering/skills/android-reverse-engineering/references/kotlin-name-recovery.md
@ -0,0 +1,108 @@
+# Recovering Original Class Names from Kotlin Metadata
+
+When R8/ProGuard obfuscates a Kotlin app, JVM symbols are renamed but the
+**Kotlin metadata strings cannot be stripped** — the Kotlin runtime depends
+on them at runtime for reflection, coroutines, and `data class` features.
+
+Two annotations leak the original fully-qualified names:
+
+## `@DebugMetadata`
+
+Generated for nearly every Kotlin coroutine `SuspendLambda` (i.e. almost
+every `suspend` function in a modern app):
+
+```java
+@DebugMetadata(
+    c  = "com.example.feature.account.AccountRepositoryImpl$fetch$1",
+    f  = "AccountRepositoryImpl.kt",
+    l  = {42, 51},
+    m  = "invokeSuspend"
+)
+public final class a extends SuspendLambda implements Function2<...> { ... }
+```
+
+The `c =` field carries the original outer class FQN (with a `$` suffix
+for inner / lambda scopes — strip everything after the first `$` to get the
+declaring class).
+
+## `@Metadata.d2`
+
+Every Kotlin class carries a top-level `@Metadata` annotation. The `d2`
+array lists internal class refs in JVM type-descriptor format
+(`Lcom/example/Foo;`):
+
+```java
+@Metadata(d1 = {"..."},
+          d2 = {"...","Lcom/example/feature/account/AccountRepositoryImpl;","..."})
+public final class b implements ... { ... }
+```
+
+The first non-stdlib descriptor in `d2` is usually the file's primary
+class.
+
+## How to mine them
+
+The skill ships two scripts:
+
+```bash
+# Build a mapping from a decompiled sources directory:
+bash scripts/recover-kotlin-names.sh <output>/sources [mapping-dir]
+
+# Outputs:
+#   <mapping-dir>/mapping.tsv        obf_fqn  real_fqn  file
+#   <mapping-dir>/mapping.json       same data, JSON
+#   <mapping-dir>/by_package/        per-real-package index files
+
+# Query the mapping:
+bash scripts/lookup-name.sh <mapping-dir> Repository                 # search
+bash scripts/lookup-name.sh <mapping-dir> -o ab.cd                   # obf -> real
+bash scripts/lookup-name.sh <mapping-dir> -p com.example.feature     # list package
+bash scripts/lookup-name.sh <mapping-dir> --grep '"api/' <output>/sources
+   # ^ greps decompiled code and appends '// real.fqn' to each hit
+```
+
+## What you typically recover
+
+On a real-world obfuscated Kotlin app the script recovers **30 – 50 % of
+classes** — but more importantly, **almost 100 % of the classes you
+actually want to read**:
+
+| Class kind                | Recovery rate |
+|---------------------------|---------------|
+| `*Repository` / `*Impl`   | ~100 %        |
+| `*ViewModel`              | ~100 %        |
+| `*UseCase` / `*Interactor`| ~100 %        |
+| Plain `data class` DTOs   | ~80 %         |
+| Pure-Java helper classes  | low (no Kotlin metadata) |
+| Anonymous inner classes   | sometimes recovered as the parent FQN |
+
+## Why `jadx --deobf` is not enough
+
+`--deobf` renames obfuscated identifiers using internal heuristics, but the
+output is still synthetic (`p001a`, `C0123Foo`). It does **not** recover
+the *original* names. Kotlin metadata recovery is the only reliable way to
+map back to the names the developer actually wrote, and it costs essentially
+nothing — just a regex pass over the decompiled sources.
+
+Run both: `--deobf` for fields/methods that have no metadata source, plus
+the recovery script for class names.
+
+## Limitations
+
+- **Method names and field names** are not recovered. Kotlin metadata only
+  preserves class-level FQNs and a few signatures. For method names you
+  still need jadx-gui's interactive rename or pattern inference.
+- **Pure-Java classes** carry no `@Metadata`, so they remain obfuscated.
+- **Heavily inlined classes** (`@JvmInline value class`, top-level fun
+  files compiled into shared `*Kt.class` synthetic classes) sometimes show
+  up under the wrong filename — treat results as a strong hint, not gospel.
+
+## Reading flow with the mapping
+
+1. Run `recover-kotlin-names.sh` once after decompiling.
+2. Use `lookup-name.sh --grep '<pattern>' <sources>` instead of plain `grep`
+   so every hit comes annotated with the real owning class.
+3. When you hit an obfuscated FQN in code (e.g. `nq.e`), resolve it with
+   `lookup-name.sh <mapping-dir> -o nq.e` — you will often see siblings
+   (`nq.d`, `nq.f`, ...) that are the same class's split lambdas/inner
+   classes, which is useful context.
--- a/plugins/android-reverse-engineering/skills/android-reverse-engineering/references/third_party_hosts.txt
+++ b/plugins/android-reverse-engineering/skills/android-reverse-engineering/references/third_party_hosts.txt
@ -0,0 +1,122 @@
+# Third-party host denylist used by find-api-calls.sh --urls.
+#
+# Patterns are extended-regex hostname suffixes / fragments. A host is
+# considered "third-party noise" if any pattern below matches anywhere
+# in the hostname. Lines starting with '#' and blank lines are ignored.
+#
+# This list is intentionally conservative: when a pattern would hide a
+# legitimate first-party host (e.g. an app may run its own *.s3.amazonaws.com
+# bucket), keep the pattern but expect manual review of the bucketed output.
+
+# Google / Firebase / Play / Crashlytics
+\.googleapis\.com$
+\.google\.com$
+\.gstatic\.com$
+\.googleusercontent\.com$
+\.googletagmanager\.com$
+\.googlesyndication\.com$
+\.firebaseio\.com$
+\.firebaseapp\.com$
+\.firebaseinstallations\.googleapis\.com$
+\.firebaseremoteconfig\.googleapis\.com$
+\.crashlytics\.com$
+\.app-measurement\.com$
+
+# Apple / Microsoft / Adobe
+\.apple\.com$
+\.icloud\.com$
+\.microsoft\.com$
+\.live\.com$
+\.office\.com$
+\.adobe\.com$
+ns\.adobe\.com
+
+# Meta
+\.facebook\.com$
+\.fbcdn\.net$
+\.instagram\.com$
+\.whatsapp\.com$
+
+# Other social / messaging / video
+\.twitter\.com$
+\.x\.com$
+\.tiktok\.com$
+\.youtube\.com$
+\.youtu\.be$
+\.linkedin\.com$
+\.snapchat\.com$
+\.pinterest\.com$
+\.reddit\.com$
+
+# Mobile attribution / analytics / observability
+\.appsflyersdk\.com$
+\.appsflyer\.com$
+\.adjust\.com$
+\.branch\.io$
+\.amplitude\.com$
+\.segment\.com$
+\.mixpanel\.com$
+\.hotjar\.com$
+\.clarity\.ms$
+\.datadoghq\.(com|eu|us)$
+\.sentry\.io$
+\.bugsnag\.com$
+\.newrelic\.com$
+\.instabug\.com$
+\.embrace\.io$
+\.rollout\.io$
+\.launchdarkly\.com$
+
+# Push / notifications
+\.onesignal\.com$
+\.urbanairship\.com$
+\.airship\.com$
+
+# Support / chat
+\.zendesk\.com$
+\.intercom\.io$
+\.intercomcdn\.com$
+\.helpshift\.com$
+\.salesforce\.com$
+\.freshchat\.com$
+\.kustomerapp\.com$
+
+# Payments
+\.stripe\.com$
+\.braintreepayments\.com$
+\.braintreegateway\.com$
+\.payu\.com$
+\.payu\.in$
+\.paypal\.com$
+\.adyen\.com$
+\.checkout\.com$
+\.klarna\.com$
+
+# Maps / location
+\.mapbox\.com$
+\.openstreetmap\.org$
+
+# Storage / CDN (often third-party even when the bucket name is app-specific)
+\.s3\.amazonaws\.com$
+\.cloudfront\.net$
+\.akamaihd\.net$
+\.akamaized\.net$
+\.fastly\.net$
+\.cloudflare\.com$
+\.azureedge\.net$
+
+# DNS / well-known infra
+\.localhost$
+^localhost
+^127\.
+
+# Standards / RFCs / placeholders that show up as XML/XMP namespaces
+\.w3\.org$
+\.w3c\.org$
+example\.(com|org|net)$
+
+# Certificate authorities
+\.sectigo\.com$
+\.entrust\.com$
+\.digicert\.com$
+\.letsencrypt\.org$
--- a/plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/find-api-calls.sh
+++ b/plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/find-api-calls.sh
@ -14,8 +14,12 @@ Arguments:
 Options:
  --retrofit      Search only for Retrofit annotations
  --okhttp        Search only for OkHttp patterns
+  --ktor          Search only for Ktor client patterns
+  --apollo        Search only for Apollo (GraphQL) patterns
  --volley        Search only for Volley patterns
  --urls          Search only for hardcoded URLs
+  --paths         Extract unique endpoint-shaped path string literals
+                  (works on heavily obfuscated apps where call sites are inlined)
  --auth          Search only for auth-related patterns
  --all           Search all patterns (default)
  -h, --help      Show this help message
@ -29,8 +33,11 @@ EOF
 SOURCE_DIR=""
 SEARCH_RETROFIT=false
 SEARCH_OKHTTP=false
+SEARCH_KTOR=false
+SEARCH_APOLLO=false
 SEARCH_VOLLEY=false
 SEARCH_URLS=false
+SEARCH_PATHS=false
 SEARCH_AUTH=false
 SEARCH_ALL=true

@ -38,8 +45,11 @@ while [[ $# -gt 0 ]]; do
  case "$1" in
    --retrofit) SEARCH_RETROFIT=true; SEARCH_ALL=false; shift ;;
    --okhttp)   SEARCH_OKHTTP=true;   SEARCH_ALL=false; shift ;;
+    --ktor)     SEARCH_KTOR=true;     SEARCH_ALL=false; shift ;;
+    --apollo)   SEARCH_APOLLO=true;   SEARCH_ALL=false; shift ;;
    --volley)   SEARCH_VOLLEY=true;    SEARCH_ALL=false; shift ;;
    --urls)     SEARCH_URLS=true;      SEARCH_ALL=false; shift ;;
+    --paths)    SEARCH_PATHS=true;     SEARCH_ALL=false; shift ;;
    --auth)     SEARCH_AUTH=true;      SEARCH_ALL=false; shift ;;
    --all)      SEARCH_ALL=true; shift ;;
    -h|--help)  usage ;;
@ -72,6 +82,58 @@ run_grep() {
  grep $GREP_OPTS -E "$pattern" "$SOURCE_DIR" 2>/dev/null || true
 }

+# Print a one-screen summary FIRST so a reader knows what to expect from
+# the long output that follows. Skipped when a single section flag was
+# requested (the user wants raw matches, not an overview). One pass over
+# the tree, counts bucketed by tag — running 8 separate greps was too slow.
+if [[ "$SEARCH_ALL" == true ]]; then
+  section "Summary (counted in a single pass)"
+  declare -A H=(
+    [retrofit]=0 [okhttp]=0 [ktor]=0 [apollo]=0 [volley]=0
+    [hilt]=0 [koin]=0 [bearer]=0 [hmac]=0
+  )
+  while IFS= read -r line; do
+    case "$line" in
+      *"@GET("*|*"@POST("*|*"@PUT("*|*"@DELETE("*|*"@PATCH("*|*"@HTTP("*) H[retrofit]=$((H[retrofit]+1));;
+    esac
+    case "$line" in
+      *"Request.Builder"*|*"HttpUrl"*|*".newCall("*) H[okhttp]=$((H[okhttp]+1));;
+    esac
+    case "$line" in
+      *"BearerTokens"*|*"defaultRequest {"*|*"client.get("*|*"client.post("*|*"httpClient.get("*|*"httpClient.post("*|*"HttpClient.get("*) H[ktor]=$((H[ktor]+1));;
+    esac
+    case "$line" in
+      *"ApolloClient"*|*".serverUrl("*) H[apollo]=$((H[apollo]+1));;
+    esac
+    case "$line" in
+      *"StringRequest"*|*"JsonObjectRequest"*|*"RequestQueue"*) H[volley]=$((H[volley]+1));;
+    esac
+    case "$line" in
+      *"@HiltAndroidApp"*|*"@AndroidEntryPoint"*|*"@HiltViewModel"*|*"@Provides"*|*"@Binds"*) H[hilt]=$((H[hilt]+1));;
+    esac
+    case "$line" in
+      *"org.koin."*|*"module {"*|*"single<"*|*"factory<"*|*"singleOf("*|*"factoryOf("*) H[koin]=$((H[koin]+1));;
+    esac
+    case "$line" in
+      *'"Bearer '*|*'"bearer '*|*"BearerTokens"*) H[bearer]=$((H[bearer]+1));;
+    esac
+    case "$line" in
+      *"HmacSHA"*|*'Mac.getInstance("Hmac'*) H[hmac]=$((H[hmac]+1));;
+    esac
+  done < <(grep -rEh --include='*.java' --include='*.kt' \
+      '@(GET|POST|PUT|DELETE|PATCH|HTTP)\(|Request\.Builder|HttpUrl|\.newCall\(|BearerTokens|defaultRequest \{|client\.(get|post)\(|httpClient\.(get|post)\(|ApolloClient|\.serverUrl\(|StringRequest|JsonObjectRequest|RequestQueue|@HiltAndroidApp|@AndroidEntryPoint|@HiltViewModel|@Provides|@Binds|org\.koin\.|module \{|single<|factory<|"[Bb]earer |HmacSHA|Mac\.getInstance' \
+      "$SOURCE_DIR" 2>/dev/null || true)
+  printf '  HTTP framework:   Retrofit=%-5s OkHttp=%-5s Ktor=%-5s Apollo=%-5s Volley=%-5s\n' \
+      "${H[retrofit]}" "${H[okhttp]}" "${H[ktor]}" "${H[apollo]}" "${H[volley]}"
+  printf '  DI framework:     Hilt/Dagger=%-5s Koin=%-5s\n' \
+      "${H[hilt]}" "${H[koin]}"
+  printf '  Auth signals:     Bearer=%-5s HMAC/Sign=%-5s\n' \
+      "${H[bearer]}" "${H[hmac]}"
+  echo
+  echo "  Run with one of --retrofit / --okhttp / --ktor / --apollo / --volley /"
+  echo "  --paths / --urls / --auth to inspect a single section."
+fi
+
 # --- Retrofit ---
 if [[ "$SEARCH_ALL" == true || "$SEARCH_RETROFIT" == true ]]; then
  section "Retrofit Annotations"
@ -90,16 +152,157 @@ if [[ "$SEARCH_ALL" == true || "$SEARCH_OKHTTP" == true ]]; then
  run_grep '(\.url\s*\(|\.addQueryParameter|\.addPathSegment|\.scheme\s*\(|\.host\s*\()'
 fi

+# --- Ktor (Kotlin) ---
+# Ktor doesn't use annotations. Endpoints appear as string args to
+# client.get/post/etc., or are built via HttpRequestBuilder.url(...). Auth
+# is configured via the bearer { loadTokens / refreshTokens } DSL.
+if [[ "$SEARCH_ALL" == true || "$SEARCH_KTOR" == true ]]; then
+  section "Ktor — Client Calls"
+  run_grep '\b(client|httpClient|HttpClient)\.(get|post|put|delete|patch|head|request)\s*[<(]'
+  section "Ktor — Request Building / Default Request"
+  run_grep '(HttpRequestBuilder|defaultRequest\s*\{|\burl\s*\(\s*"|URLBuilder|URLProtocol)'
+  section "Ktor — Auth Plugin (Bearer / Refresh)"
+  run_grep '(\bbearer\s*\{|BearerTokens\s*\(|loadTokens\s*\{|refreshTokens\s*\{|\bAuth\s*\)\s*\{)'
+fi
+
+# --- Apollo (GraphQL) ---
+if [[ "$SEARCH_ALL" == true || "$SEARCH_APOLLO" == true ]]; then
+  section "Apollo — GraphQL Client"
+  run_grep '(ApolloClient|\.serverUrl\s*\(|\.subscriptionNetworkTransport|HttpNetworkTransport)'
+  section "Apollo — Operations"
+  run_grep '(\.query\s*\(\s*[A-Z]|\.mutation\s*\(\s*[A-Z]|\.subscription\s*\(\s*[A-Z])'
+fi
+
 # --- Volley ---
 if [[ "$SEARCH_ALL" == true || "$SEARCH_VOLLEY" == true ]]; then
  section "Volley Requests"
  run_grep '(StringRequest|JsonObjectRequest|JsonArrayRequest|ImageRequest|RequestQueue|Volley\.newRequestQueue)'
 fi

+# --- Endpoint-shaped path literals ---
+# Survives R8 obfuscation: even when call sites are inlined to a.b(c, "path"),
+# the path strings themselves are not obfuscated. This produces a deduplicated
+# inventory of likely API endpoints that other modes miss.
+if [[ "$SEARCH_ALL" == true || "$SEARCH_PATHS" == true ]]; then
+  section "Endpoint-Shaped Path Literals (deduplicated)"
+  # Quoted strings that begin with /<segment> or <segment>/ where the leading
+  # segment is a typical API root word. Cap segment count and length to keep
+  # the regex grounded.
+  # An endpoint-shaped string is one of:
+  #   "/seg/seg..."                   — absolute path with >= 2 segments
+  #   "api-root/seg/seg..."           — relative path starting with a known
+  #                                     API root keyword and containing >= 1
+  #                                     '/' followed by another segment
+  # Segments are URL-safe chars plus {} for path-template placeholders.
+  SEG='[A-Za-z0-9_{}.\-]+'
+  ROOT='(api|v[0-9]+|graphql|rest|mobile|auth|oauth|sso|users?|account|session|token|register|signup|signin|logout|password|verify|otp|sms|profile|customer|cart|basket|order|checkout|payment|invoice|product|catalog|inventory|search|category|favo[u]?rites?|wishlist|address|location|delivery|shipping|review|feedback|notification|push|message|chat|track|event|stat[a-z]*|metric|config|settings?|feature|flag|banner|content|media|upload|download|file|image|video|live|stream|webhook|callback)'
+  PATHS_REGEX="\"(/${SEG}(/${SEG})+/?|${ROOT}(/${SEG})+/?)\""
+  # Filter out frequent false positives (MIME types, /proc, /sys, /dev).
+  EXCLUDE='^"(image|video|audio|text|application|content|font|model|multipart|message)/|^"/(proc|sys|dev|tmp|etc|usr|var|opt)/'
+  # Print a flat unique list rather than file:line — this is the inventory.
+  grep -rhoE --include='*.java' --include='*.kt' "$PATHS_REGEX" "$SOURCE_DIR" 2>/dev/null \
+      | grep -Ev "$EXCLUDE" \
+      | sort -u || true
+  echo
+  section "Endpoint-Shaped Path Literals — call sites"
+  grep $GREP_OPTS -E "$PATHS_REGEX" "$SOURCE_DIR" 2>/dev/null \
+      | grep -Ev ":[0-9]+:.*${EXCLUDE#^}" || true
+fi
+
 # --- Hardcoded URLs ---
+# A loose grep for http(s)://... drowns in compression-dictionary garbage and
+# in third-party SDK URLs (Google, Firebase, AppsFlyer, Datadog, ...). The
+# strict regex requires a syntactically valid hostname and rejects strings
+# containing whitespace, angle brackets, or non-printable bytes. Hosts are
+# then bucketed into "first-party candidates" vs "third-party (denylist)".
 if [[ "$SEARCH_ALL" == true || "$SEARCH_URLS" == true ]]; then
-  section "Hardcoded URLs (http:// and https://)"
-  run_grep '"https?://[^"]+'
+  HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+  DENYLIST="$HERE/../references/third_party_hosts.txt"
+  # Accept three host shapes, all rejecting whitespace / angle brackets /
+  # non-printables in the path:
+  #   * IPv4 literal (dev/staging endpoints, high signal)            192.168.0.1
+  #   * dotted host: >=2 labels ending in a 2+ letter TLD (incl apex) example.com
+  #   * bare single-label host, BUT only when followed by ':port' or  localhost:3000
+  #     '/path' — keeps internal hosts (localhost, internal-backend)  svc/health
+  #     while still dropping Kotlin-stdlib dictionary fragments like
+  #     "http://An Introduction..." (bare word, no port/path follows).
+  STRICT_URL='https?://(([0-9]{1,3}(\.[0-9]{1,3}){3}|[A-Za-z0-9-]+(\.[A-Za-z0-9-]+)*\.[A-Za-z]{2,})(:[0-9]{1,5})?(/[^"<>[:space:]]*)?|[A-Za-z0-9-]+(:[0-9]{1,5}(/[^"<>[:space:]]*)?|/[^"<>[:space:]]*))'
+
+  TMP="$(mktemp)"
+  trap 'rm -f "$TMP"' EXIT
+  # Extraction (STRICT_URL) is deliberately permissive; this awk pass drops the
+  # residual Kotlin-stdlib dictionary noise WITHOUT losing the high-signal
+  # shapes a strict-only regex discards (IPs, apex domains, internal hosts).
+  # Decision table, top-down, on the host (authority before any :port / /path):
+  #   * IPv4 literal                    -> keep  (dict fragments are words,
+  #                                              never dotted-quads)
+  #   * >=3 labels (sub.domain.tld)     -> keep  (any TLD; same tolerance the
+  #                                              original strict regex had)
+  #   * any host WITH a :port or /path  -> keep  (structured = high signal:
+  #                                              localhost:3000, svc/health)
+  #   * bare 2-label apex, no port/path -> keep ONLY if the TLD is a real one,
+  #                                              compared as a whole field (kills
+  #                                              "www.this" / "this.introduction",
+  #                                              keeps "mytrackera-api.com")
+  # Trade-off: a first-party host referenced bare with an uncommon TLD (e.g.
+  # https://foo.store with no path) is dropped — give it a path/port, or add the
+  # TLD to the list below, if you hit that case.
+  { grep -rhoE --include='*.java' --include='*.kt' "$STRICT_URL" "$SOURCE_DIR" 2>/dev/null || true; } \
+      | sort -u \
+      | awk '
+          { rest=$0; sub(/^https?:\/\//,"",rest)
+            host=rest; sub(/[/:].*/,"",host)
+            haspathport = (rest ~ /[/:]/)
+            if (host ~ /^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$/) { print; next }   # IPv4
+            n = split(host, a, ".")
+            if (n >= 3)      { print; next }                                 # sub.domain.tld
+            if (haspathport) { print; next }                                 # has :port or /path
+            if (n == 2 && a[2] ~ /^(com|net|org|io|co|app|dev|me|ai|xyz|info|biz|gov|edu|mil|int|tech|cloud|uk|de|fr|it|es|nl|in|us|ca|au|jp|cn|br|ru|eu|ch|se|no|fi|dk|pl|pt|gr|ie|be|at|cz|sg|hk|kr|tw|mx|ar|cl|za|nz)$/) print  # real apex TLD
+          }' > "$TMP"
+
+  # Extract host: strip scheme, take part up to first ':' or '/'.
+  HOSTS_TMP="$(mktemp)"
+  sed -E 's#^https?://##; s#[/:].*$##' "$TMP" | sort -u > "$HOSTS_TMP"
+
+  if [[ -f "$DENYLIST" ]]; then
+    # Build a single combined regex from the denylist (one line each).
+    DENY_REGEX="$(grep -vE '^\s*(#|$)' "$DENYLIST" | tr '\n' '|' | sed 's/|$//')"
+    THIRD_HOSTS=$(grep -E "$DENY_REGEX" "$HOSTS_TMP" || true)
+    FIRST_HOSTS=$(grep -vE "$DENY_REGEX" "$HOSTS_TMP" || true)
+  else
+    THIRD_HOSTS=""
+    FIRST_HOSTS=$(cat "$HOSTS_TMP")
+  fi
+
+  section "Likely First-Party Hosts (frequency-sorted)"
+  if [[ -n "$FIRST_HOSTS" ]]; then
+    while IFS= read -r h; do
+      [[ -z "$h" ]] && continue
+      n=$(grep -cE "://${h//./\\.}([/:\"]|$)" "$TMP" || true)
+      printf '  %5d  %s\n' "$n" "$h"
+    done <<< "$FIRST_HOSTS" | sort -rn -k1
+  else
+    echo "  (none — every URL matched the third-party denylist)"
+  fi
+
+  section "Third-Party Hosts (denylist matches, collapsed)"
+  if [[ -n "$THIRD_HOSTS" ]]; then
+    echo "$THIRD_HOSTS" | sed 's/^/  /'
+  else
+    echo "  (none)"
+  fi
+
+  section "All First-Party URLs (full strings)"
+  if [[ -n "$FIRST_HOSTS" ]]; then
+    while IFS= read -r h; do
+      [[ -z "$h" ]] && continue
+      grep -E "://${h//./\\.}([/:\"]|$)" "$TMP" | sed 's/^/  /'
+    done <<< "$FIRST_HOSTS"
+  fi
+
+  rm -f "$HOSTS_TMP" "$TMP"
+  trap - EXIT
+
  section "HttpURLConnection"
  run_grep '(openConnection|setRequestMethod|HttpURLConnection|HttpsURLConnection)'
  section "WebView URLs"
@ -109,9 +312,27 @@ fi
 # --- Auth patterns ---
 if [[ "$SEARCH_ALL" == true || "$SEARCH_AUTH" == true ]]; then
  section "Authentication & API Keys"
-  run_grep -i '(api[_-]?key|auth[_-]?token|bearer|authorization|x-api-key|client[_-]?secret|access[_-]?token)'
+  run_grep -i '(api[_-]?key|auth[_-]?token|bearer|authorization|x-api-key|client[_-]?secret|access[_-]?token|refresh[_-]?token)'
+
+  # Request-signing schemes: a hardcoded HMAC / RSA secret in an APK is a
+  # security finding worth surfacing prominently. These patterns catch the
+  # common shapes of homegrown / SDK-issued request signers.
+  section "Request Signing (HMAC / signature schemes)"
+  run_grep '(HmacSHA(1|256|512)|Mac\.getInstance\("Hmac|SecretKeySpec\(|Signature\.getInstance\()'
+  run_grep -i '(x-signature|x-client-authorization|x-amz-signature|x-hmac|aws4-hmac|signRequest|signatureFor|computeSignature|signaturev[0-9])'
+
+  # Hardcoded high-entropy strings adjacent to "secret"/"key" assignments
+  # are the canonical leaked-credential pattern.
+  section "Possible Hardcoded Secrets / Keys"
+  run_grep -i '(app[_-]?secret|client[_-]?secret|signing[_-]?key|hmac[_-]?secret|consumer[_-]?secret|private[_-]?key)'
+
  section "Base URLs and Constants"
  run_grep -i '(BASE_URL|API_URL|SERVER_URL|ENDPOINT|API_BASE|HOST_NAME)'
+
+  # Ktor BearerTokens / refresh DSL — common on Kotlin apps and lives on
+  # Ktor's public API, so it survives R8 unchanged.
+  section "Ktor Auth (Bearer + Refresh)"
+  run_grep '(BearerTokens|loadTokens\s*\{|refreshTokens\s*\{|\bbearer\s*\{)'
 fi

 echo
--- a/plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/fingerprint.sh
+++ b/plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/fingerprint.sh
@ -0,0 +1,241 @@
+#!/usr/bin/env bash
+# fingerprint.sh — Triage an APK/XAPK before decompiling.
+#
+# Detects mobile framework (Flutter, React Native, Cordova/Capacitor,
+# Xamarin, KMP/native), HTTP-stack hints, obfuscation level, native libs,
+# and notable third-party SDKs.
+#
+# Decompiling Java is mostly useless for Flutter / RN / Xamarin / Cordova
+# apps — different tools are needed. Run this BEFORE Phase 2 to choose
+# the right path.
+
+set -euo pipefail
+
+usage() {
+  cat <<EOF
+Usage: fingerprint.sh <file.apk|file.xapk>
+
+Prints a one-screen summary:
+  * mobile framework (with rationale)
+  * HTTP / DI / serialization stack hints
+  * obfuscation indicator
+  * native libraries (consolidated across split APKs)
+  * notable third-party SDKs found in assets/
+EOF
+  exit 0
+}
+
+[[ $# -lt 1 || "$1" == "-h" || "$1" == "--help" ]] && usage
+INPUT="$1"
+[[ ! -f "$INPUT" ]] && { echo "File not found: $INPUT" >&2; exit 1; }
+
+TMP="$(mktemp -d -t apkfp.XXXXXX)"
+trap 'rm -rf "$TMP"' EXIT
+
+# Resolve to a list of APKs (handle XAPK = ZIP of APKs)
+APKS=()
+case "${INPUT,,}" in
+  *.xapk|*.apks|*.apkm)
+    unzip -q -o "$INPUT" -d "$TMP/xapk"
+    while IFS= read -r p; do APKS+=("$p"); done < <(find "$TMP/xapk" -maxdepth 2 -type f -name '*.apk')
+    ;;
+  *.apk)
+    APKS=("$INPUT")
+    ;;
+  *)
+    echo "Unsupported input: $INPUT" >&2; exit 1 ;;
+esac
+
+# Aggregate ZIP listings from every APK in the bundle (split-aware view)
+LISTING="$TMP/listing.txt"
+: > "$LISTING"
+for apk in "${APKS[@]}"; do
+  unzip -l -- "$apk" 2>/dev/null | awk '{print $NF}' >> "$LISTING"
+done
+
+# Most class-level libs live inside classes*.dex, not as visible zip paths.
+# Extract the type-name strings out of each dex with `strings` and append them
+# to the listing so `has()` can match e.g. 'io/ktor/' or 'org/koin/'.
+DEX_STRINGS="$TMP/dex_strings.txt"
+: > "$DEX_STRINGS"
+for apk in "${APKS[@]}"; do
+  for dex in $(unzip -Z1 -- "$apk" 2>/dev/null | grep -E '^classes[0-9]*\.dex$' || true); do
+    # DEX type descriptors look like "Lcom/foo/Bar;". Extract the inner
+    # slash-separated FQN so callers can match e.g. 'io/ktor/' directly.
+    unzip -p -- "$apk" "$dex" 2>/dev/null \
+      | strings -n 8 \
+      | grep -oE 'L[a-z][a-zA-Z0-9_]*(/[a-zA-Z0-9_$]+)+;' \
+      | sed -E 's/^L//; s/;$//' \
+      >> "$DEX_STRINGS" || true
+  done
+done
+sort -u "$DEX_STRINGS" -o "$DEX_STRINGS"
+
+has() { grep -qE "$1" "$LISTING" || grep -qE "$1" "$DEX_STRINGS"; }
+
+# ----------------------------------------------------------------------
+# Framework detection (priority order — first match wins)
+# ----------------------------------------------------------------------
+FRAMEWORK="unknown"
+RATIONALE=""
+
+if has '^lib/[^/]+/libflutter\.so$'; then
+  FRAMEWORK="Flutter"
+  RATIONALE="lib/<abi>/libflutter.so present"
+  has '^lib/[^/]+/libapp\.so$' && RATIONALE+="; libapp.so contains AOT-compiled Dart"
+elif has '^lib/[^/]+/libhermes\.so$' || has '^assets/index\.android\.bundle$' || has '^lib/[^/]+/libreactnativejni\.so$'; then
+  FRAMEWORK="React Native"
+  reasons=()
+  has '^lib/[^/]+/libhermes\.so$'             && reasons+=("libhermes.so")
+  has '^lib/[^/]+/libreactnativejni\.so$'     && reasons+=("libreactnativejni.so")
+  has '^assets/index\.android\.bundle$'       && reasons+=("assets/index.android.bundle")
+  RATIONALE="${reasons[*]}"
+elif has '^assets/www/index\.html$' || has '^assets/www/cordova\.js$' || has '^assets/public/index\.html$'; then
+  FRAMEWORK="Cordova / Capacitor (WebView hybrid)"
+  RATIONALE="assets/www/ or assets/public/ shell present"
+elif has '^lib/[^/]+/libmonodroid\.so$' || has '^assemblies/'; then
+  FRAMEWORK="Xamarin / .NET MAUI"
+  RATIONALE="libmonodroid.so or assemblies/ present — code is in .NET DLLs"
+elif has '^lib/[^/]+/libmaui\.so$'; then
+  FRAMEWORK=".NET MAUI"
+  RATIONALE="libmaui.so present"
+elif has '^assets/flutter_assets/' && ! has '^lib/[^/]+/libflutter\.so$'; then
+  FRAMEWORK="Flutter (code-only split?)"
+  RATIONALE="flutter_assets/ but no libflutter.so in this APK — check splits"
+else
+  # Native: distinguish Compose vs classic Android by androidx.compose presence
+  if has 'androidx\.compose'; then
+    FRAMEWORK="Native Android (Kotlin + Jetpack Compose)"
+    RATIONALE="androidx.compose.* libraries detected"
+  elif has '^META-INF/.*\.kotlin_module$'; then
+    FRAMEWORK="Native Android (Kotlin)"
+    RATIONALE="kotlin_module metadata present, no Compose markers"
+  else
+    FRAMEWORK="Native Android (Java/Kotlin)"
+    RATIONALE="no cross-platform framework markers found"
+  fi
+fi
+
+# ----------------------------------------------------------------------
+# HTTP / DI / serialization stack hints
+# ----------------------------------------------------------------------
+http=()
+has 'retrofit2'                && http+=("Retrofit")
+has 'okhttp3'                  && http+=("OkHttp")
+has 'io/ktor/'                 && http+=("Ktor")
+has 'com/apollographql/'       && http+=("Apollo (GraphQL)")
+has 'com/android/volley'       && http+=("Volley")
+
+di=()
+has 'dagger/hilt/'              && di+=("Hilt")
+has '^META-INF/.*dagger.*'      && di+=("Dagger")
+has 'org/koin/'                 && di+=("Koin")
+has 'javax/inject/'             && [[ ${#di[@]} -eq 0 ]] && di+=("javax.inject")
+
+ser=()
+has 'kotlinx/serialization/'    && ser+=("kotlinx.serialization")
+has 'com/google/gson/'          && ser+=("Gson")
+has 'com/squareup/moshi/'       && ser+=("Moshi")
+has 'com/fasterxml/jackson/'    && ser+=("Jackson")
+
+# ----------------------------------------------------------------------
+# Obfuscation indicator (R8/ProGuard) — count single-letter dex packages
+# ----------------------------------------------------------------------
+# Note: pipefail is on, so guard greps that may legitimately return 0 matches.
+short_dirs=$( { grep -oE '^[a-z]{1,2}/' "$LISTING" || true; } | sort -u | wc -l | tr -d ' ')
+if [[ "$short_dirs" -gt 30 ]]; then
+  OBFUSCATION="HIGH ($short_dirs single/double-letter dirs at root)"
+elif [[ "$short_dirs" -gt 10 ]]; then
+  OBFUSCATION="MODERATE ($short_dirs short root dirs)"
+else
+  OBFUSCATION="LOW (no significant short-name namespace pollution)"
+fi
+
+# ----------------------------------------------------------------------
+# Native libraries (consolidated)
+# ----------------------------------------------------------------------
+NATIVE=$(grep -E '^lib/[^/]+/[^/]+\.so$' "$LISTING" | sort -u || true)
+
+# ----------------------------------------------------------------------
+# Notable third-party SDKs (assets-based markers)
+# ----------------------------------------------------------------------
+sdks=()
+has '^assets/com/appsflyer/'        && sdks+=("AppsFlyer")
+has 'datadog\.buildId|com/datadog/' && sdks+=("Datadog")
+has 'io/sentry/'                    && sdks+=("Sentry")
+has 'com/google/firebase/'          && sdks+=("Firebase")
+has 'com/google/android/gms/'       && sdks+=("Google Play Services")
+has 'com/facebook/'                 && sdks+=("Facebook SDK")
+has 'com/payu/'                     && sdks+=("PayU")
+has 'com/stripe/'                   && sdks+=("Stripe")
+has 'com/braintreepayments/'        && sdks+=("Braintree")
+has 'com/storyteller/'              && sdks+=("Storyteller")
+has 'zendesk/'                      && sdks+=("Zendesk")
+has 'com/intercom/'                 && sdks+=("Intercom")
+has 'com/segment/analytics'         && sdks+=("Segment")
+has 'com/amplitude/'                && sdks+=("Amplitude")
+has 'com/mixpanel/'                 && sdks+=("Mixpanel")
+has 'com/onesignal/'                && sdks+=("OneSignal")
+has 'com/microsoft/clarity'         && sdks+=("Microsoft Clarity")
+has 'com/hotjar/'                   && sdks+=("Hotjar")
+has 'com/instabug/'                 && sdks+=("Instabug")
+
+# BuildConfig.java is almost never obfuscated and often holds base URLs / flavor.
+if has 'BuildConfig\.class$'; then
+  BUILDCONFIG="present (grep BuildConfig.java after decompile for base URLs / flavor)"
+else
+  BUILDCONFIG="not detected in zip listing (still worth grepping after decompile)"
+fi
+
+# ----------------------------------------------------------------------
+# Summary
+# ----------------------------------------------------------------------
+echo "=== APK Fingerprint: $(basename "$INPUT") ==="
+echo
+echo "Framework:        $FRAMEWORK"
+echo "  Rationale:      $RATIONALE"
+echo "Obfuscation:      $OBFUSCATION"
+echo
+echo "HTTP stack:       ${http[*]:-none detected}"
+echo "DI:               ${di[*]:-none detected}"
+echo "Serialization:    ${ser[*]:-none detected}"
+echo "BuildConfig:      $BUILDCONFIG"
+echo
+echo "Third-party SDKs: ${sdks[*]:-none detected}"
+echo
+echo "Native libraries (consolidated across splits):"
+if [[ -n "$NATIVE" ]]; then
+  echo "$NATIVE" | sed 's/^/  /'
+else
+  echo "  (none)"
+fi
+echo
+
+# ----------------------------------------------------------------------
+# Recommendation
+# ----------------------------------------------------------------------
+echo "Recommended next step:"
+case "$FRAMEWORK" in
+  Flutter*)
+    echo "  Java decompilation will yield ~no app code. The Dart logic lives in"
+    echo "  libapp.so (AOT). Use tools designed for Flutter:"
+    echo "    - reFlutter / Doldrums / blutter (extract Dart class structure)"
+    echo "    - strings/rabin2 on libapp.so for endpoints & string constants"
+    ;;
+  React*)
+    echo "  Java code is just the RN host. Real app logic is in JS/Hermes:"
+    echo "    - if Hermes: hbctool disasm assets/index.android.bundle"
+    echo "    - if JSC:    js-beautify the bundle and grep for 'fetch('/'axios'"
+    ;;
+  Cordova*)
+    echo "  All app code is in assets/www/ (or assets/public/). Just unzip and"
+    echo "  inspect the HTML/JS — no Java decompile needed."
+    ;;
+  Xamarin*|.NET*)
+    echo "  App logic is in .NET DLLs (assemblies/). Use ILSpy or dotPeek;"
+    echo "  jadx will only show the Mono host."
+    ;;
+  *)
+    echo "  Proceed with Phase 2: bash scripts/decompile.sh <file>"
+    ;;
+esac
--- a/plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/lookup-name.sh
+++ b/plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/lookup-name.sh
@ -0,0 +1,85 @@
+#!/usr/bin/env bash
+# lookup-name.sh — Query the mapping produced by recover-kotlin-names.sh.
+#
+# Modes:
+#   lookup-name.sh <mapping-dir> <substring>      search by real-FQN substring
+#   lookup-name.sh <mapping-dir> -o <obf>         resolve obf -> real
+#   lookup-name.sh <mapping-dir> -p <pkg>         list a real package
+#   lookup-name.sh <mapping-dir> --grep <regex> <sources-dir>
+#       grep decompiled sources and annotate each hit with the real class name
+
+set -euo pipefail
+
+usage() {
+  cat <<EOF
+Usage: lookup-name.sh <mapping-dir> <query>
+       lookup-name.sh <mapping-dir> -o <obf-fqn>
+       lookup-name.sh <mapping-dir> -p <real-package-substring>
+       lookup-name.sh <mapping-dir> --grep <regex> <sources-dir>
+
+<mapping-dir> is the directory produced by recover-kotlin-names.sh
+(must contain mapping.json).
+EOF
+  exit 0
+}
+
+[[ $# -lt 2 ]] && usage
+DIR="$1"; shift
+[[ ! -f "$DIR/mapping.json" ]] && { echo "no mapping.json in $DIR" >&2; exit 1; }
+
+python3 - "$DIR" "$@" <<'PY'
+import json, os, re, sys, subprocess
+DIR = sys.argv[1]
+args = sys.argv[2:]
+MAP = json.load(open(os.path.join(DIR, "mapping.json")))
+REV = {}
+for o, r in MAP.items():
+    REV.setdefault(r, []).append(o)
+
+def search(q):
+    ql = q.lower()
+    for r in sorted(REV):
+        if ql in r.lower():
+            print(r)
+            for o in sorted(REV[r]):
+                print(f"    {o}")
+
+def by_obf(o):
+    if o not in MAP:
+        print(f"no mapping for {o}", file=sys.stderr); sys.exit(1)
+    print(f"{o}  ->  {MAP[o]}")
+    sibs = [s for s in REV[MAP[o]] if s != o]
+    for s in sorted(sibs):
+        print(f"    sibling: {s}")
+
+def by_pkg(p):
+    pl = p.lower()
+    for r in sorted(REV):
+        if pl in r.rsplit(".", 1)[0].lower():
+            print(r)
+            for o in sorted(REV[r]):
+                print(f"    {o}")
+
+def grep_annot(pattern, sources):
+    res = subprocess.run(
+        ["grep", "-rEn", "--include=*.java", pattern, sources],
+        capture_output=True, text=True)
+    for line in res.stdout.splitlines():
+        try:
+            path, lineno, content = line.split(":", 2)
+        except ValueError:
+            continue
+        rel = os.path.relpath(path, sources)
+        obf = rel.replace(os.sep, ".")[:-5]
+        suffix = f"  // {MAP[obf]}" if obf in MAP else ""
+        print(f"{rel}:{lineno}:{content}{suffix}")
+
+if args[0] == "-o" and len(args) == 2:
+    by_obf(args[1])
+elif args[0] == "-p" and len(args) == 2:
+    by_pkg(args[1])
+elif args[0] == "--grep" and len(args) == 3:
+    grep_annot(args[1], args[2])
+else:
+    search(" ".join(args))
+PY
--- a/plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/recover-kotlin-names.sh
+++ b/plugins/android-reverse-engineering/skills/android-reverse-engineering/scripts/recover-kotlin-names.sh
@ -0,0 +1,140 @@
+#!/usr/bin/env bash
+# recover-kotlin-names.sh — Rebuild a (obfuscated -> real) class-name map
+# from Kotlin metadata strings left in decompiled sources.
+#
+# R8 obfuscates JVM symbols but cannot strip the Kotlin metadata strings —
+# the Kotlin runtime (reflection, coroutines) needs them at runtime. Two
+# annotations carry the original FQN:
+#
+#   * @DebugMetadata(c = "<full.qualified.Name>", f = "<File.kt>", ...)
+#     emitted for almost every `suspend` function (every coroutine
+#     SuspendLambda).
+#
+#   * @Metadata(... d2 = {"...L<pkg/Class>;..."} ...) listing internal
+#     class refs of the file.
+#
+# Typical recovery on a real-world app: 30-50 % of classes regain their real
+# names — usually 100 % of the *Repository / *ViewModel / *UseCase / *Impl
+# classes you actually want to read.
+
+set -euo pipefail
+
+usage() {
+  cat <<EOF
+Usage: recover-kotlin-names.sh <decompiled-sources-dir> [output-dir]
+
+Walks every *.java under <decompiled-sources-dir>, mines @DebugMetadata
+and @Metadata annotations, and writes:
+
+  <output-dir>/mapping.tsv   tab-separated  obf_fqn <TAB> real_fqn <TAB> file
+  <output-dir>/mapping.json  same data as JSON  { obf_fqn: real_fqn, ... }
+  <output-dir>/by_package/   one file per real package, listing
+                             real_fqn <TAB> obf_fqn <TAB> file
+
+If [output-dir] is omitted, files are written next to the sources dir.
+EOF
+  exit 0
+}
+
+[[ $# -lt 1 || "$1" == "-h" || "$1" == "--help" ]] && usage
+SRC="$1"
+OUT="${2:-$(dirname "$SRC")/mapping}"
+[[ ! -d "$SRC" ]] && { echo "not a directory: $SRC" >&2; exit 1; }
+
+mkdir -p "$OUT/by_package"
+
+python3 - "$SRC" "$OUT" <<'PY'
+import os, re, sys, json
+from collections import defaultdict
+
+SRC, OUT = sys.argv[1], sys.argv[2]
+
+# @DebugMetadata(c = "com.foo.Bar$Inner$1", ...)
+RE_DEBUG = re.compile(r'@DebugMetadata\([^)]*?c\s*=\s*"([^"]+)"', re.S)
+# @Metadata(... d2 = { "...Lcom/foo/Bar;..." ...} )
+RE_DTWO  = re.compile(r'@Metadata\([^)]*?d2\s*=\s*\{([^}]*)\}', re.S)
+RE_LCLASS = re.compile(r'L([A-Za-z][\w/$]+);')
+# jadx sometimes emits this comment for renamed classes
+RE_RENAMED = re.compile(r'/\*\s*renamed from:\s*([\w.$]+)\s*\*/')
+
+# Skip third-party / framework trees — their names are already real.
+SKIP_PREFIXES = (
+    "kotlin.", "kotlinx.", "androidx.", "android.", "java.", "javax.",
+    "com.google.", "com.facebook.", "com.appsflyer.", "com.datadog.",
+    "io.ktor.", "io.sentry.", "io.realm.", "okhttp3.", "okio.",
+    "com.squareup.", "com.bumptech.", "com.airbnb.", "com.payu.",
+    "com.storyteller.", "zendesk.", "io.intercom.", "com.microsoft.",
+    "com.tinder.", "com.hotjar.", "com.amplitude.", "com.segment.",
+    "com.mixpanel.", "com.onesignal.", "com.stripe.", "com.braintreepayments.",
+    "retrofit2.", "dagger.", "javax.inject.", "org.jetbrains.",
+)
+
+mapping = {}
+file_real = {}
+counts = defaultdict(int)
+
+for dp, _, files in os.walk(SRC):
+    for f in files:
+        if not f.endswith(".java"):
+            continue
+        path = os.path.join(dp, f)
+        rel = os.path.relpath(path, SRC)
+        obf = rel[:-5].replace(os.sep, ".")
+        if obf.startswith(SKIP_PREFIXES):
+            continue
+        try:
+            text = open(path, "r", errors="replace").read()
+        except OSError:
+            continue
+        real = None
+
+        m = RE_DEBUG.search(text)
+        if m:
+            real = m.group(1).split("$", 1)[0]
+            counts["debug_meta"] += 1
+
+        if not real:
+            m = RE_DTWO.search(text)
+            if m:
+                for lm in RE_LCLASS.finditer(m.group(1)):
+                    cand = lm.group(1).replace("/", ".").split("$", 1)[0]
+                    if "." in cand and not cand.startswith(("kotlin.", "java.", "android")):
+                        real = cand
+                        counts["d2"] += 1
+                        break
+
+        if not real:
+            m = RE_RENAMED.search(text)
+            if m:
+                real = m.group(1)
+                counts["renamed"] += 1
+
+        if real:
+            mapping[obf] = real
+            file_real[obf] = path
+
+with open(os.path.join(OUT, "mapping.tsv"), "w") as f:
+    f.write("obf_fqn\treal_fqn\tfile\n")
+    for k in sorted(mapping):
+        f.write(f"{k}\t{mapping[k]}\t{file_real[k]}\n")
+
+with open(os.path.join(OUT, "mapping.json"), "w") as f:
+    json.dump(mapping, f, indent=2, sort_keys=True)
+
+by_pkg = defaultdict(list)
+for obf, real in mapping.items():
+    pkg = real.rsplit(".", 1)[0] if "." in real else "(default)"
+    by_pkg[pkg].append((real, obf, file_real[obf]))
+
+for pkg, rows in by_pkg.items():
+    safe = os.path.basename(pkg).replace(".", "_") or "default"
+    with open(os.path.join(OUT, "by_package", f"{safe}.txt"), "w") as f:
+        for real, obf, p in sorted(rows):
+            f.write(f"{real}\t{obf}\t{p}\n")
+
+print(f"Recovered {len(mapping)} class names")
+for k, v in counts.items():
+    print(f"  via {k}: {v}")
+print(f"Real packages: {len(by_pkg)}")
+print(f"Wrote {OUT}/mapping.tsv, mapping.json, by_package/")
+PY
Author	SHA1	Message	Date
Simone Avogadro	e8dde9d058	chore: bump plugin version to 1.5.0	2026-06-10 11:49:02 +02:00
Simone Avogadro	f68d9ce3be	feat: post-filter --urls to drop dictionary noise while keeping IPs and apex hosts The hardening patch widened STRICT_URL to recover IPv4 literals, apex 2-label domains and internal hosts that the PR's strict-only regex discarded as collateral while killing Kotlin-stdlib dictionary noise. Widening alone reopened a narrow noise class: 'word.word' fragments such as "www.this" / "this.introduction" pass as apex domains. Keep extraction permissive and add a small awk pass that decides per host: - IPv4 literal: always keep (dict fragments are words, never dotted-quads) - >=3 labels: always keep (any TLD; same tolerance as the original regex) - any host with a :port or /path: always keep (structured = high signal) - bare 2-label apex: keep only when the TLD is a real one, matched as a whole field (so "introduction" != "in" — the prefix-match bug a single mega-regex would have) Trade-off documented inline: a first-party host referenced bare with an uncommon TLD (e.g. https://foo.store with no path) is dropped; a path or port keeps it. awk is POSIX (sub/split/~/print) — more portable than the bash>=4 'declare -A' already used in the summary header. Verified: dictionary noise dropped; IPs, apex, internal and subdomain hosts kept; --all on a zero-match tree still exits 0; host list and full-URL list stay consistent (no orphan hosts). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 11:06:30 +02:00
Simone Avogadro	ed97b8508b	docs: document PR #16 features in README (Kotlin name recovery, fingerprint, Ktor/Apollo/Koin) The PR #16 additions were wired into SKILL.md and references/ but the human-facing README was never updated. Surface them, with prominent emphasis on first-class Kotlin support: - Top blurb: callout for R8 Kotlin name recovery + Ktor/Apollo/Koin - "What it does" table: Phase 0 fingerprint, Kotlin name recovery, modern Kotlin/KMP stacks (Ktor, Apollo, Koin, HMAC) - Usage: fingerprint.sh example, --ktor/--apollo/--paths flags, and a dedicated "Kotlin name recovery (R8 deobfuscation)" subsection - Repository Structure: add the three new scripts + two new references - Acknowledgments: credit @tajchert (#16) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 10:33:53 +02:00
Simone Avogadro	2047f99d01	fix: harden find-api-calls.sh and recover-kotlin-names.sh from PR #16 review - find-api-calls.sh: add missing '\|\| true' on the --paths inventory and --urls extraction pipelines; with set -euo pipefail a zero-match grep aborted the whole script (including the default --all run) with exit 1. - find-api-calls.sh: widen STRICT_URL to also match IPv4 literals, apex 2-label domains and bare single-label hosts followed by :port or /path (localhost, internal backends) while still rejecting dictionary-fragment noise from the Kotlin stdlib. - recover-kotlin-names.sh: sanitize the by_package/ filename with os.path.basename; a crafted absolute path in untrusted @DebugMetadata package names could otherwise escape the output directory. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 10:22:16 +02:00
Michał Tajchert	a2a0a97f23	docs: call out BuildConfig.java and adopt a two-tier endpoint doc template Two small changes that together meaningfully reduce wasted effort: 1. Phase 3 now explicitly tells the agent to read every BuildConfig.java. These files are almost never obfuscated and routinely contain the single highest-signal constants in the APK — base URLs, flavor names, build types, third-party API keys, feature flags. They were not mentioned in the previous workflow despite being the cheapest possible high-value target. One grep, finds them all. 2. The Phase 5 documentation template was a single per-endpoint block asking for path params, query params, request body, response type, and call chain. On apps with 100+ endpoints that easily becomes hours of work for output the consumer will not read. Replace it with two tiers: * Tier 1 — flat table covering every endpoint (host, method, path, auth required, source file). Always produced. Takes ~5 minutes from the --paths output. * Tier 2 — the existing detailed block, but explicitly reserved for high-value endpoints: the entire auth flow, payment/checkout, and anything the user specifically asked about. Default cap of ~10 Tier-2 entries unless asked for more. This matches the natural shape of how analysts actually use this work (one inventory table to know the surface area, plus a deep dive on auth and a couple of flows) and prevents over-investment in detail for endpoints nobody will read about.	2026-04-29 01:40:50 +02:00
Michał Tajchert	627889a4c6	feat: add summary header to find-api-calls.sh Without an overview the script dumps thousands of file:line: matches across many sections, leaving the reader to figure out which framework even applies. A short summary at the top makes the rest of the output actionable. The summary counts hits per framework / DI / auth-signal category in a single grep pass over the source tree (8 separate greps would have roughly octupled the runtime on a large decompile). Output is a 3-line table: HTTP framework: Retrofit=N OkHttp=N Ktor=N Apollo=N Volley=N DI framework: Hilt/Dagger=N Koin=N Auth signals: Bearer=N HMAC/Sign=N A reader can immediately see which framework the app actually uses, whether auth is bearer-token or signed, and whether to spend time on a section or skip it. The summary is suppressed when a single section flag (--retrofit, --ktor, --paths, ...) is given, so the existing single-section workflows are unchanged. A reminder of the available section flags is printed below the counts so the agent does not have to consult --help.	2026-04-29 01:39:55 +02:00
Michał Tajchert	ec2b14c171	feat: detect Koin DI and HMAC request-signing schemes Two gaps in the previous coverage: 1. Koin was not mentioned anywhere — Hilt/Dagger got a full section in call-flow-analysis.md but Koin (the dominant DI in KMP and a large share of Kotlin-only Android apps) had zero patterns. Add a Koin subsection with the runtime-DSL patterns (module {}, single<>, factory<>, viewModel<>, by inject, by viewModel) plus the practical trick for resolving an interface to its impl after R8 obfuscation: intersect "files that import org.koin.core.module" with "files that reference the interface name". 2. The --auth mode caught Bearer / API-key / OAuth header patterns but missed HMAC and other request-signing schemes. A hardcoded HMAC secret embedded in an APK is a security finding worth surfacing — the same kind of authority the user gets is the same authority a decompiler grants to anyone. Add patterns for: * JCA primitives: HmacSHA{1,256,512}, Mac.getInstance(...), SecretKeySpec(...), Signature.getInstance(...) * Header conventions: X-Signature, X-Hmac, X-Amz-Signature, X-Client-Authorization, AWS4-HMAC, signRequest(), signaturev2/3 * Likely secret-bearing identifiers: app_secret, client_secret, signing_key, hmac_secret, consumer_secret, private_key * Ktor BearerTokens / loadTokens / refreshTokens DSL These survive R8 because the JCA and Ktor APIs are public and not shrunk. On a real-world app with a homegrown HMAC scheme they pinpoint the signing class and its hardcoded key directly.	2026-04-29 01:26:40 +02:00
Michał Tajchert	2e6fc63453	feat: bucketed --urls output with strict regex and third-party denylist The previous --urls mode was a plain grep for "https?://..." which on a real APK produced thousands of lines, half of them junk strings extracted from Kotlin stdlib's compression dictionary ("http://An Introduction to..." fragments) and the other half SDK URLs (Google, Firebase, AppsFlyer, Datadog, Sentry, ...) that the analyst is not looking for. The signal — first-party backend hosts — was buried. Two changes: 1. Strict URL regex: hostname must have at least one dot and end in a 2+ letter TLD, with no whitespace / angle brackets / non-printables in the path. This eliminates the dictionary-fragment noise. 2. Bucket the surviving URLs into "likely first-party" vs "third-party" using references/third_party_hosts.txt — a curated denylist of ~80 patterns covering Google/Firebase/Apple/Microsoft/Adobe, attribution and observability vendors (AppsFlyer, Datadog, Sentry, Bugsnag, ...), payments (Stripe, PayU, Adyen, ...), support/chat SDKs, CAs, and standards namespaces (w3.org, etc.). The new output starts with a frequency-sorted list of likely first-party hosts — which is the artifact every reverse-engineer wants on the first page — followed by the collapsed third-party list and the full URL set for first-party hosts only. The denylist is a sidecar text file (one regex per line) so users can extend or override it without editing the script.	2026-04-29 01:23:56 +02:00
Michał Tajchert	dbb19f0a22	feat: add --paths mode for obfuscation-resistant endpoint extraction When R8 inlines call sites — client.get("/api/users") becomes a.b(c, "/api/users") — the existing framework-specific patterns find nothing, but the path string literal itself is never obfuscated. This single observation is the most useful endpoint-extraction technique on heavily shrunk apps; the existing --urls mode only catches full "https://..." URLs, missing every relative path. Add a --paths mode that greps for quoted strings matching either: * an absolute path with at least two slash-separated segments, or * a relative path beginning with a known API root keyword (api, v1/v2/v3, graphql, users, auth, profile, cart, order, ...) with a {0,8}-segment cap and a small denylist for MIME types and system paths (image/png, /proc/, /sys/, /dev/, etc.) which would otherwise pollute results. The output is a deduplicated inventory followed by the full call-site list. On a real-world Kotlin/Ktor app this produced ~240 distinct API paths in one shot — paths that the Retrofit/OkHttp/Ktor patterns missed entirely because every call was inlined. This is the recommended first extraction step on any obfuscated app. Document the regex and rationale in references/api-extraction-patterns.md.	2026-04-29 01:21:25 +02:00
Michał Tajchert	371d3d4bed	feat: add Ktor and Apollo (GraphQL) API-extraction patterns The previous find-api-calls.sh covered only Retrofit, OkHttp, and Volley. Modern Kotlin and KMP apps increasingly ship Ktor as their HTTP client (used by ~25 % of new Kotlin apps as of 2025), and many product apps use Apollo Kotlin for GraphQL. Both produced zero hits with the old patterns. Add two new modes to find-api-calls.sh: --ktor Ktor client calls (client.get/post/...), HttpRequestBuilder, defaultRequest blocks, and the Auth bearer DSL (BearerTokens / loadTokens / refreshTokens) --apollo ApolloClient, .serverUrl(), HttpNetworkTransport, and .query/.mutation/.subscription operation calls Document both in references/api-extraction-patterns.md with example post-decompile snippets and a note on R8 obfuscation: Ktor call sites get inlined to obfuscated method calls, but the path string literals and Ktor library symbols (BearerTokens, URLProtocol, etc.) survive, so library-internal patterns still work as anchors.	2026-04-29 01:16:43 +02:00
Michał Tajchert	5b63fcb418	feat: recover original Kotlin class names from R8-stripped binaries R8 obfuscates JVM symbols but cannot strip the Kotlin metadata strings — the Kotlin runtime needs them at runtime for reflection, coroutines, and data-class features. The original FQNs leak through: * @DebugMetadata(c = "<real.fqn>") emitted for every coroutine SuspendLambda (~ every suspend function in modern apps) * @Metadata(d2 = {"L<real/fqn>;"}) on every Kotlin class Add scripts/recover-kotlin-names.sh that walks decompiled sources, mines both annotations, and writes an obf -> real mapping (TSV + JSON + per-real- package index). On a real-world Kotlin app this recovers ~100 % of Repository / ViewModel / UseCase / Impl classes — exactly the classes worth reading. Add scripts/lookup-name.sh as a CLI over the mapping with four modes: search by real-name substring, resolve obf -> real, list a real package, and an annotated `--grep` that suffixes every hit with the owning real class. This is a strict upgrade over plain grep against decompiled sources. Replace the misleading 'use --deobf' tip in call-flow-analysis.md with a pointer to this technique. --deobf only renames symbols with synthetic placeholders; metadata recovery returns actual developer-written names. Document the technique, expected recovery rates, and limitations in references/kotlin-name-recovery.md, and reference it from SKILL.md as optional Phase 3.5 (only when Phase 0 reports an obfuscated Kotlin app).	2026-04-29 01:12:31 +02:00
Michał Tajchert	213818fc27	feat: add Phase 0 fingerprint script for fast pre-decompile triage Decompiling Java is wasted effort for Flutter, React Native, Cordova/ Capacitor, and Xamarin apps — their code lives in libapp.so, the JS bundle, assets/www/, or .NET DLLs respectively. The previous workflow jumped straight to Phase 1 (install deps) and Phase 2 (decompile), so the agent had no way to know which path to take until after a full jadx run. The new fingerprint.sh inspects an APK/XAPK in seconds and reports: * Detected mobile framework with the file marker that triggered it * HTTP stack hints (Retrofit, OkHttp, Ktor, Apollo, Volley) via DEX string scanning — survives R8 obfuscation * DI and serialization libraries * Obfuscation level estimate * Notable third-party SDKs found in assets/ and DEX * Consolidated native libraries across base + split APKs (split bundles often place .so files only in config.<abi>.apk) * A framework-specific recommendation for the next step SKILL.md documents this as Phase 0 and explicitly tells the agent to stop and switch tooling if the app is non-native. PowerShell port (fingerprint.ps1) intentionally not included — happy to add if needed; behavior is straightforward to mirror.	2026-04-29 01:07:40 +02:00