pyaxml User Guide

Overview

pyaxml is a tool for converting Android binary formats to and from human-readable representations. It handles two primary formats:

  • AXML – Android Binary XML, the compiled XML format used inside APK files (e.g. AndroidManifest.xml, layout files, other resource XML).

  • ARSC – Android Resource Table (resources.arsc), the compiled resource index bundled in APK files.

pyaxml can parse these binary formats, convert them to readable XML or protobuf text, and recompile XML back into binary AXML.

Installation

Building from source

Requires Rust 1.56+ (edition 2021).

cd rust-axml
cargo build --release

The binary is produced at target/release/pyaxml.

Python package

pyaxml exposes a Python package via PyO3 and maturin. To build and install the pyaxml Python package into the current environment:

cd rust-axml
pip install maturin
uv run maturin develop --release --features python

This makes pyaxml.AXML, pyaxml.ARSC, pyaxml.AXMLGuess, and pyaxml.StringBlocks available from Python.

CLI Reference

One CLI binary is provided:

  • ``pyaxml-rs``: native Rust binary, built with cargo build --release.

General usage:

pyaxml-rs [-h] [-i INPUT] [-o OUTPUT] [-p PATH] [-v] [-l LANGUAGE]
          [--stringblocks-file STRINGBLOCKS_FILE]
          {axml2xml,xml2axml,arsc2xml,arsc2proto,axml2proto}

# or equivalently via the Python CLI:
pyaxml {axml2xml,xml2axml,arsc2xml,arsc2proto,axml2proto} [options]

Common flags

-i INPUT, --input INPUT

Path to the input file. Can be a raw binary file or a ZIP/APK archive.

-o OUTPUT, --output OUTPUT

Path to the output file. If omitted, output is written to stdout (except xml2axml, which requires -o).

-p PATH, --path PATH

When the input is a ZIP/APK, specifies the entry name to extract. Defaults to AndroidManifest.xml for AXML commands and resources.arsc for arsc2xml.

-v, --version

Print the version and exit.

-h, --help

Print help and exit.

Commands

axml2xml

Convert binary AXML to readable XML.

pyaxml-rs axml2xml -i AndroidManifest.xml
pyaxml-rs axml2xml -i AndroidManifest.xml -o manifest.xml

With --stringblocks-file, the string pool is exported to a JSON file alongside the XML conversion. This is useful for preserving string ordering when round-tripping through xml2axml.

pyaxml-rs axml2xml -i AndroidManifest.xml -o manifest.xml --stringblocks-file strings.json

xml2axml

Compile a readable XML file back into Android binary AXML format. Requires -o to specify the output path.

pyaxml-rs xml2axml -i manifest.xml -o AndroidManifest.xml

To restore the original string pool ordering, supply a previously exported stringblocks JSON file:

pyaxml-rs xml2axml -i manifest.xml -o AndroidManifest.xml --stringblocks-file strings.json

arsc2xml

Parse a resources.arsc file and produce an XML listing of all resource entries, grouped by locale.

pyaxml-rs arsc2xml -i resources.arsc
pyaxml-rs arsc2xml -i resources.arsc -o resources.xml

Use -l / --language to filter output to a single locale:

# Show only default (no-locale) entries
pyaxml-rs arsc2xml -i resources.arsc -l default

# Show only English entries
pyaxml-rs arsc2xml -i resources.arsc -l en

# Show only French (France) entries
pyaxml-rs arsc2xml -i resources.arsc -l fr-FR

arsc2proto

Convert a binary ARSC resource table to protobuf text format.

pyaxml-rs arsc2proto -i resources.arsc
pyaxml-rs arsc2proto -i app.apk -o resources.proto.txt
pyaxml-rs arsc2proto -i resources.arsc --pretty

axml2proto

Convert binary AXML to protobuf text format. This produces a human-readable protobuf representation of the internal structure.

pyaxml-rs axml2proto -i AndroidManifest.xml
pyaxml-rs axml2proto -i AndroidManifest.xml -o manifest.proto.txt
pyaxml-rs axml2proto -i AndroidManifest.xml --pretty

Working with APK Files

All commands transparently handle ZIP/APK archives. When the input file begins with the ZIP magic bytes (PK), pyaxml-rs opens it as a ZIP archive and extracts the appropriate entry.

# Extract and decode AndroidManifest.xml from an APK
pyaxml-rs axml2xml -i app.apk

# Extract a specific layout file from an APK
pyaxml-rs axml2xml -i app.apk -p res/layout/activity_main.xml

# Extract resources.arsc from an APK
pyaxml-rs arsc2xml -i app.apk

# Extract a specific entry by path
pyaxml-rs arsc2xml -i app.apk -p resources.arsc

Output Formats

axml2xml output

Produces standard XML with an XML declaration header:

<?xml version='1.0' encoding='utf-8'?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android"
  package="com.example.app"
  android:versionCode="1"
  android:versionName="1.0">
  <application android:label="My App">
    ...
  </application>
</manifest>

Namespace declarations are emitted on the root element. Attribute values are decoded from their typed representation (booleans, hex integers, references, dimensions, colors, etc.) back into their string form.

arsc2xml output

Produces resource entries grouped inside <resources> elements tagged with the locale:

<resources lang="default">
  <public type="string" name="app_name" id="0x7f040000" data="My App" data_size=8/>
  <public type="color" name="primary" id="0x7f050000" data="0xff6200ee" data_size=8/>
</resources>
<resources lang="en">
  <public type="string" name="app_name" id="0x7f040000" data="My App" data_size=8/>
</resources>
<resources lang="fr-FR">
  <public type="string" name="app_name" id="0x7f040000" data="Mon App" data_size=8/>
</resources>

Each <public> element contains:

  • type – resource type name (string, color, layout, drawable, etc.)

  • name – resource key name

  • id – full 32-bit resource ID in hex (package 0x7f, type, entry)

  • data – resolved value (string content for string resources, hex for others)

  • data_size – byte size of the value cell

Stringblocks JSON format

The --stringblocks-file export produces a JSON dictionary mapping string pool indices to their decoded values:

{
  "0": "http://schemas.android.com/apk/res/android",
  "1": "android",
  "2": "package",
  ...
}

This file can be fed back to xml2axml via the same flag to preserve string pool ordering during round-trip conversions.

Python API

The pyaxml Python package exposes four public classes.

AXML

import pyaxml

# Parse binary AXML
axml = pyaxml.AXML.from_axml(data)

# Convert to an XML Element (lxml or stdlib ET)
element = axml.to_xml()

# Recompile from an XML Element or string
axml.from_xml(element)

# Serialize back to binary
binary = axml.pack()

# String pool access
count = axml.string_count()
s = axml.get_string(0)

# String pool manipulation via proto
sb = pyaxml.StringBlocks(proto=axml.stringblocks.proto)
sb.switch("oldName", "newName")
axml.stringblocks.proto = sb.proto

# Proto serialization round-trip
proto_msg = axml.to_proto()           # axml_pb2.AXML message
axml2 = pyaxml.AXML.from_proto(proto_msg)

ARSC

import pyaxml

# Parse binary ARSC
arsc = pyaxml.ARSC.from_axml(data)

# List all resource entries as an XML string
xml_str = arsc.list_packages()

# Filter by locale tag
xml_str = arsc.list_packages(language="en")
xml_str = arsc.list_packages(language="default")

# No-op finalization (backward compatibility)
arsc.compute()

# Access the full ARSC proto message
proto = arsc.proto  # or arsc.to_proto()

# Iterate packages
packages = arsc.get_packages()  # list of AXMLResTablePackage protos

# Look up a resource ID by type and name
pkg = packages[0]
res_id, key_idx = arsc.get_id_public(pkg, "string", "app_name")

# Filter by locale when looking up a resource
res_id, key_idx = arsc.get_id_public(pkg, "string", "app_name", language="en")
res_id, key_idx = arsc.get_id_public(pkg, "xml", "network_security_config", language="default")

# Package count
n = arsc.package_count()

# Proto serialization
proto_msg = arsc.to_proto()    # axml_pb2.ARSC message
arsc2 = pyaxml.ARSC.from_proto(proto_msg)

AXMLGuess

Auto-detects whether the input is AXML or ARSC and returns the appropriate object.

import pyaxml

# Returns an AXML or ARSC instance depending on the file type
obj = pyaxml.AXMLGuess.from_axml(data)
binary = obj.pack()

StringBlocks

Provides direct access to the string pool for advanced manipulation.

import pyaxml

axml = pyaxml.AXML.from_axml(data)

# Wrap the string pool proto for manipulation
sb = pyaxml.StringBlocks(proto=axml.stringblocks.proto)

# Decode a string by index
s = sb.decode_str(0)

# Replace a string everywhere in the pool
sb.switch("oldString", "newString")

# Write the modified pool back into the AXML
axml.stringblocks.proto = sb.proto
axml.compute()
binary = axml.pack()

Accessing chunk headers

Both XmlElement and ResourceMap now expose their chunk headers during parsing.

XmlElement chunk headers:

Every XML element (StartNamespace, EndNamespace, StartElement, EndElement, CData) stores its header_size and chunk_size values when parsed from binary.

import pyaxml

axml = pyaxml.AXML.from_axml(open("AndroidManifest.xml", "rb").read())

# Iterate elements (requires AXMLGuess or internal access)
# Note: Direct element iteration via element_at() not yet exposed in Python

# Access via proto (for advanced users)
for el in axml.proto.resourcexml.elts:
    if el.HasField('header'):
        print(f"chunk_type=0x{el.header.type:04x}, "
              f"header_size={el.header.header_size}, "
              f"chunk_size={el.header.size}")

Get/set methods (Rust library):

When using the Rust library directly, getter and setter methods are available:

// Getters
let header_size = element.header_size();   // u16
let chunk_size = element.chunk_size();    // u32

// Setter (for compute())
element.set_chunk_size(new_size);

ResourceMap chunk headers:

The resource map also exposes its chunk headers:

import pyaxml

axml = pyaxml.AXML.from_axml(open("resources.arsc", "rb").read())

# Access resource map if present
if axml.resource_map:
    header_size = axml.resource_map.header_size()
    chunk_size = axml.resource_map.chunk_size()

Important: pack() behavior

  • XmlElement::pack() uses stored header_size/chunk_size directly

  • ResourceMap::pack() auto-computes when chunk_size is 0 (for direct usage)

  • Always call axml.compute() before axml.pack() to ensure correct sizes

import pyaxml

# Create new AXML from XML
axml = pyaxml.AXML()
axml.from_xml(element)

# MUST call compute() before pack() for correct chunk sizes
axml.compute()
binary = axml.pack()