==========================
pyaxml User Guide
==========================
Overview
========
pyaxml is a tool for converting Android binary formats to and from human-readable
representations. It handles two primary formats:
- **AXML** -- Android Binary XML, the compiled XML format used inside APK files
(e.g. ``AndroidManifest.xml``, layout files, other resource XML).
- **ARSC** -- Android Resource Table (``resources.arsc``), the compiled resource
index bundled in APK files.
pyaxml can parse these binary formats, convert them to readable XML or protobuf
text, and recompile XML back into binary AXML.
Installation
============
Building from source
--------------------
Requires Rust 1.56+ (edition 2021).
.. code-block:: bash
cd rust-axml
cargo build --release
The binary is produced at ``target/release/pyaxml``.
Python package
--------------
pyaxml exposes a Python package via PyO3 and maturin. To build and install
the ``pyaxml`` Python package into the current environment:
.. code-block:: bash
cd rust-axml
pip install maturin
uv run maturin develop --release --features python
This makes ``pyaxml.AXML``, ``pyaxml.ARSC``, ``pyaxml.AXMLGuess``, and
``pyaxml.StringBlocks`` available from Python.
CLI Reference
=============
One CLI binary is provided:
- **``pyaxml-rs``**: native Rust binary, built with ``cargo build --release``.
General usage::
pyaxml-rs [-h] [-i INPUT] [-o OUTPUT] [-p PATH] [-v] [-l LANGUAGE]
[--stringblocks-file STRINGBLOCKS_FILE]
{axml2xml,xml2axml,arsc2xml,arsc2proto,axml2proto}
# or equivalently via the Python CLI:
pyaxml {axml2xml,xml2axml,arsc2xml,arsc2proto,axml2proto} [options]
Common flags
------------
``-i INPUT``, ``--input INPUT``
Path to the input file. Can be a raw binary file or a ZIP/APK archive.
``-o OUTPUT``, ``--output OUTPUT``
Path to the output file. If omitted, output is written to stdout
(except ``xml2axml``, which requires ``-o``).
``-p PATH``, ``--path PATH``
When the input is a ZIP/APK, specifies the entry name to extract.
Defaults to ``AndroidManifest.xml`` for AXML commands and
``resources.arsc`` for ``arsc2xml``.
``-v``, ``--version``
Print the version and exit.
``-h``, ``--help``
Print help and exit.
Commands
--------
axml2xml
~~~~~~~~
Convert binary AXML to readable XML.
.. code-block:: bash
pyaxml-rs axml2xml -i AndroidManifest.xml
pyaxml-rs axml2xml -i AndroidManifest.xml -o manifest.xml
With ``--stringblocks-file``, the string pool is exported to a JSON file
alongside the XML conversion. This is useful for preserving string ordering
when round-tripping through ``xml2axml``.
.. code-block:: bash
pyaxml-rs axml2xml -i AndroidManifest.xml -o manifest.xml --stringblocks-file strings.json
xml2axml
~~~~~~~~
Compile a readable XML file back into Android binary AXML format. Requires
``-o`` to specify the output path.
.. code-block:: bash
pyaxml-rs xml2axml -i manifest.xml -o AndroidManifest.xml
To restore the original string pool ordering, supply a previously exported
stringblocks JSON file:
.. code-block:: bash
pyaxml-rs xml2axml -i manifest.xml -o AndroidManifest.xml --stringblocks-file strings.json
arsc2xml
~~~~~~~~
Parse a ``resources.arsc`` file and produce an XML listing of all resource
entries, grouped by locale.
.. code-block:: bash
pyaxml-rs arsc2xml -i resources.arsc
pyaxml-rs arsc2xml -i resources.arsc -o resources.xml
Use ``-l`` / ``--language`` to filter output to a single locale:
.. code-block:: bash
# Show only default (no-locale) entries
pyaxml-rs arsc2xml -i resources.arsc -l default
# Show only English entries
pyaxml-rs arsc2xml -i resources.arsc -l en
# Show only French (France) entries
pyaxml-rs arsc2xml -i resources.arsc -l fr-FR
arsc2proto
~~~~~~~~~~
Convert a binary ARSC resource table to protobuf text format.
.. code-block:: bash
pyaxml-rs arsc2proto -i resources.arsc
pyaxml-rs arsc2proto -i app.apk -o resources.proto.txt
pyaxml-rs arsc2proto -i resources.arsc --pretty
axml2proto
~~~~~~~~~~
Convert binary AXML to protobuf text format. This produces a human-readable
protobuf representation of the internal structure.
.. code-block:: bash
pyaxml-rs axml2proto -i AndroidManifest.xml
pyaxml-rs axml2proto -i AndroidManifest.xml -o manifest.proto.txt
pyaxml-rs axml2proto -i AndroidManifest.xml --pretty
Working with APK Files
======================
All commands transparently handle ZIP/APK archives. When the input file begins
with the ZIP magic bytes (``PK``), pyaxml-rs opens it as a ZIP archive and extracts
the appropriate entry.
.. code-block:: bash
# Extract and decode AndroidManifest.xml from an APK
pyaxml-rs axml2xml -i app.apk
# Extract a specific layout file from an APK
pyaxml-rs axml2xml -i app.apk -p res/layout/activity_main.xml
# Extract resources.arsc from an APK
pyaxml-rs arsc2xml -i app.apk
# Extract a specific entry by path
pyaxml-rs arsc2xml -i app.apk -p resources.arsc
Output Formats
==============
axml2xml output
---------------
Produces standard XML with an XML declaration header::
...
Namespace declarations are emitted on the root element. Attribute values are
decoded from their typed representation (booleans, hex integers, references,
dimensions, colors, etc.) back into their string form.
arsc2xml output
---------------
Produces resource entries grouped inside ```` elements tagged with
the locale::
Each ```` element contains:
- ``type`` -- resource type name (string, color, layout, drawable, etc.)
- ``name`` -- resource key name
- ``id`` -- full 32-bit resource ID in hex (package 0x7f, type, entry)
- ``data`` -- resolved value (string content for string resources, hex for others)
- ``data_size`` -- byte size of the value cell
Stringblocks JSON format
-------------------------
The ``--stringblocks-file`` export produces a JSON dictionary mapping string
pool indices to their decoded values::
{
"0": "http://schemas.android.com/apk/res/android",
"1": "android",
"2": "package",
...
}
This file can be fed back to ``xml2axml`` via the same flag to preserve
string pool ordering during round-trip conversions.
Python API
==========
The ``pyaxml`` Python package exposes four public classes.
AXML
----
.. code-block:: python
import pyaxml
# Parse binary AXML
axml = pyaxml.AXML.from_axml(data)
# Convert to an XML Element (lxml or stdlib ET)
element = axml.to_xml()
# Recompile from an XML Element or string
axml.from_xml(element)
# Serialize back to binary
binary = axml.pack()
# String pool access
count = axml.string_count()
s = axml.get_string(0)
# String pool manipulation via proto
sb = pyaxml.StringBlocks(proto=axml.stringblocks.proto)
sb.switch("oldName", "newName")
axml.stringblocks.proto = sb.proto
# Proto serialization round-trip
proto_msg = axml.to_proto() # axml_pb2.AXML message
axml2 = pyaxml.AXML.from_proto(proto_msg)
ARSC
----
.. code-block:: python
import pyaxml
# Parse binary ARSC
arsc = pyaxml.ARSC.from_axml(data)
# List all resource entries as an XML string
xml_str = arsc.list_packages()
# Filter by locale tag
xml_str = arsc.list_packages(language="en")
xml_str = arsc.list_packages(language="default")
# No-op finalization (backward compatibility)
arsc.compute()
# Access the full ARSC proto message
proto = arsc.proto # or arsc.to_proto()
# Iterate packages
packages = arsc.get_packages() # list of AXMLResTablePackage protos
# Look up a resource ID by type and name
pkg = packages[0]
res_id, key_idx = arsc.get_id_public(pkg, "string", "app_name")
# Filter by locale when looking up a resource
res_id, key_idx = arsc.get_id_public(pkg, "string", "app_name", language="en")
res_id, key_idx = arsc.get_id_public(pkg, "xml", "network_security_config", language="default")
# Package count
n = arsc.package_count()
# Proto serialization
proto_msg = arsc.to_proto() # axml_pb2.ARSC message
arsc2 = pyaxml.ARSC.from_proto(proto_msg)
AXMLGuess
---------
Auto-detects whether the input is AXML or ARSC and returns the appropriate object.
.. code-block:: python
import pyaxml
# Returns an AXML or ARSC instance depending on the file type
obj = pyaxml.AXMLGuess.from_axml(data)
binary = obj.pack()
StringBlocks
------------
Provides direct access to the string pool for advanced manipulation.
.. code-block:: python
import pyaxml
axml = pyaxml.AXML.from_axml(data)
# Wrap the string pool proto for manipulation
sb = pyaxml.StringBlocks(proto=axml.stringblocks.proto)
# Decode a string by index
s = sb.decode_str(0)
# Replace a string everywhere in the pool
sb.switch("oldString", "newString")
# Write the modified pool back into the AXML
axml.stringblocks.proto = sb.proto
axml.compute()
binary = axml.pack()
Accessing chunk headers
---------------------
Both ``XmlElement`` and ``ResourceMap`` now expose their chunk headers
during parsing.
**XmlElement chunk headers:**
Every XML element (StartNamespace, EndNamespace, StartElement, EndElement, CData)
stores its ``header_size`` and ``chunk_size`` values when parsed from binary.
.. code-block:: python
import pyaxml
axml = pyaxml.AXML.from_axml(open("AndroidManifest.xml", "rb").read())
# Iterate elements (requires AXMLGuess or internal access)
# Note: Direct element iteration via element_at() not yet exposed in Python
# Access via proto (for advanced users)
for el in axml.proto.resourcexml.elts:
if el.HasField('header'):
print(f"chunk_type=0x{el.header.type:04x}, "
f"header_size={el.header.header_size}, "
f"chunk_size={el.header.size}")
**Get/set methods (Rust library):**
When using the Rust library directly, getter and setter methods are available:
.. code-block:: rust
// Getters
let header_size = element.header_size(); // u16
let chunk_size = element.chunk_size(); // u32
// Setter (for compute())
element.set_chunk_size(new_size);
**ResourceMap chunk headers:**
The resource map also exposes its chunk headers:
.. code-block:: python
import pyaxml
axml = pyaxml.AXML.from_axml(open("resources.arsc", "rb").read())
# Access resource map if present
if axml.resource_map:
header_size = axml.resource_map.header_size()
chunk_size = axml.resource_map.chunk_size()
**Important: pack() behavior**
- ``XmlElement::pack()`` uses stored header_size/chunk_size directly
- ``ResourceMap::pack()`` auto-computes when chunk_size is 0 (for direct usage)
- Always call ``axml.compute()`` before ``axml.pack()`` to ensure correct sizes
.. code-block:: python
import pyaxml
# Create new AXML from XML
axml = pyaxml.AXML()
axml.from_xml(element)
# MUST call compute() before pack() for correct chunk sizes
axml.compute()
binary = axml.pack()