========================== pyaxml User Guide ========================== Overview ======== pyaxml is a tool for converting Android binary formats to and from human-readable representations. It handles two primary formats: - **AXML** -- Android Binary XML, the compiled XML format used inside APK files (e.g. ``AndroidManifest.xml``, layout files, other resource XML). - **ARSC** -- Android Resource Table (``resources.arsc``), the compiled resource index bundled in APK files. pyaxml can parse these binary formats, convert them to readable XML or protobuf text, and recompile XML back into binary AXML. Installation ============ Building from source -------------------- Requires Rust 1.56+ (edition 2021). .. code-block:: bash cd rust-axml cargo build --release The binary is produced at ``target/release/pyaxml``. Python package -------------- pyaxml exposes a Python package via PyO3 and maturin. To build and install the ``pyaxml`` Python package into the current environment: .. code-block:: bash cd rust-axml pip install maturin uv run maturin develop --release --features python This makes ``pyaxml.AXML``, ``pyaxml.ARSC``, ``pyaxml.AXMLGuess``, and ``pyaxml.StringBlocks`` available from Python. CLI Reference ============= One CLI binary is provided: - **``pyaxml-rs``**: native Rust binary, built with ``cargo build --release``. General usage:: pyaxml-rs [-h] [-i INPUT] [-o OUTPUT] [-p PATH] [-v] [-l LANGUAGE] [--stringblocks-file STRINGBLOCKS_FILE] {axml2xml,xml2axml,arsc2xml,arsc2proto,axml2proto} # or equivalently via the Python CLI: pyaxml {axml2xml,xml2axml,arsc2xml,arsc2proto,axml2proto} [options] Common flags ------------ ``-i INPUT``, ``--input INPUT`` Path to the input file. Can be a raw binary file or a ZIP/APK archive. ``-o OUTPUT``, ``--output OUTPUT`` Path to the output file. If omitted, output is written to stdout (except ``xml2axml``, which requires ``-o``). ``-p PATH``, ``--path PATH`` When the input is a ZIP/APK, specifies the entry name to extract. Defaults to ``AndroidManifest.xml`` for AXML commands and ``resources.arsc`` for ``arsc2xml``. ``-v``, ``--version`` Print the version and exit. ``-h``, ``--help`` Print help and exit. Commands -------- axml2xml ~~~~~~~~ Convert binary AXML to readable XML. .. code-block:: bash pyaxml-rs axml2xml -i AndroidManifest.xml pyaxml-rs axml2xml -i AndroidManifest.xml -o manifest.xml With ``--stringblocks-file``, the string pool is exported to a JSON file alongside the XML conversion. This is useful for preserving string ordering when round-tripping through ``xml2axml``. .. code-block:: bash pyaxml-rs axml2xml -i AndroidManifest.xml -o manifest.xml --stringblocks-file strings.json xml2axml ~~~~~~~~ Compile a readable XML file back into Android binary AXML format. Requires ``-o`` to specify the output path. .. code-block:: bash pyaxml-rs xml2axml -i manifest.xml -o AndroidManifest.xml To restore the original string pool ordering, supply a previously exported stringblocks JSON file: .. code-block:: bash pyaxml-rs xml2axml -i manifest.xml -o AndroidManifest.xml --stringblocks-file strings.json arsc2xml ~~~~~~~~ Parse a ``resources.arsc`` file and produce an XML listing of all resource entries, grouped by locale. .. code-block:: bash pyaxml-rs arsc2xml -i resources.arsc pyaxml-rs arsc2xml -i resources.arsc -o resources.xml Use ``-l`` / ``--language`` to filter output to a single locale: .. code-block:: bash # Show only default (no-locale) entries pyaxml-rs arsc2xml -i resources.arsc -l default # Show only English entries pyaxml-rs arsc2xml -i resources.arsc -l en # Show only French (France) entries pyaxml-rs arsc2xml -i resources.arsc -l fr-FR arsc2proto ~~~~~~~~~~ Convert a binary ARSC resource table to protobuf text format. .. code-block:: bash pyaxml-rs arsc2proto -i resources.arsc pyaxml-rs arsc2proto -i app.apk -o resources.proto.txt pyaxml-rs arsc2proto -i resources.arsc --pretty axml2proto ~~~~~~~~~~ Convert binary AXML to protobuf text format. This produces a human-readable protobuf representation of the internal structure. .. code-block:: bash pyaxml-rs axml2proto -i AndroidManifest.xml pyaxml-rs axml2proto -i AndroidManifest.xml -o manifest.proto.txt pyaxml-rs axml2proto -i AndroidManifest.xml --pretty Working with APK Files ====================== All commands transparently handle ZIP/APK archives. When the input file begins with the ZIP magic bytes (``PK``), pyaxml-rs opens it as a ZIP archive and extracts the appropriate entry. .. code-block:: bash # Extract and decode AndroidManifest.xml from an APK pyaxml-rs axml2xml -i app.apk # Extract a specific layout file from an APK pyaxml-rs axml2xml -i app.apk -p res/layout/activity_main.xml # Extract resources.arsc from an APK pyaxml-rs arsc2xml -i app.apk # Extract a specific entry by path pyaxml-rs arsc2xml -i app.apk -p resources.arsc Output Formats ============== axml2xml output --------------- Produces standard XML with an XML declaration header:: ... Namespace declarations are emitted on the root element. Attribute values are decoded from their typed representation (booleans, hex integers, references, dimensions, colors, etc.) back into their string form. arsc2xml output --------------- Produces resource entries grouped inside ```` elements tagged with the locale:: Each ```` element contains: - ``type`` -- resource type name (string, color, layout, drawable, etc.) - ``name`` -- resource key name - ``id`` -- full 32-bit resource ID in hex (package 0x7f, type, entry) - ``data`` -- resolved value (string content for string resources, hex for others) - ``data_size`` -- byte size of the value cell Stringblocks JSON format ------------------------- The ``--stringblocks-file`` export produces a JSON dictionary mapping string pool indices to their decoded values:: { "0": "http://schemas.android.com/apk/res/android", "1": "android", "2": "package", ... } This file can be fed back to ``xml2axml`` via the same flag to preserve string pool ordering during round-trip conversions. Python API ========== The ``pyaxml`` Python package exposes four public classes. AXML ---- .. code-block:: python import pyaxml # Parse binary AXML axml = pyaxml.AXML.from_axml(data) # Convert to an XML Element (lxml or stdlib ET) element = axml.to_xml() # Recompile from an XML Element or string axml.from_xml(element) # Serialize back to binary binary = axml.pack() # String pool access count = axml.string_count() s = axml.get_string(0) # String pool manipulation via proto sb = pyaxml.StringBlocks(proto=axml.stringblocks.proto) sb.switch("oldName", "newName") axml.stringblocks.proto = sb.proto # Proto serialization round-trip proto_msg = axml.to_proto() # axml_pb2.AXML message axml2 = pyaxml.AXML.from_proto(proto_msg) ARSC ---- .. code-block:: python import pyaxml # Parse binary ARSC arsc = pyaxml.ARSC.from_axml(data) # List all resource entries as an XML string xml_str = arsc.list_packages() # Filter by locale tag xml_str = arsc.list_packages(language="en") xml_str = arsc.list_packages(language="default") # No-op finalization (backward compatibility) arsc.compute() # Access the full ARSC proto message proto = arsc.proto # or arsc.to_proto() # Iterate packages packages = arsc.get_packages() # list of AXMLResTablePackage protos # Look up a resource ID by type and name pkg = packages[0] res_id, key_idx = arsc.get_id_public(pkg, "string", "app_name") # Filter by locale when looking up a resource res_id, key_idx = arsc.get_id_public(pkg, "string", "app_name", language="en") res_id, key_idx = arsc.get_id_public(pkg, "xml", "network_security_config", language="default") # Package count n = arsc.package_count() # Proto serialization proto_msg = arsc.to_proto() # axml_pb2.ARSC message arsc2 = pyaxml.ARSC.from_proto(proto_msg) AXMLGuess --------- Auto-detects whether the input is AXML or ARSC and returns the appropriate object. .. code-block:: python import pyaxml # Returns an AXML or ARSC instance depending on the file type obj = pyaxml.AXMLGuess.from_axml(data) binary = obj.pack() StringBlocks ------------ Provides direct access to the string pool for advanced manipulation. .. code-block:: python import pyaxml axml = pyaxml.AXML.from_axml(data) # Wrap the string pool proto for manipulation sb = pyaxml.StringBlocks(proto=axml.stringblocks.proto) # Decode a string by index s = sb.decode_str(0) # Replace a string everywhere in the pool sb.switch("oldString", "newString") # Write the modified pool back into the AXML axml.stringblocks.proto = sb.proto axml.compute() binary = axml.pack() Accessing chunk headers --------------------- Both ``XmlElement`` and ``ResourceMap`` now expose their chunk headers during parsing. **XmlElement chunk headers:** Every XML element (StartNamespace, EndNamespace, StartElement, EndElement, CData) stores its ``header_size`` and ``chunk_size`` values when parsed from binary. .. code-block:: python import pyaxml axml = pyaxml.AXML.from_axml(open("AndroidManifest.xml", "rb").read()) # Iterate elements (requires AXMLGuess or internal access) # Note: Direct element iteration via element_at() not yet exposed in Python # Access via proto (for advanced users) for el in axml.proto.resourcexml.elts: if el.HasField('header'): print(f"chunk_type=0x{el.header.type:04x}, " f"header_size={el.header.header_size}, " f"chunk_size={el.header.size}") **Get/set methods (Rust library):** When using the Rust library directly, getter and setter methods are available: .. code-block:: rust // Getters let header_size = element.header_size(); // u16 let chunk_size = element.chunk_size(); // u32 // Setter (for compute()) element.set_chunk_size(new_size); **ResourceMap chunk headers:** The resource map also exposes its chunk headers: .. code-block:: python import pyaxml axml = pyaxml.AXML.from_axml(open("resources.arsc", "rb").read()) # Access resource map if present if axml.resource_map: header_size = axml.resource_map.header_size() chunk_size = axml.resource_map.chunk_size() **Important: pack() behavior** - ``XmlElement::pack()`` uses stored header_size/chunk_size directly - ``ResourceMap::pack()`` auto-computes when chunk_size is 0 (for direct usage) - Always call ``axml.compute()`` before ``axml.pack()`` to ensure correct sizes .. code-block:: python import pyaxml # Create new AXML from XML axml = pyaxml.AXML() axml.from_xml(element) # MUST call compute() before pack() for correct chunk sizes axml.compute() binary = axml.pack()