Skip to content

Releases: rocky/python-xdis

5.0.2

25 Jul 10:36
Compare
Choose a tag to compare
  • Add Python 3.8.4 as a 3.8 release
  • pydisasm.py Python 3.3 tolerance
  • Make pydoc's version reporting show xdis's version

5.0.1

28 Jun 16:43
Compare
Choose a tag to compare

Two small improvements that are usefil in the forthcoming trepan3k release:

  • interpret RAISE_VARARGS's argc parameter. Some other formatting was extended too
  • check_object_path() is more leanient in the path name (it doesn't have to end in .py anymore), but it is
    more stringent about what constitutes Python source (it compiles the text to determine validity)
  • In the above is_python_source() and is_bytecode_extension() are used. They are also exported.

5.0.0

28 Jun 02:29
Compare
Choose a tag to compare

Disassembly format and options have simplified and improved.

I had this "Aha!" moment working on the cross-version interpreter x-python. It can show a better disassembly because it has materialized stack entries.
So for example when a COMPARE_OP instruction is run it can show what operands are getting compared.

It was then that I realized that this is also true much of the time statically. For example you'll often find a LOAD_CONST instruction before a RETURN_VALUE and when you do can show exactly what is getting returned. Although cute, the place where something like this is most appreciated and needed is in calling functions such as via CALL_FUNCTION. The situation here is that the name of the function is on the stack and it can be several instructions back depending on the number of parameters. However in a large number of cases, by tracking use of stack effects (added in a previous release), we can often location the LOAD_CONST of that function name.

Note though that we don't attempt work across basic blocks to track down information. Nor do we even attempt recreate expression trees. We don't track across call which has a parameter return value which is the return from another call. Still, I find this all very useful.

This is not shown by default though. Instead we use a mode called "classic". To get this, in pydisasm use the --format extended or --format extended-bytes.

And that brings up a second change in formatting. Before, we had separate flags and command-line options for whether to show just the header, and whether to include bytecode ops in the output. Now there is just a single parameter called asm_format, and choice option --format (short option -F).

As a result this release is incompatible with prior releases, hence the version bump.

A slight change was made in "classic" output. Before we had shown the index into some code table, like co_consts or co_varnames. That no longer appears. If you want that information select either the bytes or extended-bytes formats.

A bug was fixed in all offsts in the recently-added xdis.lineoffsets module.

Fleetwood66

12 Jun 23:24
Compare
Choose a tag to compare

Routines for extracting line and offset information from code objects was added.

Specifically in module xdis.lineoffsets:
* classes: LineOffsetInfo, LineOffsets, and LineOffsetsCompact
* functions: lineoffsets_in_file(), lineoffsets_in_module()

This is need to better support debugging which is done via module
pyficache.

In the future, I intend to make use of this to disambiguate which offset to break at when there are several for a line. Or to indicate better which function or module the line is located in when reporting lines.

For example in:

  z = lambda x, y: x + y

there two offsets associated with that line. The first is to the assignment of z while the second is to the addition expression inside the lambda.

In other news, a long-standing bug was fixed to handle bytestring constants in 3.x. We had been erroneously converting bytestrings into 3.x. However when decompiling 1.x or 2.x bytecode from 3.x we still need to convert bytestrings into strings.

Also, operand formatting in assembly for BUILD_UNMAP_WITH_CALL has been improved, and
we note how the operand encoding has changed between 3.5. and 3.6.

Disassembly now properly marks offsets where the line number that doesn't change from the previous entry.

Lady Elaine

30 May 16:54
Compare
Choose a tag to compare

The main purpose of this release is to support x-python better.

  • Fix a bad bug in handling byte constants in 3.x. How could this go so long fixed?
  • More custom formatting across more opcodes
    • CALL_FUNCTION, CALL_FUNCTION_KW, CALL_FUNCTION_VAR, etc
    • MAKE_FUNCTION
    • LOAD_CONST in some cases
  • Go over magics numbers, yet again
  • Update See also links

Déc adi 30th Floréal - Shepherd's Crook (L.B)

19 May 04:49
Compare
Choose a tag to compare

The major impetus for this release is expanding the Python in Python interpreter x-python
(A new release of that will go out after this.)

  • 3.8.3 added as a valid 3.8 release
  • command program pydisasm disassembles more Python source files now
  • Add better argument formatting on CALL_FUNCTION and MAKE_FUNCTION
  • bytecode.py now has distb
  • opcode modules now have variable python_implementation which is either "CPython" or "PyPY"
  • Reformat a number of files using blacken, and lint using flymake
  • Update __init__.py exports based on what is used in projects uncompyle6, decompyle3, trepan3k,
    xasm and x-python
  • Remove duplicate findlinestarts() code. Remove testing on the Python version and simplify
    this where possible.
  • get_opcode_module allows either a float and string datatype for the version, and coverts
    the bytecode datatype when needed
  • Fix a bugs in marshal and unmarshal

See the commit history or ChangeLog file for a full list of changes

Stack-effect redux

27 Apr 01:37
Compare
Choose a tag to compare
  • Fix bug in marshal for 3.8+ (include posonlyargcount)
  • Go over stack effects from 2.5 to 3.4 using and idea from Maynard
  • Expand stack-effect testing

stack_effect() and code reogranization

24 Apr 06:15
Compare
Choose a tag to compare
  • stack_effects() checked against Python 3.4+ is now in place.
  • Added stack_effects() function to std.py since this is part of the API
  • cross_xdis.py file/module now has dis.py functions split off from bytecode.py
  • Instructions class is in its own module too.
  • Python 2.7.18 added into magics.

Incompatibility with earlier versions:

Note: as a result of the reorganization, exported functions from
bytecode are now in cross_dis. However functions are exported from
the top-level so use that and there will be no disruption in the
future. For example from xdis import iscode, instruction_size, code_info.

modern-pitch A

20 Apr 14:33
Compare
Choose a tag to compare

Incompatibility: load_module() and load_module_from_file_object() now return a couple more parameters: is_pypy, and the sip_hash value when that is available. The timestamp and file_size returned on these functions is now None when they aren't available. Previously timestamp had been 0.

  • --asm option fixes
  • Show sip hash in 3.7+ when that is used
  • Handle PEP 552 bytecode-file variations more properly
  • Detect more intermediate Python versions in load_code_from_file_object()
  • 3.8+ posonlyargcount in assembly... rename Kw-only field to Keyword-only
  • Add 3.5 canonic bytecode version Marshal dumps()
  • convert from byte() to str() in dumps() when needed in 3.x
  • to_native() convert to bytes from string when needed in 3.x.
  • clean up loading code by using float version values rather than magic values

Introducing the Portable Code Type

16 Apr 19:43
Compare
Choose a tag to compare

A portable version of types.CodeType was rewritten, to make it

  • easier to use
  • and catch more errors
  • more complete in tracking Python types.CodeType changes
  • simpler in implementation by using type inheretence
  • more general

Previously getting bytecode read from a bytecode file or from a code
object requiring knowing a lot about the Python version of the code
type and of the currently running interpreter. That is gone now.

Use codeType2Portable() to turn a native types.CodeType or a structure read
in from a bytecode file into a portable code type. The portable code
type allows fields to be mutated, and is more flexible in the kinds of
datatypes it allows.

For example lists of thing like co_consts, or varnames can be
Python lists as well as tuples. The line number table is stored as a
dictionary mapping of address to bytecode offset rather than as a
compressed structure. Bytecode can either be a string (which is
advantageous if you are running Python before 3.x) or a sequence of
bytes which is the datatype of a code object for 3.x.

However when you need a type.CodeType that can be can be
eval()'d by the Python interpreter you are running, use the
to_native() method on the portable code type returned. It will
compress and encode the line number table, and turn lists into tuples
and convert other datatypes to the right type as needed.

If you have a complete types.Codetype structure for a particular
Python version whether, it is the one the current Python interpreter
is using or not, use the to_portable() function and it will figure
out based on the version parameter supplied (or use the current Python
interpreter version if none supplieed), which particlar portable code
type is the right one.

If on the other hand, you have a number of code-type fields which may
be incomplete, but still want to work with something that has
code-type characteristics while not worring about which fields are
required an their exact proper datatypes, use the CodeTypeUnion structure.

Internally, we use OO inheritence to reduce the amount of duplicate
code. The load_code_internal() function from unmarshal.py is now a
lot shorter and cleaner as a result of this reorganization.

New Portable Code Methods, Modules and Classes

  • Python 3.8-ish replace() method has been added to the portable code types
  • Portable code type classes Code13, Code15 have been added to more precisely distinguish Python 1.3 and 1.5 code types. The other portable code classes are Code2, Code3, and Code38.
  • the to_native() conversts a portable code type into a native code type
  • the decode_lineno_tab() method on portable code types from Python 1.5 on decompresses the Python encode line number table into a dictionary mapping offset to line number.

Incompatibility

The module xdis.code has been remamed to xdis.codetype and with
that the function iscode() moved as well. In previous versions to
use iscode() you might import it from xdis.code; now simply import
it from xdis. In general function that had been imported from a
module under xdis can now be imported simply from xdis.

The classes Compat3Code and function code2compat() and
code3compat() have been removed. Compat2Code is still around for
dropbox 2.5, but that is deprecated and will be removed when I can
figure out how to remove it from dropbox 2.5.

Other Changes

CI testing for older testing has been fixed now that 2.7 is even more deprecated.

Note Deleted release 4.3.0 and 4.3.1 had a bugs in detected by decompilers and in handling some 3.8 bytecode.