Releases: rocky/python-xdis
5.0.2
5.0.1
Two small improvements that are usefil in the forthcoming trepan3k release:
- interpret
RAISE_VARARGS
'sargc
parameter. Some other formatting was extended too check_object_path()
is more leanient in the path name (it doesn't have to end in.py
anymore), but it is
more stringent about what constitutes Python source (it compiles the text to determine validity)- In the above
is_python_source()
andis_bytecode_extension()
are used. They are also exported.
5.0.0
Disassembly format and options have simplified and improved.
I had this "Aha!" moment working on the cross-version interpreter x-python. It can show a better disassembly because it has materialized stack entries.
So for example when a COMPARE_OP
instruction is run it can show what operands are getting compared.
It was then that I realized that this is also true much of the time statically. For example you'll often find a LOAD_CONST
instruction before a RETURN_VALUE
and when you do can show exactly what is getting returned. Although cute, the place where something like this is most appreciated and needed is in calling functions such as via CALL_FUNCTION
. The situation here is that the name of the function is on the stack and it can be several instructions back depending on the number of parameters. However in a large number of cases, by tracking use of stack effects (added in a previous release), we can often location the LOAD_CONST
of that function name.
Note though that we don't attempt work across basic blocks to track down information. Nor do we even attempt recreate expression trees. We don't track across call which has a parameter return value which is the return from another call. Still, I find this all very useful.
This is not shown by default though. Instead we use a mode called "classic". To get this, in pydisasm
use the --format extended
or --format extended-bytes
.
And that brings up a second change in formatting. Before, we had separate flags and command-line options for whether to show just the header, and whether to include bytecode ops in the output. Now there is just a single parameter called asm_format
, and choice option --format
(short option -F
).
As a result this release is incompatible with prior releases, hence the version bump.
A slight change was made in "classic" output. Before we had shown the index into some code table, like co_consts
or co_varnames
. That no longer appears. If you want that information select either the bytes
or extended-bytes
formats.
A bug was fixed in all offsts in the recently-added xdis.lineoffsets
module.
Fleetwood66
Routines for extracting line and offset information from code objects was added.
Specifically in module xdis.lineoffsets
:
* classes: LineOffsetInfo
, LineOffsets
, and LineOffsetsCompact
* functions: lineoffsets_in_file()
, lineoffsets_in_module()
This is need to better support debugging which is done via module
pyficache.
In the future, I intend to make use of this to disambiguate which offset to break at when there are several for a line. Or to indicate better which function or module the line is located in when reporting lines.
For example in:
z = lambda x, y: x + y
there two offsets associated with that line. The first is to the assignment of z
while the second is to the addition expression inside the lambda.
In other news, a long-standing bug was fixed to handle bytestring constants in 3.x. We had been erroneously converting bytestrings into 3.x. However when decompiling 1.x or 2.x bytecode from 3.x we still need to convert bytestrings into strings.
Also, operand formatting in assembly for BUILD_UNMAP_WITH_CALL
has been improved, and
we note how the operand encoding has changed between 3.5. and 3.6.
Disassembly now properly marks offsets where the line number that doesn't change from the previous entry.
Lady Elaine
The main purpose of this release is to support x-python
better.
- Fix a bad bug in handling byte constants in 3.x. How could this go so long fixed?
- More custom formatting across more opcodes
CALL_FUNCTION
,CALL_FUNCTION_KW
,CALL_FUNCTION_VAR
, etcMAKE_FUNCTION
LOAD_CONST
in some cases
- Go over magics numbers, yet again
- Update See also links
Déc adi 30th Floréal - Shepherd's Crook (L.B)
The major impetus for this release is expanding the Python in Python interpreter x-python
(A new release of that will go out after this.)
- 3.8.3 added as a valid 3.8 release
- command program
pydisasm
disassembles more Python source files now - Add better argument formatting on
CALL_FUNCTION
andMAKE_FUNCTION
- bytecode.py now has
distb
- opcode modules now have variable
python_implementation
which is either "CPython" or "PyPY" - Reformat a number of files using blacken, and lint using flymake
- Update
__init__.py
exports based on what is used in projectsuncompyle6
,decompyle3
,trepan3k
,
xasm
andx-python
- Remove duplicate
findlinestarts()
code. Remove testing on the Python version and simplify
this where possible. - get_opcode_module allows either a float and string datatype for the version, and coverts
the bytecode datatype when needed - Fix a bugs in marshal and unmarshal
See the commit history or ChangeLog file for a full list of changes
Stack-effect redux
- Fix bug in marshal for 3.8+ (include posonlyargcount)
- Go over stack effects from 2.5 to 3.4 using and idea from Maynard
- Expand stack-effect testing
stack_effect() and code reogranization
stack_effects()
checked against Python 3.4+ is now in place.- Added
stack_effects()
function tostd.py
since this is part of the API cross_xdis.py
file/module now hasdis.py
functions split off frombytecode.py
Instructions
class is in its own module too.- Python 2.7.18 added into magics.
Incompatibility with earlier versions:
Note: as a result of the reorganization, exported functions from
bytecode are now in cross_dis. However functions are exported from
the top-level so use that and there will be no disruption in the
future. For example from xdis import iscode, instruction_size, code_info
.
modern-pitch A
Incompatibility: load_module()
and load_module_from_file_object()
now return a couple more parameters: is_pypy, and the sip_hash value when that is available. The timestamp and file_size returned on these functions is now None when they aren't available. Previously timestamp had been 0.
- --asm option fixes
- Show sip hash in 3.7+ when that is used
- Handle PEP 552 bytecode-file variations more properly
- Detect more intermediate Python versions in
load_code_from_file_object()
- 3.8+ posonlyargcount in assembly... rename Kw-only field to Keyword-only
- Add 3.5 canonic bytecode version Marshal
dumps()
- convert from byte() to str() in dumps() when needed in 3.x
- to_native() convert to bytes from string when needed in 3.x.
- clean up loading code by using float version values rather than magic values
Introducing the Portable Code Type
A portable version of types.CodeType was rewritten, to make it
- easier to use
- and catch more errors
- more complete in tracking Python
types.CodeType
changes - simpler in implementation by using type inheretence
- more general
Previously getting bytecode read from a bytecode file or from a code
object requiring knowing a lot about the Python version of the code
type and of the currently running interpreter. That is gone now.
Use codeType2Portable()
to turn a native types.CodeType
or a structure read
in from a bytecode file into a portable code type. The portable code
type allows fields to be mutated, and is more flexible in the kinds of
datatypes it allows.
For example lists of thing like co_consts
, or varnames
can be
Python lists as well as tuples. The line number table is stored as a
dictionary mapping of address to bytecode offset rather than as a
compressed structure. Bytecode can either be a string (which is
advantageous if you are running Python before 3.x) or a sequence of
bytes which is the datatype of a code object for 3.x.
However when you need a type.CodeType
that can be can be
eval()
'd by the Python interpreter you are running, use the
to_native()
method on the portable code type returned. It will
compress and encode the line number table, and turn lists into tuples
and convert other datatypes to the right type as needed.
If you have a complete types.Codetype
structure for a particular
Python version whether, it is the one the current Python interpreter
is using or not, use the to_portable()
function and it will figure
out based on the version parameter supplied (or use the current Python
interpreter version if none supplieed), which particlar portable code
type is the right one.
If on the other hand, you have a number of code-type fields which may
be incomplete, but still want to work with something that has
code-type characteristics while not worring about which fields are
required an their exact proper datatypes, use the CodeTypeUnion
structure.
Internally, we use OO inheritence to reduce the amount of duplicate
code. The load_code_internal()
function from unmarshal.py
is now a
lot shorter and cleaner as a result of this reorganization.
New Portable Code Methods, Modules and Classes
- Python 3.8-ish
replace()
method has been added to the portable code types - Portable code type classes
Code13
,Code15
have been added to more precisely distinguish Python 1.3 and 1.5 code types. The other portable code classes areCode2
,Code3
, andCode38
. - the to_native() conversts a portable code type into a native code type
- the
decode_lineno_tab()
method on portable code types from Python 1.5 on decompresses the Python encode line number table into a dictionary mapping offset to line number.
Incompatibility
The module xdis.code
has been remamed to xdis.codetype
and with
that the function iscode()
moved as well. In previous versions to
use iscode()
you might import it from xdis.code
; now simply import
it from xdis
. In general function that had been imported from a
module under xdis
can now be imported simply from xdis
.
The classes Compat3Code
and function code2compat()
and
code3compat()
have been removed. Compat2Code
is still around for
dropbox 2.5, but that is deprecated and will be removed when I can
figure out how to remove it from dropbox 2.5.
Other Changes
CI testing for older testing has been fixed now that 2.7 is even more deprecated.
Note Deleted release 4.3.0 and 4.3.1 had a bugs in detected by decompilers and in handling some 3.8 bytecode.