Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve DWARF5 support and refactor (new history) #3703

Merged
merged 3 commits into from
Aug 25, 2023
Merged

Improve DWARF5 support and refactor (new history) #3703

merged 3 commits into from
Aug 25, 2023

Conversation

imbillow
Copy link
Contributor

@imbillow imbillow commented Jul 31, 2023

DO NOT SQUASH ME!

Your checklist for this pull request

  • I've read the guidelines for contributing to this repository
  • I made sure to follow the project's coding style
  • I've documented or updated the documentation of every function and struct this PR changes. If not so I've explained why.
  • I've added tests that prove my fix is effective or that my feature works (if possible)
  • I've updated the rizin book with the relevant information (if needed)

Detailed description

Existing DWARF loaders have incomplete support for DWARF5, and the support for DWARF location expressions is too simplified. It also uses a lot of SDBs to store DWARF information.

This PR completely refactors the DWARF loader, fixes a lot of bugs, and makes a lot of improvements to these issues.

TODO (after merge)

Test plan

CI is green

Closing issues

closes #3548
closes #3535
closes #1004
closes #3541
closes #3700
partially addresses #3581
...

See also https://gcc.gnu.org/wiki/DebugFission

Second version of #3565

XVilka

This comment was marked as resolved.

@XVilka

This comment was marked as resolved.

@XVilka

This comment was marked as resolved.

@XVilka

This comment was marked as resolved.

@imbillow

This comment was marked as resolved.

@XVilka

This comment was marked as resolved.

XVilka

This comment was marked as resolved.

@XVilka

This comment was marked as resolved.

@XVilka

This comment was marked as resolved.

Rot127

This comment was marked as resolved.

librz/bin/dwarf/lists.c Outdated Show resolved Hide resolved
librz/bin/dwarf/lists.c Show resolved Hide resolved
librz/bin/dwarf/loclists.c Outdated Show resolved Hide resolved
librz/bin/dwarf/loclists.c Show resolved Hide resolved
librz/bin/dwarf/loclists.c Outdated Show resolved Hide resolved
librz/core/disasm.c Outdated Show resolved Hide resolved
librz/include/rz_analysis.h Show resolved Hide resolved
librz/include/rz_analysis.h Outdated Show resolved Hide resolved
librz/include/rz_bin_dwarf.h Outdated Show resolved Hide resolved
meson.build Outdated Show resolved Hide resolved
@Rot127

This comment was marked as resolved.

@XVilka XVilka modified the milestones: 0.6.0, 0.7.0 Aug 1, 2023
@XVilka

This comment was marked as resolved.

TRY_LOAD_SECTION("debug_line", dw->line,
rz_bin_dwarf_line_from_buf(buf, &dw->encoding, dw->info, RZ_BIN_DWARF_LINE_INFO_MASK_LINES_ALL));

RzBuffer *loc = load_section("debug_loc", sdb);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thestr4ng3r what do you think about this approach of loading/saving the debug sections? cc @wargio @ret2libc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these sections actually info that has to be saved/loaded at all? If it is something that is parsed directly out of the binary and will never be edited during the session, it does not have to be saved as it can be restored from the binary again on the next load (might need tests so this property is actually guaranteed).

For the RzAnalysisDwarfVariable on the other hand, we have no choice other than to serialize it, or at least some kind of reference into the binary where the same info can be restored from, since variables can be renamed, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thestr4ng3r Rizin uses only a small part of that information, in fact, especially at this stage. Thus do not storing whole sections should reduce the resulting size of the project file. Moreover, as some binaries have huge debug information sections, storing them serialized might become a problem, for example for files like this one: #676

Copy link
Contributor Author

@imbillow imbillow Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that RzAnalysisDwarfVariable should not be edited and it can be restored via sections of DWARF. then we can reference it via DIE offset.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imbillow but then you don't need to store the whole section content in the SDB, only pointers to it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imbillow but then you don't need to store the whole section content in the SDB, only pointers to it?

Yes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. If a DIE offset for a variable is enough to reconstruct the info, then it is enough to store that as part of the variable storage info and then reconstruct the info from the bin itself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, lets stick to this approach then - saving pointers to the information. Then, if it works - the only missing thing is project migration tests, also test for loading this kind of storage from the project file. Then the PR will be ready to merge

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

@XVilka

This comment was marked as resolved.

@imbillow
Copy link
Contributor Author

imbillow commented Aug 24, 2023

There are more functions detected now. Is this intended?

What new were added?


[XX] db/abi/compilers/clang_64 ELF_ABI : Clang Of flagspaces

RZ_NOPLUGINS=1 /private/var/folders/bc/yvvw2g2s4jlg9r8x7tf61xv800009d/T/woodpecker-local-1287038090/rizinorg/rizin/prefix/bin/rizin -escr.utf8=0 -escr.color=0 -escr.interactive=0 -eflirt.sigdb.load.system=false -eflirt.sigdb.load.home=false -N -A -Qc 'fsl~functions

' bins/abi_bins/elf/compilers/clang/echo_clang_Of

-- stdout

--- expected

+++ actual

@@ -1,1 +1,1 @@

-   51 * functions

+   68 * functions



-- stderr

[x] Analyze all flags starting with sym. and entry0 (aa)

[x] Analyze function calls

[x] Analyze len bytes of instructions for references

[x] Check for classes

[x] Analyze local variables and arguments

[x] Type matching analysis for all functions

[x] Applied 0 FLIRT signatures via sigdb

[x] Propagate noreturn information

[x] Integrate dwarf function information.

[x] Resolve pointers to data sections

[x] Use -AA or aaaa to perform additional experimental analysis.

See https://ci.rizin.re/rizinorg/rizin/build/3026/6

Some functions that had no name are now have auto-named

image

@XVilka

This comment was marked as resolved.

@XVilka
Copy link
Member

XVilka commented Aug 24, 2023

@imbillow could you please add a new test for this kind of anonymous function too?

@imbillow
Copy link
Contributor Author

@imbillow could you please add a new test for this kind of anonymous function too?

My bad. It's actually just a mistake to rename an existing function with the generated name, should has been fixed.

XVilka

This comment was marked as resolved.

@imbillow

This comment was marked as resolved.

@XVilka

This comment was marked as resolved.

@XVilka

This comment was marked as resolved.

XVilka

This comment was marked as resolved.

@XVilka

This comment was marked as resolved.

.reuse/dep5 Outdated Show resolved Hide resolved
- rz_base_type_clone_into
- rz_base_type_clone
- rz_bits_leading_zeros
- rz_json_eq
- rz_json_string_eq
- RZ_STR_EQ
- RZ_STR_NE
- rz_vector_clone_into
- sdb_diff_eq
- `RZ_PROJECT_VERSION` 14
- Add `RzCallable`.has_unspecified_parameters
- Add `RzAnalysis`.debug_info
- Add `RzAnalysisVarStorageType` composite and `eval` `pending`
- Support for parse DWARF section "debug_loclists", "debug_ranges", "debug_rnglists"
- Partial support for eval DWARF expr_loc
- Support for anonymous Type, function variable, struct member
- Cache all DWARF information in `RzAnalysisDebugInfo` and remove `SDB` based caching.
- Add arm32, arm64, TriCore DWARF register name
- Fix same name basetype
@XVilka XVilka merged commit 8e29b95 into dev Aug 25, 2023
44 checks passed
@XVilka XVilka deleted the dwarf5 branch August 25, 2023 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants