Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module analysis #100

Merged
merged 13 commits into from
Dec 12, 2024
Merged

Module analysis #100

merged 13 commits into from
Dec 12, 2024

Conversation

Y-Nak
Copy link
Collaborator

@Y-Nak Y-Nak commented Dec 4, 2024

Please start reviewing after #96 is merged.

NOTE: A.I. generates the comment below, but I'm not entirely sure if this comment is helpful or not.

Pull Request: Module Analysis Framework Enhancement

Summary

This PR introduces a framework for analyzing the structure and properties of a Module in the Sonatina IR. The additions provide functionality to assess the module's dependency flow, call graph, and strongly connected components (SCCs), enabling insights into the recursive and external dependencies of functions.

Key Features

1. Module Analysis Framework

  • Module Analysis API:

    • Adds the analyze_module function to perform analysis of a module, returning a ModuleInfo object.
    • ModuleInfo encapsulates:
      • Strongly connected components (CallGraphSccs).
      • Call graph structure (CallGraph).
      • Per-function information (FuncInfo).
      • Dependency flow of the module (DependencyFlow).
  • DependencyFlow Enumeration:

    • Represents four states of call dependency: OutgoingOnly, IncomingOnly, Bidirectional, and Closed.
    • Implements a lattice-based join operation for combining dependency states and a remove_flow function for refining flow states.

2. Call Graph Analysis

  • Call Graph Construction:

    • Extracts and organizes function call relationships into a CallGraph.
    • Supports traversal, leaf function detection, and callee inspection.
  • SCC Analysis:

    • Implements Tarjan's algorithm via the SccBuilder to identify SCCs in the call graph.
    • Annotates SCCs with cyclicity and member functions.

3. Function Information

  • Determines per-function properties:
    • Recursiveness: Detects whether a function is part of a recursive call chain.
    • Leaf Functionality: Identifies functions without outgoing calls.
    • Dependency Flow: Categorizes function dependencies with respect to external entities.

4. DependencyFlow analysis

  • Aggregates data to determine the global DependencyFlow of the module.
  • Utilizes the condensation DAG of the call graph to analyze all functions.

@Y-Nak Y-Nak marked this pull request as ready for review December 6, 2024 17:56
@Y-Nak Y-Nak requested a review from sbillig December 6, 2024 18:17
Copy link
Collaborator

@sbillig sbillig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to support analyzing a known-to-be-complete program, composed of linked modules (but that can of course come later).

Can you remind me what is_leaf is useful for? It seems that any function that's known to not be involved in a recursive loop can have a stable frame region; is there something special we can do with leaf functions that I'm forgetting? Two separate leaves could have overlapping stable regions, but that's a special case of disjoint branches.

SCC: [`%f6`]
SCC: [`%f7`]

`%fx` = FuncInfo { is_non_recursive: false, flow: OutgoingOnly, is_leaf: true }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FuncInfo for externally-defined %fx is weird; local functions call it, so its flow is known to be Incoming, and we should assume that it may be Bidirectional. We also can't say that it's a leaf.

Copy link
Collaborator Author

@Y-Nak Y-Nak Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is bad. I initially thought that we should remove the external function from the result, but also hesitated to do that because it breaks the consistency between the result and FuncStore or other modules, which always manage the mapping from each FuncRef to some information.

I'm inclined to remove the external function info from the ModuleInfo now because the information won't be used anywhere(and actually, if it's used, it just gives wrong information or the most conservative information). What's your thought?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it's fine to remove it and for the backend to assume the most conservative for calls to mysterious external functions.

@Y-Nak
Copy link
Collaborator Author

Y-Nak commented Dec 6, 2024

It would be nice to support analyzing a known-to-be-complete program, composed of linked modules (but that can of course come later).

What is additional analysis in your mind when the entire program is provided? I assumed that by using module_linker to combine multiple modules before analysis, it'd be possible to obtain more fine-grained and comprehensive analysis results(and for the EVM backend, the great difference in the quality of the analysis would depend on whether the module is Bidirectional or not).

Can you remind me what is_leaf is useful for?

It depends on the backend. It won't be so useful for EVM codegen(but, we might need the information in addition to the DependencyFlow/Sccs for inlining). But for the "normal" ISA, it's useful for codgen as well.

@sbillig
Copy link
Collaborator

sbillig commented Dec 7, 2024

It would be nice to support analyzing a known-to-be-complete program, composed of linked modules (but that can of course come later).

What is additional analysis in your mind when the entire program is provided? I assumed that by using module_linker to combine multiple modules before analysis, it'd be possible to obtain more fine-grained and comprehensive analysis results(and for the EVM backend, the great difference in the quality of the analysis would depend on whether the module is Bidirectional or not).

Just more comprehensive information; no need to be overly conservative when calling external functions, for example. In practice, fe or solc can just generate the whole program in a single sonatina module and this 'multi-module whole-program analysis' isn't needed for now.

@Y-Nak
Copy link
Collaborator Author

Y-Nak commented Dec 7, 2024

I see. I probably understand what you mean. So, you mean that it'd be nicer if the analysis phase could take an arbitrary number of Modules or integrate multiple ModuleInfo to refine information accordingly, right?
By assuming my understanding is correct, I changed the external function information to the most conservative form
instead of removing it so that we can easily perform the improvement(i.e., after such refinement, the external function also could have more useful information).

@sbillig sbillig merged commit c9dee84 into fe-lang:main Dec 12, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants