Module analysis #100

Y-Nak · 2024-12-04T22:27:46Z

Please start reviewing after #96 is merged.

NOTE: A.I. generates the comment below, but I'm not entirely sure if this comment is helpful or not.

Pull Request: Module Analysis Framework Enhancement

Summary

This PR introduces a framework for analyzing the structure and properties of a Module in the Sonatina IR. The additions provide functionality to assess the module's dependency flow, call graph, and strongly connected components (SCCs), enabling insights into the recursive and external dependencies of functions.

Key Features

1. Module Analysis Framework

Module Analysis API:
- Adds the analyze_module function to perform analysis of a module, returning a ModuleInfo object.
- ModuleInfo encapsulates:
  - Strongly connected components (CallGraphSccs).
  - Call graph structure (CallGraph).
  - Per-function information (FuncInfo).
  - Dependency flow of the module (DependencyFlow).
DependencyFlow Enumeration:
- Represents four states of call dependency: OutgoingOnly, IncomingOnly, Bidirectional, and Closed.
- Implements a lattice-based join operation for combining dependency states and a remove_flow function for refining flow states.

2. Call Graph Analysis

Call Graph Construction:
- Extracts and organizes function call relationships into a CallGraph.
- Supports traversal, leaf function detection, and callee inspection.
SCC Analysis:
- Implements Tarjan's algorithm via the SccBuilder to identify SCCs in the call graph.
- Annotates SCCs with cyclicity and member functions.

3. Function Information

Determines per-function properties:
- Recursiveness: Detects whether a function is part of a recursive call chain.
- Leaf Functionality: Identifies functions without outgoing calls.
- Dependency Flow: Categorizes function dependencies with respect to external entities.

4. `DependencyFlow` analysis

Aggregates data to determine the global DependencyFlow of the module.
Utilizes the condensation DAG of the call graph to analyze all functions.

sbillig

It would be nice to support analyzing a known-to-be-complete program, composed of linked modules (but that can of course come later).

Can you remind me what is_leaf is useful for? It seems that any function that's known to not be involved in a recursive loop can have a stable frame region; is there something special we can do with leaf functions that I'm forgetting? Two separate leaves could have overlapping stable regions, but that's a special case of disjoint branches.

sbillig · 2024-12-06T20:43:47Z

crates/codegen/test_files/module_analysis/bidirectional.snap

+SCC: [`%f6`]
+SCC: [`%f7`]
+
+`%fx` = FuncInfo { is_non_recursive: false, flow: OutgoingOnly, is_leaf: true }


The FuncInfo for externally-defined %fx is weird; local functions call it, so its flow is known to be Incoming, and we should assume that it may be Bidirectional. We also can't say that it's a leaf.

Yeah, this is bad. I initially thought that we should remove the external function from the result, but also hesitated to do that because it breaks the consistency between the result and FuncStore or other modules, which always manage the mapping from each FuncRef to some information.

I'm inclined to remove the external function info from the ModuleInfo now because the information won't be used anywhere(and actually, if it's used, it just gives wrong information or the most conservative information). What's your thought?

Yeah, I think it's fine to remove it and for the backend to assume the most conservative for calls to mysterious external functions.

Y-Nak · 2024-12-06T21:35:14Z

It would be nice to support analyzing a known-to-be-complete program, composed of linked modules (but that can of course come later).

What is additional analysis in your mind when the entire program is provided? I assumed that by using module_linker to combine multiple modules before analysis, it'd be possible to obtain more fine-grained and comprehensive analysis results(and for the EVM backend, the great difference in the quality of the analysis would depend on whether the module is Bidirectional or not).

Can you remind me what is_leaf is useful for?

It depends on the backend. It won't be so useful for EVM codegen(but, we might need the information in addition to the DependencyFlow/Sccs for inlining). But for the "normal" ISA, it's useful for codgen as well.

sbillig · 2024-12-07T03:03:39Z

It would be nice to support analyzing a known-to-be-complete program, composed of linked modules (but that can of course come later).

What is additional analysis in your mind when the entire program is provided? I assumed that by using module_linker to combine multiple modules before analysis, it'd be possible to obtain more fine-grained and comprehensive analysis results(and for the EVM backend, the great difference in the quality of the analysis would depend on whether the module is Bidirectional or not).

Just more comprehensive information; no need to be overly conservative when calling external functions, for example. In practice, fe or solc can just generate the whole program in a single sonatina module and this 'multi-module whole-program analysis' isn't needed for now.

Y-Nak · 2024-12-07T09:32:50Z

I see. I probably understand what you mean. So, you mean that it'd be nicer if the analysis phase could take an arbitrary number of Modules or integrate multiple ModuleInfo to refine information accordingly, right?
By assuming my understanding is correct, I changed the external function information to the most conservative form
instead of removing it so that we can easily perform the improvement(i.e., after such refinement, the external function also could have more useful information).

Y-Nak added 8 commits December 4, 2024 21:48

Add callgraph

62e8bf4

Implement Tarjan's algorithm for callgraph Sccs computation

75c8f95

Add CallInfo inst property trait

92d4fba

Remove redundant fields from SccInfo

f8de933

Implement module analysis

62102b3

Refactor module analysis

d23040b

Add tests for module analysis

f12ea73

Add missing build.rs in ir crate

3c1494f

Y-Nak force-pushed the module-analysis branch from 80447c1 to ad2eabf Compare December 6, 2024 17:55

Y-Nak marked this pull request as ready for review December 6, 2024 17:56

Y-Nak requested a review from sbillig December 6, 2024 18:17

Y-Nak force-pushed the module-analysis branch from ad2eabf to 3c1494f Compare December 6, 2024 19:28

sbillig approved these changes Dec 6, 2024

View reviewed changes

Y-Nak added 3 commits December 6, 2024 22:08

Move fixtures for ir test under ir/test_files for consistency

bd7e035

Modify .gitignore

521872a

Expose FuncRef from FunctionBuilder

8e777c9

Fix the wrong assumption for the external function information

8dc5498

Y-Nak force-pushed the module-analysis branch from e8a5ba6 to 8dc5498 Compare December 7, 2024 10:37

Rename is_non_recursive to is_recursive in FuncInfo

a2192fc

sbillig merged commit c9dee84 into fe-lang:main Dec 12, 2024
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Module analysis #100

Module analysis #100

Y-Nak commented Dec 4, 2024 •

edited

Loading

sbillig left a comment

sbillig Dec 6, 2024

Y-Nak Dec 6, 2024 •

edited

Loading

sbillig Dec 7, 2024

Y-Nak commented Dec 6, 2024 •

edited

Loading

sbillig commented Dec 7, 2024

Y-Nak commented Dec 7, 2024 •

edited

Loading

Module analysis #100

Module analysis #100

Conversation

Y-Nak commented Dec 4, 2024 • edited Loading

Pull Request: Module Analysis Framework Enhancement

Summary

Key Features

1. Module Analysis Framework

2. Call Graph Analysis

3. Function Information

4. DependencyFlow analysis

sbillig left a comment

Choose a reason for hiding this comment

sbillig Dec 6, 2024

Choose a reason for hiding this comment

Y-Nak Dec 6, 2024 • edited Loading

Choose a reason for hiding this comment

sbillig Dec 7, 2024

Choose a reason for hiding this comment

Y-Nak commented Dec 6, 2024 • edited Loading

sbillig commented Dec 7, 2024

Y-Nak commented Dec 7, 2024 • edited Loading

Y-Nak commented Dec 4, 2024 •

edited

Loading

4. `DependencyFlow` analysis

Y-Nak Dec 6, 2024 •

edited

Loading

Y-Nak commented Dec 6, 2024 •

edited

Loading

Y-Nak commented Dec 7, 2024 •

edited

Loading