Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactored and Optimized Logic:: Parser Logic, Latex Based Output & Shared Modules #283

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

IIITM-Jay
Copy link

@IIITM-Jay IIITM-Jay commented Sep 16, 2024

This PR intends to refactor the parse.py to achieve maintainability and scalability.

Scopes Covered

  1. Modularization: Breaking the codes in small and maintainable helper functions containing understandable lines of code
  2. Optimization: Reducing number of lines of codes using techniques such as list comprehensions etc.
  3. Removing Redundancy: Removing duplicacy and utilizing reusablity of logics
  4. Refactorization: Refactor a long script into dedicated scripts for scalability

Approach Followed

  1. The parse.py file now contains only the main function where it calls respective scripts for generating the output based on extensions selected.
  2. Common methods/ functions are moved to shared_utils.py so that other scripts can re-use them efficiently. These functions are shared between various extension scripts
  3. The output for Latex is refactored and optimized in larex_utils.py
  4. Scripts: parse.py and shared_utils.py are refactored, modularized and optimized as well
  5. Separate dedicated scripts being created for each extension

Needs to be done

  • The codes for other extension outputs are only moved to their respective scripts(except for latex). They require improvements and enhancements. They are just simply taken from parse.py to their modules like c_utils.py for generating c based output
  • In the shared_utils.py, the method create_inst_dict() is yet to be refactored.

@IIITM-Jay
Copy link
Author

IIITM-Jay commented Sep 16, 2024

Hi @aswaterman and @rpsene, I have made an attempt to refactor the parse.py. For now, only latex_utils.py, shared_utils.py and parse.py and refactored, optimized and modularized.

The test cases are failing as we need to modify the tests.py as well based on the accepted refactored code.
P.S. Note: For the time being, I have imported the shared module methods for running the test cases, now all checks passed

Requesting feedback on the modified code and suggestions on what best we can do to achieve maintainability and scalability.

@IIITM-Jay
Copy link
Author

Hi @aswaterman, in addition to the written explanation of the parsing logic inside "Flow of parse.py" section of README file, I believe including flowcharts would greatly enhance the clarity and readability of the process. By visualizing the flow, readers can get a clearer understanding of the key steps involved in parsing instruction encodings.

Like, we have three main steps with each having sequence of procedures:
1. The first pass, we cover only the regular expression and follow the below steps:

flowchart TD
    A[Start: parse.py] --> B[Create list of all rv* files]
    B --> C{File contains regular instructions?}
    C -->|Yes| D[Parse file line by line]
    D --> E[Perform checks on regular instructions]
    
    E --> F[Check 1: msb > lsb in range assignment]
    E --> G[Check 2: Value representable in range]
    E --> H[Check 3: No multiple assignments to same bit]
    E --> I[Check 4: All bit positions must be accounted for]
    
    F --> J[Pass checks?]
    G --> J
    H --> J
    I --> J
    
    J -->|Yes| K[Create dictionary for regular instruction]
    K --> L[Add encoding, extension, mask, match, variable_fields]
    L --> M[Add to instr_dict]
    
    M --> N[Process next regular instruction]
    N -->|All regular instructions processed| O[End of Regular Instruction Parsing]
Loading

@IIITM-Jay
Copy link
Author

2. In the second pass, we do the checks for pseudo_instr carrying out similar procedure.

3. In the last step, the output generation,

flowchart TD
    A[Start Output Generation] --> B[Generate LaTeX tables]
    B --> C[Generate encoding.h file]
    C --> D[Generate other artifacts]

    D --> E[Output files generated]
    E --> F[End]

Loading

@IIITM-Jay
Copy link
Author

IIITM-Jay commented Sep 21, 2024

@aswaterman and @rpsene , These flowchart creation process are markdown friendly. I think it will also help to give a nutshell view of what we are doing. Let me know the feedback and suggestions, if these will add up to an enhancement for the existing repository.

@IIITM-Jay
Copy link
Author

IIITM-Jay commented Sep 25, 2024

@aswaterman and @rpsene , Refactored and Optimized the method used for creating instruction dictionary.

P.S. The second point from Needs to be Done header in PR description is ticked(marked as completed) as of now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant