RENAS toolkit

Installation

Requirements

Docker
- Confirmed working at Docker 24.0.2 on macOS 12.6.4 and Docker 20.10.22 on Linux (Ubuntu 22.04 LTS)
Docker Compose plugin v2
- Confirmed working at v2.19.1
- If you reproduce our result, docker can use at least 14GB of memory. If you just run the tool, at least 4GB of memory is needed (depending on the project to apply).
Git
If you reproduce our result, you will need 60GB of free disk space or more. If you just run the tool, probably just 5GB of free disk is needed (depending on the project to apply).

Setup

Clone the project repository. We refer to this project directory as $RENAS.

$ git clone https://github.com/salab/RENAS
$ cd RENAS

Run the pre-setup script.

$ ./setup.sh

This will result in a directory structure like this:

  RENAS
  ├ AbbrExpansion
  │  ├ code
  │     ├ ...
  ├ RefactoringMiner
  │  ├ RefactoringMiner-2.0.2
  │     ├ ...

Prepare target project directories (see the sections below).
Start docker

$ docker compose up -d

Use RENAS

$ docker compose exec renas bash

You will have a shell to be ready to run RENAS.

If you'd like to use our tools, See Basic Usage section below.
If you'd like to reproduce (part of) our result, please see Reproduction (lightweight) or Reproduction sections below.

Stop the tool

$ docker compose down

Basic Usage

This is a general explanation of how to use the RENAS tool. If you want to replicate our result, see Reproduction section below.

Performing Recommendation

Create a directory with the name of the project to be analyzed in the projects directory. For example, if the directories to be analyzed are named "proj1" or "proj2", the directory structure will be as follows:

  RENAS
  ├ projects
  │  ├ proj1
  │  ├ proj2

Create a directory called repo in the created directory and place the Git repository you want to analyze in it.

  RENAS
  ├ projects
  │  ├ proj1
  │  │  ├ repo 
  │  │    ├ foo.java
  │  │    ├ bar.java
  │  │    ├ baz
  │  │      ├ ... 
  │  │
  │  ├ proj2
  │  │  ├ repo 
  │  │    ├ qux.java
  │  │    ├ xyzzy

Enter the names of projects you'd like to recommend in "$RENAS/projects.txt", separated by lines, as shown below.

proj1
proj2

Create "$RENAS/projects/*projects name*/rename.json" and store the information on renamings. Recommendations are made based on the renamings specified here. If you'd like to make recommendations based on the renamings obtained from RefactoringMiner, there is no need to create it. The way to write rename.json is as follows.

[
    {
        "commit":"34ee18a88a29d45ada4138e5e8f8b2e11143203c",
        "oldname":"render",
        "newname":"call",
        "typeOfIdentifier":"MethodName",
        "line":96,
        "files":"ratpack-core\/src\/main\/groovy\/org\/ratpackframework\/templating\/GroovyTemplateRenderer.java"
    },
    {
        "commit":"6872a6f3cdf4100dee5d5e902a7cdf7a199218c0",
        "oldname":"byteBuf",
        "newname":"buffer",
        "typeOfIdentifier":"ParameterName",
        "line":322,
        "files":"ratpack-core\/src\/main\/java\/ratpack\/http\/internal\/DefaultResponse.java"
    },
    {
        ...
    }
]

"commit": Hash of commit.
"oldname": identifier before renaming.
"newname": identifier after renaming.
"typeOfIdentifier": Type of Identifier. You can specify either of "ClassName", "FieldName", "MethodName", "ParameterName", and "VariableName".
"line": Line where the identifier is defined.
"files": The path from "repo" to the file where the identifier is defined.

Run sh renas/execRenas. You can obtain "projects/*project name*/recommend.json.gz".

Reproduction (lightweight)

The reproduction process explained later requires a huge time (more than a week). In case that you want to just check the process of reproduction briefly, we provide a lightweight version of the reproduction (less than 1 hour), limiting the target project only to baasbox. If you want to try it, first prepare the baasbox repository outside the Docker environment:

$ mkdir -p projects/baasbox
$ git clone https://github.com/baasbox/baasbox.git projects/baasbox/repo
$ (cd projects/baasbox/repo && git reset --hard 42a265288906070f031ce9e0e24aeeac26c3a952)

and just run bash evaluation-lightweight.sh inside the Docker environment. Then you will see a file of projects/baasbox/recommend.json.gz as the recommendation result.

You may see the following result (snapshot of a console) when you run evaluation-lightwehght.sh. Note that the warnings in log4j may be produced when internally running RefactoringMiner, which could be ignored.

Reproduction

This is the reproduction process of the results presented in our paper. Download the dataset and extract its contents (refer to as $Dataset). Some of them will be used for the inputs in the reproduction process.

The projects we used are as follows:

17 projects

(...) indicates the latest commit.

dataset which uses preliminary research (Section 3-E in paper)

baasbox (42a265288906070f031ce9e0e24aeeac26c3a952)
cordova-plugin-local-notifications (eb0ac58a8a8a9b4602f9c795c285abe089d5d10f)
morphia (cd0426c32b7c8426fbbcd4cbbfad3596246265f0)
spring-integration (6207bca3bd74cee3f37e2e9df18156a89aa90ab9)

Automatically identified dataset

testng (d01a4f1079e61b3f6990ba55a1ef1138266baedd)
jackson-databind (bd9bf1b89195051a127d0a946aaf95259058c0e8)
rest.li (1d43edee1a9277324f75b4e90362dd6dc367ecdf)
Activiti (d9277212b01279079cfe71465e16398310d1c216)
k-9 (cba9ca31aa6bdb8911a2787afc145c27cf366bec)
genie (e0c62669f1016522ea1faaf8b1a18833c65cda0e)
eucalyptus (95e0cef57eba3da26ed798317900da4eeac44263)
graylog2-server (80a9e8e69f0635e489b076c7dac62a7ef45c409f)
core (49cada01fc2b71646ec36b1215d805c1c3a3b198)
gnucash-android (2ad44adf6dd846aabf8883d41be3719b723bf4f1)
giraph (14a74297378dc1584efbb698054f0e8bff4f90bc)

Manually validated dataset

ratpack (29434f7ac6fd4b36a4495429b70f4c8163100332)
argouml (be952fcfa77451e594a41779db83e1a0d7221002)

Create directories for the above 17 projects and place each repository in the repo. An easy way to do it is to run bash ./clone_repository.sh.
Place "manualValidation.csv" in the ratpack and argouml directories. This CSV file is located in $Dataset/projects/{ratpack, argouml}

$ cp $Dataset/projects/ratpack/manualValidation.csv projects/ratpack/
$ cp $Dataset/projects/argouml/manualValidation.csv projects/argouml/

Run the following commands in the order of top to bottom: preliminary study, evaluation with the automatically identified dataset, evaluation with the manually validated dataset.

# bash renas/preliminaryResearch.sh         # (may take ca. 2 days)
# bash renas/researchQuestion.sh            # (may take ca. 1 week)
# bash renas/researchQuestionManually.sh    # (may take ca. 2 days)

The results will be placed in the result directory.
- "Output File" contains a description of each file.

Inputs and Outputs of Commands

renas/repository_analyzer.py

By running the following command, the renames are extracted from RefactoringMiner. The relationships of source code are analyzed, and the identifiers are normalized.
python3 -m renas.repository_analyzer projects/**project name**
Input:

repository which you'd like to analyze

Output directory:

projects/*project name*/archives
projects/*project name*/archives/*commit id*

Output file:

projects/*project name*/archives/*commit id*/exTable.csv.gz
projects/*project name*/archives/*commit id*/classRecord.json.gz
projects/*project name*/archives/*commit id*/record.json.gz
projects/*project name*/goldset.json.gz

The above programs mainly involve the files below.

renas/refactoringminer.py
Run RefacotoringMiner
renas/refactoring/rename_extractor.py
Extract rename refactorings from RefactoringMiner
renas/relationship/analyzer.py
Analyze the relationships of source code where rename refactoring was done and normalize identifiers.
- AbbrExpansion/out/ParseCode-all.jar
  - parse relationships using AST (The source code is in AbbrExpansion/code/ParseCode)
- AbbrExpansion/code/SemanticExpand/out/libs/SemanticExpand-all.jar
  - Abbreviation expansion using KgExpander (The source code is in AbbrExpansion/code/SemanticExpand)
- renas/relationship/normalize.py
  - Remove identifier inflections

renas/recommendation.py

By running the following command, four approaches (None, Relation, Relation + Normalize, RENAS) are recommended based on the renaming.
python3 -m renas.recommendation projects/**project name**

Input:

projects/*project name*/archives/*commit id*/exTable.csv.gz
projects/*project name*/archives/*commit id*/classRecord.json.gz
projects/*project name*/archives/*commit id*/record.json.gz
projects/*project name*/goldset.json.gz

Output file:

projects/*project name*/recommend.json.gz

The above programs primarily involve the "renas/approaches/" directory.

renas/evaluator.py

By running the following command, recommended results obtained by four approaches (None, Relation, Relation + Normalize, RENAS) are evaluated based on co-renamings.
python3 -m renas.evaluator **option** projects/**project name**

The options are:

option	description	Output
-pre	Preliminary study	result/preliminary
-rq1	Research question 1	result/rq1
-rq2	Research question 2	result/rq2
-manual	Evaluation with the manually validated dataset. Use with -rq1 and/or -rq2	result/rq1_manual and/or result/rq2_manual
-sim	Similarity study	result/similarity

This script is executed by the following shell scripts:

renas/preliminaryResearch.sh (Executing with -sim -pre)
renas/researchQuestion.sh (Executing with -rq1 -rq2)
renas/researchQuestionManually.sh (Executing with -manual -rq1 -rq2)

Input:

projects/*project name*/recommend.json.gz
projects/*project name*/manualValidation.csv (if you choose -manual)

Output directory (option):

result
result/preliminary (-pre)
result/rq1 (-rq1)
result/rq2 (-rq2)
result/rq1_manual (-rq1 -manual)
result/rq2_manual (-rq2 -manual)
result/similarity (-sim)

Output file (option):

result/preliminary/values_by_alpha_beta.csv (-pre)
result/rq1/rq1.csv (-rq1)
result/rq2/ranking_evaluation.csv (-rq2)
result/rq1_manual/rq1.csv (-rq1 -manual)
result/rq2_manual/ranking_evaluation.csv (-rq2 -manual)
result/similarity/similarity.csv (-sim)

The above programs primarily involve the "renas/evaluation/" directory.

Output File Format

result/{rq1, rq1_manual}/rq1.csv

(Generated by renas/evaluator.py)

Evaluation result of RQ1

project name: Evaluated project name
approach: Approach name
precision average
recall average
fscore average

result/{rq2, rq2_manual}/ranking_evaluation.csv

(Generated by renas/evaluator.py)

Evaluation result of RQ2

alpha: Parameter which is used in culculating priority
MAP: Mean Average Precision
MRR: Mean Reciprocal Rank
top1 Recall
top5 Recall
top10 Recall

result/similarity/similarity.csv

(Generated by renas/evaluator.py)

Evaluation result of Section III E-(1)

commit: Hash of commit
name1 file: File where name1 is defined
name1 line: Line where name1 is defined
name1: Identifier after normalization
name2 file: File where name2 is defined
name2 line: Line where name2 is defined
name2: Identifier after normalization
similarity: Similarity score calculated by Dice coefficient

result/preliminary/value_by_alpha_beta.csv

(Generated by renas/evaluator.py)

Evaluation result of Section III E-(3)

alpha: Parameter which is used in culculating priority
beta: Threshold of priority
precision average
recall average
fscore average

projects/project name/recommend.json.gz

(Generated by renas/recommendation.py)

Each commit has the following structure:

goldset is the renaming database obtained from RefactoringMiner.
"none", "relation", "retionshipNormalize", "renas" are recommendation results for each method.
- "0" is the recommendation result when the 0th renaming of "goldset" is done.

"0b169b7d2286620eb346ddc625f97cd6ce6bb392": {
        "goldset": [
            {
                goldset_infomation
            },
            ...
        ]
        "none": {
              "0": [
                {
                  recommend_infomation   
                },
              ...]
        }
        "relation":{
              "0": [
                {
                  recommend_infomation   
                },
              ...]
        }
        "relationNormalize":{
              "0": [
                {
                  recommend_infomation   
                },
              ...]
        }    
        "renas":{
              "0": [
                {
                  recommend_infomation   
                },
              ...]
        }  
      }

Below is the goldset_information.

{
    "type": "Rename Parameter",
    "commit": "0b169b7d2286620eb346ddc625f97cd6ce6bb392",
    "oldname": "toUnfollow",
    "newname": "theFollowed",
    "typeOfIdentifier": "ParameterName",
    "line": 850,
    "files": "app/com/baasbox/controllers/Admin.java",
    "operation": [
        [
            "replace",
            [
                "to",
                "unfollow"
            ],
            [
                "the",
                "follow"
            ]
        ]
    ],
    "normalized": [
        "to",
        "unfollow"
    ],
    "id": "Lcom/baasbox/controllers/Admin;.removeFollowRelationship(Ljava/lang/String;Ljava/lang/String;)LResult;#toUnfollow#0#1"
},

Below is the recommend_information.

similarity is Score_sim
relationship is Score_rel

{
  "id": "Lcom/baasbox/service/storage/DocumentService;
  "files": "app/com/baasbox/service/storage/DocumentService.java",
  "line": 171,
  "name": "grantPermissionToUser",
  "typeOfIdentifier": "MethodName",
  "similarity": 0.6666666666666667,
  "relationship": 7.0,
},

projects/project name/archives/commit id/exTable.csv.gz

(Generated by renas/repository_analyzer.py)

column	description
id	Unique ID attached to the identifier
files	File where the identifier is defined
line	Line where the identifier is defined
name	The identifier name
typeOfIdentifier	Type of identifier
subclass	Part of the relationship "parent" (a class → its subclass)
descendant	Part of the relationship "ancestor" (a class → the subclass of its subclass or more)
parent	Part of the relationship "parent" (a class → its parent class)
ancestor	Part of the relationship "ancestor" (a class → its ancestor class)
method	The relationship "method"
field	The relationship "field"
sibling-members	The relationship "sibling-members"
comment	comment
type	The relationship "type"
enclosingClass	The relationship "enclosingClass"
assignmentEquation	The relationship "assignmentEquation"
pass	The relationship "pass"
argumentToParameter	Part of the relationship "argument" (argument of a method → parameter of the method)
parameter	The relationship "parameter"
enclosingMethod	The relationship "enclosingMethod"
parameterToArgument	Part of the relationship "argument" (parameter of a method → argument of the method)
split	Identifier after splitting
delimiter	Delimiter character
case	Case of words
pattern	Naming pattern of identifier
heuristic	Abbreviated forms
expanded	Identifier after expanding abbreviations
postag	POS tag for each word
normalized	Normalized identifier
parameterOverload	The relationship "parameterOverload"

projects/project name/archives/commit id/classRecord.json.gz

(Generated by renas/repository_analyzer.py)

A file that records the expanded abbreviations for each file. For example, if "buf" is expanded to buffer four times in temp.java, it shows below.

{
"temp.java": 
    {
    "buf==buffer":4
    }
}

projects/project name/archives/commit id/record.json.gz

(Generated by renas/repository_analyzer.py)

A file that records the abbreviations expanded within the project.

Related Publications

If you use or mention this tool in a scientific publication, we would appreciate citations to the following paper:

Naoki Doi, Yuki Osumi, and Shinpei Hayashi, "RENAS: Prioritizing Co-Renaming Opportunities of Identifiers," in Proceedings of the 40th IEEE International Conference on Software Maintenance and Evolution (ICSME 2024), pp. TBD, Arizona, United States, 2024, doi: TBD. Preprint: http://arxiv.org/abs/2408.09716

@inproceedings{doi-icsme2024,
  author = {Naoki Doi and Yuki Osumi and Shinpei Hayashi},
  title = {{RENAS}: Prioritizing Co-Renaming Opportunities of Identifiers}, 
  booktitle = {Proceedings of the 40th IEEE International Conference on Software Maintenance and Evolution (ICSME 2024)},
  pages = {TBD},
  doi = {TBD},
  year = {2024},
}

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
RefactoringMiner/RefactoringMiner-2.0.2		RefactoringMiner/RefactoringMiner-2.0.2
png		png
projects/temp/repo		projects/temp/repo
renas		renas
.gitignore		.gitignore
CITATION.cff		CITATION.cff
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
clone.txt		clone.txt
clone_repository.sh		clone_repository.sh
docker-compose.yml		docker-compose.yml
downloadNLTK.py		downloadNLTK.py
evaluation-lightweight.sh		evaluation-lightweight.sh
requirements.txt		requirements.txt
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RENAS toolkit

Installation

Requirements

Setup

Basic Usage

Performing Recommendation

Reproduction (lightweight)

Reproduction

Inputs and Outputs of Commands

renas/repository_analyzer.py

renas/recommendation.py

renas/evaluator.py

Output File Format

result/{rq1, rq1_manual}/rq1.csv

result/{rq2, rq2_manual}/ranking_evaluation.csv

result/similarity/similarity.csv

result/preliminary/value_by_alpha_beta.csv

projects/project name/recommend.json.gz

projects/project name/archives/commit id/exTable.csv.gz

projects/project name/archives/commit id/classRecord.json.gz

projects/project name/archives/commit id/record.json.gz

Related Publications

About

Releases 3

Packages

Contributors 2

Languages

License

salab/RENAS

Folders and files

Latest commit

History

Repository files navigation

RENAS toolkit

Installation

Requirements

Setup

Basic Usage

Performing Recommendation

Reproduction (lightweight)

Reproduction

Inputs and Outputs of Commands

renas/repository_analyzer.py

renas/recommendation.py

renas/evaluator.py

Output File Format

result/{rq1, rq1_manual}/rq1.csv

result/{rq2, rq2_manual}/ranking_evaluation.csv

result/similarity/similarity.csv

result/preliminary/value_by_alpha_beta.csv

projects/*project name*/recommend.json.gz

projects/*project name*/archives/*commit id*/exTable.csv.gz

projects/*project name*/archives/*commit id*/classRecord.json.gz

projects/*project name*/archives/*commit id*/record.json.gz

Related Publications

About

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 2

Languages

projects/project name/recommend.json.gz

projects/project name/archives/commit id/exTable.csv.gz

projects/project name/archives/commit id/classRecord.json.gz

projects/project name/archives/commit id/record.json.gz

Packages