-
Notifications
You must be signed in to change notification settings - Fork 2
Inside SemDiff
See Terminology to for details on the terms used on this page.
The SemDiff solution (sln) contains the following projects
Name | Purpose |
---|---|
SemDiff.Core | Implements analyzer interface, and all supporting functionality |
SemDiff.Test | VSTest project |
SemDiff.Vsix | Extension project used for debugging |
SemDiff can be divided into three major components:
- Interaction with GitHub, local cache maintenance, and repo identification
- Analysis of the data from GitHub and from the analyzer
- Conversion of detected conditions into
Diagnostic
s that are reported
SemDiff searches within the project folder for the .git file in order to obtain the information necessary to query the GitHub API about the repository. Once this is found, a .semdiff folder is created in the root directory. SemDiff uses this directory to store open pull request files within number folders referring to the pull request number on GitHub. SemDiff only looks for C# files and stores them within the same folder structure as they are found on GitHub. SemDiff also creates a .json file every time these files are updated to allow for reloading of the data next time Visual Studio is launched and the project is open.
When SemDiff queries GitHub it looks for a few things. First, it passes an ETag to check and see if any changes to the pull requests have been made. If changes have been made, SemDiff receives a list of pull requests. It then requests a file list for each pull request. It uses the links provided within the pull request file list to download every C# file within the pull request. SemDiff also handles pagination within both the pull request list query and pull request file list query.
Repos are identified by the Discover
api provided by LibGit2Sharp. When a Repo is located, the first remote URL that is a repo on GitHub will be used to initialize a Repo object. The repo owner and name are determined by the path in the URL.
###Detecting Moved Methods
Moved methods are detected by storing a list of removed methods (existed in ancestor, doesn't exist in new version) in the local document and the remote document. If a added method is found in the same version (didn't exist in old ancestor, but exists in new version) the other version is checked to see if the same method has been changed. When it has been both changed and moved in both versions it is reported as a false positive.
SemDiff uses the SemanticModel
provided Roslyn to find the tree containing the base class. The file path in the tree allows SemDiff to lookup the base class file in the local cache. If the file was in a pull request the ClassDeclarationSyntax
nodes for each file are mapped. The class contents are then diffed. If there are changes a false positive is reported to the user.
The components are connected by the SemDiffAnalyzer
class. This class is the entry point for the analyzer. Documentation for DiagnosticAnalyzer
s can be found on the Roslyn Wiki. SemDiff is run on every compilation of the project (CompilationAction
call back).
The following steps happen when SemDiff's OnCompilation
method is called:
- A mapping between repositories and trees is constructed
- In common case is that there will be one repo but nothing prevents a project being stored in multiple repositories
- Trees that are not contained in a repo will be filtered out. Often temporary files are included in the compilation that should be ignored.
- A new repo instance is created if a new repo is found
- The GitHub data is updated for each repo
- Each syntax tree for each repo is Analyzed
- If any false positives or false negatives were found they are converted into
Diagnostic
instances - If errors occur executing other steps then a special diagnostics are returned
SemDiff is a unique analyzer. Typical analyzers only perform static analysis the trees provided by the analyzer interface. Because Roslyn provides most of the tools needed for static analysis, analyzers don't typically have dependencies on other libraries.
LibGit2Sharp is a library that wraps the native git library with a managed api. This allows most git functions to be executed in C# code. SemDiff uses LibGit2Sharp to query which files have changed locally and retrieving the remote URLs from the .git/config
.
Json.NET (or Newtonsoft.Json) is a JSON serialization library that SemDiff uses with the GitHub’s Json API.