Skip to content

Commit

Permalink
ConDec-498: Manage file system directories for branches of git reposi…
Browse files Browse the repository at this point in the history
…tories (#137)

* CONDEC-498: manage paths for git repo branches

* GitRepositoryFSManager class for given project, git uri and default branch name should return a unique path on server's file system.

* assuming usage of multiple independent gitclient instances in this project, branches should be checked-out in their own folders on the file system to prevent conflicts with work on other branches. GitRepositoryFSManager arranges the file system and provides interfaces which can be later used by Gitclient implementations.

[issue] Cloning git repositories costs a lot of time. How can providing branch related folder be accelerated? [/issue]
[decision] Copy or rename already existing repository folders to branch dedicated folder, pull updates and checkout the branch files in the dedicated folder![/decision]
[pro] Simple but sufficient implementation. [/pro]
[con] Inefficient usage of disk space. [/con]
[con] Not any more used branches should be recycled or deleted. [/con]
[pro] Disk space is cheaper compared to network, RAM and CPU resources.[/pro]
[pro] Branch folder recycle strategies could be imlemented.[/pro]

* maintain/recycle old branch folders

[issue]Not any more used branch folders waste disk space and users most likely will not recycle such folders, how can this be improved? [/issue]

[decision]Upon a request for branch folder GitRepositoryFSManager will check if some branch folders are not used for longer time and eventually return them to temporary folders pool!
Temporary folders can be reused by future requests and keep the number of folders for given git repository at low level.[/decision]
[pro]Relatively simple strategy to implement[/pro]
[con]"Longer time" is not defined. Deciding for which parameters a branch folder is not needed any more is not black and white.[/con]
[con]Does not help much in some edge cases like when at short period of time many branches were accessed and not manually released by the user/programmer.[/con]
[pro]Implementation can begin with last request time for a branch and a fixed look-back-duration constant.[/pro]

[alternative]React only if the operating system runs out of space, then delete the oldest branch folder![/alternative]
[pro]Once checked out branch is almost forever available on the file system.[/pro]
[con]Very dangerous for the machine hosting the solution.[/con]
[con]Irresponsive utilization of resources.[/con]
  • Loading branch information
lw2011 authored Jun 7, 2019
1 parent 341950c commit 2774b68
Show file tree
Hide file tree
Showing 2 changed files with 507 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,243 @@
package de.uhd.ifi.se.decision.management.jira.extraction.versioncontrol;


import org.apache.commons.io.FileUtils;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import javax.xml.bind.DatatypeConverter;
import java.io.File;
import java.io.IOException;
import java.nio.charset.Charset;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.util.Date;

public class GitRepositoryFSManager {
private static final String TEMP_DIR_PREFIX = "TEMP";
private static final long BRANCH_OUTDATED_AFTER = 60 * 60 * 1000; //ex. 1 day = 24 hours * 60 minutes * 60 seconds * 1000 miliseconds
private String baseProjectUriPath;
private String baseProjectUriDefaultPath;

private static final Logger LOGGER = LoggerFactory.getLogger(GitRepositoryFSManager.class);

public GitRepositoryFSManager(String home, String project, String repoUri, String defaultBranch) {
String baseProjectPath = home + File.separator + project;
baseProjectUriPath = baseProjectPath + File.separator + getShortHash(repoUri);
baseProjectUriDefaultPath = baseProjectUriPath + File.separator + defaultBranch;
// clean up if possible after previous requests.
maintainNotUsedBranchPaths();
}

/**
* Returns target directory path for the default branch of the repository.
* @return absolute path to directory of the default branch
*/
public String getDefaultBranchPath() {
return baseProjectUriDefaultPath;
}

/**
* Makes branch's folder available in the temporary pool.
* It can significantly contribute to improving the speed
* of check-outs of other branches as folder renaming
* is not costly compared to copying.
*
* This mehod stays public as the developer might intend
* to release the folder and not wait for maintenance
* strategy to trigger it.
*
* @param branchShortName branch name
* @return null on failure, absolute path to new temporary directory
*/
public String releaseBranchDirectoryNameToTemp(String branchShortName) {
File oldDir = new File(getBranchPath(branchShortName));
if (!oldDir.isDirectory()) {
return null;
}
else {
Date date= new Date();
long time = date.getTime();
String tempDirString = baseProjectUriPath
+File.separator
+TEMP_DIR_PREFIX+String.valueOf(time);
File tempDir = new File(tempDirString);
boolean renameResult = false;
try {
renameResult = oldDir.renameTo(tempDir);
}
catch (Exception e) {
LOGGER.error("Could not rename "+oldDir
+" to "+tempDirString+". "+e.getMessage());
return null;
}
if (!renameResult) {
LOGGER.error("Could not rename "+oldDir
+" to "+tempDirString+". The reason is not known.");
return null;
}
removeBranchPathMarker(branchShortName);
return tempDirString;
}
}

/**
* Provides filesystem directory for targeted branch.
* Best case: branch already exists, costs no I/O operations.
* Good case: temporary folder exists and can be renamed to branch's
* target folder name.
* Bad case: branch folder is copied in I/O heavy operation
* from default branch.
*
* @param branchShortName branch name
* @return null on failure, absolute path to branch's directory
*/
public String prepareBranchDirectory(String branchShortName) {
if (!useFromExistingBranchFolder(branchShortName)
&& !useFromTemporaryFolder(branchShortName)
&& !useFromDefaultFolder(branchShortName)) {
LOGGER.warn("Neither branch, nor temporary," +
" nor default folder could be found under: "
+ baseProjectUriPath);
return null;
}
rememberBranchPathRequest(branchShortName);
return getBranchPath(branchShortName);
}

/*
* @issue:file system does not allow all charcters for folder and file name,
* therefore md5 can be used to get unique strings for inputs like uris etc.
* But md5 hashes can produce too long paths and corrupt the filesystem, how
* can this be overcome?
* @decision: use the first 5 characters from the generated hash
* @pro: it is common practise to shorten hashes
* @con: entropy might suffer too much from using only 5 chars.
*/
private String getShortHash(String text) {
try {
MessageDigest md = MessageDigest.getInstance("MD5");
md.update(text.getBytes());
byte[] digest = md.digest();
return DatatypeConverter.printHexBinary(digest).toUpperCase().substring(0,5);
}
catch (NoSuchAlgorithmException e) {
LOGGER.error("MD5 does not exist??");
return "";
}
}


/*
* Makes sure not too much disk space is wasted, if there
* is no need for many folders.
*
* Still does not prevent disk space waste when many branches
* need to be accessed in parallel.
*/
private void maintainNotUsedBranchPaths() {
String[] notUsedBranchPaths = findOutdatedBranchPaths();
if (notUsedBranchPaths!=null)
for (String branch : notUsedBranchPaths) {
releaseBranchDirectoryNameToTemp(branch);
LOGGER.info("Returned "+branch+" to temporary directory pool.");
}
}

/*
* Writes files for each branch folder request, the creation date
* of these files can be later used for branch folder clean-ups.
*/
private void rememberBranchPathRequest(String branchShortName) {
// ignore the last marker
removeBranchPathMarker(branchShortName);
// add new marker
File file = new File(baseProjectUriPath+File.separator+branchShortName);
file.setWritable(true);
try {
// assumes branch names are valid file names
FileUtils.writeStringToFile(file, getShortHash(branchShortName), Charset.forName("UTF-8"));
}
catch (IOException ex) {
LOGGER.info(ex.getMessage());
}
}

/* If the branch file marker does not exist, the maintenance
* shall not try to recycle the branch folder
*/
private void removeBranchPathMarker(String branchShortName) {
File file = new File(baseProjectUriPath,branchShortName);
file.delete();
}

private String getBranchPath(String branchShortName) {
return baseProjectUriPath+File.separator+getShortHash(branchShortName);
}

private boolean useFromDefaultFolder(String branchShortName) {
File defaultDir = new File(baseProjectUriDefaultPath);
if (!defaultDir.isDirectory()) {
return false;
}
else {
try {
File newDir = new File(getBranchPath(branchShortName));
FileUtils.copyDirectory(defaultDir,newDir);
}
catch (Exception e) {
LOGGER.error("Could not copy "+defaultDir
+" to "+getBranchPath(branchShortName)+".\n\t"+e.getMessage());
return false;
}
}
return true;
}

private boolean useFromExistingBranchFolder(String branchShortName) {
File dir = new File(getBranchPath(branchShortName));
return dir.isDirectory();
}

private boolean useFromTemporaryFolder(String branchShortName) {
String[] tempDirs = findTemporaryDirectoryNames();
if (tempDirs==null || tempDirs.length<1) {
return false;
}
try {
File dir = new File(baseProjectUriPath+File.separator
+tempDirs[0]); // get the 1st of temp dirs
File newDir = new File(getBranchPath(branchShortName));
dir.renameTo(newDir);
}
catch (Exception e) {
LOGGER.error("Could not rename "+tempDirs[0]
+" to "+getBranchPath(branchShortName)+". "+e.getMessage());
return false;
}
return true;
}

private String[] findTemporaryDirectoryNames() {
File file = new File(baseProjectUriPath);
String[] directories = file.list((current, name) ->
(name.toString().startsWith(TEMP_DIR_PREFIX)
&& new File(current, name).isDirectory()));
return directories;
}

/* Searches for files inside baseProjectUriPath
* and looks at their creation dates
*/
private String[] findOutdatedBranchPaths() {
File file = new File(baseProjectUriPath);
Date date = new Date();
String[] branchTouchFiles = file.list((current, name) ->
{
boolean outDated = (date.getTime()-current.lastModified())>BRANCH_OUTDATED_AFTER;
boolean isFile = new File(current, name).isFile();
return isFile && outDated;
});
return branchTouchFiles;
}
}
Loading

0 comments on commit 2774b68

Please sign in to comment.