This codebase is a C++ project focused on refinement of protein structures against experimental in solution SAXS. It includes functionality for generating and manipulating molecular structures, analyzing their properties, and fitting them to experimental data.
To build the project using CMake, follow these steps:
- Open a terminal and make sure you have CMake installed on your system (version 3.10 or higher is recommended)
cmake -version
- Navigate to the carbonara root directory:
cd path/to/carbonara
- Inside the carbonara directory, create a build directory and navigate into it:
mkdir build
cd build
- Generate the build files:
cmake ..
- Build the project:
make
This class represents a molecular structure. It includes methods for:
- Reading in sequence and coordinate data
- Manipulating the molecular structure
- Analyzing properties like hydrophobicity and coiled-coil potential
Key methods to focus on:
readInSequence()
: Parses sequence datareadInCoordinates()
: Loads coordinate datachangeMoleculeSingleMulti()
: Modifies a specific part of the molecule
Handles the calculation of hydration shells around molecules. Key areas:
- Generation of hydration layer
- Calculation of solvent-molecule distances
Deals with experimental scattering data and fitting. Important methods:
fitToScattering()
: Fits molecular model to scattering datasetPhases()
: Sets up scattering phases for calculations
Generates random molecular structures. Key functionality:
- Creation of random sections with specific properties
- Blending different structural elements (e.g., loops to helices)
Calculates writhe (a topological property) for molecular structures.
DIDownSample()
: Calculates downsampled writhecompareFingerPrints()
: Compares writhe "fingerprints" between structures
Located primarily in randomMol
class. The main method to focus on is makeRandomMolecule()
.
Implemented in experimentalData
class. Key method is fitToScatteringMultiple()
.
Implemented in writheFP
class. The main method is DIDownSample()
.
The primary execution flow is in mainPredictionFinalQvar.cpp
. It follows these steps:
- Initialize parameters and data structures
- Load experimental data
- Generate or load initial molecular structures
- Iteratively modify structures and evaluate fit
- Output results
- Code Organization: Many functions, especially in main files, are very long and could be broken down.
- Error Handling: More robust error checking and handling is needed throughout.
- Memory Management: Consider replacing raw pointers with smart pointers.
- Parallelism: There's potential for more parallelism in computationally intensive parts.
- Testing: Implement unit tests for key components.
RefactormainPredictionFinalQvar.cpp
to improve readability and maintainability.- Implement more comprehensive error handling.
- Optimize performance-critical sections, possibly using parallel computing techniques.
- Improve documentation throughout the codebase.
- Implement a testing framework and write unit tests for key components.