forked from yana-safonova/ig_repertoire_constructor
-
Notifications
You must be signed in to change notification settings - Fork 0
/
ig_simulator_manual.html
255 lines (206 loc) · 9.26 KB
/
ig_simulator_manual.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
<head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>IgSimulator 2.0.alpha Manual</title>
<style type="text/css">
.code {
background-color: lightgray;
}
</style>
<style>
</style>
</head>
<body>
<h1>IgSimulator 2.0.alpha manual</h1>
1. <a href="#intro">What is IgSimulator?</a><br>
2. <a href="#install">Installation</a><br>
2.1. <a href="#test_datasets">Verifying your installation</a><br>
3. <a href="#usage">IgSimulator usage</a><br>
3.1. <a href="#basic_options">Basic options</a><br>
3.2. <a href="#advanced_options">Advanced options</a><br>
3.3. <a href="#examples">Examples</a><br>
3.4. <a href="#output">Output files</a><br>
4. <a href="#output_files">Output file formats</a><br>
4.1. <a href="#base_repertoire.fasta">Base repertoire fasta</a><br>
4.2. <a href="#base_repertoire.info">Base repertoire info</a><br>
4.3. <a href="#pools">Full and filtered pool fasta</a><br>
4.4. <a href="#trees_dir">Clonal Trees files</a><br>
<!--- 5. <a href = "#plot_descr">Plot description</a><br> --->
5. <a href="#feedback">Feedback and bug reports</a><br>
<!--- 5.1. <a href="#citation">Citation</a><br> --->
<!-- -------- --->
<h2 id = "intro">1. What is IgSimulator?</h2>
<p>
IgSimulator is a tool for simulation of antibody repertoires, clonal lineages and clonal trees.
It performs the following steps:
<ul>
<li>simulates <b>metaroots</b> — each as a result of certain V(D)J recombination,</li>
<li>for each metaroot simulates a <b>number of trees</b> that are simulated at the 3rd step with this metaroot as a tree root,</li>
<li>for each metaroot simulates <b>clonal trees</b> imitating evolutionary process and clonal selection.</li>
</ul>
</p>
Some vertices of a clonal tree are marked absent to imitate evolutionary process.
<!-- -------- --->
<a id="install"></a>
<h2>2. Installation</h2>
IgSimulator has the following dependencies:
<ul>
<li>64-bit Linux or MacOS system,</li>
<li>g++ (version 4.7 or higher) or clang compiler,</li>
<li>Cmake (version 2.8.8 or higher),</li>
<li>Python 2.7.</li>
</ul>
To assemble IgSimulator, type
<pre class="code">
<code>
make
</code>
</pre>
To install IgSimulator (after the previous step) type
<pre class="code">
<code>
make install
</code>
</pre>
If you want to install IgSimulator to a particular path <code>$YOUR_PATH</code>, type
<pre class="code">
<code>
make install prefix=$YOUR_PATH
</code>
</pre>
<a id="test_datasets"></a>
<h3>2.1. Verifying your installation</h3>
► To try IgSimulator, run:
<pre class="code"><code>
./ig_simulator.py --test
</code>
</pre>
Test run should take not more than several seconds.
If the installation of IgSimulator is successful, you will find the following information at the end of the log:
<pre class="code">
<code>
Thank you for using IgSimulator!
Log was written to <your_installation_dir>/ig_simulator_test/ig_simulator.log
</code>
</pre>
<!-- -------- --->
<h2 id = "usage">3. IgSimulator usage</h2>
To run IgSimulator, type:
<pre class="code">
<code>
./ig_simulator.py [options] -o <output_dir>
</code>
</pre>
<!-- --->
<h3 id = "basic_options">3.1. Basic options</h3>
<code>-o / --output <output_dir></code><br>
output directory (required).
<br><br>
<code>--test</code><br>
Running at default parameters at a test directory.
Command line corresponding to the test run is equivalent to the following:
<pre class = "code">
<code>
./ig_simulator.py -o ig_simulator_test
</code>
</pre>
<code>--help</code><br>
Printing help.
<br><br>
<!-- --->
<h3 id = "advanced_options">3.2. Advanced options</h3>
<code>-l / --loci <str></code><br>
Immunological loci to simulate V(D)J-recombination. <br>
Available values are <code>IGH</code> / <code>IGL</code> / <code>IGK</code>.
Default value is <code>IGH</code>.
<br><br>
<code>-n / --n_metaroots <int></code><br>
Number of <b>metaroots</b> (results of V(D)J-recombinations) to simulate.
Default value is <code>10</code>.
<br><br>
<code>-s / --tree_strategy <str></code><br>
Strategy to simulate <b>clonal trees</b>.
Available values are <code>deep</code> / <code>wide</code> / <code>uniform</code>.
Default value is <code>deep</code>.
<!-- --->
<h3 id = "examples">3.3. Examples</h3>
To perform simulation of <code>50</code> <b>metaroots</b> with clonal tree simulation strategy <code>uniform</code>
and output to <code>ig_simulator_test</code> directory, run
<pre class="code">
<code>
./ig_simulator.py -n 50 -s uniform -o ig_simulator_test
</code>
</pre>
<!-- --->
<h3 id = "output">3.4. Output files</h3>
IgSimulator creates working directory (which name was specified using option <code>-o</code>)
and outputs the following files and directories there:
<ul>
<li>
<b>base_repertoire.fasta</b> — simulated metaroots — results of V(D)J recombination
(<a href = "#base_repertoire.fasta">Description</a>).
</li>
<li>
<b>base_repertoire.info</b> — detailed information about V(D)J recombination for each metaroot
(<a href = "#base_repertoire.info">Description</a>).
</li>
<li>
<b>filtered_pool.fasta</b>, <b>full_pool.fasta</b> — simulated repertoire with certain clones filtered out (imitation of clonal selection)
and full repertoire without any filtration
(<a href = "#pools">Description</a>).
</li>
</ul>
<ul>
<li>
<b>trees_dir</b> — directory that containes ready-to-draw dot files for all simulated clonal trees
(<a href = "#trees_dir">Description</a>).
</li>
</ul>
<ul>
<li>
<b>ig_simulator.log</b> — a full log of IgSimulator tool.
</li>
</ul>
<!--- -->
<h2 id = "output_files">4. Output file formats</h2>
<h3 id = "base_repertoire.fasta">4.1. Base repertoire fasta</h3>
<b>base_repertoire.fasta</b> presents all simulated metaroots in fasta format.
Id of each metaroot matches the pattern <code>forest_X_multiplicity_Y</code> where <code>X</code> is a zero-based number of the metaroot (max is param <code>-n</code> minus one)
and <code>Y</code> is the number of trees that are simulated with this metaroot as a root.
<h3 id = "base_repertoire.info">4.2. Base repertoire info</h3>
<b>base_repertoire.info</b> presents the following information about each simulated metaroot
<ul>
<li><code>Index</code> corresponds to the index in the id of the metaroot in the <code>base_repertoire.fasta</code> (<a href = "#base_repertoire.fasta">here</a>).</li>
<li><code>V/D/J names and sequences</code> — names of gene segments and their sequences that form the metaroot.</li>
<li><code>Cleavage in V/D(left)/D(right)/J gene</code> — the lenght of the cleavage. Negative cleavage presents a palindrome insertion.</li>
<li><code>Insertion in VD/DJ junction</code> — non-genomic insertions. </li>
<li><code>CDR1/2/3 positions and sequences</code> — zero-based positions of CDRegions in the metaroot sequence and corresponding sequences themselves.</li>
</ul>
<h3 id = "pools">4.3. Full and filtered pool fasta</h3>
<b>full_pool.fasta</b> presents the full pool of all sequences that are simulated.
Id of each sequence matches the pattern <code>forest_X_tree_Y_antibody_Z</code> where
<ul>
<li><code>X</code> is a zero-based number of the metaroot (max is param <code>-n</code> minus one),</li>
<li><code>Y</code> is a zero-based number of the clonal tree that the sequence is a vertex of,</li>
<li><code>Z</code> is a zero-based number of the sequence in the <code>Y</code>th clonal tree.</li>
</ul>
Due to clonal selection a certain number of sequences is absent from the real repertoire.
<b>filtered_pool.fasta</b> presents sequences in the same format as of <b>full_pool.fasta</b>.
However, the former is a subset of the latter.
<h3 id = "trees_dir">4.4. Clonal trees files</h3>
Each file in the directory <code>trees_dir</code> represents a certain simulated clonal tree in ready-to-draw <code>dot</code> format.
The name of each file matches the pattern <code>forest_X_tree_Y.dot</code> where
<ul>
<li><code>X</code> is a zero-based number of the metaroot (max is param <code>-n</code> minus one),</li>
<li><code>Y</code> is a zero-based number of the clonal tree among those possessing the metaroot as their root.</li>
</ul>
The id of each vertex is the <code>Z</code> defined <a href = "#pools">here</a>.
Productive/non-productive sequences are shaped as circles/rectangulars.
Absent seqs (that are present only in <b>full_pool.fasta</b> but not in the <b>filtered_pool.fasta</b>) are colored in magenta.
Additionally dot file containes comments about simulated SHMs.
<a id="feedback"></a>
<h2>5. Feedback and bug reports</h2>
Your comments, bug reports, and suggestions are very welcome.
They will help us to further improve IgSimulator.
<br><br>
If you have any trouble running IgSimulator, please send us the log file from the output directory.
<br><br>
Address for communications: <a href="mailto:igtools_support@googlegroups.com">igtools_support@googlegroups.com</a>.