updated documentation

KevinMenden · Mar 25, 2021 · 238c2dc · 238c2dc
1 parent c984d91
commit 238c2dc
Show file tree

Hide file tree

Showing 5 changed files with 23 additions and 7 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -8,6 +8,7 @@
 * Improved logging and using rich progress bar for training
 * Gene subsetting is now done only when merging datasets, which will allow to generate different combinations
 of simulated datasets
+* Added `scaden merge` command which allows merging of previously created datasets  
 
 ### Version 1.0.2
 

diff --git a/docs/blog.md b/docs/blog.md
@@ -2,10 +2,13 @@
 Apart from the changelog, this is a more informal section where I will inform about new features
 that have been (or will be) implemented in Scaden.
 
-# Scaden v1.1.0 - Performance Improvements (21.03.2021)
+# Scaden v1.1.0 - Performance Improvements and `scaden merge` tool (21.03.2021)
+
+Scaden v1.1.0 brings significantly improved memory consumption for the data simulation step, which was a frequently asked for feature.
+Now, instead of using about 4 GB of memory to simulate a small dataset, Scaden only uses 1 GB. Memory usage does not increase
+with the number of datasets anymore. This will allow to create datasets from large collections of scRNA-seq datasets without 
+needing excessive memory. Furthermore, Scaden now stores the simulated data in `.h5ad` format with the full list of genes.
+This way you can simulate from a scRNA-seq dataset once and combine it with other datasets in the future. To help with this,
+I've added the `scaden merge` command, which takes a list of datasets or a directory with `.h5ad` datasets and creates
+a new training dataset from it.
 
-Scaden v1.1.0 brings significantly improved memory consumption for the data simulation step, which was a asked for 
-quite frequently. Now, instead of using about 4 GB of memory to simulate a small dataset, Scaden only uses 1 GB. This will
-allow to create datasets from large collections of scRNA-seq datasets without needing excessive memory. Furthermore,
-Scaden now stores the simulated data in `.h5ad` format with the full list of genes. This way you can simulate from a
-scRNA-seq dataset once and combine it with other datasets in the future.
diff --git a/docs/changelog.md b/docs/changelog.md
@@ -8,6 +8,8 @@
 * Improved logging and using rich progress bar for training
 * Gene subsetting is now done only when merging datasets, which will allow to generate different combinations
 of simulated datasets
+* Added `scaden merge` command which allows merging of previously created datasets  
+
 
 ### Version 1.0.2
 

diff --git a/docs/index.md b/docs/index.md
@@ -8,3 +8,7 @@ at the [DZNE Tübingen](https://www.dzne.de/en/about-us/sites/tuebingen/) and th
 
 A paper describing Scaden has been published in Science Advances:
 [Deep-learning based cell composition analysis from tissue expression profiles](https://advances.sciencemag.org/content/6/30/eaba2619)
+
+For information about how to install Scaden, go to the [Installation](installation.md) section. Look in the [Usage](usage.md)
+section for general help with Scaden usage. In the [Datasets](datasets.md) section you'll find a list of prepared training datasets.
+You can also have a look in the [Blog](blog.md) section, where I summarize new features that are added to Scaden.
diff --git a/docs/usage.md b/docs/usage.md
@@ -120,7 +120,13 @@ An example for a pattern would be `*_counts.txt`. This pattern would find the fo
 
 Make sure to include an `*` in your pattern!
 
-This command will create the artificial samples in the current working directory. You can also specificy an output directory using the `--out` parameter. Scaden will also directly create a .h5ad file in this directory, which is the file you will need for training. By default, this file will be called `data.h5ad`, however you can change the prefix using the `--prefix` flag.
+This command will create the artificial samples in the current working directory. You can also specificy an output directory using the `--out` parameter.
+Scaden will also directly create a .h5ad file in this directory, which is the file you will need for training.
+By default, this file will be called `data.h5ad`, however you can change the prefix using the `--prefix` flag.
+
+Alternatively, you can manually merge `.h5ad` files that have been created with `scaden simulate` from v1.1.0 on using
+the `scaden merge` command. Either point it to a directory of `.h5ad` files, or give it a comma-separated list of files
+to merge. Type `scaden merge --help` for details.
 
 ## File Formats
 For Scaden to work properly, your input files have to be correctly formatted. As long as you use Scadens inbuilt functionality to generate the training data, you should have no problem