refineTagsAndSort settings for Cell + UMI with partial whitelist #1089
-
How can we handle UMI correction if we have whitelist for only one portion of the barcode? For example, for the tag pattern: Lets say I expect the middle part of UMIs to be GGGCCC or GGGTTT. Should I make their entries in the umi_whitelist.txt file to be NNNNNGGGCCCNNNNN and NNNNNGGGTTTNNNNN?
Alternatively, should I break the UMI in the tag pattern like the following
I was worried that by not providing a whitelist for UMI1 and UMI3, that they don't get corrected. I was also looking for presets to process both cell barcodes and UMIs. I found your preset for BD and 10x. They look similar for the first half:
But then BD continues with
and 10x continues with
What are the implications for these different |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Hi,
It is important to note that the presence or absence of a whitelist does not affect the correction step. With regards to the filters, they are used to remove errorneous cells based on the UMI count. For BD data, a strict threshold of 3 UMI per cell is applied, whereas for 10x data, an automatic thresholding approach is used, which employs a modified Otsu algorithm. However, the automatic thresholding approach does not work properly for BD data, which is why the strict threshold is used instead. |
Beta Was this translation helpful? Give feedback.
Hi,
Regarding the UMI whitelist, the right way is the second variant, which involves dividing the UMI into three groups (e.g., UMI1, UMI2, and UMI3) and providing a whitelist for the middle UMI group (UMI2). When supplying a whitelist locally, it is necessary to include the path to the file that contains the barcodes.
It is important to note that the presence or absence of a whitelist does not affect the correction step.
With regards to the filters, they are used to remove errorneous cells based on the UMI count. For BD data, a strict threshold of …