Problems with the decontamination of metagenomic data or viral data #136

1023011930 · 2023-09-23T10:10:19Z

Dear Sir or Madam, I appreciate your significant contribution in decontaminating the sequencing data.
I haven't used decontam yet, but I think most of the examples were researched based on ASV/OTU relative abundance tables
As for metagenomic data，As I understand it, removal can be done by constructing MAG abundance tables
But viral data are difficult to bin and thus constitute a MAG, and they are often found in the form of a Contig. I would like to ask if decontam can decontaminate based on contig abundance (is this scientific？)
From what I've read, the "https://www.nature.com/articles/s41586-020-2192-1" article uses Karken categorical data as the basis for decontaminated decontamination, is this a good approach?
Thank you very much for giving me some instructions!

1023011930 · 2023-09-23T10:15:31Z

To summarize, my problem is that some of the macrogenomic data is difficult to binning into "metaOTU", such as viral data (the kind of ASV relative abundance table that can't generate normal 16s data). What should I do with these data in the macrogenome to use them as input files for decontam?

benjjneb · 2023-09-26T15:56:01Z

You can use decontam with any feature type that has a relative abundance in each sample. This includes contigs.

1023011930 · 2023-09-27T06:47:49Z

You can use decontam with any feature type that has a relative abundance in each sample. This includes contigs.

Does this mean that I can use software such as "BWA OR bowtie2" to quantify the contig, then calculate the RPM or TPM, and then use them as the relative abundance for decontam? Thanks for your kindness!

1023011930 · 2023-09-27T14:43:29Z

In my perception, a standardized contig abundance scale is not the same as a relative abundance scale like 16s.

In my perception, a 16s abundance table is where each sample sums to 100% and each OTU takes up a portion of the 100%, and the percentage（0-100%） taken up is the data

Whereas standardized contig abundance tables generally use the TPM (OR read count) of each contig as the data，So their values are not necessarily in the (0-100) range

It seems to me that these two abundance tables are not the same, may I ask how the contig abundance table is generally handled, if there is a corresponding tutorial or literature I would be very grateful!
Thank you for your answer.

benjjneb · 2023-10-07T01:16:56Z

TPM is also a relative abundance. You can use it just the same.

A "relative abundance" measure, is any metric that informs about the abundance of this relative to that. If the TPM of contig X is doubel that of contig Y, then X has double the relative abundance (by this measure) of contig Y.

1023011930 · 2023-10-07T11:56:53Z

TPM is also a relative abundance. You can use it just the same.

A "relative abundance" measure, is any metric that informs about the abundance of this relative to that. If the TPM of contig X is doubel that of contig Y, then X has double the relative abundance (by this measure) of contig Y.

I think I see what you mean, I will try using standardized contig relative abundance, thank you very much for your reply!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems with the decontamination of metagenomic data or viral data #136

Problems with the decontamination of metagenomic data or viral data #136

1023011930 commented Sep 23, 2023

1023011930 commented Sep 23, 2023

benjjneb commented Sep 26, 2023

1023011930 commented Sep 27, 2023

1023011930 commented Sep 27, 2023 •

edited

Loading

benjjneb commented Oct 7, 2023

1023011930 commented Oct 7, 2023

Problems with the decontamination of metagenomic data or viral data #136

Problems with the decontamination of metagenomic data or viral data #136

Comments

1023011930 commented Sep 23, 2023

1023011930 commented Sep 23, 2023

benjjneb commented Sep 26, 2023

1023011930 commented Sep 27, 2023

1023011930 commented Sep 27, 2023 • edited Loading

benjjneb commented Oct 7, 2023

1023011930 commented Oct 7, 2023

1023011930 commented Sep 27, 2023 •

edited

Loading