TMT data analysis with DEqMS - with multiple set of TMT data sets #9

seongminlab · 2022-11-02T14:42:05Z

Dear Assume DEqMS developer teams!

how can i running DEqMS with multiple set of TMT dat sets??

i have 55 sample that multiplexed TMT-11plex with 5 set of data sets

in this case, how can i running DEqMS?

Thank you

yafeng · 2022-11-03T03:43:27Z

Hi! To analyze multiple TMT sets data using DEqMS, you need to first know what the experiment design is.
There are two different situations.

multiple TMT sets without internal standards.
In this case, combine samples from all TMT sets and obtain one large protein intensity matrix including all samples(remember to do log2 transform before next steps).
At the design matrix step, add an extra factor:

design = model.matrix(~0+group+TMTset)

"group" represent the factor which tells which group the sample belongs to, such as "ctrl" or "treated".
"TMTset" represent the factor which tells which TMT set the sample analyzed, such as "set1" , "set2", ...,"set5".

The "TMTset" factor is supposed to account for batch difference between different TMT sets. However, it relies on reasonable experiment design.

Well, you can also use other approach such as ComBat (from sva package ) to remove batch difference between TMT sets instead of linear regression mentioned above.

multiple TMT sets with internal standards.
Internal standards are aliquots of a pooled sample, it is normally used to account for variations between different TMT sets.
In this case, you can first calculate protein ratios using the intensity of internal standards as denominator.
Then combine protein ratios matrix in different TMT sets together and do log2 transform.

The next step is to make a PSM count table of different proteins. This is same for the above two situations.
Extract PSM count from different TMT sets, and use the minimum counts of different sets to assign it as the PSM count of each protein.

seongminlab · 2022-11-03T04:11:42Z

Hi!

Yes. We using secondary experiment design!

now i have normalized intensity tables by pooling samples ( called IRS normalization, generally!)

but, this case i cannot using DEqMS with my PSM tables (psm.tsv from fragpipes (5 files from each experimental sets ) or PSM table results from PD) ?

and other case, called "Differential protein expression analysis with DEqMS using a protein table" in DEqMS tutorial

there are import single PSM count table
'''
psm.count.table = data.frame(count = rowMins(
as.matrix(df.prot[,count_columns])), row.names = df.prot$Protein.accession)
fit3$count = psm.count.table[rownames(fit3$coefficients),"count"]
fit4 = spectraCounteBayes(fit3)
'''
in fit3$count (This is PSM count table, right?)

to import function spectraCounteBayes()

but i have 5 PSM count table with different TMT sets.

you mention

"Extract PSM count from different TMT sets, and use the minimum counts of different sets to assign it as the PSM count of each protein."

this means import minimun PSM counts for each TMT sets to fit3$count ?
is it OK? becaus some proteins were 0 PSM counts in some sets of TMT

and this PSM counts dosen't have problems for statistical analysis?

Thank you!!

yafeng · 2022-11-03T06:30:53Z

If you have five PSMs table from five TMT sets. You can try to get a protein table separately following the tutorial.
"DEqMS analysis using a PSM table".
https://bioconductor.org/packages/release/bioc/vignettes/DEqMS/inst/doc/DEqMS-package-vignette.html#deqms-analysis-using-a-psm-table-isobaric-labelled-data

First log2 transform PSM tables and then use medianSummary to get protein matrix for each TMT set separately.

dat.gene.nm = medianSummary(dat.psm.log, group_col = 2, ref_col = c(4, 5) )

"group_col" refers to the column of protein IDs.
"ref_col" refers to the column of your internal standards.
The PSM table "dat.psm.log" should be organized as "Sequence", "Proteins", "reporter intensity 1", "reporter intensity 2" ...

After you get the protein matrix for each TMT set, combine them into one matrix.

this means import minimun PSM counts for each TMT sets to fit3$count ?
is it OK? becaus some proteins were 0 PSM counts in some sets of TMT.

if the minimum PSM count is 0, you can add a pseudo count 1 to it.
fit3$count = fit3$count +1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TMT data analysis with DEqMS - with multiple set of TMT data sets #9

TMT data analysis with DEqMS - with multiple set of TMT data sets #9

seongminlab commented Nov 2, 2022

yafeng commented Nov 3, 2022

seongminlab commented Nov 3, 2022

yafeng commented Nov 3, 2022 •

edited

Loading

TMT data analysis with DEqMS - with multiple set of TMT data sets #9

TMT data analysis with DEqMS - with multiple set of TMT data sets #9

Comments

seongminlab commented Nov 2, 2022

yafeng commented Nov 3, 2022

seongminlab commented Nov 3, 2022

yafeng commented Nov 3, 2022 • edited Loading

yafeng commented Nov 3, 2022 •

edited

Loading