Immune checkpoint blocakde-related circRNA signature (ICBcircSig) score model based on progression-free survival-related circRNAs to predict immunotherapy efficacy. Here, we provided codes for identification of circRNAs in the ICB samples, identification of differentially expressed circRNAs between responders and non-responders, and development of ICBcircSig score model by machine learning
Four tools, including CIRI2, find_circ, CircExplorer2, and CircRNA_finder were applied to identify circRNA with default settings. After FastQC (http://www.bioinformatics.babraham.ac. uk/projects/fastqc/) for assessment of the data quality, reads that passed thresholds were aligned to reference genome (GRCh38) using hisat2 with the default setting to obtain mapped and unmapped reads in bam files. Unmapped reads were retrieved by samtools from bam files, and unmapped reads in fastq format were done by bedtools bamtofastq. We employed each program to identify circRNAs with default parameters and annotated with gencode_v28. CircRNAs identified by at least two tools with ≥ 2 back-splice reads were retained for further analysis.
To identify differentially expressed circRNAs between responders and non-responders samples in pre-treatment (PRE) and early during treatment (EDT), respectively, a linear mixed-effects model(LME) which allows for nested random effects (each individual sample) and considers for potential confounding factors was utilized and executed by the lme program in the R package. P-value < 0.05 and |log2 (fold change)| ≥ 0.5 was considered as statistical significance.
We utilized a machine learning-based algorithm to construct ICBcircSig. Briefly, (i) we performed univariate survival analysis to identify prognosis relevant circRNAs by assessing the association of progression-free survival (PFS) and the expression of circRNAs; (ii) Based on LASSO Cox regression model, cv.glmnet function in R package glmnet68 was used. We first set seed 123, and deviance to measure loss to use for cross-validation and 5 folds, to develop the LASSO Cox regression model. Then we filtered circRNAs with lambda coefficient > 0 to retain the optimal combination from circRNAs in (i). The final signature, named “ICBcirSig”, include significant circRNAs (p < 0.05) by multivariate cox analysis of circRNAs in (ii). (iii) The ICBcircSig score of each sample was built through the following equations based on the expression value and multivariate Cox regression coefficient (1.001 ∗ circTMTC3 + 1.048 ∗ circFAM117B).