Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The 'group1' or 'group2' column in the 'daa_results_df' data frame contains more than one group. Please filter each to contain only one group. #72

Open
saassli21 opened this issue Nov 16, 2023 · 6 comments

Comments

@saassli21
Copy link

saassli21 commented Nov 16, 2023

Hello @cafferychen777 ,
I have 2 datasets , I tested the script that is in #65 on the first and it worked nicely because when I print this command
(daa_sub_method_results_df1 <- daa_results_df %>% filter(p_adjust < 0.05) %>% slice(1:29) ) I only have one group in group1 (P1) and group2 (CONTROL) .
However when I tested the same script on my second dataset:

metadata <- read_delim("metadata_exp3.txt", delim = "\t", escape_double = FALSE, trim_ws = TRUE , comment = "#")

kegg_abundance <- ko2kegg_abundance("pred_metagenome_unstrat_exp3.tsv")

daa_results_df <- pathway_daa(abundance = kegg_abundance, metadata = metadata, group = "STATUS", daa_method = "LinDA", select = NULL, p.adjust = "BH" , reference = "T1")

daa_sub_method_results_df1 <- daa_results_df %>% filter(p_adjust < 0.05) %>% slice(1:29)

max(daa_sub_method_results_df1$p_adjust)

daa_annotated_sub_method_results_df1 <- pathway_annotation(pathway = "KO", daa_results_df = daa_sub_method_results_df1, ko_to_kegg = TRUE)

p <- pathway_errorbar(abundance = kegg_abundance, daa_results_df = daa_annotated_sub_method_results_df1, Group = metadata$STATUS, p_values_threshold = 0.05, order = "pathway_class", select = NULL, ko_to_kegg = TRUE, p_value_bar = TRUE, colors = NULL, x_lab = "pathway_name")

but I have this error : The 'group1' or 'group2' column in the 'daa_results_df' data frame contains more than one group. Please filter each to contain only one group.

Error in levels<-(*tmp*, value = as.character(levels)) :
factor level [5] is duplicated

this error is due to the presence of multiple group in group1
image

I did this. does it correct ?

metadata <- read_delim("metadata_exp3.txt", delim = "\t", escape_double = FALSE, trim_ws = TRUE, comment = "#")

kegg_abundance <- ko2kegg_abundance("pred_metagenome_unstrat_exp3.tsv")

daa_results_df <- pathway_daa(abundance = kegg_abundance, metadata = metadata, group = "STATUS", daa_method = "LinDA", select = NULL, p.adjust = "BH", reference = "T1")

Get unique groups in 'group1'

unique_groups_group1 <- unique(daa_results_df$group1)

Initialize an empty list to store results

results_list <- list()

Iterate over each unique group

for (selected_group in unique_groups_group1) {

Filter the data for the current group

current_group_results <- daa_results_df %>% filter(group1 == selected_group)

Filter based on p_adjust and slice

filtered_slice <- current_group_results %>% filter(p_adjust < 0.05) %>% slice(1:29)

Annotate pathway results using KO to KEGG conversion

daa_annotated_results_df <- pathway_annotation(pathway = "KO", daa_results_df = filtered_slice, ko_to_kegg = TRUE)

Generate pathway error bar plot

p <- pathway_errorbar(abundance = kegg_abundance, daa_results_df = daa_annotated_results_df, Group = metadata$STATUS, p_values_threshold = 0.05, order = "pathway_class", select = NULL, ko_to_kegg = TRUE, p_value_bar = TRUE, colors = NULL, x_lab = "pathway_name")

Generate PCA plot

p1 <- pathway_pca(kegg_abundance, metadata, "STATUS")

Generate pathway heatmap

p2 <- pathway_heatmap(kegg_abundance %>% rownames_to_column("feature") %>% filter(feature %in% daa_annotated_results_df$feature) %>% column_to_rownames("feature"), metadata, "STATUS")

Save the plots to PNG files

ggsave(paste0("pathway_errorbar_", selected_group, ".png"), p, width = 30, height = 20, units = "in", dpi = 300)
ggsave(paste0("pathway_pca_", selected_group, ".png"), p1, width = 30, height = 20, units = "in", dpi = 300)
ggsave(paste0("pathway_heatmap_", selected_group, ".png"), p2, width = 30, height = 20, units = "in", dpi = 300)

Save the results in the list

results_list[[selected_group]] <- list(daa_annotated_results_df = daa_annotated_results_df, p = p, p1 = p1, p2 = p2)
}

@saassli21 saassli21 reopened this Nov 17, 2023
@cafferychen777
Copy link
Owner

Hello,

Thank you for reaching out with your issue. To better assist you in resolving the problem, it would be very helpful if you could share your datasets with me. Having access to the actual data will allow me to replicate the error and provide a more accurate solution. If the datasets contain sensitive information, please ensure they are anonymized before sharing.

You can share the data by uploading it to a secure location and providing a download link, or if the datasets are not too large, you could also send them as email attachments. Please let me know the most convenient method for you.

Looking forward to your response so we can address this issue promptly.

Best regards,
Chen YANG

@saassli21
Copy link
Author

Hello,

Thank you for reaching out with your issue. To better assist you in resolving the problem, it would be very helpful if you could share your datasets with me. Having access to the actual data will allow me to replicate the error and provide a more accurate solution. If the datasets contain sensitive information, please ensure they are anonymized before sharing.

You can share the data by uploading it to a secure location and providing a download link, or if the datasets are not too large, you could also send them as email attachments. Please let me know the most convenient method for you.

Looking forward to your response so we can address this issue promptly.

Best regards, Chen YANG

thank you very much , can I have you email adress please

@cafferychen777
Copy link
Owner

@ivanllampy
Copy link

Hi all,

I also encountered the same error for my dataset, which has three groups, using LinDA as the method of DAA.
Does the function pathway_errorbar() limit the number of groups being display?
The code works for me if I subset my dataset into only two groups.

Thanks in advance!

@cmetadea
Copy link

Hi, any updates on this?
If I subset the sample, I got two different graph. Is there any way that I can put multiple groups on the same graph, preferably sorted by p-values?
thanks

@mvirgilio
Copy link

same problem for me,
cheers,
M.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants