Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: how to gain annotation from KEGG IDs? #120

Open
AMosca96 opened this issue Oct 5, 2024 · 1 comment
Open

Question: how to gain annotation from KEGG IDs? #120

AMosca96 opened this issue Oct 5, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@AMosca96
Copy link

AMosca96 commented Oct 5, 2024

Hi there,

First of all, thank you for this really helpful package!

I converted the KO IDs to the KEGG IDs using the "ko2kegg_abundance" function because I'm more interested on KEGG IDs.
For this reason, I'd like to retrieve the functional annotation of these IDs. I used the "pathway_annotation" to collect the annotation of KO IDs, but the same function doesn't work using an input file with KEGG IDs (the same that I got from the "ko2kegg_abundance" function). Is there a way to obtain annotations on KEGG IDs? I know that the differential abundance analysis would help me to gain the functional annotation of the KEGG IDs, however I'm more interesed on all the KEGG IDs.

Thank you again for your support!

@AMosca96 AMosca96 added the bug Something isn't working label Oct 5, 2024
@cafferychen777
Copy link
Owner

Hi @AMosca96,

Thank you for your interest in ggpicrust2! Based on the code in pathway_annotation.R and ko2kegg_abundance.R, I can suggest a solution for obtaining KEGG pathway annotations:

  1. Direct Annotation Method
# Convert your KO abundance to KEGG pathway abundance
kegg_abundance <- ko2kegg_abundance(file = "your_file.tsv")

# Create a data frame with your KEGG IDs
kegg_ids_df <- data.frame(
    feature = rownames(kegg_abundance),
    p_adjust = 1  # Dummy value to meet function requirements
)

# Get KEGG annotations
kegg_annotations <- pathway_annotation(
    daa_results_df = kegg_ids_df,
    pathway = "KO",
    ko_to_kegg = TRUE
)
  1. Alternative Approach Using Internal Functions
# If you already have KEGG IDs
kegg_ids <- rownames(kegg_abundance)

# Create a data frame with required columns
kegg_df <- data.frame(
    feature = kegg_ids,
    p_adjust = 1,  # Dummy value
    stringsAsFactors = FALSE
)

# Process KEGG annotations
annotated_df <- process_kegg_annotations(kegg_df)

The resulting data frame will contain:

  • pathway_name
  • pathway_description
  • pathway_class
  • pathway_map

Example output:

head(annotated_df)
#   feature     pathway_name    pathway_description    pathway_class    pathway_map
# 1 ko00010    Glycolysis    Carbohydrate metabolism   Metabolism      map00010
# ...

Note:

  • The annotation process connects to the KEGG database, so an internet connection is required
  • Some KEGG pathways might return NA if they're deprecated or not found
  • The process includes automatic retries for failed requests
  • Progress is logged by default (can be disabled with options(ggpicrust2.verbose = FALSE))

Let me know if you need any clarification or run into any issues!

Best regards,
Chen Yang

P.S. For bulk annotations, consider using the cache system to improve performance:

# Set cache options
options(ggpicrust2.use_cache = TRUE)
options(ggpicrust2.cache_dir = "path/to/cache")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants