Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Count of INDRA Statements for Individual Terms #10

Open
cannin opened this issue Jul 10, 2020 · 4 comments
Open

Add Count of INDRA Statements for Individual Terms #10

cannin opened this issue Jul 10, 2020 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@cannin
Copy link
Owner

cannin commented Jul 10, 2020

Add another column INDRA_QUERY_TERM_STATEMENT_COUNT, use the following example code:

# GET QUERY GROUNDING ----
import requests 
from urllib.parse import urljoin

grounding_service_url = 'http://grounding.indra.bio/'

txt = 'BRAF'
txt = 'topotecan'

resp = requests.post(urljoin(grounding_service_url, 'ground'), json={'text': txt})
grounding_results = resp.json()
grounding_results 

# TODO: Test if grounding_results has entries
term_id = grounding_results[0]['term']['id']
term_db = grounding_results[0]['term']['db']
term = term_id + '@' + term_db
term

# Get statements for query term 
out = indra_db_rest.get_statements(agents=[term])
out.statements
len(out.statements)
@cannin cannin added the enhancement New feature or request label Jul 10, 2020
@cannin cannin added this to the Metadata Additions milestone Jul 10, 2020
@cannin
Copy link
Owner Author

cannin commented Jul 11, 2020

The harder challenge: Only return back statements from specific source_apis (e.g., reach). Like this one:

        "evidence": [
            {
                "source_api": "reach",
                "pmid": "28972042",
                "text": "TMCO1 dysregulates cell cycle progression via suppression of the AKT pathway, and S60 of the TMCO1 protein is crucial for its tumor suppressor roles.",
                "annotations": {
                    "found_by": "Negative_activation_syntax_1_verb",
                    "agents": {
                        "raw_text": [
                            "TMCO1",
                            "cell cycle"
                        ]
                    },

I converted the statements to_json with 'from indra.statements.statements import stmts_to_json'. We might try to submit a PR related to this.

@cannin
Copy link
Owner Author

cannin commented Jul 11, 2020

You might want to message INDRA team to see if they have this already somewhere; some function to filter statements based on some properties; it should be a pretty independent function.

I have tackled similar challenges with jsonpath (https://github.com/h2non/jsonpath-ng) not sure if it will work here. You might want to mention this as well; INDRA might not want the extra dependency. Example code:

import json
from jsonpath_ng import jsonpath
from jsonpath_ng.ext import parse

def get_jsonpath(json_file, json_str, jsonpath_expr_str): 
    if json_file is None: 
        dat = json.loads(json_str)
    else: 
        with open(json_file) as f:
            dat = json.load(f)

    jsonpath_expr = parse(jsonpath_expr_str)

    results = jsonpath_expr.find(dat)

    results_list = []

    for match in results:
        results_list.append(match.value)

    return(results_list)

if __name__ == "__main__":

    # json_file = 'covid19_model_2020-03-22-03-16-47.json'
    # jsonpath_expr_str = "$..text_refs"
    # jsonpath_expr_str = "$..stmts[?(@.belief == 1)]"
    # jsonpath_expr_str = "$..stmts[?(@.stmt.type == 'IncreaseAmount')]"
    # jsonpath_expr_str = "$..stmts[?(@.stmt.obj.db_refs.UP == 'P16278')]"
    # jsonpath_expr_str = "$..stmts[?(@.stmt.evidence[*].text_refs.PMCID == 'PMC331007')]"

    json_file = None
    json_str = '[{"id": "a", "foo": [{"baz": 1}, {"baz": 2}]}, {"id": "b", "foo": [{"baz": 3}, {"baz": 4}]}]'
    jsonpath_expr_str = '$..foo[*].baz'
    jsonpath_expr_str = '$[?(@.id == "a")].foo'
    
    get_jsonpath(json_file, json_str, jsonpath_expr_str)

@cannin
Copy link
Owner Author

cannin commented Jul 11, 2020

This JSONPath expression retrieves what I'd like:

jsonpath_expr_str = "$[?(@.evidence[*].source_api == 'reach')]"

@PritiShaw
Copy link
Collaborator

This JSONPath expression retrieves what I'd like:

jsonpath_expr_str = "$[?(@.evidence[*].source_api == 'reach')]"

Hi Mentor
I have received reply from Ben regarding our query (sorgerlab/indra#1141)
He said about method indra.tools.assemble_corpus.filter_evidence_source(stmts_in, source_apis, policy='one', **kwargs)
image

This is also implemented in the INDRA REST API ,documented at http://api.indra.bio:8000/, under the "Preassembly" heading.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants