Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add preprint use case tutorials #48

Merged
merged 8 commits into from
Sep 29, 2024
Merged

Conversation

Lilly-May
Copy link
Collaborator

This PR adds the three use cases from the pertpy preprint to the tutorials. I created a separate notebook for each of the Norman, McFarland, and Zhang use cases. For the Zhang use case, the Dialogue section is still missing. I’ll add that once this PR is merged, as I need to run it on the cluster.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@Lilly-May Lilly-May marked this pull request as ready for review September 24, 2024 12:54
@Lilly-May Lilly-May requested a review from Zethson September 24, 2024 12:54
@Lilly-May Lilly-May mentioned this pull request Sep 26, 2024
3 tasks
Copy link

review-notebook-app bot commented Sep 27, 2024

View / edit / reply to this conversation on ReviewNB

Zethson commented on 2024-09-27T15:22:38Z
----------------------------------------------------------------

Can you add 1 more sentence of context of what the goal of this analysis is and why it matters, please?


Lilly-May commented on 2024-09-29T10:01:05Z
----------------------------------------------------------------

I added:

Overall, this tutorial demonstrates how to use pertpy to derive insights into perturbation responses in a complex dataset comprising various cell lines and drugs, with the goal of better understanding the molecular mechanisms underlying drug responses and how they vary across cell lines.

Copy link

review-notebook-app bot commented Sep 27, 2024

View / edit / reply to this conversation on ReviewNB

Zethson commented on 2024-09-27T15:22:39Z
----------------------------------------------------------------

Should we show obs here once? And maybe describe the types of perturbations?


Lilly-May commented on 2024-09-29T10:05:33Z
----------------------------------------------------------------

Done!

Copy link

review-notebook-app bot commented Sep 27, 2024

View / edit / reply to this conversation on ReviewNB

Zethson commented on 2024-09-27T15:22:40Z
----------------------------------------------------------------

Line #6.    #sc.pp.highly_variable_genes(adata, subset=True, n_top_genes=4000)

Why is it documented out? Either it was run or not.


Lilly-May commented on 2024-09-29T10:06:21Z
----------------------------------------------------------------

Yes, true - I deleted it.

Copy link

review-notebook-app bot commented Sep 27, 2024

View / edit / reply to this conversation on ReviewNB

Zethson commented on 2024-09-27T15:22:41Z
----------------------------------------------------------------

1 sentence of motivation why this is useful and we're doing it would be awesome.


Lilly-May commented on 2024-09-29T10:10:45Z
----------------------------------------------------------------

I changed the paragraph to:

Datasets often come with metadata that can enable more detailed analyses, as those presented in this tutorial. However, the extent of metadata can vary greatly between datasets, and each dataset usually has its own metadata format. To address this, databases that allow for standardized metadata annotation exist. Pertpy offers multiple metadata classes to query these databases and annotate your data. Here, we will use the CellLine and Moa classes to annotate the cell lines and mechanisms of action (MOA) of the drugs, respectively.

Copy link

review-notebook-app bot commented Sep 27, 2024

View / edit / reply to this conversation on ReviewNB

Zethson commented on 2024-09-27T15:22:42Z
----------------------------------------------------------------

Ensure that DepMap_ID is available in 'adata.obs'.

It is, right? Is this a misleading message?


Lilly-May commented on 2024-09-29T10:15:46Z
----------------------------------------------------------------

Yes, if the reference_id parameter is not set, DepMap_ID is used by default. The warning you're referencing belongs to the one in the line above. Personally, I would remove all \n characters from the warning so that the formatting is cleaner when printed. What do you think?

Zethson commented on 2024-09-29T10:21:09Z
----------------------------------------------------------------

Yeah so removing the line breaks helps. But I think that the message "ensure that it's available" should not written in the first place. The code should just check whether this column is inside the object and if not -> error with a message. Else, don't print this useless logging message.

In general, the UX is suboptimal here. There should simply be better inference of parameters and just defaults.

Lilly-May commented on 2024-09-29T13:06:44Z
----------------------------------------------------------------

I opened an issue (scverse/pertpy#664) because I think this requires a bit more discussion and possibly a general revision of the metadata UX.

Copy link

review-notebook-app bot commented Sep 27, 2024

View / edit / reply to this conversation on ReviewNB

Zethson commented on 2024-09-27T15:22:43Z
----------------------------------------------------------------

This might require one comment of justification? Maybe not necessary idk. Up to you.


Lilly-May commented on 2024-09-29T10:17:58Z
----------------------------------------------------------------

I've added one.

Copy link

review-notebook-app bot commented Sep 27, 2024

View / edit / reply to this conversation on ReviewNB

Zethson commented on 2024-09-27T15:22:44Z
----------------------------------------------------------------

Ideally we wouldn't tell the users to "ensure" but rather we check and if it's not the case, we print a warning. Can you either fix this or open an issue, please?


Lilly-May commented on 2024-09-29T11:21:55Z
----------------------------------------------------------------

This is what is done - here the respective section in the code.

if sum(adata.obsm[metadata_key].columns != adata.var.index.values) > 0:
    logger.warning(
        "Column name of metadata is not the same as the index of adata.var. Ensure that the genes are in the same order."
    )

So, we could either just change the warning and make the message clearer or we raise an Error and let the method fail when this happens?

Zethson commented on 2024-09-29T11:23:38Z
----------------------------------------------------------------

Yeah if it's critical for the usage of the function it should error. Else we should clarify the effect that this has. If it has no effect at all -> remove warning.

Lilly-May commented on 2024-09-29T13:06:56Z
----------------------------------------------------------------

I opened an issue (scverse/pertpy#664) for this because I think this requires a bit more discussion and possibly a general revision of the metadata UX.

Copy link

review-notebook-app bot commented Sep 27, 2024

View / edit / reply to this conversation on ReviewNB

Zethson commented on 2024-09-27T15:22:45Z
----------------------------------------------------------------

Is there a way for us to use the pertpy DE volcano plot here? I know that we might need something tuned for the paper...


Lilly-May commented on 2024-09-29T11:45:16Z
----------------------------------------------------------------

Yes, we can - thanks for noting that! I changed it.

Copy link

review-notebook-app bot commented Sep 27, 2024

View / edit / reply to this conversation on ReviewNB

Zethson commented on 2024-09-27T15:22:46Z
----------------------------------------------------------------

1 sentence on what this analysis is biologically about or why it matters, please.


Lilly-May commented on 2024-09-29T12:07:33Z
----------------------------------------------------------------

I changed the sentence to:

We will then use the identified perturbed cells to compute a perturbation space and explore the genetic interaction manifolds, meaning we will explore how distinct perturbations target similar or different molecular mechanisms and how these effects can be visualized in a low-dimensional space.

Copy link

review-notebook-app bot commented Sep 27, 2024

View / edit / reply to this conversation on ReviewNB

Zethson commented on 2024-09-27T15:22:47Z
----------------------------------------------------------------

1 sentence on the interpretation maybe? I know that it got criticized but it's in the paper as well. Fine for now.


Lilly-May commented on 2024-09-29T11:57:15Z
----------------------------------------------------------------

I added:

Pertpy facilitates the preprocessing of perturbation datasets, for example, by removing confounding effects using Mixscape. We can then use the identified perturbed cells to compute a perturbation space and explore genetic interaction manifolds. Here, we successfully reproduced the results presented by Norman et al. (2019), as perturbations with the same pre-annotated gene programs cluster together in the pseudobulked perturbation space. Additionally, we extended the gene program annotation to perturbations with unannotated gene programs using label transfer. Overall, this analysis enables the exploration of how distinct perturbations target similar or different molecular mechanisms.

Copy link

review-notebook-app bot commented Sep 27, 2024

View / edit / reply to this conversation on ReviewNB

Zethson commented on 2024-09-27T15:22:48Z
----------------------------------------------------------------

Line #1.    def filter_data(adata_temp):

Maybe rename this to something more descriptive?


Lilly-May commented on 2024-09-29T12:09:22Z
----------------------------------------------------------------

I renamed it to subset_to_common_cell_types

Copy link

review-notebook-app bot commented Sep 27, 2024

View / edit / reply to this conversation on ReviewNB

Zethson commented on 2024-09-27T15:22:49Z
----------------------------------------------------------------

Can we add at least 1 sentence of takeaway here? What do the plots mean?


Lilly-May commented on 2024-09-29T12:41:22Z
----------------------------------------------------------------

I added:

The heatmaps above show that patients who responded to chemotherapy showed a larger difference between their pre- and post-treatment expression profiles compared to those who responded to the combination of anti-PDL-1 and chemotherapy. This indicates that the combination therapy may have led to a less intense response or was applied in cases with poorer prognoses.

Copy link

review-notebook-app bot commented Sep 27, 2024

View / edit / reply to this conversation on ReviewNB

Zethson commented on 2024-09-27T15:22:50Z
----------------------------------------------------------------

Line #1.    # Get reference cell type

That's a comment for line 13 no?


Lilly-May commented on 2024-09-29T12:00:55Z
----------------------------------------------------------------

The entire code cell is just there to obtain the reference cell type for the entire dataset, i.e. the adata which includes all treatments. I then create new Sccoda models for the adatas with individual treatments.

Copy link

review-notebook-app bot commented Sep 27, 2024

View / edit / reply to this conversation on ReviewNB

Zethson commented on 2024-09-27T15:22:51Z
----------------------------------------------------------------

Can we get a bit more interpretation here? What do the effects mean etc? Just some bla from the paper or so is good enough.


Lilly-May commented on 2024-09-29T12:45:06Z
----------------------------------------------------------------

I added:

This fits to our earlier findings from the distance metric analysis, where we observed that the Anti-PD-L1 + Chemo group showed a smaller difference between their pre- and post-treatment expression profiles compared to the Chemo group. Overall, these findings indicate that the Anti-PD-L1 + Chemo combination therapy may have led to a less intense response or was applied in cases with poorer prognoses.

Copy link
Collaborator Author

I added:

Overall, this tutorial demonstrates how to use pertpy to derive insights into perturbation responses in a complex dataset comprising various cell lines and drugs, with the goal of better understanding the molecular mechanisms underlying drug responses and how they vary across cell lines.

View entire conversation on ReviewNB

Copy link
Collaborator Author

Done!


View entire conversation on ReviewNB

Copy link
Collaborator Author

Yes, true - I deleted it.


View entire conversation on ReviewNB

Copy link
Collaborator Author

I changed the paragraph to:

Datasets often come with metadata that can enable more detailed analyses, as those presented in this tutorial. However, the extent of metadata can vary greatly between datasets, and each dataset usually has its own metadata format. To address this, databases that allow for standardized metadata annotation exist. Pertpy offers multiple metadata classes to query these databases and annotate your data. Here, we will use the CellLine and Moa classes to annotate the cell lines and mechanisms of action (MOA) of the drugs, respectively.

View entire conversation on ReviewNB

Copy link
Collaborator Author

Yes, if the reference_id parameter is not set, DepMap_ID is used by default. The warning you're referencing belongs to the one in the line above. Personally, I would remove all \n characters from the warning so that the formatting is cleaner when printed. What do you think?


View entire conversation on ReviewNB

Copy link
Collaborator Author

I've added one.


View entire conversation on ReviewNB

Copy link
Member

Zethson commented Sep 29, 2024

Yeah so removing the line breaks helps. But I think that the message "ensure that it's available" should not written in the first place. The code should just check whether this column is inside the object and if not -> error with a message. Else, don't print this useless logging message.

In general, the UX is suboptimal here. There should simply be better inference of parameters and just defaults.


View entire conversation on ReviewNB

Copy link
Collaborator Author

This is what is done - here the respective section in the code.

if sum(adata.obsm[metadata_key].columns != adata.var.index.values) > 0:
    logger.warning(
        "Column name of metadata is not the same as the index of adata.var. Ensure that the genes are in the same order."
    )

So, we could either just change the warning and make the message clearer or we raise an Error and let the method fail when this happens?


View entire conversation on ReviewNB

Copy link
Member

Zethson commented Sep 29, 2024

Yeah if it's critical for the usage of the function it should error. Else we should clarify the effect that this has. If it has no effect at all -> remove warning.


View entire conversation on ReviewNB

Copy link
Collaborator Author

Yes, we can - thanks for noting that! I changed it.


View entire conversation on ReviewNB

Copy link
Collaborator Author

I added:

Pertpy facilitates the preprocessing of perturbation datasets, for example, by removing confounding effects using Mixscape. We can then use the identified perturbed cells to compute a perturbation space and explore genetic interaction manifolds. Here, we successfully reproduced the results presented by Norman et al. (2019), as perturbations with the same pre-annotated gene programs cluster together in the pseudobulked perturbation space. Additionally, we extended the gene program annotation to perturbations with unannotated gene programs using label transfer. Overall, this analysis enables the exploration of how distinct perturbations target similar or different molecular mechanisms.

View entire conversation on ReviewNB

Copy link
Collaborator Author

The entire code cell is just there to obtain the reference cell type for the entire dataset, i.e. the adata which includes all treatments. I then create new Sccoda models for the adatas with individual treatments.


View entire conversation on ReviewNB

Copy link
Collaborator Author

I changed the sentence to:

We will then use the identified perturbed cells to compute a perturbation space and explore the genetic interaction manifolds, meaning we will explore how distinct perturbations target similar or different molecular mechanisms and how these effects can be visualized in a low-dimensional space.

View entire conversation on ReviewNB

Copy link
Collaborator Author

I renamed it to subset_to_common_cell_types


View entire conversation on ReviewNB

Copy link
Collaborator Author

I added:

The heatmaps above show that patients who responded to chemotherapy showed a larger difference between their pre- and post-treatment expression profiles compared to those who responded to the combination of anti-PDL-1 and chemotherapy. This indicates that the combination therapy may have led to a less intense response or was applied in cases with poorer prognoses.

View entire conversation on ReviewNB

Copy link
Collaborator Author

I added:

This fits to our earlier findings from the distance metric analysis, where we observed that the Anti-PD-L1 + Chemo group showed a smaller difference between their pre- and post-treatment expression profiles compared to the Chemo group. Overall, these findings indicate that the Anti-PD-L1 + Chemo combination therapy may have led to a less intense response or was applied in cases with poorer prognoses.

View entire conversation on ReviewNB

Copy link
Collaborator Author

I opened an issue (scverse/pertpy#664) because I think this requires a bit more discussion and possibly a general revision of the metadata UX.


View entire conversation on ReviewNB

Copy link
Collaborator Author

I opened an issue (scverse/pertpy#664) for this because I think this requires a bit more discussion and possibly a general revision of the metadata UX.


View entire conversation on ReviewNB

@Zethson
Copy link
Member

Zethson commented Sep 29, 2024

Thanks! Feel free to merge and pull then.

@Lilly-May Lilly-May merged commit 894e14e into main Sep 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants