Inference of GCN model on unseen test data #223

KristinaUlicna · 2023-09-08T17:31:45Z

PR contribution summary

Why is this PR useful / good for? Please describe the problem(s) you're trying to address.

Assuming we have a working GCN, this PR provides a new notebook to run inference & evaluate the performance of the GCN on real, annotated data.

It builds a GraphLabelPredictor which updates the GraphAttrs.XXX_PREDICTION by a vector of numbers for each node and edge:

grace/grace/evaluation/inference.py

Lines 35 to 41 in 1e60b41

    
           for idx, node in G.nodes(data=True): 
        
               prediction = [int(n_pred[idx].item()), n_probabs[idx].numpy()] 
        
               node[GraphAttrs.NODE_PREDICTION] = prediction 
        
           for e_idx, edge in enumerate(G.edges(data=True)): 
        
               prediction = (int(e_pred[e_idx].item()), e_probabs[e_idx].numpy()) 
        
               edge[-1][GraphAttrs.EDGE_PREDICTION] = prediction

It allows node Annotation.UNKNOWN relabelling to be separated for nodes & edges.
This PR additionally annotates other notebooks to make sure that the flow of what is done is clearly explained.

List of proposed changes / linked issues & discussions

Resolves [DEVELOPMENT] Split the unknown masking for nodes & edges #221
Resolves [DEVELOPMENT] Rename normalise to normalize for tests to pass #224
Resolves Measure area under [receiver-operator, precision-recall] curve metrics #194
Resolves Run model inference on random graph + real annotated data #188
Resolves Transform inference procedure from notebook to script #189
Resolves [FEATURE REQUEST] Implement inference function #101
Resolved Graph pre-process #133
Resolves [DEVELOPMENT] Repurpose 2x2 confusion matrix for use outside of object metrics #225

What should a reviewer concentrate their feedback on?

✅ Scripts to check
🏃 Notebooks to run (especially infer_predictions.ipynb, others only have minor documentation / comments changes)
💻 Code quality
📝 Everything looks OK?

What type of PR is this? (check all applicable)

🪄 Feature
🐛 Bug fix
#️⃣ Documentation / code annotation
🔥 Performance Improvements

Added tests?

🙋 no, because I need some help

PR review summary

Describe what this PR does & how you reviewed the individual items, where needed:

Some helper checks to tick off:

Focus on image annotation
Focus on model training
Could any optimization be applied?
Is there any redundant code?
Are there any spelling errors?

In conclusion, after my review, I'd like to:

🙋 ask some clarifying questions
🙅 suggest some specific changes

review-notebook-app · 2023-09-08T17:31:50Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

crangelsmith · 2023-09-12T10:07:29Z

notebooks/infer_predictions.ipynb

+   "source": [
+    "from grace.io.image_dataset import ImageGraphDataset\n",
+    "from grace.models.feature_extractor import FeatureExtractor\n",
+    "from grace.evaluation.visualisation import plot_simple_graph\n",


I get an error trying to run this cell:
ModuleNotFoundError: No module named 'grace.evaluation'

I think there is a missing init.py in the evaluation directory that would make this work.

Adding the __init__.py in the next commit 🚀

crangelsmith · 2023-09-12T10:09:42Z

notebooks/infer_predictions.ipynb

+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "extractor_filename = \"/Users/kulicna/Desktop/classifier/extractor/resnet152.pt\"\n",


If we want people to be able to run this easily maybe we need to add the line to download the resnet before or add a comment to point to where this is.

Great point! I included the instructions on how to download the ResNet locally in the training subfolder README:

grace/grace/training/README.md

Lines 67 to 78 in 488f60b

### Downloading the feature extractor:

In case you decide to use a pre-trained image classifier, such as resnet-152, you can use this snippet to import the model, load the default weights & download the model:

```sh

import torch

from grace.models.feature_extractor import resnet

resnet_model = resnet(resnet_type="resnet152")

extractor_fn = "/path/to/your/feature/extractor/resnet152.pt"

torch.save(resnet_model, extractor_fn)

```

Would you like to see it added in the notebook, too?

crangelsmith · 2023-09-12T10:13:11Z

notebooks/infer_predictions.ipynb

+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "grace_path = \"/Users/kulicna/Desktop/dataset/shape_stars/train\"\n",


In general we should remove path to personal directories if we want these notebooks to be run easily for everyone, we could add a new variable called: input_paht = /path/to/data/ to be replaced by the user.

Also, we need to put this data somewhere and easy to download.

Fully agreed, we had this problem in the examples\ folder where I replaced the path for a user-specified input:

grace/examples/show_data.py

Lines 13 to 20 in 488f60b

# e.g. /Users/kulicna/Desktop/dataset/shape_squares/MRC_Synthetic_File_000.mrc

IMAGE_PATH = Path(

input(

"Enter absolute path to your file "

"(e.g. /Users/path/to/your/data/image.mrc, omit ''): "

)

)

I'm not sure how to do this from the notebook tho...

crangelsmith · 2023-09-12T10:23:44Z

Hi @KristinaUlicna, the notebook runs except for the minor comments above but I'm a bit confused about why is the inference done on a notebook and not added to the run.py script as part of the pipeline. Couldn't we just add an eval option to the config file and do this in run.py after training?

crangelsmith · 2023-09-12T10:27:12Z

Also, it would be good to visualise the resulting graph on inference, as described in #222

KristinaUlicna · 2023-09-12T11:12:59Z

Also, it would be good to visualise the resulting graph on inference, as described in #222

Coming up in PR #229 ! 🚀

KristinaUlicna added 5 commits September 8, 2023 18:28

Implement areas under curve metrics

5f7bd2c

Move & rename metrics scripts

1c77b57

Included inference notebooks

b92b8bb

Updated y labels in whole graph dataset

93cbe14

Run inference notebook with new GCN models

4c2f976

KristinaUlicna added enhancement New feature or request methodology Building functional & diverse pipeline labels Sep 8, 2023

KristinaUlicna self-assigned this Sep 8, 2023

KristinaUlicna added 15 commits September 8, 2023 19:04

Split relabel config param into node & edges

3c46acc

Unify dataset_from_graph outputs

5ac53de

Plotting simple inference metrics

51342da

Contribute GraphLabelPredictor

c2d69e8

Create universal confusion matrix plotting fn

730d703

Add inference.py script

63b40a6

Annotate AUC plot figure

e69f21d

Inference notebook

804ac65

Curate inference script

3c787c0

Recycle confusion matrix plotting fn

53979b9

Placeholder for TSNE viz script

48230a6

Cleaned inference + metrics notebooks

263173a

Separate dim_reduction for next PR

38fc928

Separate TSNE ntbk for next PR

735fd57

Annotate & curate notebooks

1e60b41

KristinaUlicna requested a review from crangelsmith September 11, 2023 09:55

crangelsmith reviewed Sep 12, 2023

View reviewed changes

crangelsmith mentioned this pull request Sep 12, 2023

Graph visualisation & embedding dimensionality reduction #229

Merged

4 tasks

crangelsmith approved these changes Sep 12, 2023

View reviewed changes

Add init script to evaluation subfolder

9b92304

KristinaUlicna merged commit f530eba into main Sep 12, 2023
1 check passed

KristinaUlicna deleted the inference branch September 12, 2023 11:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference of GCN model on unseen test data #223

Inference of GCN model on unseen test data #223

KristinaUlicna commented Sep 8, 2023 •

edited

Loading

review-notebook-app bot commented Sep 8, 2023

crangelsmith Sep 12, 2023

KristinaUlicna Sep 12, 2023

crangelsmith Sep 12, 2023

KristinaUlicna Sep 12, 2023

crangelsmith Sep 12, 2023

KristinaUlicna Sep 12, 2023

crangelsmith commented Sep 12, 2023

crangelsmith commented Sep 12, 2023

KristinaUlicna commented Sep 12, 2023

	for idx, node in G.nodes(data=True):
	prediction = [int(n_pred[idx].item()), n_probabs[idx].numpy()]
	node[GraphAttrs.NODE_PREDICTION] = prediction

	for e_idx, edge in enumerate(G.edges(data=True)):
	prediction = (int(e_pred[e_idx].item()), e_probabs[e_idx].numpy())
	edge[-1][GraphAttrs.EDGE_PREDICTION] = prediction

	### Downloading the feature extractor:

	In case you decide to use a pre-trained image classifier, such as resnet-152, you can use this snippet to import the model, load the default weights & download the model:

	```sh
	import torch
	from grace.models.feature_extractor import resnet

	resnet_model = resnet(resnet_type="resnet152")
	extractor_fn = "/path/to/your/feature/extractor/resnet152.pt"
	torch.save(resnet_model, extractor_fn)
	```

	# e.g. /Users/kulicna/Desktop/dataset/shape_squares/MRC_Synthetic_File_000.mrc

	IMAGE_PATH = Path(
	input(
	"Enter absolute path to your file "
	"(e.g. /Users/path/to/your/data/image.mrc, omit ''): "
	)
	)

Inference of GCN model on unseen test data #223

Inference of GCN model on unseen test data #223

Conversation

KristinaUlicna commented Sep 8, 2023 • edited Loading

PR contribution summary

List of proposed changes / linked issues & discussions

What should a reviewer concentrate their feedback on?

What type of PR is this? (check all applicable)

Added tests?

PR review summary

review-notebook-app bot commented Sep 8, 2023

crangelsmith Sep 12, 2023

Choose a reason for hiding this comment

KristinaUlicna Sep 12, 2023

Choose a reason for hiding this comment

crangelsmith Sep 12, 2023

Choose a reason for hiding this comment

KristinaUlicna Sep 12, 2023

Choose a reason for hiding this comment

crangelsmith Sep 12, 2023

Choose a reason for hiding this comment

KristinaUlicna Sep 12, 2023

Choose a reason for hiding this comment

crangelsmith commented Sep 12, 2023

crangelsmith commented Sep 12, 2023

KristinaUlicna commented Sep 12, 2023

KristinaUlicna commented Sep 8, 2023 •

edited

Loading