Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the acc from this repo and the paper #8

Open
lygztq opened this issue Jan 4, 2021 · 15 comments
Open

About the acc from this repo and the paper #8

lygztq opened this issue Jan 4, 2021 · 15 comments

Comments

@lygztq
Copy link

lygztq commented Jan 4, 2021

Hi, your work is excellent. However I find a gap between results obtained via your code and results reported in your paper. Specifically:

Mutagenicity NCI109 NCI1 DD
Code 79.68(1.68) 73.86(1.72) 76.29(2.14) 75.46(3.86)
Paper 82.15(0.58) 80.67(1.16) 78.45(0.77) 80.96(1.26)

I follow your hyper-parameter settings and do not change any part of your code. Why? I'm using Python 3.7 with the latest version of pytorch and pytorch_geometric. Is it possible that unmatched version of pyg causes such a gap?

@cszhangzhen
Copy link
Owner

Hi,

This is very strange. Let me check it.

@ScottLiao920
Copy link

Same for the PROTEINS dataset, only able to achieve 72% accuracy (same code, same version, same config) compared to 80%+ in the paper. Are there any other hyperparameters tuned to get 80% accu?

@cszhangzhen
Copy link
Owner

cszhangzhen commented Mar 24, 2021

Hi,

Sorry for the late reply.

I found this is caused by the updating of torch-sparse. After the release of torch-sparse 0.4.4 (reference here), spspmm does not support autograd. However, our model needs to learn the sparse adjacent matrix. I suggest to use older version of torch-sparse.

@lygztq
Copy link
Author

lygztq commented Mar 24, 2021

Actually, I also tried your 'full-graph' version (which has no spspmm operation) and got similar results...

Mutagenicity NCI109 NCI1 DD
Reported in Paper 82.15(0.58) 80.67(1.16) 78.45(0.77) 80.96(1.26)
Author's Code (full graph) 78.44(2.10) 74.44(2.05) 77.37(2.09) OOM
Author's Code (sample) 79.68(1.68) 73.86(1.72) 76.29(2.14) 75.46(3.86)

Also, it is worth pointing out that these three options --sample_neighbor, --sparse_attention and --structure_learning are useless since python will treat all non-zero strings as True.

@cszhangzhen
Copy link
Owner

These three options --sample_neighbor, --sparse_attention, --structure_learning are defined as bool type, so they should be initialized as True or False.

I also find the performance drops when updating pytorch_geometric to the latest version, and I'm not sure whether this is merely caused by spspmm. I will check this in detail.

@lygztq
Copy link
Author

lygztq commented Mar 24, 2021

Thanks for the reply :)

For bool arguments in Python's argparser, you can find an explanation here.

@cszhangzhen
Copy link
Owner

Hi,

I have tested the code.

python main.py --sample_neighbor False

And, the following code is inserted into the main.py.

print(args.sample_neighbor)

It'll output the False statement.

@Anwar-Said
Copy link

Anwar-Said commented May 5, 2021

Is it possible to reproduce the results on Proteins dataset with the latest torch_sparse version?

@cszhangzhen
Copy link
Owner

Hi, I'm afraid not, since the spspmm do not support auto-grad.

@Anwar-Said
Copy link

Anwar-Said commented May 6, 2021

ok, thanks for your response.

@Anwar-Said
Copy link

Hi,
Would be glad if someone shares the best parameter setting for reproducing results on the graph classification datasets?

@GageDeZoort
Copy link

@cszhangzhen @lygztq @Anwar-Said It may now be possible for the gradients to propagate through the TwoHopNeighborhood call, using something like this:


class TwoHopNeighborhood(object):
    def __call__(self, data):
        edge_index, edge_attr = data.edge_index, data.edge_attr
        n = data.num_nodes

        fill = 1e16
        value = edge_index.new_full((edge_index.size(1),), fill, dtype=torch.float, requires_grad=True)
        index = torch.sparse_coo_tensor(edge_index, value, (n,n)).coalesce()
        c = torch.sparse.mm(index, index).coalesce()
        row, col = c.indices()[0], c.indices()[1]
        index = torch.stack([row, col], dim=0)
        value = c.values()

        edge_index = torch.cat([edge_index, index], dim=1)
        if edge_attr is None:
            data.edge_index, _ = coalesce(edge_index, None, n, n)
        else:
            value = value.view(-1, *[1 for _ in range(edge_attr.dim() - 1)])
            value = value.expand(-1, *list(edge_attr.size())[1:])
            edge_attr = torch.cat([edge_attr, value], dim=0)

            data.edge_index, edge_attr = coalesce(edge_index, edge_attr, n, n, op='min')
            edge_attr[edge_attr >= fill] = 0
            data.edge_attr = edge_attr

        return data

    def __repr__(self):
        return '{}()'.format(self.__class__.__name__)

I'm still not seeing the results quoted in the paper, but values close to them, provided I use the hyperparameters given in the default main.py.

@cszhangzhen
Copy link
Owner

@GageDeZoort Thanks for your code. I will try it. Thanks.

@GageDeZoort
Copy link

Awesome, let me know if it helps - I'm very interested in training the structure learning component and reproducing your excellent results!

@mperozek11
Copy link

@GageDeZoort Thanks so much for posting! Your fix seems to be working for me. Training is extremely slow relative to simple GCN even with similar number of parameters. Is this expected? @cszhangzhen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants