Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement hasVertexP and hasEdgeP #135

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

Conversation

jmtd
Copy link

@jmtd jmtd commented Oct 26, 2018

This is an initial PR for hasVertexP and hasEdgeP - predicate variants of hasVertex and hasEdge (reimplemented in terms of the predicate versions). (the latter of which will be useful for some rewriting work I hope to build on top of Algebra.Graph).

I've only looked at the code so far; on my todo list is at least

  • check docstrings are appropriate
  • write tests for hasEdgeP explicitly (but probably both)
  • double-check the SPECIALISE pragmas are (still) appropriate

@snowleopard
Copy link
Owner

snowleopard commented Oct 26, 2018

@jmtd Cool, thank you! I like how nicely it all worked out.

There seem to be a performance regression for hasVertex and possibly hasEdge too:

hasEdge: 1.17 (bad)
hasVertex: 1.66 (bad)

While 1.17 may just be noise, 1.66 feels like a proper regression, likely due to the missing SPECIALISE pragma. @nobrakal Do you agree?

Performace regression is reported by our CI infrastructure on every commit, so if you push any fixes you should be able to see the results in the DOBENCH="Graph" Travis job. For example:

https://travis-ci.org/snowleopard/alga/jobs/446759541

Further things to do:

  • We probably want to return the actual matching vertices and edges from these functions. What about calling them findVertex and findEdge and modifying the type signatures slightly?

    findVertex :: (a -> Bool) -> Graph a -> [a]
    findEdge :: (a -> Bool) -> (a -> Bool) -> Graph a -> [(a, a)]
    

    Then hasVertex could simply check if the list is empty. But I'm not sure if this can be implemented without sacrificing performance.

  • Implement this functionality in Algebra.Graph.NonEmpty.

@nobrakal
Copy link
Contributor

Hi,

While 1.17 may just be noise, 1.66 feels like a proper regression

The two are proper regression in this case :)

About the SPECIALIZE pragma: It is here only to optimize out a class constraint, so it is still needed for both hasVertex and hasEdge (and adding it to hasVertexP or hasEdgeP will do nothing, since these functions does not have any class constraint).

The problem is that hasVertex contains a reference to an other function (hasVertexP) which is currently not inlined, so it add a level of indirection in the code of hasVertex

The solution is to add an INLINE pragma to both hasVertexP and hasEdgeP :) (and it is anyway a good idea to inline these !).

Then hasVertex could simply check if the list is empty. But I'm not sure if this can be implemented without sacrificing performance.

Maybe lazyness can help preventing a performance drop, but I am not sure.

jmtd added 2 commits October 30, 2018 21:55
Predicate versions of hasVertex and hasEdge.

Re-implement hasVertex and hasEdge in terms of the predicate
versions.
@snowleopard
Copy link
Owner

snowleopard commented Oct 31, 2018

Adding INLINE pragmas seems to have fixed the regression.

(There is still this strange result hasEdge: 0.69 (good), which we believe is a bug.)

@jmtd Could you try the following?

{-# INLINE [1] findVertices #-}
findVertices :: (a -> Bool) -> Graph a -> [a]
findVertices p = foldg [] [ x | p x ] (++) (++)

hasVertex :: Eq a => a -> Graph a -> Bool
hasVertex x = not . null $ findVertices (== x)

I'm interested to see whether this will be as fast as your implementation via hasVertexP.

@jmtd
Copy link
Author

jmtd commented Nov 1, 2018

Hi @snowleopard

findVertices p = foldg [] [ x | p x ] (++) (++)

I think there must be a typo in the list comprehension here, my GHCI doesn't like it either, or perhaps I need to load a language extension?

@snowleopard
Copy link
Owner

@jmtd Oh, I'm sorry, I meant:

findVertices p = foldg [] (\x -> [ x | p x ]) (++) (++)

re-implement hasVertex in terms of findVertices
@jmtd
Copy link
Author

jmtd commented Nov 2, 2018

NP. I've just pushed that, let's see what the performance is like.

I'd quite like to reproduce the performance measurement stuff locally (to avoid round-tripping github PRs to test things). I should perhaps also take a look at Stack if I'm spring-cleaning my setup.

I'm working on a findEdges now

@jmtd
Copy link
Author

jmtd commented Nov 2, 2018

As an aside, would you like PRs to add things like {-# LANGUAGE TupleSections #-} where necessary — I see this is already specified in the cabal file, but is still needed (I think) for e.g. loading the local copy into ghci w/o also specifying it on the command line

@jmtd
Copy link
Author

jmtd commented Nov 2, 2018

travis and appveyor results for the findVertices change only (ab78e059) show a regression

hasVertex: 1.55 (bad)

I'm going to push an initial findEdges now to see how that performs.

(this patch series will need some rebasing before it's mergeable)

@jmtd
Copy link
Author

jmtd commented Nov 2, 2018

Results for findEdges as of 5ed33b63284afb275d7584d0203d0b4327ecc02e

hasEdge: 1.35 (bad)
hasVertex: 2.03 (bad)

@snowleopard
Copy link
Owner

@jmtd Thanks for giving it a try. I suggest we proceed as follows.

Let's keep the Algebra.Graph module as is, but instead add new functionality to the generic module that defines the ToGraph type class. All we need to implement both findVertices and hasVertexP is the foldg method -- we do not need to know the graph itself:

findVertices :: ToGraph g => (ToVertex g -> Bool) -> g -> [ToVertex g]
findVertices p = foldg [] (\x -> [ x | p x ]) (++) (++)

With this approach other instances of the ToGraph class can use these functions too.

Also, I'd like to understand your use case a bit better: how do you plan to use these functions? Merely checking if a vertex with the right properties exist (using hasVertexP) is probably not too useful. Same about hasEdgeP. Also, the function findVertices could probably be implemented more directly as a combination of Set.filter and vertexSet.

My understanding is that you actually want to write some sort of graph transformation function that, for example, turns merges all edges that satisfy a given property into a vertex. Could you provide some more details? Otherwise, I worry that we might end up providing too many functions with similar looking functionality, hence confusing Alga users: hasVertex, hasVertexP, findVertices, vertexSet.

@jmtd
Copy link
Author

jmtd commented Nov 6, 2018

@snowleopard wrote:

Also, I'd like to understand your use case a bit better: how do you plan to use these functions? Merely checking if a vertex with the right properties exist (using hasVertexP) is probably not too useful.

My understanding is that you actually want to write some sort of graph transformation function that, for example, turns merges all edges that satisfy a given property into a vertex.

That's right. Here's an example of a graph rewrite rule, coded in a string, using composition to represent the flow of the stream data

streamMap f . streamMap g = streamMap (f . g)

We have stream graphs coded up using Graph and and ADT like the following

data StreamVertex = StreamVertex                                                                                                                                                   
    { vertexId   :: Int            -- necessary to stop unifying distinct nodes which are otherwise identical
    , operator   :: StreamOperator -- data StreamOperator = Map | Filter | Expand…                                                         
    , parameters :: [String]       -- strings of code representing operator arguments
    , intype     :: String         -- capturing the type flowing into this node                                                                        
    } deriving (Eq) 

What I am hoping to do is encode the stream rules using a similar ADT to the above instead of strings (so probably make the parameters argument a type variable and re-use the above), then apply rules to instances of Graph StreamVertex

All the rules I am working with at the moment are derived from analysing operators pair-wise, and generally swap the order of two operators, but also change their parameters; so from a Graph POV adding and removing both edges and nodes. More complex rewrite rules could involve introducing more nodes and consequently edges. We don't currently label edges, but direction is important.

@snowleopard
Copy link
Owner

@jmtd Thanks! Let me think about this for a few days. I'm travelling until 11 November, so it's a bit difficult to find time to figure out a small set of graph transformation primitives that will be useful both for your problem and for other users of the library.

@jmtd
Copy link
Author

jmtd commented Nov 9, 2018

Thanks @snowleopard. I'm going to see what PoC I can build on top of these primitives to further assess how this all might work for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants