graph4scala userguide

scalax.collection.Graph User Guide

Peter Empen

1 Introduction .................................................................................................................................. 2

1.1 Why Use Graph ..................................................................................................................... 2

1.2 Terminology............................................................................................................................ 2

1.3 Status of Work........................................................................................................................ 2

1.4 Limitations .............................................................................................................................. 2

2 Initializing Graphs ........................................................................................................................ 3

2.1 Prerequisites .......................................................................................................................... 3

2.2 Type Parameters.................................................................................................................... 3

2.3 Edge Factories ....................................................................................................................... 4

2.4 Instantiating Graphs ............................................................................................................... 6

2.5 Type Parameter Inference ...................................................................................................... 7

2.6 Choosing an Implementation .................................................................................................. 8

3 Inner and Outer Objects............................................................................................................... 9

4 Graph Operations....................................................................................................................... 10

4.1 Iterating ................................................................................................................................ 10

4.2 Looking up Nodes and Edges............................................................................................... 10

4.3 Equality of Inner and Outer Objects...................................................................................... 11

4.4 Adding and Subtracting ........................................................................................................ 11

4.5 Computing Union, Difference and Intersection...................................................................... 13

4.6 Inspecting Endpoints ............................................................................................................ 13

4.7 Inspecting Neighbors and Incident Edges............................................................................. 13

4.8 Querying by Function ........................................................................................................... 14

4.9 Finding Paths ....................................................................................................................... 15

4.10 Traversing ............................................................................................................................ 16

4.11 Measuring Graphs and Grouping Nodes by Degree ............................................................. 18

4.12 Classifying Graphs ............................................................................................................... 18

4.13 Chaining Method Calls.......................................................................................................... 18

5 Customizing Graphs .................................................................................................................. 19

5.1 Defining Custom Edges........................................................................................................ 19

5.2 Defining User Constraints..................................................................................................... 20

5.3 Narrowing Type Parameters................................................................................................. 20

5.4 Modifying Methods ............................................................................................................... 20

5.5 Adding new Methods ............................................................................................................ 21

5.6 Providing new Implementations ............................................................................................ 21

6 Run-time Characteristics ........................................................................................................... 21

scalax.collection.Graph User Guide / 2 P. Empen

1 Introduction This document provides an example-driven, comprehensive coverage of the functionality of Graph as part of the Extended Scala Library. In each chapter examples are listed first. You may then read the explanations following the examples to make them more coherent or skip them and go directly to the next chapter.

References to specific Graph classes in this document may be looked up in the Scaladoc API reference. This guide is not meant to be complete.

For the sake of simplicity, most examples are based on graphs spanned over nodes of the type Int . Graph customization is shown by the node type Airport and the edge type Flight .

1.1 Why Use Graph The most important reasons why Graph speeds up your development are:

a) Simplicity: Creating, manipulating and querying Graph is intuitive.

b) Consistency: Graph seamlessly maintains a consistent state of nodes and edges including prevention of duplicates, intelligent addition and removal.

c) Conformity: As a regular collection class, Graph has the same “look and feel” as other members of the Scala collection framework. Whenever appropriate, result types are Scala collection types themselves.

d) Flexibility: All kinds of graphs including mixed graphs, multi-graphs and hypergraphs are supported.

e) Functional Style: Graph facilitates a concise, functional style of utilizing graph functionality, including traversals, not seen in Java-based libraries.

f) Extendibility: You can easily customize Graph to reflect the needs of you application retaining all benefits of Graph .

g) Documentation: Ideal progress curve through adequate documentation.

Look and see!

1.2 Terminology Throughout the library we use the terms node as a synonym to vertex and edge as a generic term for hyperedge, line (undirected edge) or arc (directed edge).

1.3 Status of Work Graph creation, editing as well as functional traversal and path operations have been completed. More functionality is due to be added. You are invited to request enhancements based on your problem domain.

1.4 Limitations • There is no direct support for half-edges but they can be simulated by Option .

• Neither node nor edge sets may be infinite although this could be achieved by a custom implementation.


2 Initializing Graphs

2.1 Prerequisites The downloadable binaries are named Graph-<module>_<Scala-version>-<Graph-version>.jar following the community-convention. For each <module> (here: core ) there is a separate User Guide. To run the examples in the current documentation ensure that Graph-core_*.jar is on your class path.

In most cases you need to import the following:

import scalax.collection.Graph // or scalax.collection.mutable.Graph import scalax.collection.GraphPredef._, scalax.collection .GraphEdge._

2.2 Type Parameters trait Graph[N, E[X] <: EdgeLikeIn[X]]

N is the type of the nodes of a graph instance. E[X] is the kind of type of edges of a graph instance.

The trait Graph and most related templates have two type parameters: one for the contained nodes and one for the kind of contained edges. Graph doesn’t impose any restriction on the type of nodes meaning that for the upper bound of the node type parameter any type including Any may be chosen. In contrast, the type parameter for edges is required to have at least the upper bound EdgeLikeIn .

When selecting between the predefined edge types it might help you to understand their inheritance relationships:

Table 2.2.1: Edge type hierarchy

All edge classes derive from trait EdgeLike and, optionally, from trait DiEdgeLike . There are four edge categories: hyperdedge, directed hyperedge, undirected and directed edge. Each of these categories has predefined edge classes representing any combination of non-weighted, weighted (W), key-weighted (Wk), labeled (L) and key-labeled (Lk ). See 2.3 Edge Factories for examples and more details.

Based on the above inheritance hierarchy, graph types in the sense of graph theory are governed by the edge type parameter E[X] as follows:


Type of graph

Definition The graph contains…

Edge Type (kind of type parameter) … or any corresponding custom edge type

Simple only undirected edges without multi-edges

UnDiEdge or any of its variants W*, L* or WL* where the user is responsible to avoid adding directed edges as the type system does not restrict she to do so

Mixed more than one type of edges without multi-edges

UnDiEdge or its parent types such as HyperEdge or any of their variants except the keyed ones

or, as a use case, directed and undirected edges without multi-edges

UnDiEdge or any of its variants except the keyed ones

Weighted edges with the special label weight any of the predefined edge classes containing W in its prefix

Multi One or more edge types with multi-edges

any of the predefined edge classes containing K in its prefix

Table 2.2.2: Graph types mapped to edge types

Although there is no constraint on the node type, you should be especially careful when designing mutable nodes: the hashCode returned by your nodes should not change over their lifetime even if nodes are mutating.1 Otherwise, looking up mutated elements in a graph will fail - at least in case of the HashMap-based default implementation.2 This is clearly the very same behavior as with scala.collection.Set or Map.

2.3 Edge Factories

Shortcut Named Factory Meaning

1~2~3 HyperEdge(1,2,3) (undirected) hyperedge between 1, 2 and 3

1~>2~>3 DiHyperEdge(1,2,3) directed hyperedge from 1 via 2 to 3

"A"~"B" UnDiEdge("A","B") undirected edge between "A" and "B"

"A"~>"B" DiEdge("A","B") directed edge from "A" to "B"

1~2 % 5 WUnDiEdge(1,2,5) undirected edge between 1 and 2 with a weight of 5

1~>2 % 0 WDiEdge(1,2,0) directed edge from 1 to 2 with a weight of 0

(1 ~+> 2)(x) LDiEdge(1,2)(x) directed edge from 1 to 2 labeled with x

(1 ~+#> 2)(x) LkDiEdge(1,2)(x) directed edge from 1 to 2 key-labeled with x

(1 ~%+# 2)(x) WLkUnDiEdge(1,2)(5,x) undirected edge between 1 and 2 with a weight of 5 and key-labeled with x

Table 2.3.1: Edge factory examples

Edge factories are simply apply methods of companion objects of edge classes. They are typically called whenever an edge is passed to a Graph method as an argument. The above table contains factory examples for some of the predefined edge classes.

1 “Note: great care must be exercised if mutable objects are used as map keys. The behavior of a map is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is a key in the map.” (http://download.oracle.com/javase/1.4.2/docs/api/java/util/Map.html) 2 You may still find mutated elements calling find or filter with a having -argument, but these calls start a full scan.


Apart from the four basic edge classes [Di]HyperEdge and [Un]DiEdge there are plenty of convenience edge class variants enabling easy creation of weighted and / or labeled edges:

Edge Category Edge Class Shortcut Description

HyperEdge ~ hyperedge WHyperEdge ~% weighted hyperedge

WkHyperEdge ~%# key-weighted hyperedge LHyperEdge ~+ labeled hyperedge

LkHyperEdge ~+# key-labeled hyperedge WLHyperEdge ~%+ weighted labeled hyperedge

WkLHyperEdge ~%#+ key-weighted labeled hyperedge WLkHyperEdge ~%+# weighted key-labeled hyperedge

Hyperedge

WkLkHyperEdge ~%#+# key-weighted key-labeled hyperedge DiHyperEdge ~> directed hyperedge

WDiHyperEdge ~%> weighted directed hyperedge WkDiHyperEdge ~%#> key-weighted directed hyperedge

LDiHyperEdge ~+> labeled directed hyperedge LkDiHyperEdge ~+#> key-labeled directed hyperedge WLDiHyperEdge ~%+> weighted labeled directed hyperedge

WkLDiHyperEdge ~%#+> key-weighted labeled directed hyperedge WLkDiHyperEdge ~%+#> weighted key-labeled directed hyperedge

Directed Hyperedge

WkLkDiHyperEdge ~%#+#> key-weighted key-labeled directed hyperedge UnDiEdge ~ undirected edge

WUnDiEdge ~% weighted undirected edge WkUnDiEdge ~%# key-weighted undirected edge

LUnDiEdge ~+ labeled undirected edge LkUnDiEdge ~+# key-labeled undirected edge WLUnDiEdge ~%+ weighted labeled undirected edge

WkLUnDiEdge ~%#+ key-weighted labeled undirected edge WLkUnDiEdge ~%+# weighted key-labeled undirected edge

Undirected Edge

WkLkUnDiEdge ~%#+# key-weighted key-labeled undirected edge DiEdge ~> directed edge

WDiEdge ~%> weighted directed edge WkDiEdge ~%#> key-weighted directed edge

LDiEdge ~+> labeled directed edge LkDiEdge ~+#> key-labeled directed edge WLDiEdge ~%+> weighted labeled directed edge

WkLDiEdge ~%#+> key-weighted labeled directed edge WLkDiEdge ~%+#> weighted key-labeled directed edge

Directed Edge

WkLkDiEdge ~%#+#> key-weighted key-labeled directed edge

Table 2.3.2: Predefined edge classes

To bring the predefined weighted or labeled edge classes and the corresponding shortcuts into scope you need additional import s, for instance

import scalax.collection.edge.LDiEdge // labeled directed edge import scalax.collection.edge.Implicits._ // shortcuts

While simple directed and undirected edge shortcuts may be used without parenthesis like 1 ~> 2 , weighted and / or labeled edge shortcuts implicitly map to calls of functions with two parameter lists thus presuming parenthesis like (1 ~+> 2)(x) .

Keyed edges (key-weighted and / or key-labeled edges) come into play whenever you are designing multi-edges. They ensure that, when calculating the hash-code of the edge, the hash-code of the keyed weight and / or that of the keyed label will be included in the hash-code calculation of the edge.

Therefore, when utilizing any of the predefined key-labeled edge classes you are responsible to provide an appropriate label type. If your label is of a compound type in the sense that it is made up of several attributes you should carefully analyze which part of your label should become part of the edge key (hash-code). Unless the edge-key includes all parts of such a compound label you must override


equals and hashCode to reflect the real key-parts. For instance, when designing a Flight label for the predefined edge class LkDiEdge , you might observe, that only flightNo should be part of the edge key. Otherwise it would be possible to connect two airports with arcs showing to the same direction and with the same flightNo if the edges only differ in its departure . To cope with this problem you have to override equals and hashCode as follows:

case class Flight( val flightNo : String, val departure: DayTime = DayTime (0,0), val duration : Duration = Duration(0,0)) { override def equals(other: Any) = other match { case that: Flight => that.flightNo == this.flightNo case _ => false } override def hashCode = flightNo.## }

(The full example is included in TEdge.scala .)

Labeled edges will contain a reference to the label but graph instances do not know about the actual label type. This is why edges of the same graph instance are allowed to have different types by design. As a consequence, if you are utilizing a predefined edge class and want to call label methods, you need to match (cast down) the label value of the edge to your label type. This conversion is supported by the trait LEdgeImplicits[L] :

case class MyLabel(val i: Int) val eOuter = LUnDiEdge(1,3)(MyLabel(4)) val four_1 = eOuter.label.i // label must be dereferenced first

import scalax.collection.edge.LBase.LEdgeImplicits object MyImplicit extends LEdgeImplicits[MyLabel] import MyImplicit._ val four_2 = eOuter.i // implicitly dereferenced

What is more, mutable Graph instances provide edge creation methods to be called directly on Graph instances (addEdge , +~=, addAndGetEdge ) or on inner nodes (connectWith , +~). These methods should save execution time since less conversion is necessary to complete edge creation.

See also 5.1 Defining Custom Edges.

2.4 Instantiating Graphs a) val g1 = Graph(3~1, 5) // Graph[Int,UnDiEdge](1, 3, 5, 3~1)

b) val g2 = Graph(UnDiEdge(3, 1), 5) // same as above

c) val gA = Graph(3~>1.2) // Graph[AnyVal,DiEdge](3, 1.2, 3~>1.2)

d) val h = Graph(1~1, 1~2~3) // Graph[Int,HyperEdge](1, 2, 3, 1~1, 1~2~3)

e) val (jfc, fra, dme) = (Airport(”JFC”), Airport(”FRA”), Airport(”DME”)) val flights = Graph( (jfc ~+#> fra)(Flight("LH 400" ,10 o 25, 8 h 20)), (fra ~+#> dme)(Flight("LH 1444", 7 o 50, 3 h 10))

f) val nodes = List(5) val edges = List(3~1) val g3 = Graph.from(nodes, edges)

g) var n, m = 0; val f = Graph.fill(100)( {n = m; m += 1; n~m} )

h) val gs = Graph.from[Int,WLDiEdge,String,WLEdgeCompanio n,WLEdgeAdapter]( edgeStreams = Seq(new WLDiEdgeS tream(WLDiEdge))) // Graph[Int,WLDiEdge](…)


As to

a) Creates an undirected graph of the type Graph[Int,UnDiEdge] with the node set {1, 3, 5} and the edge set {3~1} . Operator ~ is used to create undirected edges. N is inferred to be Int because both the edge ends and the single node are of type Int .

b) Creates the very same directed graph as a). Operator ~ is just a short-hand for invoking the factory UnDiEdge .

c) Creates a directed graph of the type Graph[AnyVal,DiEdge] with the node set {3, 1.2} and the edge set {3~>1.2} . The operator ~> is used to create directed edges. N is inferred to be AnyVal because this is the smallest common super type of the edge ends.

d) Creates a hypergraph of the type Graph[Int,HyperEdge] with the node set {1, 2, 3} and the edge set {1~1, 1~2~3} . In 1~2~3 the second ~ operator creates an undirected hyperedge because the left operand is already an edge.

e) Creates an instance of Graph[Airport,LkDiEdge] with a node set containing the airports JFC, FRA and DME derived from the edge ends and an edge set containing two flights. The edges have labels of the type Flight , ~+#> is an edge factory shortcut to pass the labels. The Flight attributes are flight number, departure time and flight duration. See also 2.3 Edge Factories.

f) Creates a Graph from the Iterable values nodes and edges . This Graph instance g3 equals to g1 .

g) Creates a Graph instance with 100 edges {0~1, 1~2, …, 99~100} .

h) Creates a Graph instance reading from the edge input stream WLDiEdgeStream which has the edge factory WLDiEdge. Here, an overloaded from factory method is invoked with the type arguments Int for the node type, WLDiEdge for the edge type, String for the label type, WLEdgeCompanion for the edge factory and WLEdgeAdapter for the adapter type. WLDiEdgeStream must be implemented by the user. Input streams enable to instantiate graphs with no prior creation of outer nodes and edges. Implementation examples for node/edge input streams are given in TStream.scala .

2.5 Type Parameter Inference a) val g = Graph() // Graph[Nothing,Nothing]()

b) val g = Graph(1) // Graph[Int,Nothing]()

c) val g = Graph(1~>2); g += 1.2 // Graph[Int,UnDiEdge](); compiler error

d) Graph(1~>2) + 2~3 // compiler error

Calling the default Graph factory without type parameters is in general satisfactory if at least one edge is passed. Otherwise – analogously to any regular Scala collection – you are advised to explicitly define the type parameters:

As to

a) In absence of any arguments, Nothing is inferred for both the node and the edge type so you cannot add any nodes or edges to this instance later on.

b) In absence of any edge arguments, Nothing is inferred for the edge type so you can add only nodes to this instance later on.

c) To be able to add a Double to this instance, you must define its type parameters broader at creation time like Graph[AnyVal,DiEdge] .

d) This addition is rejected by the compiler because [Int,DiEdge] is inferred and 2~3 is not a directed but an undirected edge. Directed edges inherit from undirected edges but the opposite is not true.


As a final note, whenever you are implementing own functionality on top of Graph , the Scala type system requires to declaratively maintain type parameter bounds as shown in the following example:

def myFunction[N, E[X] <: EdgeLikeIn[X]](g: Graph[N,E] )

2.6 Choosing an Implementation import scalax.collection.immutable.{TinyGraphImpl => Grap h} val Graph = scalax.collection.immutable.TinyGraphImpl

Of course, Graph comes with separate implementations for immutable and mutable graphs and normally you needn’t change the default implementations.

It is still worth mentioning that there is an adjacency list based and a solely set based implementation for each. The default implementation is the adjacency list based one designed for good query performance while the solely set based (also: tiny) implementation variant is preferable whenever minimal memory footprint is highly prioritized. The above example shows how to choose the tiny implementation.


3 Inner and Outer Objects a) val g = Graph(1~2) // Graph[Int,UnDiEdge](1, 2, 1~2)

b) val n1 = g.nodes.head // g.NodeT = 1 or 2

c) val e1 = g.edges.head // g.EdgeT = 1~2

d) e1._1 // g.NodeT = 1

e) n1.diSuccessors // Set[g.NodeT] = Set(2) or Set(1) if n1 == 2

f) n1 % 2 // Int = 1

g) e1.toEdgeIn // UnDiEdge[Int] = 1~2

As to

a) Int and UnDiEdge are the types of outer nodes and outer edges of g respectively.

b) Querying the first node of the node set is a means to show that the type of the inner node is g.NodeT (head does not guarantee a specific order). See 4.2 for how to look up nodes.

c) Similarly, the inner edge e1 is of the type g.EdgeT .

d) Remarkably, the first incident node with e1 is an inner node of the type g.NodeT .

e) On n1 you can call both graph (NodeT) methods like diSuccessors …

f) …and any method of the outer node type, here % of Int .

g) Reconstructs the outer edge 1~2 ; g.edges.toEdgeInSet would reconstruct all outer edges.

A basic understanding of inner and outer objects (nodes and edges) is essential for working with Graph efficiently. From the perspective of Graph we distinguish between

• Outer Nodes An outer node is roughly speaking an object satisfying the node upper type bound of a given graph. It may be contained in the node set or may not. When such an object is passed to a graph method, it will implicitly be converted to a node wrapper of the type NodeIn . Further, when adding to the node set of a graph, outer nodes are transparently converted to inner nodes.

• Outer Edges Similarly, an outer edge is an object satisfying the edge upper type bound of a given graph. Outer edges must inherit from EdgeLike . An outer edge is not coupled with any graph instance: It my be contained in the edge set of a graph or may not. Edge factories are responsible for correct typing. On adding to the edge set of a graph, outer edges are transparently converted to inner edges.

• Inner Nodes Inner nodes are instances of an inner class of the type NodeT implementing the InnerNodeLike interface. They actually wrap outer nodes providing a wealth of graph functionality such as diSuccessors or pathTo . At the same time you can interact with an inner node as if it was your original object.

• Inner Edges Similarly, inner edges are instances of an inner class of the type EdgeT implementing the InnerEdgeLike interface. They actually wrap outer edges providing a wealth of graph functionality. At the same time you can interact with an inner edge and its incident nodes as if they were your original objects.

See also: object scalax.collection.Graph#InnerNodeLike object scalax.collection.Graph#InnerEdgeLike


4 Graph Operations

4.1 Iterating val g = Graph(2~3, 3~1)

a) g mkString ”-” // 3-1-2-2~3-3~1

b) g.nodes mkString ”-” // 3-1-2

c) g.edges mkString ”-” // 2~3-3~1

As to

a) Iterating over a Graph instance you get all nodes and all edges. Graph extends Set[GraphParam] . GraphParam is an algebraic type for outer nodes, outer edges, inner nodes and inner edges.

b) nodes returns the node set which in turn is an instance of Set[NodeT] .

c) edges returns the edge set which in turn is an instance of Set[EdgeT] .

Both Graph and its inner classes NodeSet and EdgeSet extend scala.collection.Set . Thus, you can call the gamut of well-known TraversableLike methods on these such as filter , foreach etc.

4.2 Looking up Nodes and Edges val g = Graph(1~2)

a) g find 1 // Option[g.NodeT] = Some(1)

b) g find 3 // Option[g.NodeT] = None

c) g get 1 // g.NodeT = 1

d) g get 3 // NoSuchElementException

e) g find 1~2 // Option[g.EdgeT] = Some(1~2)

f) g addAndGet 5 // g.NodeT = 5

g) g find (g having (node = _ == 1)) // g.NodeT = 1

As to

a) Searches for and finds the inner node wrapping the outer node 1.

b) Searches for the inner node wrapping the outer node 3 but cannot find it.

c) Searches for, finds and gets the inner node wrapping the outer node 1.

d) Searches for but does not find an inner node wrapping the outer node 3.

e) Searches for and finds the inner edge wrapping the outer edge 1~2 . Looking up edges works in the same manner as for nodes.

f) In case of a mutable Graph, adds 5 to the node set and returns the added inner node. This method enables you to couple adding and getting the same node.

g) This is no more a direct access to a specific element, but rather a search over all nodes stopping at the first node with the predicate _ == 1 . having may take a node and/or an edge argument.

Looking up nodes and edges is necessary to invoke graph functionality on a specific element – also called root - of a given graph. This is similar to getting a value of a Map entry by passing a key to the map. In case of Graph the “key” is an outer node or edge and the returned “value” is the inner node or edge. The default implementation uses hashCode for look-ups.


4.3 Equality of Inner and Outer Objects val g = Graph(1~2)

a) (g get 1) == 1 // true

b) (g get 1~2) == 2~1 // true

c) (g get 1~2) eq 2~1 // false

d) (g get 1~2) == 2~2 // false

As to

a) The inner node 1 equals to the outer node 1.

b) The inner edge 1~2 equals to the outer edge 2~1 . This example also demonstrates, that in case of undirected edges the order of nodes does not influence equality.

c) eq fails, because an inner edge can never be the same instance as an outer edge.

d) Obviously, 1~2 does not equal to 2~2 .

4.4 Adding and Subtracting val g = Graph(1, 2~3) // immutable or mutable

a) g + 1 // == g

b) g + 0 // Graph(0, 1, 2, 3, 2~3)

c) g + 1.2 // error: overloaded method…

d) g + 0~1 // Graph(0, 1, 2, 3, 0~1, 2~3)

e) g ++ List(1~2, 2~3) // Graph(1, 2, 3, 1~2, 2~3)

f) g ++ List[GraphParam[Int,UnDiEdge]](1~2, 2~3, 0) // Graph(0, 1, 2, 3, 1~2, 2~3)

g) g - 0 // == g

h) g - 1 // Graph(2, 3, 2~3)

i) g - 2 // Graph(1, 3)

j) g -? 2 // == g

k) g – 2~3 // Graph(1, 2, 3)

l) g -! 2~3 // Graph(1)

m) g -- List(2, 3~3) // Graph(1, 3)

val g = scalax.collection.mutable.Graph(1, 2~3) // Graph[Int,UnDiEdge] // to be assigned before each example

n) g += 0 // Graph(0, 1, 2, 3, 2~3)

o) g …= // (mutated graph)

p) g +~= (3~>1) // Graph(1, 2, 3, 2~3, 3~>1)

q) implicit val factory = scalax.collection.edge.LDiEdge g.addLEdge(3,4)("red") // true g // Graph(1, 2, 3, 4, 2~3, 3~>4 'red)


As to

a) Node 1 is not added because it is already contained, so the same instance is returned.

b) Node 0 is added to a new Graph instance.

c) Double is incompatible with Int so the compiler will issue an error.

d) Edge 0~1 is added to a new Graph instance.

e) All edges contained in the right hand operand are added to a new Graph instance.

f) All edges or nodes contained in the right hand operand are added to a new Graph instance.

g) Node 0 is not removed because it is not contained, so the same instance is returned.

h) Node 1 is removed from a new Graph instance. This node was not incident with any edge.

i) Node 2 along with 2~3 is “ripple” removed from a new Graph instance. Node 2 was incident with 2~3 .

j) Node 2 is not removed because this “gentle” removal succeeds only if the node is not incident with any edge.

k) Edge 2~3 is removed from a new Graph instance leaving its incident nodes in place.

l) Edge 2~3 along with its incident nodes 2 and 3 is removed from a new Graph instance. Incident nodes will only be removed if they are not incident with any other edge.

m) All edges or nodes contained in the right hand operand are removed from a new Graph instance.

n) Node 0 is added to the same mutable Graph instance.

o) All addition and subtraction operations valid for immutable graphs have their mutable counterparts with = as the last operand character.

p) +~= is an edge creation operator with its counterpart methods addEdge and addAndGetEdge .

q) addLEdge (same as +~+=) adds a labeled edge to a mutable graph specifying its nodes and label. This class of methods (operators) doesn’t require outer edge parameters but has an implicit edge factory parameters, instead.

The Graph operators +, += for adding and - , -= for subtracting may have an inner or outer node or edge at the right hand side. Graph guarantees a consistent state of the node and edge sets after any operation including duplicate node/edge prevention as demonstrated in e).

Both ripple and gentle removal schemas are supported. Using the standard operators - and -= , removals comply with the definition of node and edge removal in graph theory: ripple removal of nodes as in i) and gentle removal of edges as in k). In addition, gentle removal of nodes by -? and -?= as in j) as well as ripple removal of edges by -! and -!= as in l) are also supported.

It is also possible to add elements to or subtract elements from node and edge sets. However, these operations have less practical relevance as the result is always a decoupled immutable set.

In case of mutable graphs, two kinds of edge addition are supported:

• Standard addition is achieved by the operator += requiring an instance of EdgeLike at the right hand side – see example d).

• Factory-based addition is achieved by special add* methods or the corresponding operators having the incident nodes and optionally an edge weight and label as arguments. The edge factory to be used for internal edge creation must be defined as implicit val – see example q). Factory based addition should slightly outperform standard addition.

For adding or subtracting elements of another Graph instance in bulk see 4.5.


4.5 Computing Union, Difference and Intersection val g = Graph(1~2, 2~3, 2~4, 3~5, 4~5)

val h = Graph(3~4, 3~5, 4~6, 5~6)

a) g union h // Graph(1~2, 2~3, 2~4, 3~5, 4~5, 3~4, 4~6, 5~6)

b) g diff h // Graph(1~2)

c) g intersect h // Graph(3~5, 4)

d) g &= h // Graph(3~5, 4), same instance

Also union (same as ++), difference (diff same as -- ) and intersection (intersec same as &) work in compliance with the corresponding definitions in graph theory. Use any of the previous operators followed by = for the mutable variants.

4.6 Inspecting Endpoints val uE = 3~4 // UnDiEdge[Int] = 3~4

a) uE._1 * uE._2 // Int = 12

b) uE product // Int = 12

val dE = 1~>2 // DiEdge[Int] = 1~>2

c) dE.from - dE.to // Int = -1

d) uE.arity == dE.arity // Boolean = true

val hE = 1~2~11~12 // HyperEdge[Int] = 1~2~11~12

e) hE._n(hE.arity - 1) // Int = 12

f) hE sum // Int = 26

As to

a) _1 and _2 provide access to the first and second node of an edge.

b) Edges are Iterable , too.

c) The first node of a directed edge may be accessed by _1 , from or source , the second by _2 , to or target .

d) Both uE and dE have an arity of 2.

e) _n also enables direct access. Here we access the last node of hE.

f) Once again, all edges including hyperedges are Iterable .

4.7 Inspecting Neighbors and Incident Edges val g = Graph(0, 1~3, 3~>2)

val (n0, n2, n3) = (g get 0, g get 2, g get 3)

a) n0 diSuccessors // Set[g.NodeT] = Set()

b) n2.diSuccessors.isEmpty // Boolean = true

c) n3 diSuccessors // Set[g.NodeT] = Set(1, 2)

d) n3 diPredecessors // Set[g.NodeT] = Set(1)

e) n2 incoming // Set[g.EdgeT] = Set(3~>2)

f) n3 ~>? n2 // Option[g.EdgeT] = Some(3~>2)


As to

a) Node n0 is independent so the set of direct successor nodes is empty.

b) Node n2 is reachable but has no direct successor so the set of its out-neighbors is empty, too.

c) From node n3 , 1 and 2 are reachable.

d) The only direct predecessor of node n3 is node 1.

e) From the perspective of node n2 there is only one incoming edge namely 3~>2 .

f) ~>? is a synonym to findOutgoingTo .

All in all, given a specific node, the following methods are available to inspect incident edges and neighbors:

Result Type Method name Synonyms

Method name Operator diSuccessors outNeighbors ~>|

diPredecessors inNeighbors <~| Set[NodeT]

neighbors ~|

outgoing ~>

outgoingTo ~>

incoming <~ Set[EdgeT]

incomingFrom <~

findOutgoingTo ~>? Option[EdgeT]

findIncomingFrom <~?

Table 4.7: Neighbor and Incident Methods

See also: object scalax.collection.Graph#InnerNodeLike .

4.8 Querying by Function val g = Graph(2~>3, 3~1, 5)

a) g.nodes filter (_ > 2) // Set [g.NodeT] = Set(5,3)

b) g.nodes filter (_.degree > 1) // Set [g.NodeT] = Set(3)

c) g.edges filter (_.diSuccessors.isEmpty) // Set[g.EdgeT] = Set()

d) g filter ((i: Int) => i >= 2) // Graph[Int,DiEdge] = Graph(2,3,5, 2~>3)

e) g filter g.having(node = _ >= 2) // Graph(2,3,5, 2~>3)

f) g filter g.having(edge = _.directed) // Graph(2,3, 2~>3)

g) g count g.having(node = _ >= 3, edge = _.directed) // Int = 3

As to

a) Filters the node set by (NodeT) => _ > 2 .

b) Filters nodes with a degree > 1 .

c) Filters edges with no adjacent edges.

d) Creates a subgraph of g with nodes satisfying _ >= 2 and their incident edges.

e) Same as d) utilizing having . Method having , returning a partial function, helps to reduce code by internally match ing nodes and edges.


f) Creates a subgraph of g consisting of directed edges only. Note that filtering g.edges is not an alternative because it would return a set of contained edges, not a subgraph.

g) Counts the number of nodes and edges satisfying either of the given predicates.

Graph queries may start at the node set nodes , the edge set edges or at the Graph instance. Filtering a graph results in a subgraph obtained by node and / or edge predicates.

4.9 Finding Paths

val g = Graph( 1~2 % 4, 2~3 % 2, 1~>3 % 5, 1~5 % 3,

3~5 % 2, 3~4 % 1, 4~>4 % 1, 4~>5 % 0)

def n(outer: Int) = g get outer // looks up inner node equaling outer

a) n(1) findSuccessor (_.outDegree > 3) // Option[g.NodeT] = None

b) n(1) findSuccessor (_.outDegree >= 3) // Option[g.NodeT] = Some(3)

c) n(4) findSuccessor (_.edges forall (_.undirected)) // Some(2)

d) n(4) isPredecessorOf n(1) // true

e) n(1) pathTo n(4) // Some(Path(1, 1~5 % 3, 5, 3~5 % 2, 3, 3~4 % 1, 4) )

f) n(1) pathUntil (_.outDegree >= 3) // Some(Path(1, 1~5 % 3, 5, 3~5 % 2, 3))

g) val p = n(3) shortestPathTo n(1) get // Path(3, 3~4 % 1, 4, …, 1)

h) p nodes // List[g.NodeT] = List(3, 4, 5, 1)

i) p weight // Long = 4

j) val oP = n(4) pathTo (n(2), nodeFilter = _ < 4) if (oP.isDefined) oP.get.nodes // Some(List(4, 3, 2))

k) val oP = n(4) pathTo (n(2), edgeFilter = _.weight != 2 ) if (oP.isDefined) oP.get.nodes // Some(List(4, 5, 1, 2))

As to

a) Searches for any (direct or indirect) successor of node 1 having outDegree > 3 and finds None.

b) Searches for any successor of node 1 having outDegree >= 3 and finds node 3. This also means that there exists a path from node 1 to node 3 but this path is not to be determined by this method call.

c) Searches for any successor of node 4 having only undirected edges and finds node 2.

d) Successfully tests for node 4 being a predecessor of node 1.

e) Finds an arbitrary path from node 1 to node 4.

f) Finds a path from node 1 to an arbitrary node having outDegree >= 3 .


g) Determines the shortest path from node 3 to node 1 and get s the result from Option . Calling get is okay in this example because we know that there must exist a path.

h) Reduces path p to the List of nodes on the path. The returned path is an instance of the inner class Path facilitating further functionality…

i) … so, among others, it provides weight to calculate the total of the edge weights on this path.

j) Finds a path from node 4 to node 2 under the constraint that all nodes on the path must have a value less than 4. The inner case class g.Nav enables functional navigation.

k) Finds a path from node 4 to node 2 under the constraint that no edge on the path may have a weight of 2.

All methods relying on recursion are tail recursive.

4.10 Traversing Let g be the mixed graph defined in chapter 4.9 Finding Paths. Then

def n(outer: Int) = g get outer // looks up inner node equaling outer

import scalax.collection.GraphTraversal.VisitorReturn._

var sum = 0 // for node visitor side-effects var weights = 0L // for edge visitor side-effects

object Visitors { // needed only in REPL def add(node: g.NodeT) = { // a node visitor sum += node // adds the value of the visited node to sum Continue // returns Continue as opposed to Cancel } def add(edge: g.EdgeT) = { // an edge visitor weights += edge.weight // adds the weight of the visited edge to weights } } import Visitors._

a) sum = 0 n(4).traverseNodes() { add(_) } sum // Int = 15

b) sum = 0 n(4).traverseNodes(maxDepth = 1) { add(_) } sum // Int = 12

c) sum = 0 n(4).traverseNodes(nodeFilter = _ <= 4) { add(_) } sum // Int = 10

d) weights = 0 n(4).traverseEdges() { weights += _.weight } weights // Long = 30

e) weights = 0 n(4).traverseEdges(maxDepth = 1) { add(_) } weights // Long = 2

f) sum = 0; weights = 0 n(4).traverse(nodeFilter = _ <= 4, edgeFilter = _.weight < 4)( nodeVisitor = add(_), edgeVisitor = add(_)) sum // Int = 9 weights // Long = 9


g) val traversal = g.newTraversal(nodeFilter = _ <= 4, nodeVisitor = add(_) ) sum = 0 traversal (n(4), maxDepth = 1) traversal (n(2), breadthFirst = false) sum // Int = 17

As to

a) Sums up the values of all nodes in g because the traversal is unrestricted and g is connected.

b) With maxDepth = 1 the traversal is restricted to one layer so the root node and the nodes 3 and 5 will be visited.

c) Filtering nodes by _ <= 4 results in the exclusion of node 5 from the traversal.

d) As in a) this is a full traversal but with an edge visitor. The value of weights after this traversal is greater than the sum of weights over all edges because undirected edges can be visited from both ends.

e) This traversal has the same restriction as that in b). The signature of traverseEdge ensures that the edge visitor will be called. The resulting weights if the sum of 0, 1 and 1 including the hook at 4.

f) traverse has a generic signature which allows both a node and an edge visitor argument.

g) It is advantageous to explicitly create a Traversal instance and use this to start traversals on different roots whenever subsequent traversals share filters and visitors.

Inspecting the neighbors of a node could also be regarded as a one-step traversal (covered in 4.7). But a traversal in the real sense of the word allows unlimited steps beginning at a specific root node with each step following some kind of edges from the node being visited. Set the direction parameter to one of Successors (default), Predecessors or AnyConnected to determine whether only outgoing, only incoming or all edges should be followed.

Unfiltered traversals in an undirected graph or in a directed graph with direction = AnyConnected will visit all nodes and edges of the graph provided it is connected. By their nature, traversals are limited to the component containing the root node.

It is worth mentioning that, among others, visitors may also call nested traversals.


4.11 Measuring Graphs and Grouping Nodes by Degree Assume g being the mixed graph in chapter Finding Paths. Then

a) g.order // Int = 5

b) g.graphSize // Int = 8

c) g.size // Int = 13

d) g.totalDegree // Int = 16

e) g.degreeSet // TreeSet(4, 3, 2)

f) g.degreeNodeSeq(g.InDegree) // List((4,3), (3,5), (2,1), (2,4), (2,2))

g) g.degreeNodesMap // Map(2 -> Set(2), 3 -> Set(5,1), 4 -> Set(3,4))

h) g.degreeNodesMap(degreeFilter = _ > 3) // Map(4 -> Set(3,4))

As to

a) The order of (number of nodes in) g.

b) The size of (number of edges in) g.

c) The number of all contained elements (nodes and edges) of g.

d) The total degree (sum of degrees over all nodes) of g.

e) The distinct, decreasing set of degrees over all nodes of g.

f) The non-decreasing sequence of in-degrees over all nodes in g with inner node references.

g) A map of degrees over all nodes in g with nodes having the degree of key.

h) The same map as in g) restricted to degrees greater than 3.

All degree methods have implicit parameters to query in- or out-degrees and filtering degrees.

4.12 Classifying Graphs isBiparite

isComplete

isConnected

isRegular

Note: Some of the above classifying methods are not yet implemented.

4.13 Chaining Method Calls g.nodes find (_.inDegree > g.order / 5) orElse Some (g.nodes.head)

Graph is designed for functional style programming enabling to chain method calls as you feel inclined.


5 Customizing Graphs You can start to utilize Graph out-of-the-box but often you want to achieve a certain degree of customization. Basically, the following kinds of customization are eligible:

5.1 Defining Custom Edges If, for any reason, you decide not to utilize any of the predefined edge classes, you may design your own edge class. Recapping the Flight label example, it is also possible to define a custom edge type Flight . With such a custom edge type you can create Graph instances of the type Graph[Airport, Flight] . For the sake of simplicity we design the Flight edge to have the only attribute flightNo . Given the node class

case class Airport( val code: String) // node type

val (ham, ny) = (Airport("HAM"), Airport("JFK")) // two nodes

assume we want to be able to write

val flight = ham ~> ny ## "007" // flightNo 007 - doesn’t work yet

val g = Graph(flight) // Graph[Airport, Flight] - doesn’t work yet

Here is how to achieve the above requirements:

a) class Flight[Airport](nodes: Product, val flightNo: String) extends DiEdge[Airport](nodes) // a1) with ExtendedKey[Airport] // a2) with EdgeCopy[Flight] // a3) with EdgeIn[A,Flight] // a4) { def keyAttributes = Seq(flightNo) // a5) ExtendedKey override def copy[NN](newNodes: Product) = // a6) EdgeCopy new Flight[NN](newNodes, flightNo) } object Flight { def apply(from: Airport, to: Airport, no:String) = // a7) new Flight[Airport](NodeProduct(from, to), no) def unapply(e: Flight[Airport]) = Some(e) }

b) final class FlightAssoc[A <: Airport]( val e: DiEdge[A]) { // b1) @inline def ## (flightNo: String) = new Flight[A](e.nodes, flightNo) with EdgeIn[A,Flight] } implicit def edge2FlightAssoc[A <: Airport](e: DiEdge[A]) = // b2) new FlightAssoc[A](e)

As to

a1) DiEdge should be the base of any directed custom edge. Airport is the node type.

a2) If any of the label attributes is part of the key of the edge type, ExtendedKey must be mixed in. An attribute is a key if it must be considered by eauals . flightNo is such a key attribute because there may exist several flights from and to the same airport so we distinguished them by flightNo .

a3) All edge implementations must mix in EdgeCopy .

a4) All edge implementations must mix in EdgeIn .

a5) Key attributes must be added to this Seq.

a6) copy will be called by Graph transparently to create an inner edge. Thus copy plays the role of an inner edge factory. It must return an instance of the edge class.


a7) This standard apply makes use of NodeProduct which creates a Product of the arguments.

b1) Establishes the Flight edge factory shortcut ## which converts a directed edge to Flight .

b2) Enables the implicit usage of the ## operator.

Note that the supplied tests contain a more complete implementation of the flight example – see Flight.scala , TFlight.scala and FlightRouteMap.jpg in the repository.

In general, when deciding on how to define your custom edge, the following steps apply:

(A) Decide on which predefined edge class is to derive from. These are enlisted in 2.3 Edge Factories.

(B) If your edge type is labeled by one or more attributes, decide for each attribute whether it counts as a key or as a non-key attribute. You have to deal with key attributes only in case your graph is to be laid out for multi-edges. Then, similar to SQL database primary key design, those attributes making an edge unique must be defined as key attributes. If you have one or more key attributes, mix in ExtendedKey and enter all key attributes into the list returned by def keyAttributes which you must override. For any non-key attribute it is sufficient to be added as a constructor parameter.

(C) EdgeCopy must always be mixed in to override its abstract method copy .

(D) If you want to avoid any loop in your graph, you can achieve this by simply mixing in LoopFreeEdge .

(E) If your edges are weighted you may have selected either WUnDiEdge or WDiEdge in (B). If weight is your single custom attribute you don’t need to implement a custom edge. Otherwise you can use these predefined classes as a template to define your own weighted custom edge.

(F) In your custom edge class you are free to override validate that is called at edge instantiation time. For instance, the predefined validate ensures that no edge can be instantiated with any of its ends being null .

(G) Optionally, customize toString . Notice that EdgeLike comes with several protected methods for prefixes, braces etc.such as attributesToString which just need to be overridden. Thus they remove the burden of programming toString from the bottom up.

(H) Add a type and an implicit def to make passing of the custom edge to Graph methods easy.

(I) Optionally, design and implement your custom edge factory shortcut. Doing this you may also opt to use the standard edge factory shortcuts ~ and ~> for your custom edge.

In case of mixed graphs you may also design your own hierarchy of custom edge classes.

See also: object scalax.collection.GraphEdge._ object scalax.collection.GraphPredef._

5.2 Defining User Constraints It is planned to support extensibility by allowing custom constraints on the edge set.

5.3 Narrowing Type Parameters (To be completed…) As demonstrated in the test case custom.NarrowedType / custom.TNarrowedType , it is a relatively easy task to narrow down the type parameters to your needs although a single type definition does not suffice.

5.4 Modifying Methods You may want to modify methods of the library implementation of Graph or its inner traits NodeT, NodeSet , EdgeT or EdgeSet . to alter their run-time behavior or extends their functionality. However be careful not to alter the semantic - if you need new semantic its better to add new methods.


5.5 Adding new Methods You may want to add new methods to the library implementation of Graph or its inner traits NodeT, NodeSet , EdgeT or EdgeSet .

Adding new methods to Graph may be achieved by implicit functions. See the test case custom.TExtByImplicitTest .

5.6 Providing new Implementations See custom.ExtNode and TExtNode .

6 Run-time Characteristics

The default Graph implementation is based on scala.collection.mutable.HashMap thus having a reasonably good performance. Also, it is guaranteed that the only limit for storing Graph instances and calling Graph algorithms is the JVM heap space, as the implementation of all algorithms is tail recursive. Typically, it is fast to instantiate graphs made up of several hundred thousands of nodes and edges.

In terms of the Big-O notation, operation costs on graphs with the default Graph implementation can be summarized as follows:

Operation Cost of Operation

instance creation O(V + E) element addition O(1) element removal O(1) element look-up O(1) set-based traversal O(V + E) root-based traversal O(V + E) path search O(V + E) shortest path search O(VlogV + E)

If you seek even better performance, one option is to ask for a special implementation or to provide it yourself what is a really straightforward process thanks to the extendible nature of the Graph for Scala design.

graph4scala userguide

Documents