Adjacency matrix for polygons using Java

Adjacency matrix for polygons using Java

I am trying to create an adjacency matrix from a set of polygons. That is, I have a bunch of polygons and I want to identify which polygons have a common edge or "touch" each other. Once I find this information, I want to create an n x n matrix that indicates whether those each polygon either touches or does not touch the other polygon. Something like this:

Poly1 Poly2 Poly3 Poly1 0 1 0 Poly2 1 0 1 Poly3 0 1 0

That is the goal anyway. I was wondering if anyone knew of a Java package or such that could help me do this. I read some posts on doing this using VBA but I am wondering if Java Eclipse can be used for this purpose because I need to implement the adjacency matrix in a bigger program that I am writing on Java.

Edit: I have the polygon as a shapefile drawing. I think triangular Matrix will be fine because of the symmetry.

I've modified your Floyd-Warshall implementation to correctly initialize adjMat for the diagonal elements of the adjacency matrix, which should have a value 0. I also changed the floydWarshall method to not re-allocate adjMat , which has already been allocated in the main method. I also removed a duplicate call to floydWarshall in your arrayCondenser method. I have also modified the the arrayCondenser method to calculate the condensed array, and added a printCondensedGrid method to display the condensed array:

I believe the initialization below (moved into an initialize method) should reflect your picture more accurately.

In even rows, there are edges from every node to the node to its right. In odd rows, there are edges from every node to the node to its left. In every even column, there are edges from every node to the node below. In every odd column, there are edges from every node to the node above.

I've also changed the dimensions of the matrix to reflect the fact that there are 100 nodes, and not 99.

To test the initialization, I have added a test method testInit . This method runs through each node in the graph and checks nodes to the left, right, above and below to see if there are edges to those nodes. You can inspect the output of the test method and compare to the picture above.

Both initialize and testInit are called from main .

EDIT: I have implemented Gene's suggestion of using a 100 X 4 matrix where the 4 numbers represent the N, S, E, W directions, because those are the only directions in which links can exist. Gene suggested 4 bits, but I've used 4 array locations which uses more space, but I've put the actual adjacent nodes in those locations (if non-zero).

If you want to keep your original indexing, you can modify the if statements in the initialize method like so:


Authors: Justine Blanford, Fritz Kessler, Amy Griffin, David O'Sullivan, MGIS program, The Pennsylvania State University.

This courseware module is part of Penn State's College of Earth and Mineral Sciences' OER Initiative.

The College of Earth and Mineral Sciences is committed to making its websites accessible to all users, and welcomes comments or suggestions on access improvements. Please send comments or suggestions on accessibility to the site editor. The site editor may also be contacted with questions or comments about this Open Educational Resource.

Finding path-lengths by the power of Adjacency matrix of an undirected graph

I knew from Mark Newman's book - Networks: An Introduction (Page 137, Eq: 6.31) that, if $A$ is the adjacency matrix of a graph, then $ij


For a simple graph with vertex set U = <u1, …, un >, the adjacency matrix is a square n × n matrix A such that its element Aij is one when there is an edge from vertex ui to vertex uj , and zero when there is no edge. [1] The diagonal elements of the matrix are all zero, since edges from a vertex to itself (loops) are not allowed in simple graphs. It is also sometimes useful in algebraic graph theory to replace the nonzero elements with algebraic variables. [2] The same concept can be extended to multigraphs and graphs with loops by storing the number of edges between each two vertices in the corresponding matrix element, and by allowing nonzero diagonal elements. Loops may be counted either once (as a single edge) or twice (as two vertex-edge incidences), as long as a consistent convention is followed. Undirected graphs often use the latter convention of counting loops twice, whereas directed graphs typically use the former convention.

Of a bipartite graph Edit

The adjacency matrix A of a bipartite graph whose two parts have r and s vertices can be written in the form

where B is an r × s matrix, and 0r,r and 0s,s represent the r × r and s × s zero matrices. In this case, the smaller matrix B uniquely represents the graph, and the remaining parts of A can be discarded as redundant. B is sometimes called the biadjacency matrix.

Formally, let G = (U, V, E) be a bipartite graph with parts U = <u1, …, ur >, V = <v1, …, vs > and edges E . The biadjacency matrix is the r × s 0–1 matrix B in which bi,j = 1 if and only if (ui, vj) ∈ E .

If G is a bipartite multigraph or weighted graph, then the elements bi,j are taken to be the number of edges between the vertices or the weight of the edge (ui, vj) , respectively.

The adjacency matrix of a bipartite graph is totally unimodular. This means that the determinant of every square submatrix of it is −1, 0, or +1.

Variations Edit

An (a, b, c) -adjacency matrix A of a simple graph has Ai,j = a if (i, j) is an edge, b if it is not, and c on the diagonal. The Seidel adjacency matrix is a (−1, 1, 0) -adjacency matrix. This matrix is used in studying strongly regular graphs and two-graphs. [3]

The distance matrix has in position (i, j) the distance between vertices vi and vj . The distance is the length of a shortest path connecting the vertices. Unless lengths of edges are explicitly provided, the length of a path is the number of edges in it. The distance matrix resembles a high power of the adjacency matrix, but instead of telling only whether or not two vertices are connected (i.e., the connection matrix, which contains boolean values), it gives the exact distance between them.

Undirected graphs Edit

The convention followed here (for undirected graphs) is that each edge adds 1 to the appropriate cell in the matrix, and each loop adds 2. [4] This allows the degree of a vertex to be easily found by taking the sum of the values in either its respective row or column in the adjacency matrix.

Coordinates are 0–23.
White fields are zeros, colored fields are ones.

Directed graphs Edit

The adjacency matrix of a directed graph can be asymmetric. One can define the adjacency matrix of a directed graph either such that

  1. a non-zero element Aij indicates an edge from i to j or
  2. it indicates an edge from j to i .

The former definition is commonly used in graph theory and social network analysis (e.g., sociology, political science, economics, psychology). [5] The latter is more common in other applied sciences (e.g., dynamical systems, physics, network science) where A is sometimes used to describe linear dynamics on graphs. [6]

Using the first definition, the in-degrees of a vertex can be computed by summing the entries of the corresponding column and the out-degree of vertex by summing the entries of the corresponding row. When using the second definition, the in-degree of a vertex is given by the corresponding row sum and the out-degree is given by the corresponding column sum.

Coordinates are 0–23.
As the graph is directed, the matrix is not necessarily symmetric.

Trivial graphs Edit

The adjacency matrix of a complete graph contains all ones except along the diagonal where there are only zeros. The adjacency matrix of an empty graph is a zero matrix.

Spectrum Edit

The adjacency matrix of an undirected simple graph is symmetric, and therefore has a complete set of real eigenvalues and an orthogonal eigenvector basis. The set of eigenvalues of a graph is the spectrum of the graph. [7] It is common to denote the eigenvalues by λ 1 ≥ λ 2 ≥ ⋯ ≥ λ n . geq lambda _<2>geq cdots geq lambda _.>

For d -regular graphs, d is the first eigenvalue of A for the vector v = (1, …, 1) (it is easy to check that it is an eigenvalue and it is the maximum because of the above bound). The multiplicity of this eigenvalue is the number of connected components of G , in particular λ 1 > λ 2 >lambda _<2>> for connected graphs. It can be shown that for each eigenvalue λ i > , its opposite − λ i = λ n + 1 − i =lambda _> is also an eigenvalue of A if G is a bipartite graph. [8] In particular − d is an eigenvalue of bipartite graphs.

Isomorphism and invariants Edit

Suppose two directed or undirected graphs G1 and G2 with adjacency matrices A1 and A2 are given. G1 and G2 are isomorphic if and only if there exists a permutation matrix P such that

In particular, A1 and A2 are similar and therefore have the same minimal polynomial, characteristic polynomial, eigenvalues, determinant and trace. These can therefore serve as isomorphism invariants of graphs. However, two graphs may possess the same set of eigenvalues but not be isomorphic. [9] Such linear operators are said to be isospectral.

Matrix powers Edit

If A is the adjacency matrix of the directed or undirected graph G , then the matrix A n (i.e., the matrix product of n copies of A ) has an interesting interpretation: the element (i, j) gives the number of (directed or undirected) walks of length n from vertex i to vertex j . If n is the smallest nonnegative integer, such that for some i , j , the element (i, j) of A n is positive, then n is the distance between vertex i and vertex j . This implies, for example, that the number of triangles in an undirected graph G is exactly the trace of A 3 divided by 6. The adjacency matrix can be used to determine whether or not the graph is connected.

The adjacency matrix may be used as a data structure for the representation of graphs in computer programs for manipulating graphs. The main alternative data structure, also in use for this application, is the adjacency list. [10] [11]

Because each entry in the adjacency matrix requires only one bit, it can be represented in a very compact way, occupying only | V | 2 /8 bytes to represent a directed graph, or (by using a packed triangular format and only storing the lower triangular part of the matrix) approximately | V | 2 /16 bytes to represent an undirected graph. Although slightly more succinct representations are possible, this method gets close to the information-theoretic lower bound for the minimum number of bits needed to represent all n -vertex graphs. [12] For storing graphs in text files, fewer bits per byte can be used to ensure that all bytes are text characters, for instance by using a Base64 representation. [13] Besides avoiding wasted space, this compactness encourages locality of reference. However, for a large sparse graph, adjacency lists require less storage space, because they do not waste any space to represent edges that are not present. [11] [14]

An alternative form of adjacency matrix (which, however, requires a larger amount of space) replaces the numbers in each element of the matrix with pointers to edge objects (when edges are present) or null pointers (when there is no edge). [14] It is also possible to store edge weights directly in the elements of an adjacency matrix. [11]

Besides the space tradeoff, the different data structures also facilitate different operations. Finding all vertices adjacent to a given vertex in an adjacency list is as simple as reading the list, and takes time proportional to the number of neighbors. With an adjacency matrix, an entire row must instead be scanned, which takes a larger amount of time, proportional to the number of vertices in the whole graph. On the other hand, testing whether there is an edge between two given vertices can be determined at once with an adjacency matrix, while requiring time proportional to the minimum degree of the two vertices with the adjacency list. [11] [14]

Adjacency For WinBUGS Tool

The Adjacency For WinBUGS Tool is a customization of ArcMap. The tool iteratively processes each polygon within a selected layer and creates a text file of polygonal adjacency where each polygon is identified by a unique Adj_ID value. The output of this tool is used in WinBUGS (MRC Biostatistics Unit Cambridge) by the car.normal, car.l1, and conditional autoregressive distributions.

“We have been working with a class of models called hierarchical spatial count models ( These models are fitted with Markov chain Monte Carlo methods in WinBUGS. To accommodate the potential for spatial correlation between the responses, we employ a spatial conditional autoregressive component (spatial CAR). This spatial CAR requires an adjacency matrix, i.e., a matrix relating one areal unit to a collection of neighboring areal units. A requirement of this matrix is symmetry. GeoBUGS within the WinBUGS software can facilitate this adjacency matrix creation, but we find it much more convenient to conduct all of our mapping operations in ArcGIS. As such, a tool that facilitates the creation of the adjacency matrix for use in WinBUGS is most useful.”
Dr. Wayne Thogmartin

  1. Save the file: to your local hard drive
  2. Extract contents
  3. Double click AdjacencyForWinBUGS.esriAddIn file
  4. Open ArcMap 10
  5. Click on the Customize menu and the Customize Mode… item
  6. On the Customize dialog click the Commands tab
  7. Scroll down and click on Fox Tools in the Categories list
  8. Drag the Adjacency For WinBUGS button to a toolbar on the ArcMap 10 interface

Figure 1. The Adjacency For WinBUGS dialog.

  1. Click the AdjacencyForWinBUGS button to open the Adjacency For WinBUGS dialog (Figure 1).
  2. Choose an editable polygon shapefile.
  3. Click the Browse button and identify the output directory.
  4. Click OK button.
  5. Program execution:
    1. The folder specified by the user in the Output Directory textbox is created.
      1. The folder’s name will have the following structure: User identified directory + Layer Name +_Adjacency_+ Iterative Number.
      2. Characters within the new folder’s name that are problematic to ArcMap will be replaced with underscore characters.
      3. Problematic characters include

      1. Raw.txt: each line in the output file contains a list of numbers. The first number in the list is the Adj_ID of the input polygon all subsequent numbers in the list are the Adj_ID values of polygons adjacent to the input polygon. Only the input polygon’s Adj_ID will be listed if it has no adjacent neighbors.
      2. Adj.txt: is identical to the Raw.txt file with one exception, the leading Adj_ID value of the input polygon is omitted. If the input polygon does not have any adjacent neighbors, then a blank line is entered.
      3. Num.txt: contains the number of neighbors for each input polygon.
      4. SumNumNeigh.txt: the sum of values contained in the Num.txt.

      Impact of UMESC Science

      Bayesian computation is becoming an increasingly important means of statistical analysis of data. These methods are well-suited to modeling complex data, including spatial data, and software such as WinBUGS and R exist to facilitate these calculations. Unfortunately, these complex spatial data require careful preparation for proper analysis. The Adjacency Tool facilitates this analysis by easily formatting spatial data in ArcGIS, the globally predominant geographic information system. Tools such as the Adjacency Tool help bridge the information gap between statistical software and geographic information systems.

      Adjacency matrix for polygons using Java - Geographic Information Systems

      A graph is a data structure that consists of the following two components:
      1. A finite set of vertices also called as nodes.
      2. A finite set of ordered pair of the form (u, v) called as edge. The pair is ordered because (u, v) is not the same as (v, u) in case of a directed graph(di-graph). The pair of the form (u, v) indicates that there is an edge from vertex u to vertex v. The edges may contain weight/value/cost.
      Graphs are used to represent many real-life applications: Graphs are used to represent networks. The networks may include paths in a city or telephone network or circuit network. Graphs are also used in social networks like linkedIn, Facebook. For example, in Facebook, each person is represented with a vertex(or node). Each node is a structure and contains information like person id, name, gender, and locale. See this for more applications of graph.
      Following is an example of an undirected graph with 5 vertices.

      The following two are the most commonly used representations of a graph.
      1. Adjacency Matrix
      2. Adjacency List
      There are other representations also like, Incidence Matrix and Incidence List. The choice of graph representation is situation-specific. It totally depends on the type of operations to be performed and ease of use.
      Adjacency Matrix:
      Adjacency Matrix is a 2D array of size V x V where V is the number of vertices in a graph. Let the 2D array be adj[][], a slot adj[i][j] = 1 indicates that there is an edge from vertex i to vertex j. Adjacency matrix for undirected graph is always symmetric. Adjacency Matrix is also used to represent weighted graphs. If adj[i][j] = w, then there is an edge from vertex i to vertex j with weight w.

      The adjacency matrix for the above example graph is:

      Pros: Representation is easier to implement and follow. Removing an edge takes O(1) time. Queries like whether there is an edge from vertex ‘u’ to vertex ‘v’ are efficient and can be done O(1).
      Cons: Consumes more space O(V^2). Even if the graph is sparse(contains less number of edges), it consumes the same space. Adding a vertex is O(V^2) time.
      Please see this for a sample Python implementation of adjacency matrix.
      Adjacency List:
      An array of lists is used. The size of the array is equal to the number of vertices. Let the array be an array[]. An entry array[i] represents the list of vertices adjacent to the ith vertex. This representation can also be used to represent a weighted graph. The weights of edges can be represented as lists of pairs. Following is the adjacency list representation of the above graph.

      Note that in the below implementation, we use dynamic arrays (vector in C++/ArrayList in Java) to represent adjacency lists instead of the linked list. The vector implementation has advantages of cache friendliness.

      Generic Adjacency List Graph implementation

      I am trying to come up with a decent Adjacency List graph implementation so I can start tooling around with all kinds of graph problems and algorithms like traveling salesman and other problems. But I can't seem to come up with a decent implementation. This is probably because I am trying to dust the cobwebs off my data structures class.

      But what I have so far. and this is implemented in Java. is basically an edgeNode class that has a generic type and a weight-in the event the graph is indeed weighted.

      I have a graph class that has a list of edges a value for the number of Vertices and and an int value for edges as well as a boolean value for whether or not it is directed. The brings up my first question, if the graph is indeed directed, shouldn't I have a value in my edgeNode class? Or would I just need to add another vertices to my LinkedList? That would imply that a directed graph is 2X as big as an undirected graph wouldn't it?

      Finally does anybody have a standard way of initializing there graph? I was thinking of reading in a pipe-delimited file but that is so 1997.

      I guess when I am adding edges if boolean is true I would be adding a second edge. So far, this all depends on the file I write. So if I wrote a file with the following Vertices and weights.

      I would obviously load them into my list of edges, but how can I represent one vertices connected to the other. so on. I would need the opposite vertices? Say I was representing Highways connected to each city weighted and un-directed (each edge is bi-directional with weights in some fictional distance unit). Would my implementation be the best way to do that?

      I found this tutorial online Graph Tutorial that has a connector object. This appears to me be a collection of vertices pointing to each other. So you would have A and B each with there weights and so on, and you would add this to a list and this list of connectors to your graph. That strikes me as somewhat cumbersome and a little dismissive of the adjacency list concept? Am I wrong and that is a novel solution?

      This is all inspired by steve skiena's Algorithm Design Manual. Which I have to say is pretty good so far. Thanks for any help you can provide.

      Finding the shortest path in a fully-connected undirected graph

      I am trying to solve a problem in which I have a list of two-dimensional coordinates, and I want to find the shortest path that connects all of them.

      At first I thought this was a case of the Traveling Salesman Problem, however:

      • You don't need to get back to the starting city, which I believe TSP wants you to.
      • On this two-dimensional plane where we work with the euclidean distance metric, the triangle inequality holds, which the normal TSP algorithms don't care about, if I recall correctly.
      • There is no 'you only can visit a node once' rule in this problem the 'shortest path' could form a tree.

      I then thought to 'just make a graph and use Prim's or Kruskal's algorithm to find the (length of the) minimum spanning tree'. However, the graph representations commonly used are either an adjacency matrix, which seems a waste for an undirected graph, or an adjacency list, which is slower for a sparse graph (and a fully-connected graph is of course the exact opposite of sparse).

      I have the feeling that I am discarding information this specific problem has, which will result in using more time and/or memory that would be required to solve the shortest connecting path problem for a fully-connected, undirected graph in 2d-space.

      Object field declarations

      Then you don't need to specify a constructor at all.

      As a general rule, it is preferred to use the interface as the type rather than the implementation. This makes it easier to change implementations in the future. Both because you specify the implementation in fewer places and because this forces you to code to the interface.

      Redundant variables

      Why bother with numVertices ? Why not

      Then you don't have to manually maintain an extra variable that tracks information that you already have.

      I prefer singular names (e.g. vertex count) for singular variables and plural names for collections. In this case, you have a simple scalar variable and I would give it a singular name. This is of course a personal preference rather than a standard.

      You don't have to use this. with object fields in Java unless there is a conflict between a local variable/parameter and an object field. You can. It will work fine. But you don't need to do so. Some find that it makes the code more readable in that it indicates that a particular variable is an object field and not something local. My personal preference is to leave it off unless necessary.

      Contrast this with the variable holding the edge count. That variable has actual impact, as it saves having to count the number of edges, which is stored in many collections. But here, the vertex count is already tracked exactly by the number of keys in the adjacencyMap . Adding an extra variable means that there is one more thing to maintain. One more thing that can fail.

      Inconsistent idiom

      In several places, you enforce that for there to be a path between two vertices, both vertices must be in the graph. However, when adding an edge, you don't do this.

      Either way, you only have edges in the graph between two vertices in the graph.

      Now, rather than always checking for presence, we assume that things will be there until told that they aren't. This change in pattern almost always works when containsKey is immediately followed by get .

      Similarly, we now check if the add did anything after calling it rather than checking if it will work and then calling it.

      Because the && operator short circuits, this has the same effect as if the second condition were in a second if inside the first.