---
title: "An introduction to geoGraph"
author: "Thibaut Jombart and Andrea Manica"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{geoGraph}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, echo=FALSE}
# knitr::opts_chunk$set(fig.width = 7, fig.height = 6)
options(digits = 4)
```

*geoGraph*: walking through the geographic space using graphs.
=================================================

This document describes the *geoGraph* package for the R software.
*geoGraph* aims at implementing graph approaches for geographic data.
In *geoGraph*, a given geographic area is modelled by a fine regular grid, where each vertice
has a set of spatial coordinates and a set of attributes, which can be for instance habitat
descriptors, or the presence/abundance of a given species.
'Travelling' within the geographic area can then be easily modelled as moving between connected vertices.
The cost of moving from one vertex to another can be defined according to attribute values, which
allows for instance to define friction routes based on habitat.


*geoGraph* harnesses the full power of graph algorithms implemented in R by the *graph*
and *RBGL* (R Boost Graph Library) packages.
In particular, RBGL is an interface between R and the comprehensive *Boost Graph Library* in C++,
which provides fast and efficient implementations of a wide range of graph algorithms.
Once we have defined frictions for an entire geographic area, we can easily, for instance, find the least
costs path from one location to another, or find the most parsimonious way of connecting a set of locations.


Interfacing spatial data and graphs can be a complicated task.
The purpose of *geoGraph* is to provide tools to achieve and simplify this 'preliminary' step.
This is achieved by defining new classes of objects which are essentially geo-referenced graphs
with node attributes (`gGraph` objects), and interfaced spatial data (`gData` objects).
In this vignette, we show how to install *geoGraph*, construct and handle
`gGraph`/`gData` objects, and illustrate some basic features of graph algorithms.

# First steps

## Installing the package

All the following instructions should be entered from a new R session to avoid errors due to installing attached packages.

*devtools* is also needed to install *geoGraph*:
```{r eval = FALSE}
install.packages("devtools")
```

Then, to install *geoGraph*, simply type:
```{r eval = FALSE}
library(devtools)
install_github("thibautjombart/geoGraph")
```

Once installed, the package can be loaded using:
```{r load, message = FALSE}
library("geoGraph")
```

If you have an error regarding missing packages, you may need to install manually the packages *graph* and *RBGL* from *Bioconductor*:
```{r eval = FALSE}
install.packages("BiocManager")
BiocManager::install(c("graph", "RBGL"))
```

And then attempt to reinstall *geoGraph* from GitHub.

## Data representation

Data representation refers to the way a given type of data is handled by a
computer program.  Two types of objects are used in *geoGraph*: `gGraph`, and
`gData` objects.  Both objects are defined as formal (S4) classes and often have
methods for similar generic function (e.g. `getNodes` is defined for both
objects).  Essentially, `gGraph` objects contain underlying layers of
informations, including a spatial grid and possibly node attributes, and
covering the area of interest.  `gData` are sets of locations (like sampled
sites, for instance) which have been interfaced to a `gGraph` object, to allow
further manipulations such as finding paths on the grid between pairs of
locations.

### gGraph objects

The definition of the formal class `gGraph` can be obtained using:
```{r }
getClass("gGraph")
```
and a new empty object can be obtained using the constructor:
```{r }
new("gGraph")
```

The documentation `?gGraph` explains the basics about the object's content.  In
a nutshell, these objects are spatial grids with nodes and segments connecting
neighbouring nodes, and additional informations on the nodes or on the graph
itself.  `coords` is a matrix of longitudes and latitudes of the nodes.
`nodes.attr` is a data.frame storing attributes of the nodes, such as habitat
descriptors; each row corresponds to a node of the grid, while each column
corresponds to a variable.  `meta` is a list containing miscellanous
informations about the graph itself.  There is no contraint applying to the
components of the list, but some typical components such as `$costs` or
`$colors` will be recognised by certain functions.  For instance, you can
specify plotting rules for representing a given node attribute by a given color
by defining a component `$colors`.  Similarly, you can associate costs to a
given node attribute by defining a component `$costs`.  An example of this can
be found in already existing `gGraph` objects.  For instance, `worldgraph.10k`
is a graph of the world with approximately 10,000 nodes, and only on-land
connectivity (\textit{i.e.` no travelling on the seas).  ```{r } worldgraph.10k
worldgraph.10k@meta ``` Lastly, the `graph` component is a `graphNEL` object,
which is the standard class for graphs in the *graph* and *RBGL* packages.  This
object contains all information on the connections between nodes, and the
weights (costs) of these connections.

Four main `gGraph` are provided with *geoGraph*: `rawgraph.10k`, `rawgraph.40k`,
`worldgraph.10k`, and `worldgraph.40k`.  These datasets are available using the
command `data`.  The grid used in these datasets are the best geometric
approximation of a regular grid for the surface of a sphere.  One advantage of
working with these grids is that we do not have to use a projection for
geographic coordinates, which is a usual issue in regular GIS.

The difference between rawgraphs and worldgraphs is that the first are entirely
connected, while in the second connections occur only on land.  Numbers `10k'
and `40k' indicate that the grids consist of roughly 10,000 and 40,000 nodes.
For illustrative purposes, we will often use the 10k grids, since they are less
heavy to handle.  For most large-scale applications, the 40k versions should
provide sufficient resolution.  New `gGraph` can be constructed using the
constructor (`new(...)`), but this topic is not documented in this vignette.

### gData objects

`gData` are essentially sets of locations that are interfaced with a `gGraph` object.
During this operation, each location is assigned to the closest node on the grid of the
`gGraph`, then allowing for travelling between locations using the grid.
Then, it is for instance possible to find the shortest path between two locations through various
types of habitats.

Like for `gGraph`, the content of the formal class `gData` can be obtained using:
```{r }
getClass("gData")
```
and a new empty object can be obtained using the constructor:
```{r }
new("gData")
```

As before, the description of the content of these objects can be found in the
documentation (`?gData`).  `coords` is a matrix of xy (longitude/latitude)
coordinates in which each row is a location.  `nodes.id` is vector of characters
giving the name of the vertices matching the locations; this is defined
automatically when creating a new `gData`, or using the function `closestNode`.
`data` is a slot storing data associated to the locations; it can be any type of
object, but a data.frame should cover most requirements for storing data.  Note
that this object should be subsettable (i.e. the `[` operator should be
defined), so that data can be subsetted when subsetting the `gData` object.
Lastly, the slot `gGraph.name` contains the name of the `gGraph` object to which
the `gData` has been interfaced.

Contrary to `gGraph` objects, `gData` objects will frequently be constructed by
the user.  In the next sections, we illustrate how we can build and use `gData`
objects from a set of locations.

# Using *geoGraph*

An overview of the material implemented in the package is summarized the
package's manpage, accessible via:

```{r eval=FALSE}
?geoGraph
```
The html version of this manpage may be preferred to browse more easily the content
of *geoGraph*; it is accessible by typing:
```{r eval=FALSE}
help("geoGraph", package = "geoGraph", html = TRUE)
```
To revert help back to text mode, simply type:
```{r eval=FALSE}
options(htmlhelp = FALSE)
```

In the following, we go through various tasks that can be achieve using *geoGraph*.

## Importing geographic data

GeoGraphic data consist of a set of locations, possibly accompanied by
additional information.  For instance, one may want to study the migrations
amongst a set of biological populations with known geographic coordinates.  In
*geoGraph*, geographic data are stored in `gData` objects.  These objects match
locations to the closest nodes on a grid (a `gGraph` object), and store
additional data if needed.

As a toy example, let us consider four locations: Bordeaux (France), London
(UK), Malaga (Spain), and Zagreb (Croatia).  Since we will be working with a
crude grid (10,000 nodes), locations need not be exact.  We enter the longitudes
and latitudes (in this order, that is, xy coordinates) of these cities in
decimal degrees, as well as approximate population sizes:

```{r cities}
Bordeaux <- c(-1, 45)
London <- c(0, 51)
Malaga <- c(-4, 37)
Zagreb <- c(16, 46)
cities.dat <- rbind.data.frame(Bordeaux, London, Malaga, Zagreb)
colnames(cities.dat) <- c("lon", "lat")
cities.dat$pop <- c(1e6, 13e6, 5e5, 1.2e6)
row.names(cities.dat) <- c("Bordeaux", "London", "Malaga", "Zagreb")
cities.dat
```
We load a `gGraph` object which contains the grid that will support the data:
```{r wg10plot, fig = TRUE, fig.width = 8}
worldgraph.10k
plot(worldgraph.10k)
```

 (we could use `worldgraph.40k` for a better resolution).  In this figure, each
node is represented with a color depending on the habitat type, either 'sea'
(blue) or 'land' (green).  We are going to interface the cities data with this
grid; to do so, we create a `gData` object using `new` (see `?gData` object):

```{r citiesplot, fig=TRUE}
cities <- new("gData", coords = cities.dat[, 1:2], data = cities.dat[, 3, drop = FALSE], gGraph.name = "worldgraph.10k")
cities
plot(cities, type = "both", reset = TRUE)
plotEdges(worldgraph.10k)
```

 This figure illustrates the matching of original locations (black crosses) to nodes of the grid
(red circles).
As we can see, an issue occured for Bordeaux, which has been assigned to a node in the sea (in blue).
Locations can be re-assigned to nodes with restrictions for some node attribute values using
`closestNode`; for instance, here we constrain matching nodes to have an `habitat` value
(defined as node attribute in `worldgraph.10k`) equalling `land` (green points):
```{r closeNode, fig=TRUE}
cities <- closestNode(cities, attr.name = "habitat", attr.value = "land")
plot(cities, type = "both", reset = TRUE)
plotEdges(worldgraph.10k)
```

Now, all cities have been assigned to a `land' node of the grid (again, better accuracy will be
gained on 40k or finer grids - we use 10k for illustrative purposes only).
Content of `cities` can be accessed via various accessors (see `?gData`).
For instance, we can retrieve original locations, assigned nodes, and stored data using:
```{r }
getCoords(cities)
getNodes(cities)
getData(cities)
```
We can also get the coordinates of the matching nodes (\textit{i.e.}, red circle on previous figure) using:
```{r }
getCoords(cities, original = FALSE)
```
More interestingly, we can now retrieve all the geographic information contained in the underlying
grid (\textit{i.e.}, `gGraph` object) as node attributes:
```{r }
getNodesAttr(cities)
```
In this example, the information stored in `worldgraph.10k` is rather crude: `habitat` only
distinguishes the land from the sea.
However, more complex habitat information could be incorporated, for instance from GIS shapefiles
(see dedicated section below).

## Visualizing data

An essential aspect of spatial analysis lies in visualizing the data.
In *geoGraph*, the spatial grids (`gGraph`) and spatial data (`gData`) can be plotted
and browsed using a variety of functions.

### Plotting `gGraph` objects

Displaying a `gGraph` object is done through `plot` and `points` functions.
The first opens a new plotting region, while the second draws in the current plotting region;
functions have otherwise similar arguments (see `?plot.gGraph`).

By default, plotting a `gGraph` displays the grid of nodes overlaying a shapefile (by default,
the landmasses).
Edges can be plotted at the same time (argument `edges`), or added afterwards using `plotEdges`.
If the `gGraph` object possesses an adequately formed `meta$colors` component, the
colors of the nodes are chosen according to the node attributes and the color scheme specified in `meta$colors`.
Alternatively, the color of the nodes can be specified via the `col` argument in `plot`/`points`.

Here is an example using `worldgraph.10k`:

```{r wg10kdefplot, fig=TRUE, fig.width=8}
worldgraph.10k@meta$colors
head(getNodesAttr(worldgraph.10k))
table(getNodesAttr(worldgraph.10k))
plot(worldgraph.10k, reset = TRUE)
title("Default plotting of worldgraph.10k")
```

 It may be worth noting that plotting `gGraph` objects involves plotting a fairly
large number of points and edges.
On some graphical devices, the resulting plotting can be slow.
For instance, one may want to disable `cairo` under linux: this graphical device yields better
graphics than `Xlib`, but at the expense of increase computational time.
To switch to `Xlib`, type:
```{r eval=FALSE}
X11.options(type = "Xlib")
```
and to revert to `cairo`, type:
```{r eval=FALSE}
X11.options(type = "cairo")
```

### Zooming in and out, sliding, etc.

In practice, it is often useful to be able to peer at specific regions, and more generally to
navigate inside the graphical representation of the data.
For this, we can use the interactive functions `geo.zoomin`, `geo.zoomout`, `geo.slide`,
`geo.back`, `geo.bookmark`, and `geo.goto`.
The zoom and slide functions require to left-click on the graphics to zoom in, zoom out, or slide to
adjacent areas; in all cases, a right click ends the function.
Also note that `geo.zoomin` can accept an argument specifying a rectangular region, which will
be adapted by the function to fit best a square area with similar position and centre, and zoom to
this area (see `?geo.zoomin`).
`geo.bookmark` and `geo.goto` respectively set and go to a bookmark, *i.e.* a tagged area.
This is most useful when one has to switch between distant areas repeatedly.

Here are some examples based on the previous plotting of `worldgraph.10k`:
Zooming in:
```{r eval=FALSE}
geo.zoomin()
```

```{r zoomin, echo=FALSE}
geo.zoomin(c(-19.25, 44.77, 15.09, 66.19))
```

Zooming out:
```{r eval=FALSE}
geo.zoomout()
```
```{r zoomout, echo=FALSE}
geo.zoomin(c(-51.26, 76.78, -23.38, 90.00))
```

Sliding to the east:
```{r eval=FALSE}
geo.slide()
```
```{r geoslide, echo=FALSE}
geo.zoomin(c(41.76, 169.80, -28.28, 99.76))
```

One important thing which makes plotting `gGraph` objects different from most other plotting in
R is that `geoGraph` keeps the changes made to the plotting area in memory.
This allows to undo one or several moves using `geo.back`.
Moreover, even if the graphical device is killed, plotting again a `gGraph` will use the old
parameters by default.
To disable this behavior, set the argument `reset=TRUE` when calling upon `plot`.
Technically, this 'plotting memory' is implemented by storing plotting information in an environment
defined as the hidden environment `geoGraph:::.geoGraphEnv`:
```{r}
ls(env = geoGraph:::.geoGraphEnv)
```
You can inspect individual variables within this environment:
```{r}
get("last.plot.param", envir = geoGraph:::.geoGraphEnv)
```

However, it is recommended not to modify these objects directly, unless you really know what you are doing.
In any case, plotting a `gGraph` object with argument `reset=TRUE` will remove previous
plotting history and undo possible wrong manipulations.

### Plotting `gData` objects

`gData` objects are by default plotted overlaying the corresponding `gGraph`.
For instance, using the `cities` example from above:
```{r citiesPlot2, fig=TRUE}
plot(cities, reset = TRUE)
text(getCoords(cities), rownames(getData(cities)))
```

 Note the argument `reset=TRUE`, which tells the plotting function to adapt the
plotting area to the geographic extent of the dataset.


To plot additional information, it can be useful to extract the spatial coordinates from the data.
This is achieved by `getCoords`.
This method takes an extra argument `original`, which is TRUE if original spatial coordinates
are seeked, or FALSE for coordinates of the nodes on the grid.
We can use this to represent, for instance, the population sizes for the different cities:
```{r cities_plot, fig=TRUE}
transp <- function(col, alpha = .5) {
  res <- apply(col2rgb(col), 2, function(c) rgb(c[1] / 255, c[2] / 255, c[3] / 255, alpha))
  return(res)
}

plot(cities, reset = TRUE)
par(xpd = TRUE)
text(getCoords(cities) + -.5, rownames(getData(cities)))
symbols(getCoords(cities)[, 1], getCoords(cities)[, 2], circ = sqrt(unlist(getData(cities))), inch = .2, bg = transp("red"), add = TRUE)
```

## Editing `gGraphs`

Editing graphs is an essential task in *geoGraph*.
While available `gGraph` objects provide a basis to work with (see `?worldgraph.10k`),
one may want to adapt a graph to a specific case.
For instance, connectivity should be defined according to biological knowledge of the organism under study.
`gGraph` can be modified in different ways: by changing the connectivity, the costs of
edges, or the attribute values.

### Changing the global connectivity of a `gGraph`

There are two main ways of changing the connectivity of a `gGraph`, which match two different objectives.
The first approach is to perform global and systematic changes of the connectivity of the graph.
Typically, one will want to remove all connections over a given type of landscape, which is
uncrossable by the organism under study.
Let's assume we are interested in saltwater fishes.
To model fish dispersal, we have to define a graph which connects only nodes overlaying the sea.
We load the `gGraph` object `rawgraph.10k`, and zoom in to a smaller area (Madagascar) to illustrate
changes in connectivity:
```{r plot_edges, fig=TRUE}
geo.zoomin(c(35, 54, -26, -10))
plotEdges(rawgraph.10k)
```

 We shall set a bookmark for this area, in case we would want to get back to this place
later on:
```{r }
geo.bookmark("madagascar")
```
What we now want to do is remove all but sea-sea connections.
To do so, the easiest approach is to i) define costs for the edges based on habitat, with land being
given large costs and ii) remove all edges with large costs.


Costs of a given node attribute (here, `habitat') are indicated in the `meta$costs` slot:
```{r }
rawgraph.10k@meta$costs
newGraph <- rawgraph.10k
newGraph@meta$costs[2:6, 2] <- 100
newGraph@meta$costs[1, 2] <- 1
newGraph@meta$costs
```
We have just changed the costs associated to habitat type, but this change is not yet effective on
edges between nodes.
We use `setCosts` to set the cost of an edge to the average of the costs of its nodes:
```{r fig = TRUE, fig.width = 8}
newGraph <- setCosts(newGraph, attr.name = "habitat")
plot(newGraph, edge = TRUE)
```

On this new graph, we represent the edges with a width inversely proportional to the associated
cost; that is, bold lines for easy travelling and light edges/dotted lines for more costly mouvement.
This is not enough yet, since travelling on land is still possible.
However, we can tell *geoGraph* to remove all edges associated to too strong a cost, as defined
by a given threshold (using `dropDeadEdges`).
Here, only sea-sea connections shall be retained, that is, edges with cost 1.
```{r fig=TRUE}
newGraph <- dropDeadEdges(newGraph, thres = 1.1)
plot(newGraph, edge = TRUE)
```

Here we are: `newGraph` only contains connections in the sea.
Note that, although we restrained the plotting area to Madagascar, this change is effective everywhere.
For instance, travelling to the nort-west Australian coasts:
```{r bookmark, fig=TRUE}
geo.zoomin(c(110, 130, -27, -12))
geo.bookmark("australia")
```


### Changing local properties of a `gGraph`

A second approach to changing a `gGraph` is to refine the graph by hand, adding or removing
locally some connections, or altering the attributes of some nodes.
This can be necessary to connect components such as islands to the main landmasses, or to correct
erroneous data.
Adding and removing edges from the grid of a `gGraph` can be achieved by `geo.add.edges`
and `geo.remove.edges`, respectively.
These functions are interactive, and require the user to select individual nodes or a rectangular
area in which edges are added or removed.
See `?geo.add.edges` for more information on these functions.
For instance, we can remove a few odd connections in the previous graph, near the Australian coasts
(note that we have to save the changes using `<-`):
```{r eval=FALSE}
geo.goto("australia")
newGraph <- geo.remove.edges(newGraph)
```

```{r eval=TRUE, echo=FALSE}
newGraph <- readRDS("Robjects/newGraph.RDS")
#load("Robjects/newGraph.RData")
```
![img](figs/georemove.png)

 When adding connections within an area or in an entire graph, node addition is based on
another `gGraph`,*i.e.* only connections existing in another `gGraph` serving as
reference can be added to the current `gGraph`.
For graphs based on 10k or 40k grids, the raw graphs provided in `geoGraph` should be used,
(`rawgraph.10k`, `rawgraph.40k`), since they are fully connected.


In addition to changing grid connectivity, we may also want to modify the attributes of specific nodes.
This is again done interactively, using the function `geo.change.attr`.
For instance, here, we define a new value `shalowwater` (plotted in light blue) for the attribute `habitat`,
selecting affected nodes using the 'area' mode first, and refining the changes using the 'point' mode:
```{r eval=FALSE}
plot(newGraph, edge = TRUE)
temp <- geo.change.attr(newGraph, mode = "area", attr.name = "habitat", attr.value = "shallowwater", newCol = "deepskyblue")
temp <- geo.change.attr(temp, attr.name = "habitat", attr.value = "shallowwater", newCol = "deepskyblue")
newGraph <- temp
```
```{r echo=FALSE}
newGraph <- readRDS("Robjects/newGraph2.RDS")
#load("Robjects/newGraph2.RData")
```
```{r fig=TRUE}
newGraph@meta$colors
plot(newGraph, edge = TRUE)
```

Again, note that the changes made to the graph have to be save in an object (using `<-`) to be
effective.


## Extracting information from GIS shapefiles

An important feature of *geoGraph* is serving as an interface between *geographic information
system* (GIS) layers and geographic data.
As currently implemented, *geoGraph* can extract information from shapefiles with the Arc GIS
(http://www.esri.com/software/arcgis/index.html) format, using the function `extractFromLayer`.
Here, we illustrate this procedure using the shapefile `world-countries.shp` provided with the
package. *geoGraph* still relies on `SpatialPolygonDataFrame` objects from the `sp` package.
A move to `sf` objects is planned, but for the moment we have to use `sf::st_read()` to read
a GIS shapefile into an `sf` object, and then cast it to a `SpatialPolygonDataFrame` with 
`sf::as_Spatial()`:
```{r }
library(sf)
world.countries <- sf::st_read(system.file("files/shapefiles/world-countries.shp", package = "geoGraph"))
world.countries <- sf::as_Spatial(world.countries)
class(world.countries)
summary(world.countries)
```
The summary of `world.countries` shows the data (attributes) stored in the layer.
Let us assume that we are interested in retrieving continent and country information for the
`worldgraph.10k` object.
Note that `extractFromLayer` can extract information to other types of objects than `gGraph` (see `?extractFromLayer`)
```{r }
summary(getNodesAttr(worldgraph.10k))
newGraph <- extractFromLayer(worldgraph.10k, layer = world.countries, attr = c("CONTINENT", "NAME"))
summary(getNodesAttr(newGraph))
```
The new object `newGraph` is a `gGraph` which now includes, for each node of the grid, the
corresponding continent and country retrieved from the GIS layer.
We can use the newly acquired information for plotting `newGraph`, by defining new color rules:
```{r fig=TRUE}
temp <- unique(getNodesAttr(newGraph)$"NAME")
col <- c("transparent", rainbow(length(temp) - 1))
colMat <- data.frame(NAME = temp, color = col)
head(colMat)
tail(colMat)
plot(newGraph, col.rules = colMat, reset = TRUE)
```

 This information could in turn be used to define costs for travelling on the grid.
For instance, one could import habitat descriptors from a GIS, use these values to formulate a
habitat model, and derive costs for dispersal on the grid.


As soon as a GIS layer has been extracted to a `gGraph`, this information becomes also available for
any `gData` interfaced with this object.
For instance, we can re-use the `cities` example defined in a previous section, and interface it
with `newGraph` to retrieve continent and country information for the cities of the dataset:
```{r }
cities.dat
cities <- new("gData", coords = cities.dat[, 1:2], data = cities.dat[, 3, drop = FALSE], gGraph.name = "newGraph")
cities <- closestNode(cities, attr.name = "habitat", attr.value = "land")
getData(cities)
getNodesAttr(cities)
```


## Finding least-cost paths

One of the most useful applications of *geoGraph* is the research of least-cost paths between
couples of locations.
This can be achieved using the functions `dijkstraFrom` and `dijkstraBetween` on a
`gData` object which contains all the locations of interest.
These functions return least-cost paths with the format `gPath`.
`dijkstraFrom` compute the paths from a given node of the grid to all locations of the
`gData`, while `dijkstraBetween` computes the paths between pairs of locations of the
`gData`.
Below, we detail the example of the documentation of these functions, which uses the famous dataset of native
Human populations, HGDP:
```{r fig=TRUE, fig.width = 8}
hgdp
plot(hgdp, reset = TRUE)
```

 Populations of the dataset are shown by red circles, while the underlying grid
(`worldgraph.40k`) is represented with colors depending on habitat (blue: sea; green: land;
pink: coasts).
Population genetics predicts that genetic diversity within populations should decay as populations are
located further away from the geographic origin of the species.
Here, we verify this relationship for a theoretical origin in Addis abeba, Ethiopia.
We shall seek all paths through landmasses to the HGDP populations.


First, we check that all populations are connected on the grid using `isConnected`:
```{r }
isConnected(hgdp)
```
Note that in practice, we may often want to assess graphically the connectivity of the underlying grid, especially
if not all locations of the `gData` are connected.
This can be done using `connectivityPlot`, which has methods for both `gGraph` and
`gData`, and represents different connected components using different colors.
For instance, for `worldgraph.10k`:
```{r connectivityPlot, fig.width = 8}
connectivityPlot(worldgraph.10k, edges = TRUE, seed = 1)
```
```{r fig=TRUE}
geo.zoomin(c(90, 150, 18, -25))
title("Different connected components\n in worldgraph.10k")
```


Since all locations in `hgdp` are connected, we can proceed further.
We have to set the costs of edges in the `gGraph` grid.
To do so, we can choose between i) strictly uniform costs (using `dropCosts`) ii)
distance-based costs -- roughly uniform -- (using `setDistCosts`) or iii) attribute-driven
costs (using `setCosts`).


We shall first illustrate the strictly uniform costs.
After setting a `gGraph` with uniform costs, we use `dijkstraFrom` to find the shortest
paths between Addis abeba and the populations of `hgdp`:
```{r }
myGraph <- dropCosts(worldgraph.40k)
hgdp@gGraph.name <- "myGraph"
addis <- cbind(38, 9)
ori <- closestNode(myGraph, addis)
paths <- dijkstraFrom(hgdp, ori)
```
The object `paths` contains the identified paths, which are stored as a list with class `gPath` (see `?gPath`).
Paths can be plotted easily:
```{r fig=TRUE, fig.width = 8}
addis <- as.vector(addis)
plot(myGraph, col = NA, reset = TRUE)
plot(paths)
points(addis[1], addis[2], pch = "x", cex = 2)
text(addis[1] + 35, addis[2], "Addis abeba", cex = .8, font = 2)
points(hgdp, col.node = "black")
```

In this graph, each path is plotted with a different color, but several paths overlap in several places.
We can extract the distances from the `origin' using `gPath2dist`, and then examine the
relationship between genetic diversity within populations (stored in `hgdp`) and the distance from the origin:
```{r fig=TRUE}
div <- getData(hgdp)$"Genetic.Div"
dgeo.unif <- gPath2dist(paths, res.type = "vector")
plot(div ~ dgeo.unif, xlab = "GeoGraphic distance (arbitrary units)", ylab = "Genetic diversity")
lm.unif <- lm(div ~ dgeo.unif)
abline(lm.unif, col = "red")
summary(lm.unif)
title("Genetic diversity vs geographic distance \n uniform costs ")
```


Alternatively, we can use costs based on habitat.
As a toy example, we will consider that coasts are four times more favourable for dispersal than
the rest of the landmasses.
We define these new costs, and then compute and plot the corresponding shortest paths:
```{r }
myGraph@meta$costs[7, ] <- c("coast", 0.25)
myGraph@meta$costs
myGraph <- setCosts(myGraph, attr.name = "habitat")
paths.2 <- dijkstraFrom(hgdp, ori)
```
```{r fig=TRUE, fig.width = 8}
plot(newGraph, col = NA, reset = TRUE)
plot(paths.2)
points(addis[1], addis[2], pch = "x", cex = 2)
text(addis[1] + 35, addis[2], "Addis abeba", cex = .8, font = 2)
points(hgdp, col.node = "black")
```

The new paths are slightly different from the previous ones.
We can examine the new relationship with genetic distance:
```{r fig=TRUE}
dgeo.hab <- gPath2dist(paths.2, res.type = "vector")
plot(div ~ dgeo.hab, xlab = "GeoGraphic distance (arbitrary units)", ylab = "Genetic diversity")
lm.hab <- lm(div ~ dgeo.hab)
abline(lm.hab, col = "red")
summary(lm.hab)
title("Genetic diversity vs geographic distance \n habitat costs ")
```

Of course, the distinction between coasts and inner landmasses is a somewhat poor description of habitat.
In practice, complex habitat models can be used as simply.