This package provides the infrastructure for seriation with an implementation of several seriation/sequencing techniques to reorder matrices, dissimilarity matrices, and dendrograms (see below for a full list). Also provides (optimally) reordered heatmaps, color images and clustering visualizations like dissimilarity plots, and visual assessment of cluster tendency plots (VAT and iVAT).
Stable CRAN version: install from within R with
install.packages("seriation")
Current development version: Download package from AppVeyor or install from GitHub (needs devtools).
library("devtools")
install_github("mhahsler/seriation")
Load library, read data and calculate distances. Then use default seriation.
library(seriation)
data("iris")
x <- as.matrix(iris[-5])
x <- x[sample(1:nrow(x)),]
d <- dist(x)
order <- seriate(d)
order
object of class ‘ser_permutation’, ‘list’
contains permutation vectors for 1-mode data
vector length seriation method
1 150 ARSA
Compare quality.
rbind(
random = criterion(d),
reordered = criterion(d, order)
)
AR_events AR_deviations RGAR Gradient_raw Gradient_weighted Path_length
random 550620 948833.712 0.49938328 741 -1759.954 392.77766
reordered 54846 9426.094 0.04974243 992214 1772123.418 83.95758
Inertia Least_squares ME Moore_stress Neumann_stress 2SUM LS
random 214602194 78852819 291618.0 927570.00 461133.357 29954845 5669489
reordered 356945979 76487641 402332.1 13593.32 5274.093 17810802 4486900
For dissimilarity data:
- Branch-and-bound to minimize the unweighted/weighted column gradient
- DendSer - Dendrogram seriation heuristic to optimize various criteria
- GA - Genetic algorithm with warm start to optimize various criteria
- HC - Hierarchical clustering (single link, avg. link, complete link)
- GW - Hierarchical clustering reordered by Gruvaeus and Wainer heuristic
- OLO - Hierarchical clustering with optimal leaf ordering
- Identity permutation
- MDS - Multidimensional scaling (metric, non-metric, angle)
- ARSA - Simulated annealing (linear seriation)
- TSP - Traveling sales person solver to minimize Hamiltonian path length
- R2E - Rank-two ellipse seriation
- Random permutation
- Spectral seriation (unnormalized, normalized)
- SPIN - Sorting points into neighborhoods (neighborhood algorithm, side-to-site algorithm)
- VAT - Visual assessment of clustering tendency ordering
- QAP - Quadratic assignment problem heuristic (2-SUM, linear seriation, inertia, banded anti-Robinson form)
For matrices:
- BEA - Bond Energy Algorithm to maximize the measure of effectiveness (ME)
- Identity permutation
- PCA - First principal component or angle on the projection on the first two principal components
- Random permutation
- TSP - Traveling sales person solver to maximize ME
- Michael Hahsler, Kurt Hornik and Christian Buchta, Getting Things in Order: An Introduction to the R Package seriation, Journal of Statistical Software, 25(3), 2008.
- Michael Hahsler. An experimental comparison of seriation methods for one-mode two-way data. European Journal of Operational Research, 257:133-143, 2017.
- Seriation package vignette with complete examples.
- Reference manual