Category: Cluster analysis

geWorkbench (genomics Workbench) is an open-source software platform for integrated genomic data analysis. It is a desktop application written in the programming language Java. geWorkbench uses a comp
Archetypal analysis
Archetypal analysis in statistics is an unsupervised learning method similar to cluster analysis and introduced by Adele Cutler and Leo Breiman in 1994. Rather than "typical" observations (cluster cen
Consensus clustering
Consensus clustering is a method of aggregating (potentially conflicting) results from multiple clustering algorithms. Also called cluster ensembles or aggregation of clustering (or partitions), it re
Latent space
A latent space, also known as a latent feature space or embedding space, is an embedding of a set of items within a manifold in which items resembling each other are positioned closer to one another i
Geographical cluster
A geographical cluster is a localized anomaly, usually an excess of something given the distribution or variation of something else. Often it is considered as an incidence rate that is unusual in that
Determining the number of clusters in a data set
Determining the number of clusters in a data set, a quantity often labelled k as in the k-means algorithm, is a frequent problem in data clustering, and is a distinct issue from the process of actuall
Frequent pattern discovery
Frequent pattern discovery (or FP discovery, FP mining, or Frequent itemset mining) is part of knowledge discovery in databases, Massive Online Analysis, and data mining; it describes the task of find
Cluster analysis
Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in
Behavioral clustering
Behavioral clustering is a statistical analysis method used in retailing to identify consumer purchase trends and group stores based on consumer buying behaviors.
Correlation clustering
Clustering is the problem of partitioning data points into groups based on their similarity. Correlation clustering provides a method for clustering a set of objects into the optimum number of cluster
Brown clustering
Brown clustering is a hard hierarchical agglomerative clustering problem based on distributional information proposed by Peter Brown, William A. Brown, Vincent Della Pietra, , Jennifer Lai, and Robert
Farthest-first traversal
In computational geometry, the farthest-first traversal of a compact metric space is a sequence of points in the space, where the first point is selected arbitrarily and each successive point is as fa
Clustering illusion
The clustering illusion is the tendency to erroneously consider the inevitable "streaks" or "clusters" arising in small samples from random distributions to be non-random. The illusion is caused by a
Mixture model
In statistics, a mixture model is a probabilistic model for representing the presence of subpopulations within an overall population, without requiring that an observed data set should identify the su
Constrained clustering
In computer science, constrained clustering is a class of semi-supervised learning algorithms. Typically, constrained clustering incorporates either a set of must-link constraints, cannot-link constra
Medoids are representative objects of a data set or a cluster within a data set whose sum of dissimilarities to all the objects in the cluster is minimal. Medoids are similar in concept to means or ce
WACA clustering algorithm
WACA is a clustering algorithm for dynamic networks. WACA (Weighted Application-aware Clustering Algorithm) uses a heuristic weight function for self-organized cluster creation. The election of cluste
Clustering high-dimensional data
Clustering high-dimensional data is the cluster analysis of data with anywhere from a few dozen to many thousands of dimensions. Such high-dimensional spaces of data are often encountered in areas suc
A dendrogram is a diagram representing a tree. This diagrammatic representation is frequently used in different contexts: * in hierarchical clustering, it illustrates the arrangement of the clusters
Biclustering, block clustering, Co-clustering or Two-mode clustering is a data mining technique which allows simultaneous clustering of the rows and columns of a matrix.The term was first introduced b