Category: Statistical classification

Bayes classifier

In statistical classification, the Bayes classifier minimizes the probability of misclassification.

There are two main uses of the term calibration in statistics that denote special types of statistical inference problems. "Calibration" can mean * a reverse process to regression, where instead of a

False positives and false negatives

A false positive is an error in binary classification in which a test result incorrectly indicates the presence of a condition (such as a disease when the disease is not present), while a false negati

Decision boundary

In a statistical-classification problem with two classes, a decision boundary or decision surface is a hypersurface that partitions the underlying vector space into two sets, one for each class. The c

Naive Bayes classifier

In statistics, naive Bayes classifiers are a family of simple "probabilistic classifiers" based on applying Bayes' theorem with strong (naive) independence assumptions between the features (see Bayes

Geodemographic segmentation

In marketing, geodemographic segmentation is a multivariate statistical classification technique for discovering whether the individuals of a population fall into different groups by making quantitati

Variable kernel density estimation

In statistics, adaptive or "variable-bandwidth" kernel density estimation is a form of kernel density estimation in which the size of the kernels used in the estimate are varieddepending upon either t

Cover's theorem

Cover's theorem is a statement in computational learning theory and is one of the primary theoretical motivations for the use of non-linear kernel methods in machine learning applications. It is so te

Classification rule

Given a population whose members each belong to one of a number of different sets or classes, a classification rule or classifier is a procedure by which the elements of the population set are each pr

Leakage (machine learning)

In statistics and machine learning, leakage (also known as data leakage or target leakage) is the use of information in the model training process which would not be expected to be available at predic

Linear classifier

In the field of machine learning, the goal of statistical classification is to use an object's characteristics to identify which class (or group) it belongs to. A linear classifier achieves this by ma

Receiver operating characteristic

A receiver operating characteristic curve, or ROC curve, is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied. The method

Typology (social science research method)

Typology is a composite measure that involves the classification of observations in terms of their attributes on multiple variables. Such classification is usually done on a nominal scale. Typologies

Neighbourhood components analysis

Neighbourhood components analysis is a supervised learning method for classifying multivariate data into distinct classes according to a given distance metric over the data. Functionally, it serves th

Prior knowledge for pattern recognition

Pattern recognition is a very active field of research intimately bound to machine learning. Also known as classification or statistical classification, pattern recognition aims at building a classifi

Support vector machine

In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data for classification and regression

Multiclass LDA

No description available.

Total operating characteristic

The total operating characteristic (TOC) is a statistical method to compare a Boolean variable versus a rank variable. TOC can measure the ability of an index variable to diagnose either presence or a

Kernel perceptron

In machine learning, the kernel perceptron is a variant of the popular perceptron learning algorithm that can learn kernel machines, i.e. non-linear classifiers that employ a kernel function to comput

Partial Area Under the ROC Curve

The Partial Area Under the ROC Curve (pAUC) is a metric for the performance of binary classifier. It is computed based on the receiver operating characteristic (ROC) curve that illustrates the diagnos

Statistical classification

In statistics, classification is the problem of identifying which of a set of categories (sub-populations) an observation (or observations) belongs to. Examples are assigning a given email to the "spa

Platt scaling

In machine learning, Platt scaling or Platt calibration is a way of transforming the outputs of a classification model into a probability distribution over classes. The method was invented by John Pla

Probabilistic classification

In machine learning, a probabilistic classifier is a classifier that is able to predict, given an observation of an input, a probability distribution over a set of classes, rather than only outputting

Bayes error rate

In statistical classification, Bayes error rate is the lowest possible error rate for any classifier of a random outcome (into, for example, one of two categories) and is analogous to the irreducible

Evaluation of binary classifiers

The evaluation of binary classifiers compares two methods of assigning a binary attribute, one of which is usually a standard method and the other is being investigated. There are many metrics that ca

RCASE

Root Cause Analysis Solver Engine (informally RCASE) is a proprietary algorithm developed from research originally at the Warwick Manufacturing Group (WMG) at Warwick University. RCASE development com

Klecka's tau

Klecka's tau (τ) is a statistic which is used to test whether a given classification analysis improves one's classification to groups over a random allocation to the various groups under consideration

Values Modes

Values Modes is a segmentation tool in the United Kingdom, based on the British Values Survey.

Vapnik–Chervonenkis dimension

In Vapnik–Chervonenkis theory, the Vapnik–Chervonenkis (VC) dimension is a measure of the capacity (complexity, expressive power, richness, or flexibility) of a set of functions that can be learned by

Least-squares support vector machine

Least-squares support-vector machines (LS-SVM) for statistics and in statistical modeling, are least-squares versions of support-vector machines (SVM), which are a set of related supervised learning m

Automated essay scoring

Automated essay scoring (AES) is the use of specialized computer programs to assign grades to essays written in an educational setting. It is a form of educational assessment and an application of nat

Youden's J statistic

Youden's J statistic (also called Youden's index) is a single statistic that captures the performance of a dichotomous diagnostic test. Informedness is its generalization to the multiclass case and es

Predictive modelling

Predictive modelling uses statistics to predict outcomes. Most often the event one wants to predict is in the future, but predictive modelling can be applied to any type of unknown event, regardless o

Similarity measure

In statistics and related fields, a similarity measure or similarity function or similarity metric is a real-valued function that quantifies the similarity between two objects. Although no single defi

Mixture (probability)

In probability theory and statistics, a mixture is a probabilistic combination of two or more probability distributions. The concept arises mostly in two contexts: * A mixture defining a new probabil

Multiclass classification

In machine learning and statistical classification, multiclass classification or multinomial classification is the problem of classifying instances into one of three or more classes (classifying insta

Linear discriminant analysis

Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fie

Multinomial probit

In statistics and econometrics, the multinomial probit model is a generalization of the probit model used when there are several possible categories that the dependent variable can fall into. As such,

One-class classification

In machine learning, one-class classification (OCC), also known as unary classification or class-modelling, tries to identify objects of a specific class amongst all objects, by primarily learning fro

Recursive partitioning

Recursive partitioning is a statistical method for multivariable analysis. Recursive partitioning creates a decision tree that strives to correctly classify members of the population by splitting it i

Probability matching

Probability matching is a decision strategy in which predictions of class membership are proportional to the class base rates. Thus, if in the training set positive examples are observed 60% of the ti

Sensitivity and specificity

Sensitivity and specificity mathematically describe the accuracy of a test which reports the presence or absence of a condition. Individuals for which the condition is satisfied are considered "positi

Binary classification

Binary classification is the task of classifying the elements of a set into two groups (each called class) on the basis of a classification rule. Typical binary classification problems include: * Med

Confusion matrix

In the field of machine learning and specifically the problem of statistical classification, a confusion matrix, also known as an error matrix, is a specific table layout that allows visualization of

Margin classifier

In machine learning, a margin classifier is a classifier which is able to give an associated distance from the decision boundary for each example. For instance, if a linear classifier (e.g. perceptron

Multiple discriminant analysis

Multiple Discriminant Analysis (MDA) is a multivariate dimensionality reduction technique. It has been used to predict signals as diverse as neural memory traces and corporate failure. MDA is not dire

Net reclassification improvement

Net reclassification improvement (NRI) is an index that attempts to quantify how well a new model reclassifies subjects - either appropriately or inappropriately - as compared to an old model. While c

Kernel Fisher discriminant analysis

In statistics, kernel Fisher discriminant analysis (KFD), also known as generalized discriminant analysis and kernel discriminant analysis, is a kernelized version of linear discriminant analysis (LDA

Growth function

The growth function, also called the shatter coefficient or the shattering number, measures the richness of a set family. It is especially used in the context of statistical learning theory, where it

Averaged one-dependence estimators

Averaged one-dependence estimators (AODE) is a probabilistic classification learning technique. It was developed to address the attribute-independence problem of the popular naive Bayes classifier. It

Industrial market segmentation

Industrial market segmentation is a scheme for categorizing industrial and business customers to guide strategic and tactical decision-making. Government agencies and industry associations use standar

Natarajan dimension

In the theory of Probably Approximately Correct Machine Learning, the dimension characterizes the complexity of learning a set of functions, generalizing from the Vapnik-Chervonenkis dimension for boo

Firmographics

Firmographics (also known as emporographics or firm demographics) are sets of characteristics to segment prospect organizations. What demographics are to people, firmographics are to organizations. Ho

K-nearest neighbors algorithm

In statistics, the k-nearest neighbors algorithm (k-NN) is a non-parametric supervised learning method first developed by Evelyn Fix and Joseph Hodges in 1951, and later expanded by Thomas Cover. It i

Chi-square automatic interaction detection

Chi-square automatic interaction detection (CHAID) is a decision tree technique based on adjusted significance testing (Bonferroni correction, Holm-Bonferroni testing). The technique was developed in

Phi coefficient

In statistics, the phi coefficient (or mean square contingency coefficient and denoted by φ or rφ) is a for two binary variables. In machine learning, it is known as the Matthews correlation coefficie

Sagacity segmentation

Sagacity segmentation is a means of segmenting a population of interest using life-cycle stage, income and occupation variables. The logic behind this segmentation systems is that as people pass throu

Data classification (business intelligence)

In business intelligence, data classification has close ties to data clustering, but where data clustering is descriptive, data classification is predictive. In essence data classification consists of

Bias–variance tradeoff

In statistics and machine learning, the bias–variance tradeoff is the property of a model that the variance of the parameter estimated across samples can be reduced by increasing the bias in the estim

Double descent

In statistics and machine learning, double descent is the phenomenon where a statistical model with a small number of parameters and a model with an extremely large number of parameters have a small e

Proaftn

Proaftn is a fuzzy classification method that belongs to the class of supervised learning algorithms. The acronym Proaftn stands for: (PROcédure d'Affectation Floue pour la problématique du Tri Nomina

Quadratic classifier

In statistics, a quadratic classifier is a statistical classifier that uses a quadratic decision surface to separate measurements of two or more classes of objects or events. It is a more general vers