Category: Regression analysis

Smoothing spline
Smoothing splines are function estimates, , obtained from a set of noisy observations of the target , in order to balance a measure of goodness of fit of to with a derivative based measure of the smoo
Calibration (statistics)
There are two main uses of the term calibration in statistics that denote special types of statistical inference problems. "Calibration" can mean * a reverse process to regression, where instead of a
Polynomial regression
In statistics, polynomial regression is a form of regression analysis in which the relationship between the independent variable x and the dependent variable y is modelled as an nth degree polynomial
Interaction cost
Interaction cost can comprise work, costs, and other expenses, required to complete a task or interaction. This applies to several categories, including: * Economy: the interaction cost of a purchase
Scatterplot smoothing
In statistics, several scatterplot smoothing methods are available to fit a function through the points of a scatterplot to best represent the relationship between the variables. Scatterplots may be s
Principal component regression
In statistics, principal component regression (PCR) is a regression analysis technique that is based on principal component analysis (PCA). More specifically, PCR is used for estimating the unknown re
Cross-sectional regression
In statistics and econometrics, a cross-sectional regression is a type of regression in which the explained and explanatory variables are all associated with the same single period or point in time. T
Fractional model
In applied statistics, fractional models are, to some extent, related to binary response models. However, instead of estimating the probability of being in one bin of a dichotomous variable, the fract
G-prior
In statistics, the g-prior is an objective prior for the regression coefficients of a multiple regression. It was introduced by Arnold Zellner.It is a key tool in Bayes and empirical Bayes variable se
Canonical analysis
In statistics, canonical analysis (from Ancient Greek: κανων bar, measuring rod, ruler) belongs to the family of regression methods for data analysis. Regression analysis quantifies a relationship bet
Smearing retransformation
The Smearing retransformation is used in regression analysis, after estimating the logarithm of a variable. Estimating the logarithm of a variable instead of the variable itself is a common technique
Outline of regression analysis
The following outline is provided as an overview of and topical guide to regression analysis: Regression analysis – use of statistical techniques for learning about the relationship between one or mor
Elastic net regularization
In statistics and, in particular, in the fitting of linear or logistic regression models, the elastic net is a regularized regression method that linearly combines the L1 and L2 penalties of the lasso
Unit-weighted regression
In statistics, unit-weighted regression is a simplified and robust version (Wainer & Thissen, 1976) of multiple regression analysis where only the intercept term is estimated. That is, it fits a model
Conjoint analysis
Conjoint analysis is a survey-based statistical technique used in market research that helps determine how people value different attributes (feature, function, benefits) that make up an individual pr
Sliced inverse regression
Sliced inverse regression (or SIR) is a tool for dimensionality reduction in the field of multivariate statistics. In statistics, regression analysis is a method of studying the relationship between a
Linear predictor function
In statistics and in machine learning, a linear predictor function is a linear function (linear combination) of a set of coefficients and explanatory variables (independent variables), whose value is
Frisch–Waugh–Lovell theorem
In econometrics, the Frisch–Waugh–Lovell (FWL) theorem is named after the econometricians Ragnar Frisch, Frederick V. Waugh, and Michael C. Lovell. The Frisch–Waugh–Lovell theorem states that if the r
Policy capturing
Policy capturing or "the PC technique" is a statistical method used in social psychology to quantify the relationship between a person's judgement and the information that was used to make that judgem
Limited dependent variable
A limited dependent variable is a variable whose range ofpossible values is "restricted in some important way." In econometrics, the term is often used whenestimation of the relationship between the l
Ridge regression
Ridge regression is a method of estimating the coefficients of multiple-regression models in scenarios where the independent variables are highly correlated. It has been used in many fields including
Nonhomogeneous Gaussian regression
Non-homogeneous Gaussian regression (NGR) is a type of statistical regression analysis used in the atmospheric sciences as a way to convert ensemble forecasts into probabilistic forecasts. Relative to
Blinder–Oaxaca decomposition
The Blinder–Oaxaca decomposition is a statistical method that explains the difference in the means of a dependent variable between two groups by decomposing the gap into that part that is due to diffe
Regression toward the mean
In statistics, regression toward the mean (also called reversion to the mean, and reversion to mediocrity) is the fact that if one sample of a random variable is extreme, the next sampling of the same
Causal inference
Causal inference is the process of determining the independent, actual effect of a particular phenomenon that is a component of a larger system. The main difference between causal inference and infere
Homoscedasticity and heteroscedasticity
In statistics, a sequence (or a vector) of random variables is homoscedastic (/ˌhoʊmoʊskəˈdæstɪk/) if all its random variables have the same finite variance. This is also known as homogeneity of varia
Contrast (statistics)
In statistics, particularly in analysis of variance and linear regression, a contrast is a linear combination of variables (parameters or statistics) whose coefficients add up to zero, allowing compar
Polygenic score
In genetics, a polygenic score (PGS), also called a polygenic risk score (PRS), polygenic index (PGI), genetic risk score, or genome-wide score, is a number that summarizes the estimated effect of man
Non-linear mixed-effects modeling software
Nonlinear mixed-effects models are a special case of regression analysis for which a range of different software solutions are available. The statistical properties of nonlinear mixed-effects models m
C+-probability
In statistics, a c+-probability is the probability that a contrast variable obtains a positive value.Using a replication probability, the c+-probability is defined as follows: if we get a random draw
Working–Hotelling procedure
In statistics, particularly regression analysis, the Working–Hotelling procedure, named after Holbrook Working and Harold Hotelling, is a method of simultaneous estimation in linear regression models.
Antecedent variable
In statistics and social sciences, an antecedent variable is a variable that can help to explain the apparent relationship (or part of the relationship) between other variables that are nominally in a
Standardized coefficient
In statistics, standardized (regression) coefficients, also called beta coefficients or beta weights, are the estimates resulting from a regression analysis where the underlying data have been standar
General regression neural network
Generalized regression neural network (GRNN) is a variation to radial basis neural networks. GRNN was suggested by D.F. Specht in 1991. GRNN can be used for regression, prediction, and classification.
Generated regressor
In least squares estimation problems, sometimes one or more regressors specified in the model are not observable. One way to circumvent this issue is to estimate or generate regressors from observable
Projection pursuit regression
In statistics, projection pursuit regression (PPR) is a statistical model developed by Jerome H. Friedman and which is an extension of additive models. This model adapts the additive models in that it
Propensity score matching
In the statistical analysis of observational data, propensity score matching (PSM) is a statistical matching technique that attempts to estimate the effect of a treatment, policy, or other interventio
Omitted-variable bias
In statistics, omitted-variable bias (OVB) occurs when a statistical model leaves out one or more relevant variables. The bias results in the model attributing the effect of the missing variables to t
Structural break
In econometrics and statistics, a structural break is an unexpected change over time in the parameters of regression models, which can lead to huge forecasting errors and unreliability of the model in
Identifiability analysis
Identifiability analysis is a group of methods found in mathematical statistics that are used to determine how well the parameters of a model are estimated by the quantity and quality of experimental
Simple linear regression
In statistics, simple linear regression is a linear regression model with a single explanatory variable. That is, it concerns two-dimensional sample points with one independent variable and one depend
Multinomial probit
In statistics and econometrics, the multinomial probit model is a generalization of the probit model used when there are several possible categories that the dependent variable can fall into. As such,
Lasso (statistics)
In statistics and machine learning, lasso (least absolute shrinkage and selection operator; also Lasso or LASSO) is a regression analysis method that performs both variable selection and regularizatio
Deming regression
In statistics, Deming regression, named after W. Edwards Deming, is an errors-in-variables model which tries to find the line of best fit for a two-dimensional dataset. It differs from the simple line
Instrumental variables estimation
In statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables (IV) is used to estimate causal relationships when controlled experiments are not feasible or wh
Suppressor variable
A suppressor variable is a variable that increases the predictive validity of another variable when included in a regression equation. Suppression can occur when a single causal variable is related to
Commonality analysis
Commonality analysis is a statistical technique within multiple linear regression that decomposes a model's R2 statistic (i.e., explained variance) by all independent variables on a dependent variable
Difference in differences
Difference in differences (DID or DD) is a statistical technique used in econometrics and quantitative research in the social sciences that attempts to mimic an experimental research design using obse
Explained variation
In statistics, explained variation measures the proportion to which a mathematical model accounts for the variation (dispersion) of a given data set. Often, variation is quantified as variance; then,
Optimal design
In the design of experiments, optimal designs (or optimum designs) are a class of experimental designs that are optimal with respect to some statistical criterion. The creation of this field of statis
Simalto
SIMALTO – SImultaneous Multi-Attribute Trade Off – is a survey based statistical technique used in market research that helps determine how people prioritise and value alternative product and/or servi
Moderation (statistics)
In statistics and regression analysis, moderation (also known as effect modification) occurs when the relationship between two variables depends on a third variable. The third variable is referred to
DeFries–Fulker regression
In behavioural genetics, DeFries–Fulker (DF) regression, also sometimes called DeFries–Fulker extremes analysis, is a type of multiple regression analysis designed for estimating the magnitude of gene
Functional regression
Functional regression is a version of regression analysis when responses or covariates include functional data. Functional regression models can be classified into four types depending on whether the
Generalized estimating equation
In statistics, a generalized estimating equation (GEE) is used to estimate the parameters of a generalized linear model with a possible unmeasured correlation between observations from different timep
Errors and residuals
In statistics and optimization, errors and residuals are two closely related and easily confused measures of the deviation of an observed value of an element of a statistical sample from its "true val
Prediction interval
In statistical inference, specifically predictive inference, a prediction interval is an estimate of an interval in which a future observation will fall, with a certain probability, given what has alr
Function approximation
In general, a function approximation problem asks us to select a function among a well-defined class that closely matches ("approximates") a target function in a task-specific way. The need for functi
Heckman correction
The Heckman correction is a statistical technique to correct bias from non-randomly selected samples or otherwise incidentally truncated dependent variables, a pervasive issue in quantitative social s
Quantile regression
Quantile regression is a type of regression analysis used in statistics and econometrics. Whereas the method of least squares estimates the conditional mean of the response variable across values of t
Coefficient of multiple correlation
In statistics, the coefficient of multiple correlation is a measure of how well a given variable can be predicted using a linear function of a set of other variables. It is the correlation between the
Idempotent matrix
In linear algebra, an idempotent matrix is a matrix which, when multiplied by itself, yields itself. That is, the matrix is idempotent if and only if . For this product to be defined, must necessarily
Principle of marginality
In statistics, the principle of marginality is the fact that the average (or main) effects, of variables in an analysis are marginal to their interaction effect—that is, the main effect of one explana
Underfitting
No description available.
Interaction (statistics)
In statistics, an interaction may arise when considering the relationship among three or more variables, and describes a situation in which the effect of one causal variable on an outcome depends on t
Multicollinearity
In statistics, multicollinearity (also collinearity) is a phenomenon in which one predictor variable in a multiple regression model can be linearly predicted from the others with a substantial degree
Quantile regression averaging
Quantile Regression Averaging (QRA) is a forecast combination approach to the computation of prediction intervals. It involves applying quantile regression to the point forecasts of a small number of
Design matrix
In statistics and in particular in regression analysis, a design matrix, also known as model matrix or regressor matrix and often denoted by X, is a matrix of values of explanatory variables of a set
Symbolic regression
Symbolic regression (SR) is a type of regression analysis that searches the space of mathematical expressions to find the model that best fits a given dataset, both in terms of accuracy and simplicity
Heteroskedasticity-consistent standard errors
The topic of heteroskedasticity-consistent (HC) standard errors arises in statistics and econometrics in the context of linear regression and time series analysis. These are also known as heteroskedas
Sobel test
In statistics, the Sobel test is a method of testing the significance of a mediation effect. The test is based on the work of Michael E. Sobel, a statistics professor at Columbia University in New Yor
Projection matrix
In statistics, the projection matrix , sometimes also called the influence matrix or hat matrix , maps the vector of response values (dependent variable values) to the vector of fitted values (or pred
Virtual sensing
Virtual sensing techniques, also called soft sensing, proxy sensing, inferential sensing, or surrogate sensing, are used to provide feasible and economical alternatives to costly or impractical physic
Meta-regression
Meta-regression is defined to be a meta-analysis that uses regression analysis to combine, compare, and synthesize research findings from multiple studies while adjusting for the effects of available
Regression analysis
In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'l
Binary regression
In statistics, specifically regression analysis, a binary regression estimates a relationship between one or more explanatory variables and a single output binary variable. Generally the probability o
Target function
No description available.
Under-fitting
No description available.
Guess value
In mathematical modeling, a guess value is more commonly called a starting value or initial value. These are necessary for most optimization problems which use search algorithms, because those algorit
Regression discontinuity design
In statistics, econometrics, political science, epidemiology, and related disciplines, a regression discontinuity design (RDD) is a quasi-experimental pretest-posttest design that aims to determine th
Line fitting
Line fitting is the process of constructing a straight line that has the best fit to a series of data points. Several methods exist, considering: * Vertical distance: Simple linear regression * Resi
Moderated mediation
In statistics, moderation and mediation can occur together in the same model. Moderated mediation, also known as conditional indirect effects, occurs when the treatment effect of an independent variab
Mean and predicted response
In linear regression, mean response and predicted response are values of the dependent variable calculated from the regression parameters and a given value of the independent variable. The values of t
Radial basis function network
In the field of mathematical modeling, a radial basis function network is an artificial neural network that uses radial basis functions as activation functions. The output of the network is a linear c
Nonlinear regression
In statistics, nonlinear regression is a form of regression analysis in which observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one o
Pyrrho's lemma
In statistics, Pyrrho's lemma is the result that if one adds just one extra variable as a regressor from a suitable set to a linear regression model, one can get any desired outcome in terms of the co
Component analysis (statistics)
Component analysis is the analysis of two or more independent variables which comprise a treatment modality. It is also known as a dismantling study. The chief purpose of the component analysis is to
Dependent and independent variables
Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their value
Interval predictor model
In regression analysis, an interval predictor model (IPM) is an approach to regression where bounds on the function to be approximated are obtained.This differs from other techniques in machine learni
Linkage disequilibrium score regression
In statistical genetics, linkage disequilibrium score regression (LDSR or LDSC) is a technique that aims to quantify the separate contributions of polygenic effects and various confounding factors, su
Haseman–Elston regression
In statistical genetics, Haseman–Elston (HE) regression is a form of statistical regression originally proposed for linkage analysis of quantitative traits for sibling pairs. It was first developed by
Knockoffs (statistics)
In statistics, the knockoff filter, or simply knockoffs, is a framework for variable selection. It was originally introduced for linear regression by Rina Barber and Emmanuel Candès, and later general
Curve fitting
Curve fitting is the process of constructing a curve, or mathematical function, that has the best fit to a series of data points, possibly subject to constraints. Curve fitting can involve either inte