Model selection | Regression variable selection

Focused information criterion

In statistics, the focused information criterion (FIC) is a method for selecting the most appropriate model among a set of competitors for a given data set. Unlike most other model selection strategies, like the Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the deviance information criterion (DIC), the FIC does not attempt to assess the overall fit of candidate models but focuses attention directly on the parameter of primary interest with the statistical analysis, say , for which competing models lead to different estimates, say for model . The FIC method consists in first developing an exact or approximate expression for the precision or quality of each estimator, say for , and then use data to estimate these precision measures, say . In the end the model with best estimated precision is selected. The FIC methodology was developed by Gerda Claeskens and Nils Lid Hjort, first in two 2003 discussion articles in Journal of the American Statistical Association and later on in other papers and in their 2008 book. The concrete formulae and implementation for FIC depend firstly on the particular parameter of interest, the choice of which does not depend on mathematics but on the scientific and statistical context. Thus the FIC apparatus may be selecting one model as most appropriate for estimating a quantile of a distribution but preferring another model as best for estimating the mean value. Secondly, the FIC formulae depend on the specifics of the models used for the observed data and also on how precision is to be measured. The clearest case is where precision is taken to be mean squared error, say in terms of squared bias and variance for the estimator associated with model . FIC formulae are then available in a variety of situations, both for handling parametric, semiparametric and nonparametric situations, involving separate estimation of squared bias and variance, leading to estimated precision . In the end the FIC selects the model with smallest estimated mean squared error. Associated with the use of the FIC for selecting a good model is the FIC plot, designed to give a clear and informative picture of all estimates, across all candidate models, and their merit. It displays estimates on the axis along with FIC scores on the axis; thus estimates found to the left in the plot are associated with the better models and those found in the middle and to the right stem from models less or not adequate for the purpose of estimating the focus parameter in question. Generally speaking, complex models (with many parameters relative to sample size) tend to lead to estimators with small bias but high variance; more parsimonious models (with fewer parameters) typically yield estimators with larger bias but smaller variance. The FIC method balances the two desired data of having small bias and small variance in an optimal fashion. The main difficulty lies with the bias , as it involves the distance from the expected value of the estimator to the true underlying quantity to be estimated, and the true data generating mechanism may lie outside each of the candidate models. In situations where there is not a unique focus parameter, but rather a family of such, there are versions of average FIC (AFIC or wFIC) that find the best model in terms of suitably weighted performance measures, e.g. when searching for a regression model to perform particularly well in a portion of the covariate space. It is also possible to keep several of the best models on board, ending the statistical analysis with a data-dicated weighted average of the estimators of the best FIC scores, typically giving highest weight to estimators associated with the best FIC scores. Such schemes of model averaging extend the direct FIC selection method. The FIC methodology applies in particular to selection of variables in different forms of regression analysis, including the framework of generalised linear models and the semiparametric proportional hazards models (i.e. Cox regression). (Wikipedia).

Video thumbnail

Business Data Analysis with Excel

Business data presents a challenge for the data analyst. Business data is often aggregated, recorded over time, and tends to exhibit autocorrelation. Additionally, and most problematically, the amount of business data is usually quite limited. These characteristics lead to a situation wher

From playlist Data Analytics Tutorials

Video thumbnail

Tree Indexing in DBMS

A #database #index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data structure. Indexes are used to quickly locate data without having to search every row in a database

From playlist Database

Video thumbnail

Database Index Fundamentals

This video explains the fundamental principles of indexing table columns in a database to speed up queries. It illustrates the difference between clustered indexes and non-clustered indexes, which are also known as secondary keys. It explains that the primary key of a table is normally t

From playlist Databases

Video thumbnail

Evaluation 8: F-measure

F-measure is a harmonic mean of recall and precision. Think of it as accuracy, but without the effect of true negatives (which made accuracy meaningless for evaluating search algorithms). F-measure can also be interpreted as the Dice coefficient between the relevant set and the retrieved s

From playlist IR13 Evaluating Search Engines

Video thumbnail

14 Data Analytics: Indicator Methods

Lecture on the use of indicators for spatial estimation and simulation.

From playlist Data Analytics and Geostatistics

Video thumbnail

Entanglement & C-theorems (Chandrasekhar lecture III) - part 2

Discussion Meeting: Entanglement from Gravity(URL: Dates: Wednesday 10 Dec, 2014 - Friday 12 Dec, 2014 Description: In the last few years, quantum entanglement considerations have led to profound insights in the connection with gravity.

From playlist Chandrasekhar Lectures

Video thumbnail

(IC 1.6) A different notion of "information"

An informal discussion of the distinctions between our everyday usage of the word "information" and the information-theoretic notion of "information". A playlist of these videos is available at: Attribution for image of TV static:

From playlist Information theory and Coding

Video thumbnail

11th Annual Yale NEA-BPD Conference: Mentalization in Borderline Personality Disorder

Mentalization in Borderline Personality Disorder: From Bench to Bedside, Carla Sharp, PhD Dr. Sharp trained as a clinical psychologist (University of Stellenbosch, South Africa) from 1994-1997, after which she completed a Ph.D. in Developmental Psychopathology at Cambridge University, UK,

From playlist 11th Annual Yale NEA-BPD Conference: Meeting the Needs of Children and Adolescents with Borderline Personality Disorder Features

Video thumbnail

L16.2 LMS Estimation in the Absence of Observations

MIT RES.6-012 Introduction to Probability, Spring 2018 View the complete course: Instructor: John Tsitsiklis License: Creative Commons BY-NC-SA More information at More courses at

From playlist MIT RES.6-012 Introduction to Probability, Spring 2018

Video thumbnail

Statistical Rethinking Fall 2017 - week04 lecture08

Week 04, lecture 08 for Statistical Rethinking: A Bayesian Course with Examples in R and Stan, taught at MPI-EVA in Fall 2017. This lecture covers Chapter 6. Slides are available here: Additional information on textbook and R package here: http://xcel

From playlist Statistical Rethinking Fall 2017

Video thumbnail

QRM 7-2: TS for RM 2 (PACF, ARMA estimation and forecasting)

Welcome to Quantitative Risk Management (QRM). In the second part of Lesson 7, we first introduce the partial autocorrelogram (PACF) and see how we can combine it with the ACF to understand something more about AR, MA and ARMA processes. We then deal with the important problems of estima

From playlist Quantitative Risk Management

Video thumbnail

Primary and Secondary Data

Differences between primary data and secondary data in research.

From playlist Experimental Design

Video thumbnail

Waves, Instabilities and Mixing in Stars by Pascale Garaud

DISCUSSION MEETING WAVES, INSTABILITIES AND MIXING IN ROTATING AND STRATIFIED FLOWS (ONLINE) ORGANIZERS: Thierry Dauxois (CNRS & ENS de Lyon, France), Sylvain Joubaud (ENS de Lyon, France), Manikandan Mathur (IIT Madras, India), Philippe Odier (ENS de Lyon, France) and Anubhab Roy (IIT M

From playlist Waves, Instabilities and Mixing in Rotating and Stratified Flows (ONLINE)

Video thumbnail

SYN128 - The Adverbial II

The second E-Lecture about PDE adverbials deals with the functional subdivision of this complex class. Using numerous examples, Prof. Handke discusses the central properties that keep adjuncts and subjuncts, on the one hand, and disjuncts and conjuncts, on the other, apart.

From playlist VLC201 - The Structure of English

Video thumbnail

Ses 18: Capital Budgeting II & Efficient Markets I

MIT 15.401 Finance Theory I, Fall 2008 View the complete course: Instructor: Andrew Lo License: Creative Commons BY-NC-SA More information at More courses at

From playlist MIT 15.401 Finance Theory I, Fall 2008

Video thumbnail

Evaluation 5: relevance judgments

Relevance judgments indicate which documents are relevant to the information need of a user. They are constructed by trained annotators inspecting a subset of documents (typically pooled across a large number of different retrieval algorithms).

From playlist IR13 Evaluating Search Engines

Video thumbnail

The SL (2, R) action on spaces of differentials (Lecture 02) by Jayadev Athreya

DISCUSSION MEETING SURFACE GROUP REPRESENTATIONS AND PROJECTIVE STRUCTURES ORGANIZERS: Krishnendu Gongopadhyay, Subhojoy Gupta, Francois Labourie, Mahan Mj and Pranab Sardar DATE: 10 December 2018 to 21 December 2018 VENUE: Ramanujan Lecture Hall, ICTS Bangalore The study of spaces o

From playlist Surface group representations and Projective Structures (2018)

Related pages

Deviance information criterion | Variance | Semiparametric model | Estimator | Bayesian information criterion | Generalized linear model | Regression analysis | Model selection | Mean squared error | Akaike information criterion | Statistics | Hannan–Quinn information criterion | Bias of an estimator | Parametric model