Parametric Vs Non parametric model selection for Regression and classification based on Statistical test.

PARAMETRIC MODEL

  • Explainability is good(Easy to interpret to client or stakeholders).
  • Suits for simple data.
  • Parametric models are very fast to learn from data.

NON PARAMETRIC MODEL

  • Suits for Complex data.
  • No assumptions (or weak assumptions) about the underlying function.
  • Can result in higher performance models for prediction.

Hypothesis Test Generic Conditions:

REGRESSION:

  1. Dependent variable must be numeric
  2. Independent variables does not show multicollinearity
  3. Linear Relation between dependent and independent variables
  4. Absence of Autocorrelation
  5. Error terms must be homoscedastic
  6. Error terms must follow normal distribution.
Linear dependency, slope increases, pvalue<0.05 in this case. As BMI increases diabetic severity increase
Even though means are far apart slope is inconsistent linear regression finds difficult to interpret.Non linear dependency pvalue>0.05.

CLASSIFICATION

Linear dependency maintains linear relationship as BMI increases diabetics risk increases. Slope is consistent.
It is non maintaining linear relationship slope is inconsistent, Eventhough they are statistically significant.the linear model finds struggle to model the relationship.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Vignesh S

Data scientist Aspirant passionate in learning new technologies and sharing my thoughts to others .