STATISTICS FOR DATA SCIENCE

Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. Statistics is used for decision making by understanding the data.

Types of Statistics:

1. Descriptive statistics - Understand the sample
2. Inferential statistics - Understand the population

In descriptive statistics no prediction will be there understand the data of current…

REGRESSION ANALYSIS

Regression analysis is a statistical technique for studying the linear relationships.

Regression analysis is done for one of two purposes.

1. To predict which independent variables have an impact on dependent variable.
2. To estimate the effect of some explanatory variable on the dependent variable.

LINEAR REGRESSION

Linear regression is a machine learning algorithm…

PARAMETRIC MODEL

In a parametric model, the number of parameters is fixed with respect to the sample size. It must satisfy all the assumptions.

A learning model that memorizes the data with the help of parameters given (i.e. independent features )is called linear/parametric model.

Benefits of Parametric Machine Learning Algorithms:

• Explainability is…

TRAIN TEST SPLIT

Before model creation we will be doing Train test split, on the entire data. Suppose if you are having 1000 data points, Assume we are splitting it into train 70% test 30 % or 80% train 20 % test based on the data you have.

These 70 % train data

Converting Data file to CSV file and read through Jupyter notebook

Hi Welcome ,This was my first blog, so thanks for medium platform to providing this opportunity. since I was very passionate in learning new technologies and I love to share the stuff which I came across my learning. … Vignesh S

Data scientist Aspirant passionate in learning new technologies and sharing my thoughts to others .