There are various pipelines in Data science projects with machine learning used cases like Data gathering, Feature Engineering, Feature selection, Model creation, Model deployment. TRAIN TEST SPLIT Before model creation we will be doing Train test split, on the entire data. Suppose if you are having 1000 data points, Assume we are splitting it into train 70% test 30 % or 80% train 20 % test based on the data you have.