Converting Data file to CSV file and read through Jupyter notebook

4 min readJul 3, 2021

Hi Welcome ,This was my first blog, so thanks for medium platform to providing this opportunity. since I was very passionate in learning new technologies and I love to share the stuff which I came across my learning. In this blog I am going to share how to convert the Data File in to the CSV file and read it in Jupiter Notebook.

1.Datasets Extraction

There are various free sources where we can get data of any domain and infer the business insights. Below I am listing the various useful resource where we can extract the datafiles for our projects.

Kaggle:https://www.kaggle.com/tags/web-sites
UCI Machine Learning: Repository :https://archive.ics.uci.edu/ml/index.php
Data World:https://data.world/datasets/website
Government data:https://data.gov.in/

2 .Data file to CSV file through excel

Some of the data file you download may not be in the condition where you cannot directly work with platform, it may be unstructured, In order to convert it into better readable form you need to convert it into structured data. Here I will show how we are going to convert the unstructured data file to structured csv file and use it in the Jupyter platform where it is helpful for the data scientist and data analyst to make his/her work simple.

STEP 1: Download the file from data source to local desktop

STEP 2: Convert the Data file to Notepad

STEP 3: Open the notepad file and copy the content and paste in Excel

**Paste the content of notepad to Excel.**

**Click on data in menu bar and click text to columns.**

STEP 4: Change the content to structured format

**Use the delimiter option and change in to relevant columns and rows click next.**

**Changed to relevant rows and columns.**

After modification save the file and convert to comma separated values(csv) and store it in local desktop…

Read through Jupyter Notebook without changing anything in excel using python code.

open the jupyter notebook and type the following commands to import the csv file.

Import the basic pandas library to import the datasets

import numpy as np

import pandas as pd

After import the datasets read the csv file

df=pd.read_csv(“car.csv”)

Data frame has been created and now we are free to work and infer the insights of the datasets that is imported.

Explicitly pass header=0 to be able to replace existing names. The header can be a list of integers that specify row locations for a multi-index on the columns e.g. [0,1,3]. Refer:https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html

([“symboling”,”normalized-losses”,”make”,”fuel-type”,”aspiration”, “num-of-doors”,”body-style”,”drive-wheels”,”enginelocation”,“wheelbase”,”length”,”width”,”height”,”curb-weigh”,”engine-type”,“num-of-cylinders”,”engine-size”,”fuel-system”,”bore”,”stroke”,“compression-ratio”,”horsepower”,”peak-rpm”,”city-mpg”,”highway-mpg” “price”],axis=’columns’,inplace=True)