Pandas Write CSV File | Mastering in Python Pandas Library
Write csv file means to do some operations for data preprocessing or data cleaning.Data preprocessing is a data mining technique that involves transforming raw data into an understandable format.
How to Write CSV File in Python
Here we will discuss about pentameters of pd.read_csv function
import pandas as pd df = pd.read_csv('F:\\Machine Learning\\DataSet\\Fortune_10.csv') df
Output >>> ID Name Industry Inception Revenue Expenses Profit Growth 0 1 Lamtone IT Services 2009 $11,757,018 6,482,465 Dollars 5274553 30% 1 2 Stripfind Financial 2010 $12,329,371 916,455 Dollars 11412916 20% 2 3 Canecorporation Health 2012 $10,597,009 7,591,189 Dollars 3005820 7% 3 4 Mattouch IT Services 2013 $14,026,934 7,429,377 Dollars 6597557 26% 4 5 Techdrill Health 2009 $10,573,990 7,435,363 Dollars 3138627 8% 5 6 Techline Health 2006 $13,898,119 5,470,303 Dollars 8427816 23% 6 7 Cityace Health 2010 $9,254,614 6,249,498 Dollars 3005116 6% 7 8 Kayelectro Health 2009 $9,451,943 3,878,113 Dollars 5573830 4% 8 9 Ganzlax IT Services 2011 $14,001,180 3,878,153 Dollars 11901180 18% 9 10 Trantraxlax Government Services 2011 $11,088,336 5,635,276 Dollars 5453060 7%
To know the type of the dataset use type function
type(df)
Output >>> pandas.core.frame.DataFrame
This dataset is dataframe type
To know all the columns name
df.columns
Output >>> Index(['ID', 'Name', 'Industry', 'Inception', 'Revenue', 'Expenses', 'Profit', 'Growth'], dtype='object')
If you want to read some specific rows of the dataset use nrows parameters
df = pd.read_csv('F:\\Machine Learning\\DataSet\\Fortune_10.csv', nrows = 1) df
Output >>> ID Name Industry Inception Revenue Expenses Profit Growth 0 1 Lamtone IT Services 2009 $11,757,018 6,482,465 Dollars 5274553 30%
df = pd.read_csv('F:\\Machine Learning\\DataSet\\Fortune_10.csv', nrows = 5) df
Output >>> ID Name Industry Inception Revenue Expenses Profit Growth 0 1 Lamtone IT Services 2009 $11,757,018 6,482,465 Dollars 5274553 30% 1 2 Stripfind Financial 2010 $12,329,371 916,455 Dollars 11412916 20% 2 3 Canecorporation Health 2012 $10,597,009 7,591,189 Dollars 3005820 7% 3 4 Mattouch IT Services 2013 $14,026,934 7,429,377 Dollars 6597557 26% 4 5 Techdrill Health 2009 $10,573,990 7,435,363 Dollars 3138627 8%
df = pd.read_csv('F:\\Machine Learning\\DataSet\\Fortune_10.csv', usecols = [0]) df
Output >>> ID 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10
df2 = pd.read_csv('F:\\Machine Learning\\DataSet\\Fortune_10.csv', usecols = [0,1]) df2
Output >>> ID Name 0 1 Lamtone 1 2 Stripfind 2 3 Canecorporation 3 4 Mattouch 4 5 Techdrill 5 6 Techline 6 7 Cityace 7 8 Kayelectronics 8 9 Ganzlax 9 10 Trantraxlax
df = pd.read_csv('F:\\Machine Learning\\DataSet\\Fortune_10.csv', usecols = [1,2]) df
Output >>> Name Industry 0 Lamtone IT Services 1 Stripfind Financial Services 2 Canecorporation Health 3 Mattouch IT Services 4 Techdrill Health 5 Techline Health 6 Cityace Health 7 Kayelectronics Health 8 Ganzlax IT Services 9 Trantraxlax Government Services
df = pd.read_csv('F:\\Machine Learning\\DataSet\\Fortune_10.csv', usecols = [2,4,7]) df
Output >>> Industry Revenue Profit 0 IT Services $11,757,018 5274553 1 Financial Services $12,329,371 11412916 2 Health $10,597,009 3005820 3 IT Services $14,026,934 6597557 4 Health $10,573,990 3138627 5 Health $13,898,119 8427816 6 Health $9,254,614 3005116 7 Health $9,451,943 5573830 8 IT Services $14,001,180 11901180 9 Government Services $11,088,336 5453060
df = pd.read_csv('F:\\Machine Learning\\DataSet\\Fortune_10.csv') df
Output >>> 0 1 2 3 4 5 6 7 ID Name Industry Inception Revenue Expenses Profit Growth 0 1 Lamtone IT Services 2009 $11,757,018 6,482,465 Dollars 5274553 30% 1 2 Stripfind Financial 2010 $12,329,371 916,455 Dollars 11412916 20% 2 3 Canecorporation Health 2012 $10,597,009 7,591,189 Dollars 3005820 7% 3 4 Mattouch IT Services 2013 $14,026,934 7,429,377 Dollars 6597557 26% 4 5 Techdrill Health 2009 $10,573,990 7,435,363 Dollars 3138627 8% 5 6 Techline Health 2006 $13,898,119 5,470,303 Dollars 8427816 23% 6 7 Cityace Health 2010 $9,254,614 6,249,498 Dollars 3005116 6% 7 8 Kayelectro Health 2009 $9,451,943 3,878,113 Dollars 5573830 4% 8 9 Ganzlax IT Services 2011 $14,001,180 3,878,153 Dollars 11901180 18% 9 10 Trantraxlax Government Services 2011 $11,088,336 5,635,276 Dollars 5453060 7%
df = pd.read_csv('F:\\Machine Learning\\DataSet\\Fortune_10.csv', skiprows = 1) df
Output >>> ID Name Industry Inception Employees Revenue Expenses Profit Growth 0 1 Lamtone IT Services 2009 55 $11,757,018 6,482,465 Dollars 5274553 30% 1 2 Stripfind Financial 2010 25 $12,329,371 916,455 Dollars 11412916 20% 2 3 Canecorporation Health 2012 6 $10,597,009 7,591,189 Dollars 3005820 7% 3 4 Mattouch IT Services 2013 6 $14,026,934 7,429,377 Dollars 6597557 26% 4 5 Techdrill Health 2009 9 $10,573,990 7,435,363 Dollars 3138627 8% 5 6 Techline Health 2006 65 $13,898,119 5,470,303 Dollars 8427816 23% 6 7 Cityace Health 2010 25 $9,254,614 6,249,498 Dollars 3005116 6% 7 8 Kayelectro Health 2009 687 $9,451,943 3,878,113 Dollars 5573830 4% 8 9 Ganzlax IT Services 2011 75 $14,001,180 3,878,153 Dollars 11901180 18% 9 10 Trantraxlax Government Services 2011 35 $11,088,336 5,635,276 Dollars 5453060 7%
df = pd.read_csv('F:\\Machine Learning\\DataSet\\Fortune_10.csv', skiprows = 2) df
Output >>> 1 Lamtone IT Services 2009 $11,757,018 6,482,465 Dollars 5274553 30% 0 2 Stripfind Financial Services 2010 $12,329,371 916,455 Dollars 11412916 20% 1 3 Canecorporation Health 2012 $10,597,009 7,591,189 Dollars 3005820 7% 2 4 Mattouch IT Services 2013 $14,026,934 7,429,377 Dollars 6597557 26% 3 5 Techdrill Health 2009 $10,573,990 7,435,363 Dollars 3138627 8% 4 6 Techline Health 2006 $13,898,119 5,470,303 Dollars 8427816 23% 5 7 Cityace Health 2010 $9,254,614 6,249,498 Dollars 3005116 6% 6 8 Kayelectronics Health 2009 $9,451,943 3,878,113 Dollars 5573830 4% 7 9 Ganzlax IT Services 2011 $14,001,180 3,878,113 Dollars 11901180 18% 8 10 Trantraxlax Government Services 2011 $11,088,336 5,635,276 Dollars 5453060 7%
df = pd.read_csv('F:\\Machine Learning\\DataSet\\Fortune_10.csv', skiprows = 0) df
Output >>> 0 1 2 3 4 5 6 7 8 0 ID Name Industry Inception Employees Revenue Expenses Profit Growth 1 1 Lamtone IT Services 2009 55 $11,757,018 6,482,465 Dollars 5274553 30% 2 2 Stripfind Financial 2010 25 $12,329,371 916,455 Dollars 11412916 20% 3 3 Canecorporation Health 2012 6 $10,597,009 7,591,189 Dollars 3005820 7% 4 4 Mattouch IT Services 2013 6 $14,026,934 7,429,377 Dollars 6597557 26% 5 5 Techdrill Health 2009 9 $10,573,990 7,435,363 Dollars 3138627 8% 6 6 Techline Health 2006 65 $13,898,119 5,470,303 Dollars 8427816 23% 7 7 Cityace Health 2010 25 $9,254,614 6,249,498 Dollars 3005116 6% 8 8 Kayelectro Health 2009 687 $9,451,943 3,878,113 Dollars 5573830 4% 9 9 Ganzlax IT Services 2011 75 $14,001,180 3,878,153 Dollars 11901180 18% 10 10 Trantraxlax Government Services 2011 35 $11,088,336 5,635,276 Dollars 5453060 7%
df = pd.read_csv('F:\\Machine Learning\\DataSet\\Fortune_10.csv', skiprows = [0]) df
Output >>> ID Name Industry Inception Employees Revenue Expenses Profit Growth 0 1 Lamtone IT Services 2009 55 $11,757,018 6,482,465 Dollars 5274553 30% 1 2 Stripfind Financial 2010 25 $12,329,371 916,455 Dollars 11412916 20% 2 3 Canecorporation Health 2012 6 $10,597,009 7,591,189 Dollars 3005820 7% 3 4 Mattouch IT Services 2013 6 $14,026,934 7,429,377 Dollars 6597557 26% 4 5 Techdrill Health 2009 9 $10,573,990 7,435,363 Dollars 3138627 8% 5 6 Techline Health 2006 65 $13,898,119 5,470,303 Dollars 8427816 23% 6 7 Cityace Health 2010 25 $9,254,614 6,249,498 Dollars 3005116 6% 7 8 Kayelectro Health 2009 687 $9,451,943 3,878,113 Dollars 5573830 4% 8 9 Ganzlax IT Services 2011 75 $14,001,180 3,878,153 Dollars 11901180 18% 9 10 Trantraxlax Government Services 2011 35 $11,088,336 5,635,276 Dollars 5453060 7%
df = pd.read_csv('F:\\Machine Learning\\DataSet\\Fortune_10.csv', skiprows = [1]) df
Output >>> 0 1 2 3 4 5 6 7 0 1 Lamtone IT Services 2009 $11,757,018 6,482,465 Dollars 5274553 30% 1 2 Stripfind Financial Services 2010 $12,329,371 916,455 Dollars 11412916 20% 2 3 Canecorporation Health 2012 $10,597,009 7,591,189 Dollars 3005820 7% 3 4 Mattouch IT Services 2013 $14,026,934 7,429,377 Dollars 6597557 26% 4 5 Techdrill Health 2009 $10,573,990 7,435,363 Dollars 3138627 8% 5 6 Techline Health 2006 $13,898,119 5,470,303 Dollars 8427816 23% 6 7 Cityace Health 2010 $9,254,614 6,249,498 Dollars 3005116 6% 7 8 Kayelectronics Health 2009 $9,451,943 3,878,113 Dollars 5573830 4% 8 9 Ganzlax IT Services 2011 $14,001,180 3,878,113 Dollars 11901180 18% 9 10 Trantraxlax Government Services 2011 $11,088,336 5,635,276 Dollars 5453060 7%
df = pd.read_csv('F:\\Machine Learning\\DataSet\\Fortune_10.csv', skiprows = [0,2,3]) df
Output >>> ID Name Industry Inception Revenue Expenses Profit Growth 0 3 Canecorporation Health 2012 $10,597,009 7,591,189 Dollars 3005820 7% 1 4 Mattouch IT Services 2013 $14,026,934 7,429,377 Dollars 6597557 26% 2 5 Techdrill Health 2009 $10,573,990 7,435,363 Dollars 3138627 8% 3 6 Techline Health 2006 $13,898,119 5,470,303 Dollars 8427816 23% 4 7 Cityace Health 2010 $9,254,614 6,249,498 Dollars 3005116 6% 5 8 Kayelectronics Health 2009 $9,451,943 3,878,113 Dollars 5573830 4% 6 9 Ganzlax IT Services 2011 $14,001,180 3,878,113 Dollars 11901180 18% 7 10 Trantraxlax Government Services 2011 $11,088,336 5,635,276 Dollars 5453060 7%
df1 = pd.read_csv('F:\\Machine Learning\\DataSet\\Fortune_10.csv') df1
Output >>> ID Name Industry Inception Revenue Expenses Profit Growth 0 1 Lamtone IT Services 2009 $11,757,018 6,482,465 Dollars 5274553 30% 1 2 Stripfind Financial 2010 $12,329,371 916,455 Dollars 11412916 20% 2 3 Canecorporation Health 2012 $10,597,009 7,591,189 Dollars 3005820 7% 3 4 Mattouch IT Services 2013 $14,026,934 7,429,377 Dollars 6597557 26% 4 5 Techdrill Health 2009 $10,573,990 7,435,363 Dollars 3138627 8% 5 6 Techline Health 2006 $13,898,119 5,470,303 Dollars 8427816 23% 6 7 Cityace Health 2010 $9,254,614 6,249,498 Dollars 3005116 6% 7 8 Kayelectro Health 2009 $9,451,943 3,878,113 Dollars 5573830 4% 8 9 Ganzlax IT Services 2011 $14,001,180 3,878,153 Dollars 11901180 18% 9 10 Trantraxlax Government Services 2011 $11,088,336 5,635,276 Dollars 5453060 7%
df = pd.read_csv('F:\\Machine Learning\\DataSet\\Fortune_10.csv', index_col = 'ID') df
Output >>> Name Industry Inception Employees Revenue Expenses Profit Growth ID 1 Lamtone IT Services 2009 55 $11,757,018 6,482,465 Dollars 5274553 30% 2 Stripfind Financial 2010 25 $12,329,371 916,455 Dollars 11412916 20% 3 Canecorporation Health 2012 6 $10,597,009 7,591,189 Dollars 3005820 7% 4 Mattouch IT Services 2013 6 $14,026,934 7,429,377 Dollars 6597557 26% 5 Techdrill Health 2009 9 $10,573,990 7,435,363 Dollars 3138627 8% 6 Techline Health 2006 65 $13,898,119 5,470,303 Dollars 8427816 23% 7 Cityace Health 2010 25 $9,254,614 6,249,498 Dollars 3005116 6% 8 Kayelectro Health 2009 687 $9,451,943 3,878,113 Dollars 5573830 4% 9 Ganzlax IT Services 2011 75 $14,001,180 3,878,153 Dollars 11901180 18% 10 Trantraxlax Government Services 2011 35 $11,088,336 5,635,276 Dollars 5453060 7%
df = pd.read_csv('F:\\Machine Learning\\DataSet\\Fortune_10.csv', index_col = 0) df
Output >>> Name Industry Inception Employees Revenue Expenses Profit Growth ID 1 Lamtone IT Services 2009 55 $11,757,018 6,482,465 Dollars 5274553 30% 2 Stripfind Financial 2010 25 $12,329,371 916,455 Dollars 11412916 20% 3 Canecorporation Health 2012 6 $10,597,009 7,591,189 Dollars 3005820 7% 4 Mattouch IT Services 2013 6 $14,026,934 7,429,377 Dollars 6597557 26% 5 Techdrill Health 2009 9 $10,573,990 7,435,363 Dollars 3138627 8% 6 Techline Health 2006 65 $13,898,119 5,470,303 Dollars 8427816 23% 7 Cityace Health 2010 25 $9,254,614 6,249,498 Dollars 3005116 6% 8 Kayelectro Health 2009 687 $9,451,943 3,878,113 Dollars 5573830 4% 9 Ganzlax IT Services 2011 75 $14,001,180 3,878,153 Dollars 11901180 18% 10 Trantraxlax Government Services 2011 35 $11,088,336 5,635,276 Dollars 5453060 7%
df = pd.read_csv('F:\\Machine Learning\\DataSet\\Fortune_10.csv', index_col = 'Name') df
Output >>> ID Industry Inception Revenue Expenses Profit Growth Name Lamtone 1 IT Services 2009 $11,757,018 6,482,465 Dollars 5274553 30% Stripfind 2 Financial Services 2010 $12,329,371 916,455 Dollars 11412916 20% Canecorporation 3 Health 2012 $10,597,009 7,591,189 Dollars 3005820 7% Mattouch 4 IT Services 2013 $14,026,934 7,429,377 Dollars 6597557 26% Techdrill 5 Health 2009 $10,573,990 7,435,363 Dollars 3138627 8% Techline 6 Health 2006 $13,898,119 5,470,303 Dollars 8427816 23% Cityace 7 Health 2010 $9,254,614 6,249,498 Dollars 3005116 6% Kayelectronics 8 Health 2009 $9,451,943 3,878,113 Dollars 5573830 4% Ganzlax 9 IT Services 2011 $14,001,180 3,878,113 Dollars 11901180 18% Trantraxlax 10 Government Services 2011 $11,088,336 5,635,276 Dollars 5453060 7%
df1 = pd.read_csv('F:\\Machine Learning\\DataSet\\Fortune_10.csv', index_col = 2) df1
Output >>> ID Name Inception Revenue Expenses Profit Growth Industry IT Services 1 Lamtone 2009 $11,757,018 6,482,465 Dollars 5274553 30% Financial Services 2 Stripfind 2010 $12,329,371 916,455 Dollars 11412916 20% Health 3 Canecorporation 2012 $10,597,009 7,591,189 Dollars 3005820 7% IT Services 4 Mattouch 2013 $14,026,934 7,429,377 Dollars 6597557 26% Health 5 Techdrill 2009 $10,573,990 7,435,363 Dollars 3138627 8% Health 6 Techline 2006 $13,898,119 5,470,303 Dollars 8427816 23% Health 7 Cityace 2010 $9,254,614 6,249,498 Dollars 3005116 6% Health 8 Kayelectronics 2009 $9,451,943 3,878,113 Dollars 5573830 4% IT Services 9 Ganzlax 2011 $14,001,180 3,878,113 Dollars 11901180 18% Government Services 10 Trantraxlax 2011 $11,088,336 5,635,276 Dollars 5453060 7%
To Download dataset click here – Fortune_10
Download Jupyter file pandas write csv source code
Visit the official site of pandas