Python Pandas DataFrame
Pandas DataFrame is two-dimensional, size-mutable, potentially heterogeneous tabular data structure with labeled axes(rows & columns).
Here practically explanation about DataFrame.
Creating DataFrame with different ways
1. Creating empty dataframe
import pandas as pd
emt_df = pd.DataFrame()
print(emt_df)
Output >>>
Empty DataFrame
Columns: []
Index: []
2. Creating dataframe from list
lst = ['a', 'b', 'c'] # First creating a list
print(lst)
Output >>> ['a', 'b', 'c']
df1 = pd.DataFrame(lst) # Creating dataframe from above list
print(df1)
Output >>>
0
0 a
1 b
2 c
We can also inline print that command just using that variable name, without using print function
df1
Output >>>
0
0 a
1 b
2 c
Here first row(0) is data values column index/label and first column is index (which is start from 0) and second column have data values.
3. Creating dataframe from list of list
ls_of_ls = [[1,2,3], [2,3,4], [4,5,6]] # Creating list of list
print(ls_of_ls)
Output >>> [[1, 2, 3], [2, 3, 4], [4, 5, 6]]
df2 = pd.DataFrame(ls_of_ls) # Creating dataframe form above list of list
df2
Output >>>
0 1 2
0 1 2 3
1 2 3 4
2 4 5 6
Here first row (0,1,2) is column index/label and three data values columns
4. Creating dataframe from dict or dictionary or python dictionary
dict1 = {'ID': [11,22,33,44]} # Creating dict
dict1
Output >>> {'ID': [11, 22, 33, 44]}
df3 = pd.DataFrame(dict1) # Creating dataframe from above dict
df3
Output >>>
ID
0 11
1 22
2 33
3 44
For more data values columns
dict2 = {'ID': [11,22,33,44], 'SN': [1,2,3,4]}
dict2
Output >>> {'ID': [11, 22, 33, 44], 'SN': [1, 2, 3, 4]}
df4 = pd.DataFrame(dict1)
df4
Output >>>
ID SN
0 11 1
1 22 2
2 33 3
3 44 4
Here dataframe have two columns
5. Creating dataframe from list of dict
ls_dict = [{'a':1, 'b':2}, {'a':3, 'b':4}] # Creating list of dict
df5 = pd.DataFrame(ls_dict) # Creating dataframe from list of dict
df5
Output >>>
a b
0 1 2
1 3 4
# Creating dataframe from list of dict with different way
ls_dict = [{'a':1, 'b':2}, {'a':3, 'b':4, 'c':5}]
df6 = pd.DataFrame(ls_dict)
df6
Output >>>
a b c
0 1 2 NaN
1 3 4 5.0
Here in first dictionary ‘c’ is not defined but that command not gives error because pandas has function to handle missing values (which is shown by NaN)
NaN means not a number
6. Creating dataframe from dict of series
dict_sr = {'ID': pd.Series([1,2,3]), 'SN': pd.Series([111,222,333])}
df7 = pd.DataFrame(dict_sr)
df7
Output >>>
ID SN
0 1 111
1 2 222
2 3 333
Learn more Python Libraries