In this ML Algorithms course tutorial, we are going to learn “Random Forest Classification in detail. we covered it by practically and theoretical intuition.
- What is the Random Forest?
- What is Random Forest used for?
- How does Random Forest work?
- What is the Random Forest Classification?
- What is Gini impurity, entropy, the cost function for the CART algorithm?
- What is the Random Forest diagram?
- What is the difference between a decision tree and random forest?
- How to implement Random Forest Classification in python using sklearn?
Random Forest Classification Implementation
# -*- coding: utf-8 -*-
"""Random Forest Classification.ipynb
Automatically generated by Colaboratory.
Original file is located at
https://colab.research.google.com/drive/13nr6Ix-AQ3B2vbcwr7E18kXalX_hTKYA
##Random Forest Classification
### Import Libraries
"""
# import libraries
import numpy as np
import pandas as pd
"""### Load Dataset"""
#load dataset
from sklearn.datasets import load_breast_cancer
data = load_breast_cancer()
data.data
data.feature_names
data.target
data.target_names
# create dtaframe
df = pd.DataFrame(np.c_[data.data, data.target], columns=[list(data.feature_names)+['target']])
df.head()
df.tail()
df.shape
"""### Split Data"""
X = df.iloc[:, 0:-1]
y = df.iloc[:, -1]
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=2020)
print('Shape of X_train = ', X_train.shape)
print('Shape of y_train = ', y_train.shape)
print('Shape of X_test = ', X_test.shape)
print('Shape of y_test = ', y_test.shape)
"""## Train Random Forest Classification Model"""
from sklearn.ensemble import RandomForestClassifier
classifier = RandomForestClassifier(n_estimators=100, criterion='gini')
classifier.fit(X_train, y_train)
classifier.score(X_test, y_test)
"""## Predict Cancer"""
patient1 = [17.99,
10.38,
122.8,
1001.0,
0.1184,
0.2776,
0.3001,
0.1471,
0.2419,
0.07871,
1.095,
0.9053,
8.589,
153.4,
0.006399,
0.04904,
0.05373,
0.01587,
0.03003,
0.006193,
25.38,
17.33,
184.6,
2019.0,
0.1622,
0.6656,
0.7119,
0.2654,
0.4601,
0.1189]
patient1 = np.array([patient1])
patient1
classifier.predict(patient1)
data.target_names
pred = classifier.predict(patient1)
if pred[0] == 0:
print('Patient has Cancer (malignant tumor)')
else:
print('Patient has no Cancer (malignant benign)')