- Trending Categories
- Data Structure
- Operating System
- C Programming
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How can decision tree be used to construct a classifier in Python?
Decision tree is the basic building block of the random forest algorithm. It is considered as one of the most popular algorithms in machine learning and is used for classification purposes. They are extremely popular because they are easy to understand.
The decision given out by a decision tree can be used to explain why a certain prediction was made. This means the in and out of the process would be clear to the user.They are also a foundation for ensemble methods such as bagging, random forests, and gradient boosting. They are also known as CART, i.e. Classification And Regression Trees. It can be visualized as a binary tree (the one studied in data structures and algorithms).
Every node in the tree represents a single input variable, and the leaf nodes (which are also known as terminal nodes) contain output variable. These leaf nodes are used to make the prediction on the node. When a decision tree is being created, the basic idea is that the given space is being divided into multiple sections. All the values are put up and different splits are tried so as to attain less cost and best prediction values. These values are chosen in a greedy manner.
Splitting up of these nodes goes on until the maximum depth of the tree is reached. The idea behind using decision tree is to divide the input dataset into smaller dataset based on specific feature value until every target variable falls under one single category. This split is made so as to get the maximum information gain for every step.
Every decision tree begins with a root, and this is the place where the first split is made. An efficient way should be devised to ensure that the nodes are defined.
This is where Gini value comes into picture. Gini is considered to be one of the most commonly used measurement to measure inequality. Inequality refers to the target class (output) which every subset in a node may belong to.
Hence, the Gini value is calculated after every split. Based on the Gini value/ the inequality value, information gain can be defined.
DecisionTreeClassifier is used to perform multiclass classification.
Below is the syntax of the same.
class sklearn.tree.DecisionTreeClassifier(*, criterion='gini',…)
Following is the example −
from sklearn import tree from sklearn.model_selection import train_test_split my_data = [[16,19],[17,32],[13,3],[14,5],[141,28],[13,34],[186,2],[126,25],[176,28], [131,32],[166,6],[128,32],[79,110],[12,38],[19,91],[71,136],[116,25],[17,200], [15,25], [14,32],[13,35]] target_vals =['Man','Woman','Man','Woman', 'Woman','Man','Woman','Woman', 'Woman','Woman','Woman','Man','Man', 'Man','Woman', 'Woman', 'Woman', 'Woman','Man','Woman','Woman'] data_feature_names = ['Feature_1','Feature_2'] X_train, X_test, y_train, y_test = train_test_split(my_data, target_vals, test_size = 0.2, random_state = 1) clf = tree.DecisionTreeClassifier() print("The decision tree classifier is being called") DTclf = clf.fit(my_data,target_vals) prediction = DTclf.predict([[135,29]]) print("The predicted value is ") print(prediction)
The decision tree classifier is being called The predicted value is ['Woman']
- The required packages are imported into the environment.
- The code is used to classify values of target values based on feature values.
- The feature vector and target values are defined.
- The data is split into training and testing set with the help of ‘train_test_split’ function.
- The DecisionTreeClassifier is called and the data is fit to the model.
- The ‘predict’ function is used to predict the values for the feature values.
- The output is displayed on the console.
- How can decision tree be used to implement a regressor in Python?
- How to construct a decision tree?
- How can Tensorflow be used to run the classifier on a batch of images?
- How can the data be visualized to support interactive decision tree construction?
- How can Tensorflow be used for transfer learning with TF Hub, to download image net classifier?
- Decision tree implementation using Python
- How can Tensorflow be used to construct an object for customized layers?
- What is a Decision Tree?
- How can Pygal be used to visualize a treemap in Python?
- How can matplotlib be used to create a sine function in Python?
- How can Keras be used to implement ensembling in Python?
- Explain how Matplotlib can be used to create a wireframe plot in Python?
- How can Seaborn library be used to display a Scatter Plot in Python?
- How can Seaborn library be used to display a hexbin plot in Python?
- How can Bokeh be used to generate sinusoidal waves in Python?