Python Articles - Page 598 of 829

How can scikit learn library be used to preprocess data in Python?

AmitDiwan
Updated on 10-Dec-2020 13:34:59

359 Views

Pre-processing data refers to cleaning of data, removing invalid data, noise, replacing data with relevant values and so on.This doesn’t always mean text data; it could also be images or video processing as well. It is an important step in the machine learning pipeline.Data pre-processing basically refers to the task of gathering all the data (which is collected from various resources or a single resource) into a common format or into uniform datasets (depending on the type of data).This is done so that the learning algorithm can learn from this dataset and give relevant results with high accuracy. Since real-world ... Read More

How to apply functions element-wise in a dataframe in Python?

AmitDiwan
Updated on 10-Dec-2020 13:30:21

1K+ Views

It may sometimes be required to apply certain functions along the elements of the dataframe. All the functions can’t be vectorised. This is where the function ‘applymap’ comes into picture.This takes a single value as input and returns a single value as output.Example Live Demoimport pandas as pd import numpy as np my_df = pd.DataFrame(np.random.randn(5, 5), columns=['col_1', 'col_2', 'col_3', 'col_4', 'col_5']) print("The dataframe generated is ") print(my_df) my_df.applymap(lambda x:x*11.45) print("Using the applymap function") print(my_df.apply(np.mean))OutputThe dataframe generated is     col_1       col_2      col_3      col_4     col_5 0  -0.671510  -0.860741   0.886484   0.842158   ... Read More

How can a specific operation be applied row wise or column wise in Pandas Python?

AmitDiwan
Updated on 10-Dec-2020 13:28:19

448 Views

It may sometimes be required to apply certain functions along the axes of a dataframe. The axis can be specified, otherwise the default axis is considered as column-wise, where every column is considered as an array.If the axis is specified, then the operations are performed row-wise on the data.The ‘apply’ function can be used in conjunction with the dot operator on the dataframe. Let us see an example −Example Live Demoimport pandas as pd import numpy as np my_data = {'Age':pd.Series([45, 67, 89, 12, 23]), 'value':pd.Series([8.79, 23.24, 31.98, 78.56, 90.20])} print("The dataframe is :") my_df = pd.DataFrame(my_data) print(my_df) print("The description of ... Read More

How can data be summarized in Pandas Python?

AmitDiwan
Updated on 10-Dec-2020 13:27:07

170 Views

Lots of information about the data can be obtained by using different functions on it. But if we wish to get all information on the data, the ‘describe’ function can be used.This function will give information such as ‘count’, ‘mean’, ‘standard deviation’, the 25th percentile, the 50th percentile, and the 75th percentile.Example Live Demoimport pandas as pd my_data = {'Name':pd.Series(['Tom', 'Jane', 'Vin', 'Eve', 'Will']), 'Age':pd.Series([45, 67, 89, 12, 23]), 'value':pd.Series([8.79, 23.24, 31.98, 78.56, 90.20]) } print("The dataframe is :") my_df = pd.DataFrame(my_data) print(my_df) print("The description of data is :") print(my_df.describe())OutputThe dataframe is :    Name  Age   value 0  Tom   ... Read More

How to find the standard deviation of specific columns in a dataframe in Pandas Python?

AmitDiwan
Updated on 10-Dec-2020 13:25:13

7K+ Views

Standard deviation tells about how the values in the dataset are spread. They also tells how far the values in the dataset are from the arithmetic mean of the columns in the dataset.Sometimes, it may be required to get the standard deviation of a specific column that is numeric in nature. This is where the std() function can be used. The column whose mean needs to be computed can be indexed to the dataframe, and the mean function can be called on this using the dot operator.The index of the column can also be passed to find the standard deviation.Let ... Read More

How can decision tree be used to construct a classifier in Python?

AmitDiwan
Updated on 10-Dec-2020 13:20:24

230 Views

Decision tree is the basic building block of the random forest algorithm. It is considered as one of the most popular algorithms in machine learning and is used for classification purposes. They are extremely popular because they are easy to understand.The decision given out by a decision tree can be used to explain why a certain prediction was made. This means the in and out of the process would be clear to the user.They are also a foundation for ensemble methods such as bagging, random forests, and gradient boosting. They are also known as CART, i.e. Classification And Regression Trees. ... Read More

How to view the pixel values of an image using scikit-learn in Python?

AmitDiwan
Updated on 10-Dec-2020 13:15:04

781 Views

Data pre-processing basically refers to the task of gathering all the data (which is collected from various resources or a single resource) into a common format or into uniform datasets (depending on the type of data).Since real-world data is never ideal, there is a possibility that the data would have missing cells, errors, outliers, discrepancies in columns, and much more.Sometimes, images may not be correctly aligned, or may not be clear or may have a very large size. The goal of pre-processing is to remove these discrepancies and errors.To get the pixels of an image, a built-in function named ‘flatten’ ... Read More

How can scikit-learn library be used to get the resolution of an image in Python?

AmitDiwan
Updated on 10-Dec-2020 13:13:09

489 Views

Data pre-processing basically refers to the task of gathering all the data (which is collected from various resources or a single resource) into a common format or into uniform datasets (depending on the type of data). Since real-world data is never ideal, there is a possibility that the data would have missing cells, errors, outliers, discrepancies in columns, and much more. Sometimes, images may not be correctly aligned, or may not be clear or may have a very large size. The goal of pre-processing is to remove these discrepancies and errors.To get the resolution of an image, a built-in function ... Read More

How to get the mean of columns that contains numeric values of a dataframe in Pandas Python?

AmitDiwan
Updated on 10-Dec-2020 13:11:53

1K+ Views

Sometimes, it may be required to get the mean values of a specific column or mean values of all columns that contains numerical values. This is where the mean() function can be used.The term ‘mean’ refers to finding the sum of all values and dividing it by the total number of values in the dataset.Let us see a demonstration of the same −Example Live Demoimport pandas as pd my_data = {'Name':pd.Series(['Tom', 'Jane', 'Vin', 'Eve', 'Will']), 'Age':pd.Series([45, 67, 89, 12, 23]), 'value':pd.Series([8.79, 23.24, 31.98, 78.56, 90.20]) } print("The dataframe is :") my_df = pd.DataFrame(my_data) print(my_df) print("The mean is :") print(my_df.mean())OutputThe dataframe is ... Read More

How to get the sum of a specific column of a dataframe in Pandas Python?

AmitDiwan
Updated on 10-Dec-2020 13:08:54

1K+ Views

Sometimes, it may be required to get the sum of a specific column. This is where the ‘sum’ function can be used.The column whose sum needs to be computed can be passed as a value to the sum function. The index of the column can also be passed to find the sum.Let us see a demonstration of the same −Example Live Demoimport pandas as pd my_data = {'Name':pd.Series(['Tom', 'Jane', 'Vin', 'Eve', 'Will']), 'Age':pd.Series([45, 67, 89, 12, 23]), 'value':pd.Series([8.79, 23.24, 31.98, 78.56, 90.20]) } print("The dataframe is :") my_df = pd.DataFrame(my_data) print(my_df) print("The sum of 'age' column is :") print(my_df.sum(1))OutputThe dataframe is ... Read More

Advertisements