Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Programming Articles - Page 1409 of 3363
595 Views
If we have time series data stored in a data frame then plotting the same as a time series cannot be done directly, also the labels for the series might not be possible directly. Therefore, we first need to convert the data frame to a time series object by using the function ts as shown in the below example and then using the plot function to create the plot, this will display the labels for the series as well.Consider the below data frame −Example Live DemoTime
731 Views
By subtotal we mean finding the sum of values based on grouping column. For example, if we have a data frame called df that contains three numerical columns as x, y, z and one categorical column say Group then the subtotal of x, y, z for each category in Group can be found by using the command aggregate(cbind(x,y,z)~Group,data=df,FUN=sum).Consider the below data frame −Example Live Demox1
309 Views
To create a random vector of integers with increasing values, we can do random sampling with sample.int and for increasing values cummax function needs to be used. For example, to create a random vector of integers of size 5 up to values 5 starting from 1 can be done by using the command cummax(sample.int(5)).Example Live Demox1
2K+ Views
When we perform any type of data analysis, there are many types of objects that are created in the R environment such as vector, data frame, matrix, lists, arrays, etc. If we want to get the list of available data frames in R environment then we can use the below command −names(which(unlist(eapply(.GlobalEnv,is.data.frame))))Example Live Demox1
4K+ Views
If we have a numeric column in an R data frame and the unique number of values in the column is low that means the numerical column can be treated as a factor. Therefore, we can convert numeric columns to factor. To do this using dplyr package, we can use mutate_if function of dplyr package.Loading dplyr package and converting numerical columns in BOD data set (available in base R) to factor columns −Examplelibrary(dplyr) str(BOD) 'data.frame': 6 obs. of 2 variables: $ Time : num 1 2 3 4 5 7 $ demand: num 8.3 10.3 19 16 15.6 19.8 - ... Read More
482 Views
Assume, you have a dataframe and the result for trim of minimum and the maximum threshold value, minimum threshold: Column1 Column2 0 30 30 1 34 30 2 56 30 3 78 50 4 30 90 maximum threshold: Column1 Column2 0 12 23 1 34 30 2 50 25 3 50 50 4 28 50 clipped dataframe is: Column1 Column2 0 30 30 1 34 30 2 50 30 3 ... Read More
355 Views
Assume, you have a dataframe and the result for quantify shape of a distribution is, kurtosis is: Column1 -1.526243 Column2 1.948382 dtype: float64 asymmetry distribution - skewness is: Column1 -0.280389 Column2 1.309355 dtype: float64SolutionTo solve this, we will follow the steps given below −Define a dataframeApply df.kurt(axis=0) to calculate the shape of distribution, df.kurt(axis=0)Apply df.skew(axis=0) to calculate unbiased skew over axis-0 to find asymmetry distribution, df.skew(axis=0)ExampleLet’s see the following code to get a better understanding −import pandas as pd data = {"Column1":[12, 34, 56, 78, 90], "Column2":[23, 30, 45, ... Read More
547 Views
SolutionAssume you have a dataframe and mean absolute deviation of rows and column is, mad of columns: Column1 0.938776 Column2 0.600000 dtype: float64 mad of rows: 0 0.500 1 0.900 2 0.650 3 0.900 4 0.750 5 0.575 6 1.325 dtype: float64To solve this, we will follow the steps given below −Define a dataframeCalculate mean absolute deviation of row as, df.mad()Calculate mean absolute deviation of row as, df.mad(axis=1)ExampleLet’s see the following code to get a better understanding −import pandas as pd data = {"Column1":[6, 5.3, 5.9, 7.8, 7.6, 7.45, 7.75], ... Read More
321 Views
Assume, you have Panel and the average of the first row is, Average of first row is: Column1 0.274124 dtype: float64SolutionTo solve this, we will follow the steps given below −Set data value as dictionary key is ‘Column1’ with value as pd.DataFrame(np.random.randn(5, 3))data = {'Column1' : pd.DataFrame(np.random.randn(5, 3))}Assign data to Panel and save it as pp = pd.Panel(data)Print the column using dict key Column1print(p['Column1'])Calculate theAverage of first row using, major_xs(0) ,p.major_xs(0).mean()ExampleLet’s see the following code to get a better understanding −import pandas as pd import numpy as np data = {'Column1' : pd.DataFrame(np.random.randn(5, 3))} p = pd.Panel(data) print("Panel values:") ... Read More
380 Views
SolutionAssume, you have a dataframe and minimum rank of a particular column, Id Name Age Rank 0 1 Adam 12 1.0 1 2 David 13 3.0 2 3 Michael 14 5.0 3 4 Peter 12 1.0 4 5 William 13 3.0To solve this, we will follow the steps given below −Define a dataframe.Assign df[‘Age’] column inside rank function to calculate the minimum rank for axis 0 is, df["Age"].rank(axis=0, method ='min', ascending=True)ExampleLet’s see the following code to get a better understanding −import pandas as pd data = {'Id': [1, 2, 3, ... Read More