Extraction or selection of data can be done in many ways such as based on an individual value, range of values, etc. This is mostly required when we want to either compare the subsets of the data set or use the subset for analysis. The selection of rows based on range of value may be done for testing as well. We can do this by subset function.ExampleConsider the below data frame − Live Demo> x1 x2 x3 df dfOutput x1 x2 x3 1 3 2 6 2 3 4 9 3 4 4 12 4 4 8 12 5 3 5 11 ... Read More
The default size of axes labels created by using plot function does not seem to be large enough and also it does not look appealing. Therefore, we might want to change their size and color because the appearance of a plot matters a lot. This can be done by setting colors with col.lab and size with cex.lab.Example Live Demo> x y plot(x,y)OutputChanging the color of axes labels and the size of those axes labels −> plot(x,y,col.lab="blue",cex.lab=2)Output> plot(x,y,col.lab="dark blue",cex.lab=3)Output
When we have a data frame that contains all numerical columns then we might want to find the largest value in each row. For example, if we have a sales data set in which each row represents a customer and columns represent the products with quantities of values as values then we might want to find the maximum of each row to find out who buys which product the most. This can be done by using max with apply function for rows.ExampleConsider the below data frame − Live Demo> x1 x2 x3 x4 x5 df1 df1Output x1 ... Read More
Instead of finding the common rows, sometimes we need to find the uncommon rows between two data frames. It is mostly used when we expect that a large number of rows are uncommon instead of few ones. We can do this by using the negation operator which is represented by exclamation sign with subset function.ExampleConsider the below data frames − Live Demo> x1 y1 df1 df1Output x1 y1 1 10 6 2 5 9 3 10 10 4 4 10 5 1 6 6 1 4 7 9 3 8 5 10 9 10 3 10 8 2 11 6 10 12 ... Read More
A vector can contain values that are increasing or decreasing in nature or they can be also random which means a higher value may come after a lower one which is followed by a higher value. An example of increasing arrangement of elements of vector is 1, 2, 3 and the opposite of that would be decreasing arrangement. We can check whether a vector is arranged in increasing order or decreasing order by checking whether the difference between all values of the vector is greater than or equal to zero or not and it can be done by using diff ... Read More
Sometimes a vector strings have patterns and sometimes we need to make patterns from a vector of strings based on the characters. For example, we might want to extract the states name of United States of America from a vector that contains all the names. This can be done by using grepl function.ExampleConsider the below vector containing states name in USA −> US_states US_states[grepl("^A", US_states)] [1] "Alabama" "Alaska" "American Samoa" "Arizona" [5] "Arkansas" > US_states[grepl("^B", US_states)] character(0) > US_states[grepl("^C", US_states)] [1] "California" "Colorado" "Connecticut" > US_states[grepl("^D", US_states)] [1] "Delaware" "District of Columbia" > US_states[grepl("^E", US_states)] character(0) > US_states[grepl("^F", US_states)] [1] ... Read More
In Data Analysis, sometimes we need to find the difference of the current value from the previous value and it can be also needed for groups. It helps us to compare the differences among the values. In R, we can use dplyr package’s group_by and mutate function with lag.ExampleConsider the below data frame − Live Demo> Group Frequency df1 df1Output Group Frequency 1 A 7 2 A 6 3 A 9 4 A 12 5 B 19 6 B 19 7 B 4 8 B 6 9 C 14 10 C 6 ... Read More
Sometimes we want to extract the count from the data frame and that count could be the number of columns that have same characteristics based on row values. For example, if we have a data frame containing three columns with fifty rows and the values are integers between 1 and 100 then we might want to find the number of columns that have value greater than 20 for each of the rows. This can be done by using rowSums function.ExampleConsider the below data frame − Live Demo> x1 x2 x3 df dfOutput x1 x2 x3 1 9 72 9 2 5 20 ... Read More
The data in simultaneous equations can be read as matrix and then we can solve those matrices to find the value of the variables. For example, if we have three equations as −x + y + z = 6 3x + 2y + 4z = 9 2x + 2y – 6z = 3then we will convert these equations into matrices and solve them using solve function in R.Example1 Live Demo> A AOutput [, 1] [, 2] [, 3] [1, ] 1 1 2 [2, ] 3 2 4 [3, ] 2 3 -6 Live Demo> b ... Read More
Normally, the gridlines on a plot created by using ggplot2 package are a little far from each other but sometimes the plot looks better if the gridlines are close to each other, therefore, we might want to do so. This can be done by setting the minor_breaks and breaks using scale_y_continuous if the Y-axis plots a continuous variable.ExampleConsider the below data frame − Live Demo> x y df dfOutput x y 1 14 16 2 36 1 3 78 18 4 61 6 5 19 11 6 2 40 7 93 23 8 10 13 9 3 21 10 55 31 ... Read More
Data Structure
Networking
RDBMS
Operating System
Java
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP