Find Column Mean Excluding NA in R Data Frame

Nizamuddin Siddiqui
Updated on 06-Mar-2021 12:59:35

884 Views

To find the column mean by excluding NA’s can be easily done by using na,rm but if we want to have NA if all the values are NA then it won’t be that straight forward. Therefore, in such situation, we can use ifelse function and return the output as NA if all the values are NA as shown in the below examples.Example1Consider the below data frame − Live Demox1

Find Sum of Non-Missing Values in R Data Frame Column

Nizamuddin Siddiqui
Updated on 06-Mar-2021 12:57:49

12K+ Views

To find the sum of non-missing values in an R data frame column, we can simply use sum function and set the na.rm to TRUE. For example, if we have a data frame called df that contains a column say x which has some missing values then the sum of the non-missing values can be found by using the command sum(df$x,na.rm=TRUE).Example1Consider the below data frame − Live Demox1

Subset Data Table Object Using a Range of Values in R

Nizamuddin Siddiqui
Updated on 06-Mar-2021 12:54:18

3K+ Views

To subset a data.table object using a range of values, we can use single square brackets and choose the range using %between%. For example, if we have a data.table object DT that contains a column x and the values in x ranges from 1 to 10 then we can subset DT for values between 3 to 8 by using the command DT[DT$x %between% c(3,8)].Example1Loading data.table package and creating a data.table object −library(data.table) x1

Convert a List to JSON in R

Nizamuddin Siddiqui
Updated on 06-Mar-2021 12:23:45

3K+ Views

To convert a list to JSON, we can use toJSON function of jsonlite package. For example, if we have a list called LIST then it can be converted to a JSON by using the command toJSON(LIST,pretty=TRUE,auto_unbox=TRUE). We need to make sure that the package jsonlite is loaded in R environment otherwise the command won’t work.Example Live DemoList

Randomly Sample Rows from an R Data Frame using sample()

Nizamuddin Siddiqui
Updated on 06-Mar-2021 12:23:14

545 Views

To randomly sample rows from an R data frame using sample_n, we can directly pass the sample size inside sample_n function of dplyr package. For example, if we have data frame called df then to create a random sample of 5 rows in df can be done by using the command −df%>%sample_n(5)Example1Consider the below data frame − Live Demox1

Add Variable Description in R

Nizamuddin Siddiqui
Updated on 06-Mar-2021 12:16:47

2K+ Views

To add a variable description in R, we can use comment function and if we want to have a look at the description then structure call of the data frame will be used. For example, if we have a data frame say df that contains a column x then we can describe x by using the command comment(df$x)

Format All Decimal Places in R Vector and Data Frame

Nizamuddin Siddiqui
Updated on 06-Mar-2021 12:16:25

4K+ Views

To format all decimal places in an R vector and data frame, we can use formattable function of formattable package where we can specify the number of digits after decimal places. For example, if we have a numerical vector say x then the values in x can be formatted to have only 2 decimal places by using the command formattable(x,format="f",digits=2).Example1Loading formattable package −library(formattable) Live Demox1

Create Multiple Bar Plots with Same Width Bars using ggplot2 in R

Nizamuddin Siddiqui
Updated on 06-Mar-2021 12:14:11

1K+ Views

To create multiple bar plots for varying categories with same width bars using ggplot2, we would need to play with width argument inside geom_bar function to match the width of the bars in each bar plot. The best way to do this would be setting the larger ones to 0.25 and the shorter ones to 0.50.ExampleConsider the below data frame − Live Demox1

Find High Leverage Values for a Regression Model in R

Nizamuddin Siddiqui
Updated on 06-Mar-2021 12:08:26

2K+ Views

To find the high leverage values for a regression model, we first need to find the predicted values or hat values that can be found by using hatvalues function and then define the condition for high leverage and extract them. For example if we have a regression model say M then the hat values can be found by using the command hatvalues(M), now to find the high leverage values that are greater than 0.05 can be found by using the below code −which(hatvalues(M)>0.05)Example1Consider the below data frame − Live Demox1

Apply Multiple AND Conditions to a Data Frame in R

Nizamuddin Siddiqui
Updated on 06-Mar-2021 12:04:04

339 Views

To apply multiple conditions to a data frame, we can use double and sign that is &&. For example, if we have a data frame called df that contains three columns say x, y, z and we want to add a value to all columns if first element in z equals to 5 then it can be done by using the command −if(df$x && df$y && df$y == 5){    df$x = df$x+10    df$y = df$y+10    df$z = df$z+10 }Example1Consider the below data frame − Live Demox1

Advertisements