If we have positive as well as negative values in a matrix then the maximum of the matrix will be a positive number but if we want to ignore the sign then a number represented with negative sign can also be the maximum. If we want to get the maximum with its sign then which.max function can be used in R. Check out the below examples to understand how to do it.Example Live DemoM1
Ranking of a variable has many objectives such as defining order based on hierarchy but in data science, we use it mainly for analyzing non-parametric data. The ranking of a variable in an R data frame can be done by using rank function. For example, if we have a data frame df that contains column x then rank of values in x can be found as rank(df$x).Example Live DemoConsider the below data frame: x1
The default value of Y-axis tick marks using ggplot2 are taken by R using the provided data but we can set it by using scale_y_continuous function of ggplot2 package. For example, if we want to have values starting from 1 to 10 with a gap of 1 then we can use scale_y_continuous(breaks=seq(1,10,by=1)).Example Live DemoConsider the below data frame: x
If we want to match the names of a vector in sequence with string vector values in another vector having same values then pmatch function can be used. The pmatch function means pattern match hence it matches all the corresponding values and returns the index of the values. Check out the below examples to understand how it works.Example Live Demox1
Duplication is also a problem that we face during data analysis. We can find the rows with duplicated values in a particular column of an R data frame by using duplicated function inside the subset function. This will return only the duplicate rows based on the column we choose that means the first unique value will not be in the output.Example Live DemoConsider the below data frame: x1
The concatenation of string vectors will create combination of the values in the vectors thus, we can use them for interaction between/among the vectors. In R, we can use expand.grid along with apply to create such type of combinations as shown in the below examples.Example 1 Live Demox1
Instructors/educators often need to teach missing value imputation to their students; hence they require datasets that contains some missing values or they need to create one. We also have some data sets with missing values available in R such as airquality data in base R and food data in VIM package. There could be many other packages that contain data sets with missing values but it would take a lot of time to explore them. Thus, we have shared the example of airquality and some data sets from VIM package.Example 1 Live Demohead(airquality, 20)Output Ozone Solar.R Wind Temp Month Day 1 41 ... Read More
If a column in an R data frame has only two values 0 and 1 then we call it a binary column but it is not necessary that a binary column needs to be defined with 0 and 1 only but it is a general convention. To detect a binary column defined with 0 and 1 in an R data frame, we can use the apply function as shown in the below examples.ExampleConsider the below data frame − Live Demox1
Problem:You want to display open cursors in Oracle.SolutionWe can query the data dictionary to determine the number of cursors that are open per session. "V$SESSION" provides a more accurate number of the cursors currently open than "V$OPEN_CURSOR".Exampleselect a.value , c.username , c.machine , c.sid , c.serial# from v$sesstat a , v$statname b , v$session c where a.statistic# = b.statistic# and c.sid = a.sid and b.name = 'opened cursors current' and a.value != 0 and c.username IS NOT NULL order by 1, 2;The OPEN_CURSORS initialization parameter determines the maximum number of cursors a session can have open.Read More
Problem:You want to identify the SQL statements responsible for the most waits in your database.SolutionWe can use below SQL statement to identify SQL causing problem.The below query will rank SQL statements that ran during the past 30 minutes and display them as per the total time waited by each query.ExampleSELECT ash.user_id, u.username, s.sql_text, SUM(ash.wait_time + ash.time_waited) ttl_wait_time FROM v$active_session_history ash, v$sqlarea s, dba_users u WHERE ash.sample_time BETWEEN sysdate - 60/2880 AND sysdate AND ash.sql_id = s.sql_id AND ash.user_id = u.user_id GROUP BY ash.user_id, s.sql_text, u.username ORDER BY ttl_wait_time ;When you have a performance ... Read More