How to find the intersection of elements in a string vector in R?


If we have a string vector that contains more than one element then there might exists some common values in all the elements. If we want to find those values then intersect function can be used along strsplit function and Reduce function.

Check out the below Examples to understand how it can be done.

Example 1

>x1=c("Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from structured and unstructured data, and apply knowledge and actionable insights from data across a broad range of application domains.","Data science is the domain of study that deals with vast volumes of data using modern tools and techniques to find unseen patterns, derive meaningful information, and make business decisions. Data science uses complex machine learning algorithms to build predictive models.")

x1

Output

If you execute the above given snippet, it generates the following Output −

[1] "Data science is an interdisciplinary field that uses scientific methods,
processes, algorithms and systems to extract knowledge and insights from
structured and unstructured data, and apply knowledge and actionable insights
from data across a broad range of application domains."
[2] "Data science is the domain of study that deals with vast volumes of data
using modern tools and techniques to find unseen patterns, derive meaningful
information, and make business decisions. Data science uses complex machine
learning algorithms to build predictive models."

Add the following code to the above snippet −

x1
Reduce(intersect, strsplit(x1," "))

Output

If you execute all the above given snippets as a single program, it generates the following Output −

[1] "Data" "science" "is" "that" "uses"
[6] "algorithms" "and" "to" "data" "of"

Example 2

>x2<-c("Machine learning is a method of data analysis that automates analytical model building. It is a branch of artificial intelligence based on the idea that systems can learn from data, identify patterns and make decisions with minimal human intervention.","Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the development of computer programs that can access data and use it to learn for themselves.","Machine Learning is the field of study that gives computers the capability to learn without being explicitly programmed. ML is one of the most exciting technologies that one would have ever come across. As it is evident from the name, it gives the computer that makes it more similar to humans: The ability to learn. Machine learning is actively being used today, perhaps in many more places than one would expect.")

x2

Output

If you execute the above given snippet, it generates the following Output −

[1] "Machine learning is a method of data analysis that automates analytical
model building. It is a branch of artificial intelligence based on the idea
that systems can learn from data, identify patterns and make decisions with
minimal human intervention."
[2] "Machine learning is an application of artificial intelligence (AI) that
provides systems the ability to automatically learn and improve from experience
without being explicitly programmed. Machine learning focuses on the
development of computer programs that can access data and use it to learn for
themselves."
[3] "Machine Learning is the field of study that gives computers the capability
to learn without being explicitly programmed. ML is one of the most exciting
technologies that one would have ever come across. As it is evident from the
name, it gives the computer that makes it more similar to humans: The ability
to learn. Machine learning is actively being used today, perhaps in many more
places than one would expect."

Add the following code to the above snippet −

x2
Reduce(intersect,strsplit(x2," "))

Output

If you execute all the above given snippets as a single program, it generates the following Output −

[1] "Machine" "learning" "is" "of" "that" "the" "learn"
[8] "from"

Example 3

>x3<-c("Deep Learning is a subfield of machine learning concerned with algorithms inspired by the structure and function of the brain called artificial neural networks.","Deep learning is an artificial intelligence (AI) function that imitates the workings of the human brain in processing data and creating patterns for use in decision making. Deep learning is a subset of machine learning in artificial intelligence that has networks capable of learning unsupervised from data that is unstructured or unlabeled. Also known as deep neural learning or deep neural network.","Deep learning is a machine learning technique that teaches computers to do what comes naturally to humans: learn by Example. Deep learning is a key technology behind driverless cars, enabling them to recognize a stop sign, or to distinguish a pedestrian from a lamppost. It is the key to voice control in consumer devices like phones, tablets, TVs, and handsfree speakers. Deep learning is getting lots of attention lately and for good reason. It’s achieving results that were not possible before.","Deep learning can be considered as a subset of machine learning. It is a field that is based on learning and improving on its own by examining computer algorithms. While machine learning uses simpler concepts, deep learning works with artificial neural networks, which are designed to imitate how humans think and learn. Until recently, neural networks were limited by computing power and thus were limited in complexity. However, advancements in Big Data analytics have permitted larger, sophisticated neural networks, allowing computers to observe, learn, and react to complex situations faster than humans. Deep learning has aided image classification, language translation, speech recognition. It can be used to solve any pattern recognition problem and without human intervention.")

x3

Output

If you execute the above given snippet, it generates the following Output −

[1] "Deep Learning is a subfield of machine learning concerned with algorithms
inspired by the structure and function of the brain called artificial neural
networks."
[2] "Deep learning is an artificial intelligence (AI) function that imitates
the workings of the human brain in processing data and creating patterns for
use in decision making. Deep learning is a subset of machine learning in
artificial intelligence that has networks capable of learning unsupervised from
data that is unstructured or unlabeled. Also known as deep neural learning or
deep neural network."
[3] "Deep learning is a machine learning technique that teaches computers to do
what comes naturally to humans: learn by Example. Deep learning is a key
technology behind driverless cars, enabling them to recognize a stop sign, or
to distinguish a pedestrian from a lamppost. It is the key to voice control in
consumer devices like phones, tablets, TVs, and hands-free speakers. Deep
learning is getting lots of attention lately and for good reason. It’s
achieving results that were not possible before."
[4] "Deep learning can be considered as a subset of machine learning. It is a
field that is based on learning and improving on its own by examining computer
algorithms. While machine learning uses simpler concepts, deep learning works
with artificial neural networks, which are designed to imitate how humans think
and learn. Until recently, neural networks were limited by computing power and
thus were limited in complexity. However, advancements in Big Data analytics
have permitted larger, sophisticated neural networks, allowing computers to
observe, learn, and react to complex situations faster than humans. Deep
learning has aided image classification, language translation, speech
recognition. It can be used to solve any pattern recognition problem and
without human intervention."

Add the following code to the above snippet −

x3
Reduce(intersect,strsplit(x3," "))

Output

If you execute the above given snippet, it generates the following Output −

[1] "Deep" "is" "a" "of" "machine" "learning" "and"

Updated on: 02-Nov-2021

512 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements