Find the word from a given sentence having given word as prefix



When working with natural language processing or text analysis, it is often necessary to search for specific words or phrases within a larger body of text. One common task is finding all the words in a sentence that start with a given prefix. In this article, we will explore how to accomplish this task.

Algorithm

  • Read in the input sentence and prefix.

  • Tokenize the input sentence into individual words.

  • For each word in the sentence, check if it starts with the given prefix.

  • If the word starts with the prefix, add it to the list of words that match.

  • Print the list of words that match.

Example

Here're the programs that solves the problem

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main() {
   // Declare variables
   char sentence[] = "The quick brown fox jumps over the lazy dog";
   char prefix[] = "fox";
   char *word = strtok(sentence, " "); // Tokenization using space as delimiter
   char *words[100]; // Array to store tokenized words
   int wordCount = 0;

   // Tokenize the sentence into individual words
   while (word != NULL) {
      words[wordCount++] = word; // Store each word in the array
      word = strtok(NULL, " "); // Get the next word
   }

   char *matches[100]; // Array to store matching words
   int matchCount = 0;

   // Check for words that start with the given prefix
   for (int i = 0; i < wordCount; i++) {
      if (strncmp(words[i], prefix, strlen(prefix)) == 0) { // Compare with prefix
         matches[matchCount++] = words[i]; // Store matching words
      }
   }

   // Print the matching words
   printf("Matching words:\n");
   for (int i = 0; i < matchCount; i++) {
      printf("%s\n", matches[i]);
   }

   return 0;
}

Output

Matching words:
Fox
#include <iostream>
#include <string>
#include <vector>

using namespace std;

int main() {
   string sentence, prefix;
   vector<string> words;
   
   // Read in the input sentence and prefix
   sentence="The quick brown fox jumps over the lazy dog";
   prefix="fox";
   
   // Tokenize the input sentence into individual words
   string word = "";
   for (auto c : sentence) {
      if (c == ' ') {
         words.push_back(word);
         word = "";
      }
      else {
         word += c;
      }
   }
   words.push_back(word);

   // Find all words in the sentence that start with the given prefix
   vector<string> matches;
   for (auto w : words) {
      if (w.substr(0, prefix.length()) == prefix) {
         matches.push_back(w);
      }
   }
   
   // Print the list of matching words
   cout << "Matching words:" << endl;
   for (auto m : matches) {
      cout << m << endl;
   }
   
   return 0;
}

Output

Matching words:
Fox
import java.util.ArrayList;

public class Main {
   public static void main(String[] args) {
      // Declare variables
      String sentence = "The quick brown fox jumps over the lazy dog";
      String prefix = "fox";
      String[] words = sentence.split(" "); // Tokenization using space as delimiter

      ArrayList<String> matches = new ArrayList<>(); // ArrayList to store matching words

      // Check for words that start with the given prefix
      for (String w : words) {
         if (w.startsWith(prefix)) {
            matches.add(w); // Store matching words in ArrayList
         }
      }

      // Print the matching words
      System.out.println("Matching words:");
      for (String m : matches) {
         System.out.println(m);
      }
   }
}

Output

Matching words:
Fox
sentence = "The quick brown fox jumps over the lazy dog"
prefix = "fox"
words = sentence.split() # Tokenization using space as delimiter

matches = [w for w in words if w.startswith(prefix)] # List comprehension to find matching words

print("Matching words:")
for m in matches:
   print(m)

Output

Matching words:
Fox

Testcase Example

Suppose we have the following input sentence

The quick brown fox jumps over the lazy dog

And we want to find all the words that start with the prefix "fox". Running the above code with this input would produce the following output:

In this example, the only word in the sentence that starts with the prefix "fox" is "fox" itself, so it is the only word that is printed as a match.

Conclusion

Finding all the words in a sentence that start with a given prefix is a useful task in natural language processing and text analysis. By tokenizing the input sentence into individual words and checking each word for a matching prefix, we can easily accomplish this task.

Updated on: 2023-10-20T14:59:21+05:30

294 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements