Multilingual Google Meet Summarizer and Python Project


Introduction

Multilingual Google Meet summarizer is a tool/chrome extension that can create transcriptions for google meet conversations in multiple languages. During the COVID times people, they need a tool that can effectively summarize meetings, classroom lectures, and convection videos. Thus such a tool can be quite useful in this regard.

In this article, let us have an overview of the project structure and explore some implementation aspects with the help of code.

What this project is all about?

This is a simple chrome extension that when enabled during a google meet session can generate meeting transcriptions and summarize the conversation in a multilingual format.

What tools are we going to use?

  • Frontend − Reactjs, HTML, CSS, Bootstrap

  • Backend − Django REST service

  • Python (ML ) − Pytorch, NLTK library

  • DB − SQLite

Flow Diagram and steps required for the Project

  • The user starts the meeting and enables the extension simultaneously.

  • The extension then extracts the audio for the meeting using HTML audio and transcribes it.

  • It then sends the data to the backend which uses an ML algorithm for summarization of the text.

  • The ML translation model translates the summarized text into the target language.

  • Finally, the transcription can be downloaded from the extension.

Project Implementation

UI(Frontend) and Backend

  • An authentication service is created using Django Rest API for allowing the user to log in and same the transcription.

  • A model is created in Django and JWT tokens are used. REST API views for token generating and authenticating users.

  • Next, a database is created in SQLite with Relation in SQL language.

  • Transcript schema is created like id, name, date, host, title, duration, and text.

  • Next, a REST API is created which will be used by the chrome extension to send the data from the front end.

  • In the backend, APIs are also created from text summarization and translation using Python, NLTK, and NLP.

Code Implementation

Example

from googletrans import Translator
LANG_CODES = {
   'ENGLISH': 'en',
   'HINDI': 'hi',
   'MARATHI': 'mr',
   'ARABIC': 'ar',
   'BENGALI': 'bn',
   'CHINESE': 'zh-CN',
   'FRENCH': 'fr',
   'GUJARATI': 'gu',
   'JAPANESE': 'ja',
   'KANNADA': 'kn',
   'MALAYALAM': 'ml',
   'NEPALI': 'ne',
   'ORIYA': 'or',
   'PORTUGUESE': 'pt',
   'PUNJABI': 'pa',
   'RUSSIAN': 'ru',
   'SPANISH': 'es',
   'TAMIL': 'ta',
   'TELUGU': 'te',
   'URDU': 'ur'
}

def langauge_translate(input_data, input_lang, output_lang):
   input_lang, output_lang = input_lang.upper(), output_lang.upper()
   tr = Translator()
   text_translate = tr.translate(
   input_data, src=LANG_CODES[input_lang], dest=LANG_CODES[output_lang])
   output_text = text_rtanslate.text
   return(output_text)

Machine Learning Algorithm

Summarization

Data Cleaning ---------> Tokenization -----------> Vocab generation/Rank/Frequency ------------->Sentence Selection --------------> Summarization

Translation

Initialization ----> Embedding generation -----> Encoding -----------> Decoding -----------> Summarization

Python Code for Summarization and Translation

Example

from nltk.tokenize import sent_tokenize,word_tokenize
import nltk
from nltk.corpus import stopwords
nltk.download('stopwords')
nltk.download('punkt')

def text_clean(txt):
   data = txt.split('**')
   data.pop(0)
   cleaned_data = ""
   i = 0
   for t in data:
      if i % 2 != 0:
         cleaned_data += str(t)
      i += 1
   return cleaned_data
stopwords = set(stopwords.words("english"))

def word_token(txt):
   w = word_tokenize(txt)
   return w
def frequency_table(txt):
   frequencyT = dict()
   words = word_token(txt)
   for w in words:
      w = w.lower()
      if w in stopwords:
         continue
      if w in frequencyT:
         frequencyT[w] += 1
      else:
         frequencyT[w] = 1
   return frequencyT
def Senttokenizer(txt):
   sents = sent_tokenize(txt)
   return sents
def sents_rank_table(txt):
   sent_value = dict()
   freq_Table = frequency_table(txt)
   sents = Senttokenizer(txt)
   for s in sents:
      for word, freq in freq_Table.items():
         if word in s.lower():
            if s in sent_value:
               sent_value[s] += freq
            else:
               sent_value[s] = freq
   return sent_value
def summary(txt):
   sum = 0
   sent_value = sents_rank_table(txt)
   for sentence in sent_value:
      sum += sent_value[sentence]
   average = int(sum / len(sent_value))
   summary = ""
   sents = Senttokenizer(txt)
   for s in sents:
      if (s in sent_value) and (sent_value[s] > (1.2 * average)):
         summ += " " + s
   return summ
def main(input_text):
   # getting text cleaned
   if("**" not in input_text):
      txt = input_text
   else:
      cleanedtext = text_clean(input_text)
      txt = cleanedtext
   summarized_text = summary(txt)
   print("Summary: ", summarized_text)
   return summarized_text

textdata = "Site Ranking/Position and SEO play a vital role in today's search trends and the relevancy of results obtained. Search Engine\

Ranking is widely adopted by many big tech organizations like google today with state-of-the-art algorithms.\

In this article, we are going to explore how Machine Learning can impact the ranking of sites and how it can utilize site data to fuel state-of-the-art algorithms.\

How can Machine Learning be useful in Ranking?\

For some time, back specialists in SEO have been keeping themselves away from using Deep Learning and neural\

networks for developing ranking algorithms, but today with the versatility of implementation of machine learning and deep learning algorithms this \

scenario has completely changed.\

Today big organizations like Google, Microsoft, and Yahoo are actively exploiting these algorithms."

main(textdata)

Output

Summary:	Search Engine Ranking is widely 
adopted by many big tech organizations  like google today with state-of-the-art 
algorithms.In this article, we are going to explore how Machine Learning can impact 
the ranking of sites and how it can utilize site data to fuel state-of-the-art 
algorithms.How can Machine Learning be useful in Ranking?For some time, back 
specialists in SEO have been keeping themselves away using Deep Learning and 
neuralnetworks for developing ranking algorithms, but today with the versatility of 
implementation of machine learning and deep learning algorithms this scenario has 
completely changed.Today big organizations like Google, Microsoft, and Yahoo are 
actively exploiting these algorithms.

Conclusion

Google Meet Summarizer is a simple project that can summarize a google meeting in multiple languages using technology stacks related to frontend, backend, and Machine Learning.

Updated on: 23-Mar-2023

320 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements