
- Python Basic Tutorial
- Python - Home
- Python - Overview
- Python - Environment Setup
- Python - Basic Syntax
- Python - Comments
- Python - Variables
- Python - Data Types
- Python - Operators
- Python - Decision Making
- Python - Loops
- Python - Numbers
- Python - Strings
- Python - Lists
- Python - Tuples
- Python - Dictionary
- Python - Date & Time
- Python - Functions
- Python - Modules
- Python - Files I/O
- Python - Exceptions
Python - Word Embedding using Word2Vec
Word Embedding is a language modeling technique used for mapping words to vectors of real numbers. It represents words or phrases in vector space with several dimensions. Word embeddings can be generated using various methods like neural networks, co-occurrence matrix, probabilistic models, etc.
Word2Vec consists of models for generating word embedding. These models are shallow two-layer neural networks having one input layer, one hidden layer and one output layer.
Example
# importing all necessary modules from nltk.tokenize import sent_tokenize, word_tokenize import warnings warnings.filterwarnings(action = 'ignore') import gensim from gensim.models import Word2Vec # Reads ‘alice.txt’ file sample = open("C:\Users\Vishesh\Desktop\alice.txt", "r") s = sample.read() # Replaces escape character with space f = s.replace("\n", " ") data = [] # iterate through each sentence in the file for i in sent_tokenize(f): temp = [] # tokenize the sentence into words for j in word_tokenize(i): temp.append(j.lower()) data.append(temp) # Create CBOW model model1 = gensim.models.Word2Vec(data, min_count = 1, size = 100, window = 5) # Print results print("Cosine similarity between 'alice' " + "and 'wonderland' - CBOW : ", model1.similarity('alice', 'wonderland')) print("Cosine similarity between 'alice' " + "and 'machines' - CBOW : ", model1.similarity('alice', 'machines')) # Create Skip Gram model model2 = gensim.models.Word2Vec(data, min_count = 1, size = 100, window =5, sg = 1) # Print results print("Cosine similarity between 'alice' " + "and 'wonderland' - Skip Gram : ", model2.similarity('alice', 'wonderland')) print("Cosine similarity between 'alice' " + "and 'machines' - Skip Gram : ", model2.similarity('alice', 'machines'))
- Related Articles
- How can the ‘Word2Vec’ algorithm be trained using Tensorflow?
- Create Word Cloud using Python
- Word Dictionary using Python Tkinter
- How can Keras be used with Embedding layer to share layers using Python?
- Embedding an Image in a Tkinter Canvas widget using PIL
- Embedding Interfaces in Golang
- How to match a word in python using Regular Expression?
- Word Break in Python
- Word Search in Python
- Word Spacing using CSS
- Find the first repeated word in a string in Python using Dictionary
- Embedding small plots inside subplots in Matplotlib
- Shortest Completing Word in Python
- Word Break II in Python
- Word Search II in Python

Advertisements