spaCy - Init Model



Like spacy model command in version 1.x, Init model command is used to create a new model directory from raw data such as Brown clusters and word vectors.

The Init model command is as follows −

python -m spacy init-model [lang] [output_dir] [--jsonl-loc] [--vectors-loc][--prune-vectors]

Arguments

The table below explains its arguments −

ARGUMENT TYPE DESCRIPTION
lang positional It represents the model language ISO code. For example, en.
output_dir positional This argument represents the model output directory. It will be created if it does not already exist.
--jsonl-loc, -j option It represents an optional location of JSONL-formatted vocabulary file with the lexical attributes.
--vectors-loc, -v option It represents an optional location of vectors. It should be a file where, the first row contains the dimensions of the vectors and followed by a space-separated Word2Vec table. The file can be provided either in .txt format or as a zipped text file in .zip or .tar.gz format.
--truncate-vectors, -t option Introduced in version 2.3, represents the number of vectors to truncate to when reading in vectors file. The default value is 0 indicates no truncation.
--prune-vectors, -V option This argument represents the number of vectors to prune the vocabulary to. The default value is -1 indicates no pruning.
--vectors-name, -vn option It is the name that is to be assigned to the word vectors in the meta.json. For example, en_core_web_md.vectors.
--omit-extra-lookups, -OEL flag Introduced in version 2.3, it will omit any of the extra lookups tables (cluster/prob/sentiment) from spacy-lookups-data in the model.
spacy_command_line_helpers.htm
Advertisements