spaCy - Debug-data Command



With the help of this command, we can analyse, debug, and validate our training and development data. We can also get some useful statistics, invalid entity annotations, cyclic dependencies, and low data labels etc.

The Debug-data command is as follows −

python -m spacy debug-data [lang] [train_path] [dev_path] [--base-model] [--pipeline] [--ignore-warnings] [--verbose] [--no-format]

Arguments

The table below explains its arguments −

ARGUMENT TYPE DESCRIPTION
lang Positional This argument represents the model language.
train_path Positional This is the location of JSON-formatted training data which can be either a file or a directory of files.
dev_path Positional This is the location of JSON-formatted development data for evaluation, which can either be a file or a directory of files.
--tag-map-path, -tm V2.2.4 Option Introduced in version 2.2.4 representing the location of JSON-formatted tag map.
--base-model, -b Option This argument is the name of base model to update. It is optional. It can be any loadable spaCy model.
--pipeline, -p Option This is comma-separated names of pipeline components to train. The default value is 'tagger,parser,ner'.
--ignore-warnings, -IW Flag As name implies, this argument will ignore the warnings and only show statistics as well as errors.
--verbose, -V Flag It will print additional information and explanations.
–no-format, -NF Flag It will print the results. You can use this argument, if you want to write to a file.
Advertisements