spaCy - Debug-data Command

With the help of this command, we can analyse, debug, and validate our training and development data. We can also get some useful statistics, invalid entity annotations, cyclic dependencies, and low data labels etc.

The Debug-data command is as follows −

python -m spacy debug-data [lang] [train_path] [dev_path] [--base-model] [--pipeline] [--ignore-warnings] [--verbose] [--no-format]

Arguments

The table below explains its arguments −

ARGUMENT	TYPE	DESCRIPTION
lang	Positional	This argument represents the model language.
train_path	Positional	This is the location of JSON-formatted training data which can be either a file or a directory of files.
dev_path	Positional	This is the location of JSON-formatted development data for evaluation, which can either be a file or a directory of files.
--tag-map-path, -tm V2.2.4	Option	Introduced in version 2.2.4 representing the location of JSON-formatted tag map.
--base-model, -b	Option	This argument is the name of base model to update. It is optional. It can be any loadable spaCy model.
--pipeline, -p	Option	This is comma-separated names of pipeline components to train. The default value is 'tagger,parser,ner'.
--ignore-warnings, -IW	Flag	As name implies, this argument will ignore the warnings and only show statistics as well as errors.
--verbose, -V	Flag	It will print additional information and explanations.
–no-format, -NF	Flag	It will print the results. You can use this argument, if you want to write to a file.