 
- spaCy - Home
- spaCy - Introduction
- spaCy - Getting Started
- spaCy - Models and Languages
- spaCy - Architecture
- spaCy - Command Line Helpers
- spaCy - Top-level Functions
- spaCy - Visualization Function
- spaCy - Utility Functions
- spaCy - Compatibility Functions
- spaCy - Containers
- Doc Class ContextManager and Property
- spaCy - Container Token Class
- spaCy - Token Properties
- spaCy - Container Span Class
- spaCy - Span Class Properties
- spaCy - Container Lexeme Class
- Training Neural Network Model
- Updating Neural Network Model
- spaCy Useful Resources
- spaCy - Quick Guide
- spaCy - Useful Resources
- spaCy - Discussion
spaCy - Util.compile_prefix_regex
This utility function will compile a sequence of prefix rules into a regex object.
Argument
The table below explains its argument −
| NAME | TYPE | DESCRIPTION | 
|---|---|---|
| entries | tuple | This argument represents the prefix rules. For example, lang.punctuation.TOKENIZER_PREFIXES</>. | 
Syntax
prefixes = ("", "%", "=", r"+")
prefix_reg = spacy.util.compile_prefix_regex(prefixes)
nlp.tokenizer.prefix_search = prefix_reg.search
Example
import spacy
nlp = spacy.load('en_core_web_sm')
prefixes = list(nlp.Defaults.prefixes)
prefixes.remove('\\[')
prefix_regex = spacy.util.compile_prefix_regex(prefixes)
nlp.tokenizer.prefix_search = prefix_regex.search
doc = nlp("[A] works for [B] in [C].")
print([t.text for t in doc])
# ['[A]', 'works', 'for', '[B]', 'in', '[C]', '.']
Output
['[A', ']', 'works', 'for', '[B', ']', 'in', '[C', ']', '.']
spacy_util_get_data_path.htm
   Advertisements