- spaCy Tutorial
- spaCy - Home
- spaCy - Introduction
- spaCy - Getting Started
- spaCy - Models and Languages
- spaCy - Architecture
- spaCy - Command Line Helpers
- spaCy - Top-level Functions
- spaCy - Visualization Function
- spaCy - Utility Functions
- spaCy - Compatibility Functions
- spaCy - Containers
- Doc Class ContextManager and Property
- spaCy - Container Token Class
- spaCy - Token Properties
- spaCy - Container Span Class
- spaCy - Span Class Properties
- spaCy - Container Lexeme Class
- Training Neural Network Model
- Updating Neural Network Model
- spaCy Useful Resources
- spaCy - Quick Guide
- spaCy - Useful Resources
- spaCy - Discussion
spaCy - Util.compile_prefix_regex
This utility function will compile a sequence of prefix rules into a regex object.
Argument
The table below explains its argument −
NAME | TYPE | DESCRIPTION |
---|---|---|
entries | tuple | This argument represents the prefix rules. For example, lang.punctuation.TOKENIZER_PREFIXES</>. |
Syntax
prefixes = ("ยง", "%", "=", r"+") prefix_reg = spacy.util.compile_prefix_regex(prefixes) nlp.tokenizer.prefix_search = prefix_reg.search
Example
import spacy nlp = spacy.load('en_core_web_sm') prefixes = list(nlp.Defaults.prefixes) prefixes.remove('\\[') prefix_regex = spacy.util.compile_prefix_regex(prefixes) nlp.tokenizer.prefix_search = prefix_regex.search doc = nlp("[A] works for [B] in [C].") print([t.text for t in doc]) # ['[A]', 'works', 'for', '[B]', 'in', '[C]', '.']
Output
['[A', ']', 'works', 'for', '[B', ']', 'in', '[C', ']', '.']
spacy_util_get_data_path.htm
Advertisements
To Continue Learning Please Login
Login with Google