- spaCy Tutorial
- spaCy - Home
- spaCy - Introduction
- spaCy - Getting Started
- spaCy - Models and Languages
- spaCy - Architecture
- spaCy - Command Line Helpers
- spaCy - Top-level Functions
- spaCy - Visualization Function
- spaCy - Utility Functions
- spaCy - Compatibility Functions
- spaCy - Containers
- Doc Class ContextManager and Property
- spaCy - Container Token Class
- spaCy - Token Properties
- spaCy - Container Span Class
- spaCy - Span Class Properties
- spaCy - Container Lexeme Class
- Training Neural Network Model
- Updating Neural Network Model
- spaCy Useful Resources
- spaCy - Quick Guide
- spaCy - Useful Resources
- spaCy - Discussion
spaCy - Util.compile_suffix_regex
This utility function will compile a sequence of suffix rules into a regex object.
Argument
The table below explains its argument −
NAME | TYPE | DESCRIPTION |
---|---|---|
entries | Tuple | This argument represents the suffix rules. For example, lang.punctuation.TOKENIZER_SUFFIXES</>. |
Syntax
suffixes = ("'s", "'S", r"(?<=[0-9])+") suffix_reg = util.compile_suffix_regex(suffixes) nlp.tokenizer.suffix_search = suffix_reg.search
Example
import spacy nlp = spacy.load('en_core_web_sm') suffixes = list(nlp.Defaults.suffixes) suffixes.remove('\\]') suffix_regex = spacy.util.compile_suffix_regex(suffixes) nlp.tokenizer.suffix_search = suffix_regex.search doc = nlp("[A] works for [B] in [C].") print([t.text for t in doc]) # ['[A]', 'works', 'for', '[B]', 'in', '[C]', '.']
Output
['[', 'A]', 'works', 'for', '[', 'B]', 'in', '[', 'C]', '.']
spacy_util_get_data_path.htm
Advertisements
To Continue Learning Please Login
Login with Google