spaCy - Container Span Class

Quiz

This chapter will help you in understanding the Span Class in spaCy.

Span Class

It is a slice from Doc object, we discussed above.

Attributes

The table below explains its arguments −

NAME	TYPE	DESCRIPTION
doc	Doc	It represents the parent document.
tensor V2.1.7	Ndarray	Introduced in version 2.1.7 represents the spans slice of the parent Docs tensor.
sent	Span	It is actually the sentence span that this span is a part of.
start	Int	This attribute is the token offset for the start of the span.
end	Int	This attribute is the token offset for the end of the span.
start_char	Int	Integer type attribute representing the character offset for the start of the span.
end_char	Int	Integer type attribute representing the character offset for the end of the span.
text	Unicode	It is a Unicode that represents the span text.
text_with_ws	Unicode	It represents the text content of the span with a trailing whitespace character if the last token has one.
orth	Int	This attribute is the ID of the verbatim text content.
orth_	Unicode	It is the Unicode Verbatim text content, which is identical to Token.text. This text content exists mostly for consistency with the other attributes.
label	Int	This integer attribute is the hash value of the spans label.
label_	Unicode	It is the label of span.
lemma_	Unicode	It is the lemma of span.
kb_id	Int	It represents the hash value of the knowledge base ID, which is referred to by the span.
kb_id_	Unicode	It represents the knowledge base ID, which is referred to by the span.
ent_id	Int	This attribute represents the hash value of the named entity the token is an instance of.
ent_id_	Unicode	This attribute represents the string ID of the named entity the token is an instance of.
sentiment	Float	A float kind scalar value that indicates the positivity or negativity of the span.
_	Underscore	It is representing the user space for adding custom attribute extension.

Methods

Following are the methods used in Span class −

Sr.No.	Method & Description
1	Span._ _init_ _ To construct a Span object from the slice doc[start : end].
2	Span._ _getitem_ _ To get a token object at a particular position say n, where n is an integer.
3	Span._ _iter_ _ To iterate over those token objects from which the annotations can be easily accessed.
4	Span._ _len_ _ To get the number of tokens in span.
5	Span.similarity To make a semantic similarity estimate.
6	Span.merge To retokenize the document in a way that the span is merged into a single token.

ClassMethods

Following are the classmethods used in Span class −

Sr.No.	Classmethod & Description
1	Span.set_extension It defines a custom attribute on the Span.
2	Span.get_extension To look up a previously extension by name.
3	Span.has_extension To check whether an extension has been registered on the Span class or not.
4	Span.remove_extension To remove a previously registered extension on the Span class.

Print Page