Introducing spaCy v3.6

Jul 7, 2023
3 minute read
Blog
spaCy
Span Categorization
the spaCy team

We’re excited to release v3.6 of the spaCy Natural Language Processing library. spaCy v3.6 adds the span finder component to the core spaCy library and introduces trained pipelines for Slovenian.

SpanFinder component

The SpanFinder component identifies potentially overlapping, unlabeled spans by identifying span start and end tokens. It is intended for use in combination with a component like SpanCategorizer that may further filter or label the spans. See our Spancat blog post for a more detailed introduction to the span finder design.

To train a pipeline with span_finder + spancat, add span_finder (and its tok2vec or transformer if required) to [training.annotating_components] so that the spancat component can be trained directly from its predictions:

[nlp]
pipeline = ["tok2vec","span_finder","spancat"]

[training]
annotating_components = ["tok2vec","span_finder"]

Language updates

Initial support for Malay.
Support for noun chunks and other updates for Latin.

Read more about all the improvements, updates and bug fixes:

Trained pipelines

New trained pipelines

v3.6 introduces new pipelines for Slovenian, which use the trainable lemmatizer and floret vectors.

Package	UPOS	Parser LAS	NER F
`sl_core_news_sm`	96.9	82.1	62.9
`sl_core_news_md`	97.6	84.3	73.5
`sl_core_news_lg`	97.7	84.3	79.0
`sl_core_news_trf`	99.0	91.7	90.0

Pipeline updates

The English pipelines have been updated to improve handling of contractions with various apostrophes and to lemmatize “get” as a passive auxiliary.

New additions to spaCy universe

Many cool new plugins, extensions and pipelines have been added to the spaCy universe since v3.5:


LatinCy	Synthetic trained spaCy pipelines for Latin NLP.
parsigs	Structuring prescriptions text made simple using spaCy.
Sentimental Onix	Use onnx for sentiment models.
spaCysee	Visualize spaCy’s Dependency Parsing, POS tagging, and morphological analysis.
spaCy-SetFit	An an easy and intuitive approach to use SetFit in combination with spaCy.
spaCy Visual Studio Code Extension	Work with spaCy’s config files in VS Code.
spacy-wasm	spaCy in the browser using WebAssembly.
SpanMarker	Effortless state-of-the-art NER in spaCy.
Vetiver	Version, share, deploy, and monitor models.