Introducing spaCy v3.6

· by the spaCy team · ~4 min. read

We’re excited to release v3.6 of the spaCy Natural Language Processing library. spaCy v3.6 adds the span finder component to the core spaCy library and introduces trained pipelines for Slovenian.

SpanFinder component

The SpanFinder component identifies potentially overlapping, unlabeled spans by identifying span start and end tokens. It is intended for use in combination with a component like SpanCategorizer that may further filter or label the spans. See our Spancat blog post for a more detailed introduction to the span finder design.

To train a pipeline with span_finder + spancat, add span_finder (and its tok2vec or transformer if required) to [training.annotating_components] so that the spancat component can be trained directly from its predictions:

[nlp]
pipeline = ["tok2vec","span_finder","spancat"]

[training]
annotating_components = ["tok2vec","span_finder"]

Language updates

  • Initial support for Malay.
  • Support for noun chunks and other updates for Latin.

Read more about all the improvements, updates and bug fixes:

Trained pipelines

New trained pipelines

v3.6 introduces new pipelines for Slovenian, which use the trainable lemmatizer and floret vectors.

New Trained Pipelines

PackageUPOSParser LASNER F
sl_core_news_sm96.982.162.9
sl_core_news_md97.684.373.5
sl_core_news_lg97.784.379.0
sl_core_news_trf99.091.790.0

Pipeline updates

The English pipelines have been updated to improve handling of contractions with various apostrophes and to lemmatize “get” as a passive auxiliary.

New additions to spaCy universe

Many cool new plugins, extensions and pipelines have been added to the spaCy universe since v3.5:

LatinCySynthetic trained spaCy pipelines for Latin NLP.
parsigsStructuring prescriptions text made simple using spaCy.
Sentimental OnixUse onnx for sentiment models.
spaCyseeVisualize spaCy’s Dependency Parsing, POS tagging, and morphological analysis.
spaCy-SetFitAn an easy and intuitive approach to use SetFit in combination with spaCy.
spaCy Visual Studio Code ExtensionWork with spaCy’s config files in VS Code.
spacy-wasmspaCy in the browser using WebAssembly.
SpanMarkerEffortless state-of-the-art NER in spaCy.
VetiverVersion, share, deploy, and monitor models.

View the spaCy universe

Resources

About the authors

  • Matthew Honnibal

    Matthew Honnibal CTO, Founder

  • Ines Montani

    Ines Montani CEO, Founder

  • Sofie Van Landeghem

    Sofie Van Landeghem Machine Learning Engineer, spaCy Lead

  • Adriane Boyd

    Adriane Boyd Machine Learning Engineer

  • Raphael Mitsch

    Raphael Mitsch Machine Learning Engineer

  • Ákos Kádár

    Ákos Kádár Machine Learning Engineer

  • Daniël de Kok

    Daniël de Kok Machine Learning Engineer

  • Madeesh Kannan

    Madeesh Kannan Machine Learning Engineer

  • Victoria Slocum

    Victoria Slocum Developer Advocate

  • Basile Dura

    Basile Dura Software Engineer

  • Vinit Ravishankar

    Vinit Ravishankar Machine Learning Engineer