We’re pleased to publish v3.4 of the spaCy Natural Language Processing library. spaCy v3.4 brings typing and speed improvements along with new vectors for English pipelines and new trained pipelines for Croatian. This release also includes prebuilt linux aarch64 wheels for all spaCy dependencies distributed by Explosion.
Typing improvements
spaCy v3.4 supports pydantic v1.9 and mypy 0.950+ through extensive updates to types in Thinc v8.1.
Speed improvements
- For the parser, use C
saxpy
/sgemm
provided by theOps
implementation in order to use Accelerate throughthinc-apple-ops
. - Improved speed of vector lookups.
- Improved speed for
Example.get_aligned_parse
andExample.get_aligned
.
Trained pipelines
New trained pipelines
v3.4 introduces new CPU/CNN pipelines for Croatian, which use the trainable lemmatizer and floret vectors. Due to the use of Bloom embeddings and subwords, the pipelines have compact vectors with no out-of-vocabulary words.
Package | UPOS | Parser LAS | NER F |
---|---|---|---|
hr_core_news_sm | 96.6 | 77.5 | 76.1 |
hr_core_news_md | 97.3 | 80.1 | 81.8 |
hr_core_news_lg | 97.5 | 80.4 | 83.0 |
Pipeline updates
All CNN pipelines have been extended with whitespace augmentation.
The English CNN pipelines have new word vectors, which improve the NER performance and update the vectors with words like “AirTags”, “Brexit”, “covid” and “doomscrolling”:
Package | Model Version | TAG | Parser LAS | NER F |
---|---|---|---|---|
en_core_web_md | v3.3.0 | 97.3 | 90.1 | 84.6 |
en_core_web_md | v3.4.0 | 97.2 | 90.3 | 85.5 |
en_core_web_lg | v3.3.0 | 97.4 | 90.1 | 85.3 |
en_core_web_lg | v3.4.0 | 97.3 | 90.2 | 85.6 |
New in the spaCy universe
Many cool new plugins, extensions, pipelines and tutorials have been added to the spaCy universe since v3.3:
Aim-spacy | An Aim-based spaCy experiment tracker. |
Asent | Fast, flexible and transparent sentiment analysis. |
spaCy fishing | Named entity disambiguation and linking on Wikidata in spaCy with Entity-Fishing. |
spacy-report | Generates interactive reports for spaCy models. |