Software

We make a suite of AI developer tools that emphasize usability, performance and data privacy. We’re proud to be part of the best-in-class Python data science ecosystem. Most of our software is open-source, and the components that aren’t are just as privacy-conscious and developer-friendly. Unlike most AI companies, we don’t want your data: it never has to leave your servers if you don’t want it to.

spaCy

600m+ downloads
33k+ GitHub stars
139k+ GitHub projects
680+ contributors

spaCy is a free, open-source library for advanced Natural Language Processing (NLP) in Python. It’s designed specifically for production use and helps you build applications that process and “understand” large volumes of text. It can be used to build information extraction or natural language understanding systems, or to pre-process text for deep learning.

Prodigy

12k+ users
1000+ companies

Website Live Demo

Prodigy is a modern annotation tool for creating training data for machine learning models. It’s so efficient that data scientists can do the annotation themselves, enabling a new level of rapid iteration. Whether you’re working on entity recognition, intent detection or image classification, Prodigy can help you train and evaluate your models faster.

Ellf

private beta opened in 2026
10+ workflow modules

Website Waitlist

Ellf is an interactive AI-powered assistant for Natural Language Processing (NLP) and machine learning projects. It integrates with your coding assistant like Claude Code and makes it proficient at planning and developing NLP solutions. The platform lets you plug in your own data-private cluster and makes it easy to execute annotation tasks, auto-annotation agents, training experiments and more, and collaborate on development with your team.

Demos

Demos and visualizations aren’t just eye candy — they’re an essential part of explaining and exploring AI technologies, especially during development. A good visualization lets you understand your model’s behavior and catch obvious problems early.

displaCy Dependency Visualizer

displaCy Dependency Visualizer

Visualize spaCy’s guess at the syntactic structure of a sentence. Arrows point from children to heads, and are labelled by their relation type.

Rule-based Matcher Explorer

Rule-based Matcher Explorer

Test spaCy’s rule-based Matcher by creating token patterns interactively and running them over your text. Explore how spaCy processes your text – and why your pattern matches, or doesn’t.

displaCy Named Entity Visualizer

displaCy Named Entity Visualizer

Visualize spaCy’s guess at the named entities in the document. You can filter the displayed types, to only show the annotations you’re interested in.

sense2vec: Semantic Analysis of the Reddit Hivemind

sense2vec: Semantic Analysis of the Reddit Hivemind

We parsed every comment posted to Reddit in 2015 and 2019, and trained different word2vec models for each year.

spaCy v3.0 Trained Pipeline Explorer

spaCy v3.0 Trained Pipeline Explorer

Test and compare spaCy’s trained pipelines interactively with widgets for their components, powered by our Streamlit add-on, which you can use to build your own spaCy apps.

Prodigy Annotation Tool

Prodigy Annotation Tool

Whether you’re working on entity recognition, intent detection or image classification, Prodigy can help you train and evaluate your models faster.

Open Source

🪐 projects

Project templates for end-to-end NLP workflows

🤖 curated-transformers

State-of-the-art transformers, brick by brick in PyTorch

💥 spacy-stanza

Use the latest Stanza (StanfordNLP) research models directly in spaCy

🦉 srsly

Modern high-performance serialization utilities for Python

👩‍💻 spacy-vscode

Visual Studio Code extension for spaCy

💥 preshed

Cython hash tables that assume keys are pre-hashed

🕊️ radicli

Radically lightweight command-line interfaces

🔌 prodigy-segment

Prodigy plugin to leverage Meta’s Segment-Anything model for image segmentation

🔌 prodigy-whisper

Prodigy plugin that leverage OpenAI’s Whisper model for audio transcription

🔮 thinc

Lightweight deep learning library powering spaCy

👩‍🏫 spacy-course

Advanced NLP with spaCy: A free online course

🧪 spacy-experimental

Cutting-edge experimental spaCy components and features

💥 cython-blis

Fast matrix-multiplication as a self-contained Python library – no system dependencies!

👩‍💻 prodigy-vscode

Visual Studio Code extension for Prodigy

📙 catalogue

Super lightweight function registries for your library

🤗 spacy-huggingface-hub

Upload spaCy pipelines to the Hugging Face Hub

🔌 prodigy-ann

Prodigy plugin to use approximate nearest neighbor techniques to help you annotate

🦙 spacy-llm

Integrating LLMs into structured NLP pipelines

🦆 sense2vec

Contextually-keyed word vectors

📚 spacy-layout

Process PDFs, Word documents and more with spaCy

🍏 thinc-apple-ops

Make Thinc faster on macOS by calling into Apple’s native Accelerate library.

🧬 jupyterlab-prodigy

JupyterLab extension for annotating data with Prodigy

🍬 confection

The sweetest config system for Python

🔌 prodigy-hf

Prodigy plugin to train Hugging Face models

🔌 prodigy-lunr

Prodigy plugin to use old-school string matching techniques to help you annotate

🛸 spacy-transformers

spaCy pipelines for pre-trained BERT and other transformers

👑 spacy-streamlit

spaCy building blocks and visualizers for Streamlit apps

🌸 floret

fastText + Bloom embeddings for compact, full-coverage vectors with spaCy

🍳 prodigy-recipes

Recipes for the Prodigy annotation tool

💥 cymem

Cython memory pool for RAII-style memory management

🦦 weasel

A small and easy workflow system

🔌 prodigy-pdf

Prodigy plugin to annotate PDF files

🔌 prodigy-evaluate

Prodigy plugin to compute evaluation metrics for spaCy pipelines