Software

We make a suite of AI developer tools that emphasize usability, performance and data privacy. We’re proud to be part of the best-in-class Python data science ecosystem. Most of our software is open-source, and the components that aren’t are just as privacy-conscious and developer-friendly. Unlike most AI companies, we don’t want your data: it never has to leave your servers if you don’t want it to.

spaCy

  • 590m+ downloads
  • 33k+ GitHub stars
  • 139k+ GitHub projects
  • 680+ contributors
WebsiteGitHub

spaCy is a free, open-source library for advanced Natural Language Processing (NLP) in Python. It’s designed specifically for production use and helps you build applications that process and “understand” large volumes of text. It can be used to build information extraction or natural language understanding systems, or to pre-process text for deep learning.

Prodigy

  • 12k+ users
  • 1000+ companies
WebsiteLive Demo

Prodigy is a modern annotation tool for creating training data for machine learning models. It’s so efficient that data scientists can do the annotation themselves, enabling a new level of rapid iteration. Whether you’re working on entity recognition, intent detection or image classification, Prodigy can help you train and evaluate your models faster.

Ellf

  • private beta opened in 2026
  • 10+ workflow modules
WebsiteWaitlist

Ellf is an interactive AI-powered assistant for Natural Language Processing (NLP) and machine learning projects. It integrates with your coding assistant like Claude Code and makes it proficient at planning and developing NLP solutions. The platform lets you plug in your own data-private cluster and makes it easy to execute annotation tasks, auto-annotation agents, training experiments and more, and collaborate on development with your team.

Open Source

GitHub