We make a suite of AI developer tools that emphasise usability, performance and data privacy. We’re proud to be part of the best-in-class Python data science ecosystem. Most of our software is open-source, and the components that aren’t are just as privacy-conscious and developer-friendly. Unlike most AI companies, we don’t want your data: it never has to leave your servers if you don’t want it to.

Industrial-strength Natural Language Processing

spaCy is a free, open-source library for advanced Natural Language Processing (NLP) in Python. It's designed specifically for production use and helps you build applications that process and "understand" large volumes of text. It can be used to build information extraction or natural language understanding systems, or to pre-process text for deep learning.

Try spaCy →

Radically efficient machine teaching

Prodigy is an annotation tool so efficient that data scientists can do the annotation themselves, enabling a new level of rapid iteration. Whether you're working on entity recognition, intent detection or image classification, Prodigy can help you train and evaluate your models faster. Stream in your own examples, update your model in real-time and chain models together to build more complex systems.

Try Prodigy →

Next-generation Machine Learning for NLP

Thinc is the machine learning library powering spaCy. It is a practical toolkit for implementing models that follow the "Embed, encode, attend, predict" architecture. It's designed to be easy to install, efficient for CPU usage and optimised for NLP and deep learning with text – in particular, hierarchically structured input and variable-length sequences.

Read more →


Web-based visualisers for NLP models

Demos and visualizations aren't just eye candy — they're an essential part of explaining and exploring AI technologies, especially during development. A good visualisation lets you understand your model's behaviour and catch obvious problems early. Our demos include visualisations for spaCy's depency trees, entity recognition and similarity models.

Try the demos →

Other open-source projects

Stay in the loop!

Join our mailing list to receive updates about new blog posts and projects.