We make a suite of AI developer tools that emphasise usability, performance and data privacy. We’re proud to be part of the best-in-class Python data science ecosystem. Most of our software is open-source, and the components that aren’t are just as privacy-conscious and developer-friendly. Unlike most AI companies, we don’t want your data: it never has to leave your servers if you don’t want it to.
spaCy is a free, open-source library for advanced Natural Language Processing (NLP) in Python. It's designed specifically for production use and helps you build applications that process and "understand" large volumes of text. It can be used to build information extraction or natural language understanding systems, or to pre-process text for deep learning.
Prodigy is an annotation tool so efficient that data scientists can do the annotation themselves, enabling a new level of rapid iteration. Whether you're working on entity recognition, intent detection or image classification, Prodigy can help you train and evaluate your models faster. Stream in your own examples, update your model in real-time and chain models together to build more complex systems.
Prodigy Scale brings a new web-based interface to our successful Prodigy annotation system, to let you coordinate large or long-running projects. Manage a team of in-house annotators, review their progress, measure inter-annotator consistency, and trigger model training automatically – all while keeping your data fully private, and with full support for custom recipes and custom ETL logic.
Thinc is the machine learning library powering spaCy. It is a practical toolkit for implementing models that follow the "Embed, encode, attend, predict" architecture. It's designed to be easy to install, efficient for CPU usage and optimised for NLP and deep learning with text – in particular, hierarchically structured input and variable-length sequences.
Demos and visualizations aren't just eye candy — they're an essential part of explaining and exploring AI technologies, especially during development. A good visualisation lets you understand your model's behaviour and catch obvious problems early. Our demos include visualisations for spaCy's depency trees, entity recognition and similarity models.
Other open-source projects
Stay in the loop!
Join our mailing list to receive updates about new blog posts and projects.