spaCy v3: Custom trainable relation extraction componentspaCy v3.0 features new transformer-based pipelines that get spaCy’s accuracy right up to the current state-of-the-art, and a new training config and workflow system to help you take projects from prototype to production. In this video, Sofie shows you how to apply all these new features when implementing a custom trainable component from scratch.
The Physical Traits that Define Men and Women in LiteratureThe PuddingAnalysis of physical traits most tied to gender in literature using spaCy.
Training a custom entity linking model with spaCyIn this video, we show you how to create a custom Entity Linking model in spaCy to disambiguate different mentions of the person “Emerson” to unique identifiers in a knowledge base.
Explosion in 2019: Our Year in ReviewAs 2019 draws to a close and we step into the 2020s, we thought we’d take a look back at the year and all we’ve accomplished. And we realized we had so much that we could give you a month-by-month rundown of everything that happened.
sense2vec reloaded: contextually-keyed word vectorsIn 2016 we trained a sense2vec model on the 2015 portion of the Reddit comments corpus, leading to a useful library and one of our most popular demos. That work is now due for an update. In this post, we present a new version and a demo NER project that we trained to usable accuracy in just a few hours.
spaCy meets Transformers: Fine-tune BERT, XLNet and GPT-2Huge transformer models like BERT, GPT-2 and XLNet have set a new standard for accuracy on almost every NLP leaderboard. You can now use these models in spaCy, via a new interface library we've developed that connects spaCy to Hugging Face's awesome implementations.
spaCy v3: Design concepts explained (behind the scenes)In this video, Ines shows you some of the new design concepts and explain what’s going on under the hood, how we’ve implemented them and most importantly, why.
Explosion in 2020: Our Year in ReviewWhile 2020 hasn’t been easy for anyone, at Explosion we’ve considered ourselves relatively fortunate in this most interesting year. We’ve always worked remotely, so we’ve been able to take both pride and comfort in continuing to ship good software. Here’s a look back at what we’ve been up to.
✨ prodigy v1.10.0Jun 16, 2020Dependency and relation annotation, audio, video, character-based NER & more
Image Captioning with Prodigy & PyTorchIn this video, we’ll show you how you can use Prodigy to script fully custom annotation workflows in Python, how to plug in your own machine learning models and how to mix and match different interfaces for your specific use case.
Interview with Ines MontaniSayak PaulInes talks about how she got into programming, how to stay up to date with the latest developments in our field and the ideas behind the PyCon India keynote “Let Them Write Code”.
Explosion awarded META Seal of RecognitionWe’re proud to accept the META Seal of Recognition at META-FORUM in Brussels, along with Mozilla. The META-FORUM is an international conference series backed by the European Union on powerful and innovative Language Technologies for a multilingual information society.
Millennials Kill EverythingThe PuddingAnalysis on media reporting of millenials using spaCy. From napkins to marriage to Applebees, just looking at headlines you’d guess that for the past decade the millennial generation’s been on a rampage.
✨ prodigy v1.8.0May 20, 2019Support for spaCy v2.1, basic auth, multi-user sessions, review workflow & more
Prodigy v1.10: Dependencies, relations, audio, video & moreVersion 1.10 of Prodigy includes tons of new features, including manual dependency and relation annotation, audio and video annotation, a new and improved image UI, new recipe callbacks, more settings for manual NER, plus various new config options and settings.
Training a Named Entity Recognition Model with Prodigy and Transfer LearningIn this video, we’ll show you how to use Prodigy to train a named entity recognition model from scratch, by taking advantage of semi-automatic annotation and modern transfer learning techniques.
PyCon Colombia Speaker InterviewKaro and Ines talked about getting into tech and machine learning, and what’s next for spaCy and our other tools.
Künstliche Intelligenz Beyond the HypeZündfunk Netzkongress (German)“Artificial intelligence” is everywhere in the headlines. Many futuristic-sounding things suddenly seem possible. It’s not easy to judge what all these technological advances mean. What is hype and what really works? And how should we imagine the future?
Using spaCy with Hugging Face TransformersPyCon IndiaTransformer models like BERT have set a new standard for accuracy on almost every NLP leaderboard. However, these models are very new, and most of the software ecosystem surrounding them is oriented towards the many opportunities for further research. In this talk, Matt describes how you can now use these models in spaCy to work on real problems and the many opportunities transfer learningfor production NLP, regardless of which software packages you choose.
Introducing spaCy v2.2Version 2.2 of the spaCy Natural Language Processing library is leaner, cleaner and even more user-friendly. In addition to new model packages and features for training, evaluation and serialization, we've made lots of bug fixes, improved debugging and error handling, and greatly reduced the size of the library on disk.
Intro to NLP with spaCy (1): Detecting programming languagesIn this new video series, data science instructor Vincent Warmerdam gets started with spaCy, an open-source library for Natural Language Processing in Python. His mission: building a system to automatically detect programming languages in large volumes of text.
spaCy IRL 2019: 2 days of NLP in BerlinWe were pleased to invite the spaCy community and other folks working on Natural Language Processing to Berlin this summer for a small and intimate event.
Introducing spaCy v3.0spaCy v3.0 is a huge release! It features new transformer-based pipelines that get spaCy's accuracy right up to the current state-of-the-art, and a new workflow system to help you take projects from prototype to production. It's much easier to configure and train your pipeline, and there are lots of new and improved integrations with the rest of the NLP ecosystem.
How We Analyzed Google’s Search ResultsThe MarkupUsing the Prodigy annotation tool, we created a user interface and a coder manual for two annotators to spot-check 741 stained images randomly sampled from our dataset.
Introducing spaCy v2.3spaCy now speaks Chinese, Japanese, Danish, Polish and Romanian! Version 2.3 of the spaCy Natural Language Processing library adds models for five new languages. We've also updated all 15 model families with word vectors and improved accuracy, while also decreasing model size and loading times for models with vectors.
The Future of NLP in PythonPyCon Colombia KeynoteThe data community came to Python for the language, and stayed for each other – once it got critical mass, it’s the ecosystem that counts. We’ve been proud to be part of that. So what does the future hold for NLP in Python?