Topic: LLMs · Explosion · Developer tools and consulting for AI, Machine Learning and NLP

Explosion builds developer tools for AI, Machine Learning and Natural Language Processing. →
Consulting

Project

Topics

Category

Tasks

Authors

Filtered by topic: LLMs

The AI Revolution Will Not Be Monopolized: Behind the scenes

The AI Revolution Will Not Be Monopolized: Behind the scenes Open Source ML Mixer

A more in-depth look at the concepts and ideas, academic literature, related experiments and preliminary results for distilled task-specific models.

Zero-Shot NER with GliNER and spaCy

Zero-Shot NER with GliNER and spaCy Python Tutorials for Digital Humanities

Tutorial by WJB Mattingly on how to integrate the generalist GLiNER model for Named Entity Recognition with spaCy's versatile NLP environment.

🦙 spacy-llm v0.7.0Jan 19, 2024

Supporting arbitrarily long docs and various new tasks

Herding LLMs Towards Structured NLP

Herding LLMs Towards Structured NLP Global AI Conference

This talk shows how we integrate LLMs into spaCy, leveraging its modular and customizable framework. This allows for cheaper, faster and more robust NLP - driven by cutting-edge LLMs, without compromising on having structured, validated data.

How many Labelled Examples do you need for a BERT-sized Model to Beat GPT-4 on Predictive Tasks?

How many Labelled Examples do you need for a BERT-sized Model to Beat GPT-4 on Predictive Tasks?Generative AI Summit

How does in-context learning compare to supervised approaches on predictive tasks? How many labelled examples do you need on different problems before a BERT-sized model can beat GPT-4 in accuracy? The answer might surprise you: models with fewer than 1b parameters are actually very good at classic predictive NLP, while in-context learning struggles on many problem shapes.

Panel: Large Language Models

Panel: Large Language Models Big PyData BBQ

with Ines, Alejandro Saucedo (Zalando, Institute for Ethical AI & ML), Alina Lehnhard (Cerence), Michael Gerz (Heidelberg University), Alexander CS Hendorf (Königsweg)

✨ prodigy v1.13.2Sep 7, 2023

New LLM recipes for terms generation and prompt engineering

Prodigy v1.12: OpenAI integration, prompt engineering, task routers, deployment docs and more

Prodigy v1.12: OpenAI integration, prompt engineering, task routers, deployment docs and more

✨ prodigy v1.12.0Jul 5, 2023

LLM-assisted workflows for annotation and prompt engineering, task routing for multi-annotator setups

Against LLM maximalism

Against LLM maximalism

LLMs are not a direct solution to most of the NLP use-cases companies have been working on. They are extremely useful, but if you want to deliver reliable software you can improve over time, you can't just write a prompt and call it a day. Once you're past prototyping and want to deliver the best system you can, supervised learning will often give you better efficiency, accuracy and reliability.

The AI Revolution Will Not Be Monopolized: How open-source beats economies of scale, even for LLMs

The AI Revolution Will Not Be Monopolized: How open-source beats economies of scale, even for LLMs QCon London

Constructing a knowledge base with spaCy and spacy-llm

Constructing a knowledge base with spaCy and spacy-llm MantisNLP Blog

This blog post shows how to use spaCy and LLMs to extract entities and relationships from text and quickly tackle the complex problem of constructing a knowledge base graph from a corpus.

Prodigy in 2023: LLMs, task routers, QA and plugins

Prodigy in 2023: LLMs, task routers, QA and plugins

We have made a ton of new updates in Prodigy this year with v1.12, v1.13, and v1.14 releases. So we decided to write a post about them.

🔌 prodigy-whisper v0.1.0Nov 12, 2023

Audio transcription with OpenAI’s Whisper model in the loop

Identifying Signs and Symptoms of Urinary Tract Infection from Emergency Department Clinical Notes Using Large Language Models

Identifying Signs and Symptoms of Urinary Tract Infection from Emergency Department Clinical Notes Using Large Language Models Iscoe, Socrates, Gilson, Chi, Li, Huang, Kearns, Perkins, Khandjian, Taylor (2023)

For annotation we employed Prodigy, a scriptable annotation tool designed to maximize efficiency, enabling data scientists to perform the annotation tasks themselves and facilitating rapid iterative development in natural language processing (NLP) projects.

Models as annotators in Prodigy

Models as annotators in Prodigy

How to use models and LLMs as annotators to find disagreements and prioritize examples to annotate first.

✨ prodigy v1.13.1Aug 23, 2023

Use models and LLMs as annotators to find disagreements

spaCy: a customizable NLP toolkit designed for developers

spaCy: a customizable NLP toolkit designed for developers ODSC Europe

🦙 spacy-llm v0.3.0Jun 14, 2023

Cohere, Anthropic, OpenLLaMa, StableLM, logging, streamlit demo, lemmatization task

Newsletter May 2023

Newsletter May 2023

We got so much amazing feedback from the spaCy user survey, thank you all for your contributions! The most requested feature was spaCy integration with LLMs, which is why we’re so excited to announce spacy-llm!

The AI Revolution Will Not Be Monopolized: How open-source beats economies of scale, even for LLMs

The AI Revolution Will Not Be Monopolized: How open-source beats economies of scale, even for LLMs PyCon Lithuania Keynote

With the latest advancements in NLP and LLMs, and big companies like OpenAI dominating the space, many people wonder: Are we heading further into a black box era with larger and larger models, obscured behind APIs controlled by big tech monopolies?

T-RAG: Lessons from the LLM Trenches

T-RAG: Lessons from the LLM Trenches Fatehkia, Lucas, Chawla (2024)

An important application area is question answering over private enterprise documents where the main considerations are data security, which necessitates applications that can be deployed on-prem, [and] limited computational resources. [...] In addition to retrieving contextual documents, we use the spaCy library with custom rules to detect named entities from the organization.

State-of-the-Art Transformer Pipelines in spaCy

State-of-the-Art Transformer Pipelines in spaCy aiGrunn

In this talk, we will show you how you can use transformer models (from pretrained models such as XLM-RoBERTa to large language models like Llama2) to create state-of-the-art annotation pipelines for text annotation tasks such as named entity recognition.

🦙 spacy-llm v0.6.0Oct 5, 2023

PaLM, Azure OpenAI, Mistral & fixed OS model responses

Newsletter September 2023

Newsletter September 2023

The latest edition of our newsletter, featuring our plans for premium models, LLMs, chain-of-thought prompting, upcoming events and talks, and exciting new Prodigy features. Plus exclusive discounts!

✨ prodigy v1.13.0Aug 15, 2023

LLM support for NER, text classification and span categorization

How to Host Your Own API of Open Language Models For Free

Powered by Explosion’s curated-transformers, FastAPI and ngrok.

Large Language Models: From Prototype to Production

Large Language Models: From Prototype to Production PyData London Keynote

🦙 spacy-llm v0.2.0May 30, 2023

REL and spancat tasks, reading prompt templates from file

Incorporating LLMs into practical NLP workflows

Incorporating LLMs into practical NLP workflows PyCon DE & PyData Berlin

Designing for tomorrow’s programming workflows

Designing for tomorrow’s programming workflows PyCon Lithuania

Modern editors and AI-powered tools like GitHub Copilot and ChatGPT are changing how people program and are transforming our workflows and developer productivity. But what does this mean for how we should be writing and designing our APIs and libraries?

spacy-llm: From quick prototyping with LLMs to more reliable and efficient NLP solutions

spacy-llm: From quick prototyping with LLMs to more reliable and efficient NLP solutions AstraZeneca NLP Community of Practice

LLMs are paving the way for fast prototyping of NLP applications. Here, Sofie showcases how to build a structured NLP pipeline to mine clinical trials, using spaCy and spacy-llm. Moving beyond a fast prototype, she offers pragmatic solutions to make the pipeline more reliable and cost efficient.

Half hour of labeling power: Can we beat GPT?

Half hour of labeling power: Can we beat GPT?PyData NYC

Large Language Models (LLMs) offer a lot of value for modern NLP and can typically achieve surprisingly good accuracy on predictive NLP tasks. But can we do even better than that? In this workshop we show how to use LLMs at development time to create high-quality datasets and train specific, smaller, private and more accurate models for your business problems.

MP Interests Tracker: Utilising GenAI to uncover insights in the UK Register of Financial Interest

MP Interests Tracker: Utilising GenAI to uncover insights in the UK Register of Financial Interest JournalismAI Blog

Project from teams at The Times and BBC using spacy-llm to make complex financial interests data more accessible.

🦙 spacy-llm v0.5.0Sep 8, 2023

Improved user API and novel Chain-of-Thought prompting for more accurate NER

Large Language Models: From Prototype to Production

Large Language Models: From Prototype to Production EuroPython Keynote

Large Language Models (LLMs) have shown some impressive capabilities and their impact is the topic of the moment. In this talk, Ines presents visions for NLP in the age of LLMs and a pragmatic, practical approach for how to use Large Language Models to ship more successful NLP projects from prototype to production today.

🦙 spacy-llm v0.4.0Jul 6, 2023

Falcon, sentiment analysis, summarization, backend refactoring

Large Disagreement Modelling

Large Disagreement Modelling

“In this blogpost I’d like to talk about large language models. There’s a bunch of hype, sure, but there’s also an opportunity to revisit one of my favourite machine learning techniques: disagreement.”

🦙 spacy-llm v0.1.0May 11, 2023

Integrating LLMs into structured NLP pipelines