FLARE 💥

FLARE, stands for Forward-Looking Active REtrieval augmented generation is a generic retrieval-augmented generation method that actively decides when and what to retrieve using a prediction of the upcoming sentence to anticipate future content and utilize it as the query to retrieve relevant documents if it contains low-confidence tokens.

Official Paper

Here’s a code snippet for using FLARE with Langchain:

from langchain.vectorstores import LanceDB
from langchain.document_loaders import ArxivLoader
from langchain.chains import FlareChain
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain.llms import OpenAI

llm = OpenAI()

# load dataset

# LanceDB retriever
vector_store = LanceDB.from_documents(doc_chunks, embeddings, connection=table)
retriever = vector_store.as_retriever()

# define flare chain
flare = FlareChain.from_llm(llm=llm,retriever=vector_store_retriever,max_generation_len=300,min_prob=0.45)

result = flare.run(input_text)