BM25
BM25 (Wikipedia) also known as the
Okapi BM25, is a ranking function used in information retrieval systems to estimate the relevance of documents to a given search query.
BM25Retrieverretriever uses therank_bm25package.
%pip install --upgrade --quiet  rank_bm25
from langchain_community.retrievers import BM25Retriever
API Reference:BM25Retriever
Create New Retriever with Texts
retriever = BM25Retriever.from_texts(["foo", "bar", "world", "hello", "foo bar"])
Create a New Retriever with Documents
You can now create a new retriever with the documents you created.
from langchain_core.documents import Document
retriever = BM25Retriever.from_documents(
    [
        Document(page_content="foo"),
        Document(page_content="bar"),
        Document(page_content="world"),
        Document(page_content="hello"),
        Document(page_content="foo bar"),
    ]
)
API Reference:Document
Use Retriever
We can now use the retriever!
result = retriever.invoke("foo")
result
[Document(page_content='foo', metadata={}),
 Document(page_content='foo bar', metadata={}),
 Document(page_content='hello', metadata={}),
 Document(page_content='world', metadata={})]
Related
- Retriever conceptual guide
- Retriever how-to guides