Embeddings

Our embeddings API lets you easily send arbitrary data to Pinecone to be vectorized, including:

Text snippets
Text files (.docx, .txt, .pdf, .csv)
Multiple files via URL
Crawl URLs and embed the text of a webpage (todo)

Embeddings and RAG Read more about what embeddings are for and why they're useful

Config

Config for how text is embedded is handled by the configs/embeddings.yml file where you can set:

The embedding model you want to use (text-embedding-3-large or text-embedding-3-small).
The embedding dimension (optional)

Leave the embedding dimension blank and it will be added for you based on which embedding model you choose.

Which model should I use?

There is a trade-off between the large and small models. text-embedding-3-large will likely provide more accurate retrieval results, but it will fill up your Pinecone index twice as fast.

We suggest starting the text-embedding-3-small model and changing to large later if you’re not getting good results.

File uploads

When a file is uploaded via the embeddings API then it is also stored in your S3 storage in case it needs to be reindexed later.

URL Crawling

URLs are crawled in a very simple way, the content of the page is fetched with a basic GET request, and parsed into only the textual content of the page. The content is then embedded.

Our crawler doesn’t execute any JavaScript so won’t work for complex pages. If you want you could implement JSDOM or Puppeteer/PlayWright and use those as a full webpage crawler.

Context IDs

Every time you add something to the embeddings database we return a contextId property. This is what you need to use in order to use these embeddings in subsequent queries.

If you upload multiple documents at once with the /embeddings/urls endpoint, then they will all be given the same contextId so that they can be searched together.

If you upload documents from the Example App Knowledge Base page you can also specify a contextId with each upload.

How do I fetch the embeddings?

Currently we haven’t implemented an endpoint for this specifically. If you use the Chat endpoint and provide a contextIds property then the embeddings that match will automatically be searched and added to the context of the query.