Pinecone 向量数据库

Pinecone 是一款全托管的向量数据库，无需部署和维护，开箱即用。

为什么选择 Pinecone

全托管（无需运维）
自动扩展
内置 Embedding 服务
与 LangChain、LlamaIndex 深度集成
按使用量计费

缺点

付费（有免费额度，但有限制）
数据在云端（隐私考虑）

快速开始

注册和获取 API Key

访问 pinecone.io
注册账号
获取 API Key

安装

bash

pip install pinecone-client

基本使用

python

import pinecone

# 初始化
pinecone.init(api_key="your-api-key", environment="us-west1-gcp")

# 创建索引
pinecone.create_index(
    name="my-index",
    dimension=1536,  # OpenAI embedding 维度
    metric="cosine"
)

# 连接到索引
index = pinecone.Index("my-index")

# 插入数据（UPSERT = 更新或插入）
index.upsert([
    ("id1", [0.1, 0.2, ...], {"title": "文档1"}),
    ("id2", [0.3, 0.4, ...], {"title": "文档2"})
])

# 搜索
results = index.query(
    vector=[0.1, 0.2, ...],
    top_k=5,
    include_metadata=True
)

print(results)

与 LangChain 集成

python

from langchain.vectorstores import Pinecone
from langchain.embeddings import OpenAIEmbeddings
import pinecone

# 初始化 Pinecone
pinecone.init(api_key="your-key", environment="us-west1-gcp")

# 创建向量存储
embeddings = OpenAIEmbeddings()
index_name = "my-index"

# 从现有索引创建
vectorstore = Pinecone.from_existing_index(
    index_name=index_name,
    embedding=embeddings
)

# 或者插入新文档
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter

loader = TextLoader("docs.txt")
documents = loader.load()
splitter = CharacterTextSplitter(chunk_size=500)
docs = splitter.split_documents(documents)

vectorstore = Pinecone.from_documents(
    docs,
    embeddings,
    index_name=index_name
)

# 搜索
results = vectorstore.similarity_search("查询内容", k=5)

元数据过滤

python

# 插入时附带元数据
index.upsert([
    ("id1", vector1, {"category": "tech", "year": 2024}),
    ("id2", vector2, {"category": "news", "year": 2023})
])

# 查询时过滤
results = index.query(
    vector=query_vector,
    top_k=5,
    filter={
        "category": {"$eq": "tech"},
        "year": {"$gte": 2024}
    }
)

免费额度

Free Tier：
- 1 个索引
- 最大 100MB 存储
- 50 万向量（768 维）

生产环境需要付费，价格参考官网。

Pinecone 向量数据库 ​

为什么选择 Pinecone ​

缺点 ​

快速开始 ​

注册和获取 API Key ​

安装 ​

基本使用 ​

与 LangChain 集成 ​

元数据过滤 ​

免费额度 ​

相关资源 ​

Pinecone 向量数据库

为什么选择 Pinecone

缺点

快速开始

注册和获取 API Key

安装

基本使用

与 LangChain 集成

元数据过滤

免费额度

相关资源