Upgrading from 7.x to 8.x
The v8 release of the Pinecone Python SDK has been published as pinecone to PyPI.
With a few exceptions noted below, nearly all changes are additive and non-breaking. The major version bump primarily reflects the step up to API version 2025-10 and the addition of a new dependency on orjson for fast JSON parsing.
Breaking Changes
namespace parameter in GRPC methods. When namespace=None, the parameter is omitted from requests, allowing the API to handle namespace defaults appropriately. This change affects upsert_from_dataframe methods in GRPC clients. The API is moving toward "__default__" as the default namespace value, and this change ensures the SDK doesn't override API defaults.
Note: The official SDK package was renamed last year from pinecone-client to pinecone beginning in version 5.1.0. Please remove pinecone-client from your project dependencies and add pinecone instead to get the latest updates if upgrading from earlier versions.
What's new in 8.x
Dedicated Read Capacity for Serverless Indexes
You can now configure dedicated read nodes for your serverless indexes, giving you more control over query performance and capacity planning. By default, serverless indexes use OnDemand read capacity, which automatically scales based on demand. With dedicated read capacity, you can allocate specific read nodes with manual scaling control.
Create an index with dedicated read capacity:
from pinecone import (
Pinecone,
ServerlessSpec,
CloudProvider,
AwsRegion,
Metric
)
pc = Pinecone()
pc.create_index(
name='my-index',
dimension=1536,
metric=Metric.COSINE,
spec=ServerlessSpec(
cloud=CloudProvider.AWS,
region=AwsRegion.US_EAST_1,
read_capacity={
"mode": "Dedicated",
"dedicated": {
"node_type": "t1",
"scaling": "Manual",
"manual": {
"shards": 2,
"replicas": 2
}
}
}
)
)Configure read capacity on an existing index:
You can switch between OnDemand and Dedicated modes, or adjust the number of shards and replicas for dedicated read capacity:
from pinecone import Pinecone
pc = Pinecone()
# Switch to OnDemand read capacity
pc.configure_index(
name='my-index',
read_capacity={"mode": "OnDemand"}
)
# Switch to Dedicated read capacity with manual scaling
pc.configure_index(
name='my-index',
read_capacity={
"mode": "Dedicated",
"dedicated": {
"node_type": "t1",
"scaling": "Manual",
"manual": {
"shards": 3,
"replicas": 2
}
}
}
)
# Scale up by increasing shards and replicas
pc.configure_index(
name='my-index',
read_capacity={
"mode": "Dedicated",
"dedicated": {
"node_type": "t1",
"scaling": "Manual",
"manual": {
"shards": 4,
"replicas": 3
}
}
}
)When you change read capacity configuration, the index will transition to the new configuration. You can use describe_index to check the status of the transition.
See PR #528 for details.
Fetch and Update Vectors by Metadata
Fetch vectors by metadata filter
You can now fetch vectors using metadata filters instead of vector IDs. This is especially useful when you need to retrieve vectors based on their metadata properties.
from pinecone import Pinecone
pc = Pinecone()
index = pc.Index(host="your-index-host")
# Fetch vectors matching a complex filter
response = index.fetch_by_metadata(
filter={'genre': {'$in': ['comedy', 'drama']}, 'year': {'$eq': 2019}},
namespace='my_namespace',
limit=50
)
print(f"Found {len(response.vectors)} vectors")
# Iterate through fetched vectors
for vec_id, vector in response.vectors.items():
print(f"ID: {vec_id}, Metadata: {vector.metadata}")Pagination support:
When fetching large numbers of vectors, you can use pagination tokens to retrieve results in batches:
# First page
response = index.fetch_by_metadata(
filter={'status': 'active'},
limit=100
)
# Continue with next page if available
if response.pagination and response.pagination.next:
next_response = index.fetch_by_metadata(
filter={'status': 'active'},
pagination_token=response.pagination.next,
limit=100
)Update vectors by metadata filter
The update method used to require a vector id to be passed, but now you have the option to pass a metadata filter instead. This is useful for bulk metadata updates across many vectors.
There is also a dry_run option that allows you to preview the number of vectors that would be changed by the update before performing the operation.
from pinecone import Pinecone
pc = Pinecone()
index = pc.Index(host="your-index-host")
# Preview how many vectors would be updated (dry run)
response = index.update(
set_metadata={'status': 'active'},
filter={'genre': {'$eq': 'drama'}},
dry_run=True
)
print(f"Would update {response.matched_records} vectors")
# Apply the update by repeating the command without dry_run
response = index.update(
set_metadata={'status': 'active'},
filter={'genre': {'$eq': 'drama'}}
)FilterBuilder for fluent filter construction
A new FilterBuilder utility class provides a type-safe, fluent interface for constructing metadata filters. While perhaps a bit verbose, it can help prevent common errors like misspelled operator names and provides better IDE support.
When you chain .build() onto the FilterBuilder it will emit a python dictionary representing the filter. Methods that take metadata filters as arguments will continue to accept dictionaries as before.
from pinecone import Pinecone, FilterBuilder
pc = Pinecone()
index = pc.Index(host="your-index-host")
# Simple equality filter
filter1 = FilterBuilder().eq("genre", "drama").build()
# Returns: {"genre": "drama"}
# Multiple conditions with AND using & operator
filter2 = (FilterBuilder().eq("genre", "drama") &
FilterBuilder().gt("year", 2020)).build()
# Returns: {"$and": [{"genre": "drama"}, {"year": {"$gt": 2020}}]}
# Multiple conditions with OR using | operator
filter3 = (FilterBuilder().eq("genre", "comedy") |
FilterBuilder().eq("genre", "drama")).build()
# Returns: {"$or": [{"genre": "comedy"}, {"genre": "drama"}]}
# Complex nested conditions
filter4 = ((FilterBuilder().eq("genre", "drama") &
FilterBuilder().gte("year", 2020)) |
(FilterBuilder().eq("genre", "comedy") &
FilterBuilder().lt("year", 2000))).build()
# Use with fetch_by_metadata
response = index.fetch_by_metadata(filter=filter2, limit=50)
# Use with update
index.update(
set_metadata={'status': 'archived'},
filter=filter3
)The FilterBuilder supports all Pinecone filter operators: eq, ne, gt, gte, lt, lte, in_, nin, and exists. Compound expressions are built with and as & and or as |.
See PR #529 for fetch_by_metadata, PR #544 for update() with filter, and PR #531 for FilterBuilder.
Other New Features
Create namespaces programmatically
You can now create namespaces in serverless indexes directly from the SDK:
from pinecone import Pinecone
pc = Pinecone()
index = pc.Index(host="your-index-host")
# Create a namespace with just a name
namespace = index.create_namespace(name="my-namespace")
print(f"Created namespace: {namespace.name}, Vector count: {namespace.vector_count}")
# Create a namespace with schema configuration
namespace = index.create_namespace(
name="my-namespace",
schema={
"fields": {
"genre": {"filterable": True},
"year": {"filterable": True}
}
}
)Note: This operation is not supported for pod-based indexes.
See PR #532 for details.
Match terms in search operations
For sparse indexes with integrated embedding configured to use the pinecone-sparse-english-v0 model, you can now specify which terms must be present in search results:
from pinecone import Pinecone, SearchQuery
pc = Pinecone()
index = pc.Index(host="your-index-host")
response = index.search(
namespace="my-namespace",
query=SearchQuery(
inputs={"text": "Apple corporation"},
top_k=10,
match_terms={
"strategy": "all",
"terms": ["apple", "corporation"]
}
)
)The match_terms parameter ensures that all specified terms must be present in the text of each search hit. Terms are normalized and tokenized before matching, and order does not matter.
See PR #530 for details.
Admin API enhancements
Update API keys, projects, and organizations:
from pinecone import Admin
admin = Admin() # Auth with PINECONE_CLIENT_ID and PINECONE_CLIENT_SECRET
# Update an API key's name and roles
api_key = admin.api_key.update(
api_key_id='my-api-key-id',
name='updated-api-key-name',
roles=['ProjectEditor', 'DataPlaneEditor']
)
# Update a project's configuration
project = admin.project.update(
project_id='my-project-id',
name='updated-project-name',
max_pods=10,
force_encryption_with_cmek=True
)
# Update an organization
organization = admin.organization.update(
organization_id='my-org-id',
name='updated-organization-name'
)Delete organizations:
from pinecone import Admin
admin = Admin()
# Delete an organization (use with caution!)
admin.organization.delete(organization_id='my-org-id')See PR #527 and PR #543 for details.
Metadata schema configuration
You can now configure which metadata fields are filterable when creating serverless indexes. This helps optimize performance by only indexing metadata fields that you plan to use for filtering:
from pinecone import (
Pinecone,
ServerlessSpec,
CloudProvider,
AwsRegion,
Metric
)
pc = Pinecone()
pc.create_index(
name='my-index',
dimension=1536,
metric=Metric.COSINE,
spec=ServerlessSpec(
cloud=CloudProvider.AWS,
region=AwsRegion.US_EAST_1,
schema={
"genre": {"filterable": True},
"year": {"filterable": True},
"rating": {"filterable": True}
}
)
)When using schemas, only fields marked as filterable: True in the schema can be used in metadata filters.
See PR #528 for details.
Response header information
The SDK now exposes header information from API responses. This information is available in response objects via the _response_info attribute and can be useful for debugging and monitoring.
from pinecone import Pinecone
pc = Pinecone()
index = pc.Index(host="your-index-host")
# Perform a query
response = index.query(
vector=[0.1, 0.2, 0.3, ...],
top_k=10,
namespace='my_namespace'
)
# Access response headers
if hasattr(response, '_response_info') and response._response_info:
headers = response._response_info.get('raw_headers', {})
# Access specific headers (header names are normalized to lowercase)
lsn = headers.get('x-pinecone-request-lsn')
if lsn:
print(f"LSN: {lsn}")
# View all available headers
for header_name, header_value in headers.items():
print(f"{header_name}: {header_value}")See PR #539 for details.
Performance Improvements
orjson for faster JSON processing
We've replaced Python's standard library json module with orjson, a fast JSON library written in Rust. This provides significant performance improvements for both serialization and deserialization of request payloads:
- Serialization (dumps): 10-23x faster depending on payload size
- Deserialization (loads): 4-7x faster depending on payload size
- Round-trip operations: ~8-9x faster
These improvements are especially beneficial for:
- High-throughput applications making many API calls
- Applications handling large vector payloads
- Real-time applications where latency matters
No code changes are required - the API remains the same, and you'll automatically benefit from these performance improvements.
See PR #556 for details.
Optimized gRPC response parsing
We've optimized gRPC response parsing by replacing json_format.MessageToDict with direct protobuf field access. This optimization provides approximately 2x faster response parsing for gRPC operations.
Special thanks to @yorickvP for surfacing the json_format.MessageToDict refactor opportunity. While we didn't merge the specific PR, yorick's insight led us to implement a similar optimization that significantly improves gRPC performance.
See PR #553 for details.
Other Improvements
- Type hints and IDE support: Comprehensive type hints throughout the SDK improve IDE autocomplete and type checking. The SDK now uses Python 3.10+ type syntax throughout.
- Documentation: Updated docstrings with RST formatting and code examples for better developer experience.
- Dependency updates: Updated protobuf to 5.29.5 to address security vulnerabilities.
- Build system: Migrated from poetry to uv for faster dependency management.
Contributors
- @yorickvP - Thanks for surfacing the gRPC response parsing optimization opportunity!