-
Notifications
You must be signed in to change notification settings - Fork 110
Expose LSN Header Information #539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
rohanshah18
reviewed
Nov 10, 2025
| jobs: | ||
| dependency-matrix-grpc: | ||
| name: GRPC py3.9/py3.10 | ||
| name: GRPC py3.10/py3.10 |
Contributor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both versions are 3.10?
jhamon
added a commit
that referenced
this pull request
Nov 18, 2025
⚠️ **Python 3.9 is no longer supported.** The SDK now requires Python 3.10 or later. Python 3.9 reached end-of-life on October 2, 2025. Users must upgrade to Python 3.10+ to continue using the SDK.⚠️ **Namespace parameter default behavior changed.** The SDK no longer applies default values for the `namespace` parameter in GRPC methods. When `namespace=None`, the parameter is omitted from requests, allowing the API to handle namespace defaults appropriately. This change affects `upsert_from_dataframe` methods in GRPC clients. The API is moving toward `"__default__"` as the default namespace value, and this change ensures the SDK doesn't override API defaults. Note: The official SDK package was renamed last year from `pinecone-client` to `pinecone` beginning in version 5.1.0. Please remove `pinecone-client` from your project dependencies and add `pinecone` instead to get the latest updates if upgrading from earlier versions. You can now configure dedicated read nodes for your serverless indexes, giving you more control over query performance and capacity planning. By default, serverless indexes use OnDemand read capacity, which automatically scales based on demand. With dedicated read capacity, you can allocate specific read nodes with manual scaling control. **Create an index with dedicated read capacity:** ```python from pinecone import ( Pinecone, ServerlessSpec, CloudProvider, AwsRegion, Metric ) pc = Pinecone() pc.create_index( name='my-index', dimension=1536, metric=Metric.COSINE, spec=ServerlessSpec( cloud=CloudProvider.AWS, region=AwsRegion.US_EAST_1, read_capacity={ "mode": "Dedicated", "dedicated": { "node_type": "t1", "scaling": "Manual", "manual": { "shards": 2, "replicas": 2 } } } ) ) ``` **Configure read capacity on an existing index:** You can switch between OnDemand and Dedicated modes, or adjust the number of shards and replicas for dedicated read capacity: ```python from pinecone import Pinecone pc = Pinecone() pc.configure_index( name='my-index', read_capacity={"mode": "OnDemand"} ) pc.configure_index( name='my-index', read_capacity={ "mode": "Dedicated", "dedicated": { "node_type": "t1", "scaling": "Manual", "manual": { "shards": 3, "replicas": 2 } } } ) pc.configure_index( name='my-index', read_capacity={ "mode": "Dedicated", "dedicated": { "node_type": "t1", "scaling": "Manual", "manual": { "shards": 4, "replicas": 3 } } } ) ``` When you change read capacity configuration, the index will transition to the new configuration. You can use `describe_index` to check the status of the transition. See [PR #528](#528) for details. You can now fetch vectors using metadata filters instead of vector IDs. This is especially useful when you need to retrieve vectors based on their metadata properties. ```python from pinecone import Pinecone pc = Pinecone() index = pc.Index(host="your-index-host") response = index.fetch_by_metadata( filter={'genre': {'$in': ['comedy', 'drama']}, 'year': {'$eq': 2019}}, namespace='my_namespace', limit=50 ) print(f"Found {len(response.vectors)} vectors") for vec_id, vector in response.vectors.items(): print(f"ID: {vec_id}, Metadata: {vector.metadata}") ``` **Pagination support:** When fetching large numbers of vectors, you can use pagination tokens to retrieve results in batches: ```python response = index.fetch_by_metadata( filter={'status': 'active'}, limit=100 ) if response.pagination and response.pagination.next: next_response = index.fetch_by_metadata( filter={'status': 'active'}, pagination_token=response.pagination.next, limit=100 ) ``` The update method used to require a vector id to be passed, but now you have the option to pass a metadata filter instead. This is useful for bulk metadata updates across many vectors. There is also a dry_run option that allows you to preview the number of vectors that would be changed by the update before performing the operation. ```python from pinecone import Pinecone pc = Pinecone() index = pc.Index(host="your-index-host") response = index.update( set_metadata={'status': 'active'}, filter={'genre': {'$eq': 'drama'}}, dry_run=True ) print(f"Would update {response.matched_records} vectors") response = index.update( set_metadata={'status': 'active'}, filter={'genre': {'$eq': 'drama'}} ) ``` A new `FilterBuilder` utility class provides a type-safe, fluent interface for constructing metadata filters. While perhaps a bit verbose, it can help prevent common errors like misspelled operator names and provides better IDE support. When you chain `.build()` onto the `FilterBuilder` it will emit a python dictionary representing the filter. Methods that take metadata filters as arguments will continue to accept dictionaries as before. ```python from pinecone import Pinecone, FilterBuilder pc = Pinecone() index = pc.Index(host="your-index-host") filter1 = FilterBuilder().eq("genre", "drama").build() filter2 = (FilterBuilder().eq("genre", "drama") & FilterBuilder().gt("year", 2020)).build() filter3 = (FilterBuilder().eq("genre", "comedy") | FilterBuilder().eq("genre", "drama")).build() filter4 = ((FilterBuilder().eq("genre", "drama") & FilterBuilder().gte("year", 2020)) | (FilterBuilder().eq("genre", "comedy") & FilterBuilder().lt("year", 2000))).build() response = index.fetch_by_metadata(filter=filter2, limit=50) index.update( set_metadata={'status': 'archived'}, filter=filter3 ) ``` The FilterBuilder supports all Pinecone filter operators: `eq`, `ne`, `gt`, `gte`, `lt`, `lte`, `in_`, `nin`, and `exists`. Compound expressions are build with and `&` and or `|`. See [PR #529](#529) for `fetch_by_metadata`, [PR #544](#544) for `update()` with filter, and [PR #531](#531) for FilterBuilder. You can now create namespaces in serverless indexes directly from the SDK: ```python from pinecone import Pinecone pc = Pinecone() index = pc.Index(host="your-index-host") namespace = index.create_namespace(name="my-namespace") print(f"Created namespace: {namespace.name}, Vector count: {namespace.vector_count}") namespace = index.create_namespace( name="my-namespace", schema={ "fields": { "genre": {"filterable": True}, "year": {"filterable": True} } } ) ``` **Note:** This operation is not supported for pod-based indexes. See [PR #532](#532) for details. For sparse indexes with integrated embedding configured to use the `pinecone-sparse-english-v0` model, you can now specify which terms must be present in search results: ```python from pinecone import Pinecone, SearchQuery pc = Pinecone() index = pc.Index(host="your-index-host") response = index.search( namespace="my-namespace", query=SearchQuery( inputs={"text": "Apple corporation"}, top_k=10, match_terms={ "strategy": "all", "terms": ["apple", "corporation"] } ) ) ``` The `match_terms` parameter ensures that all specified terms must be present in the text of each search hit. Terms are normalized and tokenized before matching, and order does not matter. See [PR #530](#530) for details. **Update API keys, projects, and organizations:** ```python from pinecone import Admin admin = Admin() # Auth with PINECONE_CLIENT_ID and PINECONE_CLIENT_SECRET api_key = admin.api_key.update( api_key_id='my-api-key-id', name='updated-api-key-name', roles=['ProjectEditor', 'DataPlaneEditor'] ) project = admin.project.update( project_id='my-project-id', name='updated-project-name', max_pods=10, force_encryption_with_cmek=True ) organization = admin.organization.update( organization_id='my-org-id', name='updated-organization-name' ) ``` **Delete organizations:** ```python from pinecone import Admin admin = Admin() admin.organization.delete(organization_id='my-org-id') ``` See [PR #527](#527) and [PR #543](#543) for details. You can now configure which metadata fields are filterable when creating serverless indexes. This helps optimize performance by only indexing metadata fields that you plan to use for filtering: ```python from pinecone import ( Pinecone, ServerlessSpec, CloudProvider, AwsRegion, Metric ) pc = Pinecone() pc.create_index( name='my-index', dimension=1536, metric=Metric.COSINE, spec=ServerlessSpec( cloud=CloudProvider.AWS, region=AwsRegion.US_EAST_1, schema={ "genre": {"filterable": True}, "year": {"filterable": True}, "rating": {"filterable": True} } ) ) ``` When using schemas, only fields marked as `filterable: True` in the schema can be used in metadata filters. See [PR #528](#528) for details. The SDK now exposes header information from API responses. This information is available in response objects via the `_response_info` attribute and can be useful for debugging and monitoring. ```python from pinecone import Pinecone pc = Pinecone() index = pc.Index(host="your-index-host") response = index.query( vector=[0.1, 0.2, 0.3, ...], top_k=10, namespace='my_namespace' ) for k, v in response._response_info.get('raw_headers').items(): print(f"{k}: {v}") ``` See [PR #539](#539) for details. We've replaced Python's standard library `json` module with `orjson`, a fast JSON library written in Rust. This provides significant performance improvements for both serialization and deserialization of request payloads: - **Serialization (dumps)**: 10-23x faster depending on payload size - **Deserialization (loads)**: 4-7x faster depending on payload size These improvements are especially beneficial for: - High-throughput applications making many API calls - Applications handling large vector payloads - Real-time applications where latency matters No code changes are required - the API remains the same, and you'll automatically benefit from these performance improvements. See [PR #556](#556) for details. We've optimized gRPC response parsing by replacing `json_format.MessageToDict` with direct protobuf field access. This optimization provides approximately 2x faster response parsing for gRPC operations. Special thanks to [@yorickvP](https://github.com/yorickvP) for surfacing the `json_format.MessageToDict` refactor opportunity. While we didn't merge the specific PR, yorick's insight led us to implement a similar optimization that significantly improves gRPC performance. See [PR #553](#553) for details. - **Type hints and IDE support**: Comprehensive type hints throughout the SDK improve IDE autocomplete and type checking. The SDK now uses Python 3.10+ type syntax throughout. - **Documentation**: Updated docstrings with RST formatting and code examples for better developer experience. - **Dependency updates**: Updated protobuf to 5.29.5 to address security vulnerabilities. Updated `pinecone-plugin-assistant` to version 3.0.1. - **Build system**: Migrated from poetry to uv for faster dependency management. - [@yorickvP](https://github.com/yorickvP) - Thanks for surfacing the gRPC response parsing optimization opportunity!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Expose LSN Header Information in API Responses
Overview
This PR implements exposure of LSN (Log Sequence Number) header information from Pinecone API responses through a new
_response_infoattribute on response objects. This enables faster test suite execution by using LSN-based freshness checks instead of pollingdescribe_index_stats().Motivation
Integration tests currently rely on polling
describe_index_stats()to verify data freshness, which is slow and inefficient. The Pinecone API includes LSN headers in responses that can be used to determine data freshness more efficiently:x-pinecone-request-lsn: Committed LSN from write operations (upsert, delete)x-pinecone-max-indexed-lsn: Reconciled LSN from read operations (query)By extracting and exposing these headers, tests can use LSN-based polling to reduce test execution time significantly. Testing so far shows this will cut the time needed to run db data plane integration times down by half or more.
Changes
Core Implementation
Response Info Module
pinecone/utils/response_info.pywith:ResponseInfoTypedDict for structured response metadataextract_response_info()function to extract and normalize raw headersraw_headers(dictionary of all response headers normalized to lowercase)lsn_utils) rather than inResponseInfoREST API Client Integration
api_client.pyandasyncio_api_client.pyto automatically attach_response_infoto db data plane response objects_response_infoto ensureraw_headersare always available, even when LSN fields are not presentgRPC Integration
grpc_runner.pyto capture initial metadata from gRPC callsgrpc/utils.pyto accept optionalinitial_metadataparameterindex_grpc.pyto pass initial metadata to parser functionsfuture.pyto extract initial metadata from gRPC futuresResponse Dataclasses
QueryResponseandUpsertResponsedataclasses inpinecone/db_data/dataclasses/_response_infofield toFetchResponse,FetchByMetadataResponse,QueryResponse, andUpsertResponseDictLikefor dictionary-style access_response_infois a required field (always present) with default{"raw_headers": {}}Index Classes
index.pyandindex_asyncio.pyto:_response_infoattachedasync_req=TruewithApplyResultwrapper for proper dataclass conversion_response_infofromupsert_records()responsesTest Infrastructure
LSN Utilities
tests/integration/helpers/lsn_utils.pywith helper functions for extracting LSN valuespinecone/utils/lsn_utils.py(deprecated)Polling Helpers
poll_until_lsn_reconciled()to use query operations for LSN-based freshness checkspoll_until_lsn_reconciled_async()for async testsIntegration Test Updates
test_query.py,test_upsert_dense.py,test_search_and_upsert_records.pytest_fetch.py,test_fetch_by_metadata.py,test_upsert_hybrid.pytest_query_namespaces.py,seed.pytest_query.py(async)_response_infois present when expectedDocumentation
docs/maintainers/lsn-headers-discovery.mddocumenting discovered headersscripts/inspect_lsn_headers.pyfor header discoveryUsage Examples
Accessing Response Info
The
_response_infoattribute is always available on all Index response objects:Dictionary-Style Access
All response dataclasses inherit from
DictLike, enabling dictionary-style access:Technical Details
Response Info Flow
REST API:
api_client.pyextracts → attaches_response_infoto OpenAPI model → Index classes convert to dataclassesgRPC:
grpc_runner.pycaptures → parser functions extract → attach_response_infoto response objectsBackward Compatibility
_response_infois always present on response objects (required field)raw_headersin_response_infoalways contains response headers (may be empty dict if no headers)poll_until_lsn_reconciled,poll_until_lsn_reconciled_async) accept_response_infodirectly and extract LSN internallyType Safety
_response_infofieldstype: ignorecomments where necessary (e.g.,ApplyResultwrapping)Dataclass Enhancements
DictLikefor dictionary-style accessQueryResponseandUpsertResponseare new dataclasses replacing OpenAPI models_response_infofield:ResponseInfo = field(default_factory=lambda: cast(ResponseInfo, {"raw_headers": {}}), repr=True, compare=False)repr=Truefor all response dataclasses to aid debuggingraw_headersalways contains response headers (may be empty dict)ResponseInfoonly containsraw_headersTesting
Unit Tests
extract_response_info()functionIntegration Tests
_response_infoassertions added to verify LSN data is presentBreaking Changes
None - This is a backward-compatible enhancement.
Response Type Changes
QueryResponseandUpsertResponseare now dataclasses instead of OpenAPI modelsDictLike)from pinecone import QueryResponse, UpsertResponse)isinstance()checks against OpenAPI models, they should still work when importing frompineconeNew Attribute
_response_infois added to all Index response objects (QueryResponse,UpsertResponse,FetchResponse,FetchByMetadataResponse)_response_infois always present and containsraw_headers.Compatibility Notes
DictLike, enabling dictionary-style access (response['matches'])response.matches,response.namespace, etc.)to_dict()were not part of the public APIRelated Issues