You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"Use the generated API key to intialize a client connection to Pinecone:"
94
-
],
95
-
"metadata": {
96
-
"id": "sbJFp5DO5ryT"
97
-
}
97
+
]
98
98
},
99
99
{
100
100
"cell_type": "code",
@@ -113,19 +113,24 @@
113
113
},
114
114
{
115
115
"cell_type": "markdown",
116
+
"metadata": {
117
+
"id": "bN9Rl7GP258C"
118
+
},
116
119
"source": [
117
120
"## Generate vectors\n",
118
121
"\n",
119
122
"A [vector embedding](https://www.pinecone.io/learn/vector-embeddings/) is a numerical representation of data that enables similarity-based search in vector databases like Pinecone. To convert data into this format, you use an embedding model.\n",
120
123
"\n",
121
-
"For this quickstart, use the [`multilingual-e5-large`](https://docs.pinecone.io/models/multilingual-e5-large) embedding model hosted by Pinecone to [convert](https://docs.pinecone.io/guides/inference/generate-embeddings) four sentences about apples into vectors, three concerning their health benefits, one concerning their cultivation."
122
-
],
123
-
"metadata": {
124
-
"id": "bN9Rl7GP258C"
125
-
}
124
+
"For this quickstart, use the [`multilingual-e5-large`](https://docs.pinecone.io/models/multilingual-e5-large) embedding model hosted by Pinecone to [convert](https://docs.pinecone.io/guides/inference/generate-embeddings) four sentences about apples into vectors, three related to health, one related to cultivation."
125
+
]
126
126
},
127
127
{
128
128
"cell_type": "code",
129
+
"execution_count": null,
130
+
"metadata": {
131
+
"id": "ZIclo2UK3NFE"
132
+
},
133
+
"outputs": [],
129
134
"source": [
130
135
"# Define a sample dataset where each item has a unique ID, text, and category\n",
131
136
"data = [\n",
@@ -162,12 +167,7 @@
162
167
")\n",
163
168
"\n",
164
169
"print(embeddings)"
165
-
],
166
-
"metadata": {
167
-
"id": "ZIclo2UK3NFE"
168
-
},
169
-
"execution_count": null,
170
-
"outputs": []
170
+
]
171
171
},
172
172
{
173
173
"cell_type": "markdown",
@@ -307,7 +307,7 @@
307
307
"source": [
308
308
"## Search the index\n",
309
309
"\n",
310
-
"Now, let’s say you want to search your index for information about “Disease prevention”.\n",
310
+
"Now, let’s say you want to search your index for information related to \"health risks\".\n",
311
311
"\n",
312
312
"Use the the `multilingual-e5-large` model hosted by Pinecone *to* convert your query into a vector embedding, and then use the [`query`](https://docs.pinecone.io/guides/data/query-data) operation to search for the three vectors in the index that are most semantically similar to the query vector:"
313
313
]
@@ -321,7 +321,7 @@
321
321
"outputs": [],
322
322
"source": [
323
323
"# Define your query\n",
324
-
"query = \"Disease prevention\"\n",
324
+
"query = \"Health risks\"\n",
325
325
"\n",
326
326
"# Convert the query into a numerical vector that Pinecone can search with\n",
327
327
"query_embedding = pc.inference.embed(\n",
@@ -350,24 +350,29 @@
350
350
"id": "9jAJDjSAjsvA"
351
351
},
352
352
"source": [
353
-
"Notice that the response includes only records about the health benefits of apples, not the cultivation of apple."
353
+
"Notice that the response includes only records related to health, not the cultivation of apple."
354
354
]
355
355
},
356
356
{
357
357
"cell_type": "markdown",
358
+
"metadata": {
359
+
"id": "ayZib8aEUYR_"
360
+
},
358
361
"source": [
359
362
"## Add reranking\n",
360
363
"\n",
361
364
"You can increase the accuracy of your search by reranking results based on their relevance to the query.\n",
362
365
"\n",
363
-
"Use the `rerank` operation and the `bge-reranker-v2-m3` reranking model hosted by Pinecone to rerank the values of the documents.source_text fields and return only the two most relevant documents:"
364
-
],
365
-
"metadata": {
366
-
"id": "ayZib8aEUYR_"
367
-
}
366
+
"Use the `rerank` operation and the `bge-reranker-v2-m3` reranking model hosted by Pinecone to rerank the values of the documents.source_text fields:"
367
+
]
368
368
},
369
369
{
370
370
"cell_type": "code",
371
+
"execution_count": null,
372
+
"metadata": {
373
+
"id": "SyPG_OmwUjtm"
374
+
},
375
+
"outputs": [],
371
376
"source": [
372
377
"# Rerank the search results based on their relevance to the query\n",
373
378
"ranked_results = pc.inference.rerank(\n",
@@ -378,7 +383,7 @@
378
383
" {\"id\": \"rec1\", \"source_text\": \"Apples are a great source of dietary fiber, which supports digestion and helps maintain a healthy gut.\"},\n",
379
384
" {\"id\": \"rec4\", \"source_text\": \"The high fiber content in apples can also help regulate blood sugar levels, making them a favorable snack for people with diabetes.\"}\n",
380
385
" ],\n",
381
-
" top_n=2,\n",
386
+
" top_n=3,\n",
382
387
" rank_fields=[\"source_text\"],\n",
383
388
" return_documents=True,\n",
384
389
" parameters={\n",
@@ -387,37 +392,37 @@
387
392
")\n",
388
393
"\n",
389
394
"print(ranked_results)\n"
390
-
],
391
-
"metadata": {
392
-
"id": "SyPG_OmwUjtm"
393
-
},
394
-
"execution_count": null,
395
-
"outputs": []
395
+
]
396
396
},
397
397
{
398
398
"cell_type": "markdown",
399
-
"source": [
400
-
"Notice that the two returned records are the most relevant for the query, the first relating to reducing chronic diseases, the second relating to preventing diabetes."
401
-
],
402
399
"metadata": {
403
400
"id": "QTkhBFJHUnj0"
404
-
}
401
+
},
402
+
"source": [
403
+
"Notice that the two records specifically related to \"health risks\" (chronic disease and diabetes) are now ranked highest."
404
+
]
405
405
},
406
406
{
407
407
"cell_type": "markdown",
408
+
"metadata": {
409
+
"id": "nGjpffT5UrrL"
410
+
},
408
411
"source": [
409
412
"## Add filtering\n",
410
413
"\n",
411
414
"You can use a [metadata filter](https://docs.pinecone.io/guides/data/understanding-metadata) to limit your search to records matching a filter expression.\n",
412
415
"\n",
413
416
"Your upserted records contain a `category` metadata field. Now use that field as a filter to search for records in the “digestive system” category:"
414
-
],
415
-
"metadata": {
416
-
"id": "nGjpffT5UrrL"
417
-
}
417
+
]
418
418
},
419
419
{
420
420
"cell_type": "code",
421
+
"execution_count": null,
422
+
"metadata": {
423
+
"id": "KkH7Wre3Ux5B"
424
+
},
425
+
"outputs": [],
421
426
"source": [
422
427
"# Search the index with a metadata filter\n",
423
428
"filtered_results = index.query(\n",
@@ -432,21 +437,16 @@
432
437
")\n",
433
438
"\n",
434
439
"print(filtered_results)"
435
-
],
436
-
"metadata": {
437
-
"id": "KkH7Wre3Ux5B"
438
-
},
439
-
"execution_count": null,
440
-
"outputs": []
440
+
]
441
441
},
442
442
{
443
443
"cell_type": "markdown",
444
-
"source": [
445
-
"Notice that the response includes only the one record in the “digestive system” category."
446
-
],
447
444
"metadata": {
448
445
"id": "awumy10tU2Lv"
449
-
}
446
+
},
447
+
"source": [
448
+
"Notice that the response includes only the one record in the “digestive system” category."
0 commit comments