Skip to content

Commit fa3ab56

Browse files
authored
Merge pull request #406 from pinecone-io/quickstart-tweak
Quickstart notebook tweaks to align with docs quickstart
2 parents d7af87b + 3e6efc4 commit fa3ab56

File tree

1 file changed

+55
-55
lines changed

1 file changed

+55
-55
lines changed

docs/pinecone-quickstart.ipynb

Lines changed: 55 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -56,25 +56,25 @@
5656
"cell_type": "code",
5757
"execution_count": 2,
5858
"metadata": {
59-
"id": "89S8G8oP61-t",
6059
"colab": {
6160
"base_uri": "https://localhost:8080/",
6261
"height": 247
6362
},
63+
"id": "89S8G8oP61-t",
6464
"outputId": "8cf57515-28e1-4953-b86d-d23dad4ea9fe"
6565
},
6666
"outputs": [
6767
{
68-
"output_type": "display_data",
6968
"data": {
70-
"text/plain": [
71-
"<IPython.core.display.HTML object>"
72-
],
7369
"text/html": [
7470
"<script type=\"text/javascript\" src=\"https://connect.pinecone.io/embed.js\"></script>"
71+
],
72+
"text/plain": [
73+
"<IPython.core.display.HTML object>"
7574
]
7675
},
77-
"metadata": {}
76+
"metadata": {},
77+
"output_type": "display_data"
7878
}
7979
],
8080
"source": [
@@ -87,14 +87,14 @@
8787
},
8888
{
8989
"cell_type": "markdown",
90+
"metadata": {
91+
"id": "sbJFp5DO5ryT"
92+
},
9093
"source": [
9194
"## Initialize a client\n",
9295
"\n",
9396
"Use the generated API key to intialize a client connection to Pinecone:"
94-
],
95-
"metadata": {
96-
"id": "sbJFp5DO5ryT"
97-
}
97+
]
9898
},
9999
{
100100
"cell_type": "code",
@@ -113,19 +113,24 @@
113113
},
114114
{
115115
"cell_type": "markdown",
116+
"metadata": {
117+
"id": "bN9Rl7GP258C"
118+
},
116119
"source": [
117120
"## Generate vectors\n",
118121
"\n",
119122
"A [vector embedding](https://www.pinecone.io/learn/vector-embeddings/) is a numerical representation of data that enables similarity-based search in vector databases like Pinecone. To convert data into this format, you use an embedding model.\n",
120123
"\n",
121-
"For this quickstart, use the [`multilingual-e5-large`](https://docs.pinecone.io/models/multilingual-e5-large) embedding model hosted by Pinecone to [convert](https://docs.pinecone.io/guides/inference/generate-embeddings) four sentences about apples into vectors, three concerning their health benefits, one concerning their cultivation."
122-
],
123-
"metadata": {
124-
"id": "bN9Rl7GP258C"
125-
}
124+
"For this quickstart, use the [`multilingual-e5-large`](https://docs.pinecone.io/models/multilingual-e5-large) embedding model hosted by Pinecone to [convert](https://docs.pinecone.io/guides/inference/generate-embeddings) four sentences about apples into vectors, three related to health, one related to cultivation."
125+
]
126126
},
127127
{
128128
"cell_type": "code",
129+
"execution_count": null,
130+
"metadata": {
131+
"id": "ZIclo2UK3NFE"
132+
},
133+
"outputs": [],
129134
"source": [
130135
"# Define a sample dataset where each item has a unique ID, text, and category\n",
131136
"data = [\n",
@@ -162,12 +167,7 @@
162167
")\n",
163168
"\n",
164169
"print(embeddings)"
165-
],
166-
"metadata": {
167-
"id": "ZIclo2UK3NFE"
168-
},
169-
"execution_count": null,
170-
"outputs": []
170+
]
171171
},
172172
{
173173
"cell_type": "markdown",
@@ -307,7 +307,7 @@
307307
"source": [
308308
"## Search the index\n",
309309
"\n",
310-
"Now, let’s say you want to search your index for information about “Disease prevention”.\n",
310+
"Now, let’s say you want to search your index for information related to \"health risks\".\n",
311311
"\n",
312312
"Use the the `multilingual-e5-large` model hosted by Pinecone *to* convert your query into a vector embedding, and then use the [`query`](https://docs.pinecone.io/guides/data/query-data) operation to search for the three vectors in the index that are most semantically similar to the query vector:"
313313
]
@@ -321,7 +321,7 @@
321321
"outputs": [],
322322
"source": [
323323
"# Define your query\n",
324-
"query = \"Disease prevention\"\n",
324+
"query = \"Health risks\"\n",
325325
"\n",
326326
"# Convert the query into a numerical vector that Pinecone can search with\n",
327327
"query_embedding = pc.inference.embed(\n",
@@ -350,24 +350,29 @@
350350
"id": "9jAJDjSAjsvA"
351351
},
352352
"source": [
353-
"Notice that the response includes only records about the health benefits of apples, not the cultivation of apple."
353+
"Notice that the response includes only records related to health, not the cultivation of apple."
354354
]
355355
},
356356
{
357357
"cell_type": "markdown",
358+
"metadata": {
359+
"id": "ayZib8aEUYR_"
360+
},
358361
"source": [
359362
"## Add reranking\n",
360363
"\n",
361364
"You can increase the accuracy of your search by reranking results based on their relevance to the query.\n",
362365
"\n",
363-
"Use the `rerank` operation and the `bge-reranker-v2-m3` reranking model hosted by Pinecone to rerank the values of the documents.source_text fields and return only the two most relevant documents:"
364-
],
365-
"metadata": {
366-
"id": "ayZib8aEUYR_"
367-
}
366+
"Use the `rerank` operation and the `bge-reranker-v2-m3` reranking model hosted by Pinecone to rerank the values of the documents.source_text fields:"
367+
]
368368
},
369369
{
370370
"cell_type": "code",
371+
"execution_count": null,
372+
"metadata": {
373+
"id": "SyPG_OmwUjtm"
374+
},
375+
"outputs": [],
371376
"source": [
372377
"# Rerank the search results based on their relevance to the query\n",
373378
"ranked_results = pc.inference.rerank(\n",
@@ -378,7 +383,7 @@
378383
" {\"id\": \"rec1\", \"source_text\": \"Apples are a great source of dietary fiber, which supports digestion and helps maintain a healthy gut.\"},\n",
379384
" {\"id\": \"rec4\", \"source_text\": \"The high fiber content in apples can also help regulate blood sugar levels, making them a favorable snack for people with diabetes.\"}\n",
380385
" ],\n",
381-
" top_n=2,\n",
386+
" top_n=3,\n",
382387
" rank_fields=[\"source_text\"],\n",
383388
" return_documents=True,\n",
384389
" parameters={\n",
@@ -387,37 +392,37 @@
387392
")\n",
388393
"\n",
389394
"print(ranked_results)\n"
390-
],
391-
"metadata": {
392-
"id": "SyPG_OmwUjtm"
393-
},
394-
"execution_count": null,
395-
"outputs": []
395+
]
396396
},
397397
{
398398
"cell_type": "markdown",
399-
"source": [
400-
"Notice that the two returned records are the most relevant for the query, the first relating to reducing chronic diseases, the second relating to preventing diabetes."
401-
],
402399
"metadata": {
403400
"id": "QTkhBFJHUnj0"
404-
}
401+
},
402+
"source": [
403+
"Notice that the two records specifically related to \"health risks\" (chronic disease and diabetes) are now ranked highest."
404+
]
405405
},
406406
{
407407
"cell_type": "markdown",
408+
"metadata": {
409+
"id": "nGjpffT5UrrL"
410+
},
408411
"source": [
409412
"## Add filtering\n",
410413
"\n",
411414
"You can use a [metadata filter](https://docs.pinecone.io/guides/data/understanding-metadata) to limit your search to records matching a filter expression.\n",
412415
"\n",
413416
"Your upserted records contain a `category` metadata field. Now use that field as a filter to search for records in the “digestive system” category:"
414-
],
415-
"metadata": {
416-
"id": "nGjpffT5UrrL"
417-
}
417+
]
418418
},
419419
{
420420
"cell_type": "code",
421+
"execution_count": null,
422+
"metadata": {
423+
"id": "KkH7Wre3Ux5B"
424+
},
425+
"outputs": [],
421426
"source": [
422427
"# Search the index with a metadata filter\n",
423428
"filtered_results = index.query(\n",
@@ -432,21 +437,16 @@
432437
")\n",
433438
"\n",
434439
"print(filtered_results)"
435-
],
436-
"metadata": {
437-
"id": "KkH7Wre3Ux5B"
438-
},
439-
"execution_count": null,
440-
"outputs": []
440+
]
441441
},
442442
{
443443
"cell_type": "markdown",
444-
"source": [
445-
"Notice that the response includes only the one record in the “digestive system” category."
446-
],
447444
"metadata": {
448445
"id": "awumy10tU2Lv"
449-
}
446+
},
447+
"source": [
448+
"Notice that the response includes only the one record in the “digestive system” category."
449+
]
450450
},
451451
{
452452
"cell_type": "markdown",
@@ -494,4 +494,4 @@
494494
},
495495
"nbformat": 4,
496496
"nbformat_minor": 0
497-
}
497+
}

0 commit comments

Comments
 (0)