Skip to content

Commit 29349c1

Browse files
OskarStarkchr-hertel
authored andcommitted
[Examples][Store] Implement indexing pipeline
1 parent 2a09fb0 commit 29349c1

30 files changed

+1173
-126
lines changed

demo/CLAUDE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ composer install
3636
echo "OPENAI_API_KEY='sk-...'" > .env.local
3737

3838
# Initialize vector store
39-
symfony console app:blog:embed -vv
39+
symfony console ai:store:index blog -vv
4040

4141
# Test vector store
4242
symfony console app:blog:query

demo/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ The [Chroma DB](https://www.trychroma.com/) is a vector store that is used to st
7474
To initialize the Chroma DB, you need to run the following command:
7575

7676
```shell
77-
symfony console app:blog:embed -vv
77+
symfony console ai:store:index blog -vv
7878
```
7979

8080
Now you should be able to run the test command and get some results:

demo/config/packages/ai.yaml

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,11 @@ ai:
5959
class: 'Symfony\AI\Platform\Bridge\OpenAi\Embeddings'
6060
name: !php/const Symfony\AI\Platform\Bridge\OpenAi\Embeddings::TEXT_ADA_002
6161
indexer:
62-
default:
62+
blog:
63+
loader: 'App\Blog\FeedLoader'
64+
source: 'https://feeds.feedburner.com/symfony/blog'
65+
transformers:
66+
- 'Symfony\AI\Store\Document\Transformer\TextTrimTransformer'
6367
vectorizer: 'ai.vectorizer.openai_embeddings'
6468
store: 'ai.store.chroma_db.symfonycon'
6569

@@ -75,3 +79,5 @@ services:
7579
Symfony\AI\Agent\Toolbox\Tool\Wikipedia: ~
7680
Symfony\AI\Agent\Toolbox\Tool\SimilaritySearch:
7781
$vectorizer: '@ai.vectorizer.openai_embeddings'
82+
83+
Symfony\AI\Store\Document\Transformer\TextTrimTransformer: ~

demo/src/Blog/Command/EmbedCommand.php

Lines changed: 0 additions & 37 deletions
This file was deleted.

demo/src/Blog/Embedder.php

Lines changed: 0 additions & 35 deletions
This file was deleted.

demo/src/Blog/FeedLoader.php

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,23 +11,33 @@
1111

1212
namespace App\Blog;
1313

14+
use Symfony\AI\Store\Document\LoaderInterface;
15+
use Symfony\AI\Store\Document\Metadata;
16+
use Symfony\AI\Store\Document\TextDocument;
17+
use Symfony\AI\Store\Exception\InvalidArgumentException;
1418
use Symfony\Component\DomCrawler\Crawler;
1519
use Symfony\Component\Uid\Uuid;
1620
use Symfony\Contracts\HttpClient\HttpClientInterface;
1721

18-
class FeedLoader
22+
final class FeedLoader implements LoaderInterface
1923
{
2024
public function __construct(
2125
private HttpClientInterface $httpClient,
2226
) {
2327
}
2428

2529
/**
26-
* @return Post[]
30+
* @param ?string $source RSS feed URL
31+
* @param array<string, mixed> $options
32+
*
33+
* @return iterable<TextDocument>
2734
*/
28-
public function load(): array
35+
public function load(?string $source, array $options = []): iterable
2936
{
30-
$result = $this->httpClient->request('GET', 'https://feeds.feedburner.com/symfony/blog');
37+
if (null === $source) {
38+
throw new InvalidArgumentException('FeedLoader requires a RSS feed URL as source, null given.');
39+
}
40+
$result = $this->httpClient->request('GET', $source);
3141

3242
$posts = [];
3343
$crawler = new Crawler($result->getContent());
@@ -44,6 +54,8 @@ public function load(): array
4454
);
4555
});
4656

47-
return $posts;
57+
foreach ($posts as $post) {
58+
yield new TextDocument($post->id, $post->toString(), new Metadata($post->toArray()));
59+
}
4860
}
4961
}
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
<?php
2+
3+
/*
4+
* This file is part of the Symfony package.
5+
*
6+
* (c) Fabien Potencier <fabien@symfony.com>
7+
*
8+
* For the full copyright and license information, please view the LICENSE
9+
* file that was distributed with this source code.
10+
*/
11+
12+
use Symfony\AI\Platform\Bridge\OpenAi\Embeddings;
13+
use Symfony\AI\Platform\Bridge\OpenAi\PlatformFactory;
14+
use Symfony\AI\Store\Bridge\Local\InMemoryStore;
15+
use Symfony\AI\Store\Document\Loader\TextFileLoader;
16+
use Symfony\AI\Store\Document\Transformer\TextReplaceTransformer;
17+
use Symfony\AI\Store\Document\Transformer\TextSplitTransformer;
18+
use Symfony\AI\Store\Document\Vectorizer;
19+
use Symfony\AI\Store\Indexer;
20+
21+
require_once dirname(__DIR__).'/bootstrap.php';
22+
23+
$platform = PlatformFactory::create(env('OPENAI_API_KEY'), http_client());
24+
$store = new InMemoryStore();
25+
$vectorizer = new Vectorizer($platform, new Embeddings('text-embedding-3-small'));
26+
$indexer = new Indexer(
27+
loader: new TextFileLoader(),
28+
vectorizer: $vectorizer,
29+
store: $store,
30+
source: [
31+
dirname(__DIR__, 2).'/fixtures/movies/gladiator.md',
32+
dirname(__DIR__, 2).'/fixtures/movies/inception.md',
33+
dirname(__DIR__, 2).'/fixtures/movies/jurassic-park.md',
34+
],
35+
transformers: [
36+
new TextReplaceTransformer(search: '## Plot', replace: '## Synopsis'),
37+
new TextSplitTransformer(chunkSize: 500, overlap: 100),
38+
],
39+
);
40+
41+
$indexer->index();
42+
43+
$vector = $vectorizer->vectorize('Roman gladiator revenge');
44+
$results = $store->query($vector);
45+
foreach ($results as $i => $document) {
46+
echo sprintf("%d. %s\n", $i + 1, substr($document->id, 0, 40).'...');
47+
}
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
<?php
2+
3+
/*
4+
* This file is part of the Symfony package.
5+
*
6+
* (c) Fabien Potencier <fabien@symfony.com>
7+
*
8+
* For the full copyright and license information, please view the LICENSE
9+
* file that was distributed with this source code.
10+
*/
11+
12+
use Symfony\AI\Platform\Bridge\OpenAi\Embeddings;
13+
use Symfony\AI\Platform\Bridge\OpenAi\PlatformFactory;
14+
use Symfony\AI\Store\Bridge\Local\InMemoryStore;
15+
use Symfony\AI\Store\Document\Loader\InMemoryLoader;
16+
use Symfony\AI\Store\Document\Metadata;
17+
use Symfony\AI\Store\Document\TextDocument;
18+
use Symfony\AI\Store\Document\Transformer\TextSplitTransformer;
19+
use Symfony\AI\Store\Document\Vectorizer;
20+
use Symfony\AI\Store\Indexer;
21+
use Symfony\Component\Uid\Uuid;
22+
23+
require_once dirname(__DIR__).'/bootstrap.php';
24+
25+
$platform = PlatformFactory::create(env('OPENAI_API_KEY'), http_client());
26+
$store = new InMemoryStore();
27+
$vectorizer = new Vectorizer($platform, new Embeddings('text-embedding-3-small'));
28+
29+
$documents = [
30+
new TextDocument(
31+
Uuid::v4(),
32+
'Artificial Intelligence is transforming the way we work and live. Machine learning algorithms can now process vast amounts of data and make predictions with remarkable accuracy.',
33+
new Metadata(['title' => 'AI Revolution'])
34+
),
35+
new TextDocument(
36+
Uuid::v4(),
37+
'Climate change is one of the most pressing challenges of our time. Renewable energy sources like solar and wind power are becoming increasingly important for a sustainable future.',
38+
new Metadata(['title' => 'Climate Action'])
39+
),
40+
];
41+
42+
$indexer = new Indexer(
43+
loader: new InMemoryLoader($documents),
44+
vectorizer: $vectorizer,
45+
store: $store,
46+
source: null,
47+
transformers: [
48+
new TextSplitTransformer(chunkSize: 100, overlap: 20),
49+
],
50+
);
51+
52+
$indexer->index();
53+
54+
$vector = $vectorizer->vectorize('machine learning artificial intelligence');
55+
$results = $store->query($vector);
56+
foreach ($results as $i => $document) {
57+
echo sprintf("%d. %s\n", $i + 1, substr($document->id, 0, 40).'...');
58+
}

fixtures/movies/gladiator.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Gladiator (2000)
2+
3+
**IMDB**: https://www.imdb.com/title/tt0172495/
4+
5+
**Director:** Ridley Scott
6+
7+
## Cast
8+
9+
- **Russell Crowe** as Maximus Decimus Meridius
10+
- **Joaquin Phoenix** as Emperor Commodus
11+
- **Connie Nielsen** as Lucilla
12+
- **Oliver Reed** as Proximo
13+
- **Derek Jacobi** as Senator Gracchus
14+
- **Djimon Hounsou** as Juba
15+
- **Richard Harris** as Marcus Aurelius
16+
- **Ralf Möller** as Hagen
17+
- **Tommy Flanagan** as Cicero
18+
- **David Schofield** as Falco
19+
20+
## Plot
21+
22+
A former Roman General sets out to exact vengeance against the corrupt emperor who murdered his family and sent him into slavery.
23+
24+
**Maximus Decimus Meridius** is a powerful Roman general beloved by the people and the aging Emperor **Marcus Aurelius**. As Marcus Aurelius lies dying, he makes known his wish that Maximus should succeed him and return Rome to the former glory of the Republic rather than the corrupt Empire it has become.
25+
26+
However, Marcus Aurelius's son **Commodus** learns of his father's plan and murders him before he can publicly name Maximus as his successor. Commodus then orders the execution of Maximus and his family. Maximus escapes the execution but arrives at his farm too late to save his wife and son.
27+
28+
Wounded and devastated, Maximus is captured by slave traders and forced to become a gladiator. Under the training of **Proximo**, a former gladiator, Maximus becomes a skilled fighter and eventually makes his way to the **Colosseum** in Rome, where he gains fame and the crowd's favor.
29+
30+
Using his newfound popularity with the people, Maximus seeks to avenge the murder of his family and fulfill his promise to Marcus Aurelius to restore Rome to a republic. The film culminates in a final confrontation between Maximus and Commodus in the arena.
31+
32+
The film explores themes of *honor*, *revenge*, *political corruption*, and the struggle between personal desires and duty to the greater good.

fixtures/movies/inception.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# Inception (2010)
2+
3+
**IMDB**: https://www.imdb.com/title/tt1375666/
4+
5+
**Director:** Christopher Nolan
6+
7+
## Cast
8+
9+
- **Leonardo DiCaprio** as Dom Cobb
10+
- **Marion Cotillard** as Mal Cobb
11+
- **Tom Hardy** as Eames
12+
- **Elliot Page** as Ariadne
13+
- **Ken Watanabe** as Saito
14+
- **Dileep Rao** as Yusuf
15+
- **Cillian Murphy** as Robert Fischer Jr.
16+
- **Tom Berenger** as Peter Browning
17+
- **Michael Caine** as Professor Stephen Miles
18+
- **Lukas Haas** as Nash
19+
20+
## Plot
21+
22+
A skilled thief is given a chance at redemption if he can successfully perform inception, the act of planting an idea in someone's subconscious.
23+
24+
**Dom Cobb** is a skilled thief who specializes in *extraction* - stealing secrets from people's subconscious minds while they dream. This unique skill has made him a valuable player in the world of corporate espionage, but it has also cost him everything he loves. Cobb's rare ability has made him a coveted player in this treacherous new world of corporate espionage, but it has also made him an international fugitive and cost him everything he has ever loved.
25+
26+
Now Cobb is being offered a chance at redemption. One last job could give him his life back but only if he can accomplish the impossible - **inception**. Instead of the perfect heist, Cobb and his team of specialists have to pull off the reverse: their task is not to steal an idea but to plant one. If they succeed, it could be the perfect crime.
27+
28+
The film explores themes of *reality*, *dreams*, *memory*, and the nature of consciousness through multiple layers of dream states, creating a complex narrative structure that challenges both characters and audience to question what is real.

0 commit comments

Comments
 (0)