Skip to content

Commit 2fde656

Browse files
authored
Add support for computing CLIP image and text embeddings separately (Closes huggingface#148) (huggingface#227)
* Define custom CLIP ONNX configs * Update conversion script * Support specifying custom model file name * Use int64 for CLIP input ids * Add support for CLIP text and vision models * Fix JSDoc * Add docs for `CLIPTextModelWithProjection` * Add docs for `CLIPVisionModelWithProjection` * Add unit test for CLIP text models * Add unit test for CLIP vision models * Set resize precision to 3 decimal places * Fix `RawImage.save()` function * Throw error when reading image and status != 200 * Create basic semantic image search application * Separate out components * Add `update-database` script * Update transformers.js version
1 parent 27920d8 commit 2fde656

30 files changed

+1060
-105
lines changed
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
SUPABASE_URL=your-project-url
2+
SUPABASE_ANON_KEY=your-anon-key
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
{
2+
"extends": "next/core-web-vitals"
3+
}
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# See https://help.github.com/articles/ignoring-files/ for more about ignoring files.
2+
3+
# dependencies
4+
/node_modules
5+
/.pnp
6+
.pnp.js
7+
8+
# testing
9+
/coverage
10+
11+
# next.js
12+
/.next/
13+
/out/
14+
15+
# production
16+
/build
17+
18+
# misc
19+
.DS_Store
20+
*.pem
21+
22+
# debug
23+
npm-debug.log*
24+
yarn-debug.log*
25+
yarn-error.log*
26+
27+
# local env files
28+
.env*.local
29+
30+
# vercel
31+
.vercel
32+
33+
# typescript
34+
*.tsbuildinfo
35+
next-env.d.ts
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# syntax=docker/dockerfile:1.4
2+
3+
# Adapted from https://github.com/vercel/next.js/blob/e60a1e747c3f521fc24dfd9ee2989e13afeb0a9b/examples/with-docker/Dockerfile
4+
# For more information, see https://nextjs.org/docs/pages/building-your-application/deploying#docker-image
5+
6+
FROM node:18 AS base
7+
8+
# Install dependencies only when needed
9+
FROM base AS deps
10+
WORKDIR /app
11+
12+
# Install dependencies based on the preferred package manager
13+
COPY --link package.json yarn.lock* package-lock.json* pnpm-lock.yaml* ./
14+
RUN \
15+
if [ -f yarn.lock ]; then yarn --frozen-lockfile; \
16+
elif [ -f package-lock.json ]; then npm ci; \
17+
elif [ -f pnpm-lock.yaml ]; then yarn global add pnpm && pnpm i --frozen-lockfile; \
18+
else echo "Lockfile not found." && exit 1; \
19+
fi
20+
21+
22+
# Rebuild the source code only when needed
23+
FROM base AS builder
24+
WORKDIR /app
25+
COPY --from=deps --link /app/node_modules ./node_modules
26+
COPY --link . .
27+
28+
# Next.js collects completely anonymous telemetry data about general usage.
29+
# Learn more here: https://nextjs.org/telemetry
30+
# Uncomment the following line in case you want to disable telemetry during the build.
31+
# ENV NEXT_TELEMETRY_DISABLED 1
32+
33+
RUN npm run build
34+
35+
# If using yarn comment out above and use below instead
36+
# RUN yarn build
37+
38+
# Production image, copy all the files and run next
39+
FROM base AS runner
40+
WORKDIR /app
41+
42+
ENV NODE_ENV production
43+
# Uncomment the following line in case you want to disable telemetry during runtime.
44+
# ENV NEXT_TELEMETRY_DISABLED 1
45+
46+
RUN \
47+
addgroup --system --gid 1001 nodejs; \
48+
adduser --system --uid 1001 nextjs
49+
50+
COPY --from=builder --link /app/public ./public
51+
52+
# Automatically leverage output traces to reduce image size
53+
# https://nextjs.org/docs/advanced-features/output-file-tracing
54+
COPY --from=builder --link --chown=1001:1001 /app/.next/standalone ./
55+
COPY --from=builder --link --chown=1001:1001 /app/.next/static ./.next/static
56+
57+
USER nextjs
58+
59+
EXPOSE 3000
60+
61+
ENV PORT 3000
62+
ENV HOSTNAME localhost
63+
64+
# Allow the running process to write model files to the cache folder.
65+
# NOTE: In practice, you would probably want to pre-download the model files to avoid having to download them on-the-fly.
66+
RUN mkdir -p /app/node_modules/@xenova/.cache/
67+
RUN chmod 777 -R /app/node_modules/@xenova/
68+
69+
CMD ["node", "server.js"]
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
This is a [Next.js](https://nextjs.org/) project bootstrapped with [`create-next-app`](https://github.com/vercel/next.js/tree/canary/packages/create-next-app).
2+
3+
## Getting Started
4+
5+
First, run the development server:
6+
7+
```bash
8+
npm run dev
9+
# or
10+
yarn dev
11+
# or
12+
pnpm dev
13+
```
14+
15+
Open [http://localhost:3000](http://localhost:3000) with your browser to see the result.
16+
17+
You can start editing the page by modifying `app/page.js`. The page auto-updates as you edit the file.
18+
19+
This project uses [`next/font`](https://nextjs.org/docs/basic-features/font-optimization) to automatically optimize and load Inter, a custom Google Font.
20+
21+
## Learn More
22+
23+
To learn more about Next.js, take a look at the following resources:
24+
25+
- [Next.js Documentation](https://nextjs.org/docs) - learn about Next.js features and API.
26+
- [Learn Next.js](https://nextjs.org/learn) - an interactive Next.js tutorial.
27+
28+
You can check out [the Next.js GitHub repository](https://github.com/vercel/next.js/) - your feedback and contributions are welcome!
29+
30+
## Deploy on Vercel
31+
32+
The easiest way to deploy your Next.js app is to use the [Vercel Platform](https://vercel.com/new?utm_medium=default-template&filter=next.js&utm_source=create-next-app&utm_campaign=create-next-app-readme) from the creators of Next.js.
33+
34+
Check out our [Next.js deployment documentation](https://nextjs.org/docs/deployment) for more details.
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
{
2+
"compilerOptions": {
3+
"paths": {
4+
"@/*": ["./src/*"]
5+
}
6+
}
7+
}
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
/** @type {import('next').NextConfig} */
2+
const nextConfig = {
3+
// (Optional) Export as a standalone site
4+
// See https://nextjs.org/docs/pages/api-reference/next-config-js/output#automatically-copying-traced-files
5+
output: 'standalone', // Feel free to modify/remove this option
6+
7+
// Indicate that these packages should not be bundled by webpack
8+
experimental: {
9+
serverComponentsExternalPackages: ['sharp', 'onnxruntime-node'],
10+
},
11+
12+
// Define which domains we are allowed to load images from
13+
images: {
14+
domains: ['images.unsplash.com'],
15+
},
16+
};
17+
18+
module.exports = nextConfig;
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
{
2+
"name": "semantic-image-search",
3+
"version": "0.1.0",
4+
"private": true,
5+
"scripts": {
6+
"dev": "next dev",
7+
"build": "next build",
8+
"start": "next start",
9+
"lint": "next lint"
10+
},
11+
"dependencies": {
12+
"@xenova/transformers": "^2.5.0",
13+
"@supabase/supabase-js": "^2.31.0",
14+
"autoprefixer": "10.4.14",
15+
"blurhash": "^2.0.5",
16+
"eslint": "8.45.0",
17+
"eslint-config-next": "13.4.12",
18+
"next": "13.4.12",
19+
"postcss": "8.4.27",
20+
"react": "18.2.0",
21+
"react-dom": "18.2.0",
22+
"tailwindcss": "3.3.3"
23+
},
24+
"overrides": {
25+
"protobufjs": "^7.2.4"
26+
}
27+
}
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
module.exports = {
2+
plugins: {
3+
tailwindcss: {},
4+
autoprefixer: {},
5+
},
6+
}
Lines changed: 1 addition & 0 deletions
Loading
Lines changed: 1 addition & 0 deletions
Loading
Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
// Helper script to update the database with image embeddings
2+
3+
import { AutoProcessor, RawImage, CLIPVisionModelWithProjection } from '@xenova/transformers';
4+
import { createClient } from '@supabase/supabase-js'
5+
6+
if (!process.env.SUPABASE_SECRET_KEY) {
7+
throw new Error('Missing `SUPABASE_SECRET_KEY` environment variable.')
8+
}
9+
10+
// Create a single supabase client for interacting with your database
11+
const supabase = createClient(
12+
process.env.SUPABASE_URL,
13+
process.env.SUPABASE_SECRET_KEY,
14+
)
15+
16+
let { data, error } = await supabase
17+
.from('images')
18+
.select('*')
19+
.neq('ignore', true)
20+
.is('image_embedding', null);
21+
22+
if (error) {
23+
throw error;
24+
}
25+
26+
// Load processor and vision model
27+
const model_id = 'Xenova/clip-vit-base-patch16';
28+
const processor = await AutoProcessor.from_pretrained(model_id);
29+
const vision_model = await CLIPVisionModelWithProjection.from_pretrained(model_id, {
30+
quantized: false,
31+
});
32+
33+
for (const image_data of data) {
34+
let image;
35+
try {
36+
image = await RawImage.read(image_data.photo_image_url);
37+
} catch (e) {
38+
// Unable to load image, so we ignore it
39+
console.warn('Ignoring image due to error', e)
40+
await supabase
41+
.from('images')
42+
.update({ ignore: true })
43+
.eq('photo_id', image_data.photo_id)
44+
.select()
45+
continue;
46+
}
47+
48+
// Read image and run processor
49+
let image_inputs = await processor(image);
50+
51+
// Compute embeddings
52+
const { image_embeds } = await vision_model(image_inputs);
53+
const embed_as_list = image_embeds.tolist()[0];
54+
55+
// https://supabase.com/docs/guides/ai/vector-columns#storing-a-vector--embedding
56+
const { data, error } = await supabase
57+
.from('images')
58+
.update({ image_embedding: embed_as_list })
59+
.eq('photo_id', image_data.photo_id)
60+
.select()
61+
62+
if (error) {
63+
console.error('error', error)
64+
} else {
65+
console.log('success', image_data.photo_id)
66+
}
67+
}
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
import { AutoTokenizer, CLIPTextModelWithProjection } from "@xenova/transformers";
2+
import { createClient } from '@supabase/supabase-js'
3+
4+
// Use the Singleton pattern to enable lazy construction of the pipeline.
5+
// NOTE: We wrap the class in a function to prevent code duplication (see below).
6+
const S = () => class ApplicationSingleton {
7+
static model_id = 'Xenova/clip-vit-base-patch16';
8+
static tokenizer = null;
9+
static text_model = null;
10+
static database = null;
11+
12+
static async getInstance() {
13+
// Load tokenizer and text model
14+
if (this.tokenizer === null) {
15+
this.tokenizer = AutoTokenizer.from_pretrained(this.model_id);
16+
}
17+
18+
if (this.text_model === null) {
19+
this.text_model = CLIPTextModelWithProjection.from_pretrained(this.model_id, {
20+
quantized: false,
21+
});
22+
}
23+
24+
if (this.database === null) {
25+
this.database = createClient(
26+
process.env.SUPABASE_URL,
27+
process.env.SUPABASE_ANON_KEY,
28+
)
29+
}
30+
31+
return Promise.all([
32+
this.tokenizer,
33+
this.text_model,
34+
this.database,
35+
]);
36+
}
37+
}
38+
39+
let ApplicationSingleton;
40+
if (process.env.NODE_ENV !== 'production') {
41+
// When running in development mode, attach the pipeline to the
42+
// global object so that it's preserved between hot reloads.
43+
// For more information, see https://vercel.com/guides/nextjs-prisma-postgres
44+
if (!global.ApplicationSingleton) {
45+
global.ApplicationSingleton = S();
46+
}
47+
ApplicationSingleton = global.ApplicationSingleton;
48+
} else {
49+
ApplicationSingleton = S();
50+
}
51+
export default ApplicationSingleton;
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
import Image from 'next/image'
2+
import { blurHashToDataURL } from '../utils.js'
3+
4+
export function ImageGrid({ images, setCurrentImage }) {
5+
return (
6+
<div className="columns-2 gap-4 sm:columns-3 xl:columns-4 2xl:columns-5">
7+
{images && images.map(({
8+
photo_id,
9+
photo_url,
10+
photo_image_url,
11+
photo_aspect_ratio,
12+
photo_width,
13+
photo_height,
14+
blur_hash,
15+
photo_description,
16+
ai_description,
17+
similarity,
18+
}) => (
19+
<div
20+
key={photo_id}
21+
href={photo_url}
22+
className='after:content group cursor-pointer relative mb-4 block w-full after:pointer-events-none after:absolute after:inset-0 after:rounded-lg after:shadow-highlight'
23+
onClick={() => {
24+
setCurrentImage({
25+
photo_id,
26+
photo_url,
27+
photo_image_url,
28+
photo_aspect_ratio,
29+
photo_width,
30+
photo_height,
31+
blur_hash,
32+
photo_description,
33+
ai_description,
34+
similarity,
35+
});
36+
}}
37+
>
38+
<Image
39+
alt={photo_description || ai_description || ""}
40+
className="transform rounded-lg brightness-90 transition will-change-auto group-hover:brightness-110"
41+
style={{ transform: 'translate3d(0, 0, 0)' }}
42+
placeholder="blur"
43+
blurDataURL={blurHashToDataURL(blur_hash)}
44+
src={`${photo_image_url}?auto=format&fit=crop&w=480&q=80`}
45+
width={480}
46+
height={480 / photo_aspect_ratio}
47+
unoptimized={true}
48+
/>
49+
</div>
50+
))}
51+
</div>)
52+
}

0 commit comments

Comments
 (0)