Skip to content

Keep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget)

License

Notifications You must be signed in to change notification settings

jina-ai/node-DeepResearch

Repository files navigation

DeepResearch

Official UI | UI Code | Stable API | Blog

Keep searching, reading webpages, reasoning until an answer is found (or the token budget is exceeded). Useful for deeply investigating a query.

Important

Unlike OpenAI/Gemini/Perplexity's "Deep Research", we focus solely on finding the right answers via our iterative process. We don't optimize for long-form articles, that's a completely different problem – so if you need quick, concise answers from deep search, you're in the right place. If you're looking for AI-generated long reports like OpenAI/Gemini/Perplexity does, this isn't for you.

---
config:
  theme: mc
  look: handDrawn
---
flowchart LR
 subgraph Loop["until budget exceed"]
    direction LR
        Search["Search"]
        Read["Read"]
        Reason["Reason"]
  end
    Query(["Query"]) --> Loop
    Search --> Read
    Read --> Reason
    Reason --> Search
    Loop --> Answer(["Answer"])

Loading

Whether you like this implementation or not, I highly recommend you to read DeepSearch/DeepResearch implementation guide I wrote, which gives you a gentle intro to this topic.

Try it Yourself

We host an online deployment of this exact codebase, which allows you to do a vibe-check; or use it as daily productivity tools.

https://search.jina.ai

The official API is also available for you to use:

https://deepsearch.jina.ai/v1/chat/completions

Learn more about the API at https://jina.ai/deepsearch

Install

git clone https://github.com/jina-ai/node-DeepResearch.git
cd node-DeepResearch
npm install

安装部署视频教程 on Youtube

It is also available on npm but not recommended for now, as the code is still under active development.

Usage

We use Gemini (latest gemini-2.0-flash) / OpenAI / LocalLLM for reasoning, Jina Reader for searching and reading webpages, you can get a free API key with 1M tokens from jina.ai.

export GEMINI_API_KEY=...  # for gemini
# export OPENAI_API_KEY=... # for openai
# export LLM_PROVIDER=openai # for openai
export JINA_API_KEY=jina_...  # free jina api key, get from https://jina.ai/reader

npm run dev $QUERY

Official Site

You can try it on our official site.

Official API

You can also use our official DeepSearch API:

https://deepsearch.jina.ai/v1/chat/completions

You can use it with any OpenAI-compatible client.

For the authentication Bearer, API key, rate limit, get from https://jina.ai/deepsearch.

Client integration guidelines

If you are building a web/local/mobile client that uses Jina DeepSearch API, here are some design guidelines:

Demo

was recorded with gemini-1.5-flash, the latest gemini-2.0-flash leads to much better results!

Query: "what is the latest blog post's title from jina ai?" 3 steps; answer is correct! demo1

Query: "what is the context length of readerlm-v2?" 2 steps; answer is correct! demo1

Query: "list all employees from jina ai that u can find, as many as possible" 11 steps; partially correct! but im not in the list :( demo1

Query: "who will be the biggest competitor of Jina AI" 42 steps; future prediction kind, so it's arguably correct! atm Im not seeing weaviate as a competitor, but im open for the future "i told you so" moment. demo1

More examples:

# example: no tool calling 
npm run dev "1+1="
npm run dev "what is the capital of France?"

# example: 2-step
npm run dev "what is the latest news from Jina AI?"

# example: 3-step
npm run dev "what is the twitter account of jina ai's founder"

# example: 13-step, ambiguious question (no def of "big")
npm run dev "who is bigger? cohere, jina ai, voyage?"

# example: open question, research-like, long chain of thoughts
npm run dev "who will be president of US in 2028?"
npm run dev "what should be jina ai strategy for 2025?"

Use Local LLM

Note, not every LLM works with our reasoning flow, we need those who support structured output (sometimes called JSON Schema output, object output) well. Feel free to purpose a PR to add more open-source LLMs to the working list.

If you use Ollama or LMStudio, you can redirect the reasoning request to your local LLM by setting the following environment variables:

export LLM_PROVIDER=openai  # yes, that's right - for local llm we still use openai client
export OPENAI_BASE_URL=http://127.0.0.1:1234/v1  # your local llm endpoint
export OPENAI_API_KEY=whatever  # random string would do, as we don't use it (unless your local LLM has authentication)
export DEFAULT_MODEL_NAME=qwen2.5-7b  # your local llm model name

OpenAI-Compatible Server API

If you have a GUI client that supports OpenAI API (e.g. CherryStudio, Chatbox) , you can simply config it to use this server.

demo1

Start the server:

# Without authentication
npm run serve

# With authentication (clients must provide this secret as Bearer token)
npm run serve --secret=your_secret_token

The server will start on http://localhost:3000 with the following endpoint:

POST /v1/chat/completions

# Without authentication
curl http://localhost:3000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "jina-deepsearch-v1",
    "messages": [
      {
        "role": "user",
        "content": "Hello!"
      }
    ]
  }'

# With authentication (when server is started with --secret)
curl http://localhost:3000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your_secret_token" \
  -d '{
    "model": "jina-deepsearch-v1",
    "messages": [
      {
        "role": "user",
        "content": "Hello!"
      }
    ],
    "stream": true
  }'

Response format:

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "jina-deepsearch-v1",
  "system_fingerprint": "fp_44709d6fcb",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "YOUR FINAL ANSWER"
    },
    "logprobs": null,
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 12,
    "total_tokens": 21
  }
}

For streaming responses (stream: true), the server sends chunks in this format:

{
  "id": "chatcmpl-123",
  "object": "chat.completion.chunk",
  "created": 1694268190,
  "model": "jina-deepsearch-v1",
  "system_fingerprint": "fp_44709d6fcb",
  "choices": [{
    "index": 0,
    "delta": {
      "content": "..."
    },
    "logprobs": null,
    "finish_reason": null
  }]
}

Note: The think content in streaming responses is wrapped in XML tags:

<think>
[thinking steps...]
</think>
[final answer]

Docker Setup

Build Docker Image

To build the Docker image for the application, run the following command:

docker build -t deepresearch:latest .

Run Docker Container

To run the Docker container, use the following command:

docker run -p 3000:3000 --env GEMINI_API_KEY=your_gemini_api_key --env JINA_API_KEY=your_jina_api_key deepresearch:latest

Docker Compose

You can also use Docker Compose to manage multi-container applications. To start the application with Docker Compose, run:

docker-compose up

How Does it Work?

Not sure a flowchart helps, but here it is:

flowchart TD
    Start([Start]) --> Init[Initialize context & variables]
    Init --> CheckBudget{Token budget<br/>exceeded?}
    CheckBudget -->|No| GetQuestion[Get current question<br/>from gaps]
    CheckBudget -->|Yes| BeastMode[Enter Beast Mode]

    GetQuestion --> GenPrompt[Generate prompt]
    GenPrompt --> ModelGen[Generate response<br/>using Gemini]
    ModelGen --> ActionCheck{Check action<br/>type}

    ActionCheck -->|answer| AnswerCheck{Is original<br/>question?}
    AnswerCheck -->|Yes| EvalAnswer[Evaluate answer]
    EvalAnswer --> IsGoodAnswer{Is answer<br/>definitive?}
    IsGoodAnswer -->|Yes| HasRefs{Has<br/>references?}
    HasRefs -->|Yes| End([End])
    HasRefs -->|No| GetQuestion
    IsGoodAnswer -->|No| StoreBad[Store bad attempt<br/>Reset context]
    StoreBad --> GetQuestion

    AnswerCheck -->|No| StoreKnowledge[Store as intermediate<br/>knowledge]
    StoreKnowledge --> GetQuestion

    ActionCheck -->|reflect| ProcessQuestions[Process new<br/>sub-questions]
    ProcessQuestions --> DedupQuestions{New unique<br/>questions?}
    DedupQuestions -->|Yes| AddGaps[Add to gaps queue]
    DedupQuestions -->|No| DisableReflect[Disable reflect<br/>for next step]
    AddGaps --> GetQuestion
    DisableReflect --> GetQuestion

    ActionCheck -->|search| SearchQuery[Execute search]
    SearchQuery --> NewURLs{New URLs<br/>found?}
    NewURLs -->|Yes| StoreURLs[Store URLs for<br/>future visits]
    NewURLs -->|No| DisableSearch[Disable search<br/>for next step]
    StoreURLs --> GetQuestion
    DisableSearch --> GetQuestion

    ActionCheck -->|visit| VisitURLs[Visit URLs]
    VisitURLs --> NewContent{New content<br/>found?}
    NewContent -->|Yes| StoreContent[Store content as<br/>knowledge]
    NewContent -->|No| DisableVisit[Disable visit<br/>for next step]
    StoreContent --> GetQuestion
    DisableVisit --> GetQuestion

    BeastMode --> FinalAnswer[Generate final answer] --> End
Loading

About

Keep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget)

Topics

Resources

License

Stars

Watchers

Forks

Contributors 10

Languages