Skip to content

Conversation

@Josipmrden
Copy link
Contributor

@Josipmrden Josipmrden commented Apr 15, 2025

Release note

General production guide, and GraphRAG production guide

Related product PRs

PRs from product repo this doc page is related to:
(paste the links to the PRs)

Checklist:

  • Add appropriate milestone (current release cycle)
  • Add bugfix or feature label, based on the product PR type you're documenting
  • Make sure all relevant tech details are documented
  • Check all content with Grammarly
  • Perform a self-review of my code
  • The build passes locally
  • My changes generate no new warnings or errors

@vercel
Copy link

vercel bot commented Apr 15, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
documentation ✅ Ready (Inspect) Visit Preview 💬 Add feedback Apr 28, 2025 6:47am

@Josipmrden Josipmrden changed the base branch from main to memgraph-3-2 April 15, 2025 10:04
@Josipmrden Josipmrden changed the base branch from memgraph-3-2 to main April 16, 2025 13:34
@Josipmrden Josipmrden changed the title Memgraph in production guides Memgraph in production guides (general and GraphRAG) Apr 24, 2025
@Josipmrden Josipmrden requested review from antejavor and matea16 April 24, 2025 11:39
* [functions](/querying/functions): return a null value

Please note that deleting same part of the graph from parallel transaction will lead to undefined behavior.
**Please note that deleting same part of the graph from parallel transaction will lead to undefined behavior.**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**Please note that deleting same part of the graph from parallel transaction will lead to undefined behavior.**
<Callout type="warning">
Please note that deleting the same part of the graph from a parallel transaction will lead to undefined behavior.
</Callout>

| Cores | 1 vCPU | ≥ 8 vCPUs (≥ 4 physical cores) |
| Network | 100 Mbps | ≥ 1 Gbps |

The disk is used for storing database [durability files](/configuration/data-durability-and-backup) - snapshots and write-ahead
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The disk is used for storing database [durability files](/configuration/data-durability-and-backup) - snapshots and write-ahead
The disk is used for storing database [durability files](/configuration/data-durability-and-backup), including snapshots and write-ahead

| Network | 100 Mbps | ≥ 1 Gbps |

The disk is used for storing database [durability files](/configuration/data-durability-and-backup) - snapshots and write-ahead
logs. By default, Memgraph stores 3 latest snapshots in the database (flag to adjust this is `--storage-snapshot-retention-count`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logs. By default, Memgraph stores 3 latest snapshots in the database (flag to adjust this is `--storage-snapshot-retention-count`).
logs. By default, Memgraph stores the **three most recent snapshots** in the database (this can be configured using the `--storage-snapshot-retention-count` flag).

Comment on lines 124 to 125
The amount of CPU cores varies per use case. For horizontal scalability, you can further increase the amount of cores on your
system for additional scalability.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The amount of CPU cores varies per use case. For horizontal scalability, you can further increase the amount of cores on your
system for additional scalability.
The number of CPU cores required depends on your specific use case. For horizontal scalability, you can increase the number of available cores on your system for additional scalability.

Comment on lines 11 to 14
When deploying Memgraph in production, it is essential to consider a set of prerequisites to ensure optimal performance,
scalability, and resilience. This includes hardware considerations, the correct sizing of instances, configuring drivers, using
appropriate flags when starting Memgraph, importing data, and connecting to external sources. The guidelines in this section will
help you make informed decisions for your specific use case, ensuring that Memgraph performs effectively in your environment.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When deploying Memgraph in production, it is essential to consider a set of prerequisites to ensure optimal performance,
scalability, and resilience. This includes hardware considerations, the correct sizing of instances, configuring drivers, using
appropriate flags when starting Memgraph, importing data, and connecting to external sources. The guidelines in this section will
help you make informed decisions for your specific use case, ensuring that Memgraph performs effectively in your environment.
When deploying Memgraph in production, it is essential to consider a set of prerequisites to ensure optimal **performance**, **scalability** and **resilience**. That includes decisions about hardware configurations and integration strategies.
This guide is your starting point to production-readiness with Memgraph.
## ✅ What you'll need to consider
Before you dive into specific setups, it's important to think about:
- **Hardware requirements** and **instance sizing**
- **Driver configuration**
- **Flags** when starting Memgraph
- **Data import** best practices
- Connecting to **external sources**
These factors ensure Memgraph performs effectively in your environment, no matter the use case.


Reason for that is because environment variables always override any system settings that are set via queries.

Additionally, please check out [how to set up the Memgraph Lab Enterprise license](/data-visualization/user-manual/remote-storage#how-to-set-it-up)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Additionally, please check out [how to set up the Memgraph Lab Enterprise license](/data-visualization/user-manual/remote-storage#how-to-set-it-up)
Additionally, please check out [how to set up the Memgraph Lab Enterprise license](/memgraph-lab/configuration#adding-memgraph-enterprise-license)

Memgraph fully supports the **Cypher query language**, making it easy to express complex graph patterns. In addition to Cypher,
Memgraph has **built-in path traversal capabilities** at the core of the database, enabling **lightning-fast traversals** optimized for
performance-critical use cases. You can learn more about these in our
[Deep Path Traversal guide](/advanced-algorithms/deep-path-traversal).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[Deep Path Traversal guide](/advanced-algorithms/deep-path-traversal).
[Deep path traversal guide](/advanced-algorithms/deep-path-traversal).

**[MAGE library](/advanced-algorithms/available-algorithms)**, which includes a wide range of pre-built
**graph algorithms** and **procedures** for tasks like community detection, centrality scoring, node similarity, and more.

We encourage users to explore the other guides in the **"Memgraph in Production"** series, where you'll find detailed examples and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We encourage users to explore the other guides in the **"Memgraph in Production"** series, where you'll find detailed examples and
We encourage users to explore the other guides in the *Memgraph in Production* series, where you'll find detailed examples and

description: General suggestions when working with Memgraph, from testing to production.
---

import { Callout } from 'nextra/components'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import { Callout } from 'nextra/components'
import { Callout } from 'nextra/components'
import {CommunityLinks} from '/components/social-card/CommunityLinks'
import { Steps } from 'nextra/components'

**graph algorithms** and **procedures** for tasks like community detection, centrality scoring, node similarity, and more.

We encourage users to explore the other guides in the **"Memgraph in Production"** series, where you'll find detailed examples and
recommendations on which types of queries are most effective based on your specific **workload and use case**.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
recommendations on which types of queries are most effective based on your specific **workload and use case**.
recommendations on which types of queries are most effective based on your specific **workload and use case**.
<CommunityLinks/>

page. It provides **foundational, use-case-agnostic advice** for deploying Memgraph in production.

This guide builds on that foundation, offering **additional recommendations tailored to specific workloads**.
In cases where guidance overlaps, the information in this chapter should be seen as **complementary or overriding**, depending
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In cases where guidance overlaps, the information in this chapter should be seen as **complementary or overriding**, depending
In cases where guidance overlaps, consider the information here as **complementary or overriding**, depending

on the unique needs of your use case.
</Callout>

## When to use this guide
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## When to use this guide
## Is this guide for you?

## When to use this guide

This guide is for you if you're exploring or building **GraphRAG (Graph-Augmented Retrieval-Augmented Generation)** systems.
Consider diving into this guide when:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Consider diving into this guide when:
You'll benefit from this content if:

Comment on lines 27 to 28
- 🔄 You need to **seamlessly extract knowledge graphs from multiple source systems**, especially when graph representation
naturally suits the data structure.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- 🔄 You need to **seamlessly extract knowledge graphs from multiple source systems**, especially when graph representation
naturally suits the data structure.
- 🔄 You need to **seamlessly extract knowledge graphs from multiple source systems**, especially when a graph structure naturally fits the data.

If any of these resonate with your project, this guide will walk you through the best practices and configurations to bring
GraphRAG to life using Memgraph.

## Why should you choose Memgraph for GraphRAG use cases
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Why should you choose Memgraph for GraphRAG use cases
## Why choose Memgraph for GraphRAG use cases?

Understand which **enterprise features** — such as security, access controls, and dynamic graph algorithms are
essential for production-ready GraphRAG deployments.

- **[Queries that best suit your workload](#queries-that-best-suit-your-workload)**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **[Queries that best suit your workload](#queries-that-best-suit-your-workload)**
- [Queries that best suit your workload](#queries-that-best-suit-your-workload)

Learn how to use **deep path traversals**, **vector search**, and **dynamic MAGE algorithms** to efficiently retrieve contextual data and
handle **high-velocity graphs**.

- **[Memgraph ecosystem](#memgraph-ecosystem)**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **[Memgraph ecosystem](#memgraph-ecosystem)**
- [Memgraph ecosystem](#memgraph-ecosystem)

- 🧭 **Reference-based indexing**: Embeddings will be stored only in the vector index, with the property storage holding just a reference.
Since embeddings are used exclusively for vector search, this eliminates duplication.

- ⚙️ **Support for float16 in usearch**: Users will be able to store embeddings as 2-byte floats (float16), commonly used in neural networks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- ⚙️ **Support for float16 in usearch**: Users will be able to store embeddings as 2-byte floats (float16), commonly used in neural networks.
- ⚙️ **Support for float16 in usearch**: Users will be able to store embeddings as 2-byte floats (`float16`), commonly used in neural networks.

Comment on lines 110 to 118
- Consider the **appropriate embedding dimension** for your use case. While models like OpenAI use 1536 or 3072 dimensions, **lower-dimensional vectors
(e.g., 512 or 768)** often result in only a **5–6% drop in accuracy** and drastically reduce memory consumption.

- If in-memory embedding storage is too demanding, you can **offload embeddings to a third-party vector database** and still integrate it
into your GraphRAG pipeline alongside Memgraph.

- If you're also storing **document context** in Memgraph, keep in mind that these long strings are currently stored in memory as well. A short-term
roadmap item will enable **offloading static text content to disk**, since these strings rarely change. This will further
**increase your memory efficiency and scalability**.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Consider the **appropriate embedding dimension** for your use case. While models like OpenAI use 1536 or 3072 dimensions, **lower-dimensional vectors
(e.g., 512 or 768)** often result in only a **5–6% drop in accuracy** and drastically reduce memory consumption.
- If in-memory embedding storage is too demanding, you can **offload embeddings to a third-party vector database** and still integrate it
into your GraphRAG pipeline alongside Memgraph.
- If you're also storing **document context** in Memgraph, keep in mind that these long strings are currently stored in memory as well. A short-term
roadmap item will enable **offloading static text content to disk**, since these strings rarely change. This will further
**increase your memory efficiency and scalability**.
- **Choose the right embedding dimension**:
While models like OpenAI use 1536 or 3072 dimensions, **lower-dimensional vectors like 512 or 768** often result in only a **5–6% drop in accuracy** and drastically reduce memory consumption.
- **Offload if needed**: If in-memory embedding storage is too demanding, you can **offload embeddings to a third-party vector database** and still integrate it
into your GraphRAG pipeline alongside Memgraph.
- **Watch for string context size**: If you're also storing **document context** in Memgraph, keep in mind that these long strings are currently stored in memory as well. A short-term
roadmap item will enable **offloading static text content to disk**, since these strings rarely change. This will further
**increase your memory efficiency and scalability**.



## Queries that best suit your workload
When integrating Memgraph into open-source GraphRAG frameworks like **LlamaIndex** or **LangGraph**,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When integrating Memgraph into open-source GraphRAG frameworks like **LlamaIndex** or **LangGraph**,
When integrating Memgraph into open-source GraphRAG frameworks like [LlamaIndex](/ai-ecosystem/integrations#llamaindex), [LangGraph](/ai-ecosystem/integrations#langchain) or [MCP](/ai-ecosystem/integrations#model-context-protocol-mcp),

* Update initial page

* Add page for evaluating memgraph -> mgbench

* Add title for evaluating memgraph

* Address PR comments
@matea16 matea16 merged commit 0df477d into main Apr 28, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

priority: low (improvements) An idea how the representation of knowledge on a certain page could be improved

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants