-
Notifications
You must be signed in to change notification settings - Fork 12
Memgraph in production guides (general and GraphRAG) #1239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
| * [functions](/querying/functions): return a null value | ||
|
|
||
| Please note that deleting same part of the graph from parallel transaction will lead to undefined behavior. | ||
| **Please note that deleting same part of the graph from parallel transaction will lead to undefined behavior.** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| **Please note that deleting same part of the graph from parallel transaction will lead to undefined behavior.** | |
| <Callout type="warning"> | |
| Please note that deleting the same part of the graph from a parallel transaction will lead to undefined behavior. | |
| </Callout> |
| | Cores | 1 vCPU | ≥ 8 vCPUs (≥ 4 physical cores) | | ||
| | Network | 100 Mbps | ≥ 1 Gbps | | ||
|
|
||
| The disk is used for storing database [durability files](/configuration/data-durability-and-backup) - snapshots and write-ahead |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| The disk is used for storing database [durability files](/configuration/data-durability-and-backup) - snapshots and write-ahead | |
| The disk is used for storing database [durability files](/configuration/data-durability-and-backup), including snapshots and write-ahead |
| | Network | 100 Mbps | ≥ 1 Gbps | | ||
|
|
||
| The disk is used for storing database [durability files](/configuration/data-durability-and-backup) - snapshots and write-ahead | ||
| logs. By default, Memgraph stores 3 latest snapshots in the database (flag to adjust this is `--storage-snapshot-retention-count`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| logs. By default, Memgraph stores 3 latest snapshots in the database (flag to adjust this is `--storage-snapshot-retention-count`). | |
| logs. By default, Memgraph stores the **three most recent snapshots** in the database (this can be configured using the `--storage-snapshot-retention-count` flag). | |
| The amount of CPU cores varies per use case. For horizontal scalability, you can further increase the amount of cores on your | ||
| system for additional scalability. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| The amount of CPU cores varies per use case. For horizontal scalability, you can further increase the amount of cores on your | |
| system for additional scalability. | |
| The number of CPU cores required depends on your specific use case. For horizontal scalability, you can increase the number of available cores on your system for additional scalability. |
pages/memgraph-in-production.mdx
Outdated
| When deploying Memgraph in production, it is essential to consider a set of prerequisites to ensure optimal performance, | ||
| scalability, and resilience. This includes hardware considerations, the correct sizing of instances, configuring drivers, using | ||
| appropriate flags when starting Memgraph, importing data, and connecting to external sources. The guidelines in this section will | ||
| help you make informed decisions for your specific use case, ensuring that Memgraph performs effectively in your environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| When deploying Memgraph in production, it is essential to consider a set of prerequisites to ensure optimal performance, | |
| scalability, and resilience. This includes hardware considerations, the correct sizing of instances, configuring drivers, using | |
| appropriate flags when starting Memgraph, importing data, and connecting to external sources. The guidelines in this section will | |
| help you make informed decisions for your specific use case, ensuring that Memgraph performs effectively in your environment. | |
| When deploying Memgraph in production, it is essential to consider a set of prerequisites to ensure optimal **performance**, **scalability** and **resilience**. That includes decisions about hardware configurations and integration strategies. | |
| This guide is your starting point to production-readiness with Memgraph. | |
| ## ✅ What you'll need to consider | |
| Before you dive into specific setups, it's important to think about: | |
| - **Hardware requirements** and **instance sizing** | |
| - **Driver configuration** | |
| - **Flags** when starting Memgraph | |
| - **Data import** best practices | |
| - Connecting to **external sources** | |
| These factors ensure Memgraph performs effectively in your environment, no matter the use case. |
|
|
||
| Reason for that is because environment variables always override any system settings that are set via queries. | ||
|
|
||
| Additionally, please check out [how to set up the Memgraph Lab Enterprise license](/data-visualization/user-manual/remote-storage#how-to-set-it-up) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Additionally, please check out [how to set up the Memgraph Lab Enterprise license](/data-visualization/user-manual/remote-storage#how-to-set-it-up) | |
| Additionally, please check out [how to set up the Memgraph Lab Enterprise license](/memgraph-lab/configuration#adding-memgraph-enterprise-license) | |
| Memgraph fully supports the **Cypher query language**, making it easy to express complex graph patterns. In addition to Cypher, | ||
| Memgraph has **built-in path traversal capabilities** at the core of the database, enabling **lightning-fast traversals** optimized for | ||
| performance-critical use cases. You can learn more about these in our | ||
| [Deep Path Traversal guide](/advanced-algorithms/deep-path-traversal). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| [Deep Path Traversal guide](/advanced-algorithms/deep-path-traversal). | |
| [Deep path traversal guide](/advanced-algorithms/deep-path-traversal). | |
| **[MAGE library](/advanced-algorithms/available-algorithms)**, which includes a wide range of pre-built | ||
| **graph algorithms** and **procedures** for tasks like community detection, centrality scoring, node similarity, and more. | ||
|
|
||
| We encourage users to explore the other guides in the **"Memgraph in Production"** series, where you'll find detailed examples and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| We encourage users to explore the other guides in the **"Memgraph in Production"** series, where you'll find detailed examples and | |
| We encourage users to explore the other guides in the *Memgraph in Production* series, where you'll find detailed examples and | |
| description: General suggestions when working with Memgraph, from testing to production. | ||
| --- | ||
|
|
||
| import { Callout } from 'nextra/components' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| import { Callout } from 'nextra/components' | |
| import { Callout } from 'nextra/components' | |
| import {CommunityLinks} from '/components/social-card/CommunityLinks' | |
| import { Steps } from 'nextra/components' |
| **graph algorithms** and **procedures** for tasks like community detection, centrality scoring, node similarity, and more. | ||
|
|
||
| We encourage users to explore the other guides in the **"Memgraph in Production"** series, where you'll find detailed examples and | ||
| recommendations on which types of queries are most effective based on your specific **workload and use case**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| recommendations on which types of queries are most effective based on your specific **workload and use case**. | |
| recommendations on which types of queries are most effective based on your specific **workload and use case**. | |
| <CommunityLinks/> |
| page. It provides **foundational, use-case-agnostic advice** for deploying Memgraph in production. | ||
|
|
||
| This guide builds on that foundation, offering **additional recommendations tailored to specific workloads**. | ||
| In cases where guidance overlaps, the information in this chapter should be seen as **complementary or overriding**, depending |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| In cases where guidance overlaps, the information in this chapter should be seen as **complementary or overriding**, depending | |
| In cases where guidance overlaps, consider the information here as **complementary or overriding**, depending | |
| on the unique needs of your use case. | ||
| </Callout> | ||
|
|
||
| ## When to use this guide |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ## When to use this guide | |
| ## Is this guide for you? | |
| ## When to use this guide | ||
|
|
||
| This guide is for you if you're exploring or building **GraphRAG (Graph-Augmented Retrieval-Augmented Generation)** systems. | ||
| Consider diving into this guide when: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Consider diving into this guide when: | |
| You'll benefit from this content if: | |
| - 🔄 You need to **seamlessly extract knowledge graphs from multiple source systems**, especially when graph representation | ||
| naturally suits the data structure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - 🔄 You need to **seamlessly extract knowledge graphs from multiple source systems**, especially when graph representation | |
| naturally suits the data structure. | |
| - 🔄 You need to **seamlessly extract knowledge graphs from multiple source systems**, especially when a graph structure naturally fits the data. | |
| If any of these resonate with your project, this guide will walk you through the best practices and configurations to bring | ||
| GraphRAG to life using Memgraph. | ||
|
|
||
| ## Why should you choose Memgraph for GraphRAG use cases |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ## Why should you choose Memgraph for GraphRAG use cases | |
| ## Why choose Memgraph for GraphRAG use cases? | |
| Understand which **enterprise features** — such as security, access controls, and dynamic graph algorithms are | ||
| essential for production-ready GraphRAG deployments. | ||
|
|
||
| - **[Queries that best suit your workload](#queries-that-best-suit-your-workload)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - **[Queries that best suit your workload](#queries-that-best-suit-your-workload)** | |
| - [Queries that best suit your workload](#queries-that-best-suit-your-workload) | |
| Learn how to use **deep path traversals**, **vector search**, and **dynamic MAGE algorithms** to efficiently retrieve contextual data and | ||
| handle **high-velocity graphs**. | ||
|
|
||
| - **[Memgraph ecosystem](#memgraph-ecosystem)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - **[Memgraph ecosystem](#memgraph-ecosystem)** | |
| - [Memgraph ecosystem](#memgraph-ecosystem) | |
| - 🧭 **Reference-based indexing**: Embeddings will be stored only in the vector index, with the property storage holding just a reference. | ||
| Since embeddings are used exclusively for vector search, this eliminates duplication. | ||
|
|
||
| - ⚙️ **Support for float16 in usearch**: Users will be able to store embeddings as 2-byte floats (float16), commonly used in neural networks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - ⚙️ **Support for float16 in usearch**: Users will be able to store embeddings as 2-byte floats (float16), commonly used in neural networks. | |
| - ⚙️ **Support for float16 in usearch**: Users will be able to store embeddings as 2-byte floats (`float16`), commonly used in neural networks. | |
| - Consider the **appropriate embedding dimension** for your use case. While models like OpenAI use 1536 or 3072 dimensions, **lower-dimensional vectors | ||
| (e.g., 512 or 768)** often result in only a **5–6% drop in accuracy** and drastically reduce memory consumption. | ||
|
|
||
| - If in-memory embedding storage is too demanding, you can **offload embeddings to a third-party vector database** and still integrate it | ||
| into your GraphRAG pipeline alongside Memgraph. | ||
|
|
||
| - If you're also storing **document context** in Memgraph, keep in mind that these long strings are currently stored in memory as well. A short-term | ||
| roadmap item will enable **offloading static text content to disk**, since these strings rarely change. This will further | ||
| **increase your memory efficiency and scalability**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Consider the **appropriate embedding dimension** for your use case. While models like OpenAI use 1536 or 3072 dimensions, **lower-dimensional vectors | |
| (e.g., 512 or 768)** often result in only a **5–6% drop in accuracy** and drastically reduce memory consumption. | |
| - If in-memory embedding storage is too demanding, you can **offload embeddings to a third-party vector database** and still integrate it | |
| into your GraphRAG pipeline alongside Memgraph. | |
| - If you're also storing **document context** in Memgraph, keep in mind that these long strings are currently stored in memory as well. A short-term | |
| roadmap item will enable **offloading static text content to disk**, since these strings rarely change. This will further | |
| **increase your memory efficiency and scalability**. | |
| - **Choose the right embedding dimension**: | |
| While models like OpenAI use 1536 or 3072 dimensions, **lower-dimensional vectors like 512 or 768** often result in only a **5–6% drop in accuracy** and drastically reduce memory consumption. | |
| - **Offload if needed**: If in-memory embedding storage is too demanding, you can **offload embeddings to a third-party vector database** and still integrate it | |
| into your GraphRAG pipeline alongside Memgraph. | |
| - **Watch for string context size**: If you're also storing **document context** in Memgraph, keep in mind that these long strings are currently stored in memory as well. A short-term | |
| roadmap item will enable **offloading static text content to disk**, since these strings rarely change. This will further | |
| **increase your memory efficiency and scalability**. | |
|
|
||
|
|
||
| ## Queries that best suit your workload | ||
| When integrating Memgraph into open-source GraphRAG frameworks like **LlamaIndex** or **LangGraph**, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| When integrating Memgraph into open-source GraphRAG frameworks like **LlamaIndex** or **LangGraph**, | |
| When integrating Memgraph into open-source GraphRAG frameworks like [LlamaIndex](/ai-ecosystem/integrations#llamaindex), [LangGraph](/ai-ecosystem/integrations#langchain) or [MCP](/ai-ecosystem/integrations#model-context-protocol-mcp), | |
* Update initial page * Add page for evaluating memgraph -> mgbench * Add title for evaluating memgraph * Address PR comments
Release note
General production guide, and GraphRAG production guide
Related product PRs
PRs from product repo this doc page is related to:
(paste the links to the PRs)
Checklist:
bugfixorfeaturelabel, based on the product PR type you're documenting