-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add document structure into GraphRAG #2033
Conversation
KingSkyLi
commented
Sep 20, 2024
it looks like db upsert error.
|
@KingSkyLi Hi, I was assigned to help review this pr, could you please explain what kind of the doc structure needed to be added in the graph? Thanks a lot! |
@KingSkyLi @Appointat
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good refactoring work, although there are still some further improvement points.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
This reverts commit 9693984.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix comments below by another pr
dbgpt/datasource/conn_tugraph.py
Outdated
|
||
def delete_graph(self, graph_name: str) -> None: | ||
"""Delete a graph.""" | ||
"""Delete a graph in the Neo4j database if it exists.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrong comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
||
DOCUMENT = "document" | ||
CHUNK = "chunk" | ||
ENTITY = "entity" # view as general vertex in the general case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be described as default node type in knowledge graph
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
DOCUMENT = "document" | ||
CHUNK = "chunk" | ||
ENTITY = "entity" # view as general vertex in the general case | ||
RELATION = "relation" # view as general edge in the general case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be described as default edge type in knowledge graph
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
||
def is_edge(self) -> bool: | ||
"""Check if the element is an edge.""" | ||
return not self.is_vertex() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not vertex
!= edge
, enumerate all valid edge types here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
||
async def discover_communities(self, **kwargs) -> List[str]: | ||
"""Run community discovery with leiden.""" | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return []
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
) | ||
|
||
for graph in graphs: | ||
graph_of_all.upsert_graph(graph) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update graph edge _chunk_id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
self._graph_store_apdater.upsert_graph(graph_of_all) | ||
|
||
# use asyncio.gather |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove unused comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
for graph in graphs: | ||
self._graph_store.insert_graph(graph) | ||
# Support graph search by the document and the chunks | ||
if self._graph_store.get_config().enable_document_graph: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this part of code into _parse_chunks
, and rename it to load_document_graph(chunks) -> List[Chunk]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
subgraph_for_doc = self._graph_store_apdater.explore( | ||
subs=keywords_for_document_graph, | ||
limit=5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use config: KNOWLEDGE_GRAPH_CHUNK_SEARCH_TOP_SIZE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
if not subs: | ||
return MemoryGraph() | ||
|
||
if depth is None or depth < 0 or depth > self.MAX_HIERARCHY_LEVEL: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
depth = 3 by default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done