Skip to content

Dev to staging #859

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 279 commits into from
Nov 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
279 commits
Select commit Hold shift + click to select a range
94c493e
processing count updated on cancel
kartikpersistent Aug 26, 2024
489b5ae
Merge branch 'DEV' of https://github.com/neo4j-labs/llm-graph-builder…
kartikpersistent Aug 26, 2024
3ef88b6
format fixes
kartikpersistent Aug 27, 2024
4e2f909
Merge branch 'STAGING' into DEV
kartikpersistent Aug 27, 2024
dadaa28
remove whitespace for enviroment variable which due to an error "xxx …
edenbuaa Aug 27, 2024
4c6f676
updated disconnected nodes
abhishekkumar-27 Aug 27, 2024
568db51
updated disconnected nodes
abhishekkumar-27 Aug 27, 2024
501ec6b
fix: Processed count update on failed condition
kartikpersistent Aug 28, 2024
9941474
added disconnected and up nodes
abhishekkumar-27 Aug 28, 2024
8ae3b99
removed __Entity__ labels
vasanthasaikalluri Aug 28, 2024
450ba6f
removed graph_object
vasanthasaikalluri Aug 28, 2024
266c812
removed graph object in the function
vasanthasaikalluri Aug 28, 2024
cac1963
resetting the alert message on success scenario
kartikpersistent Aug 29, 2024
fd7a4bb
Merge branch 'DEV' of https://github.com/neo4j-labs/llm-graph-builder…
kartikpersistent Aug 29, 2024
43ec569
Modified queries
vasanthasaikalluri Aug 29, 2024
d010e41
populate graph schema
abhishekkumar-27 Aug 30, 2024
b3a00ac
not clearing the password when there is error scenario
kartikpersistent Aug 30, 2024
77b06db
fixed the vector index loading issue
kartikpersistent Aug 30, 2024
d166290
fix: empty credentials payload for recreate vector index api
kartikpersistent Aug 30, 2024
088eda2
chatbot status (#676)
prakriti-solankey Sep 2, 2024
dbfe2a7
added properties and modified to entity labels
vasanthasaikalluri Sep 2, 2024
3d25d78
Post processing call after all files completion (#716)
kartikpersistent Sep 3, 2024
8baedb0
modified the summary creation
vasanthasaikalluri Sep 3, 2024
87a9321
Merge branch 'communities' of https://github.com/neo4j-labs/llm-graph…
vasanthasaikalluri Sep 3, 2024
c666c36
fixed the summary creation
vasanthasaikalluri Sep 3, 2024
c159322
Configuration change. Update LLM models and remove --preload from doc…
praveshkumar1988 Sep 3, 2024
e08aeab
Retry processing (#698)
kartikpersistent Sep 4, 2024
0cb7ed9
ref added for keydown (#717)
prakriti-solankey Sep 4, 2024
18b0093
Remove total_pages propert. It is not used in DB. (#714)
praveshkumar1988 Sep 4, 2024
eb48190
Update main.py
aashipandya Sep 4, 2024
5ff0014
Add print statement for document status
praveshkumar1988 Sep 5, 2024
8215abd
refactored qa integration
vasanthasaikalluri Sep 5, 2024
6ba42e0
allow credentials true changes
kartikpersistent Sep 5, 2024
771753b
reset the values to 0 when the retry option is start from begining
kartikpersistent Sep 5, 2024
d56a60d
Update Node/Document status using SSE, Trying to fix Cancelled by Sco…
praveshkumar1988 Sep 5, 2024
949d2f9
Merge branch 'DEV' of https://github.com/neo4j-labs/llm-graph-builder…
praveshkumar1988 Sep 5, 2024
3f98459
modified the chat mode settings
vasanthasaikalluri Sep 5, 2024
09b0180
renamed QA_integration
vasanthasaikalluri Sep 5, 2024
19b7bf2
added graphdatascience
vasanthasaikalluri Sep 5, 2024
c91febb
moved settings to constants
vasanthasaikalluri Sep 5, 2024
169c272
modified constants
vasanthasaikalluri Sep 5, 2024
ba13144
resetting the nodescount and relationshipcount
kartikpersistent Sep 6, 2024
4cc8104
Add vector index exist condition to create
praveshkumar1988 Sep 6, 2024
122f6a6
Merge branch 'STAGING' into DEV
aashipandya Sep 6, 2024
1222942
Science Molecule & database icon addition (#722)
prakriti-solankey Sep 6, 2024
0c278de
Add communities check and show respective chat modes (#729)
kartikpersistent Sep 6, 2024
7ff05cc
youtube transcript issue (#736)
praveshkumar1988 Sep 9, 2024
13c7bc8
added chnages to graph schema
abhishekkumar-27 Sep 9, 2024
a12b7cd
added local search
vasanthasaikalluri Sep 10, 2024
0469138
725 add checkbox for create communities (#728)
prakriti-solankey Sep 11, 2024
177f6c8
modified local search query
vasanthasaikalluri Sep 11, 2024
36c9dd3
added entity details for chat
vasanthasaikalluri Sep 11, 2024
1aaeab6
modified chunkids
vasanthasaikalluri Sep 11, 2024
ebb4c91
modified chunk_entities
vasanthasaikalluri Sep 11, 2024
c72962c
Add communities Checkbox to graph viz (#739)
prakriti-solankey Sep 11, 2024
1ef2e29
Merge branch 'graph_communities' of https://github.com/neo4j-labs/llm…
prakriti-solankey Sep 12, 2024
2d4c35a
Added time to process file in extract API and it's functions
praveshkumar1988 Sep 12, 2024
ef77b50
Merge branch 'DEV' of https://github.com/neo4j-labs/llm-graph-builder…
prakriti-solankey Sep 12, 2024
29205dc
Add Severity level in cloud logging
praveshkumar1988 Sep 12, 2024
fd12e9a
changed name
vasanthasaikalluri Sep 12, 2024
21dd283
modified the query
vasanthasaikalluri Sep 12, 2024
ccfabc7
updated entities param
vasanthasaikalluri Sep 12, 2024
dde29f2
added document to details
vasanthasaikalluri Sep 13, 2024
82be95b
format fixes
kartikpersistent Sep 13, 2024
93a4698
added global embedding
vasanthasaikalluri Sep 13, 2024
7740336
Merge branch 'DEV' into communities
prakriti-solankey Sep 13, 2024
7e0f1c8
added database
vasanthasaikalluri Sep 13, 2024
1c302cb
Merge branch 'communities' of https://github.com/neo4j-labs/llm-graph…
vasanthasaikalluri Sep 13, 2024
d1eda2b
added database
vasanthasaikalluri Sep 13, 2024
3c0a5df
modified is_entity
vasanthasaikalluri Sep 13, 2024
9e34919
modifies chunk entities
vasanthasaikalluri Sep 16, 2024
fa1ca36
Added secweb to fix security issues
praveshkumar1988 Sep 16, 2024
62f454d
removed QA integration
vasanthasaikalluri Sep 16, 2024
4c7821e
created neo4j from existing index
vasanthasaikalluri Sep 16, 2024
4206ec0
modified script
abhishekkumar-27 Sep 16, 2024
801745e
Integrate local search to chat details (#746)
prakriti-solankey Sep 16, 2024
596cdb7
Merge branch 'STAGING' into DEV
prakriti-solankey Sep 16, 2024
107f065
removal of unused code
prakriti-solankey Sep 16, 2024
8f0a706
removed entity label
vasanthasaikalluri Sep 16, 2024
bc09b86
Added Description to chat mode menu (#743)
prakriti-solankey Sep 16, 2024
337cd25
Update log_struct method to add severity
praveshkumar1988 Sep 16, 2024
dd89e0b
community check
prakriti-solankey Sep 16, 2024
6ed968d
Entity Empty Label fix and Icon
kartikpersistent Sep 17, 2024
9237847
Update Utils.ts
prakriti-solankey Sep 17, 2024
5f2290b
Retry processing - node and rels count update condition for start fro…
aashipandya Sep 17, 2024
fec0d1e
uncommented the Retry Processing
kartikpersistent Sep 17, 2024
dff7535
removed __Entity__ labels
vasanthasaikalluri Sep 17, 2024
e3bcf2b
spell fix
kartikpersistent Sep 17, 2024
a86dd12
fixed postprocessing method invoking issue for odd no files
kartikpersistent Sep 18, 2024
ba0957e
lint fix
kartikpersistent Sep 18, 2024
3fecf70
Added filesource and name in chunks
kartikpersistent Sep 18, 2024
d6b2e6a
Merge branch 'DEV' into communities
kartikpersistent Sep 18, 2024
57550b7
Preload=True remove from HSTS
praveshkumar1988 Sep 18, 2024
1695f19
Graph communities (#748)
prakriti-solankey Sep 18, 2024
f0810c7
aria label addition
prakriti-solankey Sep 18, 2024
a1dce53
Merge branch 'DEV' of https://github.com/neo4j-labs/llm-graph-builder…
prakriti-solankey Sep 18, 2024
869c87e
code improvements used URL class for host url check
kartikpersistent Sep 19, 2024
b2097e5
host level check
kartikpersistent Sep 19, 2024
3819118
Update Security header
praveshkumar1988 Sep 19, 2024
1246b41
encryption of localstorage values
kartikpersistent Sep 19, 2024
0ec6846
Merge branch 'communities' of https://github.com/neo4j-labs/llm-graph…
prakriti-solankey Sep 19, 2024
1709a16
'mode-selection-changes'
prakriti-solankey Sep 19, 2024
f84e5c3
Merge branch 'communities' of https://github.com/neo4j-labs/llm-graph…
prakriti-solankey Sep 19, 2024
389fa7f
added local chat history
vasanthasaikalluri Sep 19, 2024
4756647
Merge branch 'DEV' into communities
kartikpersistent Sep 19, 2024
9210bf8
added neo4j from existing index to entity vector mode
vasanthasaikalluri Sep 19, 2024
2711b0c
Merge branch 'communities' of https://github.com/neo4j-labs/llm-graph…
vasanthasaikalluri Sep 19, 2024
b5b8545
label changes
prakriti-solankey Sep 19, 2024
6d9ae8a
Merge branch 'communities' of https://github.com/neo4j-labs/llm-graph…
prakriti-solankey Sep 19, 2024
adf4106
commented security header
praveshkumar1988 Sep 19, 2024
8b6b930
Merge branch 'DEV' into communities
kartikpersistent Sep 20, 2024
060fd2d
Communities (#721)
prakriti-solankey Sep 20, 2024
cc62f50
added global env for communities
vasanthasaikalluri Sep 20, 2024
92450eb
Merge branch 'communities' of https://github.com/neo4j-labs/llm-graph…
prakriti-solankey Sep 20, 2024
3732c38
comment all security header
praveshkumar1988 Sep 20, 2024
763dedf
added threading to chat summarization to improve chat response time (…
vasanthasaikalluri Sep 20, 2024
27272f4
formatted the queries and added logic for empty label (#752)
vasanthasaikalluri Sep 20, 2024
00730d3
Commented youtube google api code
praveshkumar1988 Sep 20, 2024
8d210f6
Merge branch 'DEV' of https://github.com/neo4j-labs/llm-graph-builder…
praveshkumar1988 Sep 20, 2024
48c9f20
added the error handling for passowrd decrypt error
kartikpersistent Sep 23, 2024
9d89cb2
wordings changes
kartikpersistent Sep 24, 2024
69109f9
Exclude default labels from get_labels_and_relationtypes
praveshkumar1988 Sep 24, 2024
d41a920
Post-Processing-Alerts (#758)
kartikpersistent Sep 24, 2024
5303f5c
added write access check
vasanthasaikalluri Sep 25, 2024
91a4d30
added write access param
vasanthasaikalluri Sep 25, 2024
2ab06d6
added fulltext creation
vasanthasaikalluri Sep 25, 2024
9414f31
disabled the write and delete actions for read only user mode
kartikpersistent Sep 26, 2024
85112a9
Merge branch 'DEV' of https://github.com/neo4j-labs/llm-graph-builder…
kartikpersistent Sep 26, 2024
f116300
modified query
vasanthasaikalluri Sep 26, 2024
dbb77d6
test updates
abhishekkumar-27 Sep 23, 2024
6d0e18b
test uupdated
abhishekkumar-27 Sep 26, 2024
db02bfa
Read Only User Support (#766)
kartikpersistent Sep 26, 2024
c7460de
storing the gds status and write access on refresh
kartikpersistent Sep 26, 2024
501ece4
Merge branch 'local_chat_history' of https://github.com/neo4j-labs/ll…
kartikpersistent Sep 26, 2024
ba6a9d2
Langchain libs update (#769)
aashipandya Sep 27, 2024
3c2e344
fixed the rerendering of the table while file status is processing
kartikpersistent Sep 27, 2024
292ae9e
Merge branch 'DEV' of https://github.com/neo4j-labs/llm-graph-builder…
kartikpersistent Sep 27, 2024
0d93f57
fix: Read Only User Fix
kartikpersistent Sep 27, 2024
dc994a2
Global search fulltext (#767)
prakriti-solankey Sep 27, 2024
1180406
Added elapsed time for extarction on each breakdown function
praveshkumar1988 Sep 27, 2024
9206fc8
lint and format fixes
prakriti-solankey Sep 27, 2024
34537f7
Merge branch 'DEV' of https://github.com/neo4j-labs/llm-graph-builder…
prakriti-solankey Sep 27, 2024
0405816
removed dev logs
kartikpersistent Sep 27, 2024
c9844b7
communities fix
prakriti-solankey Sep 27, 2024
e4c3349
Merge branch 'DEV' of https://github.com/neo4j-labs/llm-graph-builder…
prakriti-solankey Sep 27, 2024
085a0e4
disabled the generate graph for read only user
kartikpersistent Sep 27, 2024
7813750
format fixes
kartikpersistent Sep 27, 2024
7cb7957
graph labels change
prakriti-solankey Sep 27, 2024
b7559ac
Merge branch 'DEV' of https://github.com/neo4j-labs/llm-graph-builder…
prakriti-solankey Sep 27, 2024
e0e97fb
added the readonly check for already added waiting files
kartikpersistent Sep 27, 2024
42f4c82
Retriever evaluation using RAGAS
kaustubh-darekar Sep 27, 2024
c8e8387
Merge branch 'DEV' of https://github.com/neo4j-labs/llm-graph-builder…
kaustubh-darekar Sep 27, 2024
e189ccd
Merge branch 'DEV' of https://github.com/neo4j-labs/llm-graph-builder…
kartikpersistent Sep 27, 2024
b663ff4
deleted unused file
kartikpersistent Sep 30, 2024
7c65344
code optimization using memo
kartikpersistent Sep 30, 2024
c69b708
Added elapsed_time on each api and getiing time per_entity
praveshkumar1988 Sep 30, 2024
393c53c
Added the post processing Alert showcasing the ongoing post processin…
kartikpersistent Oct 1, 2024
0dec005
fix: readonly user retry option disable
kartikpersistent Oct 1, 2024
ac3d88a
update script to get details of extarcted doc
abhishekkumar-27 Oct 3, 2024
83351ac
Issue fixed, Latency count per entity
praveshkumar1988 Oct 3, 2024
178dacb
Multiple chat modes selection (#780)
kartikpersistent Oct 4, 2024
1a33f0d
Fix: ChatModes DeSelection on FIle Selection
kartikpersistent Oct 7, 2024
e576055
Fix: Order of the chatmodes accordoing to selected chatmodes
kartikpersistent Oct 7, 2024
d1a56ca
Community optimization (#790)
vasanthasaikalluri Oct 9, 2024
7f255ee
Async way to create entities from multiple chunks (#788)
aashipandya Oct 9, 2024
943f539
fixed graph mode error (#792)
vasanthasaikalluri Oct 10, 2024
126dd48
Raga's Evaluation Metrics (#787)
kartikpersistent Oct 10, 2024
b8296e9
Openai gemini config (#794)
aashipandya Oct 10, 2024
b04f382
Added the user action for metrics table
kartikpersistent Oct 10, 2024
6aa84c7
Merge branch 'DEV' of https://github.com/neo4j-labs/llm-graph-builder…
kartikpersistent Oct 10, 2024
f97806c
Graph enhancements (#795)
prakriti-solankey Oct 10, 2024
ba95afd
format changes
prakriti-solankey Oct 10, 2024
0cc118d
Communities Bug fixes (#775)
prakriti-solankey Oct 10, 2024
fcb9ab5
llm name changes
kartikpersistent Oct 11, 2024
f505488
build fix
kartikpersistent Oct 11, 2024
37de220
default mode fix
kartikpersistent Oct 11, 2024
a538226
ragas model names update
kaustubh-darekar Oct 11, 2024
784caa6
lint fixes
kartikpersistent Oct 11, 2024
b814f71
Chunk Entities API condition
kartikpersistent Oct 11, 2024
69793a6
added the tooltip for unsupported lllms for ragas metric loading
kartikpersistent Oct 11, 2024
c5a3dbf
removed unused imports
kartikpersistent Oct 11, 2024
acdc886
multimode fix when we get error response
kartikpersistent Oct 11, 2024
2734c4a
mode changes for score display
prakriti-solankey Oct 11, 2024
cbd3f25
Merge branch 'DEV' of https://github.com/neo4j-labs/llm-graph-builder…
prakriti-solankey Oct 11, 2024
a12a6ab
fix: Fixed the details state handling between multiple chats
kartikpersistent Oct 15, 2024
ba091a0
Fix: Entity Mode Width Fix
kartikpersistent Oct 15, 2024
93c3dd3
diffbot fix for async (#797)
aashipandya Oct 15, 2024
821b0f4
Minor changes (#798)
vasanthasaikalluri Oct 15, 2024
702ebf7
New: Added the supported llm models for ragas evaluation
kartikpersistent Oct 15, 2024
3f1633c
Merge branch 'DEV' of https://github.com/neo4j-labs/llm-graph-builder…
kartikpersistent Oct 15, 2024
dcb3975
Fix: Communitites Tab is displayed based communitites length
kartikpersistent Oct 15, 2024
d3e5365
added the conversation download button (#800)
kartikpersistent Oct 15, 2024
0ff8bd1
model name correction
prakriti-solankey Oct 15, 2024
441908b
Merge branch 'STAGING' into DEV
prakriti-solankey Oct 15, 2024
09ea071
chatmode switch mode fix
kartikpersistent Oct 17, 2024
99dc052
Add API payload GCP logging (#805)
praveshkumar1988 Oct 18, 2024
a8f821a
Adding Links to get neighboring nodes (#796)
prakriti-solankey Oct 18, 2024
6c6da26
added error message for doc retriver (#807)
vasanthasaikalluri Oct 18, 2024
3d587f0
copy row (#803)
prakriti-solankey Oct 18, 2024
845bfb7
Raga's Evaluation For Multi Modes (#806)
kartikpersistent Oct 18, 2024
952291d
lint fixes
kartikpersistent Oct 18, 2024
f5a5edd
fix: multimode metrics state handling
kartikpersistent Oct 21, 2024
b3f1dd0
fix: Multimode metrics mode change state issue
kartikpersistent Oct 21, 2024
fb5e000
fix: list style fix
kartikpersistent Oct 21, 2024
fd224a1
Correct TYPO mistake
praveshkumar1988 Oct 21, 2024
cb77c18
added new env for ragas embedding model
vasanthasaikalluri Oct 21, 2024
986ae29
Merge branch 'DEV' of https://github.com/neo4j-labs/llm-graph-builder…
vasanthasaikalluri Oct 21, 2024
30d6ea8
Merge branch 'STAGING' into DEV
praveshkumar1988 Oct 21, 2024
5c0081e
Props name changes (#811)
kartikpersistent Oct 22, 2024
ee71002
test
prakriti-solankey Oct 22, 2024
c115014
view graph
prakriti-solankey Oct 22, 2024
c200b61
nodes count and relationshipcount updation fix
kartikpersistent Oct 22, 2024
1cc81ce
Merge branch 'STAGING' into DEV
kartikpersistent Oct 22, 2024
340679b
sourceUrl Fix
kartikpersistent Oct 22, 2024
055e40f
Merge branch 'DEV' of https://github.com/neo4j-labs/llm-graph-builder…
kartikpersistent Oct 22, 2024
fb35bda
empty string "" fix to keep the default values we should keep the val…
kartikpersistent Oct 23, 2024
ed18462
prop changes
kartikpersistent Oct 23, 2024
985993e
props changes
kartikpersistent Oct 23, 2024
220bee7
retry condition update for failed files (#820)
aashipandya Oct 23, 2024
0508585
Chat modes name changes (#815)
kartikpersistent Oct 23, 2024
9dba361
Youtube transcript fix with proxy (#822)
aashipandya Oct 23, 2024
6839e52
update script for async func
abhishekkumar-27 Oct 23, 2024
a5d29fa
ragas changes for graph retrieval mode. context added in api output (…
kaustubh-darekar Oct 24, 2024
cb59a2a
Remove extract latency from logging and add LIMIT in duplicate nodes
praveshkumar1988 Oct 24, 2024
93d7f3b
Document updates (#828)
kaustubh-darekar Oct 24, 2024
0d2882c
Update README.md
kartikpersistent Oct 24, 2024
6a6dc05
updated api structire in docs (#827)
vasanthasaikalluri Oct 24, 2024
29ef09b
Update backend_docs.adoc
karanchellani Oct 24, 2024
c5cd025
821 llm model listing (#823)
prakriti-solankey Oct 24, 2024
dfbb042
Merge branch 'STAGING' into DEV
kartikpersistent Oct 24, 2024
4bed352
Exclude session lable node from duplicate nodes list
praveshkumar1988 Oct 25, 2024
3dfb42b
Added the tooltip for disabled llm option (#835)
kartikpersistent Oct 25, 2024
4d795bf
node size changes
prakriti-solankey Oct 25, 2024
1fac375
mode removal of rows check
prakriti-solankey Oct 25, 2024
eb14fbe
formatting
prakriti-solankey Oct 25, 2024
0331cc7
Merge branch 'STAGING' into DEV
prakriti-solankey Oct 25, 2024
5cd9724
Exclude __Entity__ node label from duplicate node list
praveshkumar1988 Oct 25, 2024
70cb004
Update README.md
kartikpersistent Oct 28, 2024
bf51e78
Update README.md
kartikpersistent Oct 29, 2024
76b325c
Update README.md
kartikpersistent Oct 29, 2024
1d607bc
fixed the youtube link
kartikpersistent Oct 30, 2024
d8af5a5
Security header and GZIPMiddleware (#847)
praveshkumar1988 Nov 8, 2024
358d5a6
Chunk Text Details (#850)
kaustubh-darekar Nov 8, 2024
6d35a34
Communities Id to Title (#851)
prakriti-solankey Nov 8, 2024
cd6b4c2
disconnected nodes (#852)
prakriti-solankey Nov 8, 2024
282cfa0
loading changes
prakriti-solankey Nov 8, 2024
399785f
loading changes
prakriti-solankey Nov 8, 2024
686ed95
Update score.py
karanchellani Nov 11, 2024
4f1af18
added middleware
kartikpersistent Nov 12, 2024
1c29940
removed the unused state
kartikpersistent Nov 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,13 +45,13 @@ DIFFBOT_API_KEY="your-diffbot-key"

if you only want OpenAI:
```env
VITE_LLM_MODELS="diffbot,openai-gpt-3.5,openai-gpt-4o"
VITE_LLM_MODELS_PROD="diffbot,openai-gpt-3.5,openai-gpt-4o"
OPENAI_API_KEY="your-openai-key"
```

if you only want Diffbot:
```env
VITE_LLM_MODELS="diffbot"
VITE_LLM_MODELS_PROD="diffbot"
DIFFBOT_API_KEY="your-diffbot-key"
```

Expand Down Expand Up @@ -149,7 +149,6 @@ Allow unauthenticated request : Yes
| VITE_BACKEND_API_URL | Optional | http://localhost:8000 | URL for backend API |
| VITE_BLOOM_URL | Optional | https://workspace-preview.neo4j.io/workspace/explore?connectURL={CONNECT_URL}&search=Show+me+a+graph&featureGenAISuggestions=true&featureGenAISuggestionsInternal=true | URL for Bloom visualization |
| VITE_REACT_APP_SOURCES | Mandatory | local,youtube,wiki,s3 | List of input sources that will be available |
| VITE_LLM_MODELS | Mandatory | diffbot,openai-gpt-3.5,openai-gpt-4o | Models available for selection on the frontend, used for entities extraction and Q&A
| VITE_CHAT_MODES | Mandatory | vector,graph+vector,graph,hybrid | Chat modes available for Q&A
| VITE_ENV | Mandatory | DEV or PROD | Environment variable for the app |
| VITE_TIME_PER_PAGE | Optional | 50 | Time per page for processing |
Expand Down
100 changes: 91 additions & 9 deletions backend/score.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,14 @@
from langchain_google_vertexai import ChatVertexAI
from src.api_response import create_api_response
from src.graphDB_dataAccess import graphDBdataAccess
from src.graph_query import get_graph_results
from src.graph_query import get_graph_results,get_chunktext_results
from src.chunkid_entities import get_entities_from_chunkids
from src.post_processing import create_vector_fulltext_indexes, create_entity_embedding
from sse_starlette.sse import EventSourceResponse
from src.communities import create_communities
from src.neighbours import get_neighbour_nodes
import json
from typing import List, Mapping
from typing import List, Mapping, Union
from starlette.middleware.sessions import SessionMiddleware
import google_auth_oauthlib.flow
from google.oauth2.credentials import Credentials
Expand All @@ -33,8 +33,10 @@
from Secweb.ContentSecurityPolicy import ContentSecurityPolicy
from Secweb.XContentTypeOptions import XContentTypeOptions
from Secweb.XFrameOptions import XFrame

from fastapi.middleware.gzip import GZipMiddleware
from src.ragas_eval import *
from starlette.types import ASGIApp, Message, Receive, Scope, Send
import gzip

logger = CustomLogger()
CHUNK_DIR = os.path.join(os.path.dirname(__file__), "chunks")
Expand All @@ -49,14 +51,42 @@ def healthy():

def sick():
return False

class CustomGZipMiddleware:
def __init__(
self,
app: ASGIApp,
paths: List[str],
minimum_size: int = 1000,
compresslevel: int = 5
):
self.app = app
self.paths = paths
self.minimum_size = minimum_size
self.compresslevel = compresslevel

async def __call__(self, scope: Scope, receive: Receive, send: Send):
if scope["type"] != "http":
return await self.app(scope, receive, send)

path = scope["path"]
should_compress = any(path.startswith(gzip_path) for gzip_path in self.paths)

if not should_compress:
return await self.app(scope, receive, send)

gzip_middleware = GZipMiddleware(
app=self.app,
minimum_size=self.minimum_size,
compresslevel=self.compresslevel
)
await gzip_middleware(scope, receive, send)
app = FastAPI()
# SecWeb(app=app, Option={'referrer': False, 'xframe': False})
# app.add_middleware(HSTS, Option={'max-age': 4})
# app.add_middleware(ContentSecurityPolicy, Option={'default-src': ["'self'"], 'base-uri': ["'self'"], 'block-all-mixed-content': []}, script_nonce=False, style_nonce=False, report_only=False)
# app.add_middleware(XContentTypeOptions)
# app.add_middleware(XFrame, Option={'X-Frame-Options': 'DENY'})

app.add_middleware(ContentSecurityPolicy, Option={'default-src': ["'self'"], 'base-uri': ["'self'"], 'block-all-mixed-content': []}, script_nonce=False, style_nonce=False, report_only=False)
app.add_middleware(XContentTypeOptions)
app.add_middleware(XFrame, Option={'X-Frame-Options': 'DENY'})
#app.add_middleware(GZipMiddleware, minimum_size=1000, compresslevel=5)
app.add_middleware(CustomGZipMiddleware, minimum_size=1000, compresslevel=5,paths=["/sources_list","/url/scan","/extract","/chat_bot","/chunk_entities","/get_neighbours","/graph_query","/schema","/populate_graph_schema","/get_unconnected_nodes_list","/get_duplicate_nodes","/fetch_chunktext"])
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
Expand Down Expand Up @@ -818,5 +848,57 @@ async def calculate_metric(question: str = Form(),
finally:
gc.collect()

@app.post("/fetch_chunktext")
async def fetch_chunktext(
uri: str = Form(),
database: str = Form(),
userName: str = Form(),
password: str = Form(),
document_name: str = Form(),
page_no: int = Form(1)
):
try:
payload_json_obj = {
'api_name': 'fetch_chunktext',
'db_url': uri,
'userName': userName,
'database': database,
'document_name': document_name,
'page_no': page_no,
'logging_time': formatted_time(datetime.now(timezone.utc))
}
logger.log_struct(payload_json_obj, "INFO")
start = time.time()
result = await asyncio.to_thread(
get_chunktext_results,
uri=uri,
username=userName,
password=password,
database=database,
document_name=document_name,
page_no=page_no
)
end = time.time()
elapsed_time = end - start
json_obj = {
'api_name': 'fetch_chunktext',
'db_url': uri,
'document_name': document_name,
'page_no': page_no,
'logging_time': formatted_time(datetime.now(timezone.utc)),
'elapsed_api_time': f'{elapsed_time:.2f}'
}
logger.log_struct(json_obj, "INFO")
return create_api_response('Success', data=result, message=f"Total elapsed API time {elapsed_time:.2f}")
except Exception as e:
job_status = "Failed"
message = "Unable to get chunk text response"
error_message = str(e)
logging.exception(f'Exception in fetch_chunktext: {error_message}')
return create_api_response(job_status, message=message, error=error_message)
finally:
gc.collect()


if __name__ == "__main__":
uvicorn.run(app)
49 changes: 33 additions & 16 deletions backend/src/communities.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,24 +107,38 @@
STORE_COMMUNITY_SUMMARIES = """
UNWIND $data AS row
MERGE (c:__Community__ {id:row.community})
SET c.summary = row.summary
SET c.summary = row.summary,
c.title = row.title
"""


COMMUNITY_SYSTEM_TEMPLATE = "Given input triples, generate the information summary. No pre-amble."

COMMUNITY_TEMPLATE = """Based on the provided nodes and relationships that belong to the same graph community,
generate a natural language summary of the provided information:
{community_info}

Summary:"""
COMMUNITY_TEMPLATE = """
Based on the provided nodes and relationships that belong to the same graph community,
generate following output in exact format
title: A concise title, no more than 4 words,
summary: A natural language summary of the information
{community_info}
Example output:
title: Example Title,
summary: This is an example summary that describes the key information of this community.
"""

PARENT_COMMUNITY_SYSTEM_TEMPLATE = "Given an input list of community summaries, generate a summary of the information"

PARENT_COMMUNITY_TEMPLATE = """Based on the provided list of community summaries that belong to the same graph community,
generate a natural language summary of the information.Include all the necessary information as possible
generate following output in exact format
title: A concise title, no more than 4 words,
summary: A natural language summary of the information. Include all the necessary information as much as possible.

{community_info}

Summary:"""
Example output:
title: Example Title,
summary: This is an example summary that describes the key information of this community.
"""


GET_COMMUNITY_DETAILS = """
Expand Down Expand Up @@ -277,8 +291,17 @@ def process_community_info(community, chain, is_parent=False):
combined_text = " ".join(f"Summary {i+1}: {summary}" for i, summary in enumerate(community.get("texts", [])))
else:
combined_text = prepare_string(community)
summary = chain.invoke({'community_info': combined_text})
return {"community": community['communityId'], "summary": summary}
summary_response = chain.invoke({'community_info': combined_text})
lines = summary_response.splitlines()
title = "Untitled Community"
summary = ""
for line in lines:
if line.lower().startswith("title"):
title = line.split(":", 1)[-1].strip()
elif line.lower().startswith("summary"):
summary = line.split(":", 1)[-1].strip()
logging.info(f"Community Title : {title}")
return {"community": community['communityId'], "title":title, "summary": summary}
except Exception as e:
logging.error(f"Failed to process community {community.get('communityId', 'unknown')}: {e}")
return None
Expand All @@ -291,7 +314,7 @@ def create_community_summaries(gds, model):
summaries = []
with ThreadPoolExecutor() as executor:
futures = [executor.submit(process_community_info, community, community_chain) for community in community_info_list.to_dict(orient="records")]

for future in as_completed(futures):
result = future.result()
if result:
Expand Down Expand Up @@ -482,9 +505,3 @@ def create_communities(uri, username, password, database,model=COMMUNITY_CREATIO
logging.warning("Failed to write communities. Constraint was not applied.")
except Exception as e:
logging.error(f"Failed to create communities: {e}")






2 changes: 1 addition & 1 deletion backend/src/graphDB_dataAccess.py
Original file line number Diff line number Diff line change
Expand Up @@ -354,7 +354,7 @@ def get_duplicate_nodes_list(self):
score_value = float(os.environ.get('DUPLICATE_SCORE_VALUE'))
text_distance = int(os.environ.get('DUPLICATE_TEXT_DISTANCE'))
query_duplicate_nodes = """
MATCH (n:!Chunk&!Session&!Document&!`__Community__`) with n
MATCH (n:!Chunk&!Session&!Document&!`__Community__`&!`__Entity__`) with n
WHERE n.embedding is not null and n.id is not null // and size(toString(n.id)) > 3
WITH n ORDER BY count {{ (n)--() }} DESC, size(toString(n.id)) DESC // updated
WITH collect(n) as nodes
Expand Down
33 changes: 32 additions & 1 deletion backend/src/graph_query.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
from neo4j import GraphDatabase
import os
import json
from src.shared.constants import GRAPH_CHUNK_LIMIT,GRAPH_QUERY
from src.shared.constants import GRAPH_CHUNK_LIMIT,GRAPH_QUERY,CHUNK_TEXT_QUERY,COUNT_CHUNKS_QUERY
# from neo4j.debug import watch

# watch("neo4j")
Expand Down Expand Up @@ -226,3 +226,34 @@ def get_graph_results(uri, username, password,database,document_names):
driver.close()


def get_chunktext_results(uri, username, password, database, document_name, page_no):
"""Retrieves chunk text, position, and page number from graph data with pagination."""
try:
logging.info("Starting chunk text query process")
offset = 10
skip = (page_no - 1) * offset
limit = offset
driver = GraphDatabase.driver(uri, auth=(username, password))
with driver.session(database=database) as session:
total_chunks_result = session.run(COUNT_CHUNKS_QUERY, file_name=document_name)
total_chunks = total_chunks_result.single()["total_chunks"]
total_pages = (total_chunks + offset - 1) // offset # Calculate total pages
records = session.run(CHUNK_TEXT_QUERY, file_name=document_name, skip=skip, limit=limit)
pageitems = [
{
"text": record["chunk_text"],
"position": record["chunk_position"],
"pagenumber": record["page_number"]
}
for record in records
]
logging.info(f"Query process completed with {len(pageitems)} chunks retrieved")
return {
"pageitems": pageitems,
"total_pages": total_pages
}
except Exception as e:
logging.error(f"An error occurred in get_chunktext_results. Error: {str(e)}")
raise Exception("An error occurred in get_chunktext_results. Please check the logs for more details.") from e
finally:
driver.close()
3 changes: 2 additions & 1 deletion backend/src/neighbours.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@
labels: [coalesce(apoc.coll.removeAll(labels(node), ['__Entity__'])[0], "*")],
element_id: elementId(node),
properties: {
id: CASE WHEN node.id IS NOT NULL THEN node.id ELSE node.fileName END
id: CASE WHEN node.id IS NOT NULL THEN node.id ELSE node.fileName END,
title: CASE WHEN node.title IS NOT NULL THEN node.title ELSE " " END
}
}
] AS nodes,
Expand Down
15 changes: 14 additions & 1 deletion backend/src/shared/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,19 @@
] AS entities
"""

COUNT_CHUNKS_QUERY = """
MATCH (d:Document {fileName: $file_name})<-[:PART_OF]-(c:Chunk)
RETURN count(c) AS total_chunks
"""

CHUNK_TEXT_QUERY = """
MATCH (d:Document {fileName: $file_name})<-[:PART_OF]-(c:Chunk)
RETURN c.text AS chunk_text, c.position AS chunk_position, c.page_number AS page_number
ORDER BY c.position
SKIP $skip
LIMIT $limit
"""

## CHAT SETUP
CHAT_MAX_TOKENS = 1000
CHAT_SEARCH_KWARG_SCORE_THRESHOLD = 0.5
Expand Down Expand Up @@ -717,4 +730,4 @@
value "2023-03-15"."
"## 5. Strict Compliance\n"
"Adhere to the rules strictly. Non-compliance will result in termination."
"""
"""
21 changes: 21 additions & 0 deletions backend/test_integrationqa.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,27 @@ def test_graph_website(model_name):
print("Fail: ", e)
return weburl_result

def test_graph_website(model_name):
"""Test graph creation from a Website page."""
#graph, model, source_url, source_type
source_url = 'https://www.amazon.com/'
source_type = 'web-url'
create_source_node_graph_web_url(graph, model_name, source_url, source_type)

weburl_result = extract_graph_from_web_page(URI, USERNAME, PASSWORD, DATABASE, model_name, source_url, '', '')
logging.info("WebUrl test done")
print(weburl_result)

try:
assert weburl_result['status'] == 'Completed'
assert weburl_result['nodeCount'] > 0
assert weburl_result['relationshipCount'] > 0
print("Success")
except AssertionError as e:
print("Fail: ", e)
return weburl_result


def test_graph_from_youtube_video(model_name):
"""Test graph creation from a YouTube video."""
source_url = 'https://www.youtube.com/watch?v=T-qy-zPWgqA'
Expand Down
2 changes: 1 addition & 1 deletion docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,6 @@ services:
args:
- VITE_BACKEND_API_URL=${VITE_BACKEND_API_URL-http://localhost:8000}
- VITE_REACT_APP_SOURCES=${VITE_REACT_APP_SOURCES-local,wiki,s3}
- VITE_LLM_MODELS=${VITE_LLM_MODELS-}
- VITE_GOOGLE_CLIENT_ID=${VITE_GOOGLE_CLIENT_ID-}
- VITE_BLOOM_URL=${VITE_BLOOM_URL-https://workspace-preview.neo4j.io/workspace/explore?connectURL={CONNECT_URL}&search=Show+me+a+graph&featureGenAISuggestions=true&featureGenAISuggestionsInternal=true}
- VITE_TIME_PER_PAGE=${VITE_TIME_PER_PAGE-50}
Expand All @@ -62,6 +61,7 @@ services:
- VITE_ENV=${VITE_ENV-DEV}
- VITE_CHAT_MODES=${VITE_CHAT_MODES-}
- VITE_BATCH_SIZE=${VITE_BATCH_SIZE-2}
- VITE_LLM_MODELS=${VITE_LLM_MODELS-}
- VITE_LLM_MODELS_PROD=${VITE_LLM_MODELS_PROD-openai_gpt_4o,openai_gpt_4o_mini,diffbot,gemini_1.5_flash}
volumes:
- ./frontend:/app
Expand Down
1 change: 0 additions & 1 deletion example.env
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@ ENTITY_EMBEDDING=True
VITE_BACKEND_API_URL="http://localhost:8000"
VITE_BLOOM_URL="https://workspace-preview.neo4j.io/workspace/explore?connectURL={CONNECT_URL}&search=Show+me+a+graph&featureGenAISuggestions=true&featureGenAISuggestionsInternal=true"
VITE_REACT_APP_SOURCES="local,youtube,wiki,s3,web"
VITE_LLM_MODELS="diffbot,openai-gpt-3.5,openai-gpt-4o" # ",ollama_llama3"
VITE_ENV="DEV"
VITE_TIME_PER_PAGE=50
VITE_CHUNK_SIZE=5242880
Expand Down
1 change: 0 additions & 1 deletion frontend/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@ RUN yarn install
COPY . ./
RUN VITE_BACKEND_API_URL=$VITE_BACKEND_API_URL \
VITE_REACT_APP_SOURCES=$VITE_REACT_APP_SOURCES \
VITE_LLM_MODELS=$VITE_LLM_MODELS \
VITE_GOOGLE_CLIENT_ID=$VITE_GOOGLE_CLIENT_ID \
VITE_BLOOM_URL=$VITE_BLOOM_URL \
VITE_CHUNK_SIZE=$VITE_CHUNK_SIZE \
Expand Down
Loading