Skip to content

Commit 6a02b34

Browse files
praveshkumar1988aashipandyaabhishekkumar-27kartikpersistentvasanthasaikalluri
authored
Dev (#433) (#448)
* Integration_qa test (#375) * Test IntegrationQA added * update test cases * update test * update node count assertions * test changes * update changes * modification test * Code refatctor test cases * Handle allowedlist issue in test * test changes * update test * test case execution * test chatbot updates * test case update file * added file --------- * recent merges * pdf deletion due to out of diskspace * fixed status blank issue * Rendering the file name instead of link for gcs and s3 sources in the info modal * Convert is_cancelled value from string to bool * added the default page size * Issue fixed Processed chunked as 0 when file re-process again * Youtube timestamps (#386) * Wikipedia source to accept all valid urls * wikipedia url to support multiple languages * integrated wiki langauge param for extract api * Youtube video timestamps --------- * groq llm integration backend (#286) * groq llm integration backend * groq and description in node properties * added groq in options --------- * offset in chunks (#389) * page number in gcs loader (#393) * added youtube timestamps (#392) * chat pop up button (#387) * expand * minimize-icon * css changes * chat history * chatbot wider Side Nav * expand icon * chatbot UI * Delete * merge fixes * code suggestions --------- * chunks create before extraction using is_pre_process variable (#383) * chunks create before extraction using is_pre_process variable * Return total pages for Model * update requirement.txt * total pages on uplaod API * added the Confirmation Dialog * added the selected files into the confirmation modal * format and lint fixes * added the stop watch image * fileselection on alert dialog * Add timeout in docker for gunicorn workers * Add cancel icon to info popup (#384) * Info Modal Changes * css changes * recent merges * Integration_qa test (#375) * Test IntegrationQA added * update test cases * update test * update node count assertions * test changes * update changes * modification test * Code refatctor test cases * Handle allowedlist issue in test * test changes * update test * test case execution * test chatbot updates * test case update file * added file --------- * fixed status blank issue * Rendering the file name instead of link for gcs and s3 sources in the info modal * added the default page size * Convert is_cancelled value from string to bool * Issue fixed Processed chunked as 0 when file re-process again * Youtube timestamps (#386) * Wikipedia source to accept all valid urls * wikipedia url to support multiple languages * integrated wiki langauge param for extract api * Youtube video timestamps --------- * groq llm integration backend (#286) * groq llm integration backend * groq and description in node properties * added groq in options --------- * Save Total Pages in DB * Added total Pages * file selection when we didn't select anything from Main table * added the danger icon only for large files * added the overflow for more files and file selection for all new files * moved the interface to types * added the icon accoroding to the source * set total page for wiki and youtube * h3 heading * merge * updated the alert on basis if total pages * deleted chunks * polling based on total pages * isNan check * large file based on file size for s3 and gcs * file source in server side event * time calculation based on chunks for gcs and s3 --------- * fixed the layout issue * Populate graph schema (#399) * crreate new endpoint populate_graph_schema and update the query for getting lables from DB * Added main.py changes * conditionally-including-the-gcs-login-flow-in-gcs-as-source (#396) * added the condtion * removed llms * Fixed issue : Remove extra unused param * get emb only if used (#278) * Chatbot chunks (#402) * Added file name to the content sent to LLM * added chunk text in the response * increased the docs parts sent to llm * Modified graph query * mardown rendering * youtube starttime * icons * offset changes * removed the files due to codespace space issue --------- * Settings modal to support generating the labels from the llm by using text given by user (#405) * added the json * added schema from text dialog * integrated the schemaAPI * added the alert * resize fixes * fixed css issue * fixed status blank issue * Modified response when no docs is retrived (#413) * Fixed env/docker-compose for local deployments + README doc (#410) * Fixed env/docker-compose for local deployments + README doc * wrong place for ENV in README * by default, removed langsmith + fixed knn score string to float * by default, removed langsmith + fixed knn score string to float * Fixed strings in docker-compose env * Added requirements (neo4j 5.15 or later, APOC, and instructions for Neo4j Desktop) * Missed the TIME_PER_PAGE env, was causing NaN issue in the approx time processing notification. fixed that * Support for all unstructured files (#401) * all unstructured files * responsiveness * added file type * added the extensions * spell mistake * ppt file changes --------- * Settings modal to support generating the labels from the llm by using text given by user with checkbox (#415) * added the json * added schema from text dialog * integrated the schemaAPI * added the alert * resize fixes * Extract schema using direct ChatOpenAI API and Chain * integrated the checkbox for schema to text dialog * Update SettingModal.tsx --------- * gcs file content read via storage client (#417) * gcs file content read via storage client * added the access token the file state --------- * pypdf2 to read files from gcs (#420) * 407 remove driver from frontend (#416) * removed driver * removed API * connecting to database on page refresh --------- * Css handling of info modal and Tooltips (#418) * css change * toolTips * Sidebar Tooltips * copy to clip * css change * added image types * added gcs * type fix * docker changes * speech * added the toolip for dropzone sources --------- * Fixed retrival bugs (#421) * yarn format fixes * changed the delete message * added the cancel button * changed the message on tooltip * added space * UI fixes * tooltip for setting * updated req * wikipedia URL input (#424) * accept only wikipedia links * added wikipedia link * added wikilink regex * wikipedia single url only * changed the alert message * wording change * pushed validation state persist error --------- * speech and copy (#422) * speech and copy * startTime * added chunk properties * tooltips --------- * Fixed issue for out of range in KNN API * solved conflicts * conflict solved * Remove logging info from update KNN API * tooltip changes * format and lint fixes * responsiveness changes * Fixed issue for total pages GCS, S3 * UI polishing (#428) * button and tooltip changes * checking validation on change * settings module populate fix * format fixes * opening the modal after auth success * removed the limit * added the scrobar for dropdowns * speech state (#426) * speech state * Button Details changes * delete wording change * Total pages in buckets (#431) * page number NA for buckets * added N/A for gcs and s3 pages * total pages for gcs * remove unwanted logger --------- * removed the max width * Update FileTable.tsx * Update the docker file * Modified prompt (#438) * Update Dockerfile * Update Dockerfile * Update Dockerfile * rendering Fix * Local file upload gcs (#442) * Uplaod file to GCS * GCS local upload fixed issue and delete file from GCS after processing and failed or cancelled * Add life cycle rule on uploaded bucket * pdf upload local and gcs bucket check * delete files when processed and extract changes --------- * Modified chat length and entities used (#443) * metadata for unstructured files (#446) * Unstructured file metadata (#447) * metadata for unstructured files * sleep in gcs upload * updated * icons added to chunks (#435) * icons added to chunks * info modal icons --------- Co-authored-by: aashipandya <156318202+aashipandya@users.noreply.github.com> Co-authored-by: abhishekkumar-27 <164544129+abhishekkumar-27@users.noreply.github.com> Co-authored-by: kartikpersistent <101251502+kartikpersistent@users.noreply.github.com> Co-authored-by: vasanthasaikalluri <165021735+vasanthasaikalluri@users.noreply.github.com> Co-authored-by: Prakriti Solankey <156313631+prakriti-solankey@users.noreply.github.com> Co-authored-by: Ajay Meena <meenajy1996@gmail.com> Co-authored-by: Morgan Senechal <morgan@neo4j.com> Co-authored-by: karanchellani <142801957+karanchellani@users.noreply.github.com>
1 parent ac6de74 commit 6a02b34

19 files changed

+288
-163
lines changed

backend/Dockerfile

Lines changed: 18 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,22 @@
1-
FROM python:3.10
1+
FROM python:3.10-slim
22
WORKDIR /code
33
ENV PORT 8000
44
EXPOSE 8000
5+
# Install dependencies and clean up in one layer
6+
RUN apt-get update && \
7+
apt-get install -y --no-install-recommends \
8+
libgl1-mesa-glx \
9+
cmake \
10+
poppler-utils \
11+
tesseract-ocr && \
12+
apt-get clean && \
13+
rm -rf /var/lib/apt/lists/*
14+
# Set LD_LIBRARY_PATH
15+
ENV LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH
16+
# Copy requirements file and install Python dependencies
17+
COPY requirements.txt /code/
18+
RUN pip install --no-cache-dir --upgrade -r requirements.txt
19+
# Copy application code
520
COPY . /code
6-
RUN apt-get update \
7-
&& apt-get install -y libgl1-mesa-glx cmake \
8-
&& apt-get install -y poppler-utils \
9-
&& apt install -y tesseract-ocr \
10-
&& export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH \
11-
&& pip install --no-cache-dir --upgrade -r /code/requirements.txt
12-
13-
# CMD ["uvicorn", "score:app", "--host", "0.0.0.0", "--port", "8000","--workers", "4"]
14-
CMD ["gunicorn", "score:app","--workers","4","--worker-class","uvicorn.workers.UvicornWorker", "--bind", "0.0.0.0:8000", "--timeout", "300"]
15-
21+
# Set command
22+
CMD ["gunicorn", "score:app", "--workers", "2", "--worker-class", "uvicorn.workers.UvicornWorker", "--bind", "0.0.0.0:8000", "--timeout", "300"]

backend/example.env

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,4 @@ LANGCHAIN_API_KEY = ""
2020
LANGCHAIN_PROJECT = ""
2121
LANGCHAIN_TRACING_V2 = ""
2222
LANGCHAIN_ENDPOINT = ""
23+
GCS_FILE_CACHE = "" #save the file into GCS or local, SHould be True or False

backend/requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ google-cloud-aiplatform
4343
google-cloud-bigquery==3.19.0
4444
google-cloud-core==2.4.1
4545
google-cloud-resource-manager==1.12.3
46-
google-cloud-storage==2.16.0
46+
google-cloud-storage
4747
google-crc32c==1.5.0
4848
google-resumable-media==2.7.0
4949
googleapis-common-protos==1.63.0

backend/score.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -162,7 +162,7 @@ async def extract_knowledge_graph_from_file(
162162
merged_file_path = os.path.join(MERGED_DIR,file_name)
163163
logging.info(f'File path:{merged_file_path}')
164164
result = await asyncio.to_thread(
165-
extract_graph_from_file_local_file, graph, model, file_name, merged_file_path, allowedNodes, allowedRelationship)
165+
extract_graph_from_file_local_file, graph, model, merged_file_path, file_name, allowedNodes, allowedRelationship)
166166

167167
elif source_type == 's3 bucket' and source_url:
168168
result = await asyncio.to_thread(
@@ -191,7 +191,7 @@ async def extract_knowledge_graph_from_file(
191191
error_message = str(e)
192192
graphDb_data_Access.update_exception_db(file_name,error_message)
193193
if source_type == 'local file':
194-
delete_uploaded_local_file(merged_file_path, file_name)
194+
delete_file_from_gcs(BUCKET_UPLOAD,file_name)
195195
josn_obj = {'message':message,'error_message':error_message, 'file_name': file_name,'status':'Failed','db_url':uri,'failed_count':1, 'source_type': source_type}
196196
logger.log_struct(josn_obj)
197197
logging.exception(f'File Failed in extraction: {josn_obj}')
@@ -342,7 +342,7 @@ async def upload_large_file_into_chunks(file:UploadFile = File(...), chunkNumber
342342
password=Form(None), database=Form(None)):
343343
try:
344344
graph = create_graph_database_connection(uri, userName, password, database)
345-
result = await asyncio.to_thread(upload_file, graph, model, file, chunkNumber, totalChunks, originalname, CHUNK_DIR, MERGED_DIR)
345+
result = await asyncio.to_thread(upload_file, graph, model, file, chunkNumber, totalChunks, originalname, uri, CHUNK_DIR, MERGED_DIR)
346346
josn_obj = {'api_name':'upload','db_url':uri}
347347
logger.log_struct(josn_obj)
348348
if int(chunkNumber) == int(totalChunks):

backend/src/QA_integration_new.py

Lines changed: 18 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@
3333
MATCH (chunk)-[:PART_OF]->(d:Document)
3434
CALL { WITH chunk
3535
MATCH (chunk)-[:HAS_ENTITY]->(e)
36-
MATCH path=(e)(()-[rels:!HAS_ENTITY&!PART_OF]-()){0,3}(:!Chunk&!Document)
36+
MATCH path=(e)(()-[rels:!HAS_ENTITY&!PART_OF]-()){0,2}(:!Chunk&!Document)
3737
UNWIND rels as r
3838
RETURN collect(distinct r) as rels
3939
}
@@ -45,22 +45,26 @@
4545
apoc.text.join(texts,"\n----\n") +
4646
apoc.text.join(entities,"\n")
4747
as text, entities, chunkIds, page_numbers ,start_times
48-
RETURN text, score, {source: COALESCE(CASE WHEN d.url CONTAINS "None" THEN d.fileName ELSE d.url END, d.fileName), chunkIds:chunkIds, page_numbers:page_numbers,start_times:start_times} as metadata
48+
RETURN text, score, {source: COALESCE(CASE WHEN d.url CONTAINS "None" THEN d.fileName ELSE d.url END, d.fileName), chunkIds:chunkIds, page_numbers:page_numbers,start_times:start_times,entities:entities} as metadata
4949
"""
5050

5151
SYSTEM_TEMPLATE = """
52-
You are an AI-powered question-answering agent. Your task is to provide accurate and concise responses to user queries based on the given context, chat history, and available resources.
52+
You are an AI-powered question-answering agent. Your task is to provide accurate and comprehensive responses to user queries based on the given context, chat history, and available resources.
5353
5454
### Response Guidelines:
55-
1. **Direct Answers**: Provide straightforward answers to the user's queries without headers unless requested. Avoid speculative responses.
55+
1. **Direct Answers**: Provide clear and thorough answers to the user's queries without headers unless requested. Avoid speculative responses.
5656
2. **Utilize History and Context**: Leverage relevant information from previous interactions, the current user input, and the context provided below.
5757
3. **No Greetings in Follow-ups**: Start with a greeting in initial interactions. Avoid greetings in subsequent responses unless there's a significant break or the chat restarts.
5858
4. **Admit Unknowns**: Clearly state if an answer is unknown. Avoid making unsupported statements.
5959
5. **Avoid Hallucination**: Only provide information based on the context provided. Do not invent information.
60-
6. **Response Length**: Keep responses concise and relevant. Aim for clarity and completeness within 2-3 sentences unless more detail is requested.
60+
6. **Response Length**: Keep responses concise and relevant. Aim for clarity and completeness within 4-5 sentences unless more detail is requested.
6161
7. **Tone and Style**: Maintain a professional and informative tone. Be friendly and approachable.
6262
8. **Error Handling**: If a query is ambiguous or unclear, ask for clarification rather than providing a potentially incorrect answer.
6363
9. **Fallback Options**: If the required information is not available in the provided context, provide a polite and helpful response. Example: "I don't have that information right now." or "I'm sorry, but I don't have that information. Is there something else I can help with?"
64+
10. **Context Availability**: If the context is empty, do not provide answers based solely on internal knowledge. Instead, respond appropriately by indicating the lack of information.
65+
66+
67+
**IMPORTANT** : DO NOT ANSWER FROM YOUR KNOWLEDGE BASE USE THE BELOW CONTEXT
6468
6569
### Context:
6670
<context>
@@ -72,15 +76,18 @@
7276
AI Response: 'Hello there! How can I assist you today?'
7377
7478
User: "What is Langchain?"
75-
AI Response: "Langchain is a framework that enables the development of applications powered by large language models, such as chatbots."
79+
AI Response: "Langchain is a framework that enables the development of applications powered by large language models, such as chatbots. It simplifies the integration of language models into various applications by providing useful tools and components."
7680
7781
User: "Can you explain how to use memory management in Langchain?"
78-
AI Response: "Langchain's memory management involves utilizing built-in mechanisms to manage conversational context effectively, ensuring a coherent user experience."
82+
AI Response: "Langchain's memory management involves utilizing built-in mechanisms to manage conversational context effectively. It ensures that the conversation remains coherent and relevant by maintaining the history of interactions and using it to inform responses."
7983
8084
User: "I need help with PyCaret's classification model."
81-
AI Response: "PyCaret simplifies the process of building and deploying machine learning models. For classification tasks, you can use PyCaret's setup function to prepare your data, then compare and tune models."
85+
AI Response: "PyCaret simplifies the process of building and deploying machine learning models. For classification tasks, you can use PyCaret's setup function to prepare your data. After setup, you can compare multiple models to find the best one, and then fine-tune it for better performance."
86+
87+
User: "What can you tell me about the latest realtime trends in AI?"
88+
AI Response: "I don't have that information right now. Is there something else I can help with?"
8289
83-
Note: This system does not generate answers based solely on internal knowledge. It answers from the information provided in the user's current and previous inputs, and from explicitly referenced external sources.
90+
Note: This system does not generate answers based solely on internal knowledge. It answers from the information provided in the user's current and previous inputs, and from the context.
8491
"""
8592

8693
def get_neo4j_retriever(graph, index_name="vector", search_k=CHAT_SEARCH_KWARG_K, score_threshold=CHAT_SEARCH_KWARG_SCORE_THRESHOLD):
@@ -288,7 +295,9 @@ def QA_RAG(graph,model,question,session_id):
288295
}
289296
)
290297
if docs:
298+
# print(docs)
291299
formatted_docs,sources = format_documents(docs)
300+
292301
doc_retrieval_time = time.time() - start_time
293302
logging.info(f"Modified question and Documents retrieved in {doc_retrieval_time:.2f} seconds")
294303

backend/src/chunkid_entities.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
MATCH (chunk)-[:PART_OF]->(d:Document)
1111
CALL {WITH chunk
1212
MATCH (chunk)-[:HAS_ENTITY]->(e)
13-
MATCH path=(e)(()-[rels:!HAS_ENTITY&!PART_OF]-()){0,3}(:!Chunk&!Document)
13+
MATCH path=(e)(()-[rels:!HAS_ENTITY&!PART_OF]-()){0,2}(:!Chunk&!Document)
1414
UNWIND rels as r
1515
RETURN collect(distinct r) as rels
1616
}

backend/src/document_sources/gcs_bucket.py

Lines changed: 69 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
from PyPDF2 import PdfReader
88
import io
99
from google.oauth2.credentials import Credentials
10+
import time
1011

1112
def get_gcs_bucket_files_info(gcs_project_id, gcs_bucket_name, gcs_bucket_folder, creds):
1213
storage_client = storage.Client(project=gcs_project_id, credentials=creds)
@@ -41,7 +42,7 @@ def get_gcs_bucket_files_info(gcs_project_id, gcs_bucket_name, gcs_bucket_folder
4142
def load_pdf(file_path):
4243
return PyMuPDFLoader(file_path)
4344

44-
def get_documents_from_gcs(gcs_project_id, gcs_bucket_name, gcs_bucket_folder, gcs_blob_filename, access_token):
45+
def get_documents_from_gcs(gcs_project_id, gcs_bucket_name, gcs_bucket_folder, gcs_blob_filename, access_token=None):
4546

4647
if gcs_bucket_folder is not None:
4748
if gcs_bucket_folder.endswith('/'):
@@ -56,8 +57,12 @@ def get_documents_from_gcs(gcs_project_id, gcs_bucket_name, gcs_bucket_folder, g
5657
# pages = loader.load()
5758
# file_name = gcs_blob_filename
5859
#creds= Credentials(access_token)
59-
creds= Credentials(access_token)
60-
storage_client = storage.Client(project=gcs_project_id, credentials=creds)
60+
if access_token is None:
61+
storage_client = storage.Client(project=gcs_project_id)
62+
else:
63+
creds= Credentials(access_token)
64+
storage_client = storage.Client(project=gcs_project_id, credentials=creds)
65+
print(f'BLOB Name: {blob_name}')
6166
bucket = storage_client.bucket(gcs_bucket_name)
6267
blob = bucket.blob(blob_name)
6368
content = blob.download_as_bytes()
@@ -70,4 +75,64 @@ def get_documents_from_gcs(gcs_project_id, gcs_bucket_name, gcs_bucket_folder, g
7075
text += page.extract_text()
7176
pages = [Document(page_content = text)]
7277
return gcs_blob_filename, pages
73-
78+
79+
def upload_file_to_gcs(file_chunk, chunk_number, original_file_name, bucket_name):
80+
storage_client = storage.Client()
81+
82+
file_name = f'{original_file_name}_part_{chunk_number}'
83+
bucket = storage_client.bucket(bucket_name)
84+
file_data = file_chunk.file.read()
85+
# print(f'data after read {file_data}')
86+
87+
blob = bucket.blob(file_name)
88+
file_io = io.BytesIO(file_data)
89+
blob.upload_from_file(file_io)
90+
# Define the lifecycle rule to delete objects after 6 hours
91+
# rule = {
92+
# "action": {"type": "Delete"},
93+
# "condition": {"age": 1} # Age in days (24 hours = 1 days)
94+
# }
95+
96+
# # Get the current lifecycle policy
97+
# lifecycle = list(bucket.lifecycle_rules)
98+
99+
# # Add the new rule
100+
# lifecycle.append(rule)
101+
102+
# # Set the lifecycle policy on the bucket
103+
# bucket.lifecycle_rules = lifecycle
104+
# bucket.patch()
105+
time.sleep(1)
106+
logging.info('Chunk uploaded successfully in gcs')
107+
108+
def merge_file_gcs(bucket_name, original_file_name: str):
109+
storage_client = storage.Client()
110+
# Retrieve chunks from GCS
111+
blobs = storage_client.list_blobs(bucket_name, prefix=f"{original_file_name}_part_")
112+
chunks = []
113+
for blob in blobs:
114+
chunks.append(blob.download_as_bytes())
115+
blob.delete()
116+
117+
# Merge chunks into a single file
118+
merged_file = b"".join(chunks)
119+
blob = storage_client.bucket(bucket_name).blob(original_file_name)
120+
logging.info('save the merged file from chunks in gcs')
121+
file_io = io.BytesIO(merged_file)
122+
blob.upload_from_file(file_io)
123+
pdf_reader = PdfReader(file_io)
124+
file_size = len(merged_file)
125+
total_pages = len(pdf_reader.pages)
126+
127+
return total_pages, file_size
128+
129+
def delete_file_from_gcs(bucket_name, file_name):
130+
try:
131+
storage_client = storage.Client()
132+
bucket = storage_client.bucket(bucket_name)
133+
blob = bucket.blob(file_name)
134+
if blob.exists():
135+
blob.delete()
136+
logging.info('File deleted from GCS successfully')
137+
except:
138+
raise Exception('BLOB not exists in GCS')

backend/src/document_sources/local_file.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,13 +43,17 @@ def get_documents_from_file_by_path(file_path,file_name):
4343

4444
if page.metadata['page_number']>page_number:
4545
page_number+=1
46+
if not metadata:
47+
metadata = {'total_pages':unstructured_pages[-1].metadata['page_number']}
4648
pages.append(Document(page_content = page_content, metadata=metadata))
4749
page_content=''
4850

4951
if page == unstructured_pages[-1]:
52+
if not metadata:
53+
metadata = {'total_pages':unstructured_pages[-1].metadata['page_number']}
5054
pages.append(Document(page_content = page_content, metadata=metadata))
5155

52-
elif page.metadata['category']=='PageBreak':
56+
elif page.metadata['category']=='PageBreak' and page!=unstructured_pages[0]:
5357
page_number+=1
5458
pages.append(Document(page_content = page_content, metadata=metadata))
5559
page_content=''
@@ -65,5 +69,4 @@ def get_documents_from_file_by_path(file_path,file_name):
6569
else:
6670
logging.info(f'File {file_name} does not exist')
6771
raise Exception(f'File {file_name} does not exist')
68-
6972
return file_name, pages , file_extension

backend/src/graphDB_dataAccess.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22
import os
33
from datetime import datetime
44
from langchain_community.graphs import Neo4jGraph
5-
from src.shared.common_fn import delete_uploaded_local_file
6-
from src.api_response import create_api_response
5+
from src.document_sources.gcs_bucket import delete_file_from_gcs
6+
from src.shared.constants import BUCKET_UPLOAD
77
from src.entities.source_node import sourceNode
88
import json
99

@@ -175,7 +175,7 @@ def delete_file_from_graph(self, filenames, source_types, deleteEntities:str, me
175175
merged_file_path = os.path.join(merged_dir, file_name)
176176
if source_type == 'local file':
177177
logging.info(f'Deleted File Path: {merged_file_path} and Deleted File Name : {file_name}')
178-
delete_uploaded_local_file(merged_file_path, file_name)
178+
delete_file_from_gcs(BUCKET_UPLOAD,file_name)
179179

180180
query_to_delete_document="""
181181
MATCH (d:Document) where d.fileName in $filename_list and d.fileSource in $source_types_list

0 commit comments

Comments
 (0)