forked from langchain4j/langchain4j
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Sorry for a huge PR... - added retries to OpenAiChatModel - added @UserName: an option to define a name of a user as a parameter in AI Services API - added an option to split multiple documents at once (see DocumentSplitter) - redesigned document loaders (see FileSystemDocumentLoader) - renamed DocumentSegment into TextSegment - redesigned ConversationalRetrievalChain - added EmbeddingStoreIngestor - misc refactorings/fixes --------- Co-authored-by: deep-learning-dynamo <deep-learning-dynamo@gmail.com>
- Loading branch information
1 parent
cbc4462
commit 755c9d0
Showing
49 changed files
with
869 additions
and
411 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
12 changes: 11 additions & 1 deletion
12
langchain4j-core/src/main/java/dev/langchain4j/data/document/DocumentSplitter.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,18 @@ | ||
package dev.langchain4j.data.document; | ||
|
||
import dev.langchain4j.data.segment.TextSegment; | ||
|
||
import java.util.List; | ||
|
||
import static java.util.stream.Collectors.toList; | ||
|
||
public interface DocumentSplitter { | ||
|
||
List<DocumentSegment> split(Document document); | ||
List<TextSegment> split(Document document); | ||
|
||
default List<TextSegment> split(List<Document> documents) { | ||
return documents.stream() | ||
.flatMap(document -> split(document).stream()) | ||
.collect(toList()); | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
6 changes: 3 additions & 3 deletions
6
langchain4j-core/src/main/java/dev/langchain4j/model/embedding/TokenCountEstimator.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,14 +1,14 @@ | ||
package dev.langchain4j.model.embedding; | ||
|
||
import dev.langchain4j.data.document.DocumentSegment; | ||
import dev.langchain4j.data.segment.TextSegment; | ||
|
||
import java.util.List; | ||
|
||
public interface TokenCountEstimator { | ||
|
||
int estimateTokenCount(String text); | ||
|
||
int estimateTokenCount(DocumentSegment documentSegment); | ||
int estimateTokenCount(TextSegment textSegment); | ||
|
||
int estimateTokenCount(List<DocumentSegment> documentSegments); | ||
int estimateTokenCount(List<TextSegment> textSegments); | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.