Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic Chat templating code with text/json file based config; main chat updated to drive its in-prefix, in-suffix and reverse-prompt from same; chat-apply-template equivalent c-api to allow use by other codes also #6834

Draft
wants to merge 219 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
219 commits
Select commit Hold shift + click to select a range
2146a25
ChatOn: Capture the idea
hanishkvc Apr 22, 2024
35f2519
ChatOn:Common: Add the needed cmdline arg params and its parsing
hanishkvc Apr 22, 2024
dc56be9
ChatOn:Main: Load and dump any specified chaton meta file
hanishkvc Apr 22, 2024
093abc2
ChatOn: Update sample meta json to be a valid json
hanishkvc Apr 22, 2024
1374a64
Chaton:Meta: Add chatml meta data to sample meta json file
hanishkvc Apr 22, 2024
050d329
ChatOn+Main: Initial go at chaton in main interactive flow
hanishkvc Apr 22, 2024
cdbe4f0
Chaton:Sample Meta JSON cleanup
hanishkvc Apr 22, 2024
d87d275
ChatOn: update sample meta json a bit
hanishkvc Apr 22, 2024
c4cf0e9
ChatON:Cleanup: BeginEnd, Debug log
hanishkvc Apr 22, 2024
f03dd24
ChatOn:No global-begin/end in ChatApplyTmplSingle, ChatApplyTmpl
hanishkvc Apr 22, 2024
221ccd6
ChatOn: Add SystemUser-1st-User-Has-Prefix flag support
hanishkvc Apr 22, 2024
11b47fb
ChatON:MetaJson: Add key constants, check metaJson loaded ifNeeded
hanishkvc Apr 22, 2024
b105564
ChatON: Update the notes a bit
hanishkvc Apr 23, 2024
e8c24c0
ChatOn:MetaOk: Allows template-id based cross check
hanishkvc Apr 23, 2024
efb758b
ChatON: Rename helpers to kv suffix, updated wrt metaok
hanishkvc Apr 23, 2024
42f6b45
ChatON: Use the constants defined for the keys
hanishkvc Apr 23, 2024
3f9dfc2
ChatON: Check for the boolean entries in meta-json
hanishkvc Apr 23, 2024
217544e
ChatON: Keep compiler happy
hanishkvc Apr 23, 2024
57bd772
ChatON: Cleanup logging
hanishkvc Apr 23, 2024
2a8028f
ChatON: Add Zephyr template to meta-json file
hanishkvc Apr 23, 2024
f4b5406
ChatON: Add template for Gemma
hanishkvc Apr 23, 2024
84367b9
ChatON: Add template for DeepSeek
hanishkvc Apr 23, 2024
bdd279c
ChatOn:User Begin+Prefix note update, keep things simple consistent
hanishkvc Apr 23, 2024
0f713d4
ChatOn: meta json update wrt the new begin related fields
hanishkvc Apr 23, 2024
d70fca7
ChatOn: Add begin to the mix along with prefix
hanishkvc Apr 23, 2024
724ff38
ChatOn: Wrap getting begin in try-catch,
hanishkvc Apr 23, 2024
f1f39c5
ChatON:Add Monarch model template, which uses Begin + Prefix
hanishkvc Apr 23, 2024
3064a36
ChatON+:Update tmpl_role_kv to retrieve wrt multiple keys
hanishkvc Apr 23, 2024
9de1d60
ChatON:ChatParts class initial go
hanishkvc Apr 23, 2024
d189972
ChatON: Test ChatParts in chat-template-apply
hanishkvc Apr 23, 2024
6b23f15
ChatON:ChatOnMetaJSon: Add suffix wrt assistant messages
hanishkvc Apr 24, 2024
92e780f
ChatON:ChatParts: Allow flexibility for more refined tokenization
hanishkvc Apr 24, 2024
825a78a
ChatOn: ChatTemplateApplySingle[Ex] return parts detail
hanishkvc Apr 24, 2024
3f09eb5
ChatOn: ChatTemplateApply[Ex] return tagged msgs parts detail
hanishkvc Apr 24, 2024
adab577
ChatON: more detailed/spreadout json fields
hanishkvc Apr 24, 2024
7ba0144
ChatOn:chaton_tmpl_role_kv: try except to ignore missing ifany
hanishkvc Apr 24, 2024
5d76f08
ChatON: Need to explicitly specify string to use c_str
hanishkvc Apr 24, 2024
f8ae21c
ChatON:ChatTemplateApplySingle: update begin+prefix, suffix+end
hanishkvc Apr 25, 2024
344857b
ChatOn:ChatOnTemplateApply: suffix,end flag based control
hanishkvc Apr 25, 2024
6a0214c
ChatON:MetaOK->MetaDump: Alert if user->end is needed or not
hanishkvc Apr 25, 2024
0cd7c62
ChatON: Keep compiler happy
hanishkvc Apr 25, 2024
bf1167b
ChatON: Backup the current simple meta json file
hanishkvc Apr 25, 2024
b9e3130
ChatON: Update to new detailed format wrt llama2 and llama3
hanishkvc Apr 25, 2024
13857f2
ChatON+Main: Updates wrt detailed meta json
hanishkvc Apr 25, 2024
01c8db7
ChatON+Main: Add C_API wrapper for single
hanishkvc Apr 26, 2024
ea3a0f1
ChatON: Rather check for tmpl existance in single_ex
hanishkvc Apr 26, 2024
e62699f
ChatON: Add alertAssistantAtEnd flag & logic wrt MultiMsgs Apply
hanishkvc Apr 26, 2024
308d3bf
ChatON:WIP:Add c api wrapper for chat_template_apply
hanishkvc Apr 26, 2024
a630564
ChatON:ChatTemplateApplyCAPI remaining base logic
hanishkvc Apr 26, 2024
58e1ff1
ChatON: switch to ordered_json from json library
hanishkvc Apr 26, 2024
fee887f
ChatON:Common:Update the cmdline argument name used
hanishkvc Apr 26, 2024
d61b071
Chaton:Common:Add missing newline wrt cmdline arg usage
hanishkvc Apr 26, 2024
a4b3285
ChatON:Show Log on screen when template is applied
hanishkvc Apr 27, 2024
403a6c4
ChatON:Gemma: update for detailed meta json
hanishkvc Apr 27, 2024
1b2e921
ChatON:DeepSeek: Update support wrt detailed meta json
hanishkvc Apr 27, 2024
006a398
ChatON:DeepSeekCoder: Update tmplid and wrt detailed meta json
hanishkvc Apr 27, 2024
18cd125
ChatON:Monarch:Update wrt detailed meta json
hanishkvc Apr 27, 2024
a64dcd7
ChatON:Zephyr: Update wrt detailed meta json, also update eos
hanishkvc Apr 27, 2024
368fbf1
ChatON:ChatML: Update wrt detailed meta json
hanishkvc Apr 27, 2024
ad5e521
ChatON:Mistral: Add detailed meta json entries
hanishkvc Apr 27, 2024
55e3d63
ChatON:Mistral: Update to match jinja file
hanishkvc Apr 27, 2024
cad50c5
ChatON: Update the note to match current logic
hanishkvc Apr 27, 2024
32e672c
ChatON: Dont log final tagged message string to screen
hanishkvc Apr 27, 2024
a724fd9
ChatON:Tests: Add a test templates program for chaton
hanishkvc Apr 27, 2024
c4e829d
ChatON:Mistral: Decouple \n from suffix, use wrt sys msg
hanishkvc Apr 27, 2024
ff5f688
ChatON:ChatTmplApplySingle: Avoid streamstring, update func notes
hanishkvc Apr 28, 2024
889a45f
ChatON:ChatTmplApply:Update the function notes
hanishkvc Apr 28, 2024
af9a0a2
ChatON:ChatTmplApply: Avoid the stringstream
hanishkvc Apr 28, 2024
ce75d43
SimpCfg: Initial skeleton : get and set string and bool values
hanishkvc Apr 28, 2024
f4687fa
SimpCfg:Parse config file and load string key-value fields
hanishkvc Apr 28, 2024
aea6850
SimpCfg: Keep compiler happy, also add newline wrt alt logging def
hanishkvc Apr 28, 2024
f728dbd
ChatON: Add simpcfg based config file matching chaton_meta.json
hanishkvc Apr 28, 2024
2cbb00c
SimpCfg: Add support for boolean fields wrt key-value
hanishkvc Apr 28, 2024
28ae0c5
SimpCfg:Make str_trim flexible, use to trim , wrt value
hanishkvc Apr 28, 2024
1ecca5a
SimpCfg: Convert to a class
hanishkvc Apr 28, 2024
6de8a14
SimpCfg: Rename member functions to avoid sc_ prefix
hanishkvc Apr 28, 2024
d514c81
SimpCfg: Add the const which I had forgotten wrt args
hanishkvc Apr 28, 2024
951fbc3
SimpCfg: Change logging to LDBUG and LERRR helpers
hanishkvc Apr 28, 2024
9940bd8
SimpCfg: Allow default values wrt set string and set bool
hanishkvc Apr 28, 2024
82348e2
SimpCfg: Put GroupMap back into Map, Iterate during get if DBUG
hanishkvc Apr 28, 2024
ca5a04d
SimpCfg: Remove double quotes around group, key or string value
hanishkvc Apr 28, 2024
0a534e6
SimpCfg: Rename test program related #define
hanishkvc Apr 28, 2024
d0b3ebf
SimpCfg: Use & wrt destination of [] operation
hanishkvc Apr 28, 2024
fb9a7dc
SimpCfg:Initial skeleton towards supporting int and floating point
hanishkvc Apr 28, 2024
4181164
SimpCfg:Implement set_int64 and set_double
hanishkvc Apr 28, 2024
a6648b0
SimpCfg:Show floating point values in normal and exponential form
hanishkvc Apr 28, 2024
000245b
SimpCfg:Warn possible nonstring strings, some invalid floats
hanishkvc Apr 28, 2024
8ad2c17
SimpCfg: get_int64 logic
hanishkvc Apr 28, 2024
44c0530
SimpCfg: Add support for get_double
hanishkvc Apr 28, 2024
a108000
ChatON:Phi3: Add template details to detailed meta.json
hanishkvc Apr 29, 2024
a095713
ChatON: meta-dump returns flag inturn returned by meta-ok
hanishkvc Apr 29, 2024
7302b3a
SimpCfg: Use stderr wrt internal Log messaging helpers
hanishkvc Apr 30, 2024
ee1a62c
SimpCfg:WIP:Switch to C++ Variant type - initial skeleton
hanishkvc Apr 30, 2024
1dc7fd0
SimpCfg:WIP:Variant TypeDef, to_str and std::get
hanishkvc Apr 30, 2024
f05f71b
SimpCfg:SetBool string value, str_tolower, SetValueTypeLogging
hanishkvc Apr 30, 2024
6b475e4
SimpCfg: Log Caller of Set/GetValue
hanishkvc Apr 30, 2024
5aa1072
SimpCfg: Move dump into its own func, Avoid KV iter wrt Get
hanishkvc Apr 30, 2024
ef5a2cf
SimpCfg:Dbug why bool is not setting properly
hanishkvc Apr 30, 2024
eb56517
SimpCfg:Bools:Make lowercase b4 checking true/false for bool path
hanishkvc Apr 30, 2024
0e0d7da
SimpCfg:Found issue with str_tolower, transform doesnt resize dst
hanishkvc Apr 30, 2024
08b9711
SimpCfg:Remove dbug logs wrt str_tolower and set_bool
hanishkvc Apr 30, 2024
8fdc805
SimpCfg:Cleanup:Cmdline Arg, GetValueCallerLogging, StringCmp
hanishkvc May 1, 2024
561f509
SimpCfg: initial go at adding support for spreadout arrays
hanishkvc May 1, 2024
1e1f54e
SimpCfg:GetArray flesh out, helpers to convert to string
hanishkvc May 1, 2024
86e776c
SimpCfg: Rename to get_vector, add some test code
hanishkvc May 1, 2024
56f19c7
SimpCfg: Test c++ string handling
hanishkvc May 2, 2024
713520c
SimpCfg:CheckStrings: Organise and Probe - p1 std::string
hanishkvc May 2, 2024
691d0d4
SimpCfg:CheckStrings: Organise and Probe - P2 - std::u8string
hanishkvc May 2, 2024
a448fec
SimpCfg:CheckString: organise and probe - p3 wstring
hanishkvc May 2, 2024
3ad5cec
SimpCfg:CheckStrings:MacOS, wstring and wcout
hanishkvc May 2, 2024
66d6fa6
SimpCfg: C++ and strings is a mess even after decades
hanishkvc May 2, 2024
1a618a4
SimpCfg: Update the func notes with alert
hanishkvc May 2, 2024
7607dbc
SimpCfg:CheckStrings: Try fixup wstring handling
hanishkvc May 2, 2024
2cda78f
SimpCfg:CheckStrings: WString2String finally
hanishkvc May 2, 2024
23acf07
SimpCfg:CheckStrings: Cleanup wstring flow to needed parts
hanishkvc May 2, 2024
2325764
SimpCfg:CheckStrings: Switch Mbs2Wcs to multithread safe calls
hanishkvc May 2, 2024
d1156cc
SimpCfg: As locale manipulation reqd for better processing
hanishkvc May 3, 2024
cae0fff
SimpCfg: Update notes; Try add a better trimming logic
hanishkvc May 3, 2024
554b00f
SimpCfg: Add some missing const refs
hanishkvc May 3, 2024
bf111a8
SimpCfg:TemplatedDumbTrim; Test dumb and oversmart trim logics
hanishkvc May 3, 2024
97ac443
SimpCfg:Cleanup, updated notes, templated code
hanishkvc May 3, 2024
d030a26
SimpCfg:Update TrimOverSmart use templated TrimDumb after wstrconv
hanishkvc May 3, 2024
5b8bf84
SimpCfg: Fixed & ~Variable Length to Native & MultiNativeCharSize
hanishkvc May 4, 2024
32ba195
SimpCfg: Templatize str_trim_single
hanishkvc May 4, 2024
33619a3
SimpCfg: Templatize str_lower
hanishkvc May 4, 2024
3287fdb
SimpCfg:Fix/cleanup trim related test samples and flow
hanishkvc May 4, 2024
f53c19b
SimpCfg: Update the notes wrt tolower and add test code
hanishkvc May 4, 2024
20e5b38
SimpCfg:Trim DumpHexString only if SC_DEBUG_VERBOSE
hanishkvc May 4, 2024
2b14bca
SimpCfg:ChatON: add by Humans for All note
hanishkvc May 4, 2024
5380b1e
ChatON:Update meta.json wrt command-r models
hanishkvc May 4, 2024
c6ecd93
SimpCfg: Use to_str instead of using stringstream directly
hanishkvc May 4, 2024
623d0b6
SimpCfg: General MultiPart support, KeyParts not Key wrt SetValue
hanishkvc May 4, 2024
19d3c88
SimpCfg:MultiPart keys wrt get_value etal
hanishkvc May 4, 2024
344c068
SimpCfg:MultiPart keys wrt get_vector
hanishkvc May 4, 2024
989c6c4
SimpCfg: Cleanup the Note a bit to avoid some ambiguities
hanishkvc May 5, 2024
93115a9
ChatON: initial go at OrionStar Ai chat model template
hanishkvc May 5, 2024
0f8f2a1
ChatON:chat template for OpenChat in meta.json initial go
hanishkvc May 5, 2024
b875b02
ChatON:Initial go at vicuna chat template in meta.json
hanishkvc May 5, 2024
f6a86cd
ChatON: Update the Note a bit
hanishkvc May 7, 2024
04b4a15
ChatON: Initial go at chat-template-apply c-api with parts info
hanishkvc May 7, 2024
7c288d3
ChatON: Rename to partstypes for consistency
hanishkvc May 7, 2024
43a3a91
ChatON: Cleanup/Refine initial go at tmpl_apply_ex_capi
hanishkvc May 7, 2024
0852f3b
ChatON:ExCApi: Rename for consistency
hanishkvc May 7, 2024
b3a5654
ChatON:Reposition alertAssistantAtEnd flag for consistency
hanishkvc May 7, 2024
76791ba
ChatON:Fix partsLengths to int32_t type, instead of int
hanishkvc May 7, 2024
8dfa31b
ChatON: Make c-api wrappers a bit robust incl some cross checks
hanishkvc May 8, 2024
b6da7d9
ChatON: tokenize keeping in mind the taggedMessage subparts
hanishkvc May 8, 2024
868ab60
ChatON: Add forceParseSpecial flag to subparts aware tokenizing
hanishkvc May 8, 2024
0d81ffe
Tests:ChatON: Add partial skeleton wrt subparts tokenizing
hanishkvc May 8, 2024
a49697b
ChatON: Keep compiler happy simbly
hanishkvc May 8, 2024
8fe8231
ChatON:SubPartsAwareTokenizePath: Allow extract subparts testing
hanishkvc May 8, 2024
abb406b
Merge branch 'master' into hkvc_chaton_v3
hanishkvc May 10, 2024
1f9a0eb
ChatON: Remove unneeded iostream
hanishkvc May 10, 2024
fe27902
SimpCfg: Avoid iostream/cout and format for direct library use
hanishkvc May 10, 2024
c0506f9
SimpCfg: Allow for direct initialization lists based init
hanishkvc May 10, 2024
86b842b
GroupKV: Duplicate SimpCfg to chop down into GroupKV
hanishkvc May 11, 2024
d764a9d
GroupKV: Simplify code to the minimal needed for GroupKV - P1
hanishkvc May 11, 2024
7d7c59e
GroupKV:Simplify:P2: Rename tags, Make debug logs conditional
hanishkvc May 11, 2024
0342124
GroupKV: Add to_str wrt vectors, help avoid compiler confusion
hanishkvc May 11, 2024
7f03dd0
GroupKV: Add int32_t to variant list, to simplify int use
hanishkvc May 11, 2024
fdefb39
GroupKV:Make LDBUG macros conditional, avoid condition at usage site
hanishkvc May 11, 2024
dde72df
GroupKV: Rename the internal map
hanishkvc May 11, 2024
f294fdd
GroupKV: Add group_exists checker
hanishkvc May 11, 2024
e999934
ChatON:WIP: initial go at GroupKV based flow, instead of json
hanishkvc May 11, 2024
9d4450d
GroupKV: Let dump return a string, rather than printing/logging
hanishkvc May 11, 2024
484c710
GroupKV:Add GetValue which throws exception
hanishkvc May 11, 2024
4a9a6ce
ChatON: ChatONMetaDump switch to GKV/ChatTemplates based flow
hanishkvc May 11, 2024
d9959b7
GroupKV: Get ready for use in llama.cpp ++
hanishkvc May 11, 2024
b944d04
ChatON: Add constructor for ChatTemplates which chains into GKV
hanishkvc May 11, 2024
b9d9700
CMakeLists.txt: Compile C++ code for -std=c++20
hanishkvc May 11, 2024
2efc09f
ChatON: Unnecessarily indirect nlohmann json
hanishkvc May 11, 2024
444d2cc
ChatON:LoadJSON: ChatTemplates - global/system/user/assistant
hanishkvc May 11, 2024
1574201
ChatON:LoadJSon:ChatTemplates: revPrompt, system-user flags
hanishkvc May 11, 2024
0c21a00
ChatON:p1: meta json to hpp conversion - Initial skeleton
hanishkvc May 12, 2024
078e04d
ChatON:P2:meta json to hpp conversion - add k-v pairs skeleton
hanishkvc May 12, 2024
7b5fb0a
ChatON:P3:meta json to hpp: Retain esc seqs and more kv pairs
hanishkvc May 12, 2024
b5b274a
ChatON:P4:meta json to hpp: Insert kv bool
hanishkvc May 12, 2024
b8590e3
ChatON:P5:meta json to hpp: Add required c++ inc and global var
hanishkvc May 12, 2024
a3285e8
ChatON:Include auto converted ChatONMeta.hpp chat template data
hanishkvc May 12, 2024
4232ec1
Main: Load json meta file only if specified
hanishkvc May 12, 2024
f94fed9
ChatON+MetaHpp: Had forgotten to conv reverse-prompt
hanishkvc May 12, 2024
4eae05a
ChatON: json access helper which raises exception if key missing
hanishkvc May 12, 2024
0249c07
ChatON:Switch to json_get_str to help identify missing keys better
hanishkvc May 12, 2024
470b888
ChatON: Switch to templated json_get for str/bool/etal
hanishkvc May 12, 2024
db2ffab
ChatON: use templated json_get when loading bool key-value fields
hanishkvc May 12, 2024
6048218
SimpCFG: COnvert to GroupKV extended version
hanishkvc May 12, 2024
f2dd126
GroupKV: Move test code into its own file in tests
hanishkvc May 12, 2024
3d33d62
SimpCfg: Move testing code into its own file in tests
hanishkvc May 12, 2024
9249649
ChatON+TestPrgs: Use specific log files
hanishkvc May 12, 2024
857570f
SimpCfgTest: Update dump usage to GKV return string semantic
hanishkvc May 12, 2024
d5b0bfb
SimpCfg: Remove now unused SC_DEBUG, rather GroupKV uses equiv
hanishkvc May 12, 2024
eb7554c
ChatON: Avoid -> to match simpcfg as well as corresponding keys
hanishkvc May 13, 2024
184ac32
ChatON: Make json_get efficient and flexible wrt its calling
hanishkvc May 13, 2024
0cfe990
ChatON:ChatTemplates: TmplExists, TmplGetKey, TmplRoleGetKeys
hanishkvc May 13, 2024
efbb87d
ChatON:ChatTemplates:TmplBasicCheck
hanishkvc May 13, 2024
fe0c9ce
ChatON:BasicCheck+:return a string with info, dont directly log
hanishkvc May 13, 2024
8165bd4
ChatON:WIP:chaton_tmpl_apply_single build on multi msg tagging
hanishkvc May 13, 2024
3fcaf19
ChatON+:Multi4Single: applyGlobalIfAny flag wrt templating api
hanishkvc May 13, 2024
6e13c0c
ChatON:Control SystemMsgSuffix+End tags only wrt 1st system msg
hanishkvc May 13, 2024
600653d
ChatON:Optional control of MsgCntBasedTagging
hanishkvc May 13, 2024
4dfd10a
ChatON: Move core templating/tagging code into ChatTemplates class
hanishkvc May 13, 2024
28ddd2c
ChatON: ChatParts dump returns info str rather than direct logging
hanishkvc May 13, 2024
bb9ce52
ChatON+: ValidateDump dumps All, wrapped in optional LDBUG_LN
hanishkvc May 14, 2024
bd5c39e
ChatOn+GroupKV: Cleanup a bit, including using debug logging
hanishkvc May 14, 2024
f8c0b47
ChatON+:RenameTo chaton_meta_load_json to match semantic
hanishkvc May 14, 2024
8975de9
ChatON: Update Notes to match the updated semantics and flows
hanishkvc May 14, 2024
14c28e7
GroupKV+: dump cleanup - forgot to commit earlier
hanishkvc May 14, 2024
a3d641b
ChatON: Move loading from json file into its own file
hanishkvc May 14, 2024
4a15989
ChatON: Forgot this note earlier
hanishkvc May 14, 2024
dc03a71
CMakeLists: base std::variantC++17, specificTest std::formatC++20
hanishkvc May 15, 2024
4f5add6
GroupKV:Dump/Log type of the variant instance also
hanishkvc May 15, 2024
cdd91f5
SimpCfg: Trap conversion error and raise appropriate exception
hanishkvc May 15, 2024
bb3fe48
SimpCfg+DataUtilsString: Move string helpers to its own file
hanishkvc May 15, 2024
397249d
DataUtilsString: string_as_hex and use direct log helpers
hanishkvc May 15, 2024
7a3ac0c
Merge branch 'master' into hkvc_chaton_v3
hanishkvc May 15, 2024
239b5be
ChatON+: Cleanup integration with CMake
hanishkvc May 16, 2024
1a0df95
C++17: Use and limit C++17 to common library for now
hanishkvc May 16, 2024
0cbfd40
ChatON: Option for a fallback tmpl to use wrt chat-tmpl-apply-ex
hanishkvc May 16, 2024
999bd39
ChatON: forgot to get c string format
hanishkvc May 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion common/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ add_library(${TARGET} STATIC
train.cpp
ngram-cache.h
ngram-cache.cpp
chaton_meta.cpp
)

if (BUILD_SHARED_LIBS)
Expand All @@ -83,5 +84,5 @@ if (LLAMA_CURL)
endif ()

target_include_directories(${TARGET} PUBLIC .)
target_compile_features(${TARGET} PUBLIC cxx_std_11)
target_compile_features(${TARGET} PUBLIC cxx_std_17)
target_link_libraries(${TARGET} PRIVATE ${LLAMA_COMMON_EXTRA_LIBS} PUBLIC llama)
823 changes: 823 additions & 0 deletions common/chaton.hpp

Large diffs are not rendered by default.

100 changes: 100 additions & 0 deletions common/chaton_json.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
#pragma once

/**
* Helper to load chaton's configurable template data from json file
* By Humans for All
*
* Any program which wants to load configurable template data from json file,
* can include this file to get the needed helpers for same.
*/

#include "chaton.hpp"

#include <json.hpp>
using json = nlohmann::ordered_json;


// Get value corresponding to the specified hierarchy/chain of keys.
// Also throw a more informative exception, if it is not found.
template <typename SupportedType>
inline SupportedType json_get(json &j, const std::vector<std::string_view> &keys, const std::string &msgTag) {
json curJ = j;
std::stringstream skey;
int i = 0;
for(auto key: keys) {
if (i != 0) skey << "-";
i += 1;
skey << key;
if (curJ.contains(key)) {
curJ = curJ[key];
} else {
std::stringstream ss;
ss << "ERRR:ChatON:" << __func__ << ":" << msgTag << ":KeyChain [" << skey.str() << "] is missing";
throw std::runtime_error(ss.str());
}
}
return curJ;
}

// Update/Extend the configurable template data in specified ChatTemplates instance from the specified json file.
// If nullptr is passed wrt ct, then update/extend the global compiled-in configurable template data.
inline bool chaton_meta_load_json(const std::string &fname, ChatTemplates *ct=nullptr) {
if (ct == nullptr) {
ct = &gCT;
}
std::ifstream f(fname);
json conMeta = json::parse(f);
for(auto it=conMeta.begin(); it != conMeta.end(); ++it) {

auto group = it.key();
auto curTmpl = conMeta[group];

std::string globalBegin = json_get<std::string>(curTmpl, { K_GLOBAL, K_BEGIN }, group);
ct->set_value<std::string>(group, { K_GLOBAL, K_BEGIN }, globalBegin);
std::string globalEnd = json_get<std::string>(curTmpl, { K_GLOBAL, K_END }, group);
ct->set_value<std::string>(group, { K_GLOBAL, K_END }, globalEnd);

std::string systemBegin = json_get<std::string>(curTmpl, { K_SYSTEM, K_BEGIN }, group);
ct->set_value<std::string>(group, { K_SYSTEM, K_BEGIN }, systemBegin);
std::string systemPrefix = json_get<std::string>(curTmpl, { K_SYSTEM, K_PREFIX }, group);
ct->set_value<std::string>(group, { K_SYSTEM, K_PREFIX }, systemPrefix);
std::string systemSuffix = json_get<std::string>(curTmpl, { K_SYSTEM, K_SUFFIX }, group);
ct->set_value<std::string>(group, { K_SYSTEM, K_SUFFIX }, systemSuffix);
std::string systemEnd = json_get<std::string>(curTmpl, { K_SYSTEM, K_END }, group);
ct->set_value<std::string>(group, { K_SYSTEM, K_END }, systemEnd);

std::string userBegin = json_get<std::string>(curTmpl, { K_USER, K_BEGIN }, group);
ct->set_value<std::string>(group, { K_USER, K_BEGIN }, userBegin);
std::string userPrefix = json_get<std::string>(curTmpl, { K_USER, K_PREFIX }, group);
ct->set_value<std::string>(group, { K_USER, K_PREFIX }, userPrefix);
std::string userSuffix = json_get<std::string>(curTmpl, { K_USER, K_SUFFIX }, group);
ct->set_value<std::string>(group, { K_USER, K_SUFFIX }, userSuffix);
std::string userEnd = json_get<std::string>(curTmpl, { K_USER, K_END }, group);
ct->set_value<std::string>(group, { K_USER, K_END }, userEnd);

std::string assistantBegin = json_get<std::string>(curTmpl, { K_ASSISTANT, K_BEGIN }, group);
ct->set_value<std::string>(group, { K_ASSISTANT, K_BEGIN }, assistantBegin);
std::string assistantPrefix = json_get<std::string>(curTmpl, { K_ASSISTANT, K_PREFIX }, group);
ct->set_value<std::string>(group, { K_ASSISTANT, K_PREFIX }, assistantPrefix);
std::string assistantSuffix = json_get<std::string>(curTmpl, { K_ASSISTANT, K_SUFFIX }, group);
ct->set_value<std::string>(group, { K_ASSISTANT, K_SUFFIX }, assistantSuffix);
std::string assistantEnd = json_get<std::string>(curTmpl, { K_ASSISTANT, K_END }, group);
ct->set_value<std::string>(group, { K_ASSISTANT, K_END }, assistantEnd);

std::string reversePrompt = json_get<std::string>(curTmpl, { K_REVERSE_PROMPT }, group);
ct->set_value<std::string>(group, { K_REVERSE_PROMPT }, reversePrompt);

bool systemHasSuffix = json_get<bool>(curTmpl, { K_SYSTEMUSER_SYSTEM_HAS_SUFFIX }, group);
ct->set_value(group, { K_SYSTEMUSER_SYSTEM_HAS_SUFFIX }, systemHasSuffix);
bool systemHasEnd = json_get<bool>(curTmpl, { K_SYSTEMUSER_SYSTEM_HAS_END }, group);
ct->set_value(group, { K_SYSTEMUSER_SYSTEM_HAS_END }, systemHasEnd);

bool userHasBegin = json_get<bool>(curTmpl, { K_SYSTEMUSER_1ST_USER_HAS_BEGIN }, group);
ct->set_value(group, { K_SYSTEMUSER_1ST_USER_HAS_BEGIN }, userHasBegin);
bool userHasPrefix = json_get<bool>(curTmpl, { K_SYSTEMUSER_1ST_USER_HAS_PREFIX }, group);
ct->set_value(group, { K_SYSTEMUSER_1ST_USER_HAS_PREFIX }, userHasPrefix);

}
LDBUG_LN("%s", ct->dump("", "DBUG:ChatONMetaLoad:ChatTemplates").c_str());
return true;
}
Loading
Loading