-
Notifications
You must be signed in to change notification settings - Fork 0
A simplistic configuration parser for INI like syntax in C
License
neolit123/cfg2
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
cfg2 0.99.0 README a simplistic configuration parser for INI like syntax in C last revisited: 21.02.2016 author: lubomir i. ivanov (neolit123 at gmail) ================================================================================ INTRODUCTION: cfg2 is a simple library for parsing and writing a configuration file format very similar to the INI file format. it's not related to any projects such as libcfg*. it can be used for very fast processing of application configurations and even translations. public domain code - see LICENSE! cfg2 format example: -------------------------------------------------------------------------------- [section1] key1=value1 key2=value2 ; a comment line key3="value with some special \n characters" key4="some value \ on multiple lines" [section2] ; integer value key5=-12345678 ; floating point value key6=12345678.312 ; hex value key7=0xfafafafa ; key/value with whitespace " key8 " = " value8 " ; HelloWorld as a case-insensitive hex value (needs extra parsing) key9=48656c6c6f576f726c64 -------------------------------------------------------------------------------- the above pretty much covers what the library can be used for. format notes: * the required file format is UTF-8, LF! CRLF and CR are pretty much untested. * since 0.15.0 the supported format structure is close to the INI format specified here: https://en.wikipedia.org/wiki/INI_file * comment lines still can be defined starting with the ';' or '#' characters or custom characters using cfg_t's 'comment_char[1/2]' fields * the parser does not handle duplicates i.e. memory will be allocated for such. to separate duplicates use sections! * the parser is not very strict, thus not many errors will be thrown if things go wrong, just warnings if cfg_t's 'verbose' field is more than 0 * \x??[??] sequences are not supported as they take way too much space. use direct hex strings (e.g. key9) and parse them explicitly with cfg_hex_to_char(). * the parser, file i/o and hashing functions have a 32bit limit. ================================================================================ WHY WRITE A NEW CONFIG LIBRARY?: while there are many different configuration formats and parser libraries out there, the answer is "for fun", but perhaps also for the performance, portability and overhead benefits. this library is a minimal and very fast alternative. ================================================================================ HOW IT WORKS: * STRUCTURE arrays of objects (not object pointers) are used instead of linked lists for performance reasons. all sections are allocated into cfg_section_t objects and each section object has a list of cfg_entry_t objects which have the key / value pairs. everything is continuous memory so cache misses when doing linear search are very unlikely. there is always at least one section, which is called the "root" section. * PARSING: once an input buffer is passed to the parser the first stage is to covert it into a "raw" buffer, which is readable by the second stage, while counting how many sections and entries per section are there. in this stage the parser uses a trick with special characters to separate sections, keys and values into the "raw" buffer. the second stage is to parse the strings from the "raw" buffer into the library object memory, while allocating the cfg_section_t and cfg_entry_t objects for each section. the parsing is a two stage operation for the sake of not calling realloc() multiple times on the output buffer. this is a very good optimization for any size of the input buffer and parser implementation. realloc() is only used to count the number of sections in a uint32 buffer, which grows in power-of-two increments. * ACCESS linear search is used to traverse the list of entries in a section, but this is blazing fast for millions of entries because of the continuous memory block usage. if you truly have way too many entries, just separate them into multiple sections. addition and deletion of entries is something which arrays are supposedly much worse than linked lists, yet performance in this library is quite good as it takes milliseconds to transpose a list of millions of entries. * WRITING when writing the contents of the library objects the sections and entries are serialized into strings one by one into a growing buffer. mind that all formatting is lost in the process. unlike parsing, no size estimation is done here, so realloc() is used for the output buffer. it grows in power-of-two increments. * CACHE another present optimization is caching N values in a separate cache buffer and always checking this buffer first, instead of a section buffer (which can be huge). this technique optimizes performance when an entry is used multiple times. the cache buffer can be resized dynamically or disabled when its size is set to zero. ironically, cache misses are more likely to happen when reading entries from the library cache because of the way an entry in the cache is checked for it's parent section. however, the library cache is still a major performance boost, if it has been set to a reasonable size. ================================================================================ PERFORMANCE: the library should be pretty fast even without optimization flags. from some outdated tests, it takes around 1.4 seconds to parse a 20MB text file with ~550,000 keys and ~80,000 sections on a modern dual core processor. ================================================================================ API: the API is pretty simple, but mind that it might change here and there until a "major" version is released. it uses a naming scheme for functions similar to GLib and the usage logic is quite common as well: create a library object init/allocate object work with object free object for more specific information for each API function take a look at the comments in "cfg2.h". also a better usage example can be found in "test.c". small usage example without error checking: -------------------------------------------------------------------------------- #include "cfg2.h" cfg_char *v; cfg_t *st; /* library object */ st = cfg_alloc(); /* init the library */ cfg_file_parse(st, some_file); /* parse a file */ v = cfg_value_get(st, "section1", "key1"); /* get a key's value */ puts(v); /* value1 */ cfg_free(st); ================================================================================ COMPILATION: you need GNU-Make and either GCC/Mingw or a MSVC toolchain. -------------------------------------------------------------------------------- GCC -------------------------------------------------------------------------------- to compile everything write: make to create just the library files write: make lib to compile the test write: make test to run the test write: make run NOTES: - the Makefile has been tested on Win32 and Linux - ./lib will contain both a dynamic and static library builds (e.g. dll.a, .a) - ./test will contain a test(.exe) executable which is based on test.c - to link against the static library define CFG_LIB_STATIC - by default the library builds in DEBUG mode (e.g. gcc -g); to build in release mode use: make RELEASE=1 -------------------------------------------------------------------------------- MSVC -------------------------------------------------------------------------------- specify the custom Makefile: make -f Makefile.msvc make -f Makefile.msvc lib ... NOTES: - the same Makefile rules as the GCC version apply - MSVC building has only been tested with the 1998 and 2003 toolchains ================================================================================
About
A simplistic configuration parser for INI like syntax in C
Topics
Resources
License
Stars
Watchers
Forks
Packages 0
No packages published