SylphAI-Inc · Sylph-AI · Jul 3, 2024 · Jul 2, 2024 · Jul 2, 2024 · Jul 2, 2024
diff --git a/README.md b/README.md
@@ -1,7 +1,11 @@
-# Introduction
+![LightRAG Logo](docs/source/_static/images/LightRAG-logo-doc.jpeg)
+
+⚡ The PyTorch Library for Large language Model (LLM) Applications ⚡
+
+We help developers with both building and optimizing `Retriever`-`Agent`-`Generator` (RAG) pipelines.
+It is *light*, *modular*, and *robust*.
+
 
-LightRAG is the `PyTorch` library for building large language model (LLM) applications. We help developers with both building and optimizing `Retriever`-`Agent`-`Generator` (RAG) pipelines.
-It is light, modular, and robust.
 
 **PyTorch**
 
@@ -58,46 +62,46 @@ class SimpleQA(Component):
       return await self.generator.acall({"input_str": query})
 ```
 
-## Simplicity
+## Quick Install
 
-Developers who are building real-world Large Language Model (LLM) applications are the real heroes.
-As a library, we provide them with the fundamental building blocks with 100% clarity and simplicity.
+Install LightRAG with pip:
 
-* Two fundamental and powerful base classes: Component for the pipeline and DataClass for data interaction with LLMs.
-* We end up with less than two levels of subclasses. Class Hierarchy Visualization.
-* The result is a library with bare minimum abstraction, providing developers with maximum customizability.
+```bash
+pip install lightrag
+```
 
-Similar to the PyTorch module, our Component provides excellent visualization of the pipeline structure.
+Please refer to the [full installation guide](https://lightrag.sylph.ai/get_started/installation.html) for more details.
 
-```
-SimpleQA(
-   (generator): Generator(
-      model_kwargs={'model': 'llama3-8b-8192'},
-      (prompt): Prompt(
-         template: <SYS>
-               You are a helpful assistant.
-               </SYS>
-               User: {{input_str}}
-               You:
-               , prompt_variables: ['input_str']
-      )
-      (model_client): GroqAPIClient()
-   )
-)
-```
 
-## Controllability
 
-Our simplicity did not come from doing 'less'.
-On the contrary, we have to do 'more' and go 'deeper' and 'wider' on any topic to offer developers maximum control and robustness.
+You can place the above code in your project's root ``__init__.py`` file.
+This setup ensures that LightRAG can access all necessary configurations during runtime.
 
-* LLMs are sensitive to the prompt. We allow developers full control over their prompts without relying on API features such as tools and JSON format with components like Prompt, OutputParser, FunctionTool, and ToolManager.
-* Our goal is not to optimize for integration, but to provide a robust abstraction with representative examples. See this in ModelClient and Retriever.
-* All integrations, such as different API SDKs, are formed as optional packages but all within the same library. You can easily switch to any models from different providers that we officially support.
+# Documentation
 
-## Future of LLM Applications
+LightRAG full documentation available at [lightrag.sylph.ai](https://lightrag.sylph.ai/):
 
-On top of the easiness to use, we in particular optimize the configurability of components for researchers to build their solutions and to benchmark existing solutions.
-Like how PyTorch has united both researchers and production teams, it enables smooth transition from research to production.
-With researchers building on LightRAG, production engineers can easily take over the method and test and iterate on their production data.
-Researchers will want their code to be adapted into more products too.
+- [Introduction](https://lightrag.sylph.ai/)
+- [Full installation guide](https://lightrag.sylph.ai/get_started/installation.html)
+- [Design philosophy](https://lightrag.sylph.ai/developer_notes/lightrag_design_philosophy.html)
+- [Class hierarchy](https://lightrag.sylph.ai/developer_notes/class_hierarchy.html)
+- [Tutorials](https://lightrag.sylph.ai/developer_notes/index.html)
+- [API reference](https://lightrag.sylph.ai/apis/index.html)
+
+
+
+## Contributors
+
+[![contributors](https://contrib.rocks/image?repo=SylphAI-Inc/LightRAG&max=2000)](https://github.com/SylphAI-Inc/LightRAG/graphs/contributors)
+
+# Citation
+
+```bibtex
+@software{Yin-LightRAG-2024,
+  author = {Yin, Li},
+  title = {{LightRAG: The PyTorch Library for Large language Model (LLM) Applications}},
+  month = {7},
+  year = {2024},
+  url = {https://github.com/SylphAI-Inc/LightRAG}
+}
+```
diff --git a/developer_notes/parser_note.py b/developer_notes/parser_note.py
@@ -0,0 +1,283 @@
+def examples_of_different_ways_to_parse_string():
+
+    int_str = "42"
+    float_str = "42.0"
+    boolean_str = "True"  # json works with true/false
+    None_str = "None"
+    Null_str = "null"  # json works with null
+    dict_str = '{"key": "value"}'
+    list_str = '["key", "value"]'
+    nested_dict_str = (
+        '{"name": "John", "age": 30, "attributes": {"height": 180, "weight": 70}}'
+    )
+    yaml_dict_str = "key: value"
+    yaml_nested_dict_str = (
+        "name: John\nage: 30\nattributes:\n  height: 180\n  weight: 70"
+    )
+    yaml_list_str = "- key\n- value"
+
+    # string to int/float/bool
+    print("built-in parser:\n____________________")
+    print(int(int_str))
+    print(float(float_str))
+    print(bool(boolean_str))
+
+    # via json loads
+    import json
+
+    print("\njson parser:\n____________________")
+    json_int = json.loads(int_str)
+    json_float = json.loads(float_str)
+    json_bool = json.loads(
+        boolean_str.lower()
+    )  # json.loads only accepts true or false, not True or False
+    json_none = json.loads(Null_str)
+    json_dict = json.loads(dict_str)
+    json_list = json.loads(list_str)
+    json_nested_dict = json.loads(nested_dict_str)
+    # json_yaml_dict = json.loads(yaml_dict_str) # wont work
+    # json_yaml_nested_dict = json.loads(yaml_nested_dict_str)
+    # json_yaml_list = json.loads(yaml_list_str)
+
+    print(int_str, type(json_int), json_int)
+    print(float_str, type(json_float), json_float)
+    print(boolean_str, type(json_bool), json_bool)
+    print(None_str, type(json_none), json_none)
+    print(dict_str, type(json_dict), json_dict)
+    print(list_str, type(json_list), json_list)
+    print(nested_dict_str, type(json_nested_dict), json_nested_dict)
+
+    # via yaml
+    import yaml
+
+    print("\nyaml parser:\n____________________")
+
+    yaml_int = yaml.safe_load(int_str)
+    yaml_float = yaml.safe_load(float_str)
+    yaml_bool = yaml.safe_load(boolean_str)
+    yaml_bool_lower = yaml.safe_load(boolean_str.lower())
+    yaml_null = yaml.safe_load(Null_str)
+    yaml_none = yaml.safe_load(None_str)
+
+    yaml_dict = yaml.safe_load(dict_str)
+    yaml_list = yaml.safe_load(list_str)
+    yaml_nested_dict = yaml.safe_load(nested_dict_str)
+    yaml_yaml_dict = yaml.safe_load(yaml_dict_str)
+    yaml_yaml_nested_dict = yaml.safe_load(yaml_nested_dict_str)
+    yaml_yaml_list = yaml.safe_load(yaml_list_str)
+
+    print(int_str, type(yaml_int), yaml_int)
+    print(float_str, type(yaml_float), yaml_float)
+    print(boolean_str, type(yaml_bool), yaml_bool)
+    print(boolean_str.lower(), type(yaml_bool_lower), yaml_bool_lower)
+    print(Null_str, type(yaml_null), yaml_null)
+    print(None_str, type(yaml_none), yaml_none)
+    print(dict_str, type(yaml_dict), yaml_dict)
+    print(list_str, type(yaml_list), yaml_list)
+    print(nested_dict_str, type(yaml_nested_dict), yaml_nested_dict)
+    print(yaml_dict_str, type(yaml_yaml_dict), yaml_yaml_dict)
+    print(yaml_nested_dict_str, type(yaml_yaml_nested_dict), yaml_yaml_nested_dict)
+    print(yaml_list_str, type(yaml_yaml_list), yaml_yaml_list)
+
+    # via ast for python literal
+    import ast
+
+    print("\nast parser:\n____________________\n")
+
+    ast_int = ast.literal_eval(int_str)
+    ast_float = ast.literal_eval(float_str)
+    ast_bool = ast.literal_eval(boolean_str)
+    ast_none = ast.literal_eval(None_str)
+    ast_dict = ast.literal_eval(dict_str)
+    ast_list = ast.literal_eval(list_str)
+    ast_nested_dict = ast.literal_eval(nested_dict_str)
+
+    print(int_str, type(ast_int), ast_int)
+    print(float_str, type(ast_float), ast_float)
+    print(boolean_str, type(ast_bool), ast_bool)
+    print(None_str, type(ast_none), ast_none)
+    print(dict_str, type(ast_dict), ast_dict)
+    print(list_str, type(ast_list), ast_list)
+    print(nested_dict_str, type(ast_nested_dict), ast_nested_dict)
+
+    # via eval for any python expression, but not recommended for security reasons
+
+    print("\n eval parser:\n____________________\n")
+
+    eval_int = eval(int_str)
+    eval_float = eval(float_str)
+    eval_bool = eval(boolean_str)
+    eval_dict = eval(dict_str)
+    eval_list = eval(list_str)
+    eval_nested = eval(nested_dict_str)
+    # eval_yaml_dict = eval(yaml_dict_str) # wont work
+
+    print(int_str, type(eval_int), eval_int)
+    print(float_str, type(eval_float), eval_float)
+    print(boolean_str, type(eval_bool), eval_bool)
+    print(dict_str, type(eval_dict), eval_dict)
+    print(list_str, type(eval_list), eval_list)
+    print(nested_dict_str, type(eval_nested), eval_nested)
+
+
+def int_parser():
+    from lightrag.core.string_parser import IntParser
+
+    int_str = "42"
+    int_str_2 = "42.0"
+    int_str_3 = "42.7"
+    int_str_4 = "the answer is 42.75"
+
+    # it will all return 42
+    parser = IntParser()
+    print(parser)
+    print(parser(int_str))
+    print(parser(int_str_2))
+    print(parser(int_str_3))
+    print(parser(int_str_4))
+
+
+def float_parser():
+    from lightrag.core.string_parser import FloatParser
+
+    float_str = "42.0"
+    float_str_2 = "42"
+    float_str_3 = "42.7"
+    float_str_4 = "the answer is 42.75"
+
+    # it will all return 42.0
+    parser = FloatParser()
+    print(parser(float_str))
+    print(parser(float_str_2))
+    print(parser(float_str_3))
+    print(parser(float_str_4))
+
+
+def bool_parser():
+    from lightrag.core.string_parser import BooleanParser
+
+    bool_str = "True"
+    bool_str_2 = "False"
+    bool_str_3 = "true"
+    bool_str_4 = "false"
+    # bool_str_5 = "1"  # will fail
+    # bool_str_6 = "0"  # will fail
+    # bool_str_7 = "yes"  # will fail
+    # bool_str_8 = "no"  # will fail
+
+    # it will all return True/False
+    parser = BooleanParser()
+    print(parser(bool_str))
+    print(parser(bool_str_2))
+    print(parser(bool_str_3))
+    print(parser(bool_str_4))
+    # print(parser(bool_str_5))
+    # print(parser(bool_str_6))
+    # print(parser(bool_str_7))
+    # print(parser(bool_str_8))
+
+
+def list_parser():
+
+    from lightrag.core.string_parser import ListParser
+
+    list_str = '["key", "value"]'
+    list_str_2 = 'prefix["key", 2]...'
+    list_str_3 = '[{"key": "value"}, {"key": "value"}]'
+    # dict_str = '{"key": "value"}'
+
+    parser = ListParser()
+    print(parser(list_str))
+    print(parser(list_str_2))
+    print(parser(list_str_3))
+    # print(parser(dict_str)) # will raise ValueError
+
+
+def json_parser():
+    from lightrag.core.string_parser import JsonParser
+
+    dict_str = '{"key": "value"}'
+    nested_dict_str = (
+        '{"name": "John", "age": 30, "attributes": {"height": 180, "weight": 70}}'
+    )
+    list_str = '["key", 2]'
+    list_dict_str = '[{"key": "value"}, {"key": "value"}]'
+
+    parser = JsonParser()
+    print(parser)
+    print(parser(dict_str))
+    print(parser(nested_dict_str))
+    print(parser(list_str))
+    print(parser(list_dict_str))
+
+
+def yaml_parser():
+    from lightrag.core.string_parser import YamlParser
+
+    yaml_dict_str = "key: value"
+    yaml_nested_dict_str = (
+        "name: John\nage: 30\nattributes:\n  height: 180\n  weight: 70"
+    )
+    yaml_list_str = "- key\n- value"
+
+    parser = YamlParser()
+    print(parser)
+    print(parser(yaml_dict_str))
+    print(parser(yaml_nested_dict_str))
+    print(parser(yaml_list_str))
+
+
+def json_output_parser():
+    from dataclasses import dataclass, field
+    from lightrag.components.output_parsers import JsonOutputParser
+    from lightrag.core import DataClass
+
+    @dataclass
+    class User(DataClass):
+        id: int = field(default=1, metadata={"description": "User ID"})
+        name: str = field(default="John", metadata={"description": "User name"})
+
+    user_example = User(id=1, name="John")
+
+    user_to_parse = '{"id": 2, "name": "Jane"}'
+
+    parser = JsonOutputParser(data_class=User, examples=[user_example])
+    print(parser)
+    output_format_str = parser.format_instructions()
+    print(output_format_str)
+    parsed_user = parser(user_to_parse)
+    print(parsed_user)
+
+
+def yaml_output_parser():
+    from dataclasses import dataclass, field
+    from lightrag.components.output_parsers import YamlOutputParser
+    from lightrag.core import DataClass
+
+    @dataclass
+    class User(DataClass):
+        id: int = field(default=1, metadata={"description": "User ID"})
+        name: str = field(default="John", metadata={"description": "User name"})
+
+    user_example = User(id=1, name="John")
+
+    user_to_parse = "id: 2\nname: Jane"
+
+    parser = YamlOutputParser(data_class=User, examples=[user_example])
+    print(parser)
+    output_format_str = parser.format_instructions()
+    print(output_format_str)
+    parsed_user = parser(user_to_parse)
+    print(parsed_user)
+
+
+if __name__ == "__main__":
+    examples_of_different_ways_to_parse_string()
+    int_parser()
+    float_parser()
+    bool_parser()
+    list_parser()
+    json_parser()
+    yaml_parser()
+    json_output_parser()
+    yaml_output_parser()
diff --git a/docs/source/apis/components/components.agent.rst b/docs/source/apis/components/components.agent.rst
@@ -1,6 +1,6 @@
 .. _components-agent:
 
-components.agent
+agent
 ========================
 
 Submodules

diff --git a/docs/source/apis/components/components.data_process.data_components.rst b/docs/source/apis/components/components.data_process.data_components.rst