Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: infiniflow/ragflow
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: main
Choose a base ref
...
head repository: infiniflow/ragflow
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: api
Choose a head ref
Checking mergeability… Don’t worry, you can still create the pull request.
  • 5 commits
  • 19 files changed
  • 1 contributor

Commits on Sep 18, 2024

  1. refactor(API): Refactor datasets API (#2439)

    ### What problem does this PR solve?
    
    discuss:#1102
    
    #### Completed
    1. Integrate API Flask to generate Swagger API documentation, through
    http://ragflow_host:ragflow_port/v1/docs visit
    2. Refactored http_token_auth
    ```
    class AuthUser:
        def __init__(self, tenant_id, token):
            self.id = tenant_id
            self.token = token
    
        def get_token(self):
            return self.token
    
    
    @http_token_auth.verify_token
    def verify_token(token: str) -> Union[AuthUser, None]:
        try:
            objs = APIToken.query(token=token)
            if objs:
                api_token = objs[0]
                user = AuthUser(api_token.tenant_id, api_token.token)
                return user
        except Exception as e:
            server_error_response(e)
        return None
    
    # resources api
    @manager.auth_required(http_token_auth)
    def get_all_datasets(query_data):
    	....
    ```
    3. Refactored the Datasets (Knowledgebase) API to extract the
    implementation logic into the api/apps/services directory
    
    ![image](https://github.com/user-attachments/assets/ad1f16f1-b0ce-4301-855f-6e162163f99a)
    4. Python SDK, I only added get_all_datasets as an attempt, Just to
    verify that SDK API and Server API can use the same method.
    ```
    from ragflow.ragflow import RAGFLow
    ragflow = RAGFLow('<ACCESS_KEY>', 'http://127.0.0.1:9380')
    ragflow.get_all_datasets()
    ```
    5. Request parameter validation, as an attempt, may not be necessary as
    this feature is already present at the data model layer. This is mainly
    easier to test the API in Swagger Docs service
    ```
    class UpdateDatasetReq(Schema):
        kb_id = fields.String(required=True)
        name = fields.String(validate=validators.Length(min=1, max=128))
        description = fields.String(allow_none=True)
        permission = fields.String(validate=validators.OneOf(['me', 'team']))
        embd_id = fields.String(validate=validators.Length(min=1, max=128))
        language = fields.String(validate=validators.OneOf(['Chinese', 'English']))
        parser_id = fields.String(validate=validators.OneOf([parser_type.value for parser_type in ParserType]))
        parser_config = fields.Dict()
        avatar = fields.String()
    ```
    
    #### TODO
    
    1. Simultaneously supporting multiple authentication methods, so that
    the Web API can use the same method as the Server API, but perhaps this
    feature is not important.
    I tried using this method, but it was not successful. It only allows
    token authentication when not logged in, but cannot skip token
    authentication when logged in 😢
    ```
    def http_basic_auth_required(func):
        @wraps(func)
        def decorated_view(*args, **kwargs):
            if 'Authorization' in flask_request.headers:
                # If the request header contains a token, skip username and password verification
                return func(*args, **kwargs)
            if flask_request.method in EXEMPT_METHODS or current_app.config.get("LOGIN_DISABLED"):
                pass
            elif not current_user.is_authenticated:
                return current_app.login_manager.unauthorized()
    
            if callable(getattr(current_app, "ensure_sync", None)):
                return current_app.ensure_sync(func)(*args, **kwargs)
            return func(*args, **kwargs)
    
        return decorated_view
    ```
    2. Refactoring the SDK API using the same method as the Server API is
    feasible and constructive, but it still requires time
    I see some differences between the Web and SDK APIs, such as the
    key_mapping handling of the returned results. Until I figure it out, I
    cannot modify these codes to avoid causing more problems
    
    ```
        for kb in kbs:
            key_mapping = {
                "chunk_num": "chunk_count",
                "doc_num": "document_count",
                "parser_id": "parse_method",
                "embd_id": "embedding_model"
            }
            renamed_data = {}
            for key, value in kb.items():
                new_key = key_mapping.get(key, key)
                renamed_data[new_key] = value
            renamed_list.append(renamed_data)
        return get_json_result(data=renamed_list)
    ```
    
    ### Type of change
    
    - [x] Refactoring
    Valdanitooooo authored Sep 18, 2024
    Configuration menu
    Copy the full SHA
    5c77792 View commit details
    Browse the repository at this point in the history
  2. API: fixed documentss API request data schema & fixed documentss API …

    …request data schema (#2480)
    
    ### What problem does this PR solve?
    
    - fixed documentss API request data schema
    - add documents sdk api tests
    
    ### Type of change
    
    - [x] Bug Fix (non-breaking change which fixes an issue)
    Valdanitooooo authored Sep 18, 2024
    Configuration menu
    Copy the full SHA
    93114e4 View commit details
    Browse the repository at this point in the history

Commits on Sep 20, 2024

  1. fix(API): fixed swagger docs error in nginx external port (#2509)

    ### What problem does this PR solve?
    
    1. Fixed swagger docs error in nginx external port
    2. Add retrieval api
    3. Add documentation for SDK API
    
    ### Type of change
    
    - [x] Bug Fix (non-breaking change which fixes an issue)
    - [x] Documentation Update
    - [x] Refactoring
    Valdanitooooo authored Sep 20, 2024
    Configuration menu
    Copy the full SHA
    82b46d3 View commit details
    Browse the repository at this point in the history
  2. refactor(API): Split SDK class to optimize code structure (#2515)

    ### What problem does this PR solve?
    
    1. Split SDK class to optimize code structure
    `ragflow.get_all_datasets()`  ===>     `ragflow.dataset.list()`
    2. Fixed the parameter validation to allow for empty values.
    3. Change the way of checking parameter nullness, Because even if the
    parameter is empty, the key still exists, this is a feature from
    [APIFlask](https://apiflask.com/schema/).
    
    `if "parser_config" in json_data` ===> `if json_data["parser_config"]`
    
    
    ![image](https://github.com/user-attachments/assets/dd2a26d6-b3e3-4468-84ee-dfcf536e59f7)
    
    4. Some common parameter error messages, all from
    [Marshmallow](https://marshmallow.readthedocs.io/en/stable/marshmallow.fields.html)
    
    Parameter validation configuration
    ```
        kb_id = fields.String(required=True)
        parser_id = fields.String(validate=validators.OneOf([parser_type.value for parser_type in ParserType]),
                                  allow_none=True)
    ```
    
    When my parameter is
    ```
    kb_id=None,
    parser_id='A4'
    ```
    
    Error messages
    ```
    {
        "detail": {
            "json": {
                "kb_id": [
                    "Field may not be null."
                ],
                "parser_id": [
                    "Must be one of: presentation, laws, manual, paper, resume, book, qa, table, naive, picture, one, audio, email, knowledge_graph."
                ]
            }
        },
        "message": "Validation error"
    }
    ```
    
    ### Type of change
    
    - [x] Bug Fix (non-breaking change which fixes an issue)
    Valdanitooooo authored Sep 20, 2024
    Configuration menu
    Copy the full SHA
    5110a3b View commit details
    Browse the repository at this point in the history

Commits on Sep 24, 2024

  1. fix(API): fixed retrieval api parameters matching (#2550)

    ### What problem does this PR solve?
    
    fixed /datasets/retrieval API:
    KeyError('size') and 'doc_ids': ['Field may not be null.']
    
    ### Type of change
    
    - [x] Bug Fix (non-breaking change which fixes an issue)
    Valdanitooooo authored Sep 24, 2024
    Configuration menu
    Copy the full SHA
    1fafdb8 View commit details
    Browse the repository at this point in the history
Loading