Skip to content

Ranger-5081: CI: Add check to verify plugin installation in ranger-service containers #583

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

ChinmayHegde24
Copy link

Currently, with GitHub CI, the pipeline brings up all the containers that are supported by dev-support/ranger-docker.

What changes were proposed in this pull request?

Adding a way to verify that the ranger plugin setup succeeded when the containers come up as another check in the CI pipeline.

@kumaab
Copy link
Contributor

kumaab commented Jun 5, 2025

I recommend the use of Python language for adding checks that are more readable, maintainable, reusable and extendable. It's also preferable for extensive JSON parsing.

Please take a look at below sample code (AI generated) as a starting point, can be further improvised:

#!/usr/bin/env python3

"""
Check Ranger plugin status via Ranger Admin API.
This script is intended to be used as a CI check step, and follows
a function-based, maintainable structure for easy extension.
"""

import os
import sys
import time
import json
from typing import List
import requests
from dotenv import load_dotenv

def load_env(env_path: str):
    """Load environment variables from a .env file."""
    load_dotenv(dotenv_path=env_path)

def trigger_knox_activity(knox_user: str, knox_pass: str, knox_endpoint: str):
    """Trigger activity for KNOX to make plugin active."""
    print("\nTriggering Knox activity to ensure plugin status is updated...")
    try:
        requests.get(knox_endpoint, auth=(knox_user, knox_pass), verify=False, timeout=10)
        print("Knox activity triggered.")
    except Exception as e:
        print(f"Warning: Knox trigger failed: {e}")

def fetch_plugin_info(ranger_admin_user: str, ranger_admin_pass: str, endpoint: str):
    """Fetch plugin info from Ranger Admin API."""
    print(f"\nFetching plugin info from {endpoint} ...")
    try:
        resp = requests.get(endpoint, auth=(ranger_admin_user, ranger_admin_pass), timeout=10)
        resp.raise_for_status()
        return resp.json()
    except Exception as e:
        print(f"Failed to fetch plugin info: {e}")
        return None

def check_plugin_status(response: list, expected_services: List[str]) -> bool:
    """Check the status of plugins for expected services."""
    print("\n<---------  Plugin Status  ---------->")
    failed = False
    for svc in expected_services:
        print(f"\nChecking service type: {svc}")
        entries = [entry for entry in response if entry.get("serviceType") == svc]
        count = len(entries)
        if count == 0:
            print(f"MISSING: No plugins found for service type '{svc}'.")
            failed = True
            continue
        active_plugins = [
            entry for entry in entries
            if entry.get("info", {}).get("policyActiveVersion")
        ]
        active_count = len(active_plugins)
        print(f"\U0001F7E2 Active plugins: {active_count} / {count} total plugins found.")
        if active_count == 0:
            print(f"WARNING: Plugins present but NONE are active for '{svc}'.")
            failed = True
        print("Details:")
        for entry in entries:
            host = entry.get("hostName", "unknown")
            app_type = entry.get("appType", "unknown")
            active_ver = entry.get("info", {}).get("policyActiveVersion", "null")
            print(f"- Host: {host}, AppType: {app_type}, PolicyActiveVersion: {active_ver}")
    return not failed

def main():
    # Load .env from the parent directory
    script_dir = os.path.dirname(os.path.abspath(__file__))
    env_path = os.path.join(script_dir, '..', '.env')
    load_env(env_path)

    RANGER_HOST = os.getenv("RANGER_HOST", "http://localhost:6080")
    ENDPOINT = f"{RANGER_HOST}/service/public/v2/api/plugins/info"

    KNOX_USER = os.getenv("KNOX_USER")
    KNOX_PASS = os.getenv("KNOX_PASS")
    KNOX_ENDPOINT = os.getenv("KNOX_ENDPOINT", "https://localhost:8443/gateway/sandbox/webhdfs/v1/?op=LISTSTATUS")

    RANGER_ADMIN_USER = os.getenv("RANGER_ADMIN_USER")
    RANGER_ADMIN_PASS = os.getenv("RANGER_ADMIN_PASS")

    expected_services = ["hdfs", "hbase", "kms", "yarn", "kafka", "ozone", "knox", "hive"]

    # 1. Trigger knox activity
    trigger_knox_activity(KNOX_USER, KNOX_PASS, KNOX_ENDPOINT)

    # 2. Wait for status update
    time.sleep(60)

    # 3. Fetch plugin info
    response = fetch_plugin_info(RANGER_ADMIN_USER, RANGER_ADMIN_PASS, ENDPOINT)
    if not response or not isinstance(response, list):
        print("No plugin info returned from API.")
        sys.exit(1)

    # 4. Check status
    all_ok = check_plugin_status(response, expected_services)
    print()
    if not all_ok:
        print("\u274C One or more plugins are missing or inactive.")
        sys.exit(1)
    else:
        print("\u2705 All expected plugins are present and active.")

if __name__ == "__main__":
    main()

@kumaab kumaab self-requested a review June 5, 2025 06:31
expected_services = ["hdfs", "hbase", "kms", "yarn", "kafka", "ozone", "knox", "hive"]


def trigger_knox_activity():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggested function name: init_knox_plugin

def trigger_knox_activity():
print("\nTriggering Knox activity to ensure plugin status is updated...")
try:
response = requests.get(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please take a look at apache-ranger python library, more info: https://pypi.org/project/apache-ranger/
Use RangerClient do a GET on the knox endpoint instead of using requests directly.

def fetch_plugin_info():
print(f"\nFetching plugin info from {ENDPOINT} ...")
try:
response = requests.get(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use RangerClient from apache-ranger python lib to do a GET on the ranger endpoint instead of using requests directly.

def fetch_plugin_info():
print(f"\nFetching plugin info from {ENDPOINT} ...")
try:
response = requests.get(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use RangerClient from apache-ranger python lib to do a GET on the ranger endpoint instead of using requests directly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I've made these changes please check once

trigger_knox_activity()

# wait for status update
for i in range(6): # Retry up to 3 minutes total
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping re-try interval and retries configurable is suggested.

def init_knox_plugin():
print("\nTriggering Knox activity to ensure plugin status is updated...")
try:
response = ranger.session.get(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use a single line instead



def init_knox_plugin():
print("\nTriggering Knox activity to ensure plugin status is updated...")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change log to Initializing Knox plugin

)
print("Knox activity triggered.")
except Exception as e:
print(f"Failed to trigger Knox activity: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: Failed to initialize Knox plugin

KNOX_USER=admin
KNOX_PASS=admin-password

PLUGIN_RETRY_COUNT=4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are defined as env variables in maven.yml so this is unncesssary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants