Skip to content

Conversation

neoclust
Copy link
Contributor

@neoclust neoclust commented Mar 25, 2025

Summary by Sourcery

Adds new plugins and functionalities for backup and restore operations, including file segmentation, transfer, and recombination, along with network information retrieval and management.

New Features:

  • Introduces a 'backup_restore_substitut' plugin for transferring backup files via XMPP, including ZIP archive creation, segmentation, and management.
  • Adds a 'send_file_xmpp' plugin to send files or directories to a machine via XMPP, including ZIP archive creation and segmentation.
  • Implements a 'load_send_Segment_file' plugin for cyclic file transfer to machines, scheduling file transfers, and managing segmented files.
  • Introduces a 'recombine_file' plugin for recombining segmented files received via XMPP, verifying MD5 hashes, and decompressing archives.
  • Adds functions to retrieve network information, including IPv4 addresses in CIDR format, and to find common IP addresses between networks.
  • Adds functions to search machine and network list machine

jfkneib added 12 commits March 24, 2025 23:58
Ce plugin est en charge de recevoir la liste des fichiers que la machine devra tirer en rsync sur le serveur backup.

(cherry picked from commit 2598a43)
…au serveur de backup.

               la machine devra tirer en scp sur le serveur backup.
- restriction : seulement pour les machines dans le lan du serveur urbackup
- si machine n'est pas dans le lan, alors le plugin quitte sans rien faire. il met juste 1 erreur

(cherry picked from commit eb9b373)
le serveur ssh pour le transfert de fichiers peut etre configure avec 1 port autre que 22
on peut donc configurer ce port dans le plugin.

; SSH Server machine urbackup port number
ssh_port = 22
dans le fichier de configuration. pulse_xmpp_agent/config/backup_restore.ini

(cherry picked from commit c0340b2)
- Le plugin load_send_Segment_file installe une fonction appelée cycliquement pour envoyer des fichiers à des machines.
- Les fichiers sont envoyés par morceaux via des messages XMPP vers l'agent qui recombine les morceaux pour refaire le fichier a l'aide du plugin_recombine_file
- Chaque envoi inclut un fichier manifeste et des fichiers de contenu.

pour active cette fonction il faut declarer l'utilisation du plugin dans la rubrique
[plugins]
pluginliststart = ..., ..., loadlogsrotation, load_send_Segment_file
exemple si on donne cette fonction au substitut reconf
dans
/etc/pulse-xmpp-agent-substitute/agent_master_substitute_reconf.ini
on a la rubrique avec load_send_Segment_file dans la liste
[plugins]
pluginliststart = loadreconf, loadarscheck, loadlogsrotation, load_send_Segment_file

(cherry picked from commit a91ceb9)
Plugin master.

Remarque importante.

la signature de la fonction principal utilise est la suivante.
def action(xmppobject, action, sessionid, data, message, ret, dataobj=None)

dataobj=None permet a ce plugin d'etre aussi appele depuis 1 substitut.

Ce plugin est appelé depuis mmc, il est utilisé par substitut master.
Il a comme but de transférer des fichiers de backup via xmpp.
Il crée une archive zip, celle de ce que l'on veut transférer via xmpp.
L'archive est décomposée en morceaux dans un répertoire /var/lib/pulse2/zip_transfert.

Le JSON reçu par ce plugin est de la forme suivante :
{
 "action": "backup_restore_substitut",
    "sessionid": "backup_restoreq3d0m6vl",
    "data": {
        "key_private": "-----BEGIN OPENSSH PRIVATE KEY--------\n",
        "ip_list_reseau": [
            "127.0.0.0/8",
            "10.10.0.0/16",
            "95.217.42.79/32"
        ],
        "ip_list": [
            [
                "10.10.0.30",
                "255.255.0.0"
            ],
            [
                "95.217.42.79",
                "255.255.255.255"
            ]
        ],
        "machine_source_backup": {
            "hostname": "DESKTOP-Q7O57H5",
            "jid": "DESKTOP-Q7O57H5.68i@pulse",
            "uuid_inventorymachine": "UUID5",
            "uuid_serial_machine": "4C4C4544-0030-4710-8054-C2C04F544C33",
            "id": 501,
            "enabled": 0
        },
        "machine_dest_backup": {
            "hostname": "DESKTOP-Q7O57H5",
            "jid": "DESKTOP-Q7O57H5.68i@pulse",
            "uuid_inventorymachine": "UUID5",
            "uuid_serial_machine": "4C4C4544-0030-4710-8054-C2C04F544C33",
            "id": 501,
            "enabled": 0
        },
        "base_path": "/media/BACKUP/urbackup/amu-win-6/240916-1727",
        "directorylist": [
            [
                "C_0/Users/pulseuser",
                "C:/Users/pulseuser"
            ]
        ],
        "filelist": [
            [
                "Users/desktop.ini",
                "C:\\Users\\desktop.ini"
            ]
        ]
    }
}

(cherry picked from commit fbd87c1)
ce plugin est appeler depuis la MMC.
remarque importante.
on ajoute a la signature de la fonction action, "dataobj=None" cela permet d'avoir 1 plugin appelable aussi depuis 1 substitut.
def action(xmppobject, action, sessionid, data, message, ret, dataobj=None)
comme par exemple le plugin WOL appeler depuis MMC et depuis le substitut de deployement.

Ce plugin est appelé depuis mmc, il est utilisé par substitut master.
Il a comme but de transférer des fichiers de backup via xmpp.
Il crée une archive zip, celle de ce que l'on veut transférer via xmpp.
L'archive est décomposée en morceaux dans un répertoire /var/lib/pulse2/zip_transfert.

Le JSON reçu par ce plugin est de la forme suivante :
{
 "action": "backup_restore_substitut",
    "sessionid": "backup_restoreq3d0m6vl",
    "data": {
        "key_private": "-----BEGIN OPENSSH PRIVATE KEY--------\n",
        "ip_list_reseau": [
            "127.0.0.0/8",
            "10.10.0.0/16",
            "95.217.42.79/32"
        ],
        "ip_list": [
            [
                "10.10.0.30",
                "255.255.0.0"
            ],
            [
                "95.217.42.79",
                "255.255.255.255"
            ]
        ],
        "machine_source_backup": {
            "hostname": "DESKTOP-Q7O57H5",
            "jid": "DESKTOP-Q7O57H5.68i@pulse",
            "uuid_inventorymachine": "UUID5",
            "uuid_serial_machine": "4C4C4544-0030-4710-8054-C2C04F544C33",
            "id": 501,
            "enabled": 0
        },
        "machine_dest_backup": {
            "hostname": "DESKTOP-Q7O57H5",
            "jid": "DESKTOP-Q7O57H5.68i@pulse",
            "uuid_inventorymachine": "UUID5",
            "uuid_serial_machine": "4C4C4544-0030-4710-8054-C2C04F544C33",
            "id": 501,
            "enabled": 0
        },
        "base_path": "/media/BACKUP/urbackup/amu-win-6/240916-1727",
        "directorylist": [
            [
                "C_0/Users/pulseuser",
                "C:/Users/pulseuser"
            ]
        ],
        "filelist": [
            [
                "Users/desktop.ini",
                "C:\\Users\\desktop.ini"
            ]
        ]
    }
}

(cherry picked from commit 12d6418)
Ajout du plugin plugin_recombine_file.py pour reconstituer les segments de fichiers transférés et décompresser l'archive à l'endroit précisé.
Ce plugin reçoit les segments de fichiers transférés via xmpp et les reconstitue. Ensuite, il décompresse l'archive à l'endroit précisé.

(cherry picked from commit 6f28fbe)
transfert xmpp

pluging agent

    pluging recombine_file
    plugin_backup_restore.py

pluging substitut

    plugin_backup_restore_substitut.py
    plugin_load_send_Segment_file.py
    plugin_send_file_xmpp.py

(cherry picked from commit 4e1b1d5)
Copy link

sourcery-ai bot commented Mar 25, 2025

Reviewer's Guide by Sourcery

This pull request introduces several new plugins and functions to enhance Urbackup's capabilities, including file transfer via XMPP, remote file restoration via SSH, and segmented file management. It also includes functions for network information retrieval and machine searching.

Sequence diagram for sending a file via XMPP

sequenceDiagram
    participant MMC
    participant MasterSubstitute
    participant Agent

    MMC->>MasterSubstitute: Request to send file
    MasterSubstitute->>XmppMasterDatabase: search_machine(machine_dest)
    XmppMasterDatabase-->>MasterSubstitute: Machine Details
    alt File exists
        MasterSubstitute->>MasterSubstitute: zipper_fichier(path_fichier, archive_fichier_name_zip)
        MasterSubstitute->>MasterSubstitute: ZipFileManager.analyze_and_cleanup()
        MasterSubstitute->>MasterSubstitute: process_zip_files()
        loop for each segment
            MasterSubstitute->>Agent: Send file segment
        end
    else File does not exist
        MasterSubstitute->>MMC: Error message
    end
Loading

Updated class diagram for ZipFileManager

classDiagram
    class ZipFileManager {
        -directory: str
        -files_to_keep: dict
        +__init__(directory: str)
        +analyze_and_cleanup()
        +delete_file(file: str)
    }
    note for ZipFileManager "Manages ZIP files in a directory, keeping the most recent files based on timestamp and deleting older ones."
Loading

Updated class diagram for FileHasher

classDiagram
    class FileHasher {
        -file_path: str
        +__init__(file_path: str)
        +calculate_hash(): str
    }
    note for FileHasher "Calculates the MD5 hash of a file."
Loading

File-Level Changes

Change Details Files
Added functions to find common IP addresses between two lists of CIDR networks and to retrieve IPv4 addresses in CIDR format for each network interface.
  • Added find_common_addresses function to find common IP addresses between two lists of CIDR networks.
  • Added get_CIDR_ipv4_addresses function to retrieve IPv4 addresses in CIDR format for each network interface.
pulse_xmpp_master_substitute/lib/networkinfo.py
pulse_xmpp_agent/lib/networkinfo.py
Implemented a plugin for Urbackup to transfer backup files via XMPP, including zipping, segmenting, and managing zip files.
  • Created plugin_backup_restore_substitut.py to handle backup file transfers via XMPP.
  • Implemented functions to zip directories and files, split zip files into segments, and manage zip files in a directory.
  • Added functions to calculate MD5 hashes, generate unique names with timestamps, and handle file splitting and recombination.
  • Implemented functions to replace and restore special characters in file names.
  • Added a class to manage zip files, including analyzing and cleaning up old files.
pulse_xmpp_master_substitute/pluginsmastersubstitute/plugin_backup_restore_substitut.py
Implemented a plugin for sending files via XMPP, including zipping, segmenting, and managing zip files.
  • Created plugin_send_file_xmpp.py to handle sending files via XMPP.
  • Implemented functions to zip directories and files, split zip files into segments, and manage zip files in a directory.
  • Added functions to calculate MD5 hashes, generate unique names with timestamps, and handle file splitting and recombination.
  • Implemented functions to replace and restore special characters in file names.
  • Added a class to manage zip files, including analyzing and cleaning up old files.
pulse_xmpp_master_substitute/pluginsmastersubstitute/plugin_send_file_xmpp.py
Implemented a plugin for recombining files received via XMPP, including manifest verification, file concatenation, and decompression.
  • Created plugin_recombine_file.py to handle recombining files received via XMPP.
  • Implemented functions to verify manifests, concatenate file segments, and decompress archives.
  • Added functions to calculate MD5 hashes, replace and restore special characters in file names, and validate paths.
  • Implemented functions to get file sizes and convert them to human-readable formats.
pulse_xmpp_agent/plugins_common/plugin_recombine_file.py
Implemented a plugin for Urbackup to restore files from a remote server via SSH, including configuration loading, SSH key management, and file transfer.
  • Created plugin_backup_restore.py to handle file restoration from a remote server via SSH.
  • Implemented functions to load configuration, manage SSH keys, and copy files and directories via SCP or rsync.
  • Added functions to find the best server address, retrieve local IP addresses and netmasks, and modify backup paths.
  • Created a default configuration file backup_restore.ini.
pulse_xmpp_agent/plugins_common/plugin_backup_restore.py
pulse_xmpp_agent/config/backup_restore.ini
Implemented a plugin to load and send segmented files to machines, including scheduling, file segmentation, and manifest management.
  • Created plugin_load_send_Segment_file.py to handle loading and sending segmented files.
  • Implemented functions to schedule file transfers, segment files, and manage manifests.
  • Added functions to list directories and files, read JSON files, and delete directories.
pulse_xmpp_master_substitute/pluginsmastersubstitute/plugin_load_send_Segment_file.py
Added a function to search for a machine by ID, UUID, JID, or serial number.
  • Added search_machine function to search for a machine by ID, UUID, JID, or serial number.
pulse_xmpp_master_substitute/lib/plugins/xmpp/__init__.py
Added a function to retrieve the list of networks associated with a specific machine.
  • Added network_list_machine function to retrieve the list of networks associated with a specific machine.
pulse_xmpp_master_substitute/lib/plugins/xmpp/__init__.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!
  • Generate a plan of action for an issue: Comment @sourcery-ai plan on
    an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

This commit fixes the style issues introduced in 29403c5 according to the output
from Black and PHP CS Fixer.

Details: #561
Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @neoclust - I've reviewed your changes - here's some feedback:

Overall Comments:

  • Consider adding error handling or logging for the os.remove calls in process_zip_files to handle cases where file deletion fails.
  • The find_common_addresses function is duplicated in both pulse_xmpp_master_substitute and pulse_xmpp_agent; consider refactoring into a common utility module.
Here's what I looked at during the review
  • 🟡 General issues: 7 issues found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟡 Complexity: 3 issues found
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +11726 to +11735
def search_machine(self, session, search_term):
"""
Recherche une machine qui correspond à un id, uuid_inventorymachine, jid ou serialnumber.
Paramètres :
session (Session) : La session SQLAlchemy.
search_term (str) : Le terme de recherche.
Retourne :
dict : Les informations de la machine trouvée.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Consider using a lower log level for query tracing.

The use of logger.error for non‐error debugging information (e.g., printing the query and search term) might be misleading; consider using logger.debug or logger.info instead to better differentiate expected trace logs from actual errors.

Suggested implementation:

        logger.debug("Search query: %s, term: %s", query, search_term)

If there are additional logging statements for query tracing in this function (or other similar functions in the file), update those logger.error calls to logger.debug (or logger.info) as well to align with the desired log level for non‐error debugging. Also, ensure the logger is configured to output debug-level logs as needed.

Comment on lines +59 to +68
def get_CIDR_ipv4_addresses(exclude_localhost=True):
"""
Récupère les adresses IPv4 au format CIDR pour chaque interface réseau de la machine.
Cette fonction utilise des commandes spécifiques au système d'exploitation pour obtenir les informations
sur les interfaces réseau et les adresses IPv4. Elle retourne une liste d'adresses IPv4 au format CIDR.
Args:
exclude_localhost (bool): Si True, exclut les interfaces locales (localhost ou 127.0.0.1).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Duplicate implementations noticed across modules.

The functions find_common_addresses and get_CIDR_ipv4_addresses appear twice in the repository (once in the substitute and once in the agent). Consider centralizing these common utilities to avoid code duplication if possible.

Comment on lines 100 to 109
def process_zip_files(input_dir,
output_dir_base_trunck,
uuid_serial_machine,
segment_size=8000,
type_transfert="backup",
location=None,
contenttype="directory"):
"""
Lit tous les fichiers ZIP d'un répertoire, les découpe et enregistre les segments dans un répertoire de sortie basé sur le JID.
contenttype file ou directory
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Check early return logic in process_zip_files.

The loop returns True immediately after processing the first ZIP file (via os.remove call) preventing any additional files from being processed. Revisit the loop's flow to ensure all relevant ZIP files are handled as intended.

Comment on lines 64 to 73
# Exemple d'utilisation
manager = ZipFileManager(var_file_zipzer)
manager.analyze_and_cleanup()
output_dir_list = process_zip_files(var_file_zipzer, var_file_zipzer, datamsg['machine_dest_backup']['uuid_serial_machine'], segment_size=64000)

# Traitement des fichiers à zipper
if 'filelist' in datamsg and datamsg['filelist']:
for files in datamsg['filelist']:
# logger.debug("creation d'un fichier zip depuis 1 fichier : %s" % files)
if files:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Inconsistent exception handling in plugin action function.

Exceptions are caught by a broad catch block without re-raising or further handling. Consider logging a more detailed exception message or handling specific cases separately to improve debuggability.

Suggested implementation:

except Exception as e:
    logger.exception("An error occurred in plugin action function")
    continue

If there are more similar broad exception catch blocks elsewhere within the plugin action function (or in other functions in the file), consider applying a similar change to use logger.exception for a more detailed error log. Also, if possible, catch specific exceptions when you know the expected failure modes.

Comment on lines 35 to 44
def action(xmppobject, action, sessionid, data, message, ret, dataobj=None):
logger.debug("###################################################")
logger.debug("call %s from %s" % (plugin, message["from"]))
logger.debug("###################################################")
logger.debug("data %s" % json.dumps(data, indent=4))
# Spécifier le chemin où créer le répertoire ZIP
# Créer le répertoire s'il n'existe pas
os.makedirs(var_file_zipzer, exist_ok=True)

if not verifier_cles_non_vides(data, [ "machine_dest",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Input validation could be more rigorous.

The function verifier_cles_non_vides checks for missing keys, but it may be beneficial to also validate the types and formats of file paths and other critical input values to prevent unexpected runtime errors later on.

Suggested implementation:

import re
import base64
from lib.plugins.xmpp import XmppMasterDatabase

def validate_input_values(data):
    # Validate file path: must be a string and an absolute path (starting with '/')
    if "path_fichier" in data:
        if not isinstance(data["path_fichier"], str) or not re.match(r'^/', data["path_fichier"]):
            raise ValueError("Invalid 'path_fichier': It must be a string representing an absolute file path.")
    # Validate machine destination: must be a non-empty string.
    if "machine_dest" in data:
        if not isinstance(data["machine_dest"], str) or not data["machine_dest"].strip():
            raise ValueError("Invalid 'machine_dest': It must be a non-empty string.")
    return True
def action(xmppobject, action, sessionid, data, message, ret, dataobj=None):
    # New: Validate input values for required keys.
    validate_input_values(data)

    logger.debug("###################################################")
    logger.debug("call %s from %s" % (plugin, message["from"]))
    logger.debug("###################################################")
    logger.debug("data %s" % json.dumps(data, indent=4))

Note: Depending on your codebase, you may need to adjust the file path format validation or add additional rules for other inputs. Also ensure that any errors raised by validate_input_values are properly caught and handled by higher-level logic if needed.

Comment on lines +394 to +402
if millisecondes:
timestamp = int(time.time() * 1000) # Timestamp en millisecondes
else:
timestamp = int(time.time()) # Timestamp en secondes

# Combiner le UUID et le timestamp
uuid_timestamp = f"{timestamp}-{uuid_aleatoire}"

return uuid_timestamp
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): We've found these issues:

Suggested change
if millisecondes:
timestamp = int(time.time() * 1000) # Timestamp en millisecondes
else:
timestamp = int(time.time()) # Timestamp en secondes
# Combiner le UUID et le timestamp
uuid_timestamp = f"{timestamp}-{uuid_aleatoire}"
return uuid_timestamp
timestamp = int(time.time() * 1000) if millisecondes else int(time.time())
return f"{timestamp}-{uuid_aleatoire}"

# Analyser chaque fichier
for file in zip_files:
# Vérifier que le fichier correspond au format attendu : <timestamp>_<reste_du_nom>.zip
match = re.match(r'(\d+)_([A-Za-z0-9@:_\-\\]+)\.zip', file)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): Use named expression to simplify assignment and conditional [×2] (use-named-expression)

Comment on lines +483 to +486
else:
# Inclure les fichiers vides si demandé
if fichier_vide or os.path.getsize(fichier) > 0:
zipf.write(fichier, os.path.basename(fichier))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Merge else clause's nested if statement into elif (merge-else-if-into-elif)

Suggested change
else:
# Inclure les fichiers vides si demandé
if fichier_vide or os.path.getsize(fichier) > 0:
zipf.write(fichier, os.path.basename(fichier))
elif fichier_vide or os.path.getsize(fichier) > 0:
zipf.write(fichier, os.path.basename(fichier))

Comment on lines +534 to +537
else:
# Inclure les fichiers vides si demandé
if fichier_vide or os.path.getsize(chemin_complet) > 0:
zipf.write(chemin_complet, chemin_rel)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Merge else clause's nested if statement into elif (merge-else-if-into-elif)

Suggested change
else:
# Inclure les fichiers vides si demandé
if fichier_vide or os.path.getsize(chemin_complet) > 0:
zipf.write(chemin_complet, chemin_rel)
elif fichier_vide or os.path.getsize(chemin_complet) > 0:
zipf.write(chemin_complet, chemin_rel)

Comment on lines +50 to +54
logger.debug("call %s from %s" % (plugin, msg["from"]))
logger.debug("=====================================================")
logger.debug("data %s" % json.dumps(data, indent=4))

compteurcallplugin = getattr(objectxmpp, "num_call%s" % action)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): Replace interpolated string formatting with f-string [×2] (replace-interpolation-with-fstring)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant