Refactor training script for improved logging and modularity #8

saforem2 · 2024-10-11T15:42:43Z

Summary by Sourcery

Refactor the training script to enhance logging, modularity, and data handling. Introduce new optimizers and refactor model components for better maintainability. Add helper scripts for ALCF system setup and deployment.

Enhancements:

Refactor the training script to improve logging and modularity, including the introduction of a logging system with different log levels and the use of decorators for profiling.
Enhance the data loading process by introducing a new dataset builder class and a concatenated dataset class to handle multiple datasets more efficiently.
Improve the optimizer setup by adding support for various optimizers, including new ones like GaLoreAdamW and SophiaG, and refactor the parameter group creation logic.
Refactor the model components, including the transformer layers and attention mechanisms, to improve readability and maintainability.
Introduce a new helper script for setting up and running training on ALCF systems, which includes functions for environment setup, job configuration, and command execution.

Build:

Add a new build script for data helpers to ensure that the necessary C++ extensions are compiled before running the training script.

Deployment:

Add a new script for launching training on Aurora using qsub, which sets up the environment and executes the training script.

Merge `alcf-tests` into `main`

Merge `polaris-cuda122` branch into main

e.g.: ```bash $ PBS_O_WORKDIR=$(pwd) LR=0.00020 OVERRIDE_CKPT_OPT_PARAM=1 bash train_aGPT_7B.sh --train-range-to-skip 43000 47000 --override-opt_param-scheduler ``` will override the `lr_scheduler` params from the checkpoint and instead use the specified value, `LR=0.00020` instead.

Train skip range

Remve `--train-range-to-skip` logic from `pretrain_gpt_alcf.py` and remove redundant code.

Added Sophia Optimizer

Spike skipper

sourcery-ai · 2024-10-11T15:43:08Z

Reviewer's Guide by Sourcery

This pull request refactors the training script for improved logging and modularity, introduces new features such as Llama2Tokenizer and support for additional optimizers, enhances dataset handling and logging, and updates build scripts for compatibility with ALCF systems. The changes span multiple files and include significant modifications to core training logic, data processing, and system-specific optimizations.

Sequence diagram for the training process with logging

sequenceDiagram
    participant User
    participant TrainingScript
    participant Logger
    User->>TrainingScript: Start training
    TrainingScript->>Logger: Initialize logging
    TrainingScript->>TrainingScript: initialize_megatron()
    TrainingScript->>TrainingScript: setup_model_and_optimizer()
    TrainingScript->>TrainingScript: train()
    TrainingScript->>Logger: Log training progress
    TrainingScript->>TrainingScript: evaluate()
    TrainingScript->>Logger: Log evaluation results
    TrainingScript->>TrainingScript: save_checkpoint_and_time()
    TrainingScript->>Logger: Log checkpoint status
    TrainingScript->>User: Training complete

ER diagram for dataset handling

erDiagram
    DATASET {
        string prefix
        string data_impl
        string splits_string
        int num_samples
        int seq_length
        int seed
        bool skip_warmup
    }
    DATASET ||--o{ DATASETBUILDER : builds
    DATASETBUILDER {
        string prefix
        string corpus
        int num_samples
        int seq_length
        int seed
        bool skip_warmup
    }
    DATASETBUILDER ||--o{ BUILDCONCATDATASET : concatenates
    BUILDCONCATDATASET {
        int num_datasets
        int num_samples
    }

Class diagram for the refactored training script

classDiagram
    class TrainingScript {
        +initialize_megatron()
        +setup_model_and_optimizer()
        +train()
        +evaluate()
        +save_checkpoint_and_time()
    }
    class DatasetBuilder {
        +Build()
    }
    class BuildConcatDataset {
        +__getitem__(int idx)
    }
    class RMSNorm {
        +__init__(int dim, float eps, bool sequence_parallel)
        +_norm(torch.Tensor x)
    }
    TrainingScript --> DatasetBuilder : uses
    TrainingScript --> BuildConcatDataset : uses
    TrainingScript --> RMSNorm : uses

File-Level Changes

Change	Details	Files
Refactor training script for improved logging and modularity	Replace print statements with structured logging using the logging module Introduce Profile and PerfTrace classes for performance monitoring Refactor main training loop for better readability and modularity Add support for skipping specific training iterations Implement more flexible command-line argument handling	`megatron/training.py`
Enhance dataset handling and introduce new dataset classes	Implement BuildConcatDataset and DatasetBuilder classes for efficient multi-dataset handling Refactor BlendableDataset for improved performance and flexibility Add support for corpus-specific dataset building and weighting Implement distributed dataset index building for improved efficiency	`megatron/data/gpt_dataset.py` `megatron/data/blendable_dataset.py`
Add support for Llama2Tokenizer and additional optimizers	Implement Llama2Tokenizer class Add support for AdamW, SophiaG, and other optimizer variants Implement GaLoreAdamW and GaLoreAdafactor optimizers Add support for 8-bit Adam optimizer	`megatron/tokenizer/tokenizer.py` `megatron/optimizer/__init__.py`
Update build scripts and environment setup for ALCF systems	Add helper functions for setting up the environment on different ALCF systems Implement machine-specific configurations for Aurora, Sunspot, and Polaris Add support for CCL (Collective Communications Library) on Aurora Implement flexible DeepSpeed configuration generation	`ALCF/helpers.sh` `train_llama_alcf_aurora_qsub.sh`
Improve checkpoint handling and learning rate management	Implement saving and loading of learning rate state Refactor checkpoint loading and saving logic Add support for resuming training from checkpoints with correct learning rate	`megatron/checkpointing.py`
Enhance parallel processing and distributed training capabilities	Refactor parallel transformer implementation for improved efficiency Implement more flexible parallel attention mechanisms Add support for different sequence parallelism configurations Improve handling of rotary position embeddings in distributed settings	`megatron/model/transformer.py` `megatron/core/tensor_parallel/cross_entropy.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time. You can also use
this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey @saforem2 - I've reviewed your changes and they look great!

Here's what I looked at during the review

🟡 General issues: 8 issues found
🟢 Security: all looks good
🟢 Testing: all looks good
🟡 Complexity: 6 issues found
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2024-10-11T15:47:55Z

megatron/model/transformer.py

+                first_chunk, rest_chunk = (
+                    layernorm_output[:first_ns],
+                    layernorm_output[first_ns:],
+                )
                first_chunk = torch.nn.functional.pad(


suggestion: Consider more explicit handling of edge cases instead of padding

While padding is a valid approach, more explicit handling of these edge cases could improve readability and potentially performance. It might be worth exploring alternative approaches.

if input_ids.size(1) < self.chunk_length: first_chunk = input_ids else: first_chunk = input_ids[:, :self.chunk_length]

sourcery-ai · 2024-10-11T15:47:56Z

megatron/optimizer/__init__.py

+            eps=args.adam_eps,
+        )

+    elif args.optimizer.lower() == "galore_adamw":


suggestion: Consider refactoring optimizer initialization to reduce code duplication

The current implementation repeats similar initialization code for multiple optimizers. Consider creating a factory function or using a dictionary mapping to initialize optimizers, which would improve maintainability and reduce the likelihood of errors when adding new optimizers.

def get_optimizer(args, model_params): optimizer_map = { "adam": torch.optim.Adam, "adamw": torch.optim.AdamW, "sgd": torch.optim.SGD, "galore_adamw": GaLoreAdamW if not args.use_8bit else GaLoreAdamW8bit, } optimizer_cls = optimizer_map.get(args.optimizer.lower()) if optimizer_cls is None: raise ValueError(f"Unsupported optimizer: {args.optimizer}") return optimizer_cls(model_params, lr=args.lr, eps=args.adam_eps)

sourcery-ai · 2024-10-11T15:47:56Z

megatron/optimizer/__init__.py

+            rho = args.sophiag_rho,
+            weight_decay=args.weight_decay
+        )
+    else:


suggestion: Improve error handling for unsupported optimizers

Instead of raising a generic TypeError, consider creating a custom exception (e.g., UnsupportedOptimizerError) and including the name of the unsupported optimizer in the error message. This would provide more informative error messages and make it easier to catch specific optimization-related errors.

else: raise UnsupportedOptimizerError(f"Optimizer '{optimizer_name}' is not supported") class UnsupportedOptimizerError(Exception): pass

sourcery-ai · 2024-10-11T15:47:56Z

megatron/data/blendable_dataset.py

+
+dlp = Profile("DATASET")
 class BlendableDataset(torch.utils.data.Dataset):
+    @dlp.log


suggestion (performance): Consider the performance impact of extensive logging

While logging is important for debugging and monitoring, excessive logging can impact performance. Consider adding a debug flag to conditionally enable detailed logging, allowing users to balance between performance and verbosity as needed.

Suggested change

@dlp.log

@dlp.log(condition=__debug__)

sourcery-ai · 2024-10-11T15:47:56Z

megatron/training_log_alcf.py

+    )
+
+
+def training_log(


suggestion (performance): Optimize logging function to reduce performance overhead

The training_log function performs many operations and writes to multiple logging systems. Consider batching writes to TensorBoard and wandb, and use asynchronous logging where possible to minimize the impact on training performance. Additionally, consider making some of the more expensive logging operations configurable or less frequent.

@dlp.log async def training_log( loss_dict, total_loss_dict,

sourcery-ai · 2024-10-11T15:47:59Z

ALCF/fused_stackcode.py

+
+    # Use glob to find all files matching the pattern
+    json_gz_files = glob.glob(search_pattern, recursive=True)
+
+    return json_gz_files


suggestion (code-quality): Inline variable that is immediately returned (inline-immediately-returned-variable)

Suggested change

# Use glob to find all files matching the pattern

json_gz_files = glob.glob(search_pattern, recursive=True)

return json_gz_files

return glob.glob(search_pattern, recursive=True)

sourcery-ai · 2024-10-11T15:47:59Z

ALCF/fused_stackcode.py

+        in_list = in_list + " " +str(i)
+    command = "cat" + in_list + " > " + output_file


suggestion (code-quality): Use f-string instead of string concatenation [×5] (use-fstring-for-concatenation)

Suggested change

in_list = in_list + " " +str(i)

command = "cat" + in_list + " > " + output_file

in_list = f"{in_list} {str(i)}"

command = f"cat{in_list} > {output_file}"

sourcery-ai · 2024-10-11T15:47:59Z

ALCF/fused_stackcode_bysize.py

+
+    # Use glob to find all files matching the pattern
+    json_gz_files = glob.glob(search_pattern, recursive=True)
+
+    return json_gz_files


suggestion (code-quality): Inline variable that is immediately returned (inline-immediately-returned-variable)

Suggested change

# Use glob to find all files matching the pattern

json_gz_files = glob.glob(search_pattern, recursive=True)

return json_gz_files

return glob.glob(search_pattern, recursive=True)

sourcery-ai · 2024-10-11T15:47:59Z

ALCF/fused_stackcode_bysize.py

+        in_list = in_list + " " +str(i)
+    command = "cat" + in_list + " > " + output_file


suggestion (code-quality): Use f-string instead of string concatenation [×5] (use-fstring-for-concatenation)

Suggested change

in_list = in_list + " " +str(i)

command = "cat" + in_list + " > " + output_file

in_list = f"{in_list} {str(i)}"

command = f"cat{in_list} > {output_file}"

sourcery-ai · 2024-10-11T15:47:59Z

ALCF/fused_stackcode_bysize.py

+    if vol + val > 4608:
+        # add this item to list and reset vol, sublist
+        vol = 0
+        sublist.append(key)


issue (code-quality): We've found these issues:

Hoist repeated code outside conditional statement (hoist-statement-from-if)

Use f-string instead of string concatenation [×3] (use-fstring-for-concatenation)

saforem2 · 2024-10-12T18:32:35Z

@sourcery-ai review

sourcery-ai

Hey @saforem2 - I've reviewed your changes and they look great!

Here's what I looked at during the review

🟡 General issues: 8 issues found
🟡 Security: 1 issue found
🟢 Testing: all looks good
🟡 Complexity: 7 issues found
🟡 Documentation: 8 issues found

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2024-10-12T18:39:21Z

megatron/optimizer/__init__.py

+    elif args.optimizer.lower() == "galore_adamw":
+        from galore_torch import GaLoreAdamW, GaLoreAdamW8bit
+        # redefine way to call galore_adamw
+        optimizer = GaLoreAdamW(param_groups, lr=args.lr, weight_decay=args.weight_decay)


suggestion (bug_risk): Error handling for missing GaLoreAdamW import

Consider adding a try-except block to handle potential ImportError if GaLoreAdamW is not available.

try: optimizer = GaLoreAdamW(param_groups, lr=args.lr, weight_decay=args.weight_decay) except ImportError: raise ImportError("GaLoreAdamW is not available. Please install the required package.")

sourcery-ai · 2024-10-12T18:39:21Z

megatron/data/blendable_dataset.py

+log.setLevel(LOG_LEVEL) if RANK == 0 else log.setLevel("CRITICAL")
+# --------------------------------------------------------------------------
+
+dlp = Profile("DATASET")


suggestion: Document the purpose and impact of the Profile

Add a brief comment explaining what this profiling is measuring and how it affects performance.

# Profile dataset operations for performance analysis dlp = Profile("DATASET")

sourcery-ai · 2024-10-12T18:39:21Z

megatron/data/blendable_dataset.py

+
+dlp = Profile("DATASET")
 class BlendableDataset(torch.utils.data.Dataset):
+    @dlp.log


suggestion (performance): Consider the performance impact of frequent logging

Evaluate if this logging is necessary in production or if it should be conditionally enabled for debugging.

@dlp.log_if_debug

sourcery-ai · 2024-10-12T18:39:21Z

ALCF/test_blendable_dataset.py

@@ -0,0 +1,114 @@
+#!/usr/bin/env python


suggestion: Add a module docstring explaining the purpose of this script

A brief description of what this test script does and how to use it would be helpful for maintainers.

#!/usr/bin/env python """ Test script for blendable dataset functionality. This script performs tests on the blendable dataset implementation, measuring performance and validating correctness of data blending operations. Usage: Run this script directly to execute all tests. """ import time start_time = time.time()

sourcery-ai · 2024-10-12T18:39:21Z

ALCF/helpers.sh

@@ -0,0 +1,1418 @@
+#!/bin/bash --login


suggestion: Consider modularizing the helpers.sh script

This script is over 1400 lines long and covers multiple concerns. Consider splitting it into smaller, focused modules (e.g., mpi_setup.sh, env_config.sh, utility_functions.sh) for better maintainability and readability.

#!/bin/bash --login source "${BASH_SOURCE%/*}/mpi_setup.sh" source "${BASH_SOURCE%/*}/env_config.sh" source "${BASH_SOURCE%/*}/utility_functions.sh"

sourcery-ai · 2024-10-12T18:39:23Z

megatron/model/language_model.py

+            raise Exception(
+                'Stage must have at least either encoder or decoder'
+            )


issue (code-quality): Raise a specific error instead of the general Exception or BaseException (raise-specific-error)

Explanation
If a piece of code raises a specific exception type rather than the generic [`BaseException`](https://docs.python.org/3/library/exceptions.html#BaseException) or [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception), the calling code can:

get more information about what type of error it is

define specific exception handling for it

This way, callers of the code can handle the error appropriately.

How can you solve this?

Use one of the built-in exceptions of the standard library.

Define your own error class that subclasses Exception.

So instead of having code raising Exception or BaseException like

if incorrect_input(value): raise Exception("The input is incorrect")

you can have code raising a specific error like

if incorrect_input(value): raise ValueError("The input is incorrect")

or

class IncorrectInputError(Exception): pass if incorrect_input(value): raise IncorrectInputError("The input is incorrect")

sourcery-ai · 2024-10-12T18:39:23Z

megatron/model/transformer.py

        else:
-            raise Exception("Unsupported layer type, '%s'." %
-                            self.layer_type.name)
+            raise Exception("Unsupported layer type, '%s'." % self.layer_type.name)


issue (code-quality): Raise a specific error instead of the general Exception or BaseException (raise-specific-error)

Explanation
If a piece of code raises a specific exception type rather than the generic [`BaseException`](https://docs.python.org/3/library/exceptions.html#BaseException) or [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception), the calling code can:

get more information about what type of error it is

define specific exception handling for it

This way, callers of the code can handle the error appropriately.

How can you solve this?

Use one of the built-in exceptions of the standard library.

Define your own error class that subclasses Exception.

So instead of having code raising Exception or BaseException like

if incorrect_input(value): raise Exception("The input is incorrect")

you can have code raising a specific error like

if incorrect_input(value): raise ValueError("The input is incorrect")

or

class IncorrectInputError(Exception): pass if incorrect_input(value): raise IncorrectInputError("The input is incorrect")

sourcery-ai · 2024-10-12T18:39:24Z

megatron/optimizer/sophia.py

+                         capturable: bool):
+
+    for i, param in enumerate(params):
+        grad = grads[i] if not maximize else -grads[i]


suggestion (code-quality): Swap if/else branches of if expression to remove negation (swap-if-expression)

Suggested change

grad = grads[i] if not maximize else -grads[i]

grad = -grads[i] if maximize else grads[i]

Explanation
Negated conditions are more difficult to read than positive ones, so it is best
to avoid them where we can. By swapping the if and else conditions around we
can invert the condition and make it positive.

sourcery-ai · 2024-10-12T18:39:24Z

megatron/timers.py

-        raise Exception('dummy timer should not be used to '
-                        'calculate elapsed time')
-
+        raise Exception("dummy timer should not be used to " "calculate elapsed time")


issue (code-quality): Raise a specific error instead of the general Exception or BaseException (raise-specific-error)

Explanation
If a piece of code raises a specific exception type rather than the generic [`BaseException`](https://docs.python.org/3/library/exceptions.html#BaseException) or [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception), the calling code can:

get more information about what type of error it is

define specific exception handling for it

This way, callers of the code can handle the error appropriately.

How can you solve this?

Use one of the built-in exceptions of the standard library.

Define your own error class that subclasses Exception.

So instead of having code raising Exception or BaseException like

if incorrect_input(value): raise Exception("The input is incorrect")

you can have code raising a specific error like

if incorrect_input(value): raise ValueError("The input is incorrect")

or

class IncorrectInputError(Exception): pass if incorrect_input(value): raise IncorrectInputError("The input is incorrect")

sourcery-ai · 2024-10-12T18:39:24Z

megatron/timers.py

        else:
-            raise Exception('unknown timing log option {}'.format(
-                self._log_option))
+            raise Exception("unknown timing log option {}".format(self._log_option))


issue (code-quality): Raise a specific error instead of the general Exception or BaseException (raise-specific-error)

Explanation
If a piece of code raises a specific exception type rather than the generic [`BaseException`](https://docs.python.org/3/library/exceptions.html#BaseException) or [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception), the calling code can:

get more information about what type of error it is

define specific exception handling for it

This way, callers of the code can handle the error appropriately.

How can you solve this?

Use one of the built-in exceptions of the standard library.

Define your own error class that subclasses Exception.

So instead of having code raising Exception or BaseException like

if incorrect_input(value): raise Exception("The input is incorrect")

you can have code raising a specific error like

if incorrect_input(value): raise ValueError("The input is incorrect")

or

class IncorrectInputError(Exception): pass if incorrect_input(value): raise IncorrectInputError("The input is incorrect")

Update `ALCF/helpers.sh`

saforem2 and others added 30 commits April 24, 2024 19:49

Update ALCF/{test_sirius.sh,test_sunspot.sh}

23c9531

Update pretrain_gpt_alcf.py

5fff0af

Update pretrain_gpt_alcf.py

005272b

Remove ds_report from train_llama_alcf.sh

fdb1707

Update .gitignore

936c423

Update ALCF/test_{sunspot,sirius}.sh

a59a532

Merge pull request #10 from argonne-lcf/alcf-tests

7681642

Merge `alcf-tests` into `main`

Update ALCF/data-lists/polaris/*.txt

c9c87d9

Add ALCF/test_polaris.sh

caa1a4b

Fix duplicate loggers in pretrain_gpt_alcf.py

b534e09

Update ALCF/helpers.sh

2c4d772

Update ALCF/test_{polaris,sirius,sunspot}.sh

cfa6b52

Add ALCF/data-lists/sunspot/dolma_v1_7_file_list.txt

a3114bf

Update ALCF/helpers.sh

f63aad1

Update ALCF/helpers.sh

d329801

Update train_llama_alcf.sh

585c15e

Update .gitignore

3444b99

Update defaults in ALCF/helpers.sh

505aef0

Add train_agpt.sh

e31bb23

Update ALCF/test_{polaris,sirius,sunspot}.sh

a73e9af

Add ALCF/test_alcf.sh

57f1c96

Update ALCF/helpers.sh

455126c

Update train_agpt.sh

2a49f6d

Update train_llama_alcf.sh

482dffd

Update ALCF/helpers.sh

c04c42d

Fix for conda/2024-04-29 on Polaris

36fa520

Add train_agpt_polaris_7B_production.sh

3b83b36

Update ALCF/helpers.sh on Sunspot

a916a8d

Update ALCF/test_alcf.sh

5257721

Merge pull request #11 from argonne-lcf/polaris-cuda122

ef30463

Merge `polaris-cuda122` branch into main

saforem2 and others added 22 commits September 10, 2024 08:05

Update ALCF/helpers.sh

fd1ac6d

Update train_aGPT_7B.sh

6f27f5d

Merge pull request deepspeedai#56 from argonne-lcf/train-skip-range

73720c2

Train skip range

merge: Create microsoft-main

8bc5313

Remove duplicate --profile arg

a1ede68

debug: sequence_parallel issue in RMSNorm ??

6b32cff

Update megatron/training_log_alcf.py

5ac877a

Update megatron/training.py

b3e0f6f

Update megatron/utils.py

2113dbc

Update megatron/training_log.py

7f71572

Update pretrain_gpt_alcf.py

7cb9c11

Update megatron/training_log.py

e83de19

Warn if mismatch b/w iters in megatron/checkpointing.py

29756d6

fix: try/except for non tensors in megatron/training_log.py

1a7f03b

fix: Correctly draw grad_acc_steps batches of data when skipping step

828f6a9

Update pretrain_gpt_alcf.py

295fcb3

Remve `--train-range-to-skip` logic from `pretrain_gpt_alcf.py` and remove redundant code.

added sophia

cf80e6b

Merge pull request deepspeedai#59 from mngom2/spike-skipper

09accde

Added Sophia Optimizer

Merge pull request deepspeedai#58 from argonne-lcf/spike-skipper

cef3fc7

Spike skipper

merge: Resolve merge conflicts pulling in from Microsoft upstream

fd94b37

merge: argonne-lcf-microsoft-main into main

9b5be12

sourcery-ai bot changed the title ~~@sourcery-ai~~ Refactor training script for improved logging and modularity Oct 11, 2024

sourcery-ai bot reviewed Oct 11, 2024

View reviewed changes

Update ALCF/helpers.sh, train_aGPT_7B.sh

41ff059

sourcery-ai bot reviewed Oct 12, 2024

View reviewed changes

saforem2 closed this pull request by merging all changes into main in 33962ee Nov 15, 2024

saforem2 added a commit that referenced this pull request Nov 15, 2024

Merge pull request #8 from argonne-lcf/hotfix-sirius

7f88a6e

Update `ALCF/helpers.sh`

		in_list = in_list + " " +str(i)
		command = "cat" + in_list + " > " + output_file

	grad = grads[i] if not maximize else -grads[i]
	grad = -grads[i] if maximize else grads[i]

Refactor training script for improved logging and modularity #8

Refactor training script for improved logging and modularity #8

Uh oh!

Conversation

saforem2 commented Oct 11, 2024 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Uh oh!

sourcery-ai bot commented Oct 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide by Sourcery

Sequence diagram for the training process with logging

ER diagram for dataset handling

Class diagram for the refactored training script

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 11, 2024

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 11, 2024

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 11, 2024

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 11, 2024

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 11, 2024

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 11, 2024

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 11, 2024

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 11, 2024

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 11, 2024

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 11, 2024

Choose a reason for hiding this comment

Uh oh!

saforem2 commented Oct 12, 2024

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 12, 2024

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 12, 2024

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 12, 2024

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 12, 2024

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 12, 2024

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 12, 2024

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 12, 2024

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 12, 2024

Choose a reason for hiding this comment

saforem2 commented Oct 11, 2024 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Oct 11, 2024 •

edited

Loading