forked from sergts/botnet-traffic-analysis
-
Notifications
You must be signed in to change notification settings - Fork 0
Closed
Labels
archiveRelated to archiving old research codeRelated to archiving old research codetechnical-debtTechnical debt and code qualityTechnical debt and code quality
Description
Overview
Create new branch archive-2020-fixed from archive-2020-research to fix ONLY the critical bugs while keeping original 2020 dependencies.
Goal
Show what the 2020 results SHOULD have been if data leakage and bugs were caught during research.
Branch Strategy
- Source:
archive-2020-research - Target: New branch
archive-2020-fixed - Dependencies: Keep all original (Python 3.9, TF 2.10, Pandas 1.3.5)
- Changes: Minimal - fix bugs only
Critical Fixes Needed
1. Data Leakage in Scaler (Issue #13)
Files: anomaly-detection/train_og.py, anomaly-detection/test.py
# WRONG (current)
scaler.fit(x_train.append(x_opt))
# CORRECT (fix)
scaler.fit(x_train)2. Deprecated pandas.append() (Issue #15)
Replace all uses with pd.concat() to prepare for Pandas 2.0.
Pattern:
# WRONG
df = df.append(other_df, ignore_index=True)
# CORRECT
df = pd.concat([df, other_df], ignore_index=True)3. Mixed Keras Imports (Issue #16)
# WRONG
from keras.models import load_model
# CORRECT
from tensorflow.keras.models import load_modelTesting Plan
- Run classification training on subset
- Run anomaly detection training
- Compare results to original (should be LOWER accuracy due to leakage fix)
- Document findings in RETROSPECTIVE.md
Out of Scope
- No modernization of dependencies
- No rewrite of FL code
- No refactoring for code quality
- Keep all original quirks/style
Dependencies
- Environment: Use existing
botnet-archive-2020conda env - Blocks: Future testing and validation
Acceptance Criteria
- Branch created from archive-2020-research
- Data leakage fixed
- pandas.append() replaced
- Keras imports unified
- Tests run successfully
- Results documented and compared to original
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
archiveRelated to archiving old research codeRelated to archiving old research codetechnical-debtTechnical debt and code qualityTechnical debt and code quality