Skip to content

[12.x] Binary File Size Validation Support #56223

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: 12.x
Choose a base branch
from

Conversation

yitzwillroth
Copy link
Contributor

@yitzwillroth yitzwillroth commented Jul 6, 2025

💡 Description

This PR introduces hybrid file size validation that automatically detects binary vs international units from suffixes while providing instance method fallbacks, solving the long-standing confusion between 1024-based and 1000-based file size calculations in Laravel applications.

Problem

Laravel's file validation exclusively uses international units (1000-based), creating confusion and validation failures when developers expect binary units (1024-based) that match operating system file displays.

Industry Standards Conflict:

  • Binary Units (1024-based): Operating systems, memory allocation, developer tools, file managers
  • International Units (1000-based): Storage manufacturers, web applications, Laravel (current)

Real-World Impact:

  • Files showing as "1.0 MB" in OS appear larger than File::default()->max('1MB') validation
  • Migration difficulties from systems using binary calculations
  • Inconsistent behavior across Laravel vs system tools
  • Developer confusion requiring manual size conversions

Solution

Hybrid Suffix Detection automatically determines unit system from IEC/ISO standard suffixes, with instance method fallbacks for naked numeric values.

// Automatic binary detection (IEC standard)
File::default()->max('1MiB');  // 1,048,576 bytes 
File::default()->max('5GiB');  // 5,368,709,120 bytes

// Automatic international detection (SI standard)  
File::default()->max('1MB');   // 1,000,000 bytes
File::default()->max('5GB');   // 5,000,000,000 bytes

// Instance methods for naked values
File::default()->binary()->max(1024);        // Binary interpretation
File::default()->international()->max(1000); // International interpretation

// Mixed constraints for complex scenarios
File::default()->min('1MB')->max('2MiB');    // 1,000,000 to 2,097,152 bytes

Precedence Logic:

  1. Suffix Detection (highest) - MiB/GiB/TiB forces Binary, MB/GB/TB forces International
  2. Instance Setting (medium) - binary() or international() method calls
  3. Framework Default (lowest) - International Units (maintains backward compatibility)

⚙️ Usage Examples

Real-World Scenarios

// File upload matching OS display (binary)
File::default()->max('10MiB');  // Matches "10.0 MB" shown in file manager

// Storage limit matching disk specs (international)  
File::default()->max('5GB');    // Matches "5GB" drive capacity

// Professional document workflow
File::default()
    ->types(['pdf', 'docx'])
    ->between('100KiB', '50MiB');  // 102,400 to 52,428,800 bytes

// Image optimization pipeline
File::default()
    ->types(['jpg', 'png', 'webp'])
    ->max('2MB')               // 2,000,000 bytes (web standard)
    ->extensions(['jpg', 'png', 'webp']);

// Video processing with mixed constraints
File::default()
    ->types(['mp4', 'mov'])
    ->min('10MB')              // 10,000,000 bytes minimum file size
    ->max('2GiB');             // 2,147,483,648 bytes maximum (binary)

Migration Support

// Legacy systems using binary calculations
File::default()->binary()          // Set binary as instance default
    ->types(['backup', 'archive'])
    ->min(1024)                     // 1024 bytes (binary interpretation)
    ->max(524288);                  // 512 KiB (binary interpretation)

// Modern cloud storage (international)
File::default()->international()   // Set international as instance default  
    ->types(['media'])
    ->between(1000, 100000);        // 1KB to 100KB (international)

Backward Compatibility

// ALL EXISTING CODE WORKS UNCHANGED
File::default()->max('1MB');           // Still: 1,000,000 bytes
File::default()->between(1000, 5000);  // Still: raw byte values  
File::default()->size('500KB');        // Still: 500,000 bytes

🏗️ Technical Implementation

Core Architecture

Intelligent Suffix Detection:

protected function detectUnits(string $size): string
{
    return match (true) {
        is_numeric($size) => $this->units,  // Fallback to instance setting
        in_array(strtolower(substr(trim($size), -3)), ['kib', 'mib', 'gib', 'tib']) => self::BINARY,
        in_array(strtolower(substr(trim($size), -2)), ['kb', 'mb', 'gb', 'tb']) => self::INTERNATIONAL,
        default => $this->units,  // Fallback for unrecognized
    };
}

Multiplier Methods:

protected function getBinaryMultiplier(string $size): int
{
    return match (substr($size, -3)) {
        'kib' => 1, 'mib' => 1_024, 'gib' => 1_048_576, 'tib' => 1_073_741_824,
        default => throw new InvalidArgumentException('Invalid file size suffix.'),
    };
}

Mathematical Precision

Integer Precision Preservation:

protected function prepareValueForPrecision(float $value): float|int
{
    return $value > PHP_INT_MAX || $value < PHP_INT_MIN || ((float) (int) $value) !== $value
        ? $value      // Keep as float for precision or overflow
        : (int) $value; // Convert to int for exact arithmetic
}

Overflow Protection:

protected function protectValueFromOverflow(float|int $value, int $multiplier): float|int
{
    return $value > PHP_INT_MAX / $multiplier || $value < PHP_INT_MIN / $multiplier || is_float($value)
        ? (float) $value * $multiplier  // Use float arithmetic to prevent overflow
        : (int) $value * $multiplier;   // Use integer arithmetic for precision
}

Enhanced Input Parsing:

  • Comma-separated values: "1,024MB", "2,048KiB"
  • Whitespace handling: " 2GB ", "1.5 MB "
  • Case insensitive: mb, MB, Mb, mB
  • Fractional sizes: "0.5MiB", "1.5GB"

Supported Units

Binary (IEC) Bytes International (SI) Bytes
KiB 1,024 KB 1,000
MiB 1,048,576 MB 1,000,000
GiB 1,073,741,824 GB 1,000,000,000
TiB 1,099,511,627,776 TB 1,000,000,000,000

🪵 Changelog

File Changes
src/Illuminate/Validation/Rules/File.php Added hybrid suffix detection, simplified method signatures, enhanced mathematical precision, made constants protected, updated docblocks
tests/Validation/ValidationFileRuleTest.php Added 30 comprehensive tests: suffix precedence validation, binary/international unit tests, precision arithmetic tests, backward compatibility verification

♻️ Backward Compatibility

Zero breaking changes - purely additive enhancement:

  • All existing code unchanged - Current validation rules continue working identically
  • Default behavior preserved - International units remain default for naked values
  • Method signatures simplified - Removed optional parameters for cleaner API
  • Progressive enhancement - New binary support available when needed
// BEFORE & AFTER - identical behavior
File::default()->max('1MB');           // 1,000,000 bytes
File::default()->between(1000, 5000);  // Raw byte validation
File::default()->binary()->max(1024);  // Instance method still works

// NEW - binary suffix support  
File::default()->max('1MiB');          // 1,048,576 bytes (binary)

✅ Testing

Added 30 new tests covering comprehensive scenarios:

Hybrid Functionality

Suffix precedence - Confirms MiB/GiB override instance settings
Naked value fallback - Instance methods work for numeric values
Mixed unit constraints - Complex international + binary combinations
All binary suffixes - KiB, MiB, GiB, TiB validation

Mathematical Precision

Integer precision - Exact arithmetic for whole numbers
Overflow protection - Large values on 32-bit systems
Fractional calculations - 0.5MiB, 1.5GB decimal handling
Boundary testing - Files at exact size limits

Input Processing

Case insensitive - mb, MB, Mb, mB all supported
Whitespace tolerance - " 2GB ", "1.5 MB " handled correctly
Comma parsing - "1,024MB" with locale awareness
Error validation - Invalid suffixes and negative values rejected

Framework Integration

Fluent chaining - All method combinations work seamlessly
Backward compatibility - Existing validation rules unchanged
Production scenarios - Real-world file size boundaries tested
Cross-platform - 32-bit and 64-bit system compatibility


💁🏼‍♂️ The author is available for hire -- inquire at yitzwillroth@gmail.com.

Copy link

github-actions bot commented Jul 6, 2025

Thanks for submitting a PR!

Note that draft PR's are not reviewed. If you would like a review, please mark your pull request as ready for review in the GitHub user interface.

Pull requests that are abandoned in draft may be closed due to inactivity.

@yitzwillroth yitzwillroth marked this pull request as ready for review July 6, 2025 01:02
@yitzwillroth yitzwillroth force-pushed the add-file-validation-rules branch from 287d154 to 48777b8 Compare July 6, 2025 14:16
@yitzwillroth yitzwillroth marked this pull request as draft July 6, 2025 17:47
@yitzwillroth yitzwillroth force-pushed the add-file-validation-rules branch 2 times, most recently from 090d399 to f0cf58b Compare July 7, 2025 04:23
@yitzwillroth yitzwillroth marked this pull request as ready for review July 7, 2025 04:29
@yitzwillroth yitzwillroth force-pushed the add-file-validation-rules branch from f0cf58b to 914b354 Compare July 7, 2025 13:02
@yitzwillroth yitzwillroth requested a review from bert-w July 7, 2025 13:15
@taylorotwell
Copy link
Member

To me it would be better to just use MB and MiB, etc.

@taylorotwell taylorotwell marked this pull request as draft July 7, 2025 14:51
@yitzwillroth
Copy link
Contributor Author

yitzwillroth commented Jul 7, 2025

@taylorotwell Updated!

@yitzwillroth yitzwillroth marked this pull request as ready for review July 7, 2025 18:05
@yitzwillroth yitzwillroth force-pushed the add-file-validation-rules branch from 4429142 to a032838 Compare July 9, 2025 11:20
Copy link
Contributor

@shaedrich shaedrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, we should embrace Unit Value Objects:

File::default()->max(new readonly class {
    public function __construct(
    	public int $size,
		public BinaryUnit|InternationalUnit $unit = BinaryUnit::Bytes,
	) {}
});

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants