Skip to content

pixtral SFT #296

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 11, 2025
Merged

pixtral SFT #296

merged 6 commits into from
Jun 11, 2025

Conversation

shruthan
Copy link

✨ Description

Some bug fixes for Image+Text SFTs

πŸ” Type of change

Select all that apply:

  • πŸ› Bug fix (non-breaking change that addresses a specific issue)
  • πŸš€ New feature (non-breaking change that adds functionality)
  • ⚠️ Breaking change (a change that could affect existing functionality)
  • πŸ“ˆ Performance improvement/optimization (improves speed, memory usage, or efficiency)
  • πŸ› οΈ Code refactor (non-functional changes that improve code readability, structure, etc.)
  • πŸ“¦ Dependency bump (updates dependencies, including Dockerfile or package changes)
  • πŸ“ Documentation change (updates documentation, including new content or typo fixes)
  • πŸ”§ Infrastructure/Build change (affects build process, CI/CD, or dependencies)

πŸ“ Changes

List the key changes introduced in this PR:

  1. Change A
  2. Change B

βœ… Checklist

Make sure the following tasks are completed before submitting the PR:

General

  • πŸ“œ I have read and followed the contributing guidelines.
  • 🏷️ I am using a clear and descriptive PR title that summarizes the key change or feature introduced.
  • πŸŽ‰ The functionality is complete, and I have tested the changes.
  • πŸ“ I have updated the documentation if needed.
  • ⚠️ The change does not introduce any new issues (e.g., runtime warnings, type checker errors, linting problems, unhandled edge cases).
  • 🧩 I have commented my code, especially in hard-to-understand areas.

Dependencies and Configuration

  • πŸ‹ I have updated the Docker configuration or dependencies, if applicable.
  • πŸ”„ I have ensured compatibility with the existing setup after dependency changes.

Testing

  • πŸ§ͺ I have added or updated tests to cover my changes.
  • βœ”οΈ New and existing tests pass locally with my changes.
  • 🚦 I have tested these changes on GPUs and verified training stability.
  • πŸ‹οΈ I have tested the changes on realistic training workloads, if applicable.

Performance Impact

  • πŸ“Š I have run benchmarks where applicable to evaluate the performance impact.
  • βœ… The benchmarks show no performance regression.
  • πŸš€ The benchmarks indicate a potential performance improvement.
  • ⚠️ The benchmarks indicate a potential performance degradation.
  • πŸ“ˆ I have provided benchmark results and detailed any performance impact below, if applicable.

πŸ“Š Performance Impact Details

If there is any impact on performance, describe it and provide benchmark results, if applicable:


πŸ—’οΈ Additional Notes

Include any additional context, information, or considerations here, such as known issues, follow-up tasks, or backward compatibility concerns.

@shruthan shruthan requested a review from sohamparikh June 11, 2025 17:46
Comment on lines +61 to +62
# not sure of assignment, reading flag to indicate whether preference loss-masking spans are present
self._has_preference_spans = struct.unpack("<B", stream.read(1))[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's already read above, why read it here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's another flag written after images, this is to read that, but not sure what the assignment should be

# Placeholder flag for preference spans
idx_stream.write(struct.pack("<B", 0))
# Flag to indicate whether images are present
idx_stream.write(struct.pack("<B", 1 if total_images > 0 else 0))
# Flag to indicate whether preference loss-masking spans are present
idx_stream.write(struct.pack("<B", 1 if chosen_spans.size > 0 and rejected_spans.size > 0 else 0))

Copy link
Member

@sohamparikh sohamparikh Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yeah that order should be flipped, the chosen_spans byte should be before total_images, i'll fix it in my branch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would break files with version==3 right?

It seems to me that we should rather fix the order in which those flags are dumped in the idx file below

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RaymondLi0 yes, I'm planning to fix it in #227

Comment on lines 217 to 219
total_pixels_needed = sum(
length[0] * length[1] * 3 for length in self._image_lengths[idx]
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
total_pixels_needed = sum(
length[0] * length[1] * 3 for length in self._image_lengths[idx]
)
total_pixels_needed = self._image_lengths[idx].prod(initial=3, axis=1).sum()

offset=self._pointers[idx] + self._document_sizes[idx] * np.dtype(self._dtype).itemsize,
)
images = []
start = 0
for image_length in self._image_lengths[idx]:
n_pixels = image_length.prod(initial=3)
n_pixels = image_length[0] * image_length[1] * 3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can leave it as using .prod?

@@ -549,7 +549,7 @@ def __getitem__(self, index: int) -> typing.Any:
use_loss_masking_spans=self._parameters.use_loss_masking_spans,
)
start_pos = 0
if sample.image_positions:
if sample.image_positions is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use a bool has_images = bool(sample.image_positions) and use it below as well?

Copy link
Author

@shruthan shruthan Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think bool(sample.image_positions) will throw ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() when there are more than one image_positions: bool(np.array([2, 3]))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or simply has_images = True if sample.image_positions else False?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if sample.image_positions would throw the same error, changed it to has_image_positions = sample.image_positions is not None

Copy link
Member

@sohamparikh sohamparikh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks!

@sohamparikh sohamparikh merged commit 275fefa into soham/pixtral-support Jun 11, 2025
@sohamparikh sohamparikh deleted the shruthan/pixtral-sft branch June 11, 2025 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants