Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add 'right' option for 'truncation_strategy' #2754

Merged
merged 1 commit into from
Jan 1, 2025

Conversation

zsxm1998
Copy link
Contributor

@zsxm1998 zsxm1998 commented Dec 24, 2024

PR type

  • Bug Fix
  • New Feature
  • Document Updates
  • More Models or Datasets Support

PR information

A new 'right' parameter has been added to the truncation_strategy, corresponding to right-side truncation. This is necessary, especially in the training of multimodal large models. For instance, if an image is input at the beginning of the data but the model's output exceeds the max_length, left-side truncation alone may result in the image being cut off, causing the model to lose crucial context for generating the result.

Experiment results

No experiment.

@Jintao-Huang
Copy link
Collaborator

hello, thanks for your PR.

The delete strategy seems to be able to handle this situation. If a right trim results in an entirely clean response, I believe it will throw an error.

@zsxm1998
Copy link
Contributor Author

Thank you for your thoughtful response.

However, I would like to clarify that the delete strategy directly removes overly long samples, which may lead to wasted samples. Additionally, I apologize for any confusion, but I did not fully understand your last sentence. Would you be so kind as to provide a more detailed explanation? I greatly appreciate your insights and look forward to further discussion.

@Jintao-Huang
Copy link
Collaborator

I understand. thanks

@Jintao-Huang Jintao-Huang merged commit 980119a into modelscope:main Jan 1, 2025
1 of 2 checks passed
tastelikefeet added a commit to tastelikefeet/swift that referenced this pull request Jan 3, 2025
* commit '07f10d2a94e7342413fa7762b6ce6b101b93d130': (86 commits)
  Move optimizer to create_optimizer (modelscope#2851)
  support reward_model (modelscope#2849)
  1. fix hub ignore-pattern (modelscope#2848)
  Fix bugs (modelscope#2838)
  Update base_to_chat shell (modelscope#2833)
  Update padding side (modelscope#2832)
  Fix glm4v suffix (modelscope#2829)
  add 'right' option for 'truncation_strategy' (modelscope#2754)
  update docs (specific model arguments) (modelscope#2822)
  fix enable_cache (modelscope#2813)
  fix citest (modelscope#2812)
  support ZhipuAI/cogagent-9b-20241220 (modelscope#2810)
  fix swift deploy log error (repeat log) (modelscope#2808)
  fix glm4v (modelscope#2806)
  update base_model deploy example (modelscope#2803)
  fix world_size (modelscope#2801)
  fix (modelscope#2800)
  support swift app (modelscope#2792)
  fix some web-ui bugs (modelscope#2794)
  fix stream infer (modelscope#2793)
  ...

# Conflicts:
#	examples/train/multi-gpu/ddp/train.sh
#	swift/llm/__init__.py
#	swift/llm/argument/rlhf_args.py
#	swift/llm/template/base.py
#	swift/llm/template/template_inputs.py
#	swift/llm/template/utils.py
#	swift/llm/train/tuner.py
#	swift/trainers/mixin.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants