refactor VL modules for internvl and qwen2-vl #2764

lvhan028 · 2024-11-17T15:29:19Z

No description provided.

* refactor VL modules for internvl and qwen2-vl (#2764) * qwen2-vl * internvl * qwen2 * Refactor VL modules for glm4v, deepseek-vl, llava-hf, cogvlm (#2772) * qwen2-vl * internvl * qwen2 * get image_tokens_per_patch for internvl2 * deepseek-vl * cogvlm * glm4v * update internvl * internvl_llava * llava * glm4v * upate internvl * cogvlm * deepseek * llava_hf * rollback llava, internvl-llava * Refactor VL modules for qwen-vl, llava and llava_next (#2773) * qwen2-vl * internvl * qwen2 * get image_tokens_per_patch for internvl2 * deepseek-vl * cogvlm * glm4v * update internvl * internvl_llava * llava * glm4v * upate internvl * cogvlm * deepseek * llava_hf * rollback llava, internvl-llava * refactor qwen * update internvl * update llava_hf * update qwen2-vl * llava_next * update llava_next * update llava * update llava * update llava * Refactor VL modules for qwen2-vl (#2777) * qwen2-vl * internvl * qwen2 * get image_tokens_per_patch for internvl2 * deepseek-vl * cogvlm * glm4v * update internvl * internvl_llava * llava * glm4v * upate internvl * cogvlm * deepseek * llava_hf * rollback llava, internvl-llava * refactor qwen * update internvl * update llava_hf * update qwen2-vl * llava_next * update llava_next * update llava * update llava * update llava * qwen2 * Fix side-effect to internvl (#2778) * qwen2-vl * internvl * qwen2 * get image_tokens_per_patch for internvl2 * deepseek-vl * cogvlm * glm4v * update internvl * internvl_llava * llava * glm4v * upate internvl * cogvlm * deepseek * llava_hf * rollback llava, internvl-llava * refactor qwen * update internvl * update llava_hf * update qwen2-vl * llava_next * update llava_next * update llava * update llava * update llava * qwen2 * fix internvl * Refactor VL modules for phi3-vision (#2779) * qwen2-vl * internvl * qwen2 * get image_tokens_per_patch for internvl2 * deepseek-vl * cogvlm * glm4v * update internvl * internvl_llava * llava * glm4v * upate internvl * cogvlm * deepseek * llava_hf * rollback llava, internvl-llava * refactor qwen * update internvl * update llava_hf * update qwen2-vl * llava_next * update llava_next * update llava * update llava * update llava * qwen2 * fix internvl * phi3-vision * Refactor VL modules for mllama and yi-vl (#2781) * qwen2-vl * internvl * qwen2 * get image_tokens_per_patch for internvl2 * deepseek-vl * cogvlm * glm4v * update internvl * internvl_llava * llava * glm4v * upate internvl * cogvlm * deepseek * llava_hf * rollback llava, internvl-llava * refactor qwen * update internvl * update llava_hf * update qwen2-vl * llava_next * update llava_next * update llava * update llava * update llava * qwen2 * fix internvl * phi3-vision * refactor yi-vl * refactor mllama * Refactor VLM module for minicpm and molmo (#2794) * Refactor VLM modules for xcomposer series (#2796) * Refactor VLM modules for internvl-llava (#2797) * Refactor VLM modules v2 (#2806) * internvl2 v2 * cogvlm * deepseek-vl * glm-4v * llava-hf * llava-next * llava * internvl-llava * mllama * phi3-vision * qwen * qwen2 * yi-vl * xcomposer * minicpm * molmo * update * update * Remove vl template (#2809) * Resolve conflicts (#2811) * feature: support qwen2.5 fuction_call (#2737) * feat: support qwen2.5 tools_call * fix: npe bug * fix: 模版不一致 * fix: adopting review suggestions * fix: adopting review suggestions * fix: adopting review suggestions * fix: adopting review suggestions * feat: Support multi tools calling * feat: Support multi tools calling * fix: Add '\n' between each tool * fix: Add ensure_ascii=False * bugfix: rfind * bugfix: tools_call -> tool_calls * bugfix: add toolName in tool_response * fix: some '\n' error * fix: remove toolname * fix: replace '\n' to self.separator * feat: add doc with multiple tool calling * fix：update doc * feat: add qwen2.5 prompt template test * feat: add qwen2.5 no tool call prompt test --------- Co-authored-by: gaozixiang <gaozixiang1@xiaomi.com> * Update supported models & Ascend doc (#2765) * update ascend supported model list * fix markdown * fix markdown * fix lint * Update get_started.md * Update get_started.md * [CI] Split vl testcases into turbomind and pytorch backend (#2751) * updaet * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * [Feature] support minicpm-v_2_6 for pytorch engine. (#2767) * support minicpmv_2_6. * update supported_models. * update supported_models. * Support qwen2-vl AWQ quantization (#2787) * Support qwen2-vl AWQ quantization * Update config.yaml --------- Co-authored-by: zhulinJulia24 <145004780+zhulinJulia24@users.noreply.github.com> * [dlinfer] Fix qwenvl rope error for dlinfer backend (#2795) * Optimize update_step_ctx on Ascend (#2804) * opt update_ctx for ascend * fix lint --------- Co-authored-by: 逝夜长歌 <928926035@qq.com> Co-authored-by: gaozixiang <gaozixiang1@xiaomi.com> Co-authored-by: jinminxi104 <jinminxi104@hotmail.com> Co-authored-by: zhulinJulia24 <145004780+zhulinJulia24@users.noreply.github.com> Co-authored-by: zhoushenglong <87467364+Reinerzhou@users.noreply.github.com> Co-authored-by: AllentDan <41138331+AllentDan@users.noreply.github.com> Co-authored-by: Wei Tao <1136862851@qq.com> * PytorchEngine refactor multimodal (#2742) * WIP * support mrope * support long context * support causal=false * fix mask * flash attn bound * optimize * Moskau, Moskau, wirf die Gläser an die Wand * YMCA * optimize mllama * update processor * support cogvlm * all work and no play make jack a dull boy * upgrade triton * support qwen2vl * support internvl * phi3-v WIP * glm4v WIP * support chatglm and cogvlm * use image tokens * support llava * support internvl-mono * phi3v, mllama * add llavanext * use img token ids * support multiimage chatglm cogvlm * fix ut * minor-fix * minor-fix (#2813) * fix * fix mono * fix docs * read norm_type * super().collect_images->self.collect_images * add note in supported models * define the parameters clearly * better streaming * fix molmo * Fix vision model batch inference (#2868) * remove forward from vl models that are not supported by tm * support max_batch_size * fix * warn glm4v does not support multi images * unconst * fix deepseek-vl * fix internvl * fix llava * fix minicpm 2.6 * fix callback * fix minicpm v2.5 * fix minicpm v2.6 * update llava_next.py * remove hardcode from xcomposer2.py * rollback supported_models * change to staticmethod * fix vlm quantization * update doc * update --------- Co-authored-by: q yao <streetyao@live.com>

* refactor VL modules for internvl and qwen2-vl (#2764) * qwen2-vl * internvl * qwen2 * Refactor VL modules for glm4v, deepseek-vl, llava-hf, cogvlm (#2772) * qwen2-vl * internvl * qwen2 * get image_tokens_per_patch for internvl2 * deepseek-vl * cogvlm * glm4v * update internvl * internvl_llava * llava * glm4v * upate internvl * cogvlm * deepseek * llava_hf * rollback llava, internvl-llava * Refactor VL modules for qwen-vl, llava and llava_next (#2773) * qwen2-vl * internvl * qwen2 * get image_tokens_per_patch for internvl2 * deepseek-vl * cogvlm * glm4v * update internvl * internvl_llava * llava * glm4v * upate internvl * cogvlm * deepseek * llava_hf * rollback llava, internvl-llava * refactor qwen * update internvl * update llava_hf * update qwen2-vl * llava_next * update llava_next * update llava * update llava * update llava * Refactor VL modules for qwen2-vl (#2777) * qwen2-vl * internvl * qwen2 * get image_tokens_per_patch for internvl2 * deepseek-vl * cogvlm * glm4v * update internvl * internvl_llava * llava * glm4v * upate internvl * cogvlm * deepseek * llava_hf * rollback llava, internvl-llava * refactor qwen * update internvl * update llava_hf * update qwen2-vl * llava_next * update llava_next * update llava * update llava * update llava * qwen2 * Fix side-effect to internvl (#2778) * qwen2-vl * internvl * qwen2 * get image_tokens_per_patch for internvl2 * deepseek-vl * cogvlm * glm4v * update internvl * internvl_llava * llava * glm4v * upate internvl * cogvlm * deepseek * llava_hf * rollback llava, internvl-llava * refactor qwen * update internvl * update llava_hf * update qwen2-vl * llava_next * update llava_next * update llava * update llava * update llava * qwen2 * fix internvl * Refactor VL modules for phi3-vision (#2779) * qwen2-vl * internvl * qwen2 * get image_tokens_per_patch for internvl2 * deepseek-vl * cogvlm * glm4v * update internvl * internvl_llava * llava * glm4v * upate internvl * cogvlm * deepseek * llava_hf * rollback llava, internvl-llava * refactor qwen * update internvl * update llava_hf * update qwen2-vl * llava_next * update llava_next * update llava * update llava * update llava * qwen2 * fix internvl * phi3-vision * Refactor VL modules for mllama and yi-vl (#2781) * qwen2-vl * internvl * qwen2 * get image_tokens_per_patch for internvl2 * deepseek-vl * cogvlm * glm4v * update internvl * internvl_llava * llava * glm4v * upate internvl * cogvlm * deepseek * llava_hf * rollback llava, internvl-llava * refactor qwen * update internvl * update llava_hf * update qwen2-vl * llava_next * update llava_next * update llava * update llava * update llava * qwen2 * fix internvl * phi3-vision * refactor yi-vl * refactor mllama * Refactor VLM module for minicpm and molmo (#2794) * Refactor VLM modules for xcomposer series (#2796) * Refactor VLM modules for internvl-llava (#2797) * Refactor VLM modules v2 (#2806) * internvl2 v2 * cogvlm * deepseek-vl * glm-4v * llava-hf * llava-next * llava * internvl-llava * mllama * phi3-vision * qwen * qwen2 * yi-vl * xcomposer * minicpm * molmo * update * update * Remove vl template (#2809) * Resolve conflicts (#2811) * feature: support qwen2.5 fuction_call (#2737) * feat: support qwen2.5 tools_call * fix: npe bug * fix: 模版不一致 * fix: adopting review suggestions * fix: adopting review suggestions * fix: adopting review suggestions * fix: adopting review suggestions * feat: Support multi tools calling * feat: Support multi tools calling * fix: Add '\n' between each tool * fix: Add ensure_ascii=False * bugfix: rfind * bugfix: tools_call -> tool_calls * bugfix: add toolName in tool_response * fix: some '\n' error * fix: remove toolname * fix: replace '\n' to self.separator * feat: add doc with multiple tool calling * fix：update doc * feat: add qwen2.5 prompt template test * feat: add qwen2.5 no tool call prompt test --------- Co-authored-by: gaozixiang <gaozixiang1@xiaomi.com> * Update supported models & Ascend doc (#2765) * update ascend supported model list * fix markdown * fix markdown * fix lint * Update get_started.md * Update get_started.md * [CI] Split vl testcases into turbomind and pytorch backend (#2751) * updaet * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * update * [Feature] support minicpm-v_2_6 for pytorch engine. (#2767) * support minicpmv_2_6. * update supported_models. * update supported_models. * Support qwen2-vl AWQ quantization (#2787) * Support qwen2-vl AWQ quantization * Update config.yaml * [dlinfer] Fix qwenvl rope error for dlinfer backend (#2795) * Optimize update_step_ctx on Ascend (#2804) * opt update_ctx for ascend * fix lint * PytorchEngine refactor multimodal (#2742) * WIP * support mrope * support long context * support causal=false * fix mask * flash attn bound * optimize * Moskau, Moskau, wirf die Gläser an die Wand * YMCA * optimize mllama * update processor * support cogvlm * all work and no play make jack a dull boy * upgrade triton * support qwen2vl * support internvl * phi3-v WIP * glm4v WIP * support chatglm and cogvlm * use image tokens * support llava * support internvl-mono * phi3v, mllama * add llavanext * use img token ids * support multiimage chatglm cogvlm * fix ut * minor-fix * minor-fix (#2813) * fix * fix mono * fix docs * read norm_type * super().collect_images->self.collect_images * add note in supported models * define the parameters clearly * better streaming * fix molmo * Fix vision model batch inference (#2868) * remove forward from vl models that are not supported by tm * support max_batch_size * fix * warn glm4v does not support multi images * unconst * fix deepseek-vl * fix internvl * fix llava * fix minicpm 2.6 * fix callback * fix minicpm v2.5 * fix minicpm v2.6 * update llava_next.py * remove hardcode from xcomposer2.py * rollback supported_models * change to staticmethod * optimize tp * fix vlm quantization * update doc * update

qwen2-vl

c40a8ae

lvhan028 added the WIP label Nov 17, 2024

lvhan028 marked this pull request as draft November 17, 2024 15:30

lvhan028 added 2 commits November 18, 2024 18:38

internvl

e24b303

qwen2

dcc454b

lvhan028 changed the base branch from main to refactor-vl November 18, 2024 13:27

lvhan028 removed the WIP label Nov 18, 2024

lvhan028 marked this pull request as ready for review November 18, 2024 13:27

lvhan028 changed the title ~~refactor VL modules~~ refactor VL modules for internvl and qwen2-vl Nov 18, 2024

lvhan028 merged commit 464d451 into InternLM:refactor-vl Nov 18, 2024
2 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor VL modules for internvl and qwen2-vl #2764

refactor VL modules for internvl and qwen2-vl #2764

lvhan028 commented Nov 17, 2024

refactor VL modules for internvl and qwen2-vl #2764

refactor VL modules for internvl and qwen2-vl #2764

Conversation

lvhan028 commented Nov 17, 2024