Skip to content

refactor(models): consolidate duplication into shared httpx/observe/spec layers#1092

Open
lyingbug wants to merge 1 commit into
Tencent:mainfrom
lyingbug:refactor/internal-models-clarity
Open

refactor(models): consolidate duplication into shared httpx/observe/spec layers#1092
lyingbug wants to merge 1 commit into
Tencent:mainfrom
lyingbug:refactor/internal-models-clarity

Conversation

@lyingbug
Copy link
Copy Markdown
Collaborator

Summary

Consolidates the internal/models/ package (79 → 55 files, ~10.3k → ~9.0k LOC) without changing any public interface, wire format, or runtime behavior. Adding a new LLM provider now takes a ~15-line declarative spec entry instead of copy-pasting a ~200-line per-provider file.

What changed

Shared infrastructure (new)

  • internal/models/internal/httpx/ — one DoPOST with exponential-backoff retry replaces 11 near-identical doRequestWithRetry copies across embedding/rerank providers; a StreamingClient for raw chat streams.
  • internal/models/internal/observe/ — generic Wrap / WrapStream decorator machinery replaces 5 Langfuse wrappers + 5 LLM-debug wrappers.
  • internal/models/internal/modelconfig/ — one FromModel helper factors out the shared types.Model → Config mapping.

Per-sub-package consolidation

  • embedding/: openai.go, jina.go, nvidia.go, azure_openai.go, volcengine.go, aliyun.go (6 files, ~1.2k LOC) → one providers.go declarative spec table + runner.go shared HTTP runner. Ollama and WeKnoraCloud keep their own impls (different transports).
  • rerank/: openai/aliyun/jina/zhipu/nvidia reranker files (5 files, ~660 LOC) → providers.go + runner.go. WeKnoraCloud keeps its signer-based impl.
  • provider/: 25 single-struct files consolidated into providers.go (data table) + urls.go (BaseURL constants) + model_family.go (IsQwen3 / IsDeepSeek / IsLKEAP* detectors). The 4 struct names (OpenAIProvider, AliyunProvider, MiniMaxProvider, ZhipuProvider) that provider_test.go constructs directly are preserved as thin shells.
  • chat/remote_api.go (978 lines) split by responsibility into remote.go (struct + constructor + request builder), remote_nonstream.go (Chat + chatWithRawHTTP), remote_stream.go (ChatStream + streamState + SSE handling), and remote_tools.go (tool-call delta processing + final_answer/thinking streaming).
  • chat/, embedding/, rerank/, vlm/, asr/: each sub-package's langfuse_wrapper.go + llm_debug*.go collapsed into one observe.go using the shared decorator.

Guardrails verified

All externally-referenced symbols are unchanged:

  • Interfaces: chat.Chat, embedding.Embedder/EmbedderPooler, rerank.Reranker, vlm.VLM, asr.ASR.
  • Factories: NewChat, NewEmbedder, NewReranker, NewVLM, NewVLMFromLegacyConfig, NewASR.
  • Configs + ConfigFromModel in each sub-package.
  • DTOs used by handler/application/agent: chat.{Message, ChatOptions, Tool, ToolCall, FunctionCall, FunctionDef}, rerank.RankResult, asr.{Segment, TranscriptionResult}.
  • provider.List(), provider.ListByModelType(), provider.ProviderInfo, all provider.Provider* name constants, provider.WeKnoraCloudBaseURL.
  • utils.ChunkSlice (imported by application/service/graph.go and application/service/retriever/...) kept at its current path.
  • internal/models/utils/ollama.OllamaService (imported by container/, application/service/model.go, application/service/image_multimodal.go, handler/initialization.go) kept at its current path.
  • RemoteAPIChat.BuildChatCompletionRequest (asserted on in remote_api_test.go) kept as a method with the same signature.

Non-goals

  • No behavior change. Every existing test in internal/models/ continues to pass.
  • No wire-format change for any provider.
  • No new features, no removed features, no renamed Config fields.
  • No attempt to merge the WeKnoraCloud signer-based transport into the shared runner — its shape differs enough that forcing unification adds complexity.
  • No rewrite of chat/ollama.go — the Ollama-API coupling is unavoidable and the file is self-contained.

Test plan

  • go build ./... — clean
  • go test ./internal/models/... — all pass (chat / embedding / rerank / vlm / asr / provider)
  • Diffed against upstream main: the 3 pre-existing test failures in internal/application/{repository,service} (SQLite schema mismatch with wiki_config column) are unchanged — unrelated to this refactor
  • Sanity-check a chat completion, embedding, rerank, and VLM call end-to-end against live providers once deployed
  • Confirm Langfuse traces and LLM_DEBUG_LOG output shape match pre-refactor format

…pec layers

Collapses 79 → 55 files and ~10.3k → ~9k LOC without changing any public
interface, wire format, or behavior. Adding a new LLM provider now takes one
declarative spec entry instead of copy-pasting a ~200-line provider file.

Key changes:
- New internal/{httpx,observe,modelconfig} helpers replace 11 copies of
  doRequestWithRetry, 5 Langfuse wrappers, 5 LLM-debug wrappers, and 5
  ConfigFromModel functions.
- embedding/: 6 provider files → providers.go + runner.go (declarative spec
  table + shared HTTP runner). Ollama and WeKnoraCloud keep own impls.
- rerank/: 5 provider files → providers.go + runner.go, same pattern.
- provider/: 25 single-struct files → providers.go data table, with the 4
  test-referenced struct names (OpenAIProvider/AliyunProvider/MiniMaxProvider/
  ZhipuProvider) preserved as thin shells.
- chat/remote_api.go (978 lines) split into remote.go + remote_nonstream.go +
  remote_stream.go + remote_tools.go by responsibility.
- Model-family detectors (IsQwen3/IsDeepSeek/IsLKEAP*) consolidated into
  provider/model_family.go; BaseURL constants into provider/urls.go.

Public API unchanged: Chat / Embedder / Reranker / VLM / ASR interfaces,
NewXxx factories, ConfigFromModel, provider.{List,ListByModelType}, and the
utils.ChunkSlice + utils/ollama paths that are imported by application/ and
handler/ all keep their signatures. RemoteAPIChat.BuildChatCompletionRequest
(asserted on in remote_api_test.go) stays a method with the same shape.

Verified: all internal/models tests pass; go build ./... clean; the 3
pre-existing test failures in application/{repository,service} are unchanged
(SQLite schema mismatch, unrelated).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant