Skip to content

Commit 38c3e69

Browse files
authored
Merge pull request #15 from jparkerweb/develop
2.7.0
2 parents e556ae2 + 0d52e37 commit 38c3e69

20 files changed

+1289
-993
lines changed

.example.env

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,5 +9,4 @@ AWS_SECRET_ACCESS_KEY = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
99
# == LLM PARAMS ==
1010
# ================
1111
LLM_MAX_GEN_TOKENS = 800
12-
LLM_TEMPERATURE = 0.1
13-
LLM_TOP_P = 0.9
12+
LLM_TEMPERATURE = 0.2

CHANGELOG.md

Lines changed: 55 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,38 @@
11
# Changelog
22
All notable changes to this project will be documented in this file.
33

4+
## [2.7.0] - 2025-11-18 (DeepSeek & Qwen 3)
5+
### ✨ Added
6+
- Support for DeepSeek foundation models
7+
- DeepSeek-R1 (reasoning model with chain-of-thought capabilities, 8K max output tokens)
8+
- DeepSeek-V3.1 (hybrid thinking mode for complex reasoning, 8K max output tokens, **Converse API only**)
9+
- Support for Qwen 3 foundation models
10+
- Qwen3-32B (dense architecture, 32K max output tokens)
11+
- Qwen3-Coder-30B-A3B (MoE architecture for code generation, 32K max output tokens)
12+
- Qwen3-235B-A22B-2507 (MoE architecture for general reasoning, 32K max output tokens)
13+
- Qwen3-Coder-480B-A35B (MoE architecture for advanced software engineering, 32K max output tokens)
14+
- Reasoning content extraction for DeepSeek-R1 via `reasoningContent.reasoningText`
15+
- Stop sequences support (max 10 items) for DeepSeek and Qwen models
16+
- Text-to-text completion with streaming support
17+
- MIT-licensed open weight models for commercial use (DeepSeek)
18+
- `converse_api_only` flag for models that only support Converse API (automatically forces `useConverseAPI = true`)
19+
- Long-context handling support for Qwen 3 (up to 256K tokens natively, 1M with extrapolation)
20+
- Hybrid thinking modes for complex problem-solving vs. fast responses
21+
- Repository-scale code analysis capabilities for Qwen Coder models
22+
23+
### 🤬 Breaking Changes
24+
- Removed `top_p` parameter from all models as it is not fully supported by AWS Bedrock
25+
- `temperature` should always be used instead
26+
27+
### ⚙️ Technical Details
28+
- **Model Configuration**: All new models use messages API format (OpenAI-compatible)
29+
- **API Compatibility**:
30+
- Qwen 3 models: Support both Invoke API and Converse API
31+
- DeepSeek-R1: Supports both Invoke API and Converse API
32+
- DeepSeek-V3.1: Converse API only (automatically enforced)
33+
434
## [2.6.2] - 2025-10-16 (Claude Haiku 4.5)
5-
### Added
35+
### Added
636
- Support for Claude Haiku 4.5 models
737
- Claude-4-5-Haiku
838
- Claude-4-5-Haiku-Thinking
@@ -12,24 +42,24 @@ All notable changes to this project will be documented in this file.
1242
- Temperature/Top-P mutual exclusion parameter handling for Haiku 4.5 models
1343

1444
## [2.6.1] - 2025-09-30 (Claude Sonnet 4.5)
15-
### Added
45+
### Added
1646
- Support for Claude Sonnet 4.5 models
1747
- Claude-4-5-Sonnet
1848
- Claude-4-5-Sonnet-Thinking
1949

2050
## [2.5.0] - 2025-08-12 (Converse API)
21-
### Added
51+
### Added
2252
- Support for Converse API (streaming and non-streaming)
2353

24-
### Technical Details
54+
### ⚙️ Technical Details
2555
- **Model Configuration**: All models use standard messages API format
2656
- **API Compatibility**: Supports OpenAI-style requests
2757
- **Response Processing**: Automatic reasoning tag handling based on model variant
2858
- **Streaming Fallback**: Automatic detection and fallback to non-streaming for unsupported models
2959
- **Testing Coverage**: Full integration with existing test suites and interactive example
3060

3161
## [2.4.5] - 2025-08-06 (GPT-OSS Models)
32-
### Added
62+
### Added
3363
- Support for OpenAI GPT-OSS models on AWS Bedrock
3464
- GPT-OSS-120B (120B parameter open weight model)
3565
- GPT-OSS-20B (20B parameter open weight model)
@@ -41,21 +71,21 @@ All notable changes to this project will be documented in this file.
4171
- Non-streaming support for GPT-OSS models (streaming not supported by AWS Bedrock)
4272
- OpenAI-compatible API format with `max_completion_tokens` parameter
4373

44-
### Technical Details
74+
### ⚙️ Technical Details
4575
- **Model Configuration**: All GPT-OSS models use standard messages API format
4676
- **API Compatibility**: Supports OpenAI-style requests with Apache 2.0 licensed models
4777
- **Response Processing**: Automatic reasoning tag handling based on model variant
4878
- **Streaming Fallback**: Automatic detection and fallback to non-streaming for unsupported models
4979
- **Testing Coverage**: Full integration with existing test suites and interactive example
5080

5181
## [2.4.4] - 2025-08-05 (Claude 4.1 Opus)
52-
### Added
82+
### Added
5383
- Support for Claude 4.1 Opus models
5484
- Claude-4-1-Opus
5585
- Claude-4-1-Opus-Thinking
5686

5787
## [2.4.3] - 2025-07-31 (Stop Sequences Fixes)
58-
### Fixed
88+
### 🛠️ Fixed
5989
- **Critical Discovery**: Removed stop sequences support from Llama models
6090
- AWS Bedrock does not support stop sequences for Llama models (confirmed via official AWS documentation)
6191
- Llama models only support: `prompt`, `temperature`, `top_p`, `max_gen_len`, `images`
@@ -64,7 +94,7 @@ All notable changes to this project will be documented in this file.
6494
- Removed conflicting empty `inferenceConfig: {}` from Nova model configurations
6595
- Improved error handling for empty responses when stop sequences trigger early
6696

67-
### Updated
97+
### 📝 Updated
6898
- **Documentation corrections**
6999
- Corrected stop sequences support claims (removed "all models support" language)
70100
- Added accurate model-specific support matrix with sequence limits
@@ -75,30 +105,30 @@ All notable changes to this project will be documented in this file.
75105
- ✅ Mistral models: Full support (up to 10 sequences)
76106
- ❌ Llama models: Not supported (AWS Bedrock limitation)
77107

78-
### Technical Details
108+
### ⚙️ Technical Details
79109
- Based on comprehensive research of official AWS Bedrock documentation
80110
- All changes maintain full backward compatibility
81111
- Test results show significant improvements in stop sequences reliability for supported models
82112
- Added detailed explanations to help users understand AWS Bedrock's actual capabilities
83113

84114
## [2.4.2] - 2025-07-31 (Stop Sequences Support)
85-
### Added
115+
### Added
86116
- Stop sequences support for compatible models
87117
- OpenAI-compatible `stop` and `stop_sequences` parameters
88118
- Automatic string-to-array conversion for compatibility
89119
- Model-specific parameter mapping (stop_sequences for Claude, stopSequences for Nova, stop for Mistral)
90120
- Enhanced request building logic to include stop sequences in appropriate API formats
91121
- Comprehensive stop sequences testing and validation with `npm run test-stop`
92122

93-
### Fixed
123+
### 🛠️ Fixed
94124
- **Critical Discovery**: Removed stop sequences support from Llama models
95125
- AWS Bedrock does not support stop sequences for Llama models (confirmed via official documentation)
96126
- Llama models only support: `prompt`, `temperature`, `top_p`, `max_gen_len`, `images`
97127
- This is an AWS Bedrock limitation, not a wrapper limitation
98128
- Fixed Nova model configuration conflicts that were causing stop sequence inconsistencies
99129
- Improved error handling for empty responses when stop sequences trigger early
100130

101-
### Technical Details
131+
### ⚙️ Technical Details
102132
- **Model Support Matrix**:
103133
- ✅ Claude models: Full support (up to 8,191 sequences)
104134
- ✅ Nova models: Full support (up to 4 sequences)
@@ -110,7 +140,7 @@ All notable changes to this project will be documented in this file.
110140
- Added comprehensive documentation in README.md and CLAUDE.md explaining support limitations
111141

112142
## [2.4.0] - 2025-07-24 (AWS Nova Models)
113-
### Added
143+
### Added
114144
- Support for AWS Nova models
115145
- Nova-Pro (300K context, multimodal, 5K output tokens)
116146
- Nova-Lite (300K context, multimodal, optimized for speed)
@@ -120,15 +150,15 @@ All notable changes to this project will be documented in this file.
120150
- Automatic content array formatting for Nova message compatibility
121151

122152
## [2.3.1] - 2025-05-22 (Claude 4 Opus / Sonnet)
123-
### Added
153+
### Added
124154
- Support for Claude 4 Opus & Claude 4 Sonnet models
125155
- Claude-4-Opus
126156
- Claude-4-Opus-Thinking
127157
- Claude-4-Sonnet
128158
- Claude-4-Sonnet-Thinking
129159

130160
## [2.3.0] - 2025-02-15 (Claude 3.7 & Image Support)
131-
### Added
161+
### Added
132162
- Support for Claude 3.7 models
133163
- Claude-3-7-Sonnet
134164
- Claude-3-7-Sonnet-Thinking
@@ -140,20 +170,20 @@ All notable changes to this project will be documented in this file.
140170
- Enhanced message handling for multimodal content
141171
- Documentation for image support usage
142172

143-
### Changed
173+
### 🔄 Changed
144174
- Updated model configuration for image-capable models
145175
- Improved response handling for multimodal inputs
146176

147177
## [2.2.0] - 2025-01-01 (Llama 3.3 70b)
148-
### Added
178+
### Added
149179
- Support for Llama 3.3 70b
150180

151181
## [2.1.0] - 2024-11-21 (Claude 3.5 Haiku)
152-
### Added
182+
### Added
153183
- Support for Claude 3.5 Haiku
154184

155185
## [2.0.0] - 2024-10-31 (Claude Sonnet & Haiku)
156-
### Added
186+
### Added
157187
- Support for Anthropic Sonnet & Haiku models
158188
- Claude-3-5-Sonnet-v2
159189
- Claude-3-5-Sonnet
@@ -163,37 +193,37 @@ All notable changes to this project will be documented in this file.
163193
- Stardardize output to be a string via Streamed and non-Streamed responses
164194
> **NOTE:** This is a breaking change for previous non-streaming responses. Existing streaming responses will remain unchanged.
165195
166-
### Changed
196+
### 🔄 Changed
167197
- Complete architecture overhaul for better model support
168198
- Improved message handling with role-based formatting
169199
- Enhanced error handling and response processing
170200
- Standardized model configuration format
171201
- Updated AWS SDK integration
172202

173-
### Technical Details
203+
### ⚙️ Technical Details
174204
- Implemented messages API support for compatible models
175205
- Added system message handling as separate field where supported
176206
- Configurable token limits per model
177207
- Flexible response parsing with chunk/non-chunk handling
178208
- Cross-region profile support for certain models
179209

180210
## [1.3.0] - 2024-07-24 (Llama3.2)
181-
### Added
211+
### Added
182212
- Support for Llama 3.2 series models
183213
- Llama-3-2-1b
184214
- Llama-3-2-3b
185215
- Llama-3-2-11b
186216
- Llama-3-2-90b
187217

188218
## [1.1.0] - 2024-07-24 (Llama3.1)
189-
### Added
219+
### Added
190220
- Support for Llama 3.1 series models
191221
- Llama-3-1-8b
192222
- Llama-3-1-70b
193223

194224

195225
## [1.0.14] - 2024-05-06 (Initial Stable Release)
196-
### Added
226+
### Added
197227
- Initial stablerelease of Bedrock Wrapper
198228
- Basic AWS Bedrock integration
199229
- OpenAI-compatible API object support

CLAUDE.md

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
44

55
## Project Overview
66

7-
Bedrock Wrapper (v2.5.0) is an npm package that translates OpenAI-compatible API objects to AWS Bedrock's serverless inference LLMs. It supports 32+ models including Claude, Nova, GPT-OSS, Llama, and Mistral families with features like vision support, thinking modes, and stop sequences.
7+
Bedrock Wrapper (v2.6.2) is an npm package that translates OpenAI-compatible API objects to AWS Bedrock's serverless inference LLMs. It supports 40 models including Claude, Nova, GPT-OSS, Llama, Mistral, and Qwen families with features like vision support, thinking modes, and stop sequences.
88

99
## Core Architecture
1010

@@ -82,8 +82,7 @@ AWS_REGION=us-west-2
8282
AWS_ACCESS_KEY_ID=your_access_key
8383
AWS_SECRET_ACCESS_KEY=your_secret_key
8484
LLM_MAX_GEN_TOKENS=1024
85-
LLM_TEMPERATURE=0.1
86-
LLM_TOP_P=0.9
85+
LLM_TEMPERATURE=0.2
8786
```
8887

8988
## Adding New Models
@@ -92,19 +91,25 @@ Required fields in bedrock-models.js:
9291
- `modelName`: Consumer-facing name
9392
- `modelId`: AWS Bedrock identifier
9493
- `vision`: Boolean for image support
95-
- `messages_api`: Boolean (true for Claude/Nova/GPT-OSS, false for prompt-based)
94+
- `messages_api`: Boolean (true for Claude/Nova/GPT-OSS/Qwen, false for prompt-based)
9695
- `response_chunk_element`: JSON path for streaming responses
9796
- `response_nonchunk_element`: JSON path for non-streaming responses
9897
- `special_request_schema`: Model-specific requirements
9998
- `stop_sequences_param_name`: Parameter name for stop sequences
10099

101100
## Critical Implementation Details
102101

102+
### Converse API Only Models
103+
Some models only support the Converse API and will automatically use it regardless of the `useConverseAPI` flag:
104+
- DeepSeek-V3.1
105+
106+
These models have `converse_api_only: true` in their configuration and the wrapper automatically forces `useConverseAPI = true` for them.
107+
103108
### Converse API Thinking Support
104109
- Thinking configuration added via `additionalModelRequestFields`
105110
- Response thinking data extracted from `reasoningContent.reasoningText.text`
106111
- Budget tokens calculated with constraints: 1024 <= budget_tokens <= (maxTokens * 0.8)
107-
- Temperature forced to 1.0, top_p removed for thinking models
112+
- Temperature forced to 1.0 for thinking models
108113

109114
### Nova Models Special Handling
110115
- Detect via `special_request_schema.schemaVersion === "messages-v1"`
@@ -124,6 +129,7 @@ Required fields in bedrock-models.js:
124129
| Nova || stopSequences | 4 |
125130
| GPT-OSS || stop_sequences | TBD |
126131
| Mistral || stop | 10 |
132+
| Qwen || stop | TBD |
127133
| Llama || N/A | N/A |
128134

129135
### Test Files Output

README.md

Lines changed: 8 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,6 @@ Bedrock Wrapper is an npm package that simplifies the integration of existing Op
4343
"max_tokens": LLM_MAX_GEN_TOKENS,
4444
"stream": true,
4545
"temperature": LLM_TEMPERATURE,
46-
"top_p": LLM_TOP_P,
4746
"stop_sequences": ["STOP", "END"], // Optional: sequences that will stop generation
4847
};
4948
```
@@ -158,7 +157,11 @@ Bedrock Wrapper is an npm package that simplifies the integration of existing Op
158157
| Mistral-7b | mistral.mistral-7b-instruct-v0:2 | ❌ |
159158
| Mixtral-8x7b | mistral.mixtral-8x7b-instruct-v0:1 | ❌ |
160159
| Mistral-Large | mistral.mistral-large-2402-v1:0 | ❌ |
161-
160+
| Qwen3-32B | alibaba.qwen3-32b-instruct-v1:0 | ❌ |
161+
| Qwen3-Coder-30B-A3B | alibaba.qwen3-coder-30b-a3b-instruct-v1:0 | ❌ |
162+
| Qwen3-235B-A22B-2507 | alibaba.qwen3-235b-a22b-instruct-2507-v1:0 | ❌ |
163+
| Qwen3-Coder-480B-A35B | alibaba.qwen3-coder-480b-a35b-instruct-v1:0 | ❌ |
164+
162165
To return the list progrmatically you can import and call `listBedrockWrapperSupportedModels`:
163166
```javascript
164167
import { listBedrockWrapperSupportedModels } from 'bedrock-wrapper';
@@ -235,9 +238,10 @@ const openaiChatCompletionsCreateObject = {
235238

236239
**Model Support:**
237240
-**Claude models**: Fully supported (up to 8,191 sequences)
238-
-**Nova models**: Fully supported (up to 4 sequences)
241+
-**Nova models**: Fully supported (up to 4 sequences)
239242
-**GPT-OSS models**: Fully supported
240243
-**Mistral models**: Fully supported (up to 10 sequences)
244+
-**Qwen models**: Fully supported
241245
-**Llama models**: Not supported (AWS Bedrock limitation)
242246

243247
**Features:**
@@ -251,7 +255,7 @@ const openaiChatCompletionsCreateObject = {
251255
// Stop generation when model tries to output "7"
252256
const result = await bedrockWrapper(awsCreds, {
253257
messages: [{ role: "user", content: "Count from 1 to 10" }],
254-
model: "Claude-3-5-Sonnet", // Use Claude, Nova, or Mistral models
258+
model: "Claude-3-5-Sonnet", // Use Claude, Nova, Mistral, or Qwen models
255259
stop_sequences: ["7"]
256260
});
257261
// Response: "1, 2, 3, 4, 5, 6," (stops before "7")
@@ -274,27 +278,6 @@ Some AWS Bedrock models have specific parameter restrictions that are automatica
274278
- Claude-4-Opus & Claude-4-Opus-Thinking
275279
- Claude-4-1-Opus & Claude-4-1-Opus-Thinking
276280

277-
**Restriction:** These models cannot accept both `temperature` and `top_p` parameters simultaneously.
278-
279-
**Automatic Handling:** When both parameters are provided, the wrapper automatically:
280-
1. **Keeps `temperature`** (prioritized as more commonly used)
281-
2. **Removes `top_p`** to prevent validation errors
282-
3. **Works with both APIs** (Invoke API and Converse API)
283-
284-
```javascript
285-
const request = {
286-
messages: [{ role: "user", content: "Hello" }],
287-
model: "Claude-4-5-Sonnet",
288-
temperature: 0.7, // ✅ Kept
289-
top_p: 0.9 // ❌ Automatically removed
290-
};
291-
292-
// No error thrown - wrapper handles the restriction automatically
293-
const response = await bedrockWrapper(awsCreds, request);
294-
```
295-
296-
**Why This Happens:** AWS Bedrock enforces this restriction on newer Claude models to ensure optimal performance and prevent conflicting sampling parameters.
297-
298281
---
299282

300283
### 🧪 Testing

0 commit comments

Comments
 (0)