[Gap Analysis] Implement In-House Model Serving Framework

### User Story

> As a developer who has trained a model with AiDotNet, I want to easily deploy it as a high-performance, production-ready REST API, so that I can integrate its predictions into my applications without needing a complex external serving solution.

---

### **Phase 1: Core Server and Model Management**

**Goal:** Create the foundational web server project and the API endpoints for dynamically loading and managing models.

#### **AC 1.1: Create the Serving Project (2 points)**
**Requirement:** Set up a new ASP.NET Core Web API project for the server.

-   [ ] Create a new C# project: `src/Serving/AiDotNet.Serving`.
-   [ ] The project must be configured as an ASP.NET Core Web API.
-   [ ] Add necessary project references to the core `AiDotNet` library.

#### **AC 1.2: Implement Model Repository (3 points)**
**Requirement:** Create a thread-safe, singleton service to store and manage loaded models.

-   [ ] Create a new file: `src/Serving/ModelRepository.cs`.
-   [ ] Define a `public class ModelRepository<T>` as a singleton service.
-   [ ] It must contain a private `ConcurrentDictionary<string, IModel<T>>` to store the loaded models, mapping a model name to the model instance.
-   [ ] Implement methods: `AddModel(string name, IModel<T> model)`, `GetModel(string name)`, and `RemoveModel(string name)`.

#### **AC 1.3: Create Model Management API (5 points)**
**Requirement:** Create an ASP.NET controller with endpoints to manage the models in the repository.

-   [ ] Create a new controller: `src/Serving/Controllers/ModelsController.cs`.
-   [ ] **`POST /models` Endpoint:**
    -   [ ] This endpoint will accept a request body containing a `model_name` and a `model_path`.
    -   [ ] It must load a saved AiDotNet model from the `model_path` using the library's existing deserialization logic.
    -   [ ] It will then add the loaded model to the `ModelRepository` with the given name.
-   [ ] **`GET /models` Endpoint:**
    -   [ ] This endpoint will return a list of the names of all currently loaded models from the `ModelRepository`.
-   [ ] **`DELETE /models/{model_name}` Endpoint:**
    -   [ ] This endpoint will remove the specified model from the `ModelRepository`.

---

### **Phase 2: High-Performance Inference Endpoint**

**Goal:** Implement the `/predict` endpoint with dynamic request batching to maximize throughput.

#### **AC 2.1: Implement Request Batching Service (13 points)**
**Requirement:** Create the core service that collects, batches, and processes inference requests.

-   [ ] Create a new file: `src/Serving/RequestBatcher.cs`.
-   [ ] Define a `public class RequestBatcher<T>` as a singleton service.
-   [ ] It will contain a `ConcurrentQueue` to hold incoming requests. Each item in the queue will be a tuple containing the request data and a `TaskCompletionSource` to signal when the result is ready.
-   [ ] **Background Worker:** The `RequestBatcher` will start a background task (`Task.Run`) in its constructor that runs an infinite loop.
-   [ ] **Batching Logic (inside the loop):**
    1.  [ ] `await Task.Delay(10);` (The batching window, should be configurable).
    2.  [ ] Dequeue all currently pending requests from the queue.
    3.  [ ] If there are requests, collate their individual input data into a single, large batch tensor.
    4.  [ ] Run the batch tensor through the appropriate model (this requires passing the model to the batcher).
    5.  [ ] De-collate the model's output tensor back into individual results.
    6.  [ ] For each original request, use its `TaskCompletionSource` to set its result, which unblocks the waiting HTTP request.

#### **AC 2.2: Create the `/predict` Endpoint (5 points)**
**Requirement:** Create the public-facing API endpoint that users will call for inference.

-   [ ] Create a new controller: `src/Serving/Controllers/InferenceController.cs`.
-   [ ] Implement a `POST /predict/{model_name}` endpoint.
-   [ ] **Endpoint Logic:**
    -   [ ] This method will **not** call the model directly.
    -   [ ] It will create a `TaskCompletionSource`.
    -   [ ] It will add the request data and the `TaskCompletionSource` to the `RequestBatcher`'s queue.
    -   [ ] It will then `await` the `TaskCompletionSource.Task` to get the result.
    -   [ ] Once the result is available (set by the batcher), it will be returned as the HTTP response.

---

### **Phase 3: Configuration and Testing**

**Goal:** Make the server configurable and verify its functionality, especially the batching mechanism.

#### **AC 3.1: Add Configuration (3 points)**
**Requirement:** Allow server settings to be configured via `appsettings.json`.

-   [ ] In the `appsettings.json` file, add a section for `ServingSettings`.
-   [ ] Add configuration options for `Port`, `BatchingWindowMilliseconds`, and an array of `ModelsToLoadOnStartup` (each with a name and path).
-   [ ] The server must load these settings on startup.

#### **AC 3.2: Integration Test (8 points)**
**Requirement:** Create an end-to-end test that proves the server works and that batching is effective.

-   [ ] In a test project, use `WebApplicationFactory` to host the server in-memory.
-   [ ] **Test Logic:**
    1.  [ ] Create and save a simple AiDotNet model that can be loaded by the server.
    2.  [ ] Use an `HttpClient` to call the `POST /models` endpoint to load the model.
    3.  [ ] Create a list of 10 concurrent tasks, where each task sends a unique request to the `POST /predict/{model_name}` endpoint.
    4.  [ ] Run all 10 tasks concurrently using `Task.WhenAll()`.
    5.  [ ] **Assert** that all 10 tasks complete successfully and that each received its correct, corresponding response.
    6.  [ ] **(Advanced)** Use a mock model to verify that its `Forward` method was called only **once** with a batch size of 10, proving that the dynamic batching worked correctly.

---

### **Definition of Done**

-   [ ] All checklist items are complete.
-   [ ] A new `AiDotNet.Serving` project is created.
-   [ ] A user can start the server, load a model via a REST API call, and get predictions from it.
-   [ ] The server correctly batches concurrent requests into a single model execution.
-   [ ] Integration tests verify the end-to-end functionality.
---

## ⚠️ CRITICAL ARCHITECTURAL REQUIREMENTS

**Before implementing this user story, you MUST review:**
- **📋 Full Requirements:** [`.github/USER_STORY_ARCHITECTURAL_REQUIREMENTS.md`](../.github/USER_STORY_ARCHITECTURAL_REQUIREMENTS.md)
- **📐 Project Rules:** [`.github/PROJECT_RULES.md`](../.github/PROJECT_RULES.md)

### Mandatory Implementation Checklist

#### 1. INumericOperations<T> Usage (CRITICAL)
- [ ] Include `protected static readonly INumericOperations<T> NumOps = MathHelper.GetNumericOperations<T>();` in base class
- [ ] NEVER hardcode `double`, `float`, or specific numeric types - use generic `T`
- [ ] NEVER use `default(T)` - use `NumOps.Zero` instead
- [ ] Use `NumOps.Zero`, `NumOps.One`, `NumOps.FromDouble()` for values
- [ ] Use `NumOps.Add()`, `NumOps.Multiply()`, etc. for arithmetic
- [ ] Use `NumOps.LessThan()`, `NumOps.GreaterThan()`, etc. for comparisons

#### 2. Inheritance Pattern (REQUIRED)
- [ ] Create `I{FeatureName}.cs` in `src/Interfaces/` (root level, NOT subfolders)
- [ ] Create `{FeatureName}Base.cs` in `src/{FeatureArea}/` inheriting from interface
- [ ] Create concrete classes inheriting from Base class (NOT directly from interface)

#### 3. PredictionModelBuilder Integration (REQUIRED)
- [ ] Add private field: `private I{FeatureName}<T>? _{featureName};` to `PredictionModelBuilder.cs`
- [ ] Add Configure method taking ONLY interface (no parameters):
  ```csharp
  public IPredictionModelBuilder<T, TInput, TOutput> Configure{FeatureName}(I{FeatureName}<T> {featureName})
  {
      _{featureName} = {featureName};
      return this;
  }
  ```
- [ ] Use feature in `Build()` with default: `var {featureName} = _{featureName} ?? new Default{FeatureName}<T>();`
- [ ] Verify feature is ACTUALLY USED in execution flow

#### 4. Beginner-Friendly Defaults (REQUIRED)
- [ ] Constructor parameters with defaults from research/industry standards
- [ ] Document WHY each default was chosen (cite papers/standards)
- [ ] Validate parameters and throw `ArgumentException` for invalid values

#### 5. Property Initialization (CRITICAL)
- [ ] NEVER use `default!` operator
- [ ] String properties: `= string.Empty;`
- [ ] Collections: `= new List<T>();` or `= new Vector<T>(0);`
- [ ] Numeric properties: appropriate default or `NumOps.Zero`

#### 6. Class Organization (REQUIRED)
- [ ] One class/enum/interface per file
- [ ] ALL interfaces in `src/Interfaces/` (root level)
- [ ] Namespace mirrors folder structure (e.g., `src/Regularization/` → `namespace AiDotNet.Regularization`)

#### 7. Documentation (REQUIRED)
- [ ] XML documentation for all public members
- [ ] `<b>For Beginners:</b>` sections with analogies and examples
- [ ] Document all `<param>`, `<returns>`, `<exception>` tags
- [ ] Explain default value choices

#### 8. Testing (REQUIRED)
- [ ] Minimum 80% code coverage
- [ ] Test with multiple numeric types (double, float)
- [ ] Test default values are applied correctly
- [ ] Test edge cases and exceptions
- [ ] Integration tests for PredictionModelBuilder usage

---

**⚠️ Failure to follow these requirements will result in repeated implementation mistakes and PR rejections.**

**See full details:** [`.github/USER_STORY_ARCHITECTURAL_REQUIREMENTS.md`](../.github/USER_STORY_ARCHITECTURAL_REQUIREMENTS.md)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Gap Analysis] Implement In-House Model Serving Framework #308

User Story

Phase 1: Core Server and Model Management

AC 1.1: Create the Serving Project (2 points)

AC 1.2: Implement Model Repository (3 points)

AC 1.3: Create Model Management API (5 points)

Phase 2: High-Performance Inference Endpoint

AC 2.1: Implement Request Batching Service (13 points)

AC 2.2: Create the `/predict` Endpoint (5 points)

Phase 3: Configuration and Testing

AC 3.1: Add Configuration (3 points)

AC 3.2: Integration Test (8 points)

Definition of Done

⚠️ CRITICAL ARCHITECTURAL REQUIREMENTS

Mandatory Implementation Checklist

1. INumericOperations Usage (CRITICAL)

2. Inheritance Pattern (REQUIRED)

3. PredictionModelBuilder Integration (REQUIRED)

4. Beginner-Friendly Defaults (REQUIRED)

5. Property Initialization (CRITICAL)

6. Class Organization (REQUIRED)

7. Documentation (REQUIRED)

8. Testing (REQUIRED)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

[Gap Analysis] Implement In-House Model Serving Framework #308

Description

User Story

Phase 1: Core Server and Model Management

AC 1.1: Create the Serving Project (2 points)

AC 1.2: Implement Model Repository (3 points)

AC 1.3: Create Model Management API (5 points)

Phase 2: High-Performance Inference Endpoint

AC 2.1: Implement Request Batching Service (13 points)

AC 2.2: Create the /predict Endpoint (5 points)

Phase 3: Configuration and Testing

AC 3.1: Add Configuration (3 points)

AC 3.2: Integration Test (8 points)

Definition of Done

⚠️ CRITICAL ARCHITECTURAL REQUIREMENTS

Mandatory Implementation Checklist

1. INumericOperations Usage (CRITICAL)

2. Inheritance Pattern (REQUIRED)

3. PredictionModelBuilder Integration (REQUIRED)

4. Beginner-Friendly Defaults (REQUIRED)

5. Property Initialization (CRITICAL)

6. Class Organization (REQUIRED)

7. Documentation (REQUIRED)

8. Testing (REQUIRED)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

AC 2.2: Create the `/predict` Endpoint (5 points)