-
-
Notifications
You must be signed in to change notification settings - Fork 7
Description
User Story
As a developer who has trained a model with AiDotNet, I want to easily deploy it as a high-performance, production-ready REST API, so that I can integrate its predictions into my applications without needing a complex external serving solution.
Phase 1: Core Server and Model Management
Goal: Create the foundational web server project and the API endpoints for dynamically loading and managing models.
AC 1.1: Create the Serving Project (2 points)
Requirement: Set up a new ASP.NET Core Web API project for the server.
- Create a new C# project:
src/Serving/AiDotNet.Serving. - The project must be configured as an ASP.NET Core Web API.
- Add necessary project references to the core
AiDotNetlibrary.
AC 1.2: Implement Model Repository (3 points)
Requirement: Create a thread-safe, singleton service to store and manage loaded models.
- Create a new file:
src/Serving/ModelRepository.cs. - Define a
public class ModelRepository<T>as a singleton service. - It must contain a private
ConcurrentDictionary<string, IModel<T>>to store the loaded models, mapping a model name to the model instance. - Implement methods:
AddModel(string name, IModel<T> model),GetModel(string name), andRemoveModel(string name).
AC 1.3: Create Model Management API (5 points)
Requirement: Create an ASP.NET controller with endpoints to manage the models in the repository.
- Create a new controller:
src/Serving/Controllers/ModelsController.cs. -
POST /modelsEndpoint:- This endpoint will accept a request body containing a
model_nameand amodel_path. - It must load a saved AiDotNet model from the
model_pathusing the library's existing deserialization logic. - It will then add the loaded model to the
ModelRepositorywith the given name.
- This endpoint will accept a request body containing a
-
GET /modelsEndpoint:- This endpoint will return a list of the names of all currently loaded models from the
ModelRepository.
- This endpoint will return a list of the names of all currently loaded models from the
-
DELETE /models/{model_name}Endpoint:- This endpoint will remove the specified model from the
ModelRepository.
- This endpoint will remove the specified model from the
Phase 2: High-Performance Inference Endpoint
Goal: Implement the /predict endpoint with dynamic request batching to maximize throughput.
AC 2.1: Implement Request Batching Service (13 points)
Requirement: Create the core service that collects, batches, and processes inference requests.
- Create a new file:
src/Serving/RequestBatcher.cs. - Define a
public class RequestBatcher<T>as a singleton service. - It will contain a
ConcurrentQueueto hold incoming requests. Each item in the queue will be a tuple containing the request data and aTaskCompletionSourceto signal when the result is ready. - Background Worker: The
RequestBatcherwill start a background task (Task.Run) in its constructor that runs an infinite loop. - Batching Logic (inside the loop):
-
await Task.Delay(10);(The batching window, should be configurable). - Dequeue all currently pending requests from the queue.
- If there are requests, collate their individual input data into a single, large batch tensor.
- Run the batch tensor through the appropriate model (this requires passing the model to the batcher).
- De-collate the model's output tensor back into individual results.
- For each original request, use its
TaskCompletionSourceto set its result, which unblocks the waiting HTTP request.
-
AC 2.2: Create the /predict Endpoint (5 points)
Requirement: Create the public-facing API endpoint that users will call for inference.
- Create a new controller:
src/Serving/Controllers/InferenceController.cs. - Implement a
POST /predict/{model_name}endpoint. - Endpoint Logic:
- This method will not call the model directly.
- It will create a
TaskCompletionSource. - It will add the request data and the
TaskCompletionSourceto theRequestBatcher's queue. - It will then
awaittheTaskCompletionSource.Taskto get the result. - Once the result is available (set by the batcher), it will be returned as the HTTP response.
Phase 3: Configuration and Testing
Goal: Make the server configurable and verify its functionality, especially the batching mechanism.
AC 3.1: Add Configuration (3 points)
Requirement: Allow server settings to be configured via appsettings.json.
- In the
appsettings.jsonfile, add a section forServingSettings. - Add configuration options for
Port,BatchingWindowMilliseconds, and an array ofModelsToLoadOnStartup(each with a name and path). - The server must load these settings on startup.
AC 3.2: Integration Test (8 points)
Requirement: Create an end-to-end test that proves the server works and that batching is effective.
- In a test project, use
WebApplicationFactoryto host the server in-memory. - Test Logic:
- Create and save a simple AiDotNet model that can be loaded by the server.
- Use an
HttpClientto call thePOST /modelsendpoint to load the model. - Create a list of 10 concurrent tasks, where each task sends a unique request to the
POST /predict/{model_name}endpoint. - Run all 10 tasks concurrently using
Task.WhenAll(). - Assert that all 10 tasks complete successfully and that each received its correct, corresponding response.
- (Advanced) Use a mock model to verify that its
Forwardmethod was called only once with a batch size of 10, proving that the dynamic batching worked correctly.
Definition of Done
- All checklist items are complete.
- A new
AiDotNet.Servingproject is created. - A user can start the server, load a model via a REST API call, and get predictions from it.
- The server correctly batches concurrent requests into a single model execution.
- Integration tests verify the end-to-end functionality.
⚠️ CRITICAL ARCHITECTURAL REQUIREMENTS
Before implementing this user story, you MUST review:
- 📋 Full Requirements:
.github/USER_STORY_ARCHITECTURAL_REQUIREMENTS.md - 📐 Project Rules:
.github/PROJECT_RULES.md
Mandatory Implementation Checklist
1. INumericOperations Usage (CRITICAL)
- Include
protected static readonly INumericOperations<T> NumOps = MathHelper.GetNumericOperations<T>();in base class - NEVER hardcode
double,float, or specific numeric types - use genericT - NEVER use
default(T)- useNumOps.Zeroinstead - Use
NumOps.Zero,NumOps.One,NumOps.FromDouble()for values - Use
NumOps.Add(),NumOps.Multiply(), etc. for arithmetic - Use
NumOps.LessThan(),NumOps.GreaterThan(), etc. for comparisons
2. Inheritance Pattern (REQUIRED)
- Create
I{FeatureName}.csinsrc/Interfaces/(root level, NOT subfolders) - Create
{FeatureName}Base.csinsrc/{FeatureArea}/inheriting from interface - Create concrete classes inheriting from Base class (NOT directly from interface)
3. PredictionModelBuilder Integration (REQUIRED)
- Add private field:
private I{FeatureName}<T>? _{featureName};toPredictionModelBuilder.cs - Add Configure method taking ONLY interface (no parameters):
public IPredictionModelBuilder<T, TInput, TOutput> Configure{FeatureName}(I{FeatureName}<T> {featureName}) { _{featureName} = {featureName}; return this; }
- Use feature in
Build()with default:var {featureName} = _{featureName} ?? new Default{FeatureName}<T>(); - Verify feature is ACTUALLY USED in execution flow
4. Beginner-Friendly Defaults (REQUIRED)
- Constructor parameters with defaults from research/industry standards
- Document WHY each default was chosen (cite papers/standards)
- Validate parameters and throw
ArgumentExceptionfor invalid values
5. Property Initialization (CRITICAL)
- NEVER use
default!operator - String properties:
= string.Empty; - Collections:
= new List<T>();or= new Vector<T>(0); - Numeric properties: appropriate default or
NumOps.Zero
6. Class Organization (REQUIRED)
- One class/enum/interface per file
- ALL interfaces in
src/Interfaces/(root level) - Namespace mirrors folder structure (e.g.,
src/Regularization/→namespace AiDotNet.Regularization)
7. Documentation (REQUIRED)
- XML documentation for all public members
-
<b>For Beginners:</b>sections with analogies and examples - Document all
<param>,<returns>,<exception>tags - Explain default value choices
8. Testing (REQUIRED)
- Minimum 80% code coverage
- Test with multiple numeric types (double, float)
- Test default values are applied correctly
- Test edge cases and exceptions
- Integration tests for PredictionModelBuilder usage
See full details: .github/USER_STORY_ARCHITECTURAL_REQUIREMENTS.md