Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR implements a reverse proxy Gateway class that registers services and forwards requests through a centralized gateway. Key changes include:
- Introducing the Gateway class with enhanced registration, request forwarding, and proxy support.
- Adding comprehensive unit and integration tests to verify various scenarios including URL normalization, synchronous/asynchronous registration, and proxy functionality.
- Updating related types and sample scripts to demonstrate Gateway and ProxyConnectionManager usage.
Reviewed Changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/mindtrace/services/gateway/test_proxy_connection_manager.py | Added unit tests for initialization, property access, and method proxying via proxy connection manager. |
| tests/unit/mindtrace/services/gateway/test_gateway.py | Implements unit tests for Gateway initialization, route registration, and request forwarding. |
| tests/integration/mindtrace/services/test_gateway_integration.py | Provides integration tests covering complete Gateway workflows including proxy usage and error handling. |
| tests/integration/mindtrace/services/conftest.py | Adds fixtures for launching Gateway and EchoService to support integration testing. |
| samples/services/gateway.py | Updates the sample script to demonstrate both synchronous and asynchronous usage of Gateway and ProxyConnectionManager. |
| mindtrace/services/mindtrace/services/gateway/types.py | Introduces new type definitions (AppConfig, RegisterAppTaskSchema) for the Gateway service. |
| mindtrace/services/mindtrace/services/gateway/proxy_connection_manager.py | Implements the ProxyConnectionManager for transparent service routing via the Gateway. |
| mindtrace/services/mindtrace/services/gateway/gateway.py | Implements the Gateway class with enhanced connection manager registration and request forwarding. |
| mindtrace/services/mindtrace/services/init.py | Updates module exports with the newly introduced Gateway-related types and classes. |
Comments suppressed due to low confidence (2)
mindtrace/services/mindtrace/services/gateway/gateway.py:109
- Attaching proxy connection managers as attributes using the service name may lead to issues if the name is not a valid Python identifier. Consider sanitizing the attribute name or documenting this behavior to ensure reliable attribute access by consumers.
setattr(base_cm, name, proxy_cm)
mindtrace/services/mindtrace/services/gateway/gateway.py:71
- [nitpick] Converting the forwarded response directly to JSON without verifying its content type may lead to runtime errors if the response is not valid JSON. Consider adding error handling or fallback logic to manage non-JSON responses gracefully.
return JSONResponse(content=response.json(), status_code=response.status_code)
|
|
||
| # Detect whether the method should use GET or POST based on its signature | ||
| signature = inspect.signature(original_method) | ||
| requires_arguments = any(param.default == inspect.Parameter.empty for param in signature.parameters.values()) |
There was a problem hiding this comment.
[nitpick] The logic for determining if a method requires arguments may not explicitly handle *args or **kwargs, potentially causing unintended routing behavior. Revisiting this check to clearly account for variable-length parameters could improve robustness.
| requires_arguments = any(param.default == inspect.Parameter.empty for param in signature.parameters.values()) | |
| requires_arguments = any( | |
| param.default == inspect.Parameter.empty or | |
| param.kind in (inspect.Parameter.VAR_POSITIONAL, inspect.Parameter.VAR_KEYWORD) | |
| for param in signature.parameters.values() | |
| ) |
christopherfish
left a comment
There was a problem hiding this comment.
So this may not be the only failure, but it is one: if you create two ConnectionManagers for the same Gateway, then register an app with one, the other ConnectionManager doesn't know about the new endpoint.
(Due to the bug #79 if you run this as-is the second ConnectionManager does have an echo property but that echo property is a method that will never run. If you merge that in then the second ConnectionManager just doesn't have an echo property.)
#!/usr/bin/env python3
"""
Sample script to demonstrate issues with the Gateway class as it currently exists
"""
from mindtrace.services import Gateway
from mindtrace.services.sample.echo_service import EchoService
def proxy_connection_manager_failure_example():
"""Example demonstrating the ProxyConnectionManager functionality."""
print("\nStarting ProxyConnectionManager example...")
# Launch Gateway service on port 8097
gateway_cm = Gateway.launch(port=8097, wait_for_launch=True, timeout=15)
print("Gateway launched successfully!")
# Launch EchoService on port 8098
echo_cm = EchoService.launch(port=8098, wait_for_launch=True, timeout=15)
print("EchoService launched successfully!")
try:
# Register the EchoService with the Gateway INCLUDING the connection manager
# This enables the ProxyConnectionManager functionality
gateway_cm_second = Gateway.connect(url=gateway_cm.url, timeout=15)
result = gateway_cm.register_app(
name="echo",
url="http://localhost:8098/",
connection_manager=echo_cm # Provide the connection manager to the Gateway to enable ProxyConnectionManager functionality
)
print(f"EchoService registered with Gateway: {result}")
print(f"Registered apps: {gateway_cm.registered_apps}")
# The other ConnectionManager doesn't know about echo
# this gives a weird error if you haven't merged in the recent bugfix,
# or straightforwardly AttributeError: 'GatewayConnectionManager' object has no attribute 'echo' if you have
proxy_result = gateway_cm_second.echo.echo(message="Proxied call through Gateway")
print(f"Proxied call result: {proxy_result}")
finally:
# Clean up in reverse order
echo_cm.shutdown()
gateway_cm.shutdown()
print("Services shut down successfully!")
if __name__ == "__main__":
# Run ProxyConnectionManager example
proxy_connection_manager_failure_example()
|
An amendment: The ProxyConnectionManager class was broken before. It either didn't work at all, or would call the passed in connection manager without routing through the Gateway. This broken behavior has been fixed, and can be tested with the following: import logging
from mindtrace.services import Gateway, Service
from mindtrace.services.sample.echo_service import EchoService
logging.basicConfig(level=logging.DEBUG, format='%(name)s - %(levelname)s: %(message)s')
echo_url = "http://localhost:8080/"
gateway_url = "http://localhost:8081/"
echo_cm = EchoService.launch(url=echo_url)
gateway_cm = Gateway.launch(url=gateway_url)
gateway_cm.register_app("echoer", echo_url, echo_cm)
echo_cm.logger = logging.getLogger(echo_cm.name)
gateway_cm.logger = logging.getLogger(gateway_cm.name)
echo_cm.echo(message="Hello, world!") # httpx - INFO: HTTP Request: POST http://localhost:8080/echo "HTTP/1.1 200 OK"
gateway_cm.echoer.echo(message="Hello, world!") # urllib3.connectionpool - DEBUG: http://localhost:8081 "POST /echoer/echo HTTP/1.1" 200 26The debug statement from the final line should show the request going through I take it as a point of caution that there was 100% test coverage before, with tests testing this exact behavior (i.e. making sure the gateway was not bypassed), and yet nothing was flagged. I have deleted all of the unit tests from before and regenerated them, spending a little more time going through them as they are created. Still, there may be equally glaring bugs still around. So do double check the ProxyConnectionManager logic, and feel free to request clarity on anything that looks off. In either case, the sample and all tests should pass, with 100% coverage. |
Thanks for this note, I believe you are correct and this is a known issue. Shortly put, connection managers are not automatically updated if the service they point to adds endpoints. The "accepted" behavior is for the user of a connection manager to re-connect if they suspect they are failing in this case. We can discuss if we'd like more intelligent behavior, maybe a better |
| app_name: The registered app name. | ||
| original_cm: The original connection manager. | ||
| """ | ||
| object.__setattr__(self, "gateway_url", str(gateway_url).rstrip("/")) # Ensure no trailing slash |
There was a problem hiding this comment.
This could just be self.gateway_url = str(gateway_url).rstrip("/")), right? And there's similar elsewhere.
christopherfish
left a comment
There was a problem hiding this comment.
This has the API we want from the Gateway and seems to implement it correctly. The tests all pass for me and the integration tests all look sensible. There are things we could improve about the implementation of Service in general, see my inline comment, that would affect this, but I think that doesn't change the API so we can approve this and come back to it when we have time / it's needed.
| self._generate_proxy_methods() | ||
|
|
||
| def _extract_service_endpoints(self, original_cm: ConnectionManager) -> Dict[str, Any]: | ||
| """Extract service endpoints from the original connection manager. |
There was a problem hiding this comment.
We can also get the list of endpoints (though not currently the dict with the input/output schemas) by calling original_cm.endpoints(). And this is more reliably up-to-date than any other method.
In general I'm realising that it would be good to be able to connect() to Services where we don't have access to the code for the service itself - the example I'm thinking of is the Cluster wants to be able to query the status of a Worker service that could be in a different repository. I.e. we want to be able to do MyService.launch(url) in one repository and Service.connect(url) in another repository and have that ConnectionManager still work for the endpoints of MyService. (Currently I believe, though haven't actually tested, that Service.connect() will work but the endpoints won't exist in the created ConnectionManager?). We talked about doing this via overriding __getattr__ and that's very doable and will stay up-to-date but also we could use self.endpoints() when creating the ConnectionManager to get the list.
This does have the disadvantage of losing the schema validation.
There was a problem hiding this comment.
A very good point.
I've raised a new PR #83 that would address this situation, but it's a bit ugly in its current form. But basically I think that is the way-- the base Service class maintains both the endpoints and their TaskSchemas, along with which methods are considered "public". Then the base Service class will be able to provide a "decent" connection manager, even for children classes, even in the case that the caller doesn't have the code for the child service class.
Services Gateway Class
This PR implements a reverse proxy Gateway class and closes #71.
Gateway Class
The Gateway class is a reverse proxy service. To understand what it does, consider the following case.
First, consider launching and querying a service directly:
You may register this service with the Gateway class, querying it through the gateway:
ProxyConnectionManager Integration
If the registered service is a
Service, you may pass in its connection manager, which will result in the associated Gateway connection manager exposing the service through a transparent proxy attribute. The proxy cm attribute acts identically to the original, but will route all requests through the gateway.Note the important distinction: while both calls return the same result, the proxy call routes through the Gateway service:
If you check the Gateway service logs, you can see that it forwards all proxy requests through the actual Gateway service first, rather than going directly to the target service.
Documentation & Testing
A sample script has been included in
samples/services/gateway.py.Both unit and integration test suites have been added and should all pass. The unit test suite should have 100% coverage, with the integration test suite adding additional testing using an actual launched
Gatewayservice.