Skip to content

fix: (CDK) (HttpRequester) - Make the HttpRequester.path optional #370

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 4, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 10 additions & 2 deletions airbyte_cdk/sources/declarative/declarative_component_schema.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1794,7 +1794,6 @@ definitions:
type: object
required:
- type
- path
- url_base
properties:
type:
Expand All @@ -1806,9 +1805,18 @@ definitions:
type: string
interpolation_context:
- config
- next_page_token
- stream_interval
- stream_partition
- stream_slice
- creation_response
- polling_response
- download_target
examples:
- "https://connect.squareup.com/v2"
- "{{ config['base_url'] or 'https://app.posthog.com'}}/api/"
- "{{ config['base_url'] or 'https://app.posthog.com'}}/api"
- "https://connect.squareup.com/v2/quotes/{{ stream_partition['id'] }}/quote_line_groups"
- "https://example.com/api/v1/resource/{{ next_page_token['id'] }}"
path:
title: URL Path
description: Path the specific API endpoint that this stream represents. Do not put sensitive information (e.g. API tokens) into this field - Use the Authentication component for this.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -939,7 +939,7 @@ class MinMaxDatetime(BaseModel):
)
datetime_format: Optional[str] = Field(
"",
description='Format of the datetime value. Defaults to "%Y-%m-%dT%H:%M:%S.%f%z" if left empty. Use placeholders starting with "%" to describe the format the API is using. The following placeholders are available:\n * **%s**: Epoch unix timestamp - `1686218963`\n * **%s_as_float**: Epoch unix timestamp in seconds as float with microsecond precision - `1686218963.123456`\n * **%ms**: Epoch unix timestamp - `1686218963123`\n * **%a**: Weekday (abbreviated) - `Sun`\n * **%A**: Weekday (full) - `Sunday`\n * **%w**: Weekday (decimal) - `0` (Sunday), `6` (Saturday)\n * **%d**: Day of the month (zero-padded) - `01`, `02`, ..., `31`\n * **%b**: Month (abbreviated) - `Jan`\n * **%B**: Month (full) - `January`\n * **%m**: Month (zero-padded) - `01`, `02`, ..., `12`\n * **%y**: Year (without century, zero-padded) - `00`, `01`, ..., `99`\n * **%Y**: Year (with century) - `0001`, `0002`, ..., `9999`\n * **%H**: Hour (24-hour, zero-padded) - `00`, `01`, ..., `23`\n * **%I**: Hour (12-hour, zero-padded) - `01`, `02`, ..., `12`\n * **%p**: AM/PM indicator\n * **%M**: Minute (zero-padded) - `00`, `01`, ..., `59`\n * **%S**: Second (zero-padded) - `00`, `01`, ..., `59`\n * **%f**: Microsecond (zero-padded to 6 digits) - `000000`, `000001`, ..., `999999`\n * **%z**: UTC offset - `(empty)`, `+0000`, `-04:00`\n * **%Z**: Time zone name - `(empty)`, `UTC`, `GMT`\n * **%j**: Day of the year (zero-padded) - `001`, `002`, ..., `366`\n * **%U**: Week number of the year (Sunday as first day) - `00`, `01`, ..., `53`\n * **%W**: Week number of the year (Monday as first day) - `00`, `01`, ..., `53`\n * **%c**: Date and time representation - `Tue Aug 16 21:30:00 1988`\n * **%x**: Date representation - `08/16/1988`\n * **%X**: Time representation - `21:30:00`\n * **%%**: Literal \'%\' character\n\n Some placeholders depend on the locale of the underlying system - in most cases this locale is configured as en/US. For more information see the [Python documentation](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes).\n',
description='Format of the datetime value. Defaults to "%Y-%m-%dT%H:%M:%S.%f%z" if left empty. Use placeholders starting with "%" to describe the format the API is using. The following placeholders are available:\n * **%s**: Epoch unix timestamp - `1686218963`\n * **%s_as_float**: Epoch unix timestamp in seconds as float with microsecond precision - `1686218963.123456`\n * **%ms**: Epoch unix timestamp - `1686218963123`\n * **%a**: Weekday (abbreviated) - `Sun`\n * **%A**: Weekday (full) - `Sunday`\n * **%w**: Weekday (decimal) - `0` (Sunday), `6` (Saturday)\n * **%d**: Day of the month (zero-padded) - `01`, `02`, ..., `31`\n * **%b**: Month (abbreviated) - `Jan`\n * **%B**: Month (full) - `January`\n * **%m**: Month (zero-padded) - `01`, `02`, ..., `12`\n * **%y**: Year (without century, zero-padded) - `00`, `01`, ..., `99`\n * **%Y**: Year (with century) - `0001`, `0002`, ..., `9999`\n * **%H**: Hour (24-hour, zero-padded) - `00`, `01`, ..., `23`\n * **%I**: Hour (12-hour, zero-padded) - `01`, `02`, ..., `12`\n * **%p**: AM/PM indicator\n * **%M**: Minute (zero-padded) - `00`, `01`, ..., `59`\n * **%S**: Second (zero-padded) - `00`, `01`, ..., `59`\n * **%f**: Microsecond (zero-padded to 6 digits) - `000000`, `000001`, ..., `999999`\n * **%_ms**: Millisecond (zero-padded to 3 digits) - `000`, `001`, ..., `999`\n * **%z**: UTC offset - `(empty)`, `+0000`, `-04:00`\n * **%Z**: Time zone name - `(empty)`, `UTC`, `GMT`\n * **%j**: Day of the year (zero-padded) - `001`, `002`, ..., `366`\n * **%U**: Week number of the year (Sunday as first day) - `00`, `01`, ..., `53`\n * **%W**: Week number of the year (Monday as first day) - `00`, `01`, ..., `53`\n * **%c**: Date and time representation - `Tue Aug 16 21:30:00 1988`\n * **%x**: Date representation - `08/16/1988`\n * **%X**: Time representation - `21:30:00`\n * **%%**: Literal \'%\' character\n\n Some placeholders depend on the locale of the underlying system - in most cases this locale is configured as en/US. For more information see the [Python documentation](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes).\n',
examples=["%Y-%m-%dT%H:%M:%S.%f%z", "%Y-%m-%d", "%s"],
title="Datetime Format",
)
Expand Down Expand Up @@ -1545,7 +1545,7 @@ class DatetimeBasedCursor(BaseModel):
)
datetime_format: str = Field(
...,
description="The datetime format used to format the datetime values that are sent in outgoing requests to the API. Use placeholders starting with \"%\" to describe the format the API is using. The following placeholders are available:\n * **%s**: Epoch unix timestamp - `1686218963`\n * **%s_as_float**: Epoch unix timestamp in seconds as float with microsecond precision - `1686218963.123456`\n * **%ms**: Epoch unix timestamp (milliseconds) - `1686218963123`\n * **%a**: Weekday (abbreviated) - `Sun`\n * **%A**: Weekday (full) - `Sunday`\n * **%w**: Weekday (decimal) - `0` (Sunday), `6` (Saturday)\n * **%d**: Day of the month (zero-padded) - `01`, `02`, ..., `31`\n * **%b**: Month (abbreviated) - `Jan`\n * **%B**: Month (full) - `January`\n * **%m**: Month (zero-padded) - `01`, `02`, ..., `12`\n * **%y**: Year (without century, zero-padded) - `00`, `01`, ..., `99`\n * **%Y**: Year (with century) - `0001`, `0002`, ..., `9999`\n * **%H**: Hour (24-hour, zero-padded) - `00`, `01`, ..., `23`\n * **%I**: Hour (12-hour, zero-padded) - `01`, `02`, ..., `12`\n * **%p**: AM/PM indicator\n * **%M**: Minute (zero-padded) - `00`, `01`, ..., `59`\n * **%S**: Second (zero-padded) - `00`, `01`, ..., `59`\n * **%f**: Microsecond (zero-padded to 6 digits) - `000000`\n * **%z**: UTC offset - `(empty)`, `+0000`, `-04:00`\n * **%Z**: Time zone name - `(empty)`, `UTC`, `GMT`\n * **%j**: Day of the year (zero-padded) - `001`, `002`, ..., `366`\n * **%U**: Week number of the year (starting Sunday) - `00`, ..., `53`\n * **%W**: Week number of the year (starting Monday) - `00`, ..., `53`\n * **%c**: Date and time - `Tue Aug 16 21:30:00 1988`\n * **%x**: Date standard format - `08/16/1988`\n * **%X**: Time standard format - `21:30:00`\n * **%%**: Literal '%' character\n\n Some placeholders depend on the locale of the underlying system - in most cases this locale is configured as en/US. For more information see the [Python documentation](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes).\n",
description="The datetime format used to format the datetime values that are sent in outgoing requests to the API. Use placeholders starting with \"%\" to describe the format the API is using. The following placeholders are available:\n * **%s**: Epoch unix timestamp - `1686218963`\n * **%s_as_float**: Epoch unix timestamp in seconds as float with microsecond precision - `1686218963.123456`\n * **%ms**: Epoch unix timestamp (milliseconds) - `1686218963123`\n * **%a**: Weekday (abbreviated) - `Sun`\n * **%A**: Weekday (full) - `Sunday`\n * **%w**: Weekday (decimal) - `0` (Sunday), `6` (Saturday)\n * **%d**: Day of the month (zero-padded) - `01`, `02`, ..., `31`\n * **%b**: Month (abbreviated) - `Jan`\n * **%B**: Month (full) - `January`\n * **%m**: Month (zero-padded) - `01`, `02`, ..., `12`\n * **%y**: Year (without century, zero-padded) - `00`, `01`, ..., `99`\n * **%Y**: Year (with century) - `0001`, `0002`, ..., `9999`\n * **%H**: Hour (24-hour, zero-padded) - `00`, `01`, ..., `23`\n * **%I**: Hour (12-hour, zero-padded) - `01`, `02`, ..., `12`\n * **%p**: AM/PM indicator\n * **%M**: Minute (zero-padded) - `00`, `01`, ..., `59`\n * **%S**: Second (zero-padded) - `00`, `01`, ..., `59`\n * **%f**: Microsecond (zero-padded to 6 digits) - `000000`\n * **%_ms**: Millisecond (zero-padded to 3 digits) - `000`\n * **%z**: UTC offset - `(empty)`, `+0000`, `-04:00`\n * **%Z**: Time zone name - `(empty)`, `UTC`, `GMT`\n * **%j**: Day of the year (zero-padded) - `001`, `002`, ..., `366`\n * **%U**: Week number of the year (starting Sunday) - `00`, ..., `53`\n * **%W**: Week number of the year (starting Monday) - `00`, ..., `53`\n * **%c**: Date and time - `Tue Aug 16 21:30:00 1988`\n * **%x**: Date standard format - `08/16/1988`\n * **%X**: Time standard format - `21:30:00`\n * **%%**: Literal '%' character\n\n Some placeholders depend on the locale of the underlying system - in most cases this locale is configured as en/US. For more information see the [Python documentation](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes).\n",
examples=["%Y-%m-%dT%H:%M:%S.%f%z", "%Y-%m-%d", "%s", "%ms", "%s_as_float"],
title="Outgoing Datetime Format",
)
Expand Down Expand Up @@ -2072,12 +2072,14 @@ class HttpRequester(BaseModel):
description="Base URL of the API source. Do not put sensitive information (e.g. API tokens) into this field - Use the Authentication component for this.",
examples=[
"https://connect.squareup.com/v2",
"{{ config['base_url'] or 'https://app.posthog.com'}}/api/",
"{{ config['base_url'] or 'https://app.posthog.com'}}/api",
"https://connect.squareup.com/v2/quotes/{{ stream_partition['id'] }}/quote_line_groups",
"https://example.com/api/v1/resource/{{ next_page_token['id'] }}",
],
title="API Base URL",
)
path: str = Field(
...,
path: Optional[str] = Field(
None,
description="Path the specific API endpoint that this stream represents. Do not put sensitive information (e.g. API tokens) into this field - Use the Authentication component for this.",
examples=[
"/products",
Expand Down
70 changes: 48 additions & 22 deletions airbyte_cdk/sources/declarative/requesters/http_requester.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@
from airbyte_cdk.sources.streams.call_rate import APIBudget
from airbyte_cdk.sources.streams.http import HttpClient
from airbyte_cdk.sources.streams.http.error_handlers import ErrorHandler
from airbyte_cdk.sources.types import Config, StreamSlice, StreamState
from airbyte_cdk.utils.mapping_helpers import combine_mappings
from airbyte_cdk.sources.types import Config, EmptyString, StreamSlice, StreamState
from airbyte_cdk.utils.mapping_helpers import combine_mappings, get_interpolation_context


@dataclass
Expand All @@ -49,9 +49,10 @@ class HttpRequester(Requester):

name: str
url_base: Union[InterpolatedString, str]
path: Union[InterpolatedString, str]
config: Config
parameters: InitVar[Mapping[str, Any]]

path: Optional[Union[InterpolatedString, str]] = None
authenticator: Optional[DeclarativeAuthenticator] = None
http_method: Union[str, HttpMethod] = HttpMethod.GET
request_options_provider: Optional[InterpolatedRequestOptionsProvider] = None
Expand All @@ -66,7 +67,9 @@ class HttpRequester(Requester):

def __post_init__(self, parameters: Mapping[str, Any]) -> None:
self._url_base = InterpolatedString.create(self.url_base, parameters=parameters)
self._path = InterpolatedString.create(self.path, parameters=parameters)
self._path = InterpolatedString.create(
self.path if self.path else EmptyString, parameters=parameters
)
if self.request_options_provider is None:
self._request_options_provider = InterpolatedRequestOptionsProvider(
config=self.config, parameters=parameters
Expand Down Expand Up @@ -112,27 +115,33 @@ def exit_on_rate_limit(self, value: bool) -> None:
def get_authenticator(self) -> DeclarativeAuthenticator:
return self._authenticator

def get_url_base(self) -> str:
return os.path.join(self._url_base.eval(self.config), "")
def get_url_base(
self,
*,
stream_state: Optional[StreamState] = None,
stream_slice: Optional[StreamSlice] = None,
next_page_token: Optional[Mapping[str, Any]] = None,
) -> str:
interpolation_context = get_interpolation_context(
stream_state=stream_state,
stream_slice=stream_slice,
next_page_token=next_page_token,
)
return os.path.join(self._url_base.eval(self.config, **interpolation_context), EmptyString)

def get_path(
self,
*,
stream_state: Optional[StreamState],
stream_slice: Optional[StreamSlice],
next_page_token: Optional[Mapping[str, Any]],
stream_state: Optional[StreamState] = None,
stream_slice: Optional[StreamSlice] = None,
next_page_token: Optional[Mapping[str, Any]] = None,
) -> str:
kwargs = {
"stream_slice": stream_slice,
"next_page_token": next_page_token,
# update the interpolation context with extra fields, if passed.
**(
stream_slice.extra_fields
if stream_slice is not None and hasattr(stream_slice, "extra_fields")
else {}
),
}
path = str(self._path.eval(self.config, **kwargs))
interpolation_context = get_interpolation_context(
stream_state=stream_state,
stream_slice=stream_slice,
next_page_token=next_page_token,
)
path = str(self._path.eval(self.config, **interpolation_context))
return path.lstrip("/")

def get_method(self) -> HttpMethod:
Expand Down Expand Up @@ -330,7 +339,20 @@ def _request_body_json(

@classmethod
def _join_url(cls, url_base: str, path: str) -> str:
return urljoin(url_base, path)
"""
Joins a base URL with a given path and returns the resulting URL with any trailing slash removed.

This method ensures that there are no duplicate slashes when concatenating the base URL and the path,
which is useful when the full URL is provided from an interpolation context.

Args:
url_base (str): The base URL to which the path will be appended.
path (str): The path to join with the base URL.

Returns:
str: The concatenated URL with the trailing slash (if any) removed.
"""
return urljoin(url_base, path).rstrip("/")

def send_request(
self,
Expand All @@ -347,7 +369,11 @@ def send_request(
request, response = self._http_client.send_request(
http_method=self.get_method().value,
url=self._join_url(
self.get_url_base(),
self.get_url_base(
stream_state=stream_state,
stream_slice=stream_slice,
next_page_token=next_page_token,
),
path
or self.get_path(
stream_state=stream_state,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
from airbyte_cdk.sources.types import Config, Record, StreamSlice, StreamState
from airbyte_cdk.utils.mapping_helpers import (
_validate_component_request_option_paths,
get_interpolation_context,
)


Expand Down Expand Up @@ -150,11 +151,22 @@ def next_page_token(
else:
return None

def path(self, next_page_token: Optional[Mapping[str, Any]]) -> Optional[str]:
def path(
self,
next_page_token: Optional[Mapping[str, Any]],
stream_state: Optional[Mapping[str, Any]] = None,
stream_slice: Optional[StreamSlice] = None,
) -> Optional[str]:
token = next_page_token.get("next_page_token") if next_page_token else None
if token and self.page_token_option and isinstance(self.page_token_option, RequestPath):
# make additional interpolation context
interpolation_context = get_interpolation_context(
stream_state=stream_state,
stream_slice=stream_slice,
next_page_token=next_page_token,
)
# Replace url base to only return the path
return str(token).replace(self.url_base.eval(self.config), "") # type: ignore # url_base is casted to a InterpolatedString in __post_init__
return str(token).replace(self.url_base.eval(self.config, **interpolation_context), "") # type: ignore # url_base is casted to a InterpolatedString in __post_init__
else:
return None

Expand Down Expand Up @@ -258,8 +270,17 @@ def next_page_token(
response, last_page_size, last_record, last_page_token_value
)

def path(self, next_page_token: Optional[Mapping[str, Any]]) -> Optional[str]:
return self._decorated.path(next_page_token)
def path(
self,
next_page_token: Optional[Mapping[str, Any]],
stream_state: Optional[Mapping[str, Any]] = None,
stream_slice: Optional[StreamSlice] = None,
) -> Optional[str]:
return self._decorated.path(
next_page_token=next_page_token,
stream_state=stream_state,
stream_slice=stream_slice,
)

def get_request_params(
self,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,12 @@ class NoPagination(Paginator):

parameters: InitVar[Mapping[str, Any]]

def path(self, next_page_token: Optional[Mapping[str, Any]]) -> Optional[str]:
def path(
self,
next_page_token: Optional[Mapping[str, Any]],
stream_state: Optional[Mapping[str, Any]] = None,
stream_slice: Optional[StreamSlice] = None,
) -> Optional[str]:
return None

def get_request_params(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
from airbyte_cdk.sources.declarative.requesters.request_options.request_options_provider import (
RequestOptionsProvider,
)
from airbyte_cdk.sources.types import Record
from airbyte_cdk.sources.types import Record, StreamSlice


@dataclass
Expand Down Expand Up @@ -49,7 +49,12 @@ def next_page_token(
pass

@abstractmethod
def path(self, next_page_token: Optional[Mapping[str, Any]]) -> Optional[str]:
def path(
self,
next_page_token: Optional[Mapping[str, Any]],
stream_state: Optional[Mapping[str, Any]] = None,
stream_slice: Optional[StreamSlice] = None,
) -> Optional[str]:
"""
Returns the URL path to hit to fetch the next page of records

Expand Down
8 changes: 7 additions & 1 deletion airbyte_cdk/sources/declarative/requesters/requester.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,13 @@ def get_authenticator(self) -> DeclarativeAuthenticator:
pass

@abstractmethod
def get_url_base(self) -> str:
def get_url_base(
self,
*,
stream_state: Optional[StreamState],
stream_slice: Optional[StreamSlice],
next_page_token: Optional[Mapping[str, Any]],
) -> str:
"""
:return: URL base for the API endpoint e.g: if you wanted to hit https://myapi.com/v1/some_entity then this should return "https://myapi.com/v1/"
"""
Expand Down
Loading
Loading