-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(airbyte-cdk): Fix RequestOptionsProvider for PerPartitionWithGlobalCursor #254
Conversation
📝 WalkthroughWalkthroughThe pull request modifies the Changes
Possibly related PRs
Suggested reviewers
Hey there! 👋 I noticed the changes look like they're streamlining the stream slicer type checking. Curious about the motivation behind removing the 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (4)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
2377-2377
: Consider adding a comment explaining the condition change.The condition was simplified to only check for
DatetimeBasedCursor
. Would it be helpful to add a comment explaining whyPerPartitionWithGlobalCursor
check was removed and how it affects therequest_options_provider
assignment? This would help future maintainers understand the reasoning behind this change, wdyt?+ # We only need to check for DatetimeBasedCursor as it has its own request_options_provider. + # All other stream slicers, including PerPartitionWithGlobalCursor, can act as request_options_provider. if ( not isinstance(stream_slicer, DatetimeBasedCursor) or type(stream_slicer) is not DatetimeBasedCursor ):
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (Fast)
- GitHub Check: Analyze (python)
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)
2377-2379
: 🛠️ Refactor suggestionConsider simplifying the type check condition.
The current condition uses both
isinstance
andtype
checks with anor
operator. This might be redundant and potentially confusing. Would it be clearer to use justisinstance
? The currentor
condition means the block executes if either check passes, which might not be the intended behavior, wdyt?- if ( - not isinstance(stream_slicer, DatetimeBasedCursor) - or type(stream_slicer) is not DatetimeBasedCursor - ): + if not isinstance(stream_slicer, DatetimeBasedCursor):Likely invalid or redundant comment.
Line range hint
2380-2385
: Consider documenting the breaking change.The comment indicates that custom components implementing
DatetimeBasedCursor
will need to implement their ownRequestOptionsProvider
. This is a breaking change that might affect users. Would it be helpful to:
- Add a migration guide or documentation?
- Expand the comment to provide an example of how to implement a custom
RequestOptionsProvider
?- Add a deprecation warning for the old behavior?
Let's check for any custom components that might be affected:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
airbyte_cdk/sources/declarative/incremental/per_partition_cursor.py (2)
356-362
: Consider adding error handling for partition key validation?The new
create_cursor_for_partition
method nicely centralizes the cursor creation logic! However, should we add validation to ensure the partition key is not empty or None? This could prevent potential issues down the line, wdyt? 🤔def create_cursor_for_partition(self, partition_key: str) -> None: + if not partition_key: + raise ValueError("Partition key cannot be empty") partition_state = ( self._state_to_migrate_from if self._state_to_migrate_from else self._NO_CURSOR_STATE ) cursor = self._create_cursor(partition_state) self._cursor_per_partition[partition_key] = cursor
225-226
: Extract common partition check into a helper method?I notice we have the same partition check in all request-related methods. What do you think about extracting this into a helper method to make the code more DRY? Something like this perhaps? 🎯
+ def _ensure_partition_cursor_exists(self, partition: Mapping[str, Any]) -> None: + partition_key = self._to_partition_key(partition) + if partition_key not in self._cursor_per_partition: + self.create_cursor_for_partition(partition_key) def get_request_params(...): if stream_slice: - if self._to_partition_key(stream_slice.partition) not in self._cursor_per_partition: - self.create_cursor_for_partition(self._to_partition_key(stream_slice.partition)) + self._ensure_partition_cursor_exists(stream_slice.partition)Also applies to: 249-250, 273-274, 297-298
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
airbyte_cdk/sources/declarative/incremental/per_partition_cursor.py
(5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (Fast)
- GitHub Check: Analyze (python)
🔇 Additional comments (1)
airbyte_cdk/sources/declarative/incremental/per_partition_cursor.py (1)
352-362
: Great refactoring! 🎉The centralization of cursor creation logic and consistent handling across all methods makes the code more maintainable and robust. The changes look good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pre-emptively approve
airbyte_cdk/sources/declarative/incremental/per_partition_cursor.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
unit_tests/sources/declarative/incremental/test_concurrent_perpartitioncursor.py (2)
2329-2602
: Consider refactoring manifest definitions to reduce duplicationThe
SUBSTREAM_REQUEST_OPTIONS_MANIFEST
shares many similarities with the existingSUBSTREAM_MANIFEST
. Would it be helpful to refactor common components into shared definitions or functions to avoid duplication and improve maintainability? Wdyt?
2605-2875
: Enhance test coverage by verifying request parametersIn the
test_incremental_substream_request_options_provider
function, we're verifying records and state transitions. Should we also include assertions to confirm that thepost_id
andcomment_id
parameters are correctly passed in the requests? This could help ensure that theRequestOptionsProvider
is functioning as expected. Wdyt?airbyte_cdk/sources/declarative/incremental/per_partition_cursor.py (2)
225-226
: Refactor repeated cursor initialization into a helper methodIn the methods
get_request_params
,get_request_headers
,get_request_body_data
, andget_request_body_json
, there's repeated logic for checking and creating cursors for partitions. Would it be beneficial to extract this logic into a helper method to reduce duplication and enhance code clarity? Wdyt?Also applies to: 249-250, 273-274, 297-298
356-380
: Consider adding unit tests for_create_cursor_for_partition
Since
_create_cursor_for_partition
plays a crucial role in initializing cursors, would adding unit tests for this method help ensure its correctness and prevent future regressions? Wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte_cdk/sources/declarative/incremental/per_partition_cursor.py
(5 hunks)unit_tests/sources/declarative/incremental/test_concurrent_perpartitioncursor.py
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: Analyze (python)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (1)
airbyte_cdk/sources/declarative/incremental/per_partition_cursor.py (1)
356-380
: Add a comment explaining the temporary nature of_create_cursor_for_partition
The
_create_cursor_for_partition
method is introduced to dynamically create cursors for partitions. Could we add a comment to explain why this method is necessary and mention that it's a temporary workaround until the cursor is decoupled from the concurrent cursor implementation? This would help future maintainers understand its purpose. Wdyt?
* remotes/airbyte/main: fix(airbyte-cdk): Fix RequestOptionsProvider for PerPartitionWithGlobalCursor (airbytehq#254) feat(low-code): add profile assertion flow to oauth authenticator component (airbytehq#236) feat(Low-Code Concurrent CDK): Add ConcurrentPerPartitionCursor (airbytehq#111) fix: don't mypy unit_tests (airbytehq#241) fix: handle backoff_strategies in CompositeErrorHandler (airbytehq#225) feat(concurrent cursor): attempt at clamping datetime (airbytehq#234) ci: use `ubuntu-24.04` explicitly (resolves CI warnings) (airbytehq#244) Fix(sdm): module ref issue in python components import (airbytehq#243) feat(source-declarative-manifest): add support for custom Python components from dynamic text input (airbytehq#174) chore(deps): bump avro from 1.11.3 to 1.12.0 (airbytehq#133) docs: comments on what the `Dockerfile` is for (airbytehq#240) chore: move ruff configuration to dedicated ruff.toml file (airbytehq#237)
Summary by CodeRabbit
New Features
Refactor
Tests