Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: enable handling of nested fields when injecting request_option in request body_json #201

Open
wants to merge 28 commits into
base: main
Choose a base branch
from

Conversation

ChristoGrab
Copy link
Collaborator

@ChristoGrab ChristoGrab commented Jan 6, 2025

What

To support GraphQL queries natively in the CDK, we need to add handling of nested structures when injecting a RequestOption into a request body (specifically with body_json injection).

How

  • The RequestOption class now includes an optional field_path param, and each instance must declare a field_path OR field_name.
  • The new inject_into_request method determines how to handle the injection based on the existense of field_path and the specific RequestOptionType. As stated above, only body_json injections will support nested fields.
  • The combine_mappings utility function now handles nested structures when checking for mapping conflicts.
  • The following components have been updated to call on `inject_into_request':
    1. ApiKeyAuthenticator
    2. DatetimeBasedCursor
    3. DatetimeBasedRequestOptionsProvider
    4. DefaultPaginator
    5. ListPartitionRouter
    6. SubstreamPartitionRouter

Suggested review order

  1. request_option.py: Has the main updated logic for validation and the new method inject_into_request
  2. mapping_helpers.py: added a helper method and extended the main function for catching conflicts in nested structures.
  3. declarative_component_schema.yaml and declarative_component_schema.py: Updated schema + model
  4. token.py, datetime_based_cursor.py, datetime_based_request_options_provider.py, default_paginator.py, list_partition_router.py and substream_partition_router.py all have minor updates to leverage the new method.
  5. Everything else is unit testing

Summary by CodeRabbit

Release Notes

  • New Features

    • Added support for nested field injection in request options.
    • Enhanced request parameter handling with more flexible configuration.
    • Introduced field_path for specifying nested structures in JSON requests.
  • Improvements

    • Refined request option injection logic.
    • Improved type annotations and type handling.
    • Updated mapping merge strategies, especially for JSON body requests.
    • Enhanced validation for request option configurations.
  • Bug Fixes

    • Resolved issues with request parameter and header injections.
    • Improved handling of configuration interpolation in request options.
  • Testing

    • Added comprehensive unit tests for request option and mapping functionality.
    • Expanded test coverage for nested field injection scenarios.

@github-actions github-actions bot added the enhancement New feature or request label Jan 6, 2025
@ChristoGrab ChristoGrab changed the title feat: handle nested feat: enable handling of nested field_paths in request_option injection Jan 6, 2025
@ChristoGrab ChristoGrab changed the title feat: enable handling of nested field_paths in request_option injection feat: enable handling of nested fields when injecting request_option in request body_json Jan 9, 2025
Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding some comments on this PR even if it is draft. Poke me when you want me to check it again

value: The value to inject
config: The config object to use for interpolation
"""
if self._is_field_path:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we simplify the logic by having in post_init:

        if self.field_name is not None:
            self.field_path = [InterpolatedString.create(self.field_name, parameters=parameters)]

This way, we would only have one logic to maintain and it would be the field_path one.

Copy link
Collaborator Author

@ChristoGrab ChristoGrab Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit skeptical that this could potentially lead to conflicts in the Builder if we started mixing the attributes, but I think I see what you're saying. If we just modify the internal logic here in the CDK to allow field_name, but only use it to update and use field_path instead, we simplify the backend handling while still allowing the Builder to define them as entirely separate fields, correct?

My only hangup is wondering if there isn't still some scenario where by allowing field_path to essentially override field_name, we could accidentally send those updated values back to the frontend? And it would be picked up as a manifest change? I don't yet know enough about the full lifecycle of the back and forth between the CDK and Builder components to know if that's a valid concern.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit skeptical that this could potentially lead to conflicts in the Builder if we started mixing the attributes

If this is the case, we haven't made a good job at designing our interfaces and we will lack the benefits of easy maintenance. In an ideal world, we should be able to replace the whole CDK implementation by something else and the Connector Builder should continue to work.

we simplify the backend handling

Yes! Instead of knowing that there are two cases, we all rely on the cases where we handle a list and that's it!

we could accidentally send those updated values back to the frontend?

My understand is that the only case we send back manifest information to the frontend is when we do resolve_manifest (or something like that) and it feels like this flow only work with the manifest directly and not with Python implementations so I would expect that we would be fine.

that string. Raise errors in the following cases:
* If there are duplicate keys across mappings
Combine multiple mappings into a single mapping, supporting nested structures.
If any of the mappings are a string, return that string. Raise errors in the following cases:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would precise that it is "If there only one mapping that is a string, it will return this mapping regardless of the other mappings".

That being said, I don't understand this behavior though. Maybe I'm wrong but before, I think it would fail if there was a string and a mapping, right?


# If any mapping is a string, return it
# Convert all mappings to flat (path, value) pairs for conflict detection
all_paths: List[List[tuple[List[str], Any]]] = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like the following logic is a bit complex. Would there be a way to simplify this? Right now, I see that we:

  • Extract all paths
  • Compare all paths to find conflicts
  • Create a new mapping that we can return

Would it be simpler to do just the last one and validate for conflicts while we do that? Something like:

def combine_mappings(
    mappings: List[Optional[Union[Mapping[str, Any], str]]],
) -> Union[Mapping[str, Any], str]:
    """
    Combine multiple mappings into a single mapping, supporting nested structures.
    If any of the mappings are a string, return that string. Raise errors in the following cases:
    * If there are conflicting paths across mappings (including nested conflicts)
    * If there are multiple string mappings
    * If there are multiple mappings containing keys and one of them is a string
    """
    if not mappings:
        return {}

    # Count how many string options we have, ignoring None values
    string_options = sum(isinstance(mapping, str) for mapping in mappings if mapping is not None)
    if string_options > 1:
        raise ValueError("Cannot combine multiple string options")

    # Filter out None values and empty mappings
    non_empty_mappings = [
        m for m in mappings if m is not None and not (isinstance(m, Mapping) and not m)
    ]

    # If there is only one string option, return it
    if string_options == 1:
        if len(non_empty_mappings) > 1:
            raise ValueError("Cannot combine multiple options if one is a string")
        return next(m for m in non_empty_mappings if isinstance(m, str))

    # Convert all mappings to flat (path, value) pairs for conflict detection
    for other in mappings[1:]:
        if other:
            merge(mappings[0], other)

    return mappings[0]


def merge(a: dict, b: dict, path=[]):
    """
    Blindly and shamelessly taken from https://stackoverflow.com/a/7205107
    """
    for key in b:
        if key in a:
            if isinstance(a[key], dict) and isinstance(b[key], dict):
                merge(a[key], b[key], path + [str(key)])
            elif a[key] != b[key]:
                raise ValueError('Duplicate keys')
        else:
            a[key] = b[key]
    return a

If we are afraid of modifying the mappings in memory, we can create a deepcopy of it as well.

Copy link
Collaborator Author

@ChristoGrab ChristoGrab Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've modified the logic to be more in line with your suggestion. I removed the flattening and rebuilding of data in favour of a single helper method that recursively merges mappings directly, so it should hopefully be cleaner. Updated the tests as well.

Copy link
Collaborator Author

@ChristoGrab ChristoGrab Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: A number of unit tests went red due to the updated mapping logic being more permissive in allowing duplicate keys to exist. I made some more changes to handle body_json injections differently than other request types. So:

  1. For header, parameter, and body_data requests:

    • We still maintain strict validation where any duplicate key is considered a conflict
    • This preserves existing behavior and prevents unintended header/parameter collisions
  2. For body_json requests only:

    • We support nested structures
    • We allow duplicate keys if they have matching values
    • We only raise conflicts if same path has different values

This distinction is controlled by an allow_same_value_merge flag in combine_mappings which is only set to True for body_json requests. But I'm wondering if this is leading to the logic being overly contrived again. Another option could be to create two separate methods for handling either scenario?

@ChristoGrab ChristoGrab marked this pull request as ready for review January 22, 2025 01:32
@ChristoGrab ChristoGrab requested a review from maxi297 January 22, 2025 01:42
Copy link
Contributor

coderabbitai bot commented Jan 22, 2025

📝 Walkthrough

Walkthrough

The pull request introduces comprehensive modifications to the Airbyte CDK's declarative source components, focusing on enhancing request option handling, authentication, and schema flexibility. The changes primarily involve updating type annotations, refactoring request option injection methods, and introducing new capabilities like nested field path support for request configurations.

Changes

File Change Summary
airbyte_cdk/sources/declarative/auth/token.py Updated ApiKeyAuthenticator to remove parameters initialization and modify request option handling.
airbyte_cdk/sources/declarative/requesters/request_option.py Added field_path attribute, made field_name optional, and introduced new validation and injection methods.
airbyte_cdk/utils/mapping_helpers.py Added _merge_mappings function and updated combine_mappings to support more flexible mapping merging.
Multiple files Replaced Mapping with MutableMapping in type annotations and refactored request option injection methods.

Sequence Diagram

sequenceDiagram
    participant RequestOption
    participant Authenticator
    participant HttpRequester
    
    RequestOption->>Authenticator: Create with field_path/field_name
    Authenticator->>HttpRequester: Inject request options
    HttpRequester-->>Authenticator: Populate request parameters
    RequestOption->>RequestOption: Validate injection type
Loading

Possibly related PRs

Suggested Labels

bug

Suggested Reviewers

  • maxi297
  • lazebnyi
  • aldogonzalez8

Hey there! 👋 I noticed some interesting changes in how request options are being handled. The introduction of field_path and the more flexible request option injection looks promising. Wdyt about adding some additional documentation to explain these new capabilities?

Would you be interested in expanding the README or creating a migration guide to help users understand the new nested field injection feature? It could really help smooth the transition for existing declarative source implementations. 🚀

✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (10)
airbyte_cdk/sources/declarative/requesters/request_option.py (1)

43-49: Could we simplify the mutual exclusivity checks?

Hi! I noticed that we're checking for mutual exclusivity between field_name and field_path with separate conditions. Perhaps we could simplify this by ensuring exactly one is provided using a single condition:

if (self.field_name is None) == (self.field_path is None):
    raise ValueError("RequestOption requires exactly one of field_name or field_path")

This way, we handle both cases in one check. Wdyt?

unit_tests/utils/test_mapping_helpers.py (2)

19-44: Parameterizing string handling tests enhances clarity!

Consolidating the string handling tests with parameterization makes the intent of each test case clearer and reduces code duplication. This approach is very effective. Wdyt?


97-106: Consider providing more detailed error messages in tests?

In the test_body_json_requests, the expected error message is "Duplicate keys found". Do you think adding more context to the error message could help in identifying issues faster during debugging? For example, specifying which keys are duplicated. Wdyt?

airbyte_cdk/utils/mapping_helpers.py (2)

10-45: LGTM! Consider adding doctest examples?

The implementation is robust and handles nested structures well. Would you consider adding some doctest examples to demonstrate the different behaviors between allow_same_value_merge=True and False? This could help future maintainers understand the nuances quickly, wdyt?

Example:

def _merge_mappings(
    target: Dict[str, Any],
    source: Mapping[str, Any],
    path: Optional[List[str]] = None,
    allow_same_value_merge: bool = False,
) -> None:
    """
    Recursively merge two dictionaries, raising an error if there are any conflicts.
    
    >>> target = {"a": 1}
    >>> source = {"a": 1}
    >>> _merge_mappings(target, source, allow_same_value_merge=True)  # No error
    >>> target
    {'a': 1}
    
    >>> target = {"a": 1}
    >>> source = {"a": 1}
    >>> _merge_mappings(target, source, allow_same_value_merge=False)  # Raises ValueError
    Traceback (most recent call last):
        ...
    ValueError: Duplicate keys found: a
    """

47-104: LGTM! Consider enhancing type hints?

The implementation is clean and well-documented. Would you consider making the type hints more specific for the return type? Instead of Union[Mapping[str, Any], str], we could use a TypeVar to preserve the exact input type, wdyt?

from typing import TypeVar

T = TypeVar('T', bound=Union[Mapping[str, Any], str])

def combine_mappings(
    mappings: List[Optional[T]],
    allow_same_value_merge: bool = False,
) -> T:
unit_tests/sources/declarative/requesters/request_options/test_request_options.py (1)

51-93: LGTM! Consider adding edge cases?

Great test coverage! Would you consider adding a few edge cases to test_inject_into_request_cases, wdyt?

  1. Empty path segments: field_path=["data", "", "field"]
  2. Special characters in paths: field_path=["data", "field.with.dots"]
  3. Integer keys: field_path=["data", "0", "field"]
unit_tests/sources/declarative/auth/test_token_auth.py (1)

253-285: LGTM! Consider adding error cases?

Great test cases for the happy path! Would you consider adding some error scenarios to ensure proper validation, wdyt?

  1. Invalid field path (e.g., None in path)
  2. Circular interpolation in path
  3. Non-string interpolation results
airbyte_cdk/sources/declarative/auth/token.py (1)

58-60: Consider adding a docstring for clarity, wdyt?

The _get_request_options method has been updated to use MutableMapping and the new inject_into_request method. While the changes look good, adding a docstring would help explain the purpose of these changes, especially regarding nested field support.

 def _get_request_options(self, option_type: RequestOptionType) -> Mapping[str, Any]:
+    """
+    Get request options for the specified option type.
+    
+    Args:
+        option_type: The type of request option (header, parameter, body_data, or body_json)
+    
+    Returns:
+        A mapping containing the request options, supporting nested fields for body_json
+    """
     options: MutableMapping[str, Any] = {}
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)

2819-2829: Consider adding mutual exclusivity validation?

The new field_path property looks great! Since field_name and field_path serve similar purposes but in different contexts, would it make sense to add validation to ensure they're not used simultaneously? This could help prevent confusion and potential conflicts, wdyt?

Example validation in the schema:

oneOf:
  - required: [field_name]
    not:
      required: [field_path]
  - required: [field_path]
    not:
      required: [field_name]
airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (1)

131-133: Consider adding a comment explaining the special handling of body_json.

The introduction of is_body_json flag affects mapping behavior. Would you consider adding a brief comment explaining why body_json requests need different handling for merging mappings with same values? This would help future maintainers understand the intent better.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1494458 and e36a06a.

📒 Files selected for processing (19)
  • airbyte_cdk/sources/declarative/auth/token.py (2 hunks)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1 hunks)
  • airbyte_cdk/sources/declarative/incremental/datetime_based_cursor.py (1 hunks)
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py (3 hunks)
  • airbyte_cdk/sources/declarative/partition_routers/list_partition_router.py (2 hunks)
  • airbyte_cdk/sources/declarative/partition_routers/substream_partition_router.py (3 hunks)
  • airbyte_cdk/sources/declarative/requesters/http_requester.py (2 hunks)
  • airbyte_cdk/sources/declarative/requesters/paginators/default_paginator.py (2 hunks)
  • airbyte_cdk/sources/declarative/requesters/request_option.py (2 hunks)
  • airbyte_cdk/sources/declarative/requesters/request_options/datetime_based_request_options_provider.py (1 hunks)
  • airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (2 hunks)
  • airbyte_cdk/utils/mapping_helpers.py (1 hunks)
  • unit_tests/sources/declarative/auth/test_token_auth.py (1 hunks)
  • unit_tests/sources/declarative/incremental/test_datetime_based_cursor.py (5 hunks)
  • unit_tests/sources/declarative/requesters/paginators/test_default_paginator.py (1 hunks)
  • unit_tests/sources/declarative/requesters/request_options/test_request_options.py (1 hunks)
  • unit_tests/sources/declarative/requesters/test_http_requester.py (1 hunks)
  • unit_tests/sources/declarative/retrievers/test_simple_retriever.py (6 hunks)
  • unit_tests/utils/test_mapping_helpers.py (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • unit_tests/sources/declarative/requesters/paginators/test_default_paginator.py
  • unit_tests/sources/declarative/retrievers/test_simple_retriever.py
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (20)
airbyte_cdk/sources/declarative/requesters/request_option.py (2)

58-65: Should we unify field handling by always using field_path?

I see that we're handling field_name and field_path separately. What do you think about converting field_name into a single-element field_path list during initialization? This might simplify the code by allowing us to work exclusively with field_path. For example:

if self.field_name is not None:
    self.field_path = [InterpolatedString.create(self.field_name, parameters=parameters)]

This could reduce conditional logic later on. Wdyt?


85-116: Handling inject_into_request logic elegantly!

The use of inject_into_request method to handle both field_name and field_path cases is clean and enhances readability. It effectively encapsulates the injection logic. Nice work!

airbyte_cdk/sources/declarative/requesters/request_options/datetime_based_request_options_provider.py (2)

85-87: Should we handle missing time values before injection?

Hi there! In the _get_request_options method, if start_time_value or end_time_value is missing from the stream_slice, we might end up injecting None into the request options. Do you think it would be good to check if these values are not None before injecting them? This could prevent potential issues with requests. Wdyt?

Also applies to: 89-91


83-91: Great refactoring to use inject_into_request method!

Switching to inject_into_request for injecting options enhances modularity and keeps the codebase consistent. This improves maintainability and readability. Well done!

unit_tests/utils/test_mapping_helpers.py (1)

6-17: Nice use of parameterization for basic functionality tests!

Refactoring the basic mapping tests into a parameterized function improves the test suite's structure and makes it easier to add new cases in the future. Great job!

airbyte_cdk/sources/declarative/partition_routers/list_partition_router.py (1)

103-105: LGTM! Clean integration with the new request option handling.

The changes properly utilize the new inject_into_request method while maintaining type safety with MutableMapping.

airbyte_cdk/sources/declarative/auth/token.py (1)

8-8: LGTM! Enhanced type imports.

The addition of MutableMapping to imports supports better type safety for request options handling.

airbyte_cdk/sources/declarative/requesters/paginators/default_paginator.py (1)

190-190: LGTM! Improved request option handling.

The changes look good:

  1. Using MutableMapping for better type safety
  2. Using inject_into_request for both page token and page size options
  3. Clear separation of concerns between token and page size injection

Also applies to: 199-199, 200-200, 206-208

airbyte_cdk/sources/declarative/requesters/http_requester.py (1)

202-204: Consider adding a test case for body_json merging, wdyt?

The addition of is_body_json flag and allow_same_value_merge parameter looks good for supporting nested fields in GraphQL queries. However, it would be beneficial to add test cases specifically for this scenario.

Run this script to check existing test coverage:

Also applies to: 214-215

airbyte_cdk/sources/declarative/partition_routers/substream_partition_router.py (1)

121-121: LGTM! Enhanced type safety and request option handling.

The changes improve the code by:

  1. Using MutableMapping for better type safety
  2. Using inject_into_request for cleaner parameter injection

Also applies to: 131-131

unit_tests/sources/declarative/requesters/test_http_requester.py (1)

280-281: Test cases updated to match new body_json behavior.

The test expectations now correctly reflect that same values can be merged in body_json context. This aligns with the changes in the implementation.

unit_tests/sources/declarative/incremental/test_datetime_based_cursor.py (2)

785-787: Great addition of nested field injection test case!

The new test case thoroughly validates the nested field injection capability for GraphQL queries. The test parameters are well structured and cover the key scenarios.

Also applies to: 827-845


872-874: Clean update to test helper code.

The test helper code has been updated to support both field_name and field_path parameters, maintaining backward compatibility while adding support for the new functionality.

Also applies to: 879-884

airbyte_cdk/sources/declarative/models/declarative_component_schema.py (3)

722-722: LGTM! Nice improvement in clarity.

Setting an explicit default value of None for the optional condition field makes the intent clearer.


1173-1178: Deprecation notice looks good.

The field description clearly indicates the plan to deprecate field_name in favor of field_path. This aligns with the previous discussion in PR comments.


1179-1183: LGTM! Well-documented new field for nested paths.

The new field_path field is well-documented with clear examples, and it aligns perfectly with the PR objective of supporting nested fields for GraphQL queries.

airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2)

2801-2801: LGTM! Clear description update.

The updated description accurately reflects the addition of path-based field specification while maintaining backward compatibility.


2811-2812: LGTM! Well-documented deprecation notice.

The deprecation notice clearly explains that field_name will eventually be replaced by field_path while maintaining backward compatibility for the Builder. This helps users prepare for future changes.

airbyte_cdk/sources/declarative/incremental/datetime_based_cursor.py (1)

368-376: LGTM! Nice refactor of request option handling.

The changes improve maintainability by separating value extraction from injection logic. The use of inject_into_request method provides a more robust way to handle request options.

airbyte_cdk/sources/declarative/retrievers/simple_retriever.py (1)

149-149: Verify combine_mappings behavior with allow_same_value_merge.

Let's ensure the combine_mappings utility correctly handles the new parameter.

✅ Verification successful

combine_mappings implementation verified ✅

The utility correctly handles the allow_same_value_merge parameter with comprehensive test coverage for both body_json and non-body-json scenarios. The implementation in simple_retriever.py is using it correctly!

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check combine_mappings implementation for allow_same_value_merge parameter

# Search for combine_mappings implementation and its tests
rg -A 10 "def combine_mappings"

Length of output: 907


Script:

#!/bin/bash
# Get the full implementation and tests
rg -g "mapping_helpers*" -A 30 "def combine_mappings"

# Look for test files
fd "test.*mapping.*" --exec cat {}

Length of output: 9606

Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Can we just document that this is a "soft" breaking change meaning that Python code relying on RequestOption.field_name might stop working but the alternative should be pretty simple i.e. using the method inject_into_request?

value: The value to inject
config: The config object to use for interpolation
"""
if self._is_field_path:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit skeptical that this could potentially lead to conflicts in the Builder if we started mixing the attributes

If this is the case, we haven't made a good job at designing our interfaces and we will lack the benefits of easy maintenance. In an ideal world, we should be able to replace the whole CDK implementation by something else and the Connector Builder should continue to work.

we simplify the backend handling

Yes! Instead of knowing that there are two cases, we all rely on the cases where we handle a list and that's it!

we could accidentally send those updated values back to the frontend?

My understand is that the only case we send back manifest information to the frontend is when we do resolve_manifest (or something like that) and it feels like this flow only work with the manifest directly and not with Python implementations so I would expect that we would be fine.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

2077-2079: Consider passing the parameters from the model for better consistency
Right now, you are passing an empty dictionary as parameters. Perhaps we could pass model.parameters if needed for a consistent approach with the rest of the code. Wdyt?

- return RequestOption(
-     field_name=field_name, field_path=field_path, inject_into=inject_into, parameters={}
+ return RequestOption(
+     field_name=field_name,
+     field_path=field_path,
+     inject_into=inject_into,
+     parameters=model.parameters or {},
 )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e36a06a and fd919b1.

📒 Files selected for processing (1)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Analyze (python)
🔇 Additional comments (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

2075-2076: Add clarity on the type ignore directive?
It looks like you used # type: ignore to handle a potential mismatch in the declared type or the model.field_path. Maybe it's beneficial to refine the type annotation or handle the mismatch if there's a known reason behind it, wdyt?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants