Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Publish the response to process in the context of the record's transformation and filtering steps #193

Open
wants to merge 30 commits into
base: main
Choose a base branch
from

Conversation

rpopov
Copy link

@rpopov rpopov commented Dec 31, 2024

Feat: Publish the response to process in the context of the record's transformation and filtering steps
See:

In the declarative sources, after the record extraction, in the phase of record transformation and record filtering:

  • Bind the root object from which the record is extracted into the record itself to allow additional filed calculations.
  • The root object is:
    • the top-level JSON object, when the response is a single JSON object and the current record is nested in it;
    • the individual record object, when the response is a JSONL (file, CSV, etc) list or records
  • Bind the response to the $root ket, following the convention of $parameters object.
  • Establish the convention that any field name starting with "$" is a service field, visible only in the filtering and transformation phases, that should not be visible further in the process. Examples: $root, any other to be added in future, any added field using AddField transformation, holding local interim results)
  • Remove the service fields after the record transformation to avoid storing them.

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Enhanced JSON decoding with improved error handling and logging.
    • Added service key management for record extraction and transformation.
    • Improved record flattening with service key preservation.
    • Expanded incremental state handling for parent-child relationships in streams.
    • New test cases for validating manifest reading and caching behavior.
    • Enhanced test coverage for various scenarios related to service key management.
  • Bug Fixes

    • Fixed pagination strategy to handle empty decoder responses.
    • Improved record selector and extractor handling of nested records.
  • Tests

    • Expanded test coverage for JSON decoding, record extraction, and transformation.
    • Added comprehensive tests for service key management and edge cases.
    • Enhanced tests for incremental state management and caching behavior.
    • Updated tests for handling records and state management during pagination.
    • Added new tests for various scenarios in the SimpleRetriever and ManifestDeclarativeSource.
  • Documentation

    • Revised contributing guidelines for clarity and usability.
    • Updated release notes for consistency and improved readability.
  • Chores

    • Updated Prettier dependency to the latest version.

and avoid presenting the empty JSON as an array of an empty object
…lic structures

There is an internal Flatten fransformation - let it skip the service fields
processing other than dict types. Fixed the end page parsing, when it is
empty, the token read defaults to {}
* local for the filter and transform steps
* not visible to the next mapping and store steps (as they should be)
* not visible in the tests beyond the test_record_selector (as they should be)

This allows the record transformation and filtering logic to define its
"local vcariables" in the format of service fields to reuse some interim
calculations.
… the

record selection (filteriung and transformation), as these fields became
local for the steps above.
Copy link
Contributor

coderabbitai bot commented Dec 31, 2024

📝 Walkthrough

Walkthrough

The pull request introduces several enhancements to the Airbyte Declarative Source SDK, focusing on improving error handling, record extraction, and service key management. The changes span multiple files in the airbyte_cdk package, primarily addressing JSON decoding, record transformation, and extraction processes. Modifications include more robust handling of JSON responses, better service key management, and more flexible record processing across various components of the declarative source implementation.

Changes

File Change Summary
airbyte_cdk/sources/declarative/decoders/json_decoder.py Enhanced error handling in decode method with improved logging and JSON parsing. Simplified parse_body_json method.
airbyte_cdk/sources/declarative/extractors/dpath_extractor.py Added RECORD_ROOT_KEY constant and update_record function to incorporate root response information.
airbyte_cdk/sources/declarative/extractors/record_extractor.py Introduced utility functions for managing service keys, including exclude_service_keys, remove_service_keys, and others.
airbyte_cdk/sources/declarative/extractors/record_selector.py Added _remove_service_keys method to filter out service keys from records.
airbyte_cdk/sources/declarative/requesters/paginators/strategies/cursor_pagination_strategy.py Modified next_page_token to handle empty decoder responses gracefully.
airbyte_cdk/sources/declarative/transformations/flatten_fields.py Updated flatten_record method to preserve service keys during record flattening.
unit_tests/sources/declarative/decoders/test_json_decoder.py Updated tests for clarity and coverage regarding JSON decoding behavior.
unit_tests/sources/declarative/extractors/test_dpath_extractor.py Enhanced tests for DpathExtractor to cover various scenarios, including service key management.
unit_tests/sources/declarative/extractors/test_record_extractor.py Added tests for service key management functions.
unit_tests/sources/declarative/extractors/test_record_selector.py Modified tests to incorporate service key exclusion logic.
unit_tests/sources/declarative/partition_routers/test_parent_state_stream.py Expanded test coverage for incremental state handling in parent-child relationships.
unit_tests/sources/declarative/retrievers/test_simple_retriever.py Updated tests to validate state management during pagination.
unit_tests/sources/declarative/test_manifest_declarative_source.py Added tests for manifest reading and caching behavior.
unit_tests/sources/declarative/transformations/test_flatten_fields.py Enhanced tests for flattening functionality with service key scenarios.
docs/CONTRIBUTING.md Updated instructions for code formatting and running tests.
docs/RELEASES.md Improved formatting and clarity of release management instructions.
package.json Updated prettier dependency version from ^3.3.3 to ^3.4.2.

Possibly related PRs

  • feat: add download_decoder + download_extractor #50: The changes in the main PR regarding the JsonDecoder class and its methods are related to the addition of the download_decoder property in the AsyncRetriever, which includes the JsonDecoder as one of the decoder types.
  • feat(low-code cdk): add component resolver and http component resolver #88: The modifications in the main PR to enhance error handling in the JsonDecoder class are relevant to the changes in the ConcurrentDeclarativeSource, which now includes logic for handling streams that do not utilize incremental processing, potentially affecting how JSON responses are decoded.
  • feat(low-code cdk): add config component resolver #149: The updates to the JsonDecoder class in the main PR are connected to the introduction of the ConfigComponentsResolver, which may utilize the JsonDecoder for processing configurations dynamically.
  • docs: add SDM release troubleshooting info #157: The improvements in error handling and response parsing in the JsonDecoder class align with the documentation updates that clarify the interaction between the Connector Builder and the CDK, emphasizing the importance of robust error handling in the context of building connectors.
  • fix: fix sorting & __init__ imports #189: The changes in the main PR to improve the clarity of error logging in the JsonDecoder class are relevant to the overall restructuring of imports and organization in the codebase, which aims to enhance maintainability and readability.

Suggested labels

enhancement

Suggested reviewers

  • aaronsteers
  • maxi297

What do you think? Is there anything specific you'd like to dive deeper into? 😊

Finishing Touches

  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (31)
airbyte_cdk/sources/declarative/extractors/record_extractor.py (4)

11-16: Neat introduction of service key convention!
This clearly explains why a prefix is needed to segregate service keys. Would you consider adding a brief usage example in a docstring, wdyt?


22-26: In-place removal of service keys
This function effectively mutates the original dictionary. Would you consider highlighting the mutating nature in the function name, for instance remove_service_keys_in_place, wdyt?


28-30: Helper function is_service_key
The prefix check is straightforward. One minor note: using startswith(SERVICE_KEY_PREFIX) might be more expressive. wdyt?


32-33: Assertion for presence of service keys
This ensures that at least one service key remains in the mapping, which is helpful for validating transformations. Would you consider returning a more descriptive error message to clarify which key is missing, wdyt?

airbyte_cdk/sources/declarative/transformations/flatten_fields.py (2)

8-8: Usage of is_service_key
Nice import to preserve the logic in a single place. Would you consider re-exporting is_service_key from a central file so other modules also import it consistently, wdyt?


36-44: Preserving service keys
By skipping flattening for service keys, you efficiently maintain their structure. Could we add a comment to describe why we early-return for service keys, wdyt?

unit_tests/sources/declarative/transformations/test_flatten_fields.py (1)

50-58: Test preserving service key with a string value
This sequence effectively checks that the prefixed field remains unflattened. Maybe we can add an additional assertion verifying the shape of the resultant dictionary for completeness, wdyt?

unit_tests/sources/declarative/extractors/test_record_extractor.py (2)

1-12: New test file creation
Good to see a dedicated test suite for service key handling. Maybe consider adding docstrings describing the purpose behind each test function, wdyt?


30-44: Test remove_service_keys
Great approach for ensuring in-place key removal. Possibly add a scenario with nested dictionaries to confirm behavior, wdyt?

airbyte_cdk/sources/declarative/decoders/json_decoder.py (2)

38-41: Graceful handling of JSON decode failure
Catching JSONDecodeError and yielding from an empty list is a smooth approach to avoid breaking the flow. Would you be open to adding a short comment clarifying that we skip all records if parsing fails, wdyt?


47-50: Flexible handling of list vs. single object
Yielding directly from the object if it’s a list, or wrapping it into a list otherwise, streamlines the logic. Looks good. Would you consider adding a small docstring example clarifying the single-object vs. multiple-object scenario, wdyt?

airbyte_cdk/sources/declarative/requesters/paginators/strategies/cursor_pagination_strategy.py (1)

65-65: Fallback for decoder results
Using next(self.decoder.decode(response), {}) prevents StopIteration if nothing is yielded, which is handy. Do we want to log a warning if this default kicks in, wdyt?

airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (3)

19-21: Establishing a consistent root key
Defining RECORD_ROOT_KEY with SERVICE_KEY_PREFIX + "root" is a neat pattern. Any chance we might rename it to __root__ or something similarly distinctive, wdyt?


88-88: Preserving the complete body as root
Storing body in root_response is a convenient handle for later usage. Perhaps we might add a comment explaining its role in subsequent transformations, wdyt?


98-100: Adding root context to each record
Using update_record(record, root_response) for items in the extracted list is great for easy transformations. Would you consider a quick doc mention that each record now includes the $root data, wdyt?

unit_tests/sources/declarative/decoders/test_json_decoder.py (2)

15-18: Model and factory imports
Importing DeclarativeStream and ModelToComponentFactory might be part of a broader test suite. Should we add a small integration test showcasing these, wdyt?


22-27: Testing empty and non-empty JSON
The parameterization ensures coverage for empty and valid JSON, which is great. Maybe add a test case for malformed JSON to confirm we yield an empty list, wdyt?

airbyte_cdk/sources/declarative/extractors/record_selector.py (1)

114-115: Great approach filtering service keys before normalization!
By removing these keys early, you avoid persisting unnecessary or sensitive fields in subsequent transformations. Would you consider adding a small inline comment explaining that this step is purely for service key removal, so new contributors understand the rationale? Wdyt?

unit_tests/sources/declarative/extractors/test_dpath_extractor.py (5)

45-46: Well-structured nested array test coverage!
Testing with a single record containing nested entries is helpful. Wdyt about adding checks for uniquely typed nested fields?


47-51: Extended nested structure test ensures correctness.
It’s great for verifying deeper nesting. Should we also test a scenario with partial missing nested fields? Wdyt?


58-63: Valid approach for repeated testing of deeply nested arrays!
Everything is consistent with previous param sets. Would you like to verify any type conversions here, or is that out of scope? Wdyt?


66-71: Check for single-field nested extraction is well-defined!
No issues here. Possibly consider negative tests (e.g., absent ‘id’ key). Wdyt?


169-170: Ensuring nested extraction with JSONL lines is validated.
Are you comfortable with the coverage? Possibly add an invalid JSON line test? Wdyt?

unit_tests/sources/declarative/extractors/test_record_selector.py (4)

13-16: Adding exclude_service_keys and verify_service_keys_exist is consistent with the rest of the suite.
This promotes uniform logic. Wdyt about also testing default behavior (no service keys found)?


47-60: New test for returning all records with nested data.
It’s well-defined. Would you consider adding a partial-nesting scenario for completeness? Wdyt?


62-75: Testing {{True}} filter with nested data.
Covers a universal pass scenario effectively. Possibly try a trivial fail scenario too?


166-168: Applying exclude_service_keys after record selection is correct.
This ensures the final output is free of service fields. Maybe mention that in code comments, wdyt?

unit_tests/sources/declarative/retrievers/test_simple_retriever.py (1)

242-242: New assertion checking final state for full refresh sync completion.
This is excellent for clarity. Consider adding a note about possible partial state merges? Wdyt?

unit_tests/sources/declarative/test_manifest_declarative_source.py (1)

1764-1764: Add a descriptive message in the assertion? Since this test checks if the discovered records match the expected ones, do you want to include a clarifying message in the assert to help diagnose failures more quickly? For example, assert output_data == expected_records, f"Mismatch in {test_name}...". wdyt?

unit_tests/sources/declarative/partition_routers/test_parent_state_stream.py (2)

325-327: Use a more descriptive variable name for clarity?
Currently, records_from_state might benefit from a more explicit name that conveys its content and purpose, such as records_read_from_intermediate_state. Wdyt?


Line range hint 331-333: Consider an alternative deduplication approach for large data sets?
Here, converting records to a JSON string and using them as dictionary keys might be memory-intensive for large volumes of data. Would you consider using a more straightforward approach (e.g., hashing tuples of normalized content) to reduce overhead? Wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f8cb659 and e693c94.

📒 Files selected for processing (14)
  • airbyte_cdk/sources/declarative/decoders/json_decoder.py (1 hunks)
  • airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (3 hunks)
  • airbyte_cdk/sources/declarative/extractors/record_extractor.py (1 hunks)
  • airbyte_cdk/sources/declarative/extractors/record_selector.py (3 hunks)
  • airbyte_cdk/sources/declarative/requesters/paginators/strategies/cursor_pagination_strategy.py (1 hunks)
  • airbyte_cdk/sources/declarative/transformations/flatten_fields.py (2 hunks)
  • unit_tests/sources/declarative/decoders/test_json_decoder.py (1 hunks)
  • unit_tests/sources/declarative/extractors/test_dpath_extractor.py (5 hunks)
  • unit_tests/sources/declarative/extractors/test_record_extractor.py (1 hunks)
  • unit_tests/sources/declarative/extractors/test_record_selector.py (5 hunks)
  • unit_tests/sources/declarative/partition_routers/test_parent_state_stream.py (1 hunks)
  • unit_tests/sources/declarative/retrievers/test_simple_retriever.py (2 hunks)
  • unit_tests/sources/declarative/test_manifest_declarative_source.py (1 hunks)
  • unit_tests/sources/declarative/transformations/test_flatten_fields.py (2 hunks)
🔇 Additional comments (32)
airbyte_cdk/sources/declarative/extractors/record_extractor.py (1)

18-19: Function exclude_service_keys
Great approach to returning a filtered mapping that excludes service keys. Would you consider clarifying in the docstring that this function doesn't mutate the original mapping, wdyt?

unit_tests/sources/declarative/transformations/test_flatten_fields.py (2)

7-7: Import of SERVICE_KEY_PREFIX
Bringing in the constant aids test readability. Looks great!


59-67: Set usage with service key
Admirable to test sets alongside lists. Everything looks good. Perhaps you could add a case with nested service keys in the future, wdyt?

unit_tests/sources/declarative/extractors/test_record_extractor.py (4)

15-28: Test exclude_service_keys
Testing multiple scenarios is helpful. This thoroughness is great for coverage!


46-56: Check presence of service keys
The parameterized approach covers a variety of input shapes. Looks comprehensive!


58-67: Handling the absence of service keys
Clever usage of try/except to confirm the assertion is raised. No further issues here.


69-74: Function test_service_field and test_regular_field
Straightforward tests confirming the logic behind service and non-service fields. This is quite clear and concise!

airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (3)

12-15: Imports for service keys
These additions ensure we can properly reference the SERVICE_KEY_PREFIX. Everything looks in order. Feeling confident about it, wdyt?


23-30: Utility function for record injection
update_record effectively injects the root data when record is a dict. Would you consider logging or handling non-dict cases differently instead of passing them through unchanged, wdyt?


101-101: Single object extraction
Similarly, returning the updated single object ensures consistency. LGTM. Maybe just confirm there's no performance overhead from the extra dictionary copy, wdyt?

unit_tests/sources/declarative/decoders/test_json_decoder.py (4)

11-12: New imports for testing
Introducing YamlDeclarativeSource and SyncMode could indicate expanded scenarios. Are we using them in these tests, or do we plan to, wdyt?


30-33: Verifying empty string yields empty list
This aligns with the new JsonDecoder behavior. Kudos for thorough testing. Good to go!


40-41: jsonl decoder edge cases
Checking an empty object returns [{}] and an empty list returns [[]] is interesting. Is that the desired behavior, or should empty lists remain empty, wdyt?


48-54: Clearer parameter IDs
The new ids for parameterized tests clarifies each scenario well. This is helpful in test logs. Good job!

airbyte_cdk/sources/declarative/extractors/record_selector.py (2)

11-14: Imports from record_extractor look good!
Nice job importing exclude_service_keys here. This keeps validation logic centralized. Wdyt about also adding a short docstring to indicate why it’s being imported?


164-169: Helper method _remove_service_keys is concise and cohesive!
It clearly separates the logic for removing service keys from the main flow. The name is self-explanatory, which is great. Wdyt about possibly adding a quick docstring snippet describing usage?

unit_tests/sources/declarative/extractors/test_dpath_extractor.py (13)

18-21: LGTM on the new imports!
Bringing in exclude_service_keys and verify_service_keys_exist ensures consistent testing across your modules. Wdyt about clarifying in code comments how each function is used in these tests?


40-41: Empty string handling test looks good!
It’s valuable to confirm no records are returned for empty input. Everything seems fine here.


52-57: Multiple-record nested scenario coverage is clear!
No immediate issues. Maybe consider adding a note on expected behavior if partially malformed JSON arises?


64-65: Single field extraction from root object tested.
This single-case approach is straightforward. Looks good to me!


72-77: Similar patterned test for multiple nested fields.
Maintaining consistency is a good practice. Thanks for adding clarity.


78-83: Handling multiple dictionaries at the same depth.
All good. The param definitions are descriptive.


84-94: Thorough test with an even bigger nested structure.
Thanks for covering that scenario. Seems robust!


126-127: JSONL extraction tests extended for empty data.
That ensures no unexpected defaults appear. Good coverage!


156-160: Test names are descriptive!
Naming them like “test_extract_from_empty_string” is helpful. This improves maintainability.


161-164: Continuing the pattern with param-based test IDs.
All consistent. This is a neat approach.


167-168: Confirmation that JSONL extraction matches single array.
Looks great; no changes needed.


177-178: Handles multiline JSONL with text fields.
Yes, this is crucial to ensure newlines are handled properly.


196-200: Service key management verification is great!
This final step cements correctness. Encouraging thorough coverage.

unit_tests/sources/declarative/extractors/test_record_selector.py (1)

111-123: Verifies use of $root in filter conditions.
This is a valuable scenario, capturing references to the parent. Great job!

unit_tests/sources/declarative/retrievers/test_simple_retriever.py (1)

360-360: Ensuring __ab_full_refresh_sync_complete is set properly.
This line addresses the correct final state. LGTM!

unit_tests/sources/declarative/partition_routers/test_parent_state_stream.py (1)

Line range hint 335-341: Double-check your sorting logic for correctness?
Using a string serialization for sorting is perfectly acceptable in tests. However, if the order of keys changes or if floating-point precision is altered, it might trigger unexpected results. Did you consider any possible corner cases with sorting by JSON representation? Wdyt?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (1)

23-29: Consider adding type hints for better code clarity, wdyt?

The function looks good! Would you consider adding more specific type hints to improve code clarity? For example:

-def update_record(record: Any, root: Any) -> Any:
+def update_record(record: Union[dict, Any], root: Union[dict, list, Any]) -> Union[dict, Any]:
docs/CONTRIBUTING.md (1)

43-44: Consider rephrasing to avoid repetition, wdyt?

The instructions are clear, but there's repetition of "Run" at the start of consecutive lines. Maybe we could rephrase like this?

-Run `poetry run ruff format` to format your Python code.
-Run `poetry run ruff check .` to report the not fixed issues. Fix them manually.
+Use `poetry run ruff format` to format your Python code.
+Then `poetry run ruff check .` to report any remaining issues. Fix these manually.
🧰 Tools
🪛 LanguageTool

[style] ~44-~44: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...f formatto format your Python code. - Runpoetry run ruff check .` to report the...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

docs/RELEASES.md (2)

52-52: Consider creating a GitHub issue for the TODO.

This TODO seems important for improving the release process visibility. Would you like me to create a GitHub issue to track this improvement?


60-64: Consider improving the formatting of the instructions.

The instructions are clear but could be more readable with better formatting. What do you think about this structure?

-Once the publish pipeline has completed, choose a connector to test. Set the base_image in the connector's metadata to your pre-release version in Dockerhub (make sure to update the SHA as well).
-Next, build the pre-release image locally using `airbyte-ci connectors —name=<source> build`.
-You can now run connector interfaces against the built image using the pattern `docker run airbyte/<source-name>:dev <spec/check/discover/read>`.
-The connector's README should include a list of these commands, which can be copy/pasted and run from the connector's directory for quick testing against a local config.
-You can also run `airbyte-ci connectors —name=<source> test` to run the CI test suite against the dev image.
+1. Choose a connector to test after the publish pipeline completes
+2. Update the connector's metadata:
+   - Set base_image to your pre-release version in Dockerhub
+   - Update the SHA accordingly
+3. Build the pre-release image locally:
+   ```bash
+   airbyte-ci connectors —name=<source> build
+   ```
+4. Run connector interfaces against the built image:
+   ```bash
+   docker run airbyte/<source-name>:dev <spec/check/discover/read>
+   ```
+   Note: The connector's README includes these commands for quick testing.
+5. Optionally run the CI test suite:
+   ```bash
+   airbyte-ci connectors —name=<source> test
+   ```
🧰 Tools
🪛 LanguageTool

[grammar] ~60-~60: Did you mean the noun “publishing”?
Context: ...ting Manifest-Only connectors Once the publish pipeline has completed, choose a connec...

(PREPOSITION_VERB)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e693c94 and daaea38.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (6)
  • airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (3 hunks)
  • airbyte_cdk/sources/declarative/extractors/record_extractor.py (1 hunks)
  • docs/CONTRIBUTING.md (1 hunks)
  • docs/RELEASES.md (2 hunks)
  • package.json (1 hunks)
  • unit_tests/sources/declarative/extractors/test_record_extractor.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • package.json
🚧 Files skipped from review as they are similar to previous changes (2)
  • airbyte_cdk/sources/declarative/extractors/record_extractor.py
  • unit_tests/sources/declarative/extractors/test_record_extractor.py
🧰 Additional context used
🪛 LanguageTool
docs/CONTRIBUTING.md

[style] ~44-~44: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...f formatto format your Python code. - Runpoetry run ruff check .` to report the...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

docs/RELEASES.md

[uncategorized] ~56-~56: Possible missing article found.
Context: ...lly test changes against a dev image of SDM before committing to a release, first u...

(AI_HYDRA_LEO_MISSING_THE)


[grammar] ~60-~60: Did you mean the noun “publishing”?
Context: ...ting Manifest-Only connectors Once the publish pipeline has completed, choose a connec...

(PREPOSITION_VERB)


[grammar] ~68-~68: Did you mean the noun “publishing”?
Context: ...ng Low-Code Python connectors Once the publish pipeline has completed, set the version...

(PREPOSITION_VERB)

🔇 Additional comments (2)
airbyte_cdk/sources/declarative/extractors/dpath_extractor.py (2)

19-20: LGTM! Clear and descriptive constant definition.

The constant follows the established naming convention and its purpose is well-documented in the comment.


Line range hint 88-101: Verify the memory impact of storing the root response.

The changes look good! Just want to ensure we've considered the memory implications:

  1. The root response is now stored in each record
  2. For large responses, this could increase memory usage significantly

Also, would you consider adding a comment explaining why we're storing the root response? This would help future maintainers understand the design decision. 🤔

✅ Verification successful

Let me search for test cases and actual usage patterns to better understand the memory implications. The previous command might have failed because the path pattern wasn't specific enough.


Memory impact looks reasonable based on test cases 🌟

Looking at the test cases, I can see that the responses being handled are relatively small (mostly containing simple records with IDs and nested objects). The update_record function creates a shallow copy of the dictionary and only stores a reference to the root response, not a deep copy.

A few observations that make me feel confident about the memory usage:

  • Most test cases work with small JSON payloads
  • The root response is stored by reference, not by copying
  • The pattern is used consistently across all test cases without any reported issues

Would you consider adding a quick comment explaining that we're storing a reference to the root response for downstream processing? Something like:

# Store reference to root response to allow downstream processors access to the full context
root_response = body

wdyt? 🤔

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for typical response sizes in tests
rg -A 5 "def test.*extract_records" unit_tests/

Length of output: 49


Script:

#!/bin/bash
# Let's check the test files specifically for the DPath extractor
fd "test.*dpath.*\.py" unit_tests/ --exec cat {}

# Also check the actual usage of update_record to understand the memory implications
rg -B 3 -A 3 "update_record" 

Length of output: 9434

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
airbyte_cdk/sources/declarative/extractors/record_extractor.py (3)

17-18: Consider adding type hints for better code clarity, wdyt?

The function looks good, but we could make it more explicit about the return type being a new mapping rather than modifying the input.

-def exclude_service_keys(mapping: Mapping[str, Any]) -> Mapping[str, Any]:
+def exclude_service_keys(mapping: Mapping[str, Any]) -> dict[str, Any]:

21-24: Consider adding return type annotation for consistency.

The function modifies the dict in place, which is clear from the implementation. Would you consider adding a return type annotation of None to make this explicit?

-def remove_service_keys(mapping: dict[str, Any]):  # type: ignore[no-untyped-def]
+def remove_service_keys(mapping: dict[str, Any]) -> None:

31-32: Fix typo in assertion message.

There's a small typo in the assertion message.

-    assert mapping != exclude_service_keys(mapping), "Expected service are present"
+    assert mapping != exclude_service_keys(mapping), "Expected service keys are present"
docs/CONTRIBUTING.md (1)

43-46: Consider restructuring commands for better readability, wdyt?

The commands are clear but start with repetitive "Run". Maybe we could make it more concise and readable?

-# Run `poetry run ruff format` to format your Python code.
-# Run `poetry run ruff check --fix` to fix the formatting issues.
-# Run `poetry run ruff check .` to report the not fixed issues. Fix them manually.
-# Run `poetry run mypy --config-file mypy.ini airbyte_cdk` to validate the code. Resolve the reported issues.
+Format and validate your code using these commands:
+
+1. `poetry run ruff format` - Format Python code
+2. `poetry run ruff check --fix` - Auto-fix formatting issues
+3. `poetry run ruff check .` - Report remaining issues (fix manually)
+4. `poetry run mypy --config-file mypy.ini airbyte_cdk` - Validate type annotations
🧰 Tools
🪛 LanguageTool

[style] ~44-~44: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...f formatto format your Python code. - Runpoetry run ruff check --fix` to fix th...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~45-~45: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... --fixto fix the formatting issues. - Runpoetry run ruff check .` to report the...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between daaea38 and 9a05c94.

📒 Files selected for processing (2)
  • airbyte_cdk/sources/declarative/extractors/record_extractor.py (1 hunks)
  • docs/CONTRIBUTING.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/CONTRIBUTING.md

[style] ~44-~44: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...f formatto format your Python code. - Runpoetry run ruff check --fix` to fix th...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~45-~45: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... --fixto fix the formatting issues. - Runpoetry run ruff check .` to report the...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🔇 Additional comments (1)
airbyte_cdk/sources/declarative/extractors/record_extractor.py (1)

10-14: LGTM! Clear documentation of service field conventions.

The documentation clearly explains the convention for service fields and their prefix. This aligns well with the PR objective of binding the response root object to records.

Minor edit to trigger the reevaluation of the integration tests.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
docs/RELEASES.md (2)

60-61: How about making the testing instructions more explicit? 🤔

The instructions are good, but what do you think about these small improvements to make them even clearer?

  1. Replace angle brackets with concrete examples:
-airbyte-ci connectors —name=<source> build
+airbyte-ci connectors —name=source-example build  # Replace 'source-example' with your connector name
  1. Add a note about configuration requirements:
 You can now run connector interfaces against the built image using the pattern `docker run airbyte/<source-name>:dev <spec/check/discover/read>`.
+# Make sure you have a valid config.json file in your current directory for check/discover/read commands

wdyt? 🙂

Also applies to: 68-68


52-52: Would you like me to help track this TODO? 🎯

This is a good suggestion about improving version inspection without requiring prod DB access. Would you like me to create a GitHub issue to track this improvement?

I can help create an issue with:

  • Title: "Improve SDM version inspection without prod DB access"
  • Description detailing the current limitations and desired outcome
  • Labels: documentation, enhancement

Let me know if you'd like me to proceed! 😊

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9a05c94 and 6587873.

📒 Files selected for processing (1)
  • docs/RELEASES.md (2 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/RELEASES.md

[uncategorized] ~56-~56: Possible missing article found.
Context: ...lly test changes against a dev image of SDM before committing to a release, first u...

(AI_HYDRA_LEO_MISSING_THE)


[grammar] ~68-~68: Did you mean the noun “publishing”?
Context: ...ng Low-Code Python connectors Once the publish pipeline has completed, set the version...

(PREPOSITION_VERB)

🔇 Additional comments (1)
docs/RELEASES.md (1)

14-14: LGTM! Nice formatting improvement! 👍

The consistent formatting with italics makes the notes section more readable.

@rpopov
Copy link
Author

rpopov commented Jan 1, 2025

The acceptance test failures are random and unrelated to the actual change.
What is the practice in this case?

Update CONTRIBUTING.md to clarify and make explicit the procedures and steps.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
docs/CONTRIBUTING.md (4)

17-18: The Poetry installation instructions could be more specific, wdyt?

Consider adding the recommended installation method for different operating systems to help new contributors get started quickly. For example:

1. Install Poetry:
   - macOS: `curl -sSL https://install.python-poetry.org | python3 -`
   - Windows: `(Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | py -`
   - Linux: `curl -sSL https://install.python-poetry.org | python3 -`

26-28: Consider restructuring the test commands section for better readability, wdyt?

The commands are clear but could be organized better to avoid repetitive sentence starts. Here's a suggestion:

### Running Unit Tests

Use these commands to run tests:
- All tests: `poetry run pytest`
- Specific tests: `poetry run pytest -k <suite or test name>`
- Fast tests only (<5 min): `poetry run pytest-fast`
🧰 Tools
🪛 LanguageTool

[style] ~28-~28: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests which ar...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


59-74: Unify the list style in the assumptions and steps sections

The static analysis tool flagged inconsistent list markers (asterisks vs. dashes). Consider using dashes consistently throughout the document for better maintainability:

 **Assumptions:**
-* The connector to test with is in the root [Airbyte project](https://github.com/airbytehq/airbyte).
-* The [Airbyte project](https://github.com/airbytehq/airbyte) is checked out in `airbyte` directory.
+- The connector to test with is in the root [Airbyte project](https://github.com/airbytehq/airbyte).
+- The [Airbyte project](https://github.com/airbytehq/airbyte) is checked out in `airbyte` directory.
🧰 Tools
🪛 Markdownlint (0.37.0)

59-59: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


60-60: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


61-61: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


65-65: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


66-66: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


72-72: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


73-73: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


74-74: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


174-174: Consider adding a note about Docker networking

Since the section discusses running tests in Docker, it might be helpful to add a note about potential networking issues and how to resolve them, similar to the MockServer section that mentions using host.docker.internal. What do you think?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6587873 and a33f815.

📒 Files selected for processing (1)
  • docs/CONTRIBUTING.md (2 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/CONTRIBUTING.md

[style] ~28-~28: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests which ar...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~35-~35: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. - `poetry run ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~36-~36: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...hon code. - poetry run poe format-fix to auto-fix formatting issues. ### Run Co...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🪛 Markdownlint (0.37.0)
docs/CONTRIBUTING.md

59-59: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


60-60: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


61-61: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


65-65: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


66-66: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


72-72: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


73-73: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


74-74: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


82-82: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)

docs/CONTRIBUTING.md Show resolved Hide resolved
Update CONTRIBUTING.md to state the actual testing steps.

The integration tests of the CDK fail due to missing `secrets/config.json` file for the Shopify source.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
docs/CONTRIBUTING.md (4)

26-28: Consider restructuring the test commands for better readability?

The test commands are accurate but could be more readable. What do you think about grouping them with bullet points and adding brief descriptions for each command? Something like:

- `poetry run pytest` to run all unit tests.
- `poetry run pytest -k <suite or test name>` to run specific unit tests.
- `poetry run pytest-fast` to run the subset of PyTest tests which are not flagged as `slow`. (Should take <5 min for fast tests only.)
+ - Run all tests: `poetry run pytest`
+ - Run specific tests: `poetry run pytest -k <suite or test name>`
+ - Run fast tests only (~5 min): `poetry run pytest-fast`

wdyt? 🤔

🧰 Tools
🪛 LanguageTool

[style] ~28-~28: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests which ar...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


58-61: Consider standardizing the bullet points style?

The unordered list uses asterisks (*) while other lists in the document use dashes (-). Would you like to standardize this for consistency?

-* The connector to test with is in the root [Airbyte project](https://github.com/airbytehq/airbyte).
-* The [Airbyte project](https://github.com/airbytehq/airbyte) is checked out in `airbyte` directory.
-* The CDK development project is checked out in the `aibyte/airbyte-python-cdk` subdirectory of the root [Airbyte project](https://github.com/airbytehq/airbyte).
+- The connector to test with is in the root [Airbyte project](https://github.com/airbytehq/airbyte).
+- The [Airbyte project](https://github.com/airbytehq/airbyte) is checked out in `airbyte` directory.
+- The CDK development project is checked out in the `aibyte/airbyte-python-cdk` subdirectory of the root [Airbyte project](https://github.com/airbytehq/airbyte).
🧰 Tools
🪛 Markdownlint (0.37.0)

59-59: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


60-60: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


61-61: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


74-79: Add language specifiers to code blocks?

The code blocks would be more readable with syntax highlighting. Should we add language specifiers? For example:

-```
+```bash
poetry run <connector name> spec
poetry run <connector name> check --config secrets/config.json
...

Also applies to: 81-86

🧰 Tools
🪛 Markdownlint (0.37.0)

74-74: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


187-187: Consider adding a note about Docker prerequisites?

The section about running connector acceptance tests assumes Docker knowledge. Would it be helpful to add a note about Docker requirements or link to Docker setup instructions? wdyt? 🤔

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a33f815 and 4bc243d.

📒 Files selected for processing (1)
  • docs/CONTRIBUTING.md (2 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/CONTRIBUTING.md

[style] ~28-~28: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests which ar...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~35-~35: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. - `poetry run ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~36-~36: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...hon code. - poetry run poe format-fix to auto-fix formatting issues. ### Run Co...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🪛 Markdownlint (0.37.0)
docs/CONTRIBUTING.md

59-59: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


60-60: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


61-61: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


65-65: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


66-66: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


72-72: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


73-73: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


87-87: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


95-95: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


102-102: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


74-74: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


81-81: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (2)
docs/CONTRIBUTING.md (2)

17-18: LGTM! Clear installation instructions.

The setup instructions are clear and concise, properly directing users to install Poetry and its dependencies.


97-100: Fix the command in the example

The command example shows build but the context suggests it should be test:

-airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> build
+airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> test

Trying to collect and organize the knowledge on running the test locally.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
docs/CONTRIBUTING.md (6)

26-28: Consider improving command descriptions for clarity

The commands look good, but we could make them more descriptive. What do you think about adding brief descriptions of what each command does? For example:

-`poetry run pytest` to run all unit tests.
-`poetry run pytest -k <suite or test name>` to run specific unit tests.
-`poetry run pytest-fast` to run the subset of PyTest tests which are not flagged as `slow`.
+`poetry run pytest` - Executes the complete test suite
+`poetry run pytest -k <suite or test name>` - Runs tests matching the specified pattern
+`poetry run pytest-fast` - Executes only fast tests (not marked as `slow`, completes in <5 min)
🧰 Tools
🪛 LanguageTool

[style] ~28-2828: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests which ar...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


58-82: Consider restructuring the installation instructions for better flow

The installation steps could be more streamlined. What do you think about this structure:

  1. Move the assumptions section after the preparation steps
  2. Combine the path example with the directory change instruction
  3. Add a note about why we're editing pyproject.toml

Also, the code block at line 68-71 seems out of place - should it be moved to a different section? wdyt?

🧰 Tools
🪛 Markdownlint (0.37.0)

59-5959: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


60-6060: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


61-6161: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


64-6464: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


65-6565: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


66-6666: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


73-7373: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


75-7575: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


76-7676: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


82-8282: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


63-6363: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


68-6868: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


Line range hint 213-224: Consolidate duplicate test instructions

This section appears to duplicate the information from lines 115-134. Should we combine these sections to avoid confusion? We could keep one comprehensive section about running tests with the local CDK. wdyt?


59-76: Fix markdown list formatting

The unordered lists are using mixed styles (asterisks and dashes). For consistency with the rest of the document, should we standardize on dashes? Example:

-* The connector to test with is in the root [Airbyte project](https://github.com/airbytehq/airbyte).
-* The [Airbyte project](https://github.com/airbytehq/airbyte) is checked out in `airbyte` directory.
+- The connector to test with is in the root [Airbyte project](https://github.com/airbytehq/airbyte).
+- The [Airbyte project](https://github.com/airbytehq/airbyte) is checked out in `airbyte` directory.
🧰 Tools
🪛 Markdownlint (0.37.0)

59-5959: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


60-6060: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


61-6161: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


64-6464: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


65-6565: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


66-6666: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


73-7373: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


75-7575: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


76-7676: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


63-6363: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


68-6868: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


68-68: Add language specifiers to code blocks

The fenced code blocks are missing language specifiers. Should we add them for better syntax highlighting? For example:

-```
+```bash

Also applies to: 87-87, 97-97, 107-107

🧰 Tools
🪛 Markdownlint (0.37.0)

68-6868: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


140-142: Fix typo in Poe tasks description

There's a small typo in the Poe tasks section:

-The Airbyte CDK uses [Poe the Poet](https://poethepoet.natn.io/) to define common development task.
+The Airbyte CDK uses [Poe the Poet](https://poethepoet.natn.io/) to define common development tasks.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4bc243d and 62cca5e.

📒 Files selected for processing (1)
  • docs/CONTRIBUTING.md (2 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/CONTRIBUTING.md

[style] ~28-2828: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests which ar...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~35-3535: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. - `poetry run ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~36-3636: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...hon code. - poetry run poe format-fix to auto-fix formatting issues. ### Run Co...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🪛 Markdownlint (0.37.0)
docs/CONTRIBUTING.md

59-5959: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


60-6060: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


61-6161: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


64-6464: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


65-6565: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


66-6666: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


73-7373: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


75-7575: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


76-7676: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


82-8282: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


86-8686: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


106-106106: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


113-113113: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


121-121121: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


128-128128: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


63-6363: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


68-6868: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


87-8787: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


97-9797: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


107-107107: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (1)
docs/CONTRIBUTING.md (1)

124-126: Fix the command in the example

There seems to be a copy-paste error in the command example. The text mentions running tests, but the command shows build:

-airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> build
+airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> test

* to organize and present the setup steps explicitly
* to avoid misunderstandings and wasted eforts
* to organize and present the setup steps explicitly
* to avoid misunderstandings and wasted eforts
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
docs/CONTRIBUTING.md (5)

17-63: Setup instructions could be more streamlined, wdyt?

The setup instructions are comprehensive but could be more consistent. Consider:

  1. Consolidating OS-specific instructions under a collapsible section
  2. Using consistent code block language specifications (e.g., bash or shell)
  3. Adding version requirements for pip and pipx

Example structure:

- Fedora 41:
- ```
- sudo dnf install python3.11
- ```
+ <details>
+ <summary>Fedora 41 specific instructions</summary>
+ 
+ ```bash
+ sudo dnf install python3.11
+ ```
+ </details>
🧰 Tools
🪛 Markdownlint (0.37.0)

51-51: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


52-52: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


56-56: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


51-51: Expected: 0; Actual: 2
Unordered list indentation

(MD007, ul-indent)


52-52: Expected: 0; Actual: 2
Unordered list indentation

(MD007, ul-indent)


56-56: Expected: 0; Actual: 2
Unordered list indentation

(MD007, ul-indent)


19-19: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


24-24: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


29-29: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


33-33: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


37-37: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


42-42: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


47-47: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


59-59: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


70-96: Consider grouping related commands under collapsible sections for better readability?

The development commands section is well-organized but could be more scannable. What do you think about grouping related commands under expandable sections? This would help developers quickly find what they need while keeping the full details available.

Example structure:

<details>
<summary>🧪 Testing Commands</summary>

- `poetry run pytest` - Run all unit tests
- `poetry run pytest -k <suite>` - Run specific tests
...
</details>
🧰 Tools
🪛 LanguageTool

[style] ~74-~74: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests which ar...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~81-~81: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. - `poetry run ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~82-~82: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...hon code. - poetry run poe format-fix to auto-fix formatting issues. ### Run Co...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


98-209: Great testing instructions, but could use some clarification on prerequisites

The testing section is thorough but might benefit from:

  1. A clear prerequisites checklist at the start
  2. Visual separation between local and in-container testing sections
  3. Consistent formatting for command examples

Consider adding a prerequisites checklist:

### Prerequisites
- [ ] Airbyte repo cloned at `../airbyte`
- [ ] Docker installed and running
- [ ] Python 3.10 or 3.11 installed
🧰 Tools
🪛 LanguageTool

[uncategorized] ~104-~104: You might be missing the article “the” here.
Context: ...om/airbytehq/airbyte) is checked out in airbyte directory. * The [CDK developme...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~127-~127: Possible missing comma found.
Context: ...ile. * On the Settings / Domains page find the subdomain of myshopify.com a...

(AI_HYDRA_LEO_MISSING_COMMA)

🪛 Markdownlint (0.37.0)

103-103: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


104-104: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


105-105: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


108-108: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


113-113: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


118-118: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


123-123: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


125-125: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


126-126: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


127-127: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


144-144: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


145-145: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


146-146: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


150-150: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


170-170: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


185-185: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


191-191: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


195-195: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


202-202: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


107-107: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


189-189: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


109-109: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


119-119: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


129-129: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


134-134: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


151-151: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


161-161: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


171-171: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


177-177: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


214-214: Minor grammatical fixes needed

There's a small grammatical error in this line. The word "task" should be plural.

- to define common development task.
+ to define common development tasks.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~214-~214: Possible missing article found.
Context: ...(https://poethepoet.natn.io/) to define common development task. You can run `poetry r...

(AI_HYDRA_LEO_MISSING_A)


[uncategorized] ~214-~214: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ....natn.io/) to define common development task. You can run poetry run poe list to s...

(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)


287-287: Consider adding troubleshooting section?

The documentation is comprehensive but might benefit from a troubleshooting section covering common issues developers might encounter. Would you like me to help draft one?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 62cca5e and 26367a6.

📒 Files selected for processing (1)
  • docs/CONTRIBUTING.md (2 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/CONTRIBUTING.md

[style] ~74-~74: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests which ar...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~81-~81: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. - `poetry run ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~82-~82: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...hon code. - poetry run poe format-fix to auto-fix formatting issues. ### Run Co...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[uncategorized] ~104-~104: You might be missing the article “the” here.
Context: ...om/airbytehq/airbyte) is checked out in airbyte directory. * The [CDK developme...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~127-~127: Possible missing comma found.
Context: ...ile. * On the Settings / Domains page find the subdomain of myshopify.com a...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~214-~214: Possible missing article found.
Context: ...(https://poethepoet.natn.io/) to define common development task. You can run `poetry r...

(AI_HYDRA_LEO_MISSING_A)


[uncategorized] ~214-~214: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ....natn.io/) to define common development task. You can run poetry run poe list to s...

(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)

🪛 Markdownlint (0.37.0)
docs/CONTRIBUTING.md

51-51: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


52-52: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


56-56: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


103-103: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


104-104: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


105-105: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


108-108: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


113-113: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


118-118: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


123-123: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


125-125: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


126-126: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


127-127: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


144-144: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


145-145: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


146-146: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


150-150: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


170-170: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


185-185: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


191-191: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


195-195: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


202-202: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


51-51: Expected: 0; Actual: 2
Unordered list indentation

(MD007, ul-indent)


52-52: Expected: 0; Actual: 2
Unordered list indentation

(MD007, ul-indent)


56-56: Expected: 0; Actual: 2
Unordered list indentation

(MD007, ul-indent)


107-107: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


189-189: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


19-19: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


24-24: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


29-29: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


33-33: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


37-37: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


42-42: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


47-47: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


59-59: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


109-109: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


119-119: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


129-129: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


134-134: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


151-151: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


161-161: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


171-171: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


177-177: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
docs/CONTRIBUTING.md (6)

19-42: Consider reorganizing the installation steps for better flow, wdyt?

The Fedora-specific instructions are currently scattered throughout the setup steps. Consider grouping them together in a dedicated "OS-specific Instructions" subsection for better readability.

🧰 Tools
🪛 Markdownlint (0.37.0)

20-20: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


26-26: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


32-32: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


36-36: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


40-40: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


61-66: Consider moving Shopify-specific configuration to the testing section?

The Shopify configuration example seems more relevant to the testing section since it's specifically about connector testing rather than initial setup. What do you think about moving it to the "Test locally CDK Changes against Connectors" section?

🧰 Tools
🪛 Markdownlint (0.37.0)

62-62: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


75-78: Consider using a table format for the test commands, what do you think?

To avoid repetitive poetry run prefixes and improve readability, consider using a table format:

| Command | Description |
|---------|-------------|
| `pytest` | Run all unit tests |
| `pytest -k <suite or test name>` | Run specific unit tests |
| `pytest-fast` | Run fast tests only (<5 min) |
🧰 Tools
🪛 LanguageTool

[style] ~77-~77: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests which ar...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


217-217: Minor grammar fix needed

-to define common development task.
+to define common development tasks.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~217-~217: Possible missing article found.
Context: ...(https://poethepoet.natn.io/) to define common development task. You can run `poetry r...

(AI_HYDRA_LEO_MISSING_A)


105-108: Consider standardizing list markers for consistency

The document uses a mix of asterisks and dashes for unordered lists. Consider standardizing to use dashes throughout for better consistency. Would you like me to provide a diff for all instances?

Also applies to: 147-149

🧰 Tools
🪛 Markdownlint (0.37.0)

106-106: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


107-107: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


108-108: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


130-130: Minor grammar fix needed

-On the **Settings / Domains** page find
+On the **Settings / Domains** page, find
🧰 Tools
🪛 LanguageTool

[uncategorized] ~130-~130: Possible missing comma found.
Context: ...ile. * On the Settings / Domains page find the subdomain of myshopify.com a...

(AI_HYDRA_LEO_MISSING_COMMA)

🪛 Markdownlint (0.37.0)

130-130: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 26367a6 and e1c5a47.

📒 Files selected for processing (1)
  • docs/CONTRIBUTING.md (2 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/CONTRIBUTING.md

[style] ~77-~77: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests which ar...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~84-~84: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. - `poetry run ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~85-~85: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...hon code. - poetry run poe format-fix to auto-fix formatting issues. ### Run Co...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[uncategorized] ~130-~130: Possible missing comma found.
Context: ...ile. * On the Settings / Domains page find the subdomain of myshopify.com a...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~217-~217: Possible missing article found.
Context: ...(https://poethepoet.natn.io/) to define common development task. You can run `poetry r...

(AI_HYDRA_LEO_MISSING_A)

🪛 Markdownlint (0.37.0)
docs/CONTRIBUTING.md

54-54: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


55-55: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


59-59: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


106-106: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


107-107: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


108-108: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


111-111: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


116-116: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


121-121: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


126-126: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


128-128: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


129-129: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


130-130: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


147-147: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


148-148: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


149-149: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


153-153: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


173-173: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


188-188: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


194-194: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


198-198: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


205-205: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


54-54: Expected: 0; Actual: 2
Unordered list indentation

(MD007, ul-indent)


55-55: Expected: 0; Actual: 2
Unordered list indentation

(MD007, ul-indent)


59-59: Expected: 0; Actual: 2
Unordered list indentation

(MD007, ul-indent)


110-110: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


192-192: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


20-20: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


26-26: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


32-32: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


36-36: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


40-40: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


45-45: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


50-50: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


62-62: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


112-112: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


122-122: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


132-132: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


137-137: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


154-154: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


164-164: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


174-174: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


180-180: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

Fix and polish
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
docs/CONTRIBUTING.md (6)

19-42: Consider generalizing the installation instructions

The installation steps are currently Fedora 41-specific. Would it make sense to move these into a "Fedora-specific notes" subsection and add instructions for other common operating systems (Ubuntu, macOS) as well? This would make the guide more inclusive, wdyt?

🧰 Tools
🪛 Markdownlint (0.37.0)

20-20: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


26-26: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


32-32: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


36-36: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


40-40: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


65-70: Consider moving Shopify configuration to the integration testing section

The Shopify configuration example seems out of place in the setup section. Would it make more sense to move this to the "Test locally CDK Changes against Connectors" section where it's more contextually relevant?

🧰 Tools
🪛 Markdownlint (0.37.0)

66-66: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


79-82: Consider improving command documentation readability

The test commands are clear but could be more readable. How about structuring them as a table with columns for "Command", "Description", and "Example"? This would make it easier to scan and understand the different testing options, wdyt?

🧰 Tools
🪛 LanguageTool

[style] ~81-~81: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


121-124: Add language specifiers to code blocks

The code blocks would benefit from syntax highlighting. Consider adding language specifiers to the code blocks, for example:

  • bash for shell commands
  • toml for TOML configurations
  • json for JSON files
    This improves readability and helps with syntax highlighting, wdyt?

Also applies to: 131-134, 163-171, 183-187

🧰 Tools
🪛 Markdownlint (0.37.0)

121-121: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


226-230: Consider restructuring the Poe tasks section

The Poe tasks section could be more helpful with a table of common tasks and their descriptions. Would you consider adding something like:

| Task | Description | Common Use Case |
|------|-------------|----------------|
| `poe list` | List all available tasks | Getting started |
| `poe format-fix` | Auto-fix formatting issues | Before committing changes |
| ... | ... | ... |

8-9: Consider standardizing markdown list style

The document uses both asterisks (*) and dashes (-) for unordered lists. Would you consider standardizing to use dashes (-) throughout the document for consistency? This aligns with common markdown style guides, wdyt?

Also applies to: 108-110, 156-158

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e1c5a47 and 4af3c8a.

📒 Files selected for processing (1)
  • docs/CONTRIBUTING.md (5 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/CONTRIBUTING.md

[style] ~81-~81: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~88-~88: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. - `poetry run ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~89-~89: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...hon code. - poetry run poe format-fix to auto-fix formatting issues. ### Run Co...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~234-~234: Consider a more expressive alternative.
Context: ...or, you may need to regenerate them. To do that, you can run: ```bash poetry run ...

(DO_ACHIEVE)


[locale-violation] ~252-~252: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)

🪛 Markdownlint (0.37.0)
docs/CONTRIBUTING.md

58-58: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


59-59: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


63-63: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


108-108: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


109-109: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


110-110: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


115-115: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


116-116: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


117-117: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


120-120: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


125-125: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


130-130: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


135-135: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


137-137: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


138-138: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


139-139: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


156-156: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


157-157: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


158-158: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


162-162: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


182-182: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


199-199: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


205-205: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


209-209: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


216-216: Expected: dash; Actual: asterisk
Unordered list style

(MD004, ul-style)


58-58: Expected: 0; Actual: 2
Unordered list indentation

(MD007, ul-indent)


59-59: Expected: 0; Actual: 2
Unordered list indentation

(MD007, ul-indent)


63-63: Expected: 0; Actual: 2
Unordered list indentation

(MD007, ul-indent)


119-119: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


203-203: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


20-20: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


26-26: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


32-32: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


36-36: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


40-40: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


45-45: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


50-50: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


66-66: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


121-121: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


131-131: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


141-141: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


146-146: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


163-163: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


173-173: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


183-183: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


189-189: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

@rpopov
Copy link
Author

rpopov commented Jan 5, 2025

The reported failed checks are due to #197

@rpopov rpopov changed the title Feat: Publish the response to process in the context of the record's transformation and filtering steps feat: Publish the response to process in the context of the record's transformation and filtering steps Jan 7, 2025
@marcosmarxm
Copy link
Member

marcosmarxm commented Jan 8, 2025

@airbytehq/dev-python Can someone take a look at this contribution?

renamed verify_* function to asser_* what it actually does.
Commended
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
unit_tests/sources/declarative/extractors/test_dpath_extractor.py (1)

196-200: Would you consider adding a quick comment about service keys? 📝

The service key validation and exclusion logic looks good! To help future maintainers, maybe we could add a brief comment explaining why we're doing these steps? Something like:

    actual_records = list(extractor.extract_records(response))

+   # Validate and remove service keys (fields starting with '$') that are only used
+   # during the transformation phase and shouldn't be part of the final output
    for record in actual_records:
        assert_service_keys_exist(record)

    actual_records = [exclude_service_keys(record) for record in actual_records]
unit_tests/sources/declarative/extractors/test_record_selector.py (1)

170-170: Clean handling of service keys! ✨

Good addition of exclude_service_keys before assertions. Would it be valuable to add a test case that explicitly verifies that service keys are properly excluded? This could help catch any regressions in the service key handling. wdyt?

Here's a potential test case to consider adding:

(
    "test_service_keys_are_excluded",
    ["data"],
    None,
    {
        "data": [
            {"id": 1, "$service_field": "hidden", "visible_field": "shown"}
        ]
    },
    [{"id": 1, "visible_field": "shown"}]
),

Also applies to: 244-244

🧰 Tools
🪛 GitHub Actions: Linters

[warning] File requires formatting. Run 'ruff format' to fix code formatting issues.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 62d2417 and 8c636f7.

📒 Files selected for processing (4)
  • airbyte_cdk/sources/declarative/extractors/record_extractor.py (1 hunks)
  • unit_tests/sources/declarative/extractors/test_dpath_extractor.py (5 hunks)
  • unit_tests/sources/declarative/extractors/test_record_extractor.py (1 hunks)
  • unit_tests/sources/declarative/extractors/test_record_selector.py (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • airbyte_cdk/sources/declarative/extractors/record_extractor.py
  • unit_tests/sources/declarative/extractors/test_record_extractor.py
🧰 Additional context used
🪛 GitHub Actions: Linters
unit_tests/sources/declarative/extractors/test_dpath_extractor.py

[error] 4-23: Import block is un-sorted or un-formatted

unit_tests/sources/declarative/extractors/test_record_selector.py

[warning] File requires formatting. Run 'ruff format' to fix code formatting issues.

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (5)
unit_tests/sources/declarative/extractors/test_dpath_extractor.py (2)

40-94: Awesome test coverage! 🎯

Really nice job on covering various edge cases and nested structures! The test cases are well-organized and comprehensive.


156-179: Love the descriptive test IDs! 👌

The test identifiers are very clear and help understand the purpose of each test case at a glance.

unit_tests/sources/declarative/extractors/test_record_selector.py (3)

50-79: Great test coverage for nested structures! 🎯

The new test cases thoroughly verify the handling of nested data structures. Have you considered adding a test case where the nested structure is invalid or malformed to verify error handling? wdyt?

🧰 Tools
🪛 GitHub Actions: Linters

[warning] File requires formatting. Run 'ruff format' to fix code formatting issues.


114-126: Nice test for root response access! 💡

Good test case demonstrating access to the original response. Would it be helpful to add a test case where we try to access an invalid path in the root response to verify error handling? wdyt?

🧰 Tools
🪛 GitHub Actions: Linters

[warning] File requires formatting. Run 'ruff format' to fix code formatting issues.


12-19: Run formatter to fix indentation.

The pipeline indicates formatting issues. Would you mind running ruff format to fix the indentation? 🛠️

🧰 Tools
🪛 GitHub Actions: Linters

[warning] File requires formatting. Run 'ruff format' to fix code formatting issues.

rpopov and others added 3 commits January 9, 2025 01:44
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
* main:
  fix(airbyte-cdk): unable to create custom retriever (airbytehq#198)
  feat(low-code): added keys replace transformation (airbytehq#183)
  feat: add `min` macros (airbytehq#203)
  fix(low-code cdk pagination): Fix the offset strategy so that it resets back to 0 when a stream is an incremental data feed (airbytehq#202)
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (10)
unit_tests/sources/declarative/extractors/test_dpath_extractor.py (2)

40-94: Add docstrings to explain test scenarios?

The new test cases are comprehensive and well-structured! Would you consider adding docstrings to explain the purpose of each test scenario? This could help other developers understand the edge cases being tested, wdyt?

For example:

@pytest.mark.parametrize(
    "field_path, decoder, body, expected_records",
    [
        # Test empty input handling
        pytest.param([], decoder_json, b"", [], id="test_extract_from_empty_string"),
        # Test nested structure handling
        pytest.param([], decoder_json, [{"id": 1, "nested": {"id2": 2}}], [{"id": 1, "nested": {"id2": 2}}], id="test_extract_from_nonempty_array_with_nested_array"),
        # ... more test cases
    ]
)

Also applies to: 156-179


196-200: Add explicit test cases for service key handling?

The service key verification and exclusion steps are good additions! Would it be helpful to add test cases that explicitly verify:

  1. The presence of required service keys after extraction
  2. The successful removal of service keys after exclusion

Here's a possible test case structure:

def test_service_key_handling():
    extractor = DpathExtractor(field_path=[], config=config, decoder=decoder_json, parameters=parameters)
    response = create_response({"id": 1})
    records = list(extractor.extract_records(response))
    
    # Verify service keys exist
    assert "$root" in records[0]
    
    # Verify service keys are removed
    cleaned_record = exclude_service_keys(records[0])
    assert "$root" not in cleaned_record
docs/CONTRIBUTING.md (5)

71-72: Consider consolidating duplicate references to Issue #197

Hey there! 👋 I noticed we're referencing the same issue twice in close proximity. Maybe we could combine these into a single reference under the "See also" section? wdyt?

 See also:
 
 - [Dager-Podman Integration](https://blog.playgroundtech.io/introduction-to-dagger-6ab55ee28723)
 - [CDK Issue 197](https://github.com/airbytehq/airbyte-python-cdk/issues/197)
-
-7. Make sure Docker is installed locally, instead of Podman
-   See also:
-
-- [CDK Issue 197](https://github.com/airbytehq/airbyte-python-cdk/issues/197)
+7. Make sure Docker is installed locally, instead of Podman

Also applies to: 77-77


21-23: Add language specifiers to code blocks

Hey! 🚀 Would you mind adding language specifiers to the code blocks? This helps with syntax highlighting and makes the documentation more readable. For example:

-```
+```bash
sudo dnf install python3.11



Also applies to: 29-31, 37-39, 43-45, 49-51, 55-57, 61-67, 81-83, 140-143, 154-157, 189-197, 213-229

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 Markdownlint (0.37.0)</summary>

21-21: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

---

`163-163`: **Consider adding a comma for better readability**

What do you think about adding a comma after "page" in this sentence? 
```diff
-- On the **Settings / Domains** page find the subdomain
++ On the **Settings / Domains** page, find the subdomain
🧰 Tools
🪛 LanguageTool

[uncategorized] ~163-~163: Possible missing comma found.
Context: ...file. - On the Settings / Domains page find the subdomain of myshopify.com a...

(AI_HYDRA_LEO_MISSING_COMMA)


283-283: Consider using American English for consistency

Hey! 👋 The phrase "have a look" is British English. Since the rest of the documentation uses American English, would you consider changing it to "take a look"? wdyt?

-Please have a look at the [Release Management]
+Please take a look at the [Release Management]
🧰 Tools
🪛 LanguageTool

[locale-violation] ~283-~283: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)


136-136: Consider using proper heading syntax

Instead of using bold text for these section headers, would you consider using proper markdown heading syntax? This would improve the document's structure and make it easier to navigate. For example:

-**Preparation steps**
+### Preparation steps

Also applies to: 235-235

🧰 Tools
🪛 Markdownlint (0.37.0)

136-136: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

unit_tests/sources/declarative/extractors/test_record_selector.py (3)

50-79: Great test coverage for nested structures! 🎯

The new test cases thoroughly cover nested data scenarios. What do you think about making the test names more Pythonic by using underscores instead of camelCase? For example:

  • test_no_record_filter_returns_all_records_with_nested (already good!)
  • test_true_record_filter_returns_all_records_with_nested (already good!)

114-126: Consider renaming this test case for consistency

What do you think about renaming test_the original response is available in filters and transformations to test_original_response_available_in_filters_and_transformations to match the Python naming convention used in other test cases? wdyt?


170-170: Consider adding explicit service key tests

I see you've added service key exclusion in the assertions. What do you think about adding specific test cases to verify the behavior of service keys? This could include:

  • Verifying that service keys are properly excluded
  • Checking edge cases with missing or malformed service keys

Example test case structure:

@pytest.mark.parametrize(
    "test_name, input_record, expected_output",
    [
        (
            "test_exclude_service_keys",
            {"id": 1, "$root": {"data": []}, "$parameters": {}},
            {"id": 1}
        ),
        # Add more test cases
    ]
)
def test_service_key_handling(test_name, input_record, expected_output):
    assert exclude_service_keys(input_record) == expected_output

Also applies to: 244-244

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8c636f7 and 118d785.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (4)
  • docs/CONTRIBUTING.md (5 hunks)
  • unit_tests/sources/declarative/extractors/test_dpath_extractor.py (5 hunks)
  • unit_tests/sources/declarative/extractors/test_record_extractor.py (1 hunks)
  • unit_tests/sources/declarative/extractors/test_record_selector.py (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • unit_tests/sources/declarative/extractors/test_record_extractor.py
🧰 Additional context used
🪛 LanguageTool
docs/CONTRIBUTING.md

[style] ~96-~96: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~103-~103: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. - `poetry run ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~104-~104: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...hon code. - poetry run poe format-fix to auto-fix formatting issues. ### Run Co...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[uncategorized] ~163-~163: Possible missing comma found.
Context: ...file. - On the Settings / Domains page find the subdomain of myshopify.com a...

(AI_HYDRA_LEO_MISSING_COMMA)


[style] ~265-~265: Consider a more expressive alternative.
Context: ...or, you may need to regenerate them. To do that, you can run: ```bash poetry run ...

(DO_ACHIEVE)


[locale-violation] ~283-~283: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)

🪛 Markdownlint (0.37.0)
docs/CONTRIBUTING.md

136-136: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


235-235: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


21-21: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


29-29: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


37-37: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


43-43: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


49-49: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


55-55: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


61-61: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


81-81: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


140-140: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


154-154: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


170-170: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


189-189: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


201-201: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


213-213: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


221-221: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (3)
unit_tests/sources/declarative/extractors/test_dpath_extractor.py (1)

18-21: LGTM! Clean import organization.

The new imports for service key management functions are well-organized and properly placed.

docs/CONTRIBUTING.md (1)

245-246: The command example needs updating

The text mentions testing, but the command shows build. This issue was previously identified in a past review.

unit_tests/sources/declarative/extractors/test_record_selector.py (1)

12-19: LGTM! Clean import organization.

The new imports are well-organized and properly scoped for the added functionality.

Trigger another another build.

The unit tests in this exactly version succeed under Fedora 41, while
one of them fails under Oracle Linux 8.7.
The integration and acceptance tests fail under specific OS and setup,
not related to the contents of this PR.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
docs/CONTRIBUTING.md (3)

19-56: Would you consider improving the formatting consistency of installation instructions?

The Fedora-specific installation steps could be better organized. Here's a suggestion, wdyt?

  1. Add language specifiers to code blocks (e.g., bash or shell)
  2. Consider using a consistent pattern for OS-specific instructions, like:
2. Make sure your Python version is 3.10 or 3.11

Installation commands:
- Generic:
```bash
# Your generic installation commands
  • Fedora 41:
sudo dnf install python3.11

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 Markdownlint (0.37.0)</summary>

21-21: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

29-29: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

37-37: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

43-43: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

49-49: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

55-55: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

---

`99-102`: **Would you consider reorganizing the test commands for better clarity?**

The test commands could be organized by purpose. How about something like this, wdyt?

```markdown
### Run Unit Tests

Run all tests:
```bash
poetry run pytest

Run specific tests:

poetry run pytest -k <suite or test name>

Run fast tests only (<5 min):

poetry run pytest-fast

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 LanguageTool</summary>

[style] ~101-~101: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - `poetry run pytest-fast` to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

</details>

</details>

---

`175-184`: **Consider adding a note about sensitive information in the example**

The example configuration contains placeholders for sensitive information. Would you consider adding a note to remind users not to commit their actual credentials, wdyt?

```diff
 {
   "shop": "nw0ipt-vr",
   "start_date": "2020-11-01",
   "credentials": {
     "auth_method": "api_password",
     "api_password": "shpat_XXXXXXXXXXX"   # Replace with your actual API password
   }
 }
+
+Note: Never commit your actual credentials to version control. Make sure config.json is in .gitignore.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 118d785 and b49a98d.

📒 Files selected for processing (1)
  • docs/CONTRIBUTING.md (5 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/CONTRIBUTING.md

[style] ~101-~101: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~108-~108: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. - `poetry run ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~109-~109: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...hon code. - poetry run poe format-fix to auto-fix formatting issues. ### Run Co...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~270-~270: Consider a more expressive alternative.
Context: ...or, you may need to regenerate them. To do that, you can run: ```bash poetry run ...

(DO_ACHIEVE)


[locale-violation] ~288-~288: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)

🪛 Markdownlint (0.37.0)
docs/CONTRIBUTING.md

136-136: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


235-235: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


21-21: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


29-29: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


37-37: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


43-43: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


49-49: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


55-55: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


61-61: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


81-81: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


140-140: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


154-154: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


170-170: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


189-189: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


201-201: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


213-213: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


221-221: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (2)
docs/CONTRIBUTING.md (2)

250-251: The command example needs to be fixed

The text mentions running tests, but the command shows build. This issue was previously identified.

-airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> build
+airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> test

Line range hint 270-294: Documentation generation instructions look good!

The instructions for generating API documentation and managing releases are clear and well-organized.

🧰 Tools
🪛 LanguageTool

[locale-violation] ~288-~288: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)

Trigger another build, hoping the infrastructure will not fail it
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
docs/CONTRIBUTING.md (2)

21-23: Would you consider adding language specifiers to code blocks? 🤔

Adding language specifiers (e.g., bash, json) would enable syntax highlighting and improve readability. For example:

-```
+```bash
sudo dnf install python3.11

Also applies to: 29-31, 37-39, 43-45, 49-51, 54-56, 60-62, 66-72, 86-88, 144-147, 158-161, 174-183, 193-201, 205-213, 217-221, 225-233

🧰 Tools
🪛 Markdownlint (0.37.0)

21-21: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


287-287: Small style suggestions to enhance readability 📝

Would you consider these minor improvements?

  • Line 287: In American English, "take a look" is more commonly used than "have a look"
  • Lines 101, 108: Consider varying the sentence beginnings to avoid repetition

Just nitpicks to make the docs even better, wdyt? 😊

Also applies to: 101-101, 108-108

🧰 Tools
🪛 LanguageTool

[locale-violation] ~287-~287: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b49a98d and f7f19be.

📒 Files selected for processing (1)
  • docs/CONTRIBUTING.md (5 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/CONTRIBUTING.md

[style] ~101-~101: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~108-~108: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. ### Run Code ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~269-~269: Consider a more expressive alternative.
Context: ...or, you may need to regenerate them. To do that, you can run: ```bash poetry run ...

(DO_ACHIEVE)


[locale-violation] ~287-~287: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)

🪛 Markdownlint (0.37.0)
docs/CONTRIBUTING.md

140-140: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


239-239: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


21-21: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


29-29: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


37-37: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


43-43: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


49-49: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


54-54: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


60-60: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


66-66: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


86-86: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


144-144: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


158-158: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


174-174: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


193-193: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


205-205: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


217-217: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


225-225: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (2)
docs/CONTRIBUTING.md (2)

8-11: Great job on improving the documentation structure! 🎉

The installation instructions are now much clearer with OS-specific commands and detailed steps. This will help contributors get started more easily.

Also applies to: 17-23, 25-56


249-250: The command example needs to be updated.

The text mentions running tests, but the command shows build.

-airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> build
+airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> test

Handled nitpick comments by rabbit
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
docs/CONTRIBUTING.md (4)

8-8: Enhance clarity of contribution types

The line mentions "bug fixes" but doesn't specify the scope. Would it be helpful to clarify the types of bug fixes (e.g., runtime bugs, documentation fixes, test fixes)? wdyt?

-You can contribute by reporting bugs, posting GitHub discussions, opening issues, improving docs, and submitting pull requests with bug fixes and new features.
+You can contribute by reporting bugs, posting GitHub discussions, opening issues, improving docs, and submitting pull requests with bug fixes (runtime, documentation, or test fixes) and new features.

19-23: Consider adding version verification command

Would it be helpful to add a command to verify the Python version after installation? wdyt?

sudo dnf install python3.11
+
+# Verify installation
+python3.11 --version

99-102: Consider restructuring test commands for better readability

The test commands could be more readable with additional context. What do you think about adding descriptions and examples? wdyt?

-poetry run pytest` to run all unit tests.
-poetry run pytest -k <suite or test name>` to run specific unit tests.
-poetry run pytest-fast` to run the subset of PyTest tests, which are not flagged as `slow`. (It should take <5 min for fast tests only.)
+# Run all unit tests
+poetry run pytest
+
+# Run specific test suite or test (e.g., 'test_record_extractor')
+poetry run pytest -k <suite or test name>
+
+# Run fast tests only (takes <5 min)
+poetry run pytest-fast
🧰 Tools
🪛 LanguageTool

[style] ~101-~101: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


167-171: Fix formatting in the configuration example

The configuration example has inconsistent formatting. Consider using a code block for better readability:

-    domain: nw0ipt-vr.myshopify.com
-    shop: nw0ipt-vr
+```yaml
+domain: nw0ipt-vr.myshopify.com
+shop: nw0ipt-vr
+```
🧰 Tools
🪛 LanguageTool

[uncategorized] ~167-~167: Possible missing comma found.
Context: ...file. - On the Settings / Domains page find the subdomain of myshopify.com a...

(AI_HYDRA_LEO_MISSING_COMMA)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f7f19be and e5b9e5c.

📒 Files selected for processing (1)
  • docs/CONTRIBUTING.md (5 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/CONTRIBUTING.md

[style] ~101-~101: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... unit tests. - poetry run pytest-fast to run the subset of PyTest tests, which a...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~108-~108: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ting issues. - poetry run ruff format to format your Python code. ### Run Code ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[uncategorized] ~137-~137: You might be missing the article “the” here.
Context: ...om/airbytehq/airbyte) is checked out in airbyte directory. - The [CDK developme...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~167-~167: Possible missing comma found.
Context: ...file. - On the Settings / Domains page find the subdomain of myshopify.com a...

(AI_HYDRA_LEO_MISSING_COMMA)


[style] ~269-~269: Consider a more expressive alternative.
Context: ...or, you may need to regenerate them. To do that, you can run: ```bash poetry run ...

(DO_ACHIEVE)


[locale-violation] ~287-~287: In American English, “take a look” is more commonly used.
Context: ...Actions. ## Release Management Please have a look at the [Release Management](./RELEASES....

(HAVE_A_LOOK)

🪛 Markdownlint (0.37.0)
docs/CONTRIBUTING.md

140-140: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


239-239: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


86-86: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (1)
docs/CONTRIBUTING.md (1)

249-250: Fix the command in the example

There seems to be a copy-paste error in the command example. The text mentions running tests, but the command shows build:

-airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> build
+airbyte-ci connectors --use-local-cdk --name=<CONNECTOR> test

docs/CONTRIBUTING.md Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
unit_tests/sources/declarative/test_manifest_declarative_source.py (2)

1771-1771: Consider adding a descriptive assertion message?

The assertion could be more helpful during test failures if we add a descriptive message. What do you think about:

-        assert output_data == expected_records
+        assert output_data == expected_records, f"Expected {expected_records} but got {output_data}"

Line range hint 1772-2051: Great test coverage for caching behavior! 🎉

The test thoroughly verifies the caching behavior between parent and child streams. The assertions effectively check both the stream creation and caching configuration.

Consider extracting common configurations to fixtures?

The test setup contains repeated stream configurations. What do you think about extracting common parts into fixtures to improve maintainability? For example:

@pytest.fixture
def base_stream_config():
    return {
        "paginator": {
            "type": "DefaultPaginator",
            "page_size": 10,
            "page_size_option": {
                "type": "RequestOption",
                "inject_into": "request_parameter",
                "field_name": "per_page",
            },
            # ... other common config
        }
    }
airbyte_cdk/sources/declarative/extractors/record_selector.py (1)

162-167: Consider adding a docstring for clarity.

The method looks good, but would you consider adding a docstring to explain what service keys are and why we're removing them? Something like:

def _remove_service_keys(
    self, records: Iterable[Mapping[str, Any]]
) -> Iterable[Mapping[str, Any]]:
    """
    Remove service keys (fields starting with '$') from each record.
    These keys are used during transformation but should not be part of the final output.

    Args:
        records: Iterator of records potentially containing service keys

    Returns:
        Iterator of records with service keys removed
    """

wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e5b9e5c and 638ed41.

📒 Files selected for processing (4)
  • airbyte_cdk/sources/declarative/extractors/record_selector.py (3 hunks)
  • airbyte_cdk/sources/declarative/requesters/paginators/strategies/cursor_pagination_strategy.py (1 hunks)
  • unit_tests/sources/declarative/retrievers/test_simple_retriever.py (2 hunks)
  • unit_tests/sources/declarative/test_manifest_declarative_source.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • airbyte_cdk/sources/declarative/requesters/paginators/strategies/cursor_pagination_strategy.py
  • unit_tests/sources/declarative/retrievers/test_simple_retriever.py
⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/extractors/record_selector.py (2)

11-14: LGTM! Clean import addition.

The import of exclude_service_keys is well-placed alongside related imports from the same module.


112-113: Verify the sequence of operations.

The sequence looks good: filter -> transform -> remove service keys -> normalize. However, would it be more efficient to remove service keys during the transformation phase instead of having a separate pass through the data? wdyt?

Let's verify if any transformations depend on service keys:

✅ Verification successful

The separate service key removal step is necessary

The current sequence is correct. Service keys are actively used during transformations (e.g., in FlattenFields), so removing them earlier would break existing functionality. The separate pass keeps the code clean and maintainable by following single responsibility principle.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any transformations that might use service keys (fields starting with $)
rg -A 5 'transform.*\$' ./airbyte_cdk/sources/declarative/transformations/

Length of output: 76


Script:

#!/bin/bash
# Find all transformation implementations
echo "=== Transformation Implementations ==="
ast-grep --pattern 'class $_ implements TransformationInterface'

echo -e "\n=== Service Key Patterns ==="
# Look for service key patterns more broadly
rg -g '*.py' -A 5 'service.*key|SERVICE.*KEY' ./airbyte_cdk/sources/declarative/

Length of output: 7629

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants