Skip to content

Commit

Permalink
Merge branch 'main' into fix/replace-slack-with-here-in-email-templates
Browse files Browse the repository at this point in the history
  • Loading branch information
Siddhanttimeline authored Jan 16, 2025
2 parents c27aaf7 + 4af9764 commit 054c7ed
Show file tree
Hide file tree
Showing 1,348 changed files with 19,419 additions and 6,806 deletions.
8 changes: 8 additions & 0 deletions ingestion/src/metadata/ingestion/lineage/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,9 @@
from metadata.generated.schema.entity.services.connections.database.impalaConnection import (
ImpalaType,
)
from metadata.generated.schema.entity.services.connections.database.mariaDBConnection import (
MariaDBType,
)
from metadata.generated.schema.entity.services.connections.database.mssqlConnection import (
MssqlType,
)
Expand All @@ -58,6 +61,9 @@
from metadata.generated.schema.entity.services.connections.database.redshiftConnection import (
RedshiftType,
)
from metadata.generated.schema.entity.services.connections.database.singleStoreConnection import (
SingleStoreType,
)
from metadata.generated.schema.entity.services.connections.database.snowflakeConnection import (
SnowflakeType,
)
Expand Down Expand Up @@ -120,6 +126,8 @@ class Dialect(Enum):
str(MssqlType.Mssql.value): Dialect.TSQL,
str(AzureSQLType.AzureSQL.value): Dialect.TSQL,
str(TeradataType.Teradata.value): Dialect.TERADATA,
str(MariaDBType.MariaDB.value): Dialect.MYSQL,
str(SingleStoreType.SingleStore.value): Dialect.MYSQL,
}


Expand Down
11 changes: 10 additions & 1 deletion ingestion/src/metadata/ingestion/ometa/mixins/es_mixin.py
Original file line number Diff line number Diff line change
Expand Up @@ -403,7 +403,16 @@ def yield_es_view_def(
"bool": {
"must": [
{"term": {"service.name.keyword": service_name}},
{"term": {"tableType": TableType.View.value}},
{
"term": {
"tableType": [
TableType.View.value,
TableType.MaterializedView.value,
TableType.SecureView.value,
TableType.Dynamic.value,
]
}
},
{"term": {"deleted": False}},
{"exists": {"field": "schemaDefinition"}},
]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -296,7 +296,7 @@
ARGUMENT_SIGNATURE AS signature,
COMMENT as comment,
'StoredProcedure' as procedure_type
FROM INFORMATION_SCHEMA.PROCEDURES
FROM SNOWFLAKE.ACCOUNT_USAGE.PROCEDURES
WHERE PROCEDURE_CATALOG = '{database_name}'
AND PROCEDURE_SCHEMA = '{schema_name}'
"""
Expand All @@ -312,7 +312,7 @@
ARGUMENT_SIGNATURE AS signature,
COMMENT as comment,
'UDF' as procedure_type
FROM INFORMATION_SCHEMA.FUNCTIONS
FROM SNOWFLAKE.ACCOUNT_USAGE.FUNCTIONS
WHERE FUNCTION_CATALOG = '{database_name}'
AND FUNCTION_SCHEMA = '{schema_name}'
"""
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,8 @@ The sourceConfig is defined [here](https://github.com/open-metadata/OpenMetadata

**markDeletedMlModels**: Set the Mark Deleted Ml Models toggle to flag ml models as soft-deleted if they are not present anymore in the source system.

{% /codeInfo %}
**mlModelFilterPattern**: Regex to only fetch MlModels with names matching the pattern.

**overrideMetadata**: Set the 'Override Metadata' toggle to control whether to override the existing metadata in the OpenMetadata server with the metadata fetched from the source. If the toggle is set to true, the metadata fetched from the source will override the existing metadata in the OpenMetadata server. If the toggle is set to false, the metadata fetched from the source will not override the existing metadata in the OpenMetadata server. This is applicable for fields like description, tags, owner and displayName.

{% /codeInfo %}
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,6 @@
config:
type: MlModelMetadata
# markDeletedMlModels: true
```
# mlModelFilterPattern: []
# overrideMetadata: false
```
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,14 @@

The sourceConfig is defined [here](https://github.com/open-metadata/OpenMetadata/blob/main/openmetadata-spec/src/main/resources/json/schema/metadataIngestion/messagingServiceMetadataPipeline.json):

**generateSampleData:** Option to turn on/off generating sample data during metadata extraction.
- **generateSampleData:** Option to turn on/off generating sample data during metadata extraction.

**topicFilterPattern:** Note that the `topicFilterPattern` supports regex as include or exclude.
- **topicFilterPattern:** Note that the `topicFilterPattern` supports regex as include or exclude.

- **generateSampleData:** Option to turn on/off generating sample data during metadata extraction. `generateSampleData` supports boolean value either `true` or `false`.

- **markDeletedTopics:** Optional configuration to soft delete topics in OpenMetadata if the source topics are deleted. Also, if the topic is deleted, all the associated entities like sample data, lineage, etc., with that topic will be deleted. `markDeletedTopics` supports boolean value either `true` or `false`.

- **overrideMetadata:** Set the 'Override Metadata' toggle to control whether to override the existing metadata in the OpenMetadata server with the metadata fetched from the source. If the toggle is set to true, the metadata fetched from the source will override the existing metadata in the OpenMetadata server. If the toggle is set to false, the metadata fetched from the source will not override the existing metadata in the OpenMetadata server. This is applicable for fields like description, tags, owner and displayName. `overrideMetadata` supports boolean value either `true` or `false`.

{% /codeInfo %}
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,8 @@
# includes:
# - topic1
# generateSampleData: true
# generateSampleData: false # true
# markDeletedTopics: true # false
# overrideMetadata: false # true

```
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,8 @@ The sourceConfig is defined [here](https://github.com/open-metadata/OpenMetadata

**markDeletedMlModels**: Set the Mark Deleted Ml Models toggle to flag ml models as soft-deleted if they are not present anymore in the source system.

{% /codeInfo %}
**mlModelFilterPattern**: Regex to only fetch MlModels with names matching the pattern.

**overrideMetadata**: Set the 'Override Metadata' toggle to control whether to override the existing metadata in the OpenMetadata server with the metadata fetched from the source. If the toggle is set to true, the metadata fetched from the source will override the existing metadata in the OpenMetadata server. If the toggle is set to false, the metadata fetched from the source will not override the existing metadata in the OpenMetadata server. This is applicable for fields like description, tags, owner and displayName.

{% /codeInfo %}
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,6 @@
config:
type: MlModelMetadata
# markDeletedMlModels: true
```
# mlModelFilterPattern: []
# overrideMetadata: false
```
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,14 @@

The sourceConfig is defined [here](https://github.com/open-metadata/OpenMetadata/blob/main/openmetadata-spec/src/main/resources/json/schema/metadataIngestion/messagingServiceMetadataPipeline.json):

**generateSampleData:** Option to turn on/off generating sample data during metadata extraction.
- **generateSampleData:** Option to turn on/off generating sample data during metadata extraction.

**topicFilterPattern:** Note that the `topicFilterPattern` supports regex as include or exclude.
- **topicFilterPattern:** Note that the `topicFilterPattern` supports regex as include or exclude.

- **generateSampleData:** Option to turn on/off generating sample data during metadata extraction. `generateSampleData` supports boolean value either `true` or `false`.

- **markDeletedTopics:** Optional configuration to soft delete topics in OpenMetadata if the source topics are deleted. Also, if the topic is deleted, all the associated entities like sample data, lineage, etc., with that topic will be deleted. `markDeletedTopics` supports boolean value either `true` or `false`.

- **overrideMetadata:** Set the 'Override Metadata' toggle to control whether to override the existing metadata in the OpenMetadata server with the metadata fetched from the source. If the toggle is set to true, the metadata fetched from the source will override the existing metadata in the OpenMetadata server. If the toggle is set to false, the metadata fetched from the source will not override the existing metadata in the OpenMetadata server. This is applicable for fields like description, tags, owner and displayName. `overrideMetadata` supports boolean value either `true` or `false`.

{% /codeInfo %}
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,8 @@
# includes:
# - topic1
# generateSampleData: true
# generateSampleData: false # true
# markDeletedTopics: true # false
# overrideMetadata: false # true

```
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,8 @@ The sourceConfig is defined [here](https://github.com/open-metadata/OpenMetadata

**markDeletedMlModels**: Set the Mark Deleted Ml Models toggle to flag ml models as soft-deleted if they are not present anymore in the source system.

{% /codeInfo %}
**mlModelFilterPattern**: Regex to only fetch MlModels with names matching the pattern.

**overrideMetadata**: Set the 'Override Metadata' toggle to control whether to override the existing metadata in the OpenMetadata server with the metadata fetched from the source. If the toggle is set to true, the metadata fetched from the source will override the existing metadata in the OpenMetadata server. If the toggle is set to false, the metadata fetched from the source will not override the existing metadata in the OpenMetadata server. This is applicable for fields like description, tags, owner and displayName.

{% /codeInfo %}
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,6 @@
config:
type: MlModelMetadata
# markDeletedMlModels: true
```
# mlModelFilterPattern: []
# overrideMetadata: false
```
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ source:
serviceName: local_sagemaker
serviceConnection:
config:
type: Sagemaker
type: SageMaker
awsConfig:
```
```yaml {% srNumber=1 %}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ If everything goes as planned, all the data would be displayed using the paramet
`/openmetadata/...` in your GCP Secret Manager console. The following image shows what it should look
like:

{% image src="/images/v1.5/deployment/secrets-manager/supported-implementations/gcp-secret-manager/gcp-secret-manager-console.png" alt="gcp-secret-manager-console" /%}
{% image src="/images/v1.5/deployment/secrets-manager/supported-implementations/gcp-secret-manager/gcp-secret-manager-console.png" alt="gcp-secret-manager-console" /%}

**Note:** If we want to change the starting path for our secrets names from `openmetadata` to a different one, we have
to change the property `clusterName` in our `openmetadata.yaml`. Also, if you inform the `prefix` value, it will be
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -59,13 +59,18 @@ GRANT SELECT ON ALL TABLES IN SCHEMA TEST_SCHEMA TO ROLE NEW_ROLE;
GRANT SELECT ON ALL EXTERNAL TABLES IN SCHEMA TEST_SCHEMA TO ROLE NEW_ROLE;
GRANT SELECT ON ALL VIEWS IN SCHEMA TEST_SCHEMA TO ROLE NEW_ROLE;
GRANT SELECT ON ALL DYNAMIC TABLES IN SCHEMA TEST_SCHEMA TO ROLE NEW_ROLE;

-- Grant IMPORTED PRIVILEGES on all Schemas of SNOWFLAKE DB to New Role,
-- optional but required for usage, lineage and stored procedure ingestion
GRANT IMPORTED PRIVILEGES ON ALL SCHEMAS IN DATABASE SNOWFLAKE TO ROLE NEW_ROLE;
```

{% note %}
If running any of:
- Incremental Extraction
- Ingesting Tags
- Usage Workflow
- Ingesting Stored Procedures
- Lineage & Usage Workflow

The following Grant is needed
{% /note %}
Expand All @@ -74,24 +79,11 @@ The following Grant is needed

- **Ingesting Tags**: Openmetadata fetches the information by querying `snowflake.account_usage.tag_references`.

- **Usage Workflow**: Openmetadata fetches the query logs by querying `snowflake.account_usage.query_history` table. For this the snowflake user should be granted the `ACCOUNTADMIN` role or a role granted IMPORTED PRIVILEGES on the database `SNOWFLAKE`.

In order to be able to query those tables, the user should be either granted the `ACCOUNTADMIN` role or a role with the `IMPORTED PRIVILEGES` grant on the `SNOWFLAKE` database:

```sql
-- Grant IMPORTED PRIVILEGES on all Schemas of SNOWFLAKE DB to New Role
GRANT IMPORTED PRIVILEGES ON ALL SCHEMAS IN DATABASE SNOWFLAKE TO ROLE NEW_ROLE;
```
- **Lineage & Usage Workflow**: Openmetadata fetches the query logs by querying `snowflake.account_usage.query_history` table. For this the snowflake user should be granted the `ACCOUNTADMIN` role or a role granted IMPORTED PRIVILEGES on the database `SNOWFLAKE`.

You can find more information about the `account_usage` schema [here](https://docs.snowflake.com/en/sql-reference/account-usage).

Regarding Stored Procedures:
1. Snowflake only allows the grant of `USAGE` or `OWNERSHIP`
2. A user can only see the definition of the procedure in 2 situations:
1. If it has the `OWNERSHIP` grant,
2. If it has the `USAGE` grant and the procedure is created with `EXECUTE AS CALLER`.

Make sure to add the `GRANT <USAGE|OWNERSHIP> ON PROCEDURE <NAME>(<SIGNATURE>) to NEW_ROLE`, e.g., `GRANT USAGE ON PROCEDURE CLEAN_DATA(varchar, varchar) to NEW_ROLE`.
- **Ingesting Stored Procedures**: Openmetadata fetches the information by querying `snowflake.account_usage.procedures` & `snowflake.account_usage.functions`.

## Metadata Ingestion

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,22 @@ following [link](https://docs.confluent.io/platform/current/clients/confluent-ka

{% /codeInfo %}

{% codeInfo srNumber=9 %}
**securityProtocol**: security.protocol consumer config property. It accepts `PLAINTEXT`,`SASL_PLAINTEXT`, `SASL_SSL`, `SSL`.
{% /codeInfo %}

{% codeInfo srNumber=10 %}
**schemaRegistryTopicSuffixName**: Schema Registry Topic Suffix Name. The suffix to be appended to the topic name to get topic schema from registry.
{% /codeInfo %}

{% codeInfo srNumber=11 %}
**schemaRegistrySSL**: Schema Registry SSL Config. Configuration for enabling SSL for the Schema Registry connection.
{% /codeInfo %}

{% codeInfo srNumber=12 %}
**supportsMetadataExtraction**: Supports Metadata Extraction. `supportsMetadataExtraction` supports boolean value either true or false.
{% /codeInfo %}

{% partial file="/v1.6/connectors/yaml/messaging/source-config-def.md" /%}

{% partial file="/v1.6/connectors/yaml/ingestion-sink-def.md" /%}
Expand Down Expand Up @@ -164,6 +180,18 @@ source:
```yaml {% srNumber=8 %}
schemaRegistryConfig: {}
```
```yaml {% srNumber=9 %}
# securityProtocol: PLAINTEXT
```
```yaml {% srNumber=10 %}
# schemaRegistryTopicSuffixName: -value
```
```yaml {% srNumber=11 %}
# schemaRegistrySSL: ""
```
```yaml {% srNumber=12 %}
# supportsMetadataExtraction: true
```

{% partial file="/v1.6/connectors/yaml/messaging/source-config.md" /%}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,14 @@ following [link](https://docs.confluent.io/platform/current/clients/confluent-ka

{% /codeInfo %}

{% codeInfo srNumber=9 %}
**securityProtocol**: security.protocol consumer config property. It accepts `PLAINTEXT`,`SASL_PLAINTEXT`, `SASL_SSL`, `SSL`.
{% /codeInfo}

{% codeInfo srNumber=10 %}
**supportsMetadataExtraction**: Supports Metadata Extraction. `supportsMetadataExtraction` supports boolean value either true or false.
{% /codeInfo %}

{% partial file="/v1.6/connectors/yaml/messaging/source-config-def.md" /%}

{% partial file="/v1.6/connectors/yaml/ingestion-sink-def.md" /%}
Expand Down Expand Up @@ -159,6 +167,13 @@ source:
```yaml {% srNumber=8 %}
schemaRegistryConfig: {}
```
```yaml {% srNumber=9 %}
# securityProtocol: PLAINTEXT
```
```yaml {% srNumber=10 %}
# supportsMetadataExtraction: true
```


{% partial file="/v1.6/connectors/yaml/messaging/source-config.md" /%}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ source:
serviceName: local_sagemaker
serviceConnection:
config:
type: Sagemaker
type: SageMaker
awsConfig:
```
```yaml {% srNumber=1 %}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ If everything goes as planned, all the data would be displayed using the paramet
`/openmetadata/...` in your GCP Secret Manager console. The following image shows what it should look
like:

{% image src="/images/v1.6/deployment/secrets-manager/supported-implementations/gcp-secret-manager/gcp-secret-manager-console.png" alt="gcp-secret-manager-console" /%}
{% image src="/images/v1.6/deployment/secrets-manager/supported-implementations/gcp-secret-manager/gcp-secret-manager-console.png" alt="gcp-secret-manager-console" /%}

**Note:** If we want to change the starting path for our secrets names from `openmetadata` to a different one, we have
to change the property `clusterName` in our `openmetadata.yaml`. Also, if you inform the `prefix` value, it will be
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,11 @@ slug: /main-concepts/metadata-standard/schemas/analytics/basic

## Definitions

- **`webAnalyticEventType`** *(string)*: event type. Must be one of: `['PageView', 'CustomEvent']`.
- **`webAnalyticEventType`** *(string)*: event type. Must be one of: `["PageView", "CustomEvent"]`.
- **`fullUrl`** *(string)*: complete URL of the page.
- **`url`** *(string)*: url part after the domain specification.
- **`hostname`** *(string)*: domain name.
- **`sessionId`**: Unique ID identifying a session. Refer to *../type/basic.json#/definitions/uuid*.
- **`sessionId`**: Unique ID identifying a session. Refer to *[../type/basic.json#/definitions/uuid](#/type/basic.json#/definitions/uuid)*.


Documentation file automatically generated at 2023-10-27 13:55:46.343512.
Documentation file automatically generated at 2025-01-15 09:05:25.266839+00:00.
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,14 @@ slug: /main-concepts/metadata-standard/schemas/analytics

# Analytics

Documentation file automatically generated at 2023-10-27 13:55:46.343512.
This folder contains the following items:

- [**ReportDataType**](/main-concepts/metadata-standard/schemas/analytics/reportdatatype)
- [**WebAnalyticEventType**](/main-concepts/metadata-standard/schemas/analytics/webanalyticeventtype)
- [**ReportData**](/main-concepts/metadata-standard/schemas/analytics/reportdata)
- [**WebAnalyticEventData**](/main-concepts/metadata-standard/schemas/analytics/webanalyticeventdata)
- [**WebAnalyticEvent**](/main-concepts/metadata-standard/schemas/analytics/webanalyticevent)
- [**Basic**](/main-concepts/metadata-standard/schemas/analytics/basic)


Documentation file automatically generated at 2025-01-15 09:05:25.266839+00:00.
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,16 @@ slug: /main-concepts/metadata-standard/schemas/analytics/reportdata

## Properties

- **`id`**: Unique identifier for a result. Refer to *../type/basic.json#/definitions/uuid*.
- **`timestamp`**: timestamp for of a result ingestion. Refer to *../type/basic.json#/definitions/timestamp*.
- **`reportDataType`** *(string)*: Type of data. Must be one of: `['entityReportData', 'webAnalyticUserActivityReportData', 'webAnalyticEntityViewReportData', 'rawCostAnalysisReportData', 'aggregatedCostAnalysisReportData']`.
- **`id`**: Unique identifier for a result. Refer to *[../type/basic.json#/definitions/uuid](#/type/basic.json#/definitions/uuid)*.
- **`timestamp`**: timestamp for of a result ingestion. Refer to *[../type/basic.json#/definitions/timestamp](#/type/basic.json#/definitions/timestamp)*.
- **`reportDataType`** *(string)*: Type of data. Must be one of: `["entityReportData", "webAnalyticUserActivityReportData", "webAnalyticEntityViewReportData", "rawCostAnalysisReportData", "aggregatedCostAnalysisReportData"]`.
- **`data`**: Data captured.
- **One of**
- : Refer to *[reportDataType/entityReportData.json](#portDataType/entityReportData.json)*.
- : Refer to *[reportDataType/webAnalyticUserActivityReportData.json](#portDataType/webAnalyticUserActivityReportData.json)*.
- : Refer to *[reportDataType/webAnalyticEntityViewReportData.json](#portDataType/webAnalyticEntityViewReportData.json)*.
- : Refer to *[reportDataType/rawCostAnalysisReportData.json](#portDataType/rawCostAnalysisReportData.json)*.
- : Refer to *[reportDataType/aggregatedCostAnalysisReportData.json](#portDataType/aggregatedCostAnalysisReportData.json)*.


Documentation file automatically generated at 2023-10-27 13:55:46.343512.
Documentation file automatically generated at 2025-01-15 09:05:25.266839+00:00.
Loading

0 comments on commit 054c7ed

Please sign in to comment.