Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(model 2.0): implement extractRelationshipsFromAspect util method #449

Merged
merged 5 commits into from
Oct 16, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions dao-impl/ebean-dao/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ configurations {
dependencies {
compile project(':core-models-utils')
compile project(':dao-api')
compile project(':gradle-plugins:metadata-annotations-lib')
compile externalDependency.ebean
compile externalDependency.flywayCore
compile externalDependency.guava
Expand All @@ -20,6 +21,7 @@ dependencies {

annotationProcessor externalDependency.lombok

testCompile project(':gradle-plugins:metadata-annotations-test-models')
testCompile project(':testing:core-models-testing')
testCompile project(':testing:test-models')
testCompile externalDependency.mysql
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,11 @@
import com.linkedin.data.schema.RecordDataSchema;
import com.linkedin.data.template.DataTemplateUtil;
import com.linkedin.data.template.RecordTemplate;
import com.linkedin.metadata.annotations.AlwaysAllowList;
import com.linkedin.metadata.annotations.DeltaEntityAnnotationArrayMap;
import com.linkedin.metadata.annotations.GmaAnnotation;
import com.linkedin.metadata.annotations.GmaAnnotationParser;
import com.linkedin.metadata.annotations.ModelType;
import com.linkedin.metadata.aspect.AuditedAspect;
import com.linkedin.metadata.aspect.SoftDeletedAspect;
import com.linkedin.metadata.dao.EbeanMetadataAspect;
Expand All @@ -27,10 +31,13 @@
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.Objects;
import java.util.Optional;
import java.util.stream.Collectors;
import javax.annotation.Nonnull;
import javax.annotation.Nullable;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.lang.StringUtils;
import org.json.simple.JSONObject;
import org.json.simple.parser.JSONParser;
import org.json.simple.parser.ParseException;
Expand Down Expand Up @@ -366,4 +373,50 @@ public static LocalRelationshipCriterion buildRelationshipFieldCriterion(LocalRe
.setValue(localRelationshipValue)
.setCondition(condition);
}

/**
* Extract the non-null values from all top-level relationship fields of an aspect.
* @param aspect aspect (possibly with relationships) to be ingested
* @return a list of relationship arrays, with each array representing the relationship(s) present in each top-level relationship
* field in the aspect. An empty list means that there is no non-null relationship metadata attached to the given aspect.
*/
@Nonnull
public static <RELATIONSHIP extends RecordTemplate, ASPECT extends RecordTemplate> List<List<RELATIONSHIP>> extractRelationshipsFromAspect(ASPECT aspect) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean an Aspect can contain contain one relationship type?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few thoughts:

  1. have a cache for each Aspect type, and the relationship type and path, since these information are deterministic through the application life cycle. This cache maintains

    <Aspect.class, List<<Relationship.class, List<field_path>>>>

  2. the method could be List extractRelationshipFromAspect(Aspect aspect, Class relationshipType)

The implementation of 2 can be extended with recursive lookup or get List<field_path> for relationship type in an Aspect for fast access.

What's more important over the performance improvement is we can remove the 1:1 mapping from relationship to aspect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A cache is a good suggestion, although a caveat is that some of the relationship fields of an aspect can be not present when we allow an aspect to have multiple relationships. For example, aspect with model:

record AspectFoo {
    relationshipBars = optional array[RelationshipBar]
    relationshipFooBars = optional array[RelationshipFooBar]
}

On the first ingestion of this aspect, the relationshipBars is present but relationshipFooBars may not be present. so the cache will not register that field. So on the next ingestion of this aspect, we need to check every field again. I will leave this optimization for the future if we see that this logic has noticeable performance impact.

return aspect.schema().getFields().stream().filter(field -> !field.getType().isPrimitive()).map(field -> {
String fieldName = field.getName();
Class<RecordTemplate> clazz = (Class<RecordTemplate>) aspect.getClass();
try {
Method getMethod = clazz.getMethod("get" + StringUtils.capitalize(fieldName)); // getFieldName
Object obj = getMethod.invoke(aspect); // invoke getFieldName
// all relationship fields will be represented as Arrays so filter out any non-lists, empty lists, and lists that don't contain RecordTemplates
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a hard requirement? can't a relationship be a single value field in an aspect?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in recent PR to support single value field case: #452

if (!(obj instanceof List) || ((List) obj).isEmpty() || !(((List) obj).get(0) instanceof RecordTemplate)) {
return null;
}
List<RecordTemplate> relationshipsList = (List<RecordTemplate>) obj;
ModelType modelType = parseModelTypeFromGmaAnnotation(relationshipsList.get(0));
if (modelType == ModelType.RELATIONSHIP) {
hongliu2024 marked this conversation as resolved.
Show resolved Hide resolved
log.debug(String.format("Found {%d} relationships of type {%s} for field {%s} of aspect class {%s}.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: a more efficient way to write this log is

log.isDebugEnabled() {
log.debug(...)
}

saves the string formatter parsing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify how this gets rid of the string formatting?

relationshipsList.size(), relationshipsList.get(0).getClass(), fieldName, clazz.getName()));
return (List<RELATIONSHIP>) relationshipsList;
}
} catch (ReflectiveOperationException e) {
throw new RuntimeException(e);
}
return null;
}).filter(Objects::nonNull).collect(Collectors.toList());
}

// Using the GmaAnnotationParser, extract the model type from the @gma.model annotation on any models.
private static ModelType parseModelTypeFromGmaAnnotation(RecordTemplate model) {
try {
final RecordDataSchema schema = (RecordDataSchema) DataTemplateUtil.getSchema(model.getClass());
final Optional<GmaAnnotation> gmaAnnotation = new GmaAnnotationParser(new AlwaysAllowList()).parse(schema);
if (!gmaAnnotation.isPresent() || !gmaAnnotation.get().hasModel()) {
return null;
}
return gmaAnnotation.get().getModel();
} catch (Exception e) {
throw new RuntimeException(String.format("Failed to parse the annotations for field %s", model.getClass().getCanonicalName()), e);
}
}
}
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
package com.linkedin.metadata.dao.utils;

import com.google.common.io.Resources;
import com.linkedin.data.template.IntegerArray;
import com.linkedin.data.template.RecordTemplate;
import com.linkedin.data.template.StringArray;
import com.linkedin.metadata.aspect.AuditedAspect;
import com.linkedin.metadata.dao.EbeanLocalAccess;
Expand All @@ -13,7 +15,15 @@
import com.linkedin.metadata.query.LocalRelationshipValue;
import com.linkedin.metadata.query.RelationshipField;
import com.linkedin.metadata.query.UrnField;
import com.linkedin.testing.AnnotatedAspectBarWithRelationshipFields;
import com.linkedin.testing.AnnotatedRelationshipBar;
import com.linkedin.testing.AnnotatedRelationshipBarArray;
import com.linkedin.testing.AnnotatedRelationshipFoo;
import com.linkedin.testing.AnnotatedRelationshipFooArray;
import com.linkedin.testing.AspectFoo;
import com.linkedin.testing.AnnotatedAspectFooWithRelationshipField;
import com.linkedin.testing.CommonAspect;
import com.linkedin.testing.CommonAspectArray;
import com.linkedin.testing.urn.BurgerUrn;
import com.linkedin.testing.urn.FooUrn;
import io.ebean.Ebean;
Expand Down Expand Up @@ -567,4 +577,62 @@ public void testBuildRelationshipFieldCriterionWithAspectField() {
relationshipField);
assertEquals(relationshipField, filterCriterion.getField().getRelationshipField());
}

@Test
public void testExtractRelationshipsFromAspect() {
// case 1: aspect model does not contain any relationship typed fields
// expected: return null
AspectFoo foo = new AspectFoo();
assertTrue(EBeanDAOUtils.extractRelationshipsFromAspect(foo).isEmpty());

// case 2: aspect model contains only a non-null relationship type field
// expected: return the non-null relationship
AnnotatedRelationshipFoo relationshipFoo = new AnnotatedRelationshipFoo();
AnnotatedRelationshipFooArray relationshipFoos = new AnnotatedRelationshipFooArray(relationshipFoo);
AnnotatedAspectFooWithRelationshipField fooWithRelationshipField = new AnnotatedAspectFooWithRelationshipField()
.setRelationshipFoo(relationshipFoos);

List<List<RecordTemplate>> results = EBeanDAOUtils.extractRelationshipsFromAspect(fooWithRelationshipField);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if EBeanDAOUtils.extractRelationshipsFromAspect returns List<List>, what's the point for the generic type RELATIONSHIP

assertEquals(1, results.size());
assertEquals(1, results.get(0).size());
assertEquals(relationshipFoo, results.get(0).get(0));

// case 3: aspect model contains only a null relationship type field
// expected: return null
AnnotatedAspectFooWithRelationshipField fooWithNullRelationshipField = new AnnotatedAspectFooWithRelationshipField();
assertTrue(EBeanDAOUtils.extractRelationshipsFromAspect(fooWithNullRelationshipField).isEmpty());

// case 4: aspect model contains multiple relationship fields, some null and some non-null, as well as array fields
// containing non-Relationship objects
// expected: return only the non-null relationships
relationshipFoos = new AnnotatedRelationshipFooArray(new AnnotatedRelationshipFoo(), new AnnotatedRelationshipFoo());
AnnotatedRelationshipBarArray relationshipBars = new AnnotatedRelationshipBarArray(new AnnotatedRelationshipBar());
// given:
// aspect = {
// value -> "abc"
// integers -> [1]
// nonRelationshipStructs -> [commonAspect1]
// relationshipFoos -> [foo1, foo2]
// relationshipBars -> [bar1]
// moreRelationshipFoos -> not present
// }
// expect:
// [[foo1, foo2], [bar1]]
AnnotatedAspectBarWithRelationshipFields barWithRelationshipFields = new AnnotatedAspectBarWithRelationshipFields()
.setValue("abc")
.setIntegers(new IntegerArray(1))
.setNonRelationshipStructs(new CommonAspectArray(new CommonAspect()))
.setRelationshipFoos(relationshipFoos)
.setRelationshipBars(relationshipBars); // don't set moreRelationshipFoos field

results = EBeanDAOUtils.extractRelationshipsFromAspect(barWithRelationshipFields);
assertEquals(2, results.size());
assertEquals(2, results.get(0).size());
assertEquals(1, results.get(1).size());
assertEquals(new AnnotatedRelationshipFoo(), results.get(0).get(0));
assertEquals(new AnnotatedRelationshipFoo(), results.get(0).get(1));
assertEquals(new AnnotatedRelationshipBar(), results.get(1).get(0));
}


}
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,15 @@ record GmaAnnotation {
*/
delta: optional DeltaAnnotation

/**
* The type of model
*/
model: optional enum ModelType {
ASSET,
ASPECT,
RELATIONSHIP
}

/**
* Information about GMA Search functionality.
*/
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
namespace com.linkedin.testing

/**
* For unit tests
*/
@gma.aspect.column.name = "annotatedaspectbarwithrelationshipfields"
@gma.model = "ASPECT"
record AnnotatedAspectBarWithRelationshipFields {
/**
* For unit tests
*/
value: string

/**
* For unit tests
*/
integers: optional array[int]

/**
* For unit tests
*/
nonRelationshipStructs: optional array[CommonAspect]

/**
* For unit tests
*/
relationshipFoos: optional array[AnnotatedRelationshipFoo]

/**
* For unit tests
*/
relationshipBars: optional array[AnnotatedRelationshipBar]

/**
* For unit tests
*/
moreRelationshipFoos: optional array[AnnotatedRelationshipFoo]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
namespace com.linkedin.testing

/**
* For unit tests
*/
@gma.aspect.column.name = "annotatedaspectfoowithrelationshipfield"
@gma.model = "ASPECT"
record AnnotatedAspectFooWithRelationshipField {
/**
* For unit tests
*/
value: string

/**
* For unit tests
*/
relationshipFoo: optional array[AnnotatedRelationshipFoo]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
namespace com.linkedin.testing

import com.linkedin.common.Urn

/**
* For unit tests
*/
@pairings = [ {
jsdonn marked this conversation as resolved.
Show resolved Hide resolved
"destination" : "com.linkedin.common.urn.Urn",
"source" : "com.linkedin.common.urn.Urn"
} ]
@gma.model = "RELATIONSHIP"
record AnnotatedRelationshipBar {

/**
* For unit tests
*/
source: Urn

/**
* For unit tests
*/
destination: Urn
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
namespace com.linkedin.testing

import com.linkedin.common.Urn

/**
* For unit tests
*/
@pairings = [ {
"destination" : "com.linkedin.common.urn.Urn",
"source" : "com.linkedin.common.urn.Urn"
} ]
@gma.model = "RELATIONSHIP"
record AnnotatedRelationshipFoo {

/**
* For unit tests
*/
source: Urn

/**
* For unit tests
*/
destination: Urn
}
Loading