Data Validation Framework for Spark Streaming, Batch and Java Entities.
This framework leverage the ability to test your spark datasets. The framework offer 3 modes
- Inline Validator - Streaming applications with in memory data validations.
- Offline Validator - Batch applications
- Entity Validator - Java Entities in any application.
The Rule Engine: Data Validator is basically built using the drools rule engine which offers various level data validation on and error handling.
Note: The Framework build is still in progress and will be adding all the capabilities soon.
Test Locally:
Run com.data.validator.batch.ExecuteMain.main