-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get formatted schema and anomalies to visualize #146
Comments
What do you mean by "visualizeable formatted" data? The schema and stats are protocol buffer [1] objects. They implemented [1] https://developers.google.com/protocol-buffers |
Thanks, @brills, and sorry, my writing was bad. I want to visualize the result like this on Kubeflow Pipeline. For that, I want to get the dataframe created in |
Thanks for the clarification. We noted it in our internal bug tracker. What you suggested makes sense to me. But I'll check w/ the Kubeflow team to understand what their UI is capable of displaying first. In the meanwhile please keep using your "hack". As you can see, that piece of logic has been stable (and the part it extracts from the schema also has been stable). |
I understand. Thanks! |
A vote of support for this feature. I was trying to do exactly the same thing – DataFrames are much easier to work with than protos, especially for visualization in JS. I also ended up copying the |
I'm trying to run tfdv process in Kubeflow Pipeline and visualize the results in the pipeline UI.
For statistics, I can easily visualize using
get_statistics_html
.However, for schema and anomalies, I was struggled. We have
display_schema
anddisplay_anomalies
function, but it transforms data and calls IPython display inside. So, we have no way to get visualizable formatted data.Eventually, I almost copied the display functions and change those to return DataFrame.
FYI, the code is like this.
Does someone know any other good way?
What do you think about separate the display function for the transforming function and visualizing function like the function for statistics?
The text was updated successfully, but these errors were encountered: