Transcript#
This transcript was generated automatically and may contain errors.
Here are five data validation libraries for running data quality checks on your Polars data frames.
1. Pandera. Schema-first validation with statistical tests. Pandera has a heavy emphasis on statistical validation, like hypothesis testing.
2. Petito. Pydantic-style data models for data frames. Petito emphasizes integration with existing Pydantic workflows and object modeling.
3. pointblank. pointblank is a comprehensive validation library that focuses on step-by-step validation and results in beautiful interactive HTML reports perfect for sharing with stakeholders.
pointblank is a comprehensive validation library that focuses on step-by-step validation and results in beautiful interactive HTML reports perfect for sharing with stakeholders.
4. Validupci. Composable checks with smart failure handling. Validupci prioritizes flexibility and operational robustness.
5. DataFramely. Type-safe schema validation with advanced features. DataFramely emphasizes relational data integrity and type safety.
If you like this list, check out Rich Ianan's original post, linked in the description, and subscribe for more updates from Posit.