-
Notifications
You must be signed in to change notification settings - Fork 0
Home
FabGuard is a Python library that simplifies input file verification. It is based on the data validation library Pandera and adapted for FabFlee. This documentation will guide you through the steps to use FabGuard for input file verification.
Before you get started with FabGuard, make sure you have the following prerequisites in place:
- The Pandas library installed.
- The Pandera library installed.
FabGuard is a plugin for FabFlee. The structure of the FabGuard folder is asa follows:
-
tests Folder: Contains schemes (tests) for various input files. For example, the closure_scheme folder contains verification tests for the closure file.
-
config.py: Contains configuration information, including the names of test files.
-
error_messages.py: Contains error messages used in your verification checks.
-
fab_guard.py: The main wrapper for Pandera tests. It defines decorators, such as
fg.log
for functions defining error messages andfg-check
for functions that should be executed as part of the test suite. It also provides utility functions likeload_files
for reading a CSV file and returning a DataFrame, andtranspose
for transposing a CSV file.
Each scheme file contains a class that inherits from pa.DataFrameModel
.
To ensure efficient use of resources, all test files are loaded into memory only once. This prevents unnecessary file loading, and you can achieve this by using the singleton class FabGuard. Load all files using the following method:
FabGuard.get_instance().load_file(config.routes)
In this guide, we will create tests for the locations.csv file as an example. Follow these steps to create your validation tests:
-
Create a Python class that inherits from
pa.DataFrameModel
. -
In this class, define constraints for each column as fields of the class. For example, if you have a routes file with columns like name1, name2, distance, and forced_redirection, define the constraints as follows:
name1: Series[pa.String] = pa.Field(nullable=false, alias='#"name1"')
name2: Series[pa.String] = pa.Field(nullable=false)
distance: Series[int] = pa.Field(ge=0)
forced_redirection: Series[float] = pa.Field(isin=[0, 1, 2], nullable=True)