Building and Running Expressions
The xorq CLI provides two powerful commands, build
and run
, that help you separate the definition of data transformations from their execution. This separation enables:
- Reproducibility: Define transformations once and execute them consistently across different environments with guaranteed identical results.
- Serialization: Store complex queries as compiled artifacts that can be shared, versioned, and executed without the original code
- Performance optimization: Pre-compile expressions to avoid repeated parsing and optimization at runtime
Prerequisites
Before starting, make sure you have xorq installed:
Building Expressions
The build
command compiles an Ibis expression into a reusable artifact that can be executed later.
Basic Usage
The basic syntax for the build
command is:
Where:
<script_path>
is the path to your Python script containing the expression<expression_name>
is the name of the variable holding the Ibis expression<builds-dir>
is where the artifacts will be generated (defaults to “builds”)
Example
Let’s create a simple script that defines an Ibis expression:
Now, let’s build this expression using the CLI:
This command will:
- Load the
pipeline.py
script - Find the
expr
variable - Generate artifacts based on the expression
- Save them to the
artifacts
directory
You should see output similar to:
Currently, the build
command does not support in-memory tables as it’s unclear what the best approach for serializing/deserializing DataFrames for build artifacts would be.
Running Expressions
Once you’ve built an expression, you can execute it with the run
command.
Basic Usage
The basic syntax for the run
command is:
Where:
<build_path>
is the path to the built expression--output-path
argument specifies where to write the results (defaults to discarding output, writing to os.devnull)--format
argument specifies the output format, which can be “csv”, “json”, or “parquet” (defaults to “parquet”)
Example
To run the expression we built earlier and save the results to a parquet file:
To save the results as CSV instead:
Error Handling
The CLI will provide helpful error messages if:
- The script doesn’t exist
- The expression variable isn’t found in the script
- The variable isn’t an Ibis expression
- No expression name is provided for the build command
- The build path doesn’t exist or isn’t a valid xorq artifact
- The output format isn’t supported