The xorq CLI provides two powerful commands, build and run, that help you separate the definition of data transformations from their execution. This separation enables:

  • Reproducibility: Define transformations once and execute them consistently across different environments with guaranteed identical results.
  • Serialization: Store complex queries as compiled artifacts that can be shared, versioned, and executed without the original code
  • Performance optimization: Pre-compile expressions to avoid repeated parsing and optimization at runtime

Prerequisites

Before starting, make sure you have xorq installed:

pip install xorq

Building Expressions

The build command compiles an Ibis expression into a reusable artifact that can be executed later.

Basic Usage

The basic syntax for the build command is:

xorq build <script_path> -e <expression_name> --builds-dir <output_directory>

Where:

  • <script_path> is the path to your Python script containing the expression
  • <expression_name> is the name of the variable holding the Ibis expression
  • <builds-dir> is where the artifacts will be generated (defaults to “builds”)

Example

Let’s create a simple script that defines an Ibis expression:

import xorq as xo
from xorq.common.utils.defer_utils import deferred_read_parquet
from xorq.expr.relations import into_backend


pg = xo.postgres.connect_env()
db = xo.duckdb.connect()

batting = pg.table("batting")

integer = 1

backend = xo.duckdb.connect()
awards_players = deferred_read_parquet(
    backend,
    xo.config.options.pins.get_path("awards_players"),
    table_name="award_players",
)
left = batting.filter(batting.yearID == 2015)
right = awards_players.filter(awards_players.lgID == "NL").drop("yearID", "lgID")
expr = left.join(
    into_backend(right, pg, "pg-filtered-table"), ["playerID"], how="semi"
)[["yearID", "stint"]]

Now, let’s build this expression using the CLI:

xorq build pipeline.py -e expr --builds-dir artifacts

This command will:

  1. Load the pipeline.py script
  2. Find the expr variable
  3. Generate artifacts based on the expression
  4. Save them to the artifacts directory

You should see output similar to:

Building expr from pipeline.py
Written 'expr' to artifacts/3350466c8fcd

Currently, the build command does not support in-memory tables as it’s unclear what the best approach for serializing/deserializing DataFrames for build artifacts would be.

Running Expressions

Once you’ve built an expression, you can execute it with the run command.

Basic Usage

The basic syntax for the run command is:

xorq run <build_path> --output-path <output_file> --format <output_format>

Where:

  • <build_path> is the path to the built expression
  • --output-path argument specifies where to write the results (defaults to discarding output, writing to os.devnull)
  • --format argument specifies the output format, which can be “csv”, “json”, or “parquet” (defaults to “parquet”)

Example

To run the expression we built earlier and save the results to a parquet file:

xorq run artifacts/2a3b46d8d3d0 --output-path results.parquet

To save the results as CSV instead:

xorq run artifacts/2a3b46d8d3d0 --output-path results.csv --format csv

Error Handling

The CLI will provide helpful error messages if:

  • The script doesn’t exist
  • The expression variable isn’t found in the script
  • The variable isn’t an Ibis expression
  • No expression name is provided for the build command
  • The build path doesn’t exist or isn’t a valid xorq artifact
  • The output format isn’t supported