Python API

Core Operations

APIs for reading and returning data

connect Create a xorq backend.
execute Execute an expression against its backend if one exists.
memtable Construct an ibis table expression from in-memory data.
read_csv Lazily load a CSV or set of CSVs.
read_parquet Lazily load a parquet file or set of parquet files.
deferred_read_csv Create a deferred read operation for CSV files that will execute only when needed.
deferred_read_parquet Create a deferred read operation for Parquet files that will execute only when needed.
to_csv Write the results of executing the given expression to a CSV file.
to_json Write the results of expr to a NDJSON file.
to_parquet Write the results of executing the given expression to a parquet file.
to_pyarrow Execute expression and return results in as a pyarrow table.
to_pyarrow_batches Execute expression and return a RecordBatchReader.
to_sql Return the formatted SQL string for an expression.
register
get_plans

Data Operations

Table An immutable and lazy dataframe.
GroupedTable An intermediate table expression to hold grouping information.
Value Base class for a data generating expression having a known type.
Scalar
Column
NumericColumn
IntegerColumn
FloatingColumn
StringValue
TimeValue
DateValue
DayOfWeek A namespace of methods for extracting day of week information.
TimestampValue
IntervalValue

Caching

Caching

ParquetCache Cache expressions as Parquet files using a snapshot invalidation strategy.
ParquetSnapshotCache Cache expressions as Parquet files using a snapshot invalidation strategy.
SourceCache
SourceSnapshotCache

Data Operations

Table, column, and value types

window Create a window clause for use with window functions.
selectors Convenient column selectors.

Machine Learning Operations

Machine Learning Functions and Helpers

train_test_splits Generates multiple train/test splits of an Ibis table for different test sizes.
Step A single step in a machine learning pipeline that wraps a scikit-learn estimator.
Pipeline A machine learning pipeline that chains multiple processing steps together.
FittedPipeline
deferred_fit_predict
deferred_fit_transform
calc_split_column
make_quickgrove_udf Create a UDF from a quickgrove (quickgrove) model.

Lineage

Data lineage tracking utilities

build_column_trees Builds a lineage tree for each column in the expression.
build_tree
print_tree

Flight Operations

Apache Arrow Flight server and client operations

FlightServer
FlightUrl
make_udxf

Catalog Operations

Compute catalog management

XorqCatalog Xorq Catalog container.
Build Build information.
Alias
CatalogMetadata Catalog metadata.

Type System

Data types and schemas

Data types Scalar and column data types
Schemas Table Schemas

UDF System

The functions for creating UDF

make_pandas_udf Create a scalar User-Defined Function (UDF) that operates on pandas DataFrames.
make_pandas_expr_udf Create an expression-based scalar UDF that incorporates pre-computed values.
pyarrow_udwf Create a User-Defined Window Function (UDWF) using PyArrow.
agg.pyarrow Decorator for creating PyArrow-based aggregation functions.
agg.pandas_df Create a pandas DataFrame-based aggregation function.
flight_udxf Create a User-Defined Exchange Function (UDXF) that executes a pandas DataFrame