Python API

Core Operations

APIs for reading and returning data

read_csv Lazily load a CSV or set of CSVs.
read_parquet Lazily load a parquet file or set of parquet files.
memtable Construct an ibis table expression from in-memory data.
to_sql Return the formatted SQL string for an expression.
execute Execute an expression against its backend if one exists.
to_pyarrow_batches Execute expression and return a RecordBatchReader.
to_pyarrow Execute expression and return results in as a pyarrow table.
to_parquet Write the results of executing the given expression to a parquet file.
to_csv Write the results of executing the given expression to a CSV file.
to_json Write the results of expr to a NDJSON file.

Data Operations

Table An immutable and lazy dataframe.
GroupedTable An intermediate table expression to hold grouping information.
Value Base class for a data generating expression having a known type.
Scalar
Column
NumericColumn
IntegerColumn
FloatingColumn
StringValue
TimeValue
DateValue
DayOfWeek A namespace of methods for extracting day of week information.
TimestampValue
IntervalValue

Caching

Caching Storage

ParquetStorage Storage that caches expressions as Parquet files using a modification time strategy.
ParquetSnapshotStorage Storage that caches expressions as Parquet files using a snapshot invalidation strategy.
SourceStorage Storage that caches expressions within the source backend using a modification time strategy.
SourceSnapshotStorage Storage that caches expressions within the source backend using a snapshot strategy.

Machine Learning Operations

Machine Learning Functions and Helpers

train_test_splits Generates multiple train/test splits of an Ibis table for different test sizes.
Step A single step in a machine learning pipeline that wraps a scikit-learn estimator.
Pipeline A machine learning pipeline that chains multiple processing steps together.

Type System

Data types and schemas

Data types Scalar and column data types
Schemas Table Schemas

UDF System

The functions for creating UDF

make_pandas_udf Create a scalar User-Defined Function (UDF) that operates on pandas DataFrames.
make_pandas_expr_udf Create an expression-based scalar UDF that incorporates pre-computed values.
pyarrow_udwf Create a User-Defined Window Function (UDWF) using PyArrow.
agg.pyarrow Decorator for creating PyArrow-based aggregation functions.
agg.pandas_df Create a pandas DataFrame-based aggregation function.
flight_udxf Create a User-Defined Exchange Function (UDXF) that executes a pandas DataFrame