Python API
Core Operations
APIs for reading and returning data
| read_csv | Lazily load a CSV or set of CSVs. |
| read_parquet | Lazily load a parquet file or set of parquet files. |
| memtable | Construct an ibis table expression from in-memory data. |
| to_sql | Return the formatted SQL string for an expression. |
| execute | Execute an expression against its backend if one exists. |
| to_pyarrow_batches | Execute expression and return a RecordBatchReader. |
| to_pyarrow | Execute expression and return results in as a pyarrow table. |
| to_parquet | Write the results of executing the given expression to a parquet file. |
| to_csv | Write the results of executing the given expression to a CSV file. |
| to_json | Write the results of expr to a NDJSON file. |
Data Operations
| Table | An immutable and lazy dataframe. |
| GroupedTable | An intermediate table expression to hold grouping information. |
| Value | Base class for a data generating expression having a known type. |
| Scalar | |
| Column | |
| NumericColumn | |
| IntegerColumn | |
| FloatingColumn | |
| StringValue | |
| TimeValue | |
| DateValue | |
| DayOfWeek | A namespace of methods for extracting day of week information. |
| TimestampValue | |
| IntervalValue |
Caching
Caching Storage
| ParquetStorage | Storage that caches expressions as Parquet files using a modification time strategy. |
| ParquetSnapshotStorage | Storage that caches expressions as Parquet files using a snapshot invalidation strategy. |
| SourceStorage | Storage that caches expressions within the source backend using a modification time strategy. |
| SourceSnapshotStorage | Storage that caches expressions within the source backend using a snapshot strategy. |
Machine Learning Operations
Machine Learning Functions and Helpers
| train_test_splits | Generates multiple train/test splits of an Ibis table for different test sizes. |
| Step | A single step in a machine learning pipeline that wraps a scikit-learn estimator. |
| Pipeline | A machine learning pipeline that chains multiple processing steps together. |
Type System
Data types and schemas
| Data types | Scalar and column data types |
| Schemas | Table Schemas |
UDF System
The functions for creating UDF
| make_pandas_udf | Create a scalar User-Defined Function (UDF) that operates on pandas DataFrames. |
| make_pandas_expr_udf | Create an expression-based scalar UDF that incorporates pre-computed values. |
| pyarrow_udwf | Create a User-Defined Window Function (UDWF) using PyArrow. |
| agg.pyarrow | Decorator for creating PyArrow-based aggregation functions. |
| agg.pandas_df | Create a pandas DataFrame-based aggregation function. |
| flight_udxf | Create a User-Defined Exchange Function (UDXF) that executes a pandas DataFrame |