The core concepts to understand multi-engine system
Multi-Engine
xorq’s multi-engine system enables seamless data movement between different query engines, allowing you to leverage the strengths of each engine while maintaining a unified workflow.
The into_backend Operator
The core of xorq’s multi-engine capability is the into_backend operator, which enables:
Transparent data movement between engines
Zero-copy data transfer using Apache Arrow
Automatic optimization of data placement
import xorq as xofrom xorq.expr.relations import into_backend# Connect to different enginespg = xo.postgres.connect_env()db = xo.duckdb.connect()# Get tables from different sourcesbatting = pg.table("batting")# Load awards_players into DuckDBawards_players = xo.examples.awards_players.fetch(backend=db)# Filter data in respective enginesleft = batting.filter(batting.yearID ==2015)right = awards_players.filter(awards_players.lgID =="NL").drop("yearID", "lgID")# Move right table into postgres for efficient joinexpr = left.join( into_backend(right, pg), ["playerID"], how="semi")[["yearID", "stint"]]# Execute the multi-engine queryresult = expr.execute()
Invalid type NoneType for attribute 'path' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types
Supported Engines
xorq currently supports:
In-Process Engines
DuckDB
DataFusion
Pandas
Distributed Engines
Trino
Snowflake
BigQuery
Engine Selection Guidelines
Choose engines based on their strengths:
DuckDB: Local processing, AsOf joins, efficient file formats