User Guide
Welcome to xorq!
xorq is a deferred computational framework for building, running, and serving pandas groupby-apply style pipelines common in ML workflows. xorq is built on top of Ibis and Apache DataFusion.
Getting Started
Dive Deeper
Multi-part series on how to build an end-to-end ML pipeline using live data from the HackerNews API.
1. Data prep with LLM Assisted Labeling
Learn to use UDXFs for preparing and labeling data with OpenAI.
2. Feature engineering with TF-IDF
Learn how to build fit-transform style deferred pipelines.
3. Model training with XGBoost
Learn how to build fit-predict style deferred pipelines.
4. Serving trained model
Learn how to serve fit-transform trained models
Why xorq?
xorq was developed to give Python developers a more ergonomic way to build, cache, and serve pipelines—without getting locked into a single engine. The xorq computational framework provides a quantum leap in ML development by:
- Simplifying development - no more juggling separate SQL jobs, pandas scripts, and ML framework specific transformations.
- Accelerating iteration - intelligent caching means no more having to wait for full pipeline re-runs after every little change.
- Making deployment seamless - moving a working pipeline from local dev to production no longer requires rewriting.