Catalog
Catalog()A git-backed registry for versioned build artifacts.
A catalog is a git repository containing serialized xorq expressions as content-addressed zip archives. When backed by git-annex, cloning downloads only metadata and artifact content is fetched on demand. A plain-git backend stores archives as regular blobs.
Construct via the classmethods from_name, from_repo_path, from_default, clone_from, or the dispatch helper from_kwargs.
Attributes
| Name | Description |
|---|---|
| remote_config | The resolved remote config, or None. |
Methods
| Name | Description |
|---|---|
| add | Add a build to the catalog. |
| add_alias | Create an alias pointing at entry name. Overwrites if the alias already exists. |
| assert_consistency | Verify that catalog.yaml, entries, metadata, and aliases are all in agreement. |
| bind | Bind a source entry through one or more transform entries. |
| clone_from | Clone a catalog repo and optionally init git-annex. |
| contains | Return True if an entry with name exists in the catalog. |
| embed_readonly | Embed read-only credentials into the git-annex branch. |
| fetch | Fetch from all git remotes (excludes annex-only special remotes). |
| fetch_entries | Fetch annex content for the given entries in a single operation. |
| get_catalog_entry | Look up a CatalogEntry by name. Raises if not found. |
| get_zip | Export an entry’s archive to dir_path (default: cwd). Returns the output path. |
| list | Return the list of entry names in the catalog. |
| list_aliases | Return the list of alias names in the catalog. |
| load | Return a tagged RemoteTable expression for a catalog entry (by hash or alias). |
| pull | Pull from all git remotes. |
| push | Push to all git remotes after verifying consistency. |
| remove | Remove an entry (and its aliases) from the catalog by name. |
| set_remote_config | Update the git-annex special remote configuration. |
| sync | Pull then push — shorthand for a full round-trip synchronization. |
add
add(obj, sync=True, aliases=(), exist_ok=False, project_path=None)Add a build to the catalog.
obj may be a Path to a zip archive, a Path to a build directory, or an xorq Expr. Returns the created CatalogEntry.
project_path is the directory containing the pyproject.toml used to build the wheel and requirements sidecars. If omitted, the packager walks upward from the current working directory to find one. Passing it explicitly is required when the caller’s cwd is not inside the project (e.g. Jupyter kernels started from /tmp). Ignored for zip inputs, which are already complete build archives.
add_alias
add_alias(name, alias, sync=True)Create an alias pointing at entry name. Overwrites if the alias already exists.
assert_consistency
assert_consistency()Verify that catalog.yaml, entries, metadata, and aliases are all in agreement.
bind
bind(source_entry, *transforms, con=None)Bind a source entry through one or more transform entries.
clone_from
clone_from(
url,
repo_path=None,
check_consistency=True,
annex=None,
git_config=None,
**remote_kwargs,
)Clone a catalog repo and optionally init git-annex.
annex controls the backend:
None(default) — auto-detect. If the cloned repo has agit-annexbranch, git-annex is initialised and the remote is enabled when credentials are available (embedded, env vars, or remote_kwargs). Otherwise falls back to plain git.False— force plain git, even if the repo has agit-annexbranch.- Any
AnnexConfiginstance — git-annex is initialised and the remote is enabled if remote.log has a special remote configured.
Content is not fetched eagerly; it is retrieved on demand when entry.expr is accessed (via fetch_content). For S3 remotes without embedded credentials, the caller can supply credentials via remote_kwargs or environment variables (XORQ_CATALOG_S3_*).
Use git_config to set repo-local git config before annex init (e.g. {"annex.security.allowed-ip-addresses": "all"}).
contains
contains(name)Return True if an entry with name exists in the catalog.
embed_readonly
embed_readonly(readonly_config)Embed read-only credentials into the git-annex branch.
Verifies that readonly_config cannot write to the bucket, then sets embedcreds=yes and writes the config to remote.log.
Raises ValueError if the credentials have write access.
fetch
fetch()Fetch from all git remotes (excludes annex-only special remotes).
fetch_entries
fetch_entries(*entries)Fetch annex content for the given entries in a single operation.
Each element can be a CatalogEntry or a string (entry name). No-op for plain-git backends.
get_catalog_entry
get_catalog_entry(name, maybe_alias=False)Look up a CatalogEntry by name. Raises if not found.
get_zip
get_zip(name, dir_path=None)Export an entry’s archive to dir_path (default: cwd). Returns the output path.
list
list()Return the list of entry names in the catalog.
list_aliases
list_aliases()Return the list of alias names in the catalog.
load
load(name_or_alias, con=None)Return a tagged RemoteTable expression for a catalog entry (by hash or alias).
pull
pull()Pull from all git remotes.
push
push()Push to all git remotes after verifying consistency.
remove
remove(name, sync=True)Remove an entry (and its aliases) from the catalog by name.
set_remote_config
set_remote_config(remote_config)Update the git-annex special remote configuration.
Calls enableremote to write the config to remote.log on the git-annex branch. Use catalog.remote_config to read it back.
sync
sync()Pull then push — shorthand for a full round-trip synchronization.