Catalog

Catalog()

A git-backed registry for versioned build artifacts.

A catalog is a git repository containing serialized xorq expressions as content-addressed zip archives. When backed by git-annex, cloning downloads only metadata and artifact content is fetched on demand. A plain-git backend stores archives as regular blobs.

Construct via the classmethods from_name, from_repo_path, from_default, clone_from, or the dispatch helper from_kwargs.

Attributes

Name Description
remote_config The resolved remote config, or None.

Methods

Name Description
add Add a build to the catalog.
add_alias Create an alias pointing at entry name. Overwrites if the alias already exists.
assert_consistency Verify that catalog.yaml, entries, metadata, and aliases are all in agreement.
bind Bind a source entry through one or more transform entries.
clone_from Clone a catalog repo and optionally init git-annex.
contains Return True if an entry with name exists in the catalog.
embed_readonly Embed read-only credentials into the git-annex branch.
fetch Fetch from all git remotes (excludes annex-only special remotes).
fetch_entries Fetch annex content for the given entries in a single operation.
get_catalog_entry Look up a CatalogEntry by name. Raises if not found.
get_zip Export an entry’s archive to dir_path (default: cwd). Returns the output path.
list Return the list of entry names in the catalog.
list_aliases Return the list of alias names in the catalog.
load Return a tagged RemoteTable expression for a catalog entry (by hash or alias).
pull Pull from all git remotes.
push Push to all git remotes after verifying consistency.
remove Remove an entry (and its aliases) from the catalog by name.
set_remote_config Update the git-annex special remote configuration.
sync Pull then push — shorthand for a full round-trip synchronization.

add

add(obj, sync=True, aliases=(), exist_ok=False)

Add a build to the catalog.

obj may be a Path to a zip archive, a Path to a build directory, or an xorq Expr. Returns the created CatalogEntry.

add_alias

add_alias(name, alias, sync=True)

Create an alias pointing at entry name. Overwrites if the alias already exists.

assert_consistency

assert_consistency()

Verify that catalog.yaml, entries, metadata, and aliases are all in agreement.

bind

bind(source_entry, *transforms, con=None)

Bind a source entry through one or more transform entries.

clone_from

clone_from(
    url,
    repo_path=None,
    check_consistency=True,
    annex=None,
    git_config=None,
    **remote_kwargs,
)

Clone a catalog repo and optionally init git-annex.

annex controls the backend:

  • None (default) — auto-detect. If the cloned repo has a git-annex branch, git-annex is initialised and the remote is enabled when credentials are available (embedded, env vars, or remote_kwargs). Otherwise falls back to plain git.
  • False — force plain git, even if the repo has a git-annex branch.
  • Any AnnexConfig instance — git-annex is initialised and the remote is enabled if remote.log has a special remote configured.

Content is not fetched eagerly; it is retrieved on demand when entry.expr is accessed (via fetch_content). For S3 remotes without embedded credentials, the caller can supply credentials via remote_kwargs or environment variables (XORQ_CATALOG_S3_*).

Use git_config to set repo-local git config before annex init (e.g. {"annex.security.allowed-ip-addresses": "all"}).

contains

contains(name)

Return True if an entry with name exists in the catalog.

embed_readonly

embed_readonly(readonly_config)

Embed read-only credentials into the git-annex branch.

Verifies that readonly_config cannot write to the bucket, then sets embedcreds=yes and writes the config to remote.log.

Raises ValueError if the credentials have write access.

fetch

fetch()

Fetch from all git remotes (excludes annex-only special remotes).

fetch_entries

fetch_entries(*entries)

Fetch annex content for the given entries in a single operation.

Each element can be a CatalogEntry or a string (entry name). No-op for plain-git backends.

get_catalog_entry

get_catalog_entry(name, maybe_alias=False)

Look up a CatalogEntry by name. Raises if not found.

get_zip

get_zip(name, dir_path=None)

Export an entry’s archive to dir_path (default: cwd). Returns the output path.

list

list()

Return the list of entry names in the catalog.

list_aliases

list_aliases()

Return the list of alias names in the catalog.

load

load(name_or_alias, con=None)

Return a tagged RemoteTable expression for a catalog entry (by hash or alias).

pull

pull()

Pull from all git remotes.

push

push()

Push to all git remotes after verifying consistency.

remove

remove(name, sync=True)

Remove an entry (and its aliases) from the catalog by name.

set_remote_config

set_remote_config(remote_config)

Update the git-annex special remote configuration.

Calls enableremote to write the config to remote.log on the git-annex branch. Use catalog.remote_config to read it back.

sync

sync()

Pull then push — shorthand for a full round-trip synchronization.