>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t.body_mass_g.approx_median()Column
Column(arg)Methods
| Name | Description |
|---|---|
| approx_median | Return an approximate of the median of self. |
| approx_nunique | Return the approximate number of distinct elements in self. |
| arbitrary | Select an arbitrary value in a column. |
| argmax | Return the value of self that maximizes key. |
| argmin | Return the value of self that minimizes key. |
| as_scalar | Inform ibis that the expression should be treated as a scalar. |
| as_table | Promote the expression to a Table. |
| count | Compute the number of rows in an expression. |
| cume_dist | Return the cumulative distribution over a window. |
| cummax | Return the cumulative max over a window. |
| cummin | Return the cumulative min over a window. |
| dense_rank | Position of first element within each group of equal values. |
| first | Return the first value of a column. |
| lag | Return the row located at offset rows before the current row. |
| last | Return the last value of a column. |
| lead | Return the row located at offset rows after the current row. |
| max | Return the maximum of a column. |
| median | Return the median of the column. |
| min | Return the minimum of a column. |
| mode | Return the mode of a column. |
| nth | Return the nth value (0-indexed) over a window. |
| ntile | Return the integer number of a partitioning of the column values. |
| nunique | Compute the number of distinct rows in an expression. |
| percent_rank | Return the relative rank of the values in the column. |
| preview | Print a subset as a single-column Rich Table. |
| quantile | Return value at the given quantile. |
| rank | Compute position of first element within each equal-value group in sorted order. |
| topk | Return a “top k” expression. |
| value_counts | Compute a frequency table. |
approx_median
approx_median(where=None)Return an approximate of the median of self.
Whether the result is an approximation depends on the backend.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| where | ir.BooleanValue | None | Filter in values when where is True |
None |
Returns
| Name | Type | Description |
|---|---|---|
| Scalar | An approximation of the median of self |
Examples
>>> t.body_mass_g.approx_median(where=t.species == "Chinstrap")approx_nunique
approx_nunique(where=None)Return the approximate number of distinct elements in self.
Whether the result is an approximation depends on the backend.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| where | ir.BooleanValue | None | Filter in values when where is True |
None |
Returns
| Name | Type | Description |
|---|---|---|
| Scalar | An approximate count of the distinct elements of self |
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t.body_mass_g.approx_nunique()>>> t.body_mass_g.approx_nunique(where=t.species == "Adelie")arbitrary
arbitrary(where=None, how=None)Select an arbitrary value in a column.
Returns an arbitrary (nondeterministic, backend-specific) value from the column. The value will be non-NULL, except if the column is empty or all values are NULL.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| where | ir.BooleanValue | None | A filter expression | None |
| how | Any | DEPRECATED | None |
Returns
| Name | Type | Description |
|---|---|---|
| Scalar | An expression |
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"a": [1, 2, 2], "b": list("aaa"), "c": [4.0, 4.1, 4.2]})
>>> t>>> t.group_by("a").agg(arb=t.b.arbitrary(), c=t.c.sum()).order_by("a")argmax
argmax(key, where=None)Return the value of self that maximizes key.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| key | ir.Value | Key to use for max computation. |
required |
| where | ir.BooleanValue | None | Keep values when where is True |
None |
Returns
| Name | Type | Description |
|---|---|---|
| Scalar | The value of self that maximizes key |
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t.species.argmax(t.body_mass_g)>>> t.species.argmax(t.body_mass_g, where=t.island == "Dream")argmin
argmin(key, where=None)Return the value of self that minimizes key.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| key | ir.Value | Key to use for min computation. |
required |
| where | ir.BooleanValue | None | Keep values when where is True |
None |
Returns
| Name | Type | Description |
|---|---|---|
| Scalar | The value of self that minimizes key |
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t.species.argmin(t.body_mass_g)>>> t.species.argmin(t.body_mass_g, where=t.island == "Biscoe")as_scalar
as_scalar()Inform ibis that the expression should be treated as a scalar.
Creates a scalar subquery from the column expression. Since ibis cannot be sure that the column expression contains only one value, the column expression is wrapped in a scalar subquery and treated as a scalar.
Note that the execution of the scalar subquery will fail if the column expression contains more than one value.
Returns
| Name | Type | Description |
|---|---|---|
| Scalar | A scalar subquery |
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> heavy_gentoo = t.filter(t.species == "Gentoo", t.body_mass_g > 6200)
>>> from_that_island = t.filter(t.island == heavy_gentoo.island.as_scalar())
>>> from_that_island.species.value_counts().order_by("species")as_table
as_table()Promote the expression to a Table.
Returns
| Name | Type | Description |
|---|---|---|
| Table | A table expression |
Examples
>>> import ibis
>>> t = ibis.table(dict(a="str"), name="t")
>>> expr = t.a.length().name("len").as_table()
>>> expected = t.select(len=t.a.length())
>>> expr.equals(expected)count
count(where=None)Compute the number of rows in an expression.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| where | ir.BooleanValue | None | Filter expression | None |
Returns
| Name | Type | Description |
|---|---|---|
| IntegerScalar | Number of elements in an expression |
cume_dist
cume_dist()Return the cumulative distribution over a window.
cummax
cummax(where=None, group_by=None, order_by=None)Return the cumulative max over a window.
cummin
cummin(where=None, group_by=None, order_by=None)Return the cumulative min over a window.
dense_rank
dense_rank()Position of first element within each group of equal values.
Values are returned in sorted order and duplicate values are ignored.
Equivalent to SQL’s DENSE_RANK().
Returns
| Name | Type | Description |
|---|---|---|
| IntegerColumn | The rank |
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"values": [1, 2, 1, 2, 3, 2]})
>>> t.mutate(rank=t.values.dense_rank())first
first(where=None, order_by=None, include_null=False)Return the first value of a column.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| where | ir.BooleanValue | None | An optional filter expression. If provided, only rows where where is True will be included in the aggregate. |
None |
| order_by | Any | An ordering key (or keys) to use to order the rows before aggregating. If not provided, the meaning of first is undefined and will be backend specific. |
None |
| include_null | bool | Whether to include null values when performing this aggregation. Set to True to include nulls in the result. |
False |
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"chars": ["a", "b", "c", "d"]})
>>> t>>> t.chars.first()>>> t.chars.first(where=t.chars != "a")lag
lag(offset=None, default=None)Return the row located at offset rows before the current row.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| offset | int | ir.IntegerValue | None | Index of row to select | None |
| default | Value | None | Value used if no row exists at offset |
None |
last
last(where=None, order_by=None, include_null=False)Return the last value of a column.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| where | ir.BooleanValue | None | An optional filter expression. If provided, only rows where where is True will be included in the aggregate. |
None |
| order_by | Any | An ordering key (or keys) to use to order the rows before aggregating. If not provided, the meaning of last is undefined and will be backend specific. |
None |
| include_null | bool | Whether to include null values when performing this aggregation. Set to True to include nulls in the result. |
False |
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"chars": ["a", "b", "c", "d"]})
>>> t>>> t.chars.last()>>> t.chars.last(where=t.chars != "d")lead
lead(offset=None, default=None)Return the row located at offset rows after the current row.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| offset | int | ir.IntegerValue | None | Index of row to select | None |
| default | Value | None | Value used if no row exists at offset |
None |
max
max(where=None)Return the maximum of a column.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| where | ir.BooleanValue | None | Filter in values when where is True |
None |
Returns
| Name | Type | Description |
|---|---|---|
| Scalar | The maximum value in self |
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t.body_mass_g.max()>>> t.body_mass_g.max(where=t.species == "Chinstrap")median
median(where=None)Return the median of the column.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| where | ir.BooleanValue | None | Optional boolean expression. If given, only the values where where evaluates to true will be considered for the median. |
None |
Returns
| Name | Type | Description |
|---|---|---|
| Scalar | Median of the column |
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()Compute the median of bill_depth_mm
>>> t.bill_depth_mm.median()>>> t.group_by(t.species).agg(median_bill_depth=t.bill_depth_mm.median()).order_by(
... ibis.desc("median_bill_depth")
... )In addition to numeric types, any orderable non-numeric types such as strings and dates work with median.
>>> t.group_by(t.island).agg(median_species=t.species.median()).order_by(
... ibis.desc("median_species")
... )min
min(where=None)Return the minimum of a column.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| where | ir.BooleanValue | None | Filter in values when where is True |
None |
Returns
| Name | Type | Description |
|---|---|---|
| Scalar | The minimum value in self |
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t.body_mass_g.min()>>> t.body_mass_g.min(where=t.species == "Adelie")mode
mode(where=None)Return the mode of a column.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| where | ir.BooleanValue | None | Filter in values when where is True |
None |
Returns
| Name | Type | Description |
|---|---|---|
| Scalar | The mode of self |
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t.body_mass_g.mode()>>> t.body_mass_g.mode(where=(t.species == "Gentoo") & (t.sex == "male"))nth
nth(n)Return the nth value (0-indexed) over a window.
.nth(0) is equivalent to .first(). Negative will result in NULL. If the value of n is greater than the number of rows in the window, NULL will be returned.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| n | int | ir.IntegerValue | Desired rank value | required |
Returns
| Name | Type | Description |
|---|---|---|
| Column | The nth value over a window |
ntile
ntile(buckets)Return the integer number of a partitioning of the column values.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| buckets | int | ir.IntegerValue | Number of buckets to partition into | required |
nunique
nunique(where=None)Compute the number of distinct rows in an expression.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| where | ir.BooleanValue | None | Filter expression | None |
Returns
| Name | Type | Description |
|---|---|---|
| IntegerScalar | Number of distinct elements in an expression |
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t.body_mass_g.nunique()>>> t.body_mass_g.nunique(where=t.species == "Adelie")percent_rank
percent_rank()Return the relative rank of the values in the column.
preview
preview(
max_rows=None,
max_length=None,
max_string=None,
max_depth=None,
console_width=None,
)Print a subset as a single-column Rich Table.
This is an explicit version of what you get when you inspect this object in interactive mode, except with this version you can pass formatting options. The options are the same as those exposed in ibis.options.interactive.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| max_rows | int | None | Maximum number of rows to display | None |
| max_length | int | None | Maximum length for pretty-printed arrays and maps. | None |
| max_string | int | None | Maximum length for pretty-printed strings. | None |
| max_depth | int | None | Maximum depth for nested data types. | None |
| console_width | int | float | None | Width of the console in characters. If not specified, the width will be inferred from the console. | None |
Examples
>>> import ibis
>>> t = ibis.examples.penguins.fetch()
>>> t.island.preview(max_rows=3, max_string=5)quantile
quantile(quantile, where=None)Return value at the given quantile.
The output of this method is a continuous quantile if the input is numeric, otherwise the output is a discrete quantile.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| quantile | float | ir.NumericValue | Sequence[ir.NumericValue | float] | 0 <= quantile <= 1, or an array of such values indicating the quantile or quantiles to compute |
required |
| where | ir.BooleanValue | None | Boolean filter for input values | None |
Returns
| Name | Type | Description |
|---|---|---|
| Scalar | Quantile of the input |
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()Compute the 99th percentile of bill_depth
>>> t.bill_depth_mm.quantile(0.99)>>> t.group_by(t.species).agg(p99_bill_depth=t.bill_depth_mm.quantile(0.99)).order_by(
... ibis.desc("p99_bill_depth")
... )In addition to numeric types, any orderable non-numeric types such as strings and dates work with quantile.
Let’s compute the 99th percentile of the species column
>>> t.group_by(t.island).agg(p99_species=t.species.quantile(0.99)).order_by(
... ibis.desc("p99_species")
... )rank
rank()Compute position of first element within each equal-value group in sorted order.
Equivalent to SQL’s RANK() window function.
Returns
| Name | Type | Description |
|---|---|---|
| Int64Column | The min rank |
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"values": [1, 2, 1, 2, 3, 2]})
>>> t.mutate(rank=t.values.rank())topk
topk(k, by=None, *, name=None)Return a “top k” expression.
Computes a Table containing the top k values by a certain metric (defaults to count).
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| k | int | The number of rows to return. | required |
| by | ir.Value | None | The metric to compute “top” by. Defaults to count. |
None |
| name | str | None | The name to use for the metric column. A suitable name will be automatically generated if not provided. | None |
Returns
| Name | Type | Description |
|---|---|---|
| Table | The top k values. |
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.diamonds.fetch()Compute the top 3 diamond colors by frequency:
>>> t.color.topk(3)Compute the top 3 diamond colors by mean price:
>>> t.color.topk(3, by=t.price.mean())Compute the top 2 diamond colors by max carat:
>>> t.color.topk(2, by=t.carat.max(), name="max_carat")value_counts
value_counts(name=None)Compute a frequency table.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| name | str | None | The name to use for the frequency column. A suitable name will be automatically generated if not provided. | None |
Returns
| Name | Type | Description |
|---|---|---|
| Table | The frequency table. |
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"chars": ["a", "a", "b", "c", "c", "c", "d", "d", "d", "d"]})Compute the count of each unique value in “chars”, ordered by “chars”:
>>> t.chars.value_counts().order_by("chars")Compute the count of each unique value in “chars” as a column named “freq”, ordered by “freq”:
>>> t.chars.value_counts(name="freq").order_by("freq")