NumericColumn

NumericColumn(arg)

Methods

Name Description
approx_quantile Compute one or more approximate quantiles of a column.
bucket Compute a discrete binning of a numeric array.
corr Return the correlation of two numeric columns.
cov Return the covariance of two numeric columns.
cummean Return the cumulative mean of the input.
cumsum Return the cumulative sum of the input.
histogram Compute a histogram with fixed width bins.
mean Return the mean of a numeric column.
std Return the standard deviation of a numeric column.
sum Return the sum of a numeric column.
var Return the variance of a numeric column.

approx_quantile

approx_quantile(quantile, where=None)

Compute one or more approximate quantiles of a column.

The result may or may not be exact

Whether the result is an approximation depends on the backend.

Parameters

Name Type Description Default
quantile float | ir.NumericValue | Sequence[ir.NumericValue | float] 0 <= quantile <= 1, or an array of such values indicating the quantile or quantiles to compute required
where ir.BooleanValue | None Boolean filter for input values None

Returns

Name Type Description
Scalar Quantile of the input

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()

Compute the approximate 0.50 quantile of bill_depth_mm.

>>> t.bill_depth_mm.approx_quantile(0.50)

┌────────┐
│ 17.318 │
└────────┘

Compute multiple approximate quantiles in one call - in this case the result is an array.

>>> t.bill_depth_mm.approx_quantile([0.25, 0.75])

┌────────────────────────┐
│ [15.565625, 18.671875] │
└────────────────────────┘

bucket

bucket(
    buckets,
    closed='left',
    close_extreme=True,
    include_under=False,
    include_over=False,
)

Compute a discrete binning of a numeric array.

Parameters

Name Type Description Default
buckets Sequence[int] List of buckets required
closed Literal['left', 'right'] Which side of each interval is closed. For example: python buckets = [0, 100, 200] closed = "left" # 100 falls in 2nd bucket closed = "right" # 100 falls in 1st bucket 'left'
close_extreme bool Whether the extreme values fall in the last bucket True
include_over bool Include values greater than the last bucket in the last bucket False
include_under bool Include values less than the first bucket in the first bucket False

Returns

Name Type Description
IntegerColumn A categorical column expression

corr

corr(right, where=None, how='sample')

Return the correlation of two numeric columns.

Parameters

Name Type Description Default
right NumericColumn Numeric column required
where ir.BooleanValue | None Filter None
how Literal['sample', 'pop'] Population or sample correlation 'sample'

Returns

Name Type Description
NumericScalar The correlation of left and right

cov

cov(right, where=None, how='sample')

Return the covariance of two numeric columns.

Parameters

Name Type Description Default
right NumericColumn Numeric column required
where ir.BooleanValue | None Filter None
how Literal['sample', 'pop'] Population or sample covariance 'sample'

Returns

Name Type Description
NumericScalar The covariance of self and right

cummean

cummean(where=None, group_by=None, order_by=None)

Return the cumulative mean of the input.

cumsum

cumsum(where=None, group_by=None, order_by=None)

Return the cumulative sum of the input.

histogram

histogram(nbins=None, binwidth=None, base=None, eps=1e-13)

Compute a histogram with fixed width bins.

Parameters

Name Type Description Default
nbins int | None If supplied, will be used to compute the binwidth None
binwidth float | None If not supplied, computed from the data (actual max and min values) None
base float | None The value of the first histogram bin. Defaults to the minimum value of column. None
eps float Allowed floating point epsilon for histogram base 1e-13

Returns

Name Type Description
Column Bucketed column

mean

mean(where=None)

Return the mean of a numeric column.

Parameters

Name Type Description Default
where ir.BooleanValue | None Filter None

Returns

Name Type Description
NumericScalar The mean of the input expression

std

std(where=None, how='sample')

Return the standard deviation of a numeric column.

Parameters

Name Type Description Default
where ir.BooleanValue | None Filter None
how Literal['sample', 'pop'] Sample or population standard deviation 'sample'

Returns

Name Type Description
NumericScalar Standard deviation of arg

sum

sum(where=None)

Return the sum of a numeric column.

Parameters

Name Type Description Default
where ir.BooleanValue | None Filter None

Returns

Name Type Description
NumericScalar The sum of the input expression

var

var(where=None, how='sample')

Return the variance of a numeric column.

Parameters

Name Type Description Default
where ir.BooleanValue | None Filter None
how Literal['sample', 'pop'] Sample or population variance 'sample'

Returns

Name Type Description
NumericScalar Standard deviation of arg