>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
NumericColumn
NumericColumn(arg)
Methods
Name | Description |
---|---|
approx_quantile | Compute one or more approximate quantiles of a column. |
bucket | Compute a discrete binning of a numeric array. |
corr | Return the correlation of two numeric columns. |
cov | Return the covariance of two numeric columns. |
cummean | Return the cumulative mean of the input. |
cumsum | Return the cumulative sum of the input. |
histogram | Compute a histogram with fixed width bins. |
mean | Return the mean of a numeric column. |
std | Return the standard deviation of a numeric column. |
sum | Return the sum of a numeric column. |
var | Return the variance of a numeric column. |
approx_quantile
=None) approx_quantile(quantile, where
Compute one or more approximate quantiles of a column.
The result may or may not be exact
Whether the result is an approximation depends on the backend.
Parameters
Name | Type | Description | Default |
---|---|---|---|
quantile | float | ir.NumericValue | Sequence[ir.NumericValue | float] | 0 <= quantile <= 1 , or an array of such values indicating the quantile or quantiles to compute |
required |
where | ir.BooleanValue | None | Boolean filter for input values | None |
Returns
Name | Type | Description |
---|---|---|
Scalar | Quantile of the input |
Examples
Compute the approximate 0.50 quantile of bill_depth_mm
.
>>> t.bill_depth_mm.approx_quantile(0.50)
┌────────┐
│ 17.318 │
└────────┘
Compute multiple approximate quantiles in one call - in this case the result is an array.
>>> t.bill_depth_mm.approx_quantile([0.25, 0.75])
┌────────────────────────┐ │ [15.565625, 18.671875] │ └────────────────────────┘
bucket
bucket(
buckets,='left',
closed=True,
close_extreme=False,
include_under=False,
include_over )
Compute a discrete binning of a numeric array.
Parameters
Name | Type | Description | Default |
---|---|---|---|
buckets | Sequence[int] | List of buckets | required |
closed | Literal['left', 'right'] | Which side of each interval is closed. For example: python buckets = [0, 100, 200] closed = "left" # 100 falls in 2nd bucket closed = "right" # 100 falls in 1st bucket |
'left' |
close_extreme | bool | Whether the extreme values fall in the last bucket | True |
include_over | bool | Include values greater than the last bucket in the last bucket | False |
include_under | bool | Include values less than the first bucket in the first bucket | False |
Returns
Name | Type | Description |
---|---|---|
IntegerColumn | A categorical column expression |
corr
=None, how='sample') corr(right, where
Return the correlation of two numeric columns.
Parameters
Name | Type | Description | Default |
---|---|---|---|
right | NumericColumn | Numeric column | required |
where | ir.BooleanValue | None | Filter | None |
how | Literal['sample', 'pop'] | Population or sample correlation | 'sample' |
Returns
Name | Type | Description |
---|---|---|
NumericScalar | The correlation of left and right |
cov
=None, how='sample') cov(right, where
Return the covariance of two numeric columns.
Parameters
Name | Type | Description | Default |
---|---|---|---|
right | NumericColumn | Numeric column | required |
where | ir.BooleanValue | None | Filter | None |
how | Literal['sample', 'pop'] | Population or sample covariance | 'sample' |
Returns
Name | Type | Description |
---|---|---|
NumericScalar | The covariance of self and right |
cummean
=None, group_by=None, order_by=None) cummean(where
Return the cumulative mean of the input.
cumsum
=None, group_by=None, order_by=None) cumsum(where
Return the cumulative sum of the input.
histogram
=None, binwidth=None, base=None, eps=1e-13) histogram(nbins
Compute a histogram with fixed width bins.
Parameters
Name | Type | Description | Default |
---|---|---|---|
nbins | int | None | If supplied, will be used to compute the binwidth | None |
binwidth | float | None | If not supplied, computed from the data (actual max and min values) | None |
base | float | None | The value of the first histogram bin. Defaults to the minimum value of column . |
None |
eps | float | Allowed floating point epsilon for histogram base | 1e-13 |
Returns
Name | Type | Description |
---|---|---|
Column | Bucketed column |
mean
=None) mean(where
Return the mean of a numeric column.
Parameters
Name | Type | Description | Default |
---|---|---|---|
where | ir.BooleanValue | None | Filter | None |
Returns
Name | Type | Description |
---|---|---|
NumericScalar | The mean of the input expression |
std
=None, how='sample') std(where
Return the standard deviation of a numeric column.
Parameters
Name | Type | Description | Default |
---|---|---|---|
where | ir.BooleanValue | None | Filter | None |
how | Literal['sample', 'pop'] | Sample or population standard deviation | 'sample' |
Returns
Name | Type | Description |
---|---|---|
NumericScalar | Standard deviation of arg |
sum
sum(where=None)
Return the sum of a numeric column.
Parameters
Name | Type | Description | Default |
---|---|---|---|
where | ir.BooleanValue | None | Filter | None |
Returns
Name | Type | Description |
---|---|---|
NumericScalar | The sum of the input expression |
var
=None, how='sample') var(where
Return the variance of a numeric column.
Parameters
Name | Type | Description | Default |
---|---|---|---|
where | ir.BooleanValue | None | Filter | None |
how | Literal['sample', 'pop'] | Sample or population variance | 'sample' |
Returns
Name | Type | Description |
---|---|---|
NumericScalar | Standard deviation of arg |