>>> import ibis
>>> from ibis import _
>>> ibis.options.interactive = True
>>> t = ibis.memtable(
... ["2020-04-15 08:04:00", 1),
... ("2020-04-15 08:06:00", 2),
... ("2020-04-15 08:09:00", 3),
... ("2020-04-15 08:11:00", 4),
... (
... ],=["ts", "val"],
... columns"ts": "timestamp"}) ... ).cast({
TimestampValue
TimestampValue(arg)
Attributes
Name | Description |
---|---|
add | Add an interval to a timestamp. |
radd | Add an interval to a timestamp. |
sub | Subtract a timestamp or an interval from a timestamp. |
Methods
Name | Description |
---|---|
bucket | Truncate the timestamp to buckets of a specified interval. |
date | Return the date component of the expression. |
delta | Compute the number of part s between two timestamps. |
strftime | Format a timestamp according to format_str . |
truncate | Truncate timestamp expression to units of unit . |
bucket
bucket(=None,
interval*,
=None,
years=None,
quarters=None,
months=None,
weeks=None,
days=None,
hours=None,
minutes=None,
seconds=None,
milliseconds=None,
microseconds=None,
nanoseconds=None,
offset )
Truncate the timestamp to buckets of a specified interval.
This is similar to truncate
, but supports truncating to arbitrary intervals rather than a single unit. Buckets are computed as fixed intervals starting from the UNIX epoch. This origin may be offset by specifying offset
.
Parameters
Name | Type | Description | Default |
---|---|---|---|
interval | Any | The bucket width as an interval. Alternatively may be specified via component keyword arguments. | None |
years | int | None | Number of years | None |
quarters | int | None | Number of quarters | None |
months | int | None | Number of months | None |
weeks | int | None | Number of weeks | None |
days | int | None | Number of days | None |
hours | int | None | Number of hours | None |
minutes | int | None | Number of minutes | None |
seconds | int | None | Number of seconds | None |
milliseconds | int | None | Number of milliseconds | None |
microseconds | int | None | Number of microseconds | None |
nanoseconds | int | None | Number of nanoseconds | None |
offset | Any | An interval to use to offset the start of the bucket. | None |
Returns
Name | Type | Description |
---|---|---|
TimestampValue | The start of the bucket as a timestamp. |
Examples
Bucket the data into 5 minute wide buckets:
>>> t.ts.bucket(minutes=5)
┏━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ TimestampBucket(ts, 5m) ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ timestamp │ ├─────────────────────────┤ │ 2020-04-15 08:00:00 │ │ 2020-04-15 08:05:00 │ │ 2020-04-15 08:05:00 │ │ 2020-04-15 08:10:00 │ └─────────────────────────┘
Bucket the data into 5 minute wide buckets, offset by 2 minutes:
>>> t.ts.bucket(minutes=5, offset=ibis.interval(minutes=2))
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ TimestampBucket(ts, 5m, 2m) ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ timestamp │ ├─────────────────────────────┤ │ 2020-04-15 08:02:00 │ │ 2020-04-15 08:02:00 │ │ 2020-04-15 08:07:00 │ │ 2020-04-15 08:07:00 │ └─────────────────────────────┘
One common use of timestamp bucketing is computing statistics per bucket. Here we compute the mean of val
across 5 minute intervals:
>>> mean_by_bucket = (
=5).name("bucket"))
... t.group_by(t.ts.bucket(minutes=_.val.mean())
... .agg(mean"bucket")
... .order_by(
... )>>> mean_by_bucket
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓ ┃ bucket ┃ mean ┃ ┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩ │ timestamp │ float64 │ ├─────────────────────┼─────────┤ │ 2020-04-15 08:00:00 │ 1.0 │ │ 2020-04-15 08:05:00 │ 2.5 │ │ 2020-04-15 08:10:00 │ 4.0 │ └─────────────────────┴─────────┘
date
date()
Return the date component of the expression.
Returns
Name | Type | Description |
---|---|---|
DateValue | The date component of self |
delta
delta(other, part)
Compute the number of part
s between two timestamps.
The second argument is subtracted from the first.
Parameters
Name | Type | Description | Default |
---|---|---|---|
other | datetime.datetime | Value[dt.Timestamp] | A timestamp expression | required |
part | Literal['year', 'quarter', 'month', 'week', 'day', 'hour', 'minute', 'second', 'millisecond', 'microsecond', 'nanosecond'] | Value[dt.String] | The unit of time to compute the difference in | required |
Returns
Name | Type | Description |
---|---|---|
IntegerValue | The number of part s between self and other |
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> start = ibis.time("01:58:00")
>>> end = ibis.time("23:59:59")
>>> end.delta(start, unit="hour")
┌────┐
│ 22 │
└────┘
>>> data = '''tpep_pickup_datetime,tpep_dropoff_datetime
... 2016-02-01T00:23:56,2016-02-01T00:42:28
... 2016-02-01T00:12:14,2016-02-01T00:21:41
... 2016-02-01T00:43:24,2016-02-01T00:46:14
... 2016-02-01T00:55:11,2016-02-01T01:24:34
... 2016-02-01T00:11:13,2016-02-01T00:16:59'''
>>> with open("/tmp/triptimes.csv", "w") as f:
= f.write(data) # nbytes is unused
... nbytes >>> taxi = ibis.read_csv("/tmp/triptimes.csv")
>>> ride_duration = taxi.tpep_dropoff_datetime.delta(
="minute"
... taxi.tpep_pickup_datetime, unit"ride_minutes")
... ).name(>>> ride_duration
┏━━━━━━━━━━━━━━┓ ┃ ride_minutes ┃ ┡━━━━━━━━━━━━━━┩ │ int64 │ ├──────────────┤ │ 19 │ │ 9 │ │ 3 │ │ 29 │ │ 5 │ └──────────────┘
strftime
strftime(format_str)
Format a timestamp according to format_str
.
Format string may depend on the backend, but we try to conform to ANSI strftime
.
Parameters
Name | Type | Description | Default |
---|---|---|---|
format_str | str | strftime format string |
required |
Returns
Name | Type | Description |
---|---|---|
StringValue | Formatted version of arg |
truncate
truncate(unit)
Truncate timestamp expression to units of unit
.
Parameters
Name | Type | Description | Default |
---|---|---|---|
unit | Literal['Y', 'Q', 'M', 'W', 'D', 'h', 'm', 's', 'ms', 'us', 'ns'] | Unit to truncate to | required |
Returns
Name | Type | Description |
---|---|---|
TimestampValue | Truncated timestamp expression |