# interval

**Syntax**

interval(X, duration, fill, [step], [explicitOffset=false], [closed], [label], [origin])

**Details**

In SQL queries, group data into continuous intervals with the length of *duration*. For intervals without any data, fill the results using the interpolation method specified by *fill*. This function must be used in a SQL *group by* clause.

Note: Temporal data types in “where” conditions are automatically converted.

**Arguments**

X is a vector of integral or temporal type.

duration is of integral or duration type. The following time units are supported (case-sensitive): w, d, H, m, s, ms, us, ns. As time units *M* and *y* are not supported, to group *X* by year or month, convert the data format of *X* with function month or year.

fill indicates how to fill the missing values of the result. It can take the value of “prev”, “post”, “linear”, “null”, a specific numeric value and “none”.

“prev”: the previous value

“post”: the next value

“linear”: linear interpolation. For non-numerical data, linear interpolation cannot be used and the “prev” method will be used instead.

“null”: null value

a specific numeric value.

“none”: do not interpolate

step is of integral or duration type. It’s an optional parameter indicating the step size of the sliding window. It must be a number which can divide the duration. This parameter allows us to specify a sliding interval that is smaller than *duration*. The default value is the same as *duration*, which means the calculation window slides by the length of *duration*.

Note: If step is specified, the following aggregation calculations are not supported: atImax, atImin, difference, imax, imin, lastNot, mode, percentile.

explicitOffset is an optional BOOLEAN. When *explicitOffset* = true, the first interpolation window starts at the starting point specified by the SQL *where* clause. When *explicitOffset* = false (default), the first interpolation window starts at the nearest point that is divisible by step before the starting point specified by the SQL *where* clause.

closed is a string indicating which boundary of the interval is closed. It can be specified as ‘left’ or ‘right’.

label is a string indicating which boundary is used to label the interval with. It can be specified as ‘left’ or ‘right’.

origin is a string or a scalar of the same temporal type as *X*, indicating the timestamp when the intervals start. When origin is a string, it can be specified as:

‘epoch’: 1970-01-01;

‘start’: the first value of the timeseries;

‘start_day’: 00:00 of the first day of the timeseries;

‘end’: the last value of the timeseries;

‘end_day’: 24:00 of the last day of the timeseries.

Note that *origin* is specified only when *explicitOffset* = false.

Note: As of version 1.30.14, the *range* parameter has been removed.

The following figure uses duration = 2 * step as an example to further explain how the interpolation window slides and applies calculations. The windows whose results are NULL are filled using the method specified by *fill*.

The figure below explains how the starting interpolation window is determined based on *explicitOffset*. In this example, t1 is the starting point specified by the where condition, t0 is the nearest point before t1 that is divisible by *step*. When *explicitOffset* = true, the first interpolation window starts at t1. When *explicitOffset* = false, the first interpolation window starts at t0.

**Examples**

Fill with the previous value:

```
$ timestampv = temporalAdd(2012.01.01T00:00:00.000, [0, 1, 2, 4, 8, 9, 10, 11, 15, 20], "s")
$ symbolv = take(`A`B`C, 10)
$ a1v = double(1..10)
$ t = table(timestampv as timestamp, symbolv as symbol, a1v as a1)
$ select max(a1) from t group by interval(timestamp, 3000ms, "prev")
```

interval_timestamp |
max_a1 |
---|---|

2012.01.01T00:00:00.000 |
3 |

2012.01.01T00:00:03.000 |
4 |

2012.01.01T00:00:06.000 |
5 |

2012.01.01T00:00:09.000 |
8 |

2012.01.01T00:00:12.000 |
8 |

2012.01.01T00:00:15.000 |
9 |

2012.01.01T00:00:18.000 |
10 |

fill with a specific number:

```
$ select max(a1) from t group by interval(timestamp, 3000ms, 100)
```

interval_timestamp |
max_a1 |
---|---|

2012.01.01T00:00:00.000 |
3 |

2012.01.01T00:00:03.000 |
4 |

2012.01.01T00:00:06.000 |
5 |

2012.01.01T00:00:09.000 |
8 |

2012.01.01T00:00:12.000 |
100 |

2012.01.01T00:00:15.000 |
9 |

2012.01.01T00:00:18.000 |
10 |

In the following example, to group data by every 2 years, convert *X* with the `year`

function, and specify *duration* as an integer in `interval`

.

```
$ t=table(1 3 NULL NULL 5 as price,2016.10.12T00:00:00.500 join 2017.10.12T00:00:03.000 join 2018.10.12T00:00:03.000 join 2019.10.12T00:00:08.000 join 2020.10.12T00:00:08.000 as time )
$ select max(price) from t group by interval(X=year(time), duration=2, fill="prev")
```

interval |
max_price |
---|---|

2016 |
3 |

2018 |
3 |

2020 |
5 |

The following example demonstrates how the starting point of the first window is determined based on the value of *explicitOffset*.

```
$ symbol = `C`C`C`C`C`C`MS`MS`MS`MS
$ price= 29.6 29.46 29.52 30.02 30.17 30.23 50.76 50.32 51.29 53.23
$ volume = 2200 1900 2100 3200 6800 5400 1300 2500 8800 5800
$ date = [2012.01.01, 2012.01.01, 2012.01.01, 2012.01.03, 2012.01.05, 2012.01.06, 2012.01.02, 2012.01.02, 2012.01.04, 2012.01.06]
$ second = [09:34:07,09:34:42,09:34:51,09:34:59,09:35:47,09:35:26,09:35:36,09:36:26,09:37:12,10:00:00]
$ t = table(date, second, symbol, volume, price)
//When explicitOffset is set to true, the first interpolation window starts at 09:33:50 specified by where clause.
$ select max(volume) as max_volume, avg(price) as avg_price from t where second between 09:33:50:09:35:00 group by symbol, date, interval(X=second, duration=30, fill="prev", explicitOffset=true) as second
```

symbol |
date |
second |
max_volume |
avg_price |
---|---|---|---|---|

C |
2012.01.01 |
09:33:50 |
2200 |
29.6 |

C |
2012.01.01 |
09:34:20 |
1900 |
29.46 |

C |
2012.01.01 |
09:34:50 |
2100 |
29.52 |

C |
2012.01.03 |
09:33:50 |
2100 |
29.52 |

C |
2012.01.03 |
09:34:20 |
2100 |
29.52 |

C |
2012.01.03 |
09:34:50 |
3200 |
30.02 |

When *explicitOffset* is set to false, the starting point of the first window should be the first value before 09:33:50 (the starting point of the time range specified by the *where* condition) that is divisible by *step*. Therefore, the starting point of the first window should be second(09:33:50/30*30), which is 09:33:30.

```
$ select max(volume) as max_volume, avg(price) as avg_price from t where second between 09:33:50:09:35:00 group by symbol, date, interval(X=second, duration=30, fill="prev", explicitOffset=false) as second
```

symbol |
date |
second |
max_volume |
avg_price |
---|---|---|---|---|

C |
2012.01.01 |
09:33:30 |
||

C |
2012.01.01 |
09:34:00 |
2200 |
29.6 |

C |
2012.01.01 |
09:34:30 |
2100 |
29.49 |

C |
2012.01.01 |
09:35:00 |
2100 |
29.49 |

C |
2012.01.03 |
09:33:30 |
2100 |
29.49 |

C |
2012.01.03 |
09:34:00 |
2100 |
29.49 |

C |
2012.01.03 |
09:34:30 |
3200 |
30.02 |

C |
2012.01.03 |
09:35:00 |
3200 |
30.02 |

With explicitOffset set to false, the start time of the first window is 09:33:30, i.e., the nearest point that can be divided by step before the start time (09:33:50) specified in the *where* clause. In the following example, step is not specified, it takes the same value as duration, and the first window starts at: second(09:33:50/30*30).

```
$ select max(volume) as max_volume, avg(price) as avg_price from t where second between 09:33:50:09:35:00 group by symbol, date, interval(X=second, duration=30, fill="prev", explicitOffset=false) as second
```

symbol |
date |
second |
max_volume |
avg_price |
---|---|---|---|---|

C |
2012.01.01 |
09:33:30 |
||

C |
2012.01.01 |
09:34:00 |
2200 |
29.6 |

C |
2012.01.01 |
09:34:30 |
2100 |
29.49 |

C |
2012.01.01 |
09:35:00 |
2100 |
29.49 |

C |
2012.01.03 |
09:33:30 |
2100 |
29.49 |

C |
2012.01.03 |
09:34:00 |
2100 |
29.49 |

C |
2012.01.03 |
09:34:30 |
3200 |
30.02 |

C |
2012.01.03 |
09:35:00 |
3200 |
30.02 |

The following example sets *step* to 20s and *duration* to 60s. The calculation is performed by sliding forward every 20s.

```
$ select max(volume) as max_volume, avg(price) as avg_price from t where second between 09:33:50:09:35:00 group by symbol, date, interval(X=second, duration=60, fill=0, step=20, explicitOffset=false) as second
```

symbol |
date |
second |
max_volume |
avg_price |
---|---|---|---|---|

C |
2012.01.01 |
09:33:40 |
2200 |
29.6 |

C |
2012.01.01 |
09:34:00 |
2200 |
29.5267 |

C |
2012.01.01 |
09:34:20 |
2100 |
29.49 |

C |
2012.01.01 |
09:34:40 |
2100 |
29.49 |

C |
2012.01.01 |
09:35:00 |
0 |
0 |

C |
2012.01.03 |
09:33:40 |
0 |
0 |

C |
2012.01.03 |
09:34:00 |
3200 |
30.02 |

C |
2012.01.03 |
09:34:20 |
3200 |
30.02 |

C |
2012.01.03 |
09:34:40 |
3200 |
30.02 |

C |
2012.01.03 |
09:35:00 |
0 |
0 |