Time-Based Moving TopN Functions (tmTopN functions)

DolphinDB provides tmTopN functions to perform calculations on the top N elements in a time-based sliding window.

First Release 1.30.22/2.00.10

Introduction

  • A general template for tmTopN functions is:

$ tmTopN(T, X, S, window, top, [ascending=true], [tiesMethod='latest'])
$ tmTopN(T, X, Y, S, window, top, [ascending=true], [tiesMethod='latest'])

Parameters

T is a non-strictly increasing vector of temporal or integral type.

X (Y) is a numeric vector or matrix.

S is a numeric/temporal vector or matrix, based on which X are sorted. NULL values in S are ignored.

window is an integer greater than 1, indicating the sliding window size.

top is a positive integer or a floating-point number in (0,1) that indicates the number of top-ranked elements of X after sorted based on S.

  • If top is an integer, the first top elements are obtained.

  • If top is a floating-point number, it represents the percentage of top-ranked elements. The number of top-ranked elements is max(1, floor(window size*top)). This takes the rounded-down result of multiplying the number of elements in a window by top percentage, and returns the maximum of either 1 or this value.

ascending is a Boolean value indicating whether to sort S in ascending order. It is an optional parameter and the default value is true.

tiesMethod is a string that specifies how to select elements if there are more elements with the same value than spots available in the top N after sorting X within a sliding window. It can be:

  • ‘oldest’: select elements starting from the earliest entry into the window;

  • ‘latest’: select elements starting from the latest entry into the window;

  • ‘all’: select all elements.

List of Functions

See also

tmTopN(T, X, S, window, top, [ascending=true], [tiesMethod='latest'])

tmsumTopN, tmavgTopN, tmstdTopN, tmstdpTopN, tmvarTopN, tmvarpTopN, tmskewTopN, tmkurtosisTopN

tmTopN(T, X, Y, S, window, top, [ascending=true], [tiesMethod='latest'])

tmbetaTopN, tmcorrTopN, tmcovarTopN, tmwsumTopN

Windowing Logic

For the tmTopN functions, window can be of integral or DURATION type. The window size is measured by time. For each element Ti in T, the range of window is (temporalAdd(Ti, -window), Ti].

After sorting X (or X, Y) based on S in a time-based sliding window, the function obtains the first top elements for calculation. It adopts stable sorting algorithms and the order is specified by ascending.

For the first top windows, all the elements are taken for calculation. Therefore, the figure illustrates the rules starting from the top + 1 window.

The following example illustrates the calculation rules:

$ T=13:30m 13:34m 13:36m 13:37m 13:38m
$ S = 5 8 1 9 7
$ X = 2 1 5 3 4
$ tmsumTopN(T, X, S, window=4, top=3)
[2,1,6,9,12]
../../_images/tmTopN_1.png

The following examples show the usage of parameter tiesMethod:

$ T=2021.01.03+1..7
$ X = [2, 1, 4, 3, 4, 3, 1]
$ S = [5, 8, 1, 1, 1, 1, 3]
// For the second last window, there are four elements of value 1
// As tiesMethod is not specified, the default 'latest' is used, meaning the latest 3 occurrences of 1 (corresponding to 3, 4, 1 of X) are selected
$ tmsumTopN(T,X,S,4,3)
[2,3,7,9,11,10,10]
// As tiesMethod is set to 'oldest', the first 3 occurrences of 1 (corresponding to 4, 3, 4 of X) are selected
$ tmsumTopN(T,X,S,4,3,tiesMethod=`oldest)
[2,3,7,9,11,11,10]
// As tiesMethod is set to 'all', all the occurrences of 1 (corresponding to 4, 3, 4, 1 of X) are selected
$ tmsumTopN(T,X,S,4,3,tiesMethod=`all)
[2,3,7,9,11,14,10]