sql

Syntax

sql(select, from, [where], [groupBy], [groupFlag], [csort], [ascSort], [having], [orderBy], [ascOrder], [limit], [hint], [exec=false])

Arguments

select metacode indicating the columns to be selected. Each column is generated by either function sqlCol or sqlColAlias . Use a tuple to select multiple columns.

from a table object or table name.

where the “where” conditions. In case of multiple “where” conditions, use an ANY vector with each element corresponding to the metacode of a condition.

groupBy “group by” or “context by” column(s). In case of multiple “group by” columns, use an ANY vector with each element corresponding to the metacode of a column name.

groupFlag 1 means “group by”; 0 means “context by”; 2 means “pivot by”. The default value is 1.

csort “csort” column(s) to sort the rows within each “context by” group before conducting calculations in “select” statements. This parameter can only be specified for “context by” clause(when groupFlag=0). In case of multiple “csort” columns, use an ANY vector with each element corresponding to the metacode of a column name.

ascSort a scalar or vector indicating whether each “csort” column is sorted in ascending or descending order. 1 means sorting in ascending order; 0 means sorting in descending order. The default value is 1.

having the “having” conditions. In case of multiple “having” conditions, use a tuple with each element corresponding to the metacode of a condition.

orderBy “order by” column(s). In case of multiple “order by” columns, use a tuple with each element corresponding to the metacode of a column name.

ascOrder a scalar or vector indicating whether each “order by” column is sorted in ascending or descending order. 1 means sorting in ascending order; 0 means sorting in descending order. The default value is 1.

limit an integer indicating the number of rows to select from the result starting from the first row. If groupBy is specified and groupFlag=0, select limit rows from each group starting from the first row in each group. It corresponds to “top” clause in “select” statements.

hint a constant that can take the following values:

  • HINT_HASH: use Hashing algorithm to execute “group by” statements.

  • HINT_SNAPSHOT: query data from snapshot engine.

  • HINT_KEEPORDER: the records in the result after executing “context by” statements are in the same order as in the input data.

exec indicates whether to use the exec clause. The default value is false. If set to be true, a scalar or a vector will be generated. If the “pivot by” is used in the exec clause, a matrix can be generated.

Details

Create a SQL statement dynamically. To execute the generated SQL statement, use function eval.

Examples

$ symbol = take(`GE,6) join take(`MSFT,6) join take(`F,6)
$ date=take(take(2017.01.03,2) join take(2017.01.04,4), 18)
$ price=31.82 31.69 31.92 31.8  31.75 31.76 63.12 62.58 63.12 62.77 61.86 62.3 12.46 12.59 13.24 13.41 13.36 13.17
$ volume=2300 3500 3700 2100 1200 4600 1800 3800 6400 4200 2300 6800 4200 5600 8900 2300 6300 9600
$ t1 = table(symbol, date, price, volume);

$ t1;

symbol

date

price volume

GE

2017.01.03

31.82 2300

GE

2017.01.03

31.69 3500

GE

2017.01.04

31.92 3700

GE

2017.01.04

31.8 2100

GE

2017.01.04

31.75 1200

GE

2017.01.04

31.76 4600

MSFT

2017.01.03

63.12 1800

MSFT

2017.01.03

62.58 3800

MSFT

2017.01.04

63.12 6400

MSFT

2017.01.04

62.77 4200

MSFT

2017.01.04

61.86 2300

MSFT

2017.01.04

62.3 6800

F

2017.01.03

12.46 4200

F

2017.01.03

12.59 5600

F

2017.01.04

13.24 8900

F

2017.01.04

13.41 2300

F

2017.01.04

13.36 6300

F

2017.01.04

13.17 9600

$ x=5000
$ whereConditions = [<symbol=`MSFT>,<volume>x>]
$ havingCondition = <sum(volume)>200>;

$ sql(sqlCol("*"), t1);
< select * from t1 >

$ sql(sqlCol("*"), t1, whereConditions);
< select * from t1 where symbol == "MSFT",volume > x >

$ sql(select=sqlColAlias(<avg(price)>), from=t1, where=whereConditions, groupBy=sqlCol(`date));
< select avg(price) as avg_price from t1 where symbol == "MSFT",volume > x group by date >

$ sql(select=sqlColAlias(<avg(price)>), from=t1, groupBy=[sqlCol(`date),sqlCol(`symbol)]);
< select avg(price) as avg_price from t1 group by date,symbol >

$ sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, groupBy=sqlCol(`date`symbol), groupFlag=0);
< select symbol,date,cumsum(volume) as cumVol from t1 context by date,symbol >

$ sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, where=whereConditions, groupBy=sqlCol(`date), groupFlag=0);
< select symbol,date,cumsum(volume) as cumVol from t1 where symbol == "MSFT",volume > x context by date >

$ sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, where=whereConditions, groupBy=sqlCol(`date), groupFlag=0, csort=sqlCol(`volume), ascSort=0);
< select symbol,date,cumsum(volume) as cumVol from t1 where symbol == "MSFT",volume > x context by date csort volume desc >

$ sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, where=whereConditions, groupBy=sqlCol(`date), groupFlag=0, having=havingCondition);
< select symbol,date,cumsum(volume) as cumVol from t1 where symbol == "MSFT",volume > x context by date having sum(volume) > 200 >

$ sql(select=sqlCol("*"), from=t1, where=whereConditions, orderBy=sqlCol(`date), ascOrder=0);
< select * from t1 where symbol == "MSFT",volume > x order by date desc >

$ sql(select=sqlCol("*"), from=t1, limit=1);
< select top 1 * from t1 >

$ sql(select=sqlCol("*"), from=t1, groupBy=sqlCol(`symbol), groupFlag=0, limit=1);
< select top 1 * from t1 context by symbol >

$ sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, groupBy=sqlCol(`date`symbol), groupFlag=0, hint=HINT_KEEPORDER);
< select [128] symbol,date,cumsum(volume) as cumVol from t1 context by date,symbol >

$ whereConditions1 = <symbol=`MSFT or volume>x>
$ sql(select=sqlCol("*"), from=t1, where=whereConditions1, orderBy=sqlCol(`date), ascOrder=0);
< select * from t14059d76a00000000 where symbol == "MSFT" or volume > x order by date desc >

A convenient and flexible way to generate complicated queries dynamically is to define a function that calls function sql.

$ def f1(t, sym, x){
$ whereConditions=[<symbol=sym>,<volume>x>]
$ return sql(sqlCol("*"),t,whereConditions).eval()
$ };


$ f1(t1, `MSFT, 5000);

symbol

date

price volume

MSFT

2017.01.04

63.12 6400

MSFT

2017.01.04

62.3 6800

$ f1(t1, `F, 9000);

symbol

date

price volume

F

2017.01.04

13.17 9600

$ def f2(t, sym, colNames, filterColumn, filterValue){
$  whereConditions=[<symbol=sym>,expr(sqlCol(filterColumn),>,filterValue)]
$     return sql(sqlCol(colNames),t,whereConditions).eval()
$ };
$ f2(t1,`GE, `symbol`date`volume, `volume, 3000);

symbol

date

volume

GE

2017.01.03

3500

GE

2017.01.04

3700

GE

2017.01.04

4600

$ f2(t1,`F, `symbol`date`volume,`price,13.2);

symbol

date

volume

F

2017.01.04

8900

F

2017.01.04

2300

F

2017.01.04

6300

Set the parameter exec=true and use exec clause with the pivot by statement to generate a matrix:

$ date = 2020.09.21 + 0 0 0 0 1 1 1 1
$ sym = `MS`MS`GS`GS`MS`MS`GS`GS$SYMBOL
$ factorNum = 1 2 1 2 1 2 1 2
$ factorValue = 1.2 -3.4 -2.5 6.3 1.1 -3.2 -2.1 5.6
$ t = table(date, sym, factorNum, factorValue);
$ sql(select=sqlCol(`factorValue), from=t, groupBy=[sqlCol(`date), sqlCol(`sym)], groupFlag=2, exec=true)

< exec factorValue from t pivot by date,sym >