select
Access data in a table. The select statement always returns a table.
Syntax
Select column(s) from a table
$ select column1, column2, ... from t
$ select * from t //select all columns
where, column1, column2, … are the column names of table t.
Create a new column
$ select expression [as colName] from t
where expression can be an expression on a column of table t, or a constant. When colName is not specified:
If expression is an expression, combine the expression and the applied column name as the new column name.
If expression is a constant, use the constant value as the new column name. Note: When you select only one constant, it returns a table with one column and one row.
Generate a column with NULL values only
$ select NULL [as colName] from t
The generated column is of VOID type. It is used as a placeholder. It does not support any calculations or operations (including adding, inserting, or modifying). If you append a table containing a NULL column to another table (using append!
, insert
into or tableInsert
), the NULL column will automatically be converted to the type of the column being appended to.
Note: When you select
only one NULL value, it returns a table with one column and one row.
The above three usages can be used together, such as select *, expression as newCol1, NULL as newCol2 from t
.
If the result returned by a SQL select query is:
a table with one column, it is treated as a vector and can be used with the
in
keyword.a table with one column and one row, it is treated as a scalar and can be used with operators such as
gt
,ge
,lt
,le
,eq
, orne
.
Examples
Create a table t1 with columns timestamp, qty, price and sym.
$ sym = `C`MS`MS`MS`IBM`IBM`C`C`C$SYMBOL
$ price= 49.6 29.46 29.52 30.02 174.97 175.23 50.76 50.32 51.29
$ qty = 2200 1900 2100 3200 6800 5400 1300 2500 8800
$ timestamp = [09:34:07,09:36:42,09:36:51,09:36:59,09:32:47,09:35:26,09:34:16,09:34:26,09:38:12]
$ t1 = table(timestamp, sym, qty, price);
$ t1;
timestamp |
sym |
qty |
price |
---|---|---|---|
09:34:07 |
C |
2200 |
49.6 |
09:36:42 |
MS |
1900 |
29.46 |
09:36:51 |
MS |
2100 |
29.52 |
09:36:59 |
MS |
3200 |
30.02 |
09:32:47 |
IBM |
6800 |
174.97 |
09:35:26 |
IBM |
5400 |
175.23 |
09:34:16 |
C |
1300 |
50.76 |
09:34:26 |
C |
2500 |
50.32 |
09:38:12 |
C |
8800 |
51.29 |
select * from t1;
timestamp |
sym |
qty |
price |
---|---|---|---|
09:34:07 |
C |
2200 |
49.6 |
09:36:42 |
MS |
1900 |
29.46 |
09:36:51 |
MS |
2100 |
29.52 |
09:36:59 |
MS |
3200 |
30.02 |
09:32:47 |
IBM |
6800 |
174.97 |
09:35:26 |
IBM |
5400 |
175.23 |
09:34:16 |
C |
1300 |
50.76 |
09:34:26 |
C |
2500 |
50.32 |
09:38:12 |
C |
8800 |
51.29 |
The “selected” objects in the select statement don’t have to be table columns. They could be variables or functions. The system will check if the “selected” object exists in the table as a column. If the answer is yes, it is treated as a table column; otherwise it is treated as a variable column or a function.
recordNum=1..9;
select 1..9 as recordNum, sym from t1;
recordNum |
sym |
---|---|
1 |
C |
2 |
MS |
3 |
MS |
4 |
MS |
5 |
IBM |
6 |
IBM |
7 |
C |
8 |
C |
9 |
C |
$ select 3 as portfolio, sym from t1;
portfolio |
sym |
---|---|
3 |
C |
3 |
MS |
3 |
MS |
3 |
MS |
3 |
IBM |
3 |
IBM |
3 |
C |
3 |
C |
3 |
C |
In the example above, we use a scalar to represent a column with same values.
Use a function in select statement:
def f(a):a+100;
select f(qty) as newQty, sym from t1;
newQty |
sym |
---|---|
2300 |
C |
2000 |
MS |
2200 |
MS |
3300 |
MS |
6900 |
IBM |
5500 |
IBM |
1400 |
C |
2600 |
C |
8900 |
C |
select last price from t1 group by sym;
sym |
last_price |
---|---|
C |
51.29 |
MS |
30.02 |
IBM |
175.23 |
Generate a column with constants:
t = table(1..5 as id, `a`a`a`b`b as val);
select *, 1 as cst from t
id |
val |
cst |
---|---|---|
1 |
a |
1 |
2 |
a |
1 |
3 |
a |
1 |
4 |
b |
1 |
5 |
b |
1 |
select *, 1 from t
id |
val |
1 |
---|---|---|
1 |
a |
1 |
2 |
a |
1 |
3 |
a |
1 |
4 |
b |
1 |
5 |
b |
1 |
Combine query results with keyword in
:
$ t = table(`APPL`IBM`AMZN`IBM`APPL`AMZN as sym, 10.1 11.2 11.3 12 10.6 10.8 as val)
$ t;
sym |
val |
---|---|
APPL |
10.1 |
IBM |
11.2 |
AMZN |
11.3 |
IBM |
12 |
APPL |
10.6 |
AMZN |
10.8 |
$ stk = table(1 2 3 as id, `AAPL`IBM`FB as stock);
$ select count(*) from t where sym in select stock from stk;
2
The following example generates a table containing a NULL column
$ t1 = table(rand(10,3) as x)
$ tmp=select *, NULL from t1
$ typestr(tmp[`NULL])
VOID VECTOR
Then append the table tmp to t2
$ t2 = table(1..6 as x1, 1..6 as y1)
$ t2.append!(tmp)
$ t2
x1 y1
1 1
2 2
3 3
3
4
6
The top clause returns a specified number of records starting from the first record of the table. We can either specify a scalar or a range for the top clause. To be consistent within DolphinDB, the range index in DolphinDB SQL syntax starts from 0 instead of 1, and the ending value is exclusive.
select top 3 * from t1;
timestamp |
sym |
qty |
price |
---|---|---|---|
09:34:07 |
C |
2200 |
49.6 |
09:36:42 |
MS |
1900 |
29.46 |
09:36:51 |
MS |
2100 |
29.52 |
select top 2:4 * from t1;
timestamp |
sym |
qty |
price |
---|---|---|---|
09:36:51 |
MS |
2100 |
29.52 |
09:36:59 |
MS |
3200 |
30.02 |
The scalar or range in the top clause must be integer constants. It is not allowed to use variable or expressions.
x=2;
select top x * from t1;
Syntax Error: [line #2] integer constant expected after keyword top
select top (1+2) * from t1;
Syntax Error: [line #1] integer constant expected after keyword top
The top clause cannot be used together with group by clause or pivot by clause, but can be used with context by clause. Please refer to the section about context by.