select

Access data in a table. The select statement always returns a table.

Syntax

  1. Select column(s) from a table

$ select column1, column2, ... from t
$ select * from t //select all columns

where, column1, column2, … are the column names of table t.

  1. Create a new column

$ select expression [as colName] from t

where expression can be an expression on a column of table t, or a constant. When colName is not specified:

  • If expression is an expression, combine the expression and the applied column name as the new column name.

  • If expression is a constant, use the constant value as the new column name. Note: When you select only one constant, it returns a table with one column and one row.

  1. Generate a column with NULL values only

$ select NULL [as colName] from t

The generated column is of VOID type. It is used as a placeholder. It does not support any calculations or operations (including adding, inserting, or modifying). If you append a table containing a NULL column to another table (using append!, insert into or tableInsert), the NULL column will automatically be converted to the type of the column being appended to.

Note: When you select only one NULL value, it returns a table with one column and one row.

The above three usages can be used together, such as select *, expression as newCol1, NULL as newCol2 from t.

If the result returned by a SQL select query is:

  • a table with one column, it is treated as a vector and can be used with the in keyword.

  • a table with one column and one row, it is treated as a scalar and can be used with operators such as gt, ge, lt, le, eq, or ne.

Examples

Create a table t1 with columns timestamp, qty, price and sym.

$ sym = `C`MS`MS`MS`IBM`IBM`C`C`C$SYMBOL
$ price= 49.6 29.46 29.52 30.02 174.97 175.23 50.76 50.32 51.29
$ qty = 2200 1900 2100 3200 6800 5400 1300 2500 8800
$ timestamp = [09:34:07,09:36:42,09:36:51,09:36:59,09:32:47,09:35:26,09:34:16,09:34:26,09:38:12]
$ t1 = table(timestamp, sym, qty, price);
$ t1;

timestamp

sym

qty

price

09:34:07

C

2200

49.6

09:36:42

MS

1900

29.46

09:36:51

MS

2100

29.52

09:36:59

MS

3200

30.02

09:32:47

IBM

6800

174.97

09:35:26

IBM

5400

175.23

09:34:16

C

1300

50.76

09:34:26

C

2500

50.32

09:38:12

C

8800

51.29

select * from t1;

timestamp

sym

qty

price

09:34:07

C

2200

49.6

09:36:42

MS

1900

29.46

09:36:51

MS

2100

29.52

09:36:59

MS

3200

30.02

09:32:47

IBM

6800

174.97

09:35:26

IBM

5400

175.23

09:34:16

C

1300

50.76

09:34:26

C

2500

50.32

09:38:12

C

8800

51.29

The “selected” objects in the select statement don’t have to be table columns. They could be variables or functions. The system will check if the “selected” object exists in the table as a column. If the answer is yes, it is treated as a table column; otherwise it is treated as a variable column or a function.

recordNum=1..9;
select 1..9 as recordNum, sym from t1;

recordNum

sym

1

C

2

MS

3

MS

4

MS

5

IBM

6

IBM

7

C

8

C

9

C

$ select 3 as portfolio, sym from t1;

portfolio

sym

3

C

3

MS

3

MS

3

MS

3

IBM

3

IBM

3

C

3

C

3

C

In the example above, we use a scalar to represent a column with same values.

Use a function in select statement:

def f(a):a+100;
select f(qty) as newQty, sym from t1;

newQty

sym

2300

C

2000

MS

2200

MS

3300

MS

6900

IBM

5500

IBM

1400

C

2600

C

8900

C

select last price from t1 group by sym;

sym

last_price

C

51.29

MS

30.02

IBM

175.23

Generate a column with constants:

t = table(1..5 as id, `a`a`a`b`b as val);
select *, 1 as cst from t

id

val

cst

1

a

1

2

a

1

3

a

1

4

b

1

5

b

1

select *, 1 from t

id

val

1

1

a

1

2

a

1

3

a

1

4

b

1

5

b

1

Combine query results with keyword in:

$ t = table(`APPL`IBM`AMZN`IBM`APPL`AMZN as sym, 10.1 11.2 11.3 12 10.6 10.8 as val)
$ t;

sym

val

APPL

10.1

IBM

11.2

AMZN

11.3

IBM

12

APPL

10.6

AMZN

10.8

$ stk = table(1 2 3 as id, `AAPL`IBM`FB as stock);
$ select count(*) from t where sym in select stock from stk;
2

The following example generates a table containing a NULL column

$ t1 = table(rand(10,3) as x)
$ tmp=select *, NULL from t1
$ typestr(tmp[`NULL])
VOID VECTOR

Then append the table tmp to t2

$ t2 = table(1..6 as x1, 1..6 as y1)
$ t2.append!(tmp)
$ t2

x1  y1
1   1
2   2
3   3
3
4
6

The top clause returns a specified number of records starting from the first record of the table. We can either specify a scalar or a range for the top clause. To be consistent within DolphinDB, the range index in DolphinDB SQL syntax starts from 0 instead of 1, and the ending value is exclusive.

select top 3 * from t1;

timestamp

sym

qty

price

09:34:07

C

2200

49.6

09:36:42

MS

1900

29.46

09:36:51

MS

2100

29.52

select top 2:4 * from t1;

timestamp

sym

qty

price

09:36:51

MS

2100

29.52

09:36:59

MS

3200

30.02

The scalar or range in the top clause must be integer constants. It is not allowed to use variable or expressions.

x=2;
select top x * from t1;
Syntax Error: [line #2] integer constant expected after keyword top
select top (1+2) * from t1;
Syntax Error: [line #1] integer constant expected after keyword top

The top clause cannot be used together with group by clause or pivot by clause, but can be used with context by clause. Please refer to the section about context by.