Chapter 8: SQL Statements
This chapter covers how to use SQL to access, retrieve and manipulate data in DolphinDB. DolphinDB’s SQL syntax is very similar to the standard SQL language in a Relational Database Management System (RDBMS) such as MySQL, Oracle, SQL Server, etc.
Syntax
select [top_clause] column_expressions
from table_name | table_expression
[where filtering_conditions]
[grouping_clause [having_clause] | order_clause]
Common features with standard SQL
Support
select
,insert
,update
anddelete
statement for retrieving, inserting, updating and deleting records in a table, respectively. Since 1.30.17 and 2.00.5, DolphinDB supports the create statement to create a database (table), and the alter statement to add columns to a table.Support the
where
clause.Support
group by
andorder by
clauses.Support table join, including
inner join
,left join
,left semijoin
,full join
.
What are the differences?
In general, standard SQL statements are case insensitive, whereas the SQL statements in DolphinDB are case sensitive. All SQL keywords in DolphinDB must use lower case.
Most functions can be directly called in SQL queries.
Other differences as listed below.
Standard SQL syntax |
DolphinDB syntax |
Explanation |
---|---|---|
where sym=’IBM’ |
where sym=`IBM (“IBM”) or where sym==`IBM (“IBM”) |
To represent string in DolphinDB, we can use a backtick ` before a single word, or a pair of double quotes ” ” or single quotes ‘ ‘ around a single word or multiple words with spaces or symbols as string. |
where sym=’IBM’ and qty>2000 |
where sym==`IBM, qty>2000 where sym==`IBM and qty>2000 where sym=`IBM && qty>2000 |
All SQL keywords such as “select”, “where”, “and”, “or” must use lower case in DolphinDB. “,” is preferred to “and” with multiple “where” conditions. Each condition is tested on the filtered data set with previous conditions. |
where sym=”IBM” or qty>2000 |
where sym==`IBM or qty>2000 where sym==`IBM || qty>2000 |
|
where x not in (1,2) and y not in (22,23) |
where (not x in [1,2]) and (not y in (22,23)) |
DolphinDB does not support NOT IN operator. |
select avg(price), sym from Dataset group by sym |
select avg(price) from Dataset group by sym OR select avg(price), sym from Dataset group by sym |
If a GROUP BY column is not specified in the SELECT clause, it will be automatically included in the result. |
N/A |
context by |
context by is an innovation of DolphinDB. It simplifies processing time-series data within each group. With group by, each group returns a scalar; with context by, each group returns a vector of the same size as the group’s records. |
N/A |
pivot by |
pivot by transforms a vector into a matrix or table. |
N/A |
cgroup by |
Perform cumulative grouping calculations |
N/A |
map |
Execute the SQL statement on each partition separately, then merge the results. |
case …. when …. |
||
select column_name(s) from table1 left join table 2 on table1.column_name=table2.column_name |
select column_name(s) from lj(table1, table2, column_name) |
DolphinDB’s syntax is more succinct. |
LEFT JOIN |
left join, lj, left semijoin, lsj |
left join (lj) and left semijoin (lsj). Left join returns all records from the left table and the matched records from the right table. The result is NULL from the right table if there is no match. If there are more than one matched record in the right table, all the matched records in the right table are returned. Unlike left join, if there are more than one matched record in the right table, left semi join only returns the first record. |
INNER JOIN |
ej, sej, inner join |
ej and sej. Return only the rows that have equivalent values for the matching columns. Unlike ej, sej will sort the table in order by the join column. |
OUTER JOIN |
fj, full join |
Full join. Return all rows from both the left table and the right table, whether they have a matching row or not. |
N/A |
aj |
Asof join. It takes each record in the left table as a reference and checks if there is a match in the right table. If there is no match, the most recent observation will be chosen. If there are more than one match, the last one will be chosen. |
N/A |
wj, pwj |
Window join and prevailing window join. They are a generalization of asof join. For each row in the left table, window join applies aggregate functions on a window of rows in the right table. If the right table doesn’t have a matching value for the window, prevailing window join will fill it with the last value before the window and then apply the aggregate functions. |