delete

Delete existing records in a table.

It can be used not only for an in-memory table, but also for a dfs table(a distributed table or a dimension table). Please note that if the data in a dfs table needs to be deleted, the whole partition which includes the data will be updated after deletion; If the data in a dimension table needs to be updated, the whole table will be updated after deletion. Therefore, the operation to delete a dfs table is only suitable for low-frequency deletion tasks such as minute-level deletion tasks, and not suitable for high-frequency deletion tasks, such as millisecond-level deletion tasks.

The deletion takes a multi-version approach and supports transactions. The system will create a new version to store the new data. Before committing the transaction, other SQL statements still can access the data in the old version. If data in multiple partitions needs to be deleted, as long as the deletion on one partitions fails, the system will roll back all modifications on each partition.

The delete clause supports the map sub-clause in version 1.30.13 and above, which means that the deletion operation will be be executed in each partition separately. If functions such as isDuplicated, first, firstNot, etc. whose results are sensitive to the order of the rows can be used in the where clause of a SQL statement on multiple partitions only if the map clause is used.

Syntax

delete from table_name
    where condition(s);

If where conditions are not used, delete all records from the table.

Examples

t = table(1 1 1 2 2 2 3 3 3 3 as id, 1..10 as x);
t;

id	x
1	1
1	2
1	3
2	4
2	5
2	6
3	7
3	8
3	9
3	10

delete from t where id=1;
t;

id	x
2	4
2	5
2	6
3	7
3	8
3	9
3	10

delete from t where id=3, x>8;
t;

id	x
2	4
2	5
2	6
3	7
3	8

delete from t;
t;

id	x

Example 2. Delete the data in a dfs table

$ login(`admin, `123456)
$ n=1000000
$ ID=rand(10, n)
$ x=rand(1.0, n)
$ t=table(ID, x)
$ db=database("dfs://rangedb124", RANGE,  0 5 10)
$ pt=db.createPartitionedTable(t, `pt, `ID)$ pt.append!(t)
$ select count(*) from pt;1000000
$ delete from pt where ID=5;
$ select count(*) from pt;

Example 3. To delete data in distributed table, and the result of the function in where clause is sensitive to the order of the rows, the map clause must be used which mean the operation will be executed separately in the partition.

$ n=10000$ ID=take( 0..4, n)
$ date=take(2017.08.07..2017.11.11, n)
$ ts=rand(timestamp(1..n),n)
$ int=take(1..50,n)
$ str=rand(string(1..n),n)
$ sym=rand(symbol(string(1..n)),n)
$ x=rand(10.0, n)
$ t=table(ID, date,ts,int,str,sym, x)
$ if(existsDatabase("dfs://compoDB")){
$ dropDatabase("dfs://compoDB")
$ }
$ db = database("dfs://compoDB", VALUE,2017.08M..2017.10M)
$ pt = db.createPartitionedTable(t, `pt, `date)$ pt.append!(t);
$ delete from pt where isDuplicated([ID,int])=false map;