repartitionDS

Syntax

repartitionDS(query, [column], [partitionType], [partitionScheme], [local=true])

Arguments

query is metacode of SQL statements or a tuple of metacode of SQL statements.

column is a string indicating a column name in query. Function repartitionDS deliminates data sources based on column.

partitionType means the type of partition. It can take the value of VALUE or RANGE.

partitionScheme is a vector indicating the partitioning scheme. For details please refer to Distributed Computing.

local is a Boolean value indicating whether to move the data sources to the local node for computing. The default value is true.

Details

Generate a tuple of data sources from a table with a new partitioning design.

If query is metacode of SQL statements, the parameter column must be specified. partitionType and partitionScheme can be unspecified for a partitioned table with a COMPO domain. In this case, the data sources will be determined based on the original partitionType and partitionScheme of column.

If query is a tuple of metacode of SQL statements, the other 3 parameters should be unspecified. The function returns a tuple with the same length as query. Each element of the result is a data source corresponding to a piece of metacode in query.

Examples

$ n=1000000
$ ID=rand(100, n)
$ dates=2017.08.07..2017.08.11
$ date=rand(dates, n)
$ x=rand(10.0, n)
$ t=table(ID, date, x)

$ $ dbDate = database(, VALUE, 2017.08.07..2017.08.11)
dbID = database(, RANGE, 0 50 100)
$ db = database("dfs://compoDB", COMPO, [dbDate, dbID])
$ pt = db.createPartitionedTable(t, `pt, `date`ID)
$ pt.append!(t);

Example 1. query is metacode of SQL statements. partitionType and partitionScheme are specified.

$ repartitionDS(<select * from pt>,`date,RANGE,2017.08.07 2017.08.09 2017.08.11);

[DataSource< select [4] * from pt where date >= 2017.08.07,date < 2017.08.09 >,DataSource< select [4] * from pt where date >= 2017.08.09,date < 2017.08.11 >]

Example 2. query is metacode of SQL statements. partitionType and partitionScheme are unspecified.

$ repartitionDS(<select * from pt>,`ID);

[DataSource< select [4] * from pt [partition = */0_50] >,DataSource< select [4] * from pt [partition = */50_100] >]

Example 3. query is a tuple of metacode of SQL statements.

$ repartitionDS([<select * from pt where id between 0:50>,<select * from pt where id between 51:100>]);

[DataSource< select [4] * from pt where id between 0 : 50 >,DataSource< select [4] * from pt where id between 51 : 100 >]