enableTablePersistence

Syntax

enableTablePersistence(table, [asynWrite=true], [compress=true], [cacheSize=-1], [retentionMinutes=1440], [flushMode=0])

Arguments

table is an empty shared stream table.

asynWrite (optional) is a Boolean value indicating whether persistence is enabled in asynchronous mode. The default value is true, meaning asynchronous persistence is enabled. In this case, once data is written into memory, the write is deemed complete. The data stored in memory is then persisted to disk by another thread.

compress (optional) is a Boolean value indicating whether to save a table to disk in compression mode. The default value is true.

cacheSize (optional) is an integer of LONG type specifying the maximum number of rows of the table to keep in memory. If it is 0 or unspecified (default -1), all rows are kept in memory. The minimum valid value for this parameter is 100,000.

retentionMinutes (optional) is an integer indicating for how long (in minutes) a log file will be kept after last update. The default value is 1440, which means the log file is kept for 1440 minutes, i.e., 1 day.

flushMode (optional) is an integer indicating whether to enable synchronous disk flush. It can be 0 or 1. The persistence process first writes data from memory to the page cache, then flushes the cached data to disk. If flushMode is 0 (default), asynchronous disk flushing is enabled. In this case, once data is written from memory to the page cache, the flush is deemed complete and the next batch of data can be written to the table. If flushMode is set to 1, the current batch of data must be flushed to disk before the next batch can be written.

Details

This command enables a shared stream table to be persisted to disk.

For this command to work, we need to specify the configuration parameter persistenceDir. The persistence location of the table is <PERSISTENCE_DIR>/<TABLE_NAME>. The directory contains 2 types of files: data files (named like data0.log, data1.log…) and an index file index.log. The data that has been persisted to disk will be loaded into memory after the system is restarted.

The parameter asynWrite informs the system whether table persistence is in asynchronous mode. With asynchronous mode, new data are pushed to a queue and persistence workers (threads) will write the data to disk later. With synchronous mode, the table append operation keeps running until new data are persisted to the disk. The default value is true (asynchronous mode). In general, asynchronous mode achieves higher throughput.

With asynchronous mode, table persistence is conducted by a single persistence worker (thread), and the persistence worker may handle multiple tables. If there is only one table to be persisted, an increase in the number of persistence workers doesn’t improve performance.

By default, a stream table keeps all data in memory. If the stream table is too large, the system may run out of memory. To avoid this problem, we can set a limit. If the number of rows of the stream table reaches this limit, the first half of the table will be cleared from memory.

note:

  • It is recommended to invoke command fflush to write data in the page cache to disk before you kill a DolphinDB process and restart it.

  • If asynchronous mode is enabled for data persistence or flush, data loss may occur due to server crash.

Examples

$ colName=["time","x"]
$ colType=["timestamp","int"]
$ t = streamTable(100:0, colName, colType);
$ share t as st

$ enableTablePersistence(table=st, cacheSize=1200000)
for(s in 0:200){
    n=10000
    time=2019.01.01T00:00:00.000+s*n+1..n
    x=rand(10.0, n)
    insert into st values(time, x)
}
$ getPersistenceMeta(st);

persistenceDir->/data/ssd/DolphinDBDemo/persistence3/st
retentionMinutes->1440
hashValue->0
asynWrite->true
diskOffset->0
sizeInMemory->800000
compress->1
memoryOffset->1200000
totalSize->2000000
sizeOnDisk->2000000

Please note that in this example, we shared a stream table before persisting it with the command enableTablePersistence. These 2 operations can be achieved with command enableTableShareAndPersistence.

Related commands: disableTablePersistence, clearTablePersistence, enableTablePersistence