Cluster Mode
=====================================

**Configurations**

In cluster mode, DolphinDB configuration files are saved under the config directory.

* controller.cfg: the configuration file for controller nodes
* agent.cfg: the configuration file for agent nodes
* cluster.cfg: the configuration file for data nodes
* cluster.nodes: network information and node mode of each node in the cluster.

Please note that all configuration files require that the first line is not a blank line.

For high availability cluster, all the configuration files cluster.cfg and cluster.node are synchronously managed by the Raft group. After the configuration parameters are modified through the web interface, the interface will automatically synchronize the parameters to all configuration files in the cluster.

*Configuration Parameters for Controllers*

Configuration parameters for controllers can be specified in controller.cfg, or in command line when starting a controller node. localSite and mode must be specified.

+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Configuration Parameters         | Details                                                                                                                                                                |
+==================================+========================================================================================================================================================================+
| localSite                        | IP address, port number and alias of the controller node.                                                                                                              |
+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| mode=controller                  | Node mode. Must specify "controller" here for a controller.                                                                                                            |
+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| clusterConfig=cluster.cfg        | Configuration file for the cluster. The default value is cluster.cfg. It can only be specified in command line.                                                        |
+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| nodesFile=cluster.nodes          | Specify IP address, port number, alias and node mode of all nodes in a cluster. It is loaded when the controller is started. The default value is cluster.nodes.       |
|                                  +                                                                                                                                                                        |
|                                  | It can only be specified in command line.                                                                                                                              |
+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dataSync=0                       | Whether database logs are forced to persist to disk before the transaction is committed. If *dataSync* = 1, the database log (including redo log, undo log, edit log   |
|                                  +                                                                                                                                                                        |
|                                  | of data nodes, and edit log of controller nodes) must be written to disk before each transaction is committed. This guarantees that data written to the database is    |
|                                  +                                                                                                                                                                        |
|                                  | not lost in case of system crash or power outage. If *dataSync* = 0, the log files are written to cache before a transaction is committed. The operating system will   |
|                                  +                                                                                                                                                                        |
|                                  | write the log files to disk at a later time. We may experience loss of data or corruption of the database in case of system crash or power outage. The default value   |
|                                  +                                                                                                                                                                        |
|                                  | is 0.                                                                                                                                                                  |
+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dfsMetaDir=/home/DolphinDB/serve | The directory to save the metadata of the distributed file system on the controller node. The default value is the DolphinDB home directory specified by parameter     |
| r                                |                                                                                                                                                                        |
|                                  +                                                                                                                                                                        |
|                                  | *home*.                                                                                                                                                                |
+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| PublicName                       | Internet IP or domain name of the controller node. It must be a domain name if enableHttps=true.                                                                       |
+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|datanodeRestartInterval           |It is a non-negative integer of int type. Its unit is second. Defaults to 0, indicating that the system will not automatically start data/compute nodes. Configure this |
|                                  +                                                                                                                                                                        |
|                                  |parameter to implement the following features:                                                                                                                          |
|                                  +                                                                                                                                                                        |
|                                  |\(1) Automatically start the data/compute node after the controller is started (the agent needs to be started)                                                          |                                  
|                                  +                                                                                                                                                                        |
|                                  |\(2) The data/compute node will be automatically restarted after its offline time is larger than the set value (for a non-graceful shutdown and the agent needs to be   |
|                                  +                                                                                                                                                                        |
|                                  |started).                                                                                                                                                               |            
+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dfsSyncRecoveryChunkRowSize      | A positive integer with a default value of 1000. By default, data recovery between nodes is asynchronous. When the difference between the number of records in the     |
|                                  +                                                                                                                                                                        |
|                                  | destination chunk to be recovered and the number of records in the chunk of the latest version is less than *dfsSyncRecoveryChunkRowSize*, synchronous recovery is     |
|                                  +                                                                                                                                                                        |
|                                  | enabled.                                                                                                                                                               |
+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dfsRecoveryConcurrency           | The number of concurrent recovery tasks in a node recovery. The default value is twice the number of all nodes except the agent node.                                  |
+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dfsRebalanceConcurrency          | The number of concurrent rebalance tasks in a node rebalance. The default value is twice the number of all nodes except the agent.                                     |
+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dfsChunkNodeHeartBeatTimeout     | Time interval (in units of seconds) for marking a datanode as down, i.e., if the controller has not received heartbeat message from a datanode for more than this time |
|                                  +                                                                                                                                                                        |
|                                  | interval, the datanode will be marked and treated as down by default.                                                                                                  |
+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

.. dfsMetaLogFilename=DFSMetaLog    | The edit log file for metadata of the distributed file system on the controller node. The default value is DFSMetaLog.

The following configuration parameters are for high availability and are specified in controller.cfg.

+------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Configuration Parameters     | Details                                                                                                                                                                    |
+==============================+============================================================================================================================================================================+
| dfsHAMode=Raft               | Whether the controller nodes form a Raft group. It can take the value of either Raft or None. The default value is None.                                                   |
+------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dfsReplicationFactor=2       | The number of all replicas for each data chunk. The default value is 2.                                                                                                    |
+------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dfsReplicaReliabilityLevel=0 | Whether multiple replicas can reside on the same server. 0: Yes; 1: No. 2: replicas are allocated to multiple servers if possible. The default value is 0.                 |
+------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dfsRecoveryWaitTime=0        | Length of time (in milliseconds) the controller waits after a data node becomes unavailable before restoring the data to other data nodes. The default value is 0          |
|                              +                                                                                                                                                                            |
|                              | indicating no automatic recovery.                                                                                                                                          |
+------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| raftElectionTick=800         | Specify a time interval (in 10ms): [raftElectionTick, 2*raftElectionTick]. After receiving the last heartbeat from the leader, if a follower does not receive the next     |
|                              +                                                                                                                                                                            |
|                              | heartbeat after a random waiting time within the specified interval, it will send a request for leader election. The default value is 800 (8 seconds) and the interval is  |
|                              +                                                                                                                                                                            |
|                              | [8s, 16s]. Note: All controllers of the raft group must have the same configurations.                                                                                      |
+------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Note: The number of online data nodes must be no smaller than the value of *dfsReplicationFactor*, otherwise an exception will be thrown.

*Configuration Parameters for Agents*

Configuration parameters for agents can be specified in agent.cfg, or in command line when starting an agent node. All of the following 3 parameters must be specified.

+------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Configuration          | Details                                                                                                                                                                          |
| Parameters             |                                                                                                                                                                                  |
+========================+==================================================================================================================================================================================+
| localSite              | Host address, port number and alias of the local node.                                                                                                                           |
+------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| mode=agent             | Node mode. Must specify "agent" here for an agent.                                                                                                                               |
+------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| controllerSite         | Host address, port number and alias of the controller site of the agent node. When an agent starts up, it uses this information to contact the controller. It must be identical  |
|                        +                                                                                                                                                                                  |
|                        | as localSite of  one of the controllers in *controller.cfg*.                                                                                                                     |
+------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| lanCluster=1           | Whether the cluster is within a LAN (local area network). *lanCluster* = 1: use UDP for heartbeats; *lanCluster* = 0: use TCP for heartbeats. Set *lanCluster* = 0 for           |
|                        +                                                                                                                                                                                  |
|                        | cloud deployment. The default value is 1.                                                                                                                                        |
+------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

*Configuration Parameters for Data Nodes*

All configuration parameters in the standalone mode can be used for data nodes in the cluster mode. They are specified in cluster.cfg. Please refer to  :doc:`/DatabaseandDistributedComputing/Configuration/StandaloneMode`  for more details.

The following table lists the configuration parameters that apply to cluster mode only.

+------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
| Configuration Parameters                                   | Details                                                                                                                                      |
+============================================================+==============================================================================================================================================+
| lanCluster = 1                                             | Whether the cluster is within a LAN (local area network). *lanCluster* = 1: use UDP for heartbeats; *lanCluster* = 0: use TCP for            |
|                                                            +                                                                                                                                              |
|                                                            | heartbeats. Set *lanCluster* = 0 for cloud deployment. The default value is 1.                                                               |
+------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
| allowVolumeCreation = 1                                    | Set the parameter to determine whether a volume can be created automatically if the volume does not exist. The value can be 0 or 1. The      |
|                                                            +                                                                                                                                              |
|                                                            | default value 1 indicates to create volumes automatically. If the parameter is set to 0, the system will report an error if the volume does  |
|                                                            +                                                                                                                                              |
|                                                            | not exist.                                                                                                                                   |
+------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
| volumes=/hdd/hdd1/volumes,/hdd/hdd2/volumes,               | The directory of data files. If it is not specified, the default directory is <HomeDir>/<nodeAlias>/storage. Note: For an HA cluster, restart|
|                                                            +                                                                                                                                              |
| /hdd/hdd3/volumes,/hdd/hdd4/volumes                        | all data nodes after adding the directories in the cluster.cfg files of the machine where the controllers are located. Otherwise, the leader |
|                                                            +                                                                                                                                              |
|                                                            | is switched after restarting, and the new configuration will be lost. If modified via the web interface, it will be automatically            |
|                                                            +                                                                                                                                              |
|                                                            | synchronized to all cluster.cfg files and takes effect upon restarting the data nodes.                                                       |
+------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
| recoveryWorkers = 1                                        | The number of workers that can be used to recover chunks synchronously in node recovery. The default value is 1.                             |
+------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
| memLimitOfQueryResult                                      | The memory limit for a query result. The default value is min(50% * maxMemSize, 8G), and it must be smaller than 80% * maxMemSize.           |
+------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
| memLimitOfTaskGroupResult                                  | In the Map phase of MapReduce, a single query is divided into several tasks, among which the remote tasks are sent to remote nodes in        |
|                                                            +                                                                                                                                              |
|                                                            | batches (task groups). The configuration parameter is used to set the memory limit of a task group sent from the current node. The default   |
|                                                            +                                                                                                                                              |
|                                                            | value is min(20% * maxMemSize, 2G), and it must be smaller than 50% * maxMemSize.                                                            |
+------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+




Streaming-related configuration parameters are specified in cluster.cfg.

+------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
| Configuration Parameters                                   | Details                                                                                                                                      |
+============================================================+==============================================================================================================================================+
| streamingHADir=/home/DolphinDB/Data/NODE1/log/streamLog    | The directory to keep streaming Raft log files. The default value is <HOME>/log/streamLog. Each data node should be configured               |
|                                                            +                                                                                                                                              |
|                                                            | with different *streamingHADir*.                                                                                                             |
+------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
| streamingHAMode=raft                                       | Enable high-availability for streaming.                                                                                                      |
+------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+
| streamingRaftGroups=2:NODE1:NODE2:NODE3,3:NODE3:NODE4:NODE | Information about Raft groups. Each Raft group is represented by group ID and aliases of data nodes in the group, separated with colon (:).  |
| 5                                                          |                                                                                                                                              |
|                                                            +                                                                                                                                              |
|                                                            | Raft group ID must be an integer greater than 1. Each Raft group has at least 3 data nodes. Use comma (,) to seperate multiple Raft  groups. |
+------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------+


**Asynchronous Replication**

- Master Cluster 

The parameters *clusterReplicationSlaveNum* and *clusterReplicationMode* must be configured.

+----------------------------+----------------------------------+--------------------------------------------------------------------------------------------+------------+
| Parameter                  | Default                          | Description                                                                                | Node       |
+============================+==================================+============================================================================================+============+
| clusterReplicationSlaveNum | 2                                | Specifies the upper limit of slave clusters.                                               | controller |
+----------------------------+----------------------------------+--------------------------------------------------------------------------------------------+------------+
| clusterReplicationMode     | \                                | Specifies whether the cluster is master or slave cluster.                                  | data node  |
+----------------------------+----------------------------------+--------------------------------------------------------------------------------------------+------------+
| clusterReplicationWorkDir  | \<*HomeDir*\>/clusterReplication | Specifies the working directory where data of write tasks on the master cluster is stored. | data node  |
+----------------------------+----------------------------------+--------------------------------------------------------------------------------------------+------------+
| clusterReplicationSyncPers | false                            | Specifies a Boolean value indicating whether to persist data of write tasks synchronously  | data node  |
| istence                    |                                  | or asynchronously. Note: Synchronous persistence ensures data consistency but may affect   |            |
|                            |                                  | transaction efficiency, while asynchronous persistence improves performance but may cause  |            |
|                            |                                  | data loss in node crashes.                                                                 |            |
+----------------------------+----------------------------------+--------------------------------------------------------------------------------------------+------------+

- Slave cluster

The parameters *clusterReplicationMasterCtl* and *clusterReplicationMode* must be configured.

+----------------------------------+---------+--------------------------------------------------------------------------------------------------------+------------+
| Parameter                        | Default | Description                                                                                            | Node       |
+==================================+=========+========================================================================================================+============+
| clusterReplicationMasterCtl      | \       | Specifies \<*ip:port*\> of the controller in the master cluster. For a high-availability (HA) cluster, | controller |
|                                  |         | you can specify any controller in a raft group.                                                        |            |
+----------------------------------+---------+--------------------------------------------------------------------------------------------------------+------------+
| clusterReplicationMode           | \       | Specifies whether the cluster is master or slave cluster.                                              | data node  |
+----------------------------------+---------+--------------------------------------------------------------------------------------------------------+------------+
| replicationExecutionUsername     | admin   | Specifies the username used to execute operations regarding cluster replication. The specified user    | data node  |
|                                  |         | must have relevant privileges on transactions, otherwise the asynchronous tasks will fail.             |            |
+----------------------------------+---------+--------------------------------------------------------------------------------------------------------+------------+
| replicationExecutionPassword     | 123456  | Specifies the user password for replicationExecutionUsername.                                          | data node  |
+----------------------------------+---------+--------------------------------------------------------------------------------------------------------+------------+


In *cluster.cfg*, we can specify configuration parameter values in the following 4 ways:

\1. Node alias qualified configuration parameters. Node aliases are defined in cluster.nodes.

.. code:: console
  
   $ nodeA.volumes = /DFSRoot/nodeA
   $ nodeB.volumes = /DFSRoot/nodeB


\2. (Node alias + wildcard character ("?" or "%")) qualified configuration parameters. "?" represents a single character; "%" represents 0, 1 or multiple characters.

.. code:: console

   $ %8821.volumes = /DFSRoot/data8821
   $ %8822.volumes = /DFSRoot/data8822
   $ DFSnode?.maxMemSize=16


\3. Use macro variable <ALIAS> for assignments with node aliases. For a cluster with 2 data nodes nodeA and nodeB:

.. code:: console

   $ volumes = /DFSRoot/<ALIAS>


is equivalent to

.. code:: console

   $ nodeA.volumes = /DFSRoot/nodeA
   $ nodeB.volumes = /DFSRoot/nodeB

Note: The macro variable<ALIAS> cannot be used to specify the volume for a single node, otherwise the controller cannot be started. For example, please don't specify configuration parameter as ``nodeA.volumes = /DFSRoot/<ALIAS>``.

\4. If the parameters are not qualified by node aliases, they indicate common configuration parameter values for all the data nodes in the cluster.

.. code:: console
   
   // for Windows
   $ maxConnections=64     
   // for Linux
   $ maxConnections=512    
   
   $ maxMemSize=12

*Parameter Assignment Order*

A configuration parameter may appear in the command line or multiple configuration files. It may be assigned different values in different locations. DolphinDB checks for parameter values in the order below. If a parameter is assigned value in a step, then assignments in all subsequent steps are ignored.

   \1. command line

   \2. cluster.nodes

   \3. qualified by specific node aliases in cluster.cfg

   \4. qualified by node aliases and wildcard characters in cluster.cfg
   
   \5. common configuration in cluster.cfg