Performance Monitoring

Enable/Disable Performance Monitoring

  • If each node is started individually, the default setting for performance monitoring is False. To start performance monitoring, we need to set perfMonitoring=true in the configuration file dolphindb.cfg.

  • If the cluster is started on the web interface, the default setting for performance monitoring is True. To stop performance monitoring on all the nodes started with a controller node, we need to set perfMonitoring=false in the configuration file dolphindb.cfg for the controller node.

The job log file contains descriptive information of all the queries that have been executed for each node. The default folder for the job log file is the log folder. The default name of the job log file is nodeAlias_job.log. We can set the path and name of the job log file with the parameter of jobLog in the configuration file.

Performance Monitoring Methods

DolphinDB provides the following 3 ways of performance monitoring:

  • With built-in functions. Performance monitoring must be enabled to run the following functions. Both of the following functions return various measures for performance monitoring:

    • getPerf : return performance monitoring measures for the local node. Can run on each node in a cluster.

    • getClusterPerf : return performance monitoring measures for all the nodes in the cluster. Can only run on the controller.

    • getJobStat : monitor the number of jobs and tasks that are are running or in the job queue.

  • On the web interface. The following are some of the performance monitoring metrics displayed on the web interface:

    • memUsed: memory used on the node

    • memAlloc: size of the current memory pool for DolphinDB on the node

    • medLast10QueryTime: the median execution time of the previous 10 finished queries

    • maxLast10QueryTime: the maximum execution time of the previous 10 finished queries

    • medLast100QueryTime: the median execution time of the previous 100 finished queries

    • maxLast100QueryTime: the maximum execution time of the previous 100 finished queries

    • maxRunningQueryTime: the maximum elapsed time of the queries that are currently running

  • Through API with third-party systems, such as Prometheus. We can take the following steps:

    1. Download Prometheus following the instructions from: https://prometheus.io/docs/prometheus/latest/getting_started/

    2. Add the data nodes that we would like to monitor in the configuration file prometheus.yml (in Linux):

    $ static_configs:
    $ - targets: ['192.168.1.27:8503','192.168.1.27:8504']
    

    3. Start Prometheus (in Linux):

    $ ./prometheus -config.file=prometheus.yml