NoSQL Database Performance Tuning
MangoNoSQL performs its writes in the background when Mango is under high load. This is done via Batch Write Behind tasks. Each task will run in parallel to the others and pull point values from a common pool and insert them into the data store. If you see any NoSQL Data Lost Events in your system it is recommended to adjust the performance settings.
Batch write behind spawn threshold
This setting determines how many point values must be waiting to be written before creating a new Batch Write Behind task to insert them in parallel.
Max batch write behind tasks
This setting determines the maximum number of tasks that will be used. New tasks are only created when the pool reaches the spawn threshold. Note that increasing this value will have an effect on the number of available High Priority threads, so ensure you have a large enough High Priority Thread pool.
Batch write inserts per task
This setting determines how many point values are pulled from the common pool when a batch write behind task is ready to insert more data.
The application is capable of automatic backup the MangoNoSql Database to a zip file. The frequency and location of these backups can be set here. By default the configuration is saved daily and the past 9 backups are kept on the disk.
Incremental backups allow only changes to be backed up during a backup. This reduces backup sizes. Each backup is a zip file of the changes from the previous backup to now. To restore them simply use the restore tool or unzip the files overtop of the existing database files in order.
Enable auto backup to schedule the backup of the system configuration to a file.
Last run is the time and date of the last backup. If no backup has ever been made, the date will be unknown.
The Backup directory should point to the directory where the backup files will be stored. By default this is set to MA_HOME/backups.
Backup every number of periods. Choose the frequency of backups. These are relative to the last backup time unless that is unknown and then the first backup will be run at the next date/time entered below.
Backup Time a preferred time of day. Choose the hour and minute of the backup, hour is from 0-23, minute from 0-59.
The Versions to keep setting allows the user to choose how many backups to keep in the directory before they are deleted. If > 1 is entered then the backup will be named MangoNoSql-MMM-dd-yyyy_HHmmss.zip, if the number entered is 1 then the file name will always remain MangoNoSql.zip and it's contents will be replaced by the most recent backup.
The Backup now button allows the user to queue a job to backup the system now. This job will execute as soon as possible.
Mango stores all point values waiting to be written in a list in memory that is only used by the NoSQL module. It then selects up to ‘Batch write behind inserts per task’ from this list and starts a thread to insert them. As long as this list has at least one point value in it, Mango will be trying to insert data. So data is only held in this list as long as necessary. The way to tune how much data is stored in the list is to adjust the following settings:
- ‘Batch write behind span threshold’ determines how big the list must get before a new task to insert data is created. So for example a spawn threshold of 10 would create new insert tasks when the list gets to 10, 20, 30 etc.
- ‘Max batch write behind tasks’ determines the maximum number of tasks that will be run at the same time to insert data.
- ‘Batch write behind inserts per task’ determines the number of point values that the task will take from the list on each of its iterations. Tasks will continue to run, grabbing this number of values from the list until the list is empty.
This is how we throttle data onto the disk during times of high polling vs. times of low polling. These settings are highly dependent on your disk I/O capacity, for example its not worth having 1000 tasks running if you can only save data as fast at 10 tasks can write. Conversely one task can only write the values (in order) for 1 point at a time but multiple tasks can write data for separate points at the same time (due to the Time Series nature of the data store), so having only 1 task running can create a backlog if it cannot write data as fast as it is coming in.