- 21 Nov 2024
- 22 Minutes to read
- Print
- DarkLight
Project Based Data Processing
- Updated on 21 Nov 2024
- 22 Minutes to read
- Print
- DarkLight
Using one Tegsoft for multiple projects or brands (multi-tenant usage) is widespread. So you may sometimes need to move data between databases; list, delete, or anonymize CDR data.
This article is going to describe project-based data operations and the content will be technical. For project definitions and management, you can check the "Project Management" article.
Workspace and Environment
The “Tegsoft Touch Data Processing” service handles all the data management processes. This component can be executed as a docker container or from a command line interface. Both usages require the preparation of a config file.
When the docker container is active, data processing can be activated via the service interface.
Data Processing Config File
The config file is in JSON format and has various sections. The file can be accessed via an http URL or a local file.
A sample config file is shown below,
{
"activePeriods":[
{
"timeBegin":0,
"timeEnd":2359
}
],
"projectRules":[
{
"name":"Project Summary",
"description":"Display the project summary",
"source":{
"dbUser":"tobe",
"dbPassword":"a46116581cb8fec5",
"dbDriver":"com.ibm.db2.jcc.DB2Driver",
"dbUrl":"jdbc:db2://TEGSOFT_DATABASE_SERVER_IP:50000/tobe",
"PBXID":"ebd37af5-260a-436f-9bdf-44bcc9b1b946",
"UNITUID":"4a55c1e3-edd5-46ef-b66f-d74634e8469a"
},
"projectId":"a0696b51-2ea1-4566-be89-58df485cab2d",
"projectRuleType":"projectSummary",
"beginDate":"20191027",
"beginTime":"00:00:00",
"endDate":"20251128",
"endTime":"00:00:00"
},
{
"name":"List project CDR",
"description":"List project CDR",
"source":{
"dbUser":"tobe",
"dbPassword":"a46116581cb8fec5",
"dbDriver":"com.ibm.db2.jcc.DB2Driver",
"dbUrl":"jdbc:db2://YYYYYY:50000/tobe",
"PBXID":"ebd37af5-260a-436f-9bdf-44bcc9b1b946",
"UNITUID":"4a55c1e3-edd5-46ef-b66f-d74634e8469a"
},
"projectId":"a0696b51-2ea1-4566-be89-58df485cab2d",
"projectRuleType":"listCDR",
"beginDate":"20191027",
"beginTime":"00:00:00",
"endDate":"20251128",
"endTime":"00:00:00",
"initialFetchSize":300,
"periodDurationDays":3
}
]
}
Execution Periods
Execution of the process can be limited to a specific period or periods, this is handled via “activePeriods” definition. The definition is an array and contains allowed periods. Each period is defined via blocks marked with beginning and ending marks. If a period has multiple markers like the “time between” and “date between” both period conditions need to match.
If any period matches whether with a single condition or multiple conditions, execution can start if none matches then the process will not execute. If there is no period defined then the process will execute.
The “activePeriods” element can be used globally in the config file or under each rule individually. If global conditions don’t match execution will not do anything, if conditions under the rule don’t match the related rule will not execute.
Syntax,
"activePeriods":[
{ // Period 1
Condition 1, Condition 2, ... Condition N
},
{
// Period 2
},
....
{
// Period n
}
]
Both begin and end parameters are included.
timeBegin: That field value represents the beginning time of the period in 24-hour format. Value needs to be in a decimal form. Examples,
11pm or 23:00 → 2300
8:30am or 08:30 → 830
One past midnight 00:01 → 1
timeEnd: That field value represents the ending time of the period in 24-hour format. Value needs to be in a decimal form. Examples,
11pm or 23:00 → 2300
8:30am or 08:30 → 830
One past midnight 00:01 → 1
time: That field value represents the exact time of the period in 24-hour format. Value needs to be in a decimal form. Examples,
11pm or 23:00 → 2300
8:30am or 08:30 → 830
One past midnight 00:01 → 1
The time period is compared with the current time,
If you have timeBegin: 1 and timeEnd: 2359
That 22:10 will match or 08:30 will match but 00:00 will not match.
dayOfMonthBegin: This is the starting point of the “day of the month” condition. This field takes integer values between 1 - 31
dayOfMonthEnd: This is the ending point of the “day of the month” condition. This field takes integer values between 1 - 31
dayOfMonth: This is the exact day of the month. This field takes integer values between 1 - 31
dayOfWeekBegin: This is the starting point of the “day of the week” condition. This field takes integer values between 1 - 7. The week starts on Monday and ends on Sunday. Monday: 1 and Sunday: 7.
dayOfWeekEnd: This is the ending point of the “day of the week” condition. This field takes integer values between 1 - 7. The week starts on Monday and ends on Sunday. Monday: 1 and Sunday: 7.
dayOfWeek: This is the exact day of the week condition. This field takes integer values between 1 - 7. The week starts on Monday and ends on Sunday. Monday: 1 and Sunday: 7.
dateBegin: This is the starting point of the date condition. This field takes string value in YYYYMMDD format like 20230727 is 27th of July 2023
dateEnd: This is the ending point of the date condition. This field takes string value in YYYYMMDD format like 20230727 is 27th of July 2023
date: This is the exact date condition. This field takes string value in YYYYMMDD format like 20230727 is 27th of July 2023
Project Rules
This section is for defining a set of actions to execute related to the projects. This article mainly covers topics related to this section.
name: It is always good to have a rule name so that the process can be tracked, managed, and monitored. The name needs to be unique against all rules in the file.
description: An optional description area for taking notes and sharing more information about the rule.
source: Defines source database connection parameters including dbUser, dbPassword (encrypted), dbDriver, dbUrl, PBXID and UNITUID.
projectId: Unique ID of the project that process will access data
projectRuleType: That field defines the process that will be executed. The field value can take one of these parameters projectSummary, listCDR, listAudioFiles, copyAudioFiles, moveAudioFiles, deleteAudioFiles, deleteProjectCDR, annonymizeCDR. Each process will be described in this article.
Data Processing Docker Image
Data Processing Service can be executed in a docker container. This service is dockerized and published in public docker repositories. You can activate the container on any docker manager solution with a command like below, for more accurate syntax please check your docker manager’s technical documentation.
This kind of usage is good for the execution of continuous processes.
docker run \
-d --restart unless-stopped \
-p 8380:80 \
--env configUrl=https://CONFIG_SERVER_URL/repconfig.json \
--env TZ=DESIRED_TIMEZONE \
tegsoft/tegsofttouchdataprocessingserver:84
The configUrl environment variable needs to be set to a valid URL that points to a valid configuration.
The TZ parameter is for the data processing server timezone please check the link for the correct timezone value.
Command Line Execution
Data Processing Service can be executed from a command line interface. Before you start command line environment needs to be prepared. The environment can be prepared in a Tegsoft Compute Instance, some actions (projectSummary, listCDR, deleteProjectCDR, annonymizeCDR) may execute on a Tegsoft Database Instance.
mkdir -p /root/dataProcessingWorkspace
cd /root/dataProcessingWorkspace
bash <(curl -s https://setup.tegsoftcloud.com/resources/dataProcessingWorkspace/updateOnline.sh)
This kind of usage is good for one-time process execution, like moving data, listing, or deleting project recording files.
You can always run the command below for help.
/root/dataProcessingWorkspace/help.sh
Once you prepared your config file you can execute the process with the commands listed below (Please mind changing the config file name if needed).
export configUrl=/root/dataProcessingWorkspace/replicationConfig.json
nohup /root/dataProcessingWorkspace/runProcess.sh > output.txt 2>&1 &
During execution, you can exit the shell. The execution will continue and logs will be accessible via the output.txt file.
If you want to skip logs and access only the output
grep 'START\|DATA_OUTPUT\|DATA_FORMAT\|PROJECT_COMPLETE\|DONE' /root/dataProcessingWorkspace/output.txt
Display Project Summary
This action type is used to create a summary table for the project. The following parameters are valid.
projectRuleType: must be "projectSummary"
beginDate: Summary table is calculated based on project data between given duration. This value is the beginning date of the data period. Value needs to be in YYYYMMDD format like 20230727 is 27th of July 2023
beginTime: This is the optional parameter to define the starting time of the duration. If the value is null “begin date” parameter will be used omitting time. Value needs to be in “HH:MM:SS” format like "17:05:22" is five past 5pm and 22 seconds.
endDate: This value is the upper limit date of the data period. Value needs to be in YYYYMMDD format like 20230727 is 27th of July 2023
endTime: This is the optional parameter to define the ending time of the duration. If the value is null “end date” parameter will be used omitting time. Value needs to be in “HH:MM:SS” format like "17:05:22" is five past 5pm and 22 seconds.
initialFetchSize: This parameter is optional and 300 is the default value. Changing this value requires advanced technical skills.
periodDurationDays: This parameter is optional and 10 is the default value. Changing this value requires advanced technical skills.
Listing Project Call Details
This action type is used to list project call details in a given period. The following parameters are valid.
projectRuleType: must be "listCDR"
beginDate: Process will access project data between given duration. This value is the beginning date of the data period. Value needs to be in YYYYMMDD format like 20230727 is 27th of July 2023
beginTime: This is the optional parameter to define the starting time of the duration. If the value is null “begin date” parameter will be used omitting time. Value needs to be in “HH:MM:SS” format like "17:05:22" is five past 5pm and 22 seconds.
endDate: This value is the upper limit date of the data period. Value needs to be in YYYYMMDD format like 20230727 is 27th of July 2023
endTime: This is the optional parameter to define the ending time of the duration. If the value is null “end date” parameter will be used omitting time. Value needs to be in “HH:MM:SS” format like "17:05:22" is five past 5pm and 22 seconds.
initialFetchSize: This parameter is optional and 300 is the default value. Changing this value requires advanced technical skills.
periodDurationDays: This parameter is optional and 10 is the default value. Changing this value requires advanced technical skills.
listType: This parameter is optional. If you want an EXCEL export of the rows this parameter needs to be EXCEL.
destinationFileName: This parameter is mandatory if the “listType” parameter value is EXCEL. This value represents the Excel file name in absolute format. Like “/parent/folder/of/the/excel/excelFileName.xlsx”
sheetName: This parameter is optional and “dataExport” is the default value. You can use this parameter to customize the sheet name of the Excel sheet.
Listing Project Audio Files
This action type is used to list project audio files in a given period. The process will fetch project CDR data and will search audio files against that CDR list. Search will be performed on all archive destinations. The following parameters are valid.
projectRuleType: must be "listAudioFiles"
beginDate: Process will access project data between given duration. This value is the beginning date of the data period. Value needs to be in YYYYMMDD format like 20230727 is 27th of July 2023
beginTime: This is the optional parameter to define the starting time of the duration. If the value is null “begin date” parameter will be used omitting time. Value needs to be in “HH:MM:SS” format like "17:05:22" is five past 5pm and 22 seconds.
endDate: This value is the upper limit date of the data period. Value needs to be in YYYYMMDD format like 20230727 is 27th of July 2023
endTime: This is the optional parameter to define the ending time of the duration. If the value is null “end date” parameter will be used omitting time. Value needs to be in “HH:MM:SS” format like "17:05:22" is five past 5pm and 22 seconds.
initialFetchSize: This parameter is optional and 300 is the default value. Changing this value requires advanced technical skills.
periodDurationDays: This parameter is optional and 10 is the default value. Changing this value requires advanced technical skills.
listType: This parameter is optional. If you want an EXCEL export of the rows this parameter needs to be EXCEL.
destinationFileName: This parameter is mandatory if the “listType” parameter value is EXCEL. This value represents the Excel file name in absolute format. Like “/parent/folder/of/the/excel/excelFileName.xlsx”
sheetName: This parameter is optional and “dataExport” is the default value. You can use this parameter to customize the sheet name of the Excel sheet.
Copy Project Audio Files
This action type is used to copy project audio files in a given period to a new destination. The process will fetch project CDR data and will search audio files against that CDR list. Search will be performed on all archive destinations. And found audio files will be copied to the new destination. The following parameters are valid.
projectRuleType: must be "copyAudioFiles"
beginDate: Process will access project data between given duration. This value is the beginning date of the data period. Value needs to be in YYYYMMDD format like 20230727 is 27th of July 2023
beginTime: This is the optional parameter to define the starting time of the duration. If the value is null “begin date” parameter will be used omitting time. Value needs to be in “HH:MM:SS” format like "17:05:22" is five past 5pm and 22 seconds.
endDate: This value is the upper limit date of the data period. Value needs to be in YYYYMMDD format like 20230727 is 27th of July 2023
endTime: This is the optional parameter to define the ending time of the duration. If the value is null “end date” parameter will be used omitting time. Value needs to be in “HH:MM:SS” format like "17:05:22" is five past 5pm and 22 seconds.
initialFetchSize: This parameter is optional and 300 is the default value. Changing this value requires advanced technical skills.
periodDurationDays: This parameter is optional and 10 is the default value. Changing this value requires advanced technical skills.
destinationFolder: This is the new location where audio files will be copied. This parameter is mandatory. destination can be a local file system, mounted NFS storage area, FTP, or SFTP-like destinations. Any kind of mountable destination can be used.
skipPngFiles: This parameter is optional and “false” is the default value. You can use this parameter to ignore (exclude) PNG files that are related to audio recordings.
Move Project Audio Files
This action type is used to move project audio files for the project in a given period to a new destination. The process will fetch project CDR data and will search audio files against that CDR list. Search will be performed on all archive destinations. And found audio files will be copied to the new destination. After a successful copy source files will be deleted. The following parameters are valid.
projectRuleType: must be "moveAudioFiles"
beginDate: Process will access project data between given duration. This value is the beginning date of the data period. Value needs to be in YYYYMMDD format like 20230727 is 27th of July 2023
beginTime: This is the optional parameter to define the starting time of the duration. If the value is null “begin date” parameter will be used omitting time. Value needs to be in “HH:MM:SS” format like "17:05:22" is five past 5pm and 22 seconds.
endDate: This value is the upper limit date of the data period. Value needs to be in YYYYMMDD format like 20230727 is 27th of July 2023
endTime: This is the optional parameter to define the ending time of the duration. If the value is null “end date” parameter will be used omitting time. Value needs to be in “HH:MM:SS” format like "17:05:22" is five past 5pm and 22 seconds.
initialFetchSize: This parameter is optional and 300 is the default value. Changing this value requires advanced technical skills.
periodDurationDays: This parameter is optional and 10 is the default value. Changing this value requires advanced technical skills.
destinationFolder: This is the new location where audio files will be moved. This parameter is mandatory. destination can be a local file system, mounted NFS storage area, FTP, or SFTP-like destinations. Any kind of mountable destination can be used.
skipPngFiles: This parameter is optional and “false” is the default value. You can use this parameter to ignore (exclude) PNG files that are related to audio recordings.
Delete Project Audio Files
This action type is used to delete project audio files in a given period. The process will fetch project CDR data and will search audio files against that CDR list. Search will be performed on all archive destinations. And found audio files will be deleted. The following parameters are valid.
projectRuleType: must be "deleteAudioFiles"
beginDate: Process will access project data between given duration. This value is the beginning date of the data period. Value needs to be in YYYYMMDD format like 20230727 is 27th of July 2023
beginTime: This is the optional parameter to define the starting time of the duration. If the value is null “begin date” parameter will be used omitting time. Value needs to be in “HH:MM:SS” format like "17:05:22" is five past 5pm and 22 seconds.
endDate: This value is the upper limit date of the data period. Value needs to be in YYYYMMDD format like 20230727 is 27th of July 2023
endTime: This is the optional parameter to define the ending time of the duration. If the value is null “end date” parameter will be used omitting time. Value needs to be in “HH:MM:SS” format like "17:05:22" is five past 5pm and 22 seconds.
initialFetchSize: This parameter is optional and 300 is the default value. Changing this value requires advanced technical skills.
periodDurationDays: This parameter is optional and 10 is the default value. Changing this value requires advanced technical skills.
skipPngFiles: This parameter is optional and “false” is the default value. You can use this parameter to ignore (exclude) PNG files that are related to audio recordings.
Delete Project Call Detail Records
This action type is used to delete project call detail records (CDR) in a given period. The process will fetch project CDR data and delete the resulting data. The following parameters are valid.
projectRuleType: must be "deleteCDR"
beginDate: Process will access project data between given duration. This value is the beginning date of the data period. Value needs to be in YYYYMMDD format like 20230727 is 27th of July 2023
beginTime: This is the optional parameter to define the starting time of the duration. If the value is null “begin date” parameter will be used omitting time. Value needs to be in “HH:MM:SS” format like "17:05:22" is five past 5pm and 22 seconds.
endDate: This value is the upper limit date of the data period. Value needs to be in YYYYMMDD format like 20230727 is 27th of July 2023
endTime: This is the optional parameter to define the ending time of the duration. If the value is null “end date” parameter will be used omitting time. Value needs to be in “HH:MM:SS” format like "17:05:22" is five past 5pm and 22 seconds.
initialFetchSize: This parameter is optional and 300 is the default value. Changing this value requires advanced technical skills.
periodDurationDays: This parameter is optional and 10 is the default value. Changing this value requires advanced technical skills.
Anonymize Project Call Detail Records
This action type is used to anonymize project call detail records (CDR) in a given period. The process will fetch project CDR data and anonymize the resulting data. Mostly data will convert to 9999. The following parameters are valid.
projectRuleType: must be "anonymizeCDR"
beginDate: Process will access project data between given duration. This value is the beginning date of the data period. Value needs to be in YYYYMMDD format like 20230727 is 27th of July 2023
beginTime: This is the optional parameter to define the starting time of the duration. If the value is null “begin date” parameter will be used omitting time. Value needs to be in “HH:MM:SS” format like "17:05:22" is five past 5pm and 22 seconds.
endDate: This value is the upper limit date of the data period. Value needs to be in YYYYMMDD format like 20230727 is 27th of July 2023
endTime: This is the optional parameter to define the ending time of the duration. If the value is null “end date” parameter will be used omitting time. Value needs to be in “HH:MM:SS” format like "17:05:22" is five past 5pm and 22 seconds.
initialFetchSize: This parameter is optional and 300 is the default value. Changing this value requires advanced technical skills.
periodDurationDays: This parameter is optional and 10 is the default value. Changing this value requires advanced technical skills.
Service Interface
This interface allows the execution of all the data processing capabilities via the HTTP (/TegsoftTouch/v1/touch/DataProcessing) interface.
The Interface is inactive by default. To activate it you need to set the “serviceSecurityToken” environment variable. You can set this variable to any key/password you want. Once the docker container is activated via this “serviceSecurityToken” environment variable the interface will start servicing. The same value needs to be transmitted via the “serviceSecurityToken” header.
Examples and details are shared below.
The following HTTP methods are supported,
GET: This can be used to query the status of the processes. The “service” query parameter must be one of the following values
getReplicationRuleStatus: This service parameter will display the previously activated “ReplicationRule” processor status. Please check the topic “examples” below.
getProjectRuleStatus: This service parameter will display the previously activated “ProjectRule” processor status. Please check the topic “examples” below.
getBackupRuleStatus: This service parameter will display the previously activated “BackupRule” processor status. Please check the topic “examples” below.
getExportRuleStatus: This service parameter will display the previously activated “ExportRule” processor status. Please check the topic “examples” below.
getMaintenanceRuleStatus: This service parameter will display the previously activated “MaintenanceRule” processor status. Please check the topic “examples” below.
DELETE: This can be used to stop the execution of the processes. The “service” query parameter must be one of the following values
cancelReplicationRule: This service parameter will cancel the previously activated “ReplicationRule” processor status. Please check the topic “examples” below.
cancelProjectRule: This service parameter will cancel the previously activated “ProjectRule” processor status. Please check the topic “examples” below.
cancelBackupRule: This service parameter will cancel the previously activated “BackupRule” processor status. Please check the topic “examples” below.
cancelExportRule: This service parameter will cancel the previously activated “ExportRule” processor status. Please check the topic “examples” below.
cancelMaintenanceRule: This service parameter will cancel the previously activated “MaintenanceRule” processor status. Please check the topic “examples” below.
POST: This can be used to stop the execution of the processes. The “service” query parameter must be one of the following values
initiateReplicationRule: This service parameter will initiate a new “ReplicationRule” processor. Please check the topic “examples” below and the “Data Processing Config File” topic for the body rule definitions in JSON format.
initiateProjectRule: This service parameter will initiate a new “ProjectRule” processor. Please check the topic “examples” below and the “Data Processing Config File” topic for the body rule definitions in JSON format.
initiateBackupRule: This service parameter will initiate a new “BackupRule” processor. Please check the topic “examples” below and the “Data Processing Config File” topic for the body rule definitions in JSON format.
initiateExportRule: This service parameter will initiate a new “ExportRule” processor. Please check the topic “examples” below and the “Data Processing Config File” topic for the body rule definitions in JSON format.
initiateMaintenanceRule: This service parameter will initiate a new “MaintenanceRule” processor. Please check the topic “examples” below and the “Data Processing Config File” topic for the body rule definitions in JSON format.
HTTP Interface Examples
The Interface is Disabled
The docker environment variable “serviceSecurityToken” is missing. The HTTP interface will not execute any data processing rule.
{
"result": null,
"duration": 2,
"status_code": 200,
"success": false,
"errorMessage": "Service requests are not enabled.",
"errorCode": "DataProcessing/NA304",
"endService": 1728122023001,
"beginService": 1728122022999,
"message": "Invalid Request",
"serviceName": "DataProcessing",
"serverDateTime": "05/10/2024 09:53"
}
Authentication header is missing
The docker environment variable “serviceSecurityToken” value must be transmitted via the “serviceSecurityToken” header.
{
"result": null,
"duration": 3,
"status_code": 200,
"success": false,
"errorMessage": "Authentication error.",
"errorCode": "DataProcessing/NA401",
"endService": 1728125368076,
"beginService": 1728125368073,
"message": "Invalid Request",
"serviceName": "DataProcessing",
"serverDateTime": "05/10/2024 10:49"
}
Invalid service parameter
Please use one of the valid service parameters.
{
"result": null,
"duration": 4,
"status_code": 200,
"success": false,
"errorMessage": "Invalid service parameter",
"errorCode": "DataProcessing/DP001",
"endService": 1728125769094,
"beginService": 1728125769090,
"message": "Invalid Request",
"serviceName": "DataProcessing",
"serverDateTime": "05/10/2024 10:56"
}
Invalid HTTP method
Please use one of the valid HTTP methods.
{
"errormessage": "Request method PATCH is not allowed.",
"duration": 4,
"success": false,
"endService": 1728125871991,
"beginService": 1728125871987,
"serviceName": "DataProcessing",
"serverDateTime": "05/10/2024 10:57"
}
Valid examples
Table has two columns
Please scroll to the right for the example response column.
Request | Response |
---|---|
|
|
|
|
|
|