Data Manager (DM) Utility
Data Manager (DM) is a utility for the users so they could prepare their data for the upload into TimelinePI application. The idea behind DM is that user could combine, transform, cleanse, and sanitize (de-sensitize) his data on his local machine/network before uploading the data to the cloud.
The utility is a local executable file so the user doesn’t need to be concerned about the security of the raw data and internet, etc. The data and the executable all remain on the local machine.
The utility is able to perform the following operations on data:
1) Access multiple data sources including CSV and XLSX files and relational databases.
2) Merge (de-normalize) data from the same or multiple sources
3) Remove records with specific condition such as missing or out of range value
4) Create additional compound fields by concatenating several other fields
5) Perform the basic transformations within a field:
- Trim spaces
- Convert case
- Remove or replace specific substrings
6) Data sanitation:
- Perform one-way hash on a sensitive field
- Perform a name replacement for a fake name
- Replace a string with X first characters
- Encode the string with a password *
7) Save the result into CSV file for review and upload
So there are the following key types of operations:
A) Extract (load) dataset. It include connect to DBMS and perform a query or load a file.
B) Join several datasets into one via the key fields
C) Filter a dataset by a field value
D) Create compound field
E) Transform the field value
F) Generate CSV file
User could perform any sequence of these operations, for example:
a) Connect to DBMS1 and execute a query “SELECT…” to get dataset A.
b) Load file File.csv to get dataset B
c) Trim spaces from field A.f1.
d) Create new field A.f3 by concatenating A.f1 and “space” and A.f2.
e) Join dataset A to B via the key fields A.f3 and B.f1.
f) Hash field A.f4
g) Replace field B.f2 with three characters, like Elk***.
h) Produce CSV file with fields A.f1, A.f2, …, B.f1, B.f2…