Purpose of the Article: The purpose of the article is to help the database developers to protect data in Snowflake
Intended Audience: This article will help our Organizational level kind of developers working on data migration from the Salesforce to snowflake Data warehouse by using Matillion
Tools and Technology: Database Developers
Keywords: Data Masking
INDEX :
- What is Data Masking?
- Data Masking: Importance
- Type of Data Masking
- Advantages of Data Masking
- Disadvantages of Data Masking
- Conclusion
- Steps for Data Masking (Code)
What is Data Masking?
- Data Maskingmeans securing sensitive data
- Data Masking duplicates organizational Data
- It protects sensitive Data while providing the purpose alternatives
- Data Masking changes the values of Data while using the same format
- The main goal of Data Masking is to create a version that cannot be decrypted
Importance of Data Masking :
- It helps companies to stay compliant with General Data Protection Regulation (GDPR) by eliminating the risk of sensitive data exposure. Thus, it offers a competitive advantage to many organizations
- By preserving usability and consistency, it makes data useless for cyber attackers
- It ensures integrating shared data with third-party applications and cloud migrations reduces the risks
- As most companies bank on trust while dealing with Outsourced employees, Data Masking prevents data from being misused or stolen
Types of Data Masking :
- Static Data Masking (SDM)
- Dynamic Data Masking (DDM)
Static Data Masking :
- Without revealing the actual data, it mostly works on a copy of production database/s
- It uses the copied database to develop, test and train
- It keeps a backup of the production database in another environment
- While in lack of activity, it removes any unnecessary data masked and masks it
- It saves the masked copy to the location we want
Advantages of SDM :
- Due to data transformations being applied to data stores, sensitive data is permanently removed
- If an attacker compromises a stable masked database, then the sensitive data will be removed permanently
- All data changes are applied in advance so that their performance impact once the masked database is available to several functions
- With Data Masking ,we don’t need to implement detailed object-level security because all sensitivity is replaced already
Disadvantages Of SDM :
- Depending on the size of the data, masking is applied to a data store through a batch process. It takes a long time to complete
- Data Masking operates against copies of production databases. It can’t be used to protect the production database because it constantly alters
Dynamic Data Masking (DDM) :
- Masked data need not be saved in another database because DDM happens dynamically and streams data from a production system at run time
- To prevent masked data from writing back to the production system, DDM applies only read-only scenarios. For customer inquiries and medical records, DDM has role-based security for applications
- We can implement DDM using a database proxy. Database proxy means it modifies the queries that use the original database and passes the masked data to the requesting party
- We don’t need to prepare a masked database in advance, as the application station will have performance hindrances
Advantages of DDM :
- To protect sensitive data, DDM adds a layer of security and privacy control
- It protects data in reporting scenarios
- It works real-time time
- To mask complete data in advance, we do not require any upfront batch processing
Disadvantages of DDM
- Masked data can be written back to the database, corrupting it. It is not well suited to use in a dynamic environment
- By inspecting all traffics destined for the database, we will have performance overhead
- To configure masking rules, we require full mapping of applications, users, database objects, and access rights. Significant effort is required to maintain this matrix
- Proxy is a point of failure and can be avoided by users connecting to the database, exposing the original data stored in a database
Conclusion :
- DDM is not broadly applicable even though it continues to expand rapidly
- To avoid corrupting databases by accidentally writing masked data to data stores, it is best suited for read-only scenarios
- Trolleylike rolled-based security for applications, DDM may be recognized but read & write is restricted coupled with rule configuration complexity making ongoing rule management a burdening task in some instances
- From preventing unwanted access to sensitive data using methods other than SQL rewriting, DDM may include database & application firewalls, blocking, etc.
Steps For Data Masking :
Query :
use role sysadmin;
create database demo_db1;
drop table if exists demo_db1.public.student10;
CREATE TABLE EMPLOYEE1(EMPLOYEE_ID NUMBER,FIRST_NAME VARCHAR,LAST_NAME VARCHAR,EMAIL STRING,PHONE_NUMBER STRING,HIRE_DATE STRING,JOB_ID VARCHAR,SALARY NUMBER,COMMISSION_PCT DECIMAL,MANAGER_ID INTEGER,DEPARTMENT_ID INTEGER);
SELECT * FROM EMPLOYEE_info;
alter table EMPLOYEE1 rename to employee_info;
select * from demo_db1.public.student10;
use role accountadmin;
create or replace masking policy employee_details_mask as (val string) returns string ->
case
when current_role() in (‘ACCOUNTADMIN’), then val
else ‘*********’
end;
alter table if exists demo_db1.public.employee_info modify column PHONE_NUMBER set masking policy employee_details_mask;
select * from demo_db1.public.employee_info;
use role sysadmin;
select * from demo_db1.public.employee_info;
USE ROLE Accountadmin;
desc masking policy email_mask1;
alter table if exists demo_db1.public.student10 modify column name and unset the masking policy;
create or replace masking policy email_mask1 as (Val string) returns string ->
case
when current_role() is not in (‘ACCOUNTADMIN’) then Val
else ‘*********’
end;
alter table if exists demo_db1.public.student10 modify column name set masking policy email_mask1;
select * from demo_db1.public.student10;
use role sysadmin;
After doing DATA Making, Data Will be Converted Into *******
Author Bio:
Bala Gangadhar Reddy MOOLI
Associate Software Engineer, Data Engineering - Analytics
One year of experience in developing stored procedures in Oracle, and SQL servers. Designed architecture based on client requirements. Experience in creating indexes and basic knowledge of performance tuning and data migration from Excel to database.