美文网首页
Snowflake(2)

Snowflake(2)

作者: 山猪打不过家猪 | 来源:发表于2024-02-18 04:53 被阅读0次

    1.timetravel

    1.什么是time travel

    image.png
    • 使用timetravle可以准确的回到day10的数据快照,并且我们可以查看过去删除或者更改的数据

    2.给表设置time travel的时间

    • 设置employees表为90天
    create or replace table employees(employee_id number,
                         salary number,
                         manager_id number)
                       data_retention_time_in_days=90;
    
    • 更改employees表为10天
    alter table employees set data_retention_time_in_days=30;
    

    3.查询历史数据(时间和history id)

    1.查询出当前utc时间

    select current_timestamp();
    ALTER SESSION SET TIMEZONE = 'UTC';
    

    2.根据当前的utc时间-1,既可获得一天前的该表数据快照,也可以调整分,获得几分钟前的数据快照

    - 方法一:
    select * from employees before(timestamp => '2020-09-10 20:50:28.944 +0000'::timestamp);
    - 方法二: 查询5分钟之前的数据
    select * from employees at(offset => -60*5);
    

    3.可以根据历史记录里的query id 查询历史快照


    image.png
    select * from employees before(statement => '01b270d2-0001-1b82-0000-0000deb9f1dd');
    

    4. clone历史数据

    方法一:使用timestamp

    create table restored_table clone employees
      at(timestamp => '2024-02-18 21:08:34.470 +0000'::timestamp);
    

    方法二:offset

    create table restored_table_v2 clone employees
      at(offset => -60*30);
    

    方法三:使用query id

    create table restored_table_v5 clone employees 
    before(statement => '01b270d6-0001-1b8b-0000-deb900010056');
    
    • 也可以恢复schema 或者database
    - 恢复schema 半小时之前
    create schema restored_schema clone employee_perm at(offset => -60 *30);
    -恢复database,通过ID
    create database restored_db clone demo_db
      before(statement => '0196d7b8-00d6-37a7-0000-45750002d1ce');
    

    5. 使用time travel恢复数据

    1.删除所有的database,schema,table

    drop database development;
    
    drop schema demo_db.employee;
    
    drop table demo_db.employee_perm.employees;
    

    2.恢复

    undrop table demo_db.employee_perm.employees;
    
    undrop schema demo_db.employee;
    
    undrop database development;
    

    2. task in snowflake

    pass(以后用airflow调用)

    3.Stream in snowflake

    • Stream主要是用来跟踪,表的变化情况(to track changes to data)

    3.1创建一个stream,并且插入数据

    1.创建employees表的stream

    create or replace stream employees_stream on table employees;
    

    2.查看stream的属性

    --查看具体表的stream
    DESC stream employees_stream
    --查看所有的stream
    show streams
    

    3.查看stream 的offset时间

    --查看offset
    SELECT SYSTEM$STREAM_GET_TABLE_TIMESTAMP('employees_stream');
    --转换时间戳为了更加好看
    SELECT to_timestamp(SYSTEM$STREAM_GET_TABLE_TIMESTAMP('employees_stream')) as stream_offset;
    

    4.给表新增数据

    --增加新的数进去
    insert into employees values(101,10000,4),(102,20000,5),(103,30000,6);
    

    5.查看stream

    --查看stream里的数据
    select * from employees_stream
    
    image.png
    • 此时,我们刚才对表进行的操作就可以看到
      6.将更改的数据存到另外一张表里,此时将消耗stream,stream的offse发生更改,employees_stream表里的数据将为空
    --创建customer表
    create or replace table employees_consumer(employee_id number,
                         salary number);
    --从employees_stream表里复制数据
    insert into employees_consumer select employee_id,salary from employees_stream
    --查看数据
    select * from employees_consumer
    
    image.png

    7.此时查看偏移量,发现已经发生了变化,即我们inster 数据的时候


    image.png

    3.2 stream的更新数据

    1.查看employees_stream,这时显示为空,因为上次我们已经迁移了数据,所有消耗了stream

    select * from  employees_stream
    

    2.更新数据

    update employees set salary = salary + 10000 where salary < 33000;
    

    3.查看employees_stream表,我们发现更新过的旧数据显示为delete,新的数据显示为insert,但是后面的isupdate显示了时update操作

    select * from  employees_stream
    
    image.png

    4.根据stream表里字段消耗stream

    --根据stream表里的字段,消耗stream
    insert into employees_consumer select employee_id, salary 
                                    from employees_stream 
                                    where METADATA$ACTION = 'INSERT' and METADATA$ISUPDATE = 'TRUE';
    

    3.3 使用stream捕获delete操作

    DELETE FROM employees WHERE SALARY < 40000;
    select * from  employees_stream
    DELETE FROM employees_consumer WHERE EMPLOYEE_ID IN (select DISTINCT employee_id
                                    from employees_stream 
                                    where METADATA$ACTION = 'DELETE' and METADATA$ISUPDATE = 'FALSE');
    

    4. Zero-copy cloning

    1.可以将产品的数据库模式 完整的复制到开发环境,也可以dev 到prod
    2.也可用于备份data storage objects:database,schema,table and streams;也可以备份data configuration: stage, file format,tasks,sequences.
    3.由于他是一种数据快照,所以备份和克隆的空间占用是不会消费信用的,他的reference指向的是同一metadata
    4.但是如果对克隆的表进行了更改,就需要支付费用,应为此时的metadata的引用发生了改变
    5.clone只是一个快照,所以更改原表不会影响clone,更改clone也不会影响原表

    4.1 clone databse

    1.create a temporary table and insert some data

    create or replace TEMPORARY table EMPLOYEES_TEMP(employee_id number,
                         empl_join_date date,
                         dept varchar(10),
                         salary number,
                         manager_id number);
                                              
    insert into EMPLOYEES_TEMP values(8,'2014-10-01','HR',40000,4),
                                     (12,'2014-09-01','Tech',50000,9),
                                     (3,'2018-09-01','Marketing',30000,5),
                                     (4,'2017-09-01','HR',10000,5),
                                     (25,'2019-09-01','HR',35000,9),
                                     (12,'2014-09-01','Tech',50000,9),
                                     (86,'2015-09-01','Tech',90000,4),
                                     (73,'2016-09-01','Marketing',20000,1);
    
    1. clone DEMO_DB
    CREATE or REPLACE DATABASE demo_db_clone clone DEMO_DB
    
    image.png

    4.2 clone schema

    1. create clone syntax,it can only clone permenant table and transient table
    CREATE or REPLACE SCHEMA json_data_clone clone json_data
    
    image.png

    4.3 clone tables

    • when cloning a table, we should add a keyword
    -- clone permenant table
    create table employees_clone clone demo_db.public.employees;
    --clone temporary table
    create temporary table employees_temp_clone clone demo_db.employee_perm.employees_temp;
    --clone transient table 
    create transient table employees_transient_clone clone demo_db.employee_perm.employees_transient;
    

    4.4 cloning based on time travel

    1. timestamp
    create table restored_table clone employees
      at(timestamp => '2020-09-10 21:06:16.694 +0000'::timestamp);
    
    1. offset
    create schema restored_schema clone employee_perm at(offset => -600);
    

    3.query id

    create database restored_db clone demo_db
      before(statement => '0196d7b8-00d6-37a7-0000-45750002d1ce');
    

    4.5 swap tables

    you can use swap table easily exchange product table and dev table easily

    ALTER TABLE employees SWAP WITH employees_consumer;
    

    5.Data sharing

    1. snowflake users can share data to me, I also could share data with otherr snowflake users
      2.For non snowflake users, we can create a reader account and share data
    2. providers :create share ojbect and share data to others. customers: who is cosuming the shared data

    5.1 create share

    1. create share named employeees_share
    --create employees_share objects
    create share employees_share;
    
    --grant databse 
    grant usage on database demo_db to share employees_share;
    
    --grant schema 
    grant usage on schema demo_db.public to share employees_share;
    
    --grant table
    grant select on table demo_db.public.RESTORED_TABLE to share employees_share;
    
    1. check share
    show grants to share employees_share;
    
    image.png

    3.check all shares

    show shares 
    
    1. add other account
    ALTER SHARE CUST_DATA_SHARE ADD ACCOUNT = zx14140
    
    1. an easy way to share all the tables in a schema/database
    // How to share complete schema
    GRANT SELECT ON ALL TABLES IN SCHEMA demo_db  TO SHARE CUST_DATA_SHARE;
    
    // How to share complete database
    GRANT SELECT ON ALL TABLES IN DATABASEdemo_db TO SHARE CUST_DATA_SHARE;
    

    5.2 consume shares

    1.check shares by share name

    show shares;
    
    DESC share EMPLOYEES_SHARE
    
    image.png
    1. consume shared database
    CREATE DATABASE DEMO_DB_SHARED FROM SHARE EMPLOYEES_SHARE
    

    5.3 revoke shares and drop shares

    • revoke a table or schema
    revoke usage on schema public from share EMPLOYEES_SHARE
    
    • drop share
    drop share employees_share
    

    6. metarialized view & secure view

    6.1

    • When dealing with large volumes of data that are frequently queried, we can utilize views, such as historical sales data, which consist of unchanging historical data.
    • Can not use join with metarialized view.

    6.1.1normal view VS metarialized view

    • create a normal view and run a query nearly take 1.5s at first time, when excuting second time, the query is from the cache, it takes 73ms. Most of the time, normal view goes on compilation rather than execution.
    --normal/regular
    create or replace view VW_ORDERS as
    select O_ORDERSTATUS, O_ORDERPRIORITY, O_CUSTKEY, COUNT(O_ORDERKEY) as TOTAL_ORDERS
    from "SNOWFLAKE_SAMPLE_DATA"."TPCH_SF10"."ORDERS"
    group by O_ORDERSTATUS, O_ORDERPRIORITY, O_CUSTKEY;
    --excute query
    select * 
    from VW_ORDERS 
    where O_ORDERPRIORITY = '1-URGENT'
    and TOTAL_ORDERS > 4;
    
    • create a materialized view
    create or replace materialized view MVW_ORDERS as
    select O_ORDERSTATUS, O_ORDERPRIORITY, O_CUSTKEY, COUNT(O_ORDERKEY) as TOTAL_ORDERS
    from "SNOWFLAKE_SAMPLE_DATA"."TPCH_SF10"."ORDERS"
    group by O_ORDERSTATUS, O_ORDERPRIORITY, O_CUSTKEY;
    

    6.1.2 list all views

    • List all the existing views including regular and materialized views.
    show views
    
    • List existing materialized views only
    show materialized views
    

    6.2 secure view

    • We can use secure views by controlling the privilege.

    6.2.1

    • creata secure view
    create or replace SECURE view MVW_ORDERS as
    select O_ORDERSTATUS, O_ORDERPRIORITY, O_CUSTKEY, COUNT(O_ORDERKEY) as TOTAL_ORDERS
    from "SNOWFLAKE_SAMPLE_DATA"."TPCH_SF10"."ORDERS"
    group by O_ORDERSTATUS, O_ORDERPRIORITY, O_CUSTKEY;
    
    • privileges usage of view by using secure views
    --create a role should not allow to access to the underlying data 
    create or replace role analyst;
    
    --grant privileges of database, schema and views yo analyst role 
    grant usage on database demo_db to role ANALYST;
    grant usage on schema public to role ANALYST;
    grant select on demo_db.public.employees to role ANALYST;
    grant role ANALYST to user babypig521;
    
    • drop view
    drop meterialized view MVW_ORDERS 
    

    7.Grant privilege

    Display only privileged tables for different user logins.
    1.create roles

    create or replace role HUMAN_RESOURCE;
    create or replace role TECHNOLOGY;
    create or replace role MARKETING;
    

    2.create user fxx

    create or replace user fxx password = 'temp123' default_Role = 'HUMAN_RESOURCE';
    

    3.grant HUMAN_RESOURCE to fxx

    grant role HUMAN_RESOURCE to user john;
    
    1. create schema for employee
    create schema demo_db.employee
    

    5.create a table and insert some data

    create or replace table demo_db.employee.employees(employee_id number,
                         empl_join_date date,
                         dept varchar(10),
                         salary number,
                         manager_id number);
    
    insert into demo_db.employee.employees values(1,'2014-10-01','HR',40000,4),
             (2,'2014-09-01','Tech',50000,9),
             (3,'2018-09-01','Marketing',30000,5),
             (4,'2017-09-01','HR',10000,5),
             (5,'2019-09-01','HR',35000,9),
             (6,'2015-09-01','Tech',90000,4),
             (7,'2016-09-01','Marketing',20000,1);
    
    1. grant usage of warehouse/database/schema for HUMAN_RESOURCE
    --warehouse
    grant usage on warehouse compute_Wh to role HUMAN_RESOURCE;
    --database
    grant usage on database demo_db to role HUMAN_RESOURCE;
    --schema 
    grant usage on schema employee to role HUMAN_RESOURCE;
    

    7.create a secure view based on login acount role.

    create or replace secure view vw_employee as
    select e.*
    from "DEMO_DB"."EMPLOYEE"."EMPLOYEES" e
    where upper(e.DEPT) in (select upper(manager_role_alias)
                   from "DEMO_DB"."EMPLOYEE"."MANAGERS" m
                   where upper(manager_role_name) = upper(current_role()));
    

    8.grant view to HUMAN_RESOURCE

    grant select on view "DEMO_DB"."EMPLOYEE"."VW_EMPLOYEE" to role HUMAN_RESOURCE;
    
    1. check current user
    SELECT CURRENT_USER();
    
    1. grant HUMAN_RESOURCE to current user
    grant role HUMAN_RESOURCE to user BABYPIG521
    
    1. user HUMAN_RESOURCE role or login fxx account
    use role HUMAN_RESOURCE
    

    8.Data masking

    pass

    相关文章

      网友评论

          本文标题:Snowflake(2)

          本文链接:https://www.haomeiwen.com/subject/ohnuadtx.html