1.timetravel
1.什么是time travel
image.png- 使用timetravle可以准确的回到day10的数据快照,并且我们可以查看过去删除或者更改的数据
2.给表设置time travel的时间
- 设置employees表为90天
create or replace table employees(employee_id number,
salary number,
manager_id number)
data_retention_time_in_days=90;
- 更改employees表为10天
alter table employees set data_retention_time_in_days=30;
3.查询历史数据(时间和history id)
1.查询出当前utc时间
select current_timestamp();
ALTER SESSION SET TIMEZONE = 'UTC';
2.根据当前的utc时间-1,既可获得一天前的该表数据快照,也可以调整分,获得几分钟前的数据快照
- 方法一:
select * from employees before(timestamp => '2020-09-10 20:50:28.944 +0000'::timestamp);
- 方法二: 查询5分钟之前的数据
select * from employees at(offset => -60*5);
3.可以根据历史记录里的query id 查询历史快照
image.png
select * from employees before(statement => '01b270d2-0001-1b82-0000-0000deb9f1dd');
4. clone历史数据
方法一:使用timestamp
create table restored_table clone employees
at(timestamp => '2024-02-18 21:08:34.470 +0000'::timestamp);
方法二:offset
create table restored_table_v2 clone employees
at(offset => -60*30);
方法三:使用query id
create table restored_table_v5 clone employees
before(statement => '01b270d6-0001-1b8b-0000-deb900010056');
- 也可以恢复schema 或者database
- 恢复schema 半小时之前
create schema restored_schema clone employee_perm at(offset => -60 *30);
-恢复database,通过ID
create database restored_db clone demo_db
before(statement => '0196d7b8-00d6-37a7-0000-45750002d1ce');
5. 使用time travel恢复数据
1.删除所有的database,schema,table
drop database development;
drop schema demo_db.employee;
drop table demo_db.employee_perm.employees;
2.恢复
undrop table demo_db.employee_perm.employees;
undrop schema demo_db.employee;
undrop database development;
2. task in snowflake
pass(以后用airflow调用)
3.Stream in snowflake
- Stream主要是用来跟踪,表的变化情况(to track changes to data)
3.1创建一个stream,并且插入数据
1.创建employees表的stream
create or replace stream employees_stream on table employees;
2.查看stream的属性
--查看具体表的stream
DESC stream employees_stream
--查看所有的stream
show streams
3.查看stream 的offset时间
--查看offset
SELECT SYSTEM$STREAM_GET_TABLE_TIMESTAMP('employees_stream');
--转换时间戳为了更加好看
SELECT to_timestamp(SYSTEM$STREAM_GET_TABLE_TIMESTAMP('employees_stream')) as stream_offset;
4.给表新增数据
--增加新的数进去
insert into employees values(101,10000,4),(102,20000,5),(103,30000,6);
5.查看stream
--查看stream里的数据
select * from employees_stream
image.png
- 此时,我们刚才对表进行的操作就可以看到
6.将更改的数据存到另外一张表里,此时将消耗stream,stream的offse发生更改,employees_stream表里的数据将为空
--创建customer表
create or replace table employees_consumer(employee_id number,
salary number);
--从employees_stream表里复制数据
insert into employees_consumer select employee_id,salary from employees_stream
--查看数据
select * from employees_consumer
image.png
7.此时查看偏移量,发现已经发生了变化,即我们inster 数据的时候
image.png
3.2 stream的更新数据
1.查看employees_stream,这时显示为空,因为上次我们已经迁移了数据,所有消耗了stream
select * from employees_stream
2.更新数据
update employees set salary = salary + 10000 where salary < 33000;
3.查看employees_stream表,我们发现更新过的旧数据显示为delete,新的数据显示为insert,但是后面的isupdate显示了时update操作
select * from employees_stream
image.png
4.根据stream表里字段消耗stream
--根据stream表里的字段,消耗stream
insert into employees_consumer select employee_id, salary
from employees_stream
where METADATA$ACTION = 'INSERT' and METADATA$ISUPDATE = 'TRUE';
3.3 使用stream捕获delete操作
DELETE FROM employees WHERE SALARY < 40000;
select * from employees_stream
DELETE FROM employees_consumer WHERE EMPLOYEE_ID IN (select DISTINCT employee_id
from employees_stream
where METADATA$ACTION = 'DELETE' and METADATA$ISUPDATE = 'FALSE');
4. Zero-copy cloning
1.可以将产品的数据库模式 完整的复制到开发环境,也可以dev 到prod
2.也可用于备份data storage objects:database,schema,table and streams;也可以备份data configuration: stage, file format,tasks,sequences.
3.由于他是一种数据快照,所以备份和克隆的空间占用是不会消费信用的,他的reference指向的是同一metadata
4.但是如果对克隆的表进行了更改,就需要支付费用,应为此时的metadata的引用发生了改变
5.clone只是一个快照,所以更改原表不会影响clone,更改clone也不会影响原表
4.1 clone databse
1.create a temporary table and insert some data
create or replace TEMPORARY table EMPLOYEES_TEMP(employee_id number,
empl_join_date date,
dept varchar(10),
salary number,
manager_id number);
insert into EMPLOYEES_TEMP values(8,'2014-10-01','HR',40000,4),
(12,'2014-09-01','Tech',50000,9),
(3,'2018-09-01','Marketing',30000,5),
(4,'2017-09-01','HR',10000,5),
(25,'2019-09-01','HR',35000,9),
(12,'2014-09-01','Tech',50000,9),
(86,'2015-09-01','Tech',90000,4),
(73,'2016-09-01','Marketing',20000,1);
- clone DEMO_DB
CREATE or REPLACE DATABASE demo_db_clone clone DEMO_DB
image.png
4.2 clone schema
- create clone syntax,it can only clone permenant table and transient table
CREATE or REPLACE SCHEMA json_data_clone clone json_data
image.png
4.3 clone tables
- when cloning a table, we should add a keyword
-- clone permenant table
create table employees_clone clone demo_db.public.employees;
--clone temporary table
create temporary table employees_temp_clone clone demo_db.employee_perm.employees_temp;
--clone transient table
create transient table employees_transient_clone clone demo_db.employee_perm.employees_transient;
4.4 cloning based on time travel
- timestamp
create table restored_table clone employees
at(timestamp => '2020-09-10 21:06:16.694 +0000'::timestamp);
- offset
create schema restored_schema clone employee_perm at(offset => -600);
3.query id
create database restored_db clone demo_db
before(statement => '0196d7b8-00d6-37a7-0000-45750002d1ce');
4.5 swap tables
you can use swap table easily exchange product table and dev table easily
ALTER TABLE employees SWAP WITH employees_consumer;
5.Data sharing
- snowflake users can share data to me, I also could share data with otherr snowflake users
2.For non snowflake users, we can create a reader account and share data - providers :create share ojbect and share data to others. customers: who is cosuming the shared data
5.1 create share
- create share named
employeees_share
--create employees_share objects
create share employees_share;
--grant databse
grant usage on database demo_db to share employees_share;
--grant schema
grant usage on schema demo_db.public to share employees_share;
--grant table
grant select on table demo_db.public.RESTORED_TABLE to share employees_share;
- check share
show grants to share employees_share;
image.png
3.check all shares
show shares
- add other account
ALTER SHARE CUST_DATA_SHARE ADD ACCOUNT = zx14140
- an easy way to share all the tables in a schema/database
// How to share complete schema
GRANT SELECT ON ALL TABLES IN SCHEMA demo_db TO SHARE CUST_DATA_SHARE;
// How to share complete database
GRANT SELECT ON ALL TABLES IN DATABASEdemo_db TO SHARE CUST_DATA_SHARE;
5.2 consume shares
1.check shares by share name
show shares;
DESC share EMPLOYEES_SHARE
image.png
- consume shared database
CREATE DATABASE DEMO_DB_SHARED FROM SHARE EMPLOYEES_SHARE
5.3 revoke shares and drop shares
- revoke a table or schema
revoke usage on schema public from share EMPLOYEES_SHARE
- drop share
drop share employees_share
6. metarialized view & secure view
6.1
- When dealing with large volumes of data that are frequently queried, we can utilize views, such as historical sales data, which consist of unchanging historical data.
- Can not use join with metarialized view.
6.1.1normal view VS metarialized view
- create a normal view and run a query nearly take 1.5s at first time, when excuting second time, the query is from the cache, it takes 73ms. Most of the time, normal view goes on compilation rather than execution.
--normal/regular
create or replace view VW_ORDERS as
select O_ORDERSTATUS, O_ORDERPRIORITY, O_CUSTKEY, COUNT(O_ORDERKEY) as TOTAL_ORDERS
from "SNOWFLAKE_SAMPLE_DATA"."TPCH_SF10"."ORDERS"
group by O_ORDERSTATUS, O_ORDERPRIORITY, O_CUSTKEY;
--excute query
select *
from VW_ORDERS
where O_ORDERPRIORITY = '1-URGENT'
and TOTAL_ORDERS > 4;
- create a materialized view
create or replace materialized view MVW_ORDERS as
select O_ORDERSTATUS, O_ORDERPRIORITY, O_CUSTKEY, COUNT(O_ORDERKEY) as TOTAL_ORDERS
from "SNOWFLAKE_SAMPLE_DATA"."TPCH_SF10"."ORDERS"
group by O_ORDERSTATUS, O_ORDERPRIORITY, O_CUSTKEY;
6.1.2 list all views
- List all the existing views including regular and materialized views.
show views
- List existing materialized views only
show materialized views
6.2 secure view
- We can use secure views by controlling the privilege.
6.2.1
- creata secure view
create or replace SECURE view MVW_ORDERS as
select O_ORDERSTATUS, O_ORDERPRIORITY, O_CUSTKEY, COUNT(O_ORDERKEY) as TOTAL_ORDERS
from "SNOWFLAKE_SAMPLE_DATA"."TPCH_SF10"."ORDERS"
group by O_ORDERSTATUS, O_ORDERPRIORITY, O_CUSTKEY;
- privileges usage of view by using secure views
--create a role should not allow to access to the underlying data
create or replace role analyst;
--grant privileges of database, schema and views yo analyst role
grant usage on database demo_db to role ANALYST;
grant usage on schema public to role ANALYST;
grant select on demo_db.public.employees to role ANALYST;
grant role ANALYST to user babypig521;
- drop view
drop meterialized view MVW_ORDERS
7.Grant privilege
Display only privileged tables for different user logins.
1.create roles
create or replace role HUMAN_RESOURCE;
create or replace role TECHNOLOGY;
create or replace role MARKETING;
2.create user fxx
create or replace user fxx password = 'temp123' default_Role = 'HUMAN_RESOURCE';
3.grant HUMAN_RESOURCE to fxx
grant role HUMAN_RESOURCE to user john;
- create schema for employee
create schema demo_db.employee
5.create a table and insert some data
create or replace table demo_db.employee.employees(employee_id number,
empl_join_date date,
dept varchar(10),
salary number,
manager_id number);
insert into demo_db.employee.employees values(1,'2014-10-01','HR',40000,4),
(2,'2014-09-01','Tech',50000,9),
(3,'2018-09-01','Marketing',30000,5),
(4,'2017-09-01','HR',10000,5),
(5,'2019-09-01','HR',35000,9),
(6,'2015-09-01','Tech',90000,4),
(7,'2016-09-01','Marketing',20000,1);
- grant usage of warehouse/database/schema for HUMAN_RESOURCE
--warehouse
grant usage on warehouse compute_Wh to role HUMAN_RESOURCE;
--database
grant usage on database demo_db to role HUMAN_RESOURCE;
--schema
grant usage on schema employee to role HUMAN_RESOURCE;
7.create a secure view based on login acount role.
create or replace secure view vw_employee as
select e.*
from "DEMO_DB"."EMPLOYEE"."EMPLOYEES" e
where upper(e.DEPT) in (select upper(manager_role_alias)
from "DEMO_DB"."EMPLOYEE"."MANAGERS" m
where upper(manager_role_name) = upper(current_role()));
8.grant view to HUMAN_RESOURCE
grant select on view "DEMO_DB"."EMPLOYEE"."VW_EMPLOYEE" to role HUMAN_RESOURCE;
- check current user
SELECT CURRENT_USER();
- grant HUMAN_RESOURCE to current user
grant role HUMAN_RESOURCE to user BABYPIG521
- user HUMAN_RESOURCE role or login fxx account
use role HUMAN_RESOURCE
8.Data masking
pass
网友评论