Definition:
"Big Data" is data whose scale distribution,diversity, and timeliness require the use of new technical architechures and analytics to unlock business insights.
Characteristics:
1. Data Volume
2 .Processing Complexity
3. Data Structure more complex
From the traditional structured data in database, to semi-structure, "Quasi"Structured data such as XML files and webpages, to the unstructured data such as text, image and video file.
Data Repositories :
Data Spreadmarts -> isolated data extraction
Warehouses -> BI and reporting suports DBAs
Analytic Sandbox :data replication, high performance
Business Intelligence VS Data Science
• BI : what was..? Structured data and Standard reporting
• DS: What will…? Why? Predicting and modeling
Data Scientists Profile:
Quantitative; Technical; Skeptical; Curious& Creative; Communicative& Collaborative
网友评论