Women in Data Sciences（WiDS）Conf

作者: Kayyyy | 来源:发表于2017-03-08 12:17 被阅读119次

Women in Data Sciences（WiDS）Conf
Advances in Data and Information
[R语言] Statistics and R - Week 1
Docker部署MySQL8
R数据科学 R for data sciences
Macos big sur 根目录下创建data软链
R语言OLS回归
hang: nnetbin/sat-nnet-train-frm
docker run 和 docker-compose 配置挂载
Docker Nginx部署静态页面

Opening

Processor Alice Wong
Associate Dean of Science, HKU

WiDS 2017 is a collaboration among Stanford University, SAP, Google, Microsoft and Walmart Labs.
50th Anniversary for HKU Department of Statistic and Actuarial Science
the big data research cluster @HKU

Talk 1: Women in Data Science

Speaker:
Anita Varshney
Global Strategy Transformation Lead, SAP Hong Kong

WiDS

held by Stanford every February (March in Asia)
keynote speakers from various industries that are doing data science now
having largest attending number actually in middle east

SAP

the world's largest provider of enterprise application software
HQ in Germany; founded in 1972
career suggestion: look for a good mentor
present in 26 industries
Real time processes, Prediction and simulation, great User experience, Agility and TCO

SAP next-gen

Providing platform for college students to present their ideas directly to business customers.
Technologies
- Machine learning
- IoT

*Amazing time management of presentation

Talk 2: Big Data Decision Analysis

"Big data is something that breaks Microsoft Excel" (lol)

Research project - Machine Learning for Chinese Suicide Newspaper Articles Classification

Analysis how the media report suicide incidence, and to figure out how to prevent suicide.

WiseNews database: over 220K search result for the keyword "suicide", containing 84 million terms
Big data challenges
- Noisy dataset: e.g. "suicide car booming attack"
- Data classification
Supervised Machine Learning (use labeled articles to train)
Web Interface for manually label
Article features extraction for ML
- Text Segmentation: Sentence -> Words -> N-grams
  - Tool: Jieba(结巴) - functionalities like MP & HMM(Hidden Markov Model)
    - State Transition Matrix: P(M|B) >> P(E|B)
- Document Representation
  - Word to Document Matrix (not very efficient)
  - Chosen approach - Word Embedding (Word2Vec)
    - each word is represented by a vector of fixed number of dimensions (usually 30-500d)
    - Neural network: to determine the dimensions of the document vector, CBOW and Skip-Gram Model
    - Cosine similarity
Classification (Training)
- labeled dataset: 70% for training and 30% for testing
- P(Suicide = Yes) 85.9% accuracy, P(Student = No), P(HK = Yes), ...
Future work
- Identify any pattern of misclassification
- Increase dimensions of the word vectors
- Deep learning approach for other NLP tasks with this dataset
  - Predict the method used for suicide
  - Predict the reasons used for suicide

Talk 3: Predictive Analytics

Vanessa Ko
Head of Presales SAP Hong Kong

SAP HK

Customers: I.T., Cathay Pacific, PizzaHut, etc.
Biggest competitor: overall no, only in some sub-areas.

Predictive Analytics

How to make use of digitalized historical data
Case: Obama for America 2012
- Data source: Historical voting data, Census, Volunteer collected data, Facebook, etc;
- Segments of voters, Found raising prediction, who's persuadable?
- Data Modeling: VOTING RATE MODEL, SUPPORT RATE MODEL, Persuasive Rating, Overall score;
- Goal: Target Voters, Donators and Volunteers -> especially swing voters (not too supportive or too opposing)

网友评论

本文标题：Women in Data Sciences（WiDS）Conf

本文链接：https://www.haomeiwen.com/subject/akdbgttx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

Women in Data Sciences（WiDS）Conf

Opening

Talk 1: Women in Data Science

Talk 2: Big Data Decision Analysis

Research project - Machine Learning for Chinese Suicide Newspaper Articles Classification

Talk 3: Predictive Analytics

相关文章

Women in Data Sciences（WiDS）Conf

Advances in Data and Information

[R语言] Statistics and R - Week 1

Docker部署MySQL8

R数据科学 R for data sciences

Macos big sur 根目录下创建data软链

R语言OLS回归

hang: nnetbin/sat-nnet-train-frm

docker run 和 docker-compose 配置挂载

Docker Nginx部署静态页面

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

数据科学家

玩转大数据

数据科学家

大数据