pyLDAvis 模块代码及应用

作者: H2016 | 来源:发表于2018-01-10 14:59 被阅读355次

pyLDAvis 模块代码及应用
angular.module 详解
geodjango在webgis中的应用（附pandas与RES
深入浅出React和Redux学习笔记（四）
nginx源码分析--nginx模块解析
模块化 React 和 Redux 应用
Paw:网络请求工具
webpack模块化
使用Python中的logging模块打印日志信息
Angular快速入门模块（一）

背景

pyLDAvis模块是python中的一个对LDA主题模型算法的可视化模块。本文的代码是根据github上的某个项目代码修改而得，很感谢github及创造原始代码的大牛朋友们！

import pandas as pd

df = pd.read_csv("C:\\Users\\Desktop\\neg.csv",errors='ignore')

print(df.head())

print(df.shape)

import jieba

jieba.load_userdict("C:\\Users\\Desktop\\中文分词词库整理\\中文分词词库整理\\百度分词词库.txt") #自定义分词词典

def chinese_word_cut(mytext):

return " ".join(jieba.cut(mytext))#分词

df["content_cutted"] = df.content.apply(chinese_word_cut)

print(df.content_cutted.head())

from sklearn.feature_extraction.text import TfidfVectorizer, CountVectorizer

n_features = 1000

tf_vectorizer = CountVectorizer(strip_accents = 'unicode',

max_features=n_features,

stop_words='english',

max_df = 1.0,

min_df = 0.1)#训练词矩阵

tf = tf_vectorizer.fit_transform(df.content_cutted)

from sklearn.decomposition import LatentDirichletAllocation

n_topics = 3

lda = LatentDirichletAllocation(n_topics=n_topics, max_iter=50,

learning_method='online',

learning_offset=50.,

random_state=0)#LDA模型训练

lda.fit(tf)

def print_top_words(model, feature_names, n_top_words):#主题相关的top词计算

for topic_idx, topic in enumerate(model.components_):

print("Topic #%d:" % topic_idx)

print(" ".join([feature_names[i]

for i in topic.argsort()[:-n_top_words - 1:-1]]))

print()

n_top_words = 25

tf_feature_names = tf_vectorizer.get_feature_names()

print_top_words(lda, tf_feature_names, n_top_words)

import pyLDAvis#所需可视化模块

import pyLDAvis.sklearn

data = pyLDAvis.sklearn.prepare(lda, tf, tf_vectorizer)

pyLDAvis.show(data)#可视化主题模型

pyLDAvis 模块代码及应用
背景 pyLDAvis模块是python中的一个对LDA主题模型算法的可视化模块。本文的代码是根据github上的...
angular.module 详解
AngularJS 模块模块包含了主要的应用代码。一个应用可以包含多个模块，每一个模块都包含了定义具体功能的代码。...
geodjango在webgis中的应用（附pandas与RES
geodjango模块在webgis中的应用 1.项目简介项目主要应用django的geodjango模块及po...
深入浅出React和Redux学习笔记（四）
模块化React和Redux应用创建一个复杂的应用该如何操作？模块化应用的要点；代码文件的组织方式；状态树...
nginx源码分析--nginx模块解析
nginx的模块非常之多，可以认为所有代码都是以模块的形式组织，这包括核心模块和功能模块，针对不同的应用场合，并非...
模块化 React 和 Redux 应用
模块化应用的要点代码文件的组织结构确定模块的边界 Store 的状态树设计开发辅助工具代码文件的组织方式 ...
Paw:网络请求工具
推荐一款网络请求的应用！非常好用！当然我用的是破解版！ Paw应用图标及版本信息应用界面及基础功能模块...
webpack模块化
模块化打包工具的由来模块化确实是很好的解决了我们在复杂应用开发中的代码组织问题，但随着我们引入模块化，我们的应用...
使用Python中的logging模块打印日志信息
转载自：知乎：使用Python中logging模块，抛弃print吧 logging模块作用及示例在实际应用中，...
Angular快速入门模块（一）
模块有两层含义，一种是框架代码以模块形式组织称为文件模块，另一种是功能单元以模块形式组织称为应用模块。文件模块其...