文档

使用词云可视化文本数据

这个例子展示了如何使用单词想象文本数据云。

文本分析工具箱的功能延伸wordcloud(MATLAB)功能。它增加了支持创建金宝app直接从字符串数组和创建云词云从bag-of-words模型和LDA的话题。

加载示例数据。该文件weatherReports.csv包含天气预报,包括每个事件的文本描述和分类标签。

文件名=“weatherReports.csv”;T = readtable(文件名,“TextType”,“字符串”);

提取的文本数据event_narrative列。

textData = T.event_narrative;textData (1:10)
ans =10 x1字符串数组“大树”Plantersville和Nettleton之间。"One to two feet of deep standing water developed on a street on the Winthrop University campus after more than an inch of rain fell in less than an hour. One vehicle was stalled in the water." "NWS Columbia relayed a report of trees blown down along Tom Hall St." "Media reported two trees blown down along I-40 in the Old Fort area." "" "A few tree limbs greater than 6 inches down on HWY 18 in Roseland." "Awning blown off a building on Lamar Avenue. Multiple trees down near the intersection of Winchester and Perkins." "Quarter size hail near Rosemark." "Tin roof ripped off house on Old Memphis Road near Billings Drive. Several large trees down in the area." "Powerlines down at Walnut Grove and Cherry Lane roads."

创建一个词云的天气预报。

图wordcloud (textData);标题(“天气预报”)

比较报告中的词和标签“冰雹”“雷暴风”。创建词云为每一个标签的报告。指定字的颜色是蓝色和红色分别为每个词云。

图标签= T.event_type;次要情节(1、2、1)idx = = =标签“冰雹”;wordcloud (textData (idx),“颜色”,“蓝”);标题(“冰雹”次要情节(1、2、2)idx = = =标签“雷暴风”;wordcloud (textData (idx),“颜色”,“红色”);标题(“雷暴风”)

比较报告中的词从美国佛罗里达州,堪萨斯州和阿拉斯加。创建的词云报告为每个这些州在矩形和画一个边框每个词云。

图状态= T.state;次要情节(1、3、1)idx = = =“佛罗里达”;wordcloud (textData (idx),“形状”,“矩形”,“盒子”,“上”);标题(“佛罗里达”次要情节(1、3、2)idx = = =“堪萨斯”;wordcloud (textData (idx),“形状”,“矩形”,“盒子”,“上”);标题(“堪萨斯”次要情节(1,3,3)idx = = =“阿拉斯加”;wordcloud (textData (idx),“形状”,“矩形”,“盒子”,“上”);标题(“阿拉斯加”)

比较报告中的词和在数千美元的财产损失报告损坏的报告发表在数百万美元。创建的词云报告为每个这些数量分别与突出颜色蓝色和红色。

成本= T.damage_property;idx = endsWith(成本,“K”);图wordcloud (textData (idx),“HighlightColor”,“蓝”);标题(“成千上万的伤害报告”)

idx = endsWith(成本,“M”);图wordcloud (textData (idx),“HighlightColor”,“红色”);标题(“数以百万计的损坏报告”)

另请参阅

||

相关的话题