使用单词云可视化文本数据
This example shows how to visualize text data using word clouds.
Text Analytics Toolbox extends the functionality of thewordcloud
(MATLAB)函数。它增加了直接从字金宝app符串数组中创建单词云的支持,并从字袋模型和LDA主题中创建单词云。
Load the example data. The filefactoryReports.csv
包含工厂报告,包括每个事件的文本说明和分类标签。
文件名=“ Factory Reports.csv”; tbl = readtable(filename,'TextType','string');
从Description
column.
textdata = tbl.description;textdata(1:10)
ans =10x1 string"Items are occasionally getting stuck in the scanner spools." "Loud rattling and banging sounds are coming from assembler pistons." "There are cuts to the power when starting the plant." "Fried capacitors in the assembler." "Mixer tripped the fuses." "Burst pipe in the constructing agent is spraying coolant." "A fuse is blown in the mixer." "Things continue to tumble off of the belt." "Falling items from the conveyor belt." "The scanner reel is split, it will soon begin to curve."
从报告中创建一个单词云。
figure wordcloud(textData); title("Factory Reports")
Compare the words in the reports with labels"Leak"
和“机械故障”
. Create word clouds of the reports for each of these labels. Specify the word colors to be blue and magenta for each word cloud respectively.
figure labels = tbl.Category; subplot(1,2,1) idx = labels =="Leak"; wordcloud(textData(idx),'Color','blue');标题("Leak") subplot(1,2,2) idx = labels ==“机械故障”; wordcloud(textData(idx),'Color','品红');标题(“机械故障”)
Compare the words in the reports with urgency "Low", "Medium", and "High".
figure urgency = tbl.Urgency; subplot(1,3,1) idx = urgency =="Low"; wordcloud(textData(idx)); title("Urgency: Low") subplot(1,3,2) idx = urgency ==“中等的”; wordcloud(textData(idx)); title("Urgency: Medium") subplot(1,3,3) idx = urgency =="High"; wordcloud(textData(idx)); title(“紧急:高”)
Compare the words in the reports with cost reported in hundreds of dollars to the reports with costs reported in thousands of dollars. Create word clouds of the reports for each of these amounts with highlight color blue and red respectively.
cost = tbl.Cost; idx = cost > 100; figure wordcloud(textData(idx),'HighlightColor','blue');标题("Cost > $100")
idx = cost > 1000; figure wordcloud(textData(idx),'HighlightColor','red');标题("Cost > $1,000")