Main Content

ratioSentimentScores

森timent scores with ratio rule

Description

UseratioSentimentScoresto evaluate sentiment in tokenized text with a ratio rule. TheratioSentimentScoresfunction, by default, uses the VADER sentiment lexicon.

example

compoundScores= ratioSentimentScores(documents)returns sentiment scores for tokenized documents based on the ratio of positive and negative tokens. For each document where the ratio of the positive score to negative score is larger than 1, the function returns 1. For each document where the ratio of the negative score to positive score is larger than 1, the function returns -1. Otherwise, the function returns 0.

[compoundScores,positiveScores,negativeScores] = ratioSentimentScores(documents)also returns the sums of the positive and negative token scores of the documents respectively.

example

___= ratioSentimentScores(___,Name,Value)specifies additional options using one or more name-value pairs.

Examples

collapse all

Create a tokenized document.

str = ["The book was VERY good!!!!""The book was terrible."]; documents = tokenizedDocument(str);

评估情绪的标记化的文档。A score of 1 indicates positive sentiment, a score of -1 indicates negative sentiment, and a score of 0 indicates neutral sentiment.

compoundScores = ratioSentimentScores(documents)
compoundScores =2×11 -1

森timent analysis algorithms rely on annotated lists of words called sentiment lexicons. For example, theratioSentimentScoresfunction uses a sentiment lexicon with words annotated with a sentiment score ranging from -1 to 1, where scores close to 1 indicate strong positive sentiment, scores close to -1 indicate strong negative sentiment, and scores close to zero indicate neutral sentiment.

If the sentiment lexicon used by theratioSentimentScoresfunction does not suit the data you are analyzing, for example, if you have a domain-specific data set like medical or engineering data, then you can use your own custom sentiment lexicon. For an example showing how to generate a domain specific sentiment lexicon, seeGenerate Domain Specific Sentiment Lexicon.

Create a tokenized document array containing the text data to analyze.

textData = ["This company is showing extremely strong growth.""This other company is accused of misleading consumers."]; documents = tokenizedDocument(textData);

Load the example domain specific lexicon for finance data.

文件名="financeSentimentLexicon.csv"; tbl = readtable(filename); head(tbl)
ans=8×2 tableToken SentimentScore _________________ ______________ {'opportunities'} 0.95633 {'innovative' } 0.89635 {'success' } 0.84362 {'focused' } 0.83768 {'strong' } 0.81042 {'capabilities' } 0.79174 {'innovation' } 0.77698 {'improved' } 0.77176

Evaluate the sentiment using theratioSentimentScoresfunction and specify the custom sentiment lexicon using the'SentimentLexicon'option. A score of 1 indicates positive sentiment, a score of -1 indicates negative sentiment, and a score of 0 indicates neutral sentiment.

compoundScores = ratioSentimentScores(documents,'SentimentLexicon',tbl)
compoundScores =2×11 -1

Input Arguments

collapse all

Input documents, specified as atokenizedDocumentarray.

Name-Value Arguments

Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN, whereNameis the argument name andValueis the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and encloseNamein quotes.

Example:Threshold,0.5sets the ratio threshold to 0.5

森timent lexicon, specified as a table with the following columns:

  • Token– Token, specified as a string scalar.

  • SentimentScore– Sentiment score of token, specified as a numeric scalar.

The default sentiment lexicon is the VADER sentiment lexicon.

Data Types:table

Ratio threshold, specified as a nonnegative scalar.

If the ratio of the positive score to negative score ofdocuments(i)is larger thanThreshold, thencompoundScores(i)is 1. If the ratio of the negative score to positive score ofdocuments(i)is larger thanThreshold, thencompoundScores(i)is -1. Otherwise,compoundScores(i)is 0.

Data Types:single|double|int8|int16|int32|int64|uint8|uint16|uint32|uint64

Output Arguments

collapse all

Compound sentiment scores, returned as a numeric vector. The function returns one score for each input document.

If the ratio of the positive score to negative score ofdocuments(i)is larger thanThreshold, thencompoundScores(i)is 1. If the ratio of the negative score to positive score ofdocuments(i)is larger thanThreshold, thencompoundScores(i)is -1. Otherwise,compoundScores(i)is 0.

Positive sentiment scores, returned as a numeric vector. The function returns one score for each input document. The valuepositiveScores(i)corresponds to the positive sentiment score ofdocuments(i).

Negative sentiment scores, returned as a numeric vector. The function returns one score for each input document. The valuenegativeScores(i)corresponds to the negative sentiment score ofdocuments(i).

Version History

Introduced in R2019b