Skip to content

Using Sentiment Multi-label Analysis for MARVEL Character Review

Notifications You must be signed in to change notification settings

egalijatov1/nlp-sentiment-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Using Sentiment Multi-label Analysis for MARVEL Character Review

This project describes a model that predicts whether movie text line belongs to one or more emotional classes. After model is trained over one data-set of movie lines, it is used for character analysis of other data-set - MARVEL movie lines. This part includes exploring what emotions characters encounter through a movie. For character analysis dataset of MARVEL movie lines is used, where most important characters are analysed. This model uses features derived from word and char n-grams, parts-ofspeech, word embedding and Opinion Lexicon.

DATA

  • XED dataset consists of emotion annotated movie subtitles (data/en-annotated.tsv). Movie lines in this dataset have following distribution: image

  • Marvel Universe dataset is created from the transcripts of Marvel Universe movies (data/mcu.csv). This dataset contains lines from over 600 characters. In this project only the most important ones are considered: image

  • GloVe - Global Vectors for Word Representation

METHODS

Two approaches for classification are compared: LinearRegression and LinearSVC (Suport Vector Classifier) classification algorithms. To translate these into multi-label problem, OneVsRestClassifier was used. This estimator uses the binary relevance method, which involves training one binary classifier independently for each label.

REPORT

In file Sentiment_multi_label_MARVEL.pdf you can find detailed project description. This includes preprocessing and feature extraction as well as presentation of results.

About

Using Sentiment Multi-label Analysis for MARVEL Character Review

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published