Data wrangling using Python and SQL meme text, images, and 82 language sentiment lexicons into a realational MySQL database for language and sentiment analysis. Visualization with Tableau.
See this project’s: GitHub Repository and documentation
Working with the Python langid
library and sentiment lexicons in 81 languages developed by the SUNY Data Science Lab, my partner, Liz Seeley, and I created a relational database of 57,000 meme instances based on over 800 basememes. In this process, we were able to probabilistically identify the language of each meme and analyze their captions for positive and negative words to which we assigned positive (+1) and negative (-1) scores, respectively. With an SQL query, we were able to calculate an overall sentiment score for each meme and basememe both within and across languages.