Meme Generator Natural Language Processing and Sentiment Analysis

Data wrangling using Python and SQL meme text, images, and 82 language sentiment lexicons into a realational MySQL database for language and sentiment analysis. Visualization with Tableau.

See this project’s: GitHub Repository and documentation

English, Spanish, and Russian memes



Project Summary

Working with the Python langid library and sentiment lexicons in 81 languages developed by the SUNY Data Science Lab, my partner, Liz Seeley, and I created a relational database of 57,000 meme instances based on over 800 basememes. In this process, we were able to probabilistically identify the language of each meme and analyze their captions for positive and negative words to which we assigned positive (+1) and negative (-1) scores, respectively. With an SQL query, we were able to calculate an overall sentiment score for each meme and basememe both within and across languages.

Database diagram

Data wrangling flow