Earth Day

One of the most interesting projects I worked on during my university years was one concerning climate change. Data Science applications are massively helping the fight against climate change. In addition to climate models, machine-learning solutions have been developed to predict climate behavior in the future. However, this project focused on data management and visualization techniques.

My team and I selected 36 countries presenting specific characteristics related to environment and climate change to be analyzed. We chose the 24 countries that joined Annex II a subcategory of the Kyoto Agreement. Three countries from the BRICS (we had to discard Russia and China because of Twitter limitations) and 10 countries that are active in the fight against climate change.
Firstly, we gathered data from different sources. From the Word Bank website, we extracted information about the socioeconomic situations of these countries. From the official website of the Environmental Performance Index, we retrieved some of the major indicators concerning climate and the environment. We created a single dataset merging all the information gathered.

EarthDay

Then we streamed tweets from the countries selected during World Earth Day making use of Tweepy and Kafka. The streaming was performed by filtering by hashtags. We chose words such as #earhday, #climatechange, and #plasticfree. All these patterns were translated into multiple languages to reach a greater number of tweets.
After some preprocessing concerning the language translation and country identification, we applied Sentiment Analysis to extract the feelings of each population. In the end, we integrated the result into the main dataset.

When the data was ready, we started to explore and analyze it creating visual representations making use of Python and Tableau. The final dashboard is available here while the notebook used for every single step is posted on my GitHub page.