Phase 2 - Report: Alexander Rossa

During my second GSoC term I was focusing on finishing various parts of the Twitter Deescalation bot and on extending the Seed module.

Twitter bot:

  • Created dataset of several thousands of Tweets for both topic prediction (keyword labeled and checked for correctness) and anger/participation classification (manually labeled)

  • Improved and tested neural network models used on said dataset

  • Did some pretotyping work with the bot - participating in online discussions and sort of impersonating an ideal version of the bot to see what the bot will have to deal with "in the real world" - the logs are from "bot's perspective" and closely follow the actual execution of the bot

Seed module:

  • Transformed Seed into an NPM module

  • Wrote up some documentation for using Seed as an NPM module

  • Almost finished implementing the conditional generation, still need to do a bit of work on connecting all the outputs and do some testing for correctness of the solution

The next focus for this project will be:

  • Finishing the conditional generation for Seed

  • Reworking the collected dataset a bit (turns out that there were too many classes for too little data which plateaued the test set accuracy on about 60% even with heavy regularization) - I collected more data for smaller number of classes and am hand labeling it right now

  • Testing and improving the bot in the real world

  • Retrospectively rewriting the original Seed repository with using Seed as a Node module instead and adding the ability to easily create Twitter bots from the Seed website https://seed.emrg.be/