PHP script that filters moods from twitter

..how depressed is the web?

Posted by Christian Haschek on 2012-01-08
Never miss a post by liking this blog on Facebook

Some time ago I saw a TED Talk where a programmer made a twitter-feeling project.

He wrote a program that scanned twitter for feelings and displayed them on his website with nice animations and a few other eye-candy things.

Personally I love data and graphing data so I wanted to try this myself.

I wrote a PHP script that uses the twitter-php library to search for the phrase "I feel" on twitter and processes the posts, strips everything but the posted feeling. In the end I want to get something like "I feel excited" but after watching the results I saw that there are many phrases that would be much to complex to filter like "I feel its time for a change"

So I made a "forbidden words" list that triggers the script not to scan any further if one of these words is detected after the "I feel" part.

In addition I made a function that makes the script look for one more word so that sentences like "I feel very good" will be found to be correct.

So here's the script. Feel free to copy, change and whatever

(note: this is intended for CLI execution. If you want to run it from a webserver with PHP just remove the while(1) loop so the search is done only once)

<?php
/*******************************************************
* Twitter feelings filter
* by Christian Haschek (2011/2012)
* Blog: http://futurelopment.blogspot.com/
* 
* for this to work you'll need the twitter-php class
* from http://code.google.com/p/twitter-php/
*******************************************************/

// load twitter class
include_once('twitter.class.php');

error_reporting(E_ALL ^ E_NOTICE);

//you'll have to enter your twitter api information here (as described in the twitter-php readme)
$consumerKey = "";
$consumerSecret = "";
$accessToken = "";
$accessTokenSecret = "";

$twitter = new Twitter($consumerKey, $consumerSecret, $accessToken, $accessTokenSecret);

//infinite loop that will check twitter every 5 seconds
while(1)
{
 $results = $twitter->search('i feel'); //the search string
 foreach ($results as $result)
 {
  if(!$read[$result->user->name][$result->created_at])
  {
   echo processPost($result);
   $read[$result->user->name][$result->created_at]=1; //so that no post will be printed twice
  }
 }
 sleep(5);
}

function processPost($result)
{
 $text = $result->text;
 $timestamp = $result->created_at;
 $user = $result->from_user;
 $realname = addslashes(utf8_encode($result->from_user_name));
 $lang = $result->iso_language_code;
 $postid = $result->id_str;
 $origtext = addslashes(utf8_encode($text));
 $image = $result->profile_image_url;

 //filter out special characters
 $text = preg_replace("/[^a-zA-Z0-9 ]/", "", $text);
 $text = strtolower($text);
 $arr = explode(' ',$text);
 $i = array_search('i',$arr);
 $feel = array_search('feel',$arr);

 if(($i + 1) == $feel && !checkForbiddenWord($arr[$feel+1]) && !is_int($arr[$feel+1]))
 {
  $word = $arr[($feel+1)];
  $plus = 1;

  //recursively checks if the next word is in the "next word too" list
  while(checkNextToo($arr[($feel+$plus)]) && !checkForbiddenWord($arr[($feel+$plus)]) && !is_int($arr[$feel+$plus]))
  {
   $plus++;
   $word .= ' '.$arr[($feel+$plus)];
  }
 }

 if($word)
  echo 'I feel '.$word."\n";
}

//I made this list by watching the results..
function checkForbiddenWord($word)
{
 if(strlen($word)<2) return true;
 $f[] = 'like';
 $f[] = 'for';
 $f[] = 'its';
 $f[] = 'ya';
 $f[] = 'you';
 $f[] = 'will';
 $f[] = 'that';
 $f[] = 'about';
 $f[] = 'but';
 $f[] = 'they';
 $f[] = 'is';
 $f[] = 'at';
 $f[] = 'to';
 $f[] = 'dont';
 $f[] = 'in';
 $f[] = 'when';
 $f[] = 'by';
 $f[] = 'lmao';
 $f[] = 'as';
 $f[] = 'after';

 if(in_array($word,$f))
  return true;
 else return false;
}

function checkNextToo($word)
{
 $f[] = 'very';
 $f[] = 'the';
 $f[] = 'not';
 $f[] = 'no';
 $f[] = 'make';
 $f[] = 'need';
 $f[] = 'needed';
 $f[] = 'kinda';
 $f[] = 'kind';
 $f[] = 'of';
 $f[] = 'so';
 $f[] = 'soo';
 $f[] = 'too';
 $f[] = 'a';
 $f[] = 'lot';
 $f[] = 'less';
 $f[] = 'alot';
 $f[] = 'little';
 $f[] = 'my';
 $f[] = 'new';
 $f[] = 'bit';
 $f[] = 'like';
 $f[] = 'giving';
 $f[] = 'soo';
 $f[] = 'really';
 $f[] = 'your';
 $f[] = 'much';
 $f[] = 'i';
 $f[] = 'because';

 if(in_array($word,$f))
  return true;
 else return false;
}

The results are usually pretty depressing and look something like this:

I feel old

I feel so good

I feel sick

I feel bad

I feel so alone

I feel super

I feel sorry

I feel bad

When I had it running for a couple hours I printed out the feelings people post with the font size being exactly the amount of times this word has been posted. So if "happy" would be captured 10 times it would have font size 10. This is my result ("bad" has a font size of 289px)

Twitter feeling result


Tags: api | christian | cli | feelings | forbidden words | haschek | i feel | php | scanner | twitter | webserver
986

There are no ads on this (https enforced) blog. Please help me to keep it that way
1ChrisHMgr4DvEVXzAv1vamkviZNLPS7yx