Oct 31 2008

Mood Adaptive Playlists

Whilst listening to my iPod on the train to work today, I had an idea, albeit probably not applicable to the iPod.

The basic story goes that, I’d say most people like to listen to different genres of music at different times. Unless you have a restricted musical taste, you probably like at least a few genres of music, and usually you like to listen to them on different occasions. I think the genre and mood of the music you like to listen to reflect what’s going on in your life and how you feel at that particular moment.

Due to the fact that we express how we feel through facial expressions and our actions, a computer could theoretically detect how you’re feeling. I know this isn’t the whole story, but nothings black and white.

So my idea is that, through using say a webcam (a Mac with an iSight would be ideal), we could combine facial recognition and emotion detection algorithms to form smart playlists that relates to your mood. This could even be combined with Genius in iTunes to form a set of songs that go well together and express your mood at the same time. I see the emotion detection algorithms being adaptive, relying on the image processing but also user input e.g. if the user chooses to skip a given song.

There are obviously some technical and privacy issues. The first technical issue being that image processing is very processor intensive, however with machines of the future having tens of cores, this may not be such a problem. Many users may also not want a computer detecting how they feel, or attempting to.

Now if only my iTunes COM interface worked properly I’d start coding it!


Sep 22 2008

Ep Guides Reminder

After regularly missing my favourite American TV shows, I decided to write a little script to remind me to watch them.

It scrapes data from epguides.com and forms an RSS feed based on what shows are going to be released in the next week (this can be changed in the parameters). Due to the scraping being pretty slow, and not wanting to use too much bandwidth, I setup a cron job and piped the output of the script to a file that I’ve added to my iGoogle homepage.

Now whenever I open my homepage I can instantly see what shows that I like are going to be aired in the coming weeks!

Screen scraping is never ideal, but it usually works. Rather conveniently the format of epguides pages are very much table like, in order to save writing many regular expressions that may catch the wrong data when information is missing I simply split the data into its columns and rows. I simply match the correct div element and then process each line by splitting it into sub strings. This isn’t the most efficient way, but since this only runs daily in a cron job, its fine for me.

  1. <?php
  2. $urlPrefix = "http://epguides.com/";
  3.  
  4. $toFetch = array("Prison Break" => "PrisonBreak", "Dexter" => "Dexter", "Lost" => "Lost", "NUMB3RS" => "NUMB3RS", "House, M.D." => "House", "Family Guy" => "FamilyGuy", "American Dad" => "AmericanDad", "South Park" => "SouthPark"); //the shows for the feed
  5.  
  6. $dateFormat = "l jS F Y";
  7.  
  8. $minAge = time() - 172800; // 2 days ago
  9. $maxAge =  time() + 604800; //one week from today
  10.  
  11.  
  12. require("downloader.class.php");
  13. header('Content-type: application/xml; charset="utf-8"', true);
  14. $list = array();
  15.  
  16. foreach ($toFetch as &$url)
  17. {
  18.  $url = $urlPrefix . $url . '/';
  19.  
  20.  set_time_limit(20);
  21.  $file = getFile($url);
  22.  $arr = performMatch($file);
  23.  $show = $arr[0];
  24.  $arr = filter_old($arr[1]);
  25.  
  26.  $list [$show]= $arr;
  27. }
  28.  
  29. $dom = new domDocument;
  30. $dom->loadXML('<rss version="0.92">
  31. <channel>
  32. <title>Upcoming TV Shows</title>
  33. <description>The latest TV shows from epguides.com</description>
  34. <link>http://epguides.com</link>
  35. </channel>
  36. </rss>');
  37. if (!$dom)
  38. {
  39.      echo 'Error while parsing the document';
  40.      exit;
  41. }
  42.  
  43. $mainRoot = simplexml_import_dom($dom);
  44. $root = $mainRoot->channel[0];
  45.  
  46. $byDate = array();
  47. foreach ($list as $id => $show)
  48. {
  49.  if (count($show) > 0)
  50.  
  51.   foreach ($show as $ep)
  52.   {
  53.    $ep ["show"]= $id;
  54.    $byDate [$ep['air-date']][] = $ep;
  55.   }
  56. }
  57.  
  58. foreach ($byDate as $id => $date)
  59. {
  60.   $showElement = $root->addChild('item');
  61.   $showElement->addChild('title', date($dateFormat, $id));
  62.   $showElement->addChild('pubDate', date(DATE_RFC822));
  63.  
  64.   $desStr = "";
  65.   foreach ($date as $ep)
  66.   {
  67.    if (preg_match("/([0-9]*)\-([0-9]*)/", $ep['ep-season'], $epSeasonSplit))
  68.    {
  69.     $season = sprintf("%02s", $epSeasonSplit[1]);
  70.     $episode = sprintf("%02s", $epSeasonSplit[2]);
  71.     $seString = ' S'.$season.'E'.$episode;
  72.     $desStr .= $seString . " - ";
  73.    }
  74.    else
  75.     $desStr .= $ep['ep-season'] . " - ";
  76.    
  77.    $desStr .=  '<a href="'.$toFetch[$ep['show']].'">'.$ep['show'].'</a> - <a href="'.$ep['link'].'">'.$ep['title']."</a><br />";
  78.   }
  79.  
  80.   $showElement->addChild('description', $desStr);
  81. }
  82.  
  83. echo($mainRoot->asXML());
  84.  
  85. function getFile($url)
  86. {
  87.  $downloader = new downloader();
  88.  $downloader->clearCache($url);
  89.  return $downloader->get($url);
  90. }
  91.  
  92. function filter_old($arr)
  93. {
  94.  global $maxAge, $minAge;
  95.  $ret = array();
  96.  $time = $minAge;
  97.  $nextWeek = $maxAge;
  98.  
  99.  foreach ($arr as $row)
  100.   if ( ($row['air-date'] >= $time) && ($row['air-date'] <= $nextWeek) )
  101.    $ret []= $row;
  102.    
  103.  return $ret;
  104. }
  105.  
  106. function performMatch($file)
  107. {
  108.  $matches = array();
  109.  $epTableRegex = '/<div id="eplist">.*<pre>(.*)<\/pre>.*<\/div>/isU';
  110.  $hLinkRegex = '/<a target="[^"]*" href="([^"]*)">([^<]*)<\/a>/';
  111.  $titleRegex = '/<h1><a href="([^"]*)">([^<]*)<\/a><\/h1>/';
  112.  
  113.  preg_match($titleRegex, $file, $titleMatches);
  114.  preg_match($epTableRegex, $file, $matches);
  115.  
  116.  $showTitle = ($titleMatches[2]);
  117.  $ep_table = trim($matches[1]);
  118.  $ep_arr = split("\n", $ep_table);
  119.  $episodes = array();
  120.  
  121.  $split_line = null;
  122.  foreach ($ep_arr as $ep)
  123.  {
  124.   $e = str_replace(" ", "", $ep);
  125.   if (strlen($e) == count(split("_", $e)))
  126.   {
  127.    $split_line = $ep;
  128.    break;
  129.   }
  130.  }
  131.  $col_lengths = array(0);
  132.  if ($split_line !=null)
  133.  {
  134.   $split_arr = split (" ", $split_line);
  135.   foreach ($split_arr as $split_len)
  136.   {
  137.    $len = strlen($split_len);
  138.    if ($len > 0)
  139.     $col_lengths []= $len;
  140.   }
  141.  
  142.  
  143.   for ($i =1; $i<count($col_lengths); $i++)
  144.   {
  145.    $currentCol = $col_lengths[$i] + $col_lengths[$i-1];
  146.    $col_lengths[$i] = $currentCol +1;
  147.   }
  148.  
  149.   $part_arr = array("ep-num", "ep-season", "prod-num", "air-date", "title");
  150.   $table_arr = array();
  151.   foreach ($ep_arr as $line)
  152.   {
  153.    $line_arr = array();
  154.    for ($i = 1; $i<count($col_lengths); $i++)
  155.    {
  156.     $start = $col_lengths[$i-1];
  157.     $end = ($col_lengths[$i]-$start);
  158.     if ($i == (count($col_lengths)-1))
  159.      $end = strlen($line)-$start;
  160.    
  161.     $str = substr($line, $start, $end);
  162.     $line_arr [$part_arr[$i-1]]= trim($str);
  163.    }
  164.    $table_arr [] = $line_arr;
  165.   }
  166.  
  167.   foreach ($table_arr as $row)
  168.   {
  169.    $airdate = strtotime ($row["air-date"]);
  170.    preg_match($hLinkRegex, $row["title"], $matches);
  171.    $title = $matches[2];
  172.    $link = $matches[1];
  173.    $epNum = str_replace(".", "", $row["ep-num"]);
  174.    $epSeasonNum = str_replace(" ", "", $row["ep-season"]);
  175.    $prodNum = $row["prod-num"];
  176.    
  177.    if (($epNum != null) && ($airdate != null))
  178.     $episodes []= array("ep-num" => $epNum, "ep-season" => $epSeasonNum, "prod-num"=>$prodNum, "air-date"=>$airdate, "title"=>$title, "link"=>$link);
  179.   }
  180.  }
  181.  return array($showTitle, $episodes);
  182. }
  183. ?>

The downloader class simply grabs and manages files for local scraping.


Aug 19 2008

My Summer

Much has happened since my last post, and I didn’t keep up the blogging – no shock there.

I managed to get my hands on a mint condition iMac for a very good price, it’s the model just before they made them silver. Nice little thing, the iMac, it would be nice to ditch the PC but it would be too expensive to replace with the equivalent Mac.

The first day of getting the Mac up and running saw me install Synergy and the iPhone SDK, everything I shall ever need for the Mac! Synergy is a pretty handy app, albeit a little painful to setup and install. This allows me to keep my desk clear and use the keyboard on my Windows machine to develop on the Mac too.

Within a few days development had begun on MovieStar (checkout my project page) – my native IMDb search tool. Prior this point I had no experience with Cocoa or Objective-C, besides the fact that I’d only used Mac’s occasionally. The first few weeks of getting to grips with the new language and development environment were hard, but the odd syntax gradually came to me.

The link between the Interface Builder and the code was a little hard to fathom at first, where if you accidently connect the wrong components or outlets the app just crashes with little clue as to why. This brings me on to XCode’s error handling – what’s going on there? The majority of crashes give you NO error message at all, XCode just starts up GDB and shrugs.

XCode isn’t all bad though, if you look past the fact you can’t rename a project easily. The API look up tool is pretty handy and I do like the code auto completion, although it’s a little hard to get used to coming from Netbeans and similar tools.

Expect to see MovieStar on the AppStore at the end of September of sometime in October – I hope!