Work with RSS feeds using PHP and cURL

cURL or Client URL Library is a very powerful tool, and its something that i recently had to use while working with two APIs one for Unfuddle and one for HelpSpot.

PHP supports libcurl, a library created by Daniel Stenberg, that allows you to connect and communicate to many different types of servers with many different types of protocols. libcurl currently supports the http, https, ftp, gopher, telnet, dict, file, and ldap protocols. libcurl also supports HTTPS certificates, HTTP POST, HTTP PUT, FTP uploading (this can also be done with PHP’s ftp extension), HTTP form based upload, proxies, cookies, and user+password authentication.

These functions have been added in PHP 4.0.2.

You can interact with most APIs using cURL, at first it seams intimidating but once you start to use it you become comfortable with the set up.

Lets get started by parsing an RSS feed.

$ch = curl_init();

curl_init() initializes a new session and return a cURL handle for use with the curl_setopt(), curl_exec(), and curl_close() functions. We set $ch with the curl_init() function to get things rolling.

curl_setopt($ch, CURLOPT_URL, $feed);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_USERAGENT, $useragent);

Just like the name says, curl_setopt() will set an option on the given cURL session handle. There are far to many to list so you are bette roff reading the manual found here. In our example we are going to need to access a URL and bring back the data.

CURLOPT_URL – This will be the location of the RSS file.
CURLOPT_HEADER – We set this to 0 or false as we do not want to see the header information.
CURLOPT_RETURNTRANSFER – This will transfer the data back as a string instead of outputting directly to the browser.
CURLOPT_USERAGENT – While this is not needed, The contents of the “User-Agent: ” header to be used in a HTTP request.

$rss = curl_exec($ch);
curl_close($ch);

Once we have our options set the next step is to execute them using curl_exec(), we will asign $rss to curl_exec() thus giving the variable the contents of our look up.

For good measure we close the connection using the curl_close() function.

Since we know RSS feeds are made of XML we can goa head and do some simple parsing of my XML.

// Manipulate string into object
$rss = simplexml_load_string($rss);

simplexml_load_string() is available in PHP 5.1.0 and on and will take a well-formed XML string and returns it as an object. This makes it easy for us to work with.

I have talked about simplexml_load_string() in Google HTTP Geocoding.

Since my goal was to introduce you to cURL i’m not going to go into much detail with the XML parsing. we have set an array on the variable $rss, i usually use Print_r($rss) to see the structure of the array before i work with it, this makes it easy to work with.

$cnt = count($rss->channel->item);

for($i=0; $i<$cnt; $i++)
{
  $url = $rss->channel->item[$i]->link;
  $title = $rss->channel->item[$i]->title;
  $desc = $rss->channel->item[$i]->description;
  echo $title;
  echo $desc;
}

The above code is counting the number of items and looping through them one at a time printing the data to the screen, not real magic here.

Below is the complete source code for your coding pleasure.

function feedMe($feed) {
  // Use cURL to fetch text
  $ch = curl_init();
  curl_setopt($ch, CURLOPT_URL, $feed);
  curl_setopt($ch, CURLOPT_HEADER, 0);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
  curl_setopt ($ch, CURLOPT_USERAGENT, $useragent);
  $rss = curl_exec($ch);
  curl_close($ch);

  // Manipulate string into object
  $rss = simplexml_load_string($rss);

  $siteTitle = $rss->channel->title;
  echo $siteTitle;


  $cnt = count($rss->channel->item);

  for($i=0; $i<$cnt; $i++) {
    $url = $rss->channel->item[$i]->link;
    $title = $rss->channel->item[$i]->title;
    $desc = $rss->channel->item[$i]->description;
    echo $title;
    echo $desc;
  }
}

feedMe("http://feeds.feedburner.com/adampatterson/");

cURL is a powerful tool, you can interact with APIs as well as sites or services that don’t have an AIL like Google Analytics. So play with it, make a twitter feed parser.

Further Reading:
http://ca.php.net/curl

Tutorial: FTP Upload via cURL



Signup for my mailing list

Receive other rambings like this on design, code, and some times food.