I’m a constant user of the social bookmarking service delicious. I use it to store bookmarks for all of the interesting web sites I see, and I use it to find interesting sites other people have found. It’s a very useful service and I’ve grown to rely on it. They have, unfortunately, had some problems with downtime, slow service, and other unreliability. I’m sure that this will be diminished greatly due to their purchase by Yahoo!, and I’m not really complaining, because it is a free service.

Because I rely on it so much, I decided that I needed a way to back up my bookmarks and still have access to them anywhere with an internet connection. Exporting to an HTML file or to my Firefox bookmarks didn’t really have any appeal to me. So my idea was to put together a simple PHP script to load all of my delicious links into a mySQL database and provide a simple interface to browse them. Delicious provides a REST API that has a call to export all of your posts to XML – if you load up http://del.icio.us/api/posts/all in your web browser, provide your delicious username and password, you can get the full XML. I also took this as an opportunity to play around with some technology that I haven’t used much yet – PHP5, SimpleXML, and AJAX.

Some of this is spurred on by the fact that I got a new webhosting account at Dreamhost – this supports PHP5 and has a ton of other great features. As a side note, if you’re looking for a webhosting account, you can sign up for Dreamhost’s Level 1 account (20GB space, just about unlimited bandwidth, PHP5, PHP4, Ruby on Rails, SSH, unlimited domains/subdomains, and a free 1 year domain registration) for $30 if you use the code ‘GREGPHOTO‘ at checkout and sign up for a 1 year prepaid plan. Ok, enough of the advertising – onto how I made the app and a bunch of code examples!

Exporting delicious posts

As mentioned above, I used the link http://del.icio.us/api/posts/all through Firefox to download the XML file and save it as all.xml. This has a very simple format, an example is shown below:

1
2
3
4
5
6
7
8
< ?xml version="1.0" standalone="yes"?>
<posts update="2006-01-07T17:55:47Z" user="gneustaetter">
  <post href="http://www.subzane.com/projects.details.php?ID=12" description="subzane.com - Free PHP Scripts and Classes" hash="05d908f23adbb89ef3e97e5d3a6ffd9f" tag="php" time="2006-01-06T07:50:49Z"></post>
  <post href="http://priyadi.net/archives/2005/09/27/wordpress-plugin-code-autoescape/" description="Priyadi�s Place � Blog Archive � WordPress Plugin: Code Autoescape" hash="502441799d404abcbe08a07186c32626" tag="php wordpress" time="2006-01-06T06:47:59Z"></post>
   <post href="http://www.huddletogether.com/projects/lightbox/" description="Lightbox JS: Fullsize Image Overlays" hash="08a5a446ff39aeb04c5fdbc50d674765" tag="javascript css design" time="2005-12-30T17:58:44Z"></post>
  <post href="http://www.kayak.com/h/buzz/flights" description="Kayak Buzz popular flight searches" hash="ea74f520caad38215201d839403fc61f" tag="travel maps" time="2005-12-21T17:37:50Z"></post>
  <post href="http://wsfinder.jot.com/WikiHome" description="WikiHome - wsfinder - JotSpot" hash="0e85912cf399cb7469a00e8633d13a77" tag="api reference tools" time="2005-12-21T17:31:16Z"></post>
</posts>

So basically it has a top level ‘posts’ element and then ‘post’ elements inside this that detail the link href, description, the tags applied to the link, the time posted, and a hash. I’ll use all of the information except for the hash.

Parsing the XML with PHP5’s SimpleXML

One of the greatest features in PHP5 is SimpleXML, which really allows incredibly simple access to XML. For tasks such as parsing the above XML, this is a great thing to use – you basically get to iterate through the XML as if it were an Array. Let’s take a look at the parseXML function in my delicious import class:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
function parseXML() {
	$xml = simplexml_load_file($this->xmlfile);
	foreach($xml->post as $post) {
		$tags = $post['tag'];
		$tags = explode(' ',$tags);
		foreach($tags as $tag) {
			$tag = trim($tag);
			if(!in_array($tag, $this->tags)) {
				$this->tags[$tag]['name'] = $tag;
			}
		}
		$this->posts[] = array(
			'href' => (string)$post['href'],
			'description' => (string)$post['description'],
			'tags' => $tags,
			'time' => strtotime((string)$post['time'])
		);
	}
}
  1. The function takes no arguments and starts off (on line 2) by loading the all.xml file (path stored as $this->xmlfile) and creating a simplexml object called $xml
  2. SimpleXML supports iteration, so we can loop through the XML file as if it were an array – using the foreach construct (line 3)
  3. On lines 3 and 4, we get the tags from the tag attribute of the XML, and explode them into an array by splitting them wherever there are spaces – i.e. the tag “php5 simplexml” would turn into an array with two elements, “php5” and “simpleXML”
  4. From here, we loop through the tags array I just created and push the tags into a class-level array if they aren’t already in there yet (lines 6-11)
  5. Lastly, on lines 12-18, we populate information about the post into a class-level array that stores the information about all the posts

So in this function we took the all.xml file that had all of the delicious links, loaded them into a simpleXML object, and populated two arrays – one holding a unique list of all the tags, another holding information about all of the posts.

Read the rest of this entry »