Magpie and CaRP are two popular RSS parser scripts, both written in PHP, but which is better? I did a quick comparison, and this is what I came up with. Yes, I am the creator of CaRP, so read carefully and decide for yourself whether you can trust me! I'll try to be as objective as possible.

Executive Summary

Magpie is a lean, mean, parsing machine. It's faster than CaRP, so if you're processing a huge number of feeds very frequently, it's definitely worth a look. Plus it's free.

The reason CaRP is slower is because it does so much more for you. If you don't want to have to write a lot of PHP code to process the feed data once it's been parsed, CaRP may be the better choice. CaRP comes in two versions: a free version named CaRP SE, and a commercial version named CaRP Evolution. (I'll say "CaRP" when talking about both versions, and "CaRP SE" or "CaRP Evolution" when talking about only one or the other.)

Magpie

Essentially, Magpie will take a feed, chop it up into it's parts, and give you an array containing the chopped up data.

It can cache a local copy of the parsed data, which will speed it up significantly. But on most websites, the default caching scheme won't work automatically because Magpie won't have the necessary access permissions to create cache files. Creating a cache folder isn't too terribly difficult if you know how to work with server access permissions, but you'll have to do it manually.

Magpie can "transcode" feeds to and from different character sets, which is critical if (as is often the case) the feed doesn't use the same encoding as your webpage.

Magpie supports both RSS and Atom feeds.

Magpie does a little bit of mapping of different RSS and Atom feed elements that contain essentially the same data (for example, atom:content and content:encoded or dc:date and pubDate). But the mappings are not configurable, and are fairly limited.

Finally, Magpie is known as an RSS parser that gives you all of the data from the feed, so I was surprised to see that for the most part, it discards any data found in element attributes. For core RSS and Atom elements, it does a very good job, but for extension elements, most if not all of this data will be lost.

Reader Comment:
Antone Roundy said:
Xavier, Just found this comment which hadn't been moderated before. CaRP does NOT cache the image -- only the text content that was in the feed itself.
(join the conversation below)

Magpie was last updated in November 2005.

CaRP

The way CaRP is normally used, it will, like Magpie, chop up the feed. But instead of giving you an array of the feed data, it displays it for you. Before the data is displayed, it cleans up dangerous HTML code (or any HTML code that you don't want), formats it the way you specify, truncates data that's longer than what you want displayed, etc.

CaRP Evolution also supports a few ways to access the feed data -- most notably, in version 4.0.5 (which I'll release later today), it includes a new plugin that will give you the data in a format similar (though not identical) to what you get from Magpie. However, the new plugin will give you just the data you want, including data taken from element attributes.

Plus, CaRP automatically maps various similar elements to common fields for you, and with CaRP Evolution, the mapping is completely customizable. For example, if you want access to images pointed to by the feed, with CaRP Evolution, you won't have to know whether the feed uses RSS's image element or enclosure element, or whether it uses "Media RSS" elements (nor will you have to know which of the myriad possible ways that Media RSS elements can point to the image is being used). This can save you a lot of feed-by-feed tweaking of your code.

Like Magpie, CaRP can cache feed data for you. CaRP's installation script sets up the caching for you (either in files or a MySQL database), which is a bit easier than getting Magpie's caching working. CaRP's default caching stores the raw RSS feed data, so every time the feed is displayed, CaRP has to re-parse the feed. In most cases, this is fast enough, but if you need better performance, you can manually specify a cache file name and have CaRP cache the formatted output. This enables CaRP to simply display the contents of the cache file, which is much faster.

CaRP Evolution can transcode feeds just like Magpie does. CaRP SE, however, supports only the character encodings supported by the "Expat" XML parser library (UTF-8, ISO-8859-1 and US-ASCII). This is enough to handle many English language feeds, but some English language feeds and many non-English feeds use other encodings, which may or may not be successful depending on the content of the feed.

CaRP SE supports RSS only, while CaRP Evolution supports both RSS and Atom.

CaRP Evolution comes with 15 plugins to perform filtering, clean up feed data, store data in a MySQL database, display YouTube videos, and a variety of other things, and new plugins are added from time to time.

Finally, CaRP's latest update will be released later today.

Wrap Up

If you just want the parsed data from the feed to use in your own PHP code, Magpie's speed may make it the tool of choice for you. But if you need the other features offered by CaRP (particularly CaRP Evolution), using it will save you writing a lot of additional code. Plus, if you do write the additional code yourself, the time it takes to run your code might cancel out the speed difference between Magpie and CaRP.

Get CaRP SE or Evolution.