Convert XML to SQL
I thought it might be worth crossposting my reply to a recent perlmonks question asking for a generalised XML to SQL conversion.
You can transform your xml into sql using xslt. Stealing some bits of code from elsewhere on the net, I can use the following xslt to convert my del.icio.us bookmarks into a set of insert statements:
<?xml version="1.0"?> <!DOCTYPE rdf:RDF [ <!ENTITY rdf 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'> <!ENTITY rss 'http://purl.org/rss/1.0/'> ]> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:rdf="&rdf;" xmlns:rss="&rss;"> <xsl:output method="text"/> <!-- main template --> <xsl:template match="/rdf:RDF"> <xsl:apply-templates select="rss:item" /> </xsl:template> <xsl:template match="rss:item"> insert into links (url, title) values ( '<xsl:call-template name="sql-escape"><xsl:with-param name="text" select="rss:title"/></xsl:call-template>', '<xsl:call-template name="sql-escape"><xsl:with-param name="text" select="rss:link"/></xsl:call-template>' ); </xsl:template> <!-- utility functions --> <xsl:template name="sql-escape"> <xsl:param name="text"/> <xsl:variable name="tmp"> <xsl:call-template name="replace-substring"> <xsl:with-param name="from">'</xsl:with-param> <xsl:with-param name="to">''</xsl:with-param> <xsl:with-param name="value" select="$text"/> </xsl:call-template> </xsl:variable> <xsl:value-of select="$tmp"/> </xsl:template> <xsl:template name="replace-substring"> <xsl:param name="value" /> <xsl:param name="from" /> <xsl:param name="to" /> <xsl:choose> <xsl:when test="contains($value,$from)"> <xsl:value-of select="substring-before($value,$from)" /> <xsl:value-of select="$to" /> <xsl:call-template name="replace-substring"> <xsl:with-param name="value" select="substring-after($value,$from)" /> <xsl:with-param name="from" select="$from" /> <xsl:with-param name="to" select="$to" /> </xsl:call-template> </xsl:when> <xsl:otherwise> <xsl:value-of select="$value" /> </xsl:otherwise> </xsl:choose> </xsl:template> </xsl:stylesheet>If I save that to a file called del2sql.xsl I can insert the links in the rss response like so:
wget -O - http://feeds.delicious.com/rss/aufflick | xsltproc del2sql.xsl - | psql links_dbPretty neat huh! Unfortunately xslt is even less decipherable than golf perl, but thems the breaks. The command pipeline above uses the postgres psql command, but you can just as easily use this approach with mysql, oracle, sybase, sql server or whatever. Most database vendors/projects these days have some sort of xml functionality built in, but this puts all the processing load on the part of your system that is most difficult to scale. Better, I think, to do as much processing as possible on your application server which you can more easily scale. You might also be better using xslt to convert to CSV and then using some simple script to slurp the csv into your database - you could then handle insert/update errors gracefully in your script. The utility sql-escape and replace-substring templates can be placed in a common file and included into your xsl stylesheets via an xsl:import statement. If you have an xsl 2 compatible xslt engine you can use the x:replace() function directly instead of the recursive replace-substring template.
11:41 PM, 03 Mar 2008 by Mark Aufflick Permalink | Comments (0) |
Archive
November 2008 October 2008 September 2008 August 2008 July 2008 June 2008 May 2008 March 2008 February 2008 January 2008 December 2007 November 2007 October 2007 September 2007 August 2007 July 2007 June 2007 May 2007 April 2007 March 2007 February 2007 January 2007 December 2006 November 2006 October 2006 September 2006 August 2006 July 2006 June 2006 May 2006 April 2006 March 2006 February 2006 January 2006 December 2005 November 2005 October 2005 September 2005 August 2005 July 2005 June 2005 May 2005 April 2005 March 2005 February 2005 January 2005 December 2004 November 2004 October 2004 September 2004 August 2004 July 2004 June 2004 May 2004 April 2004 March 2004 February 2004 January 2004 December 2003 November 2003 October 2003 September 2003 August 2003 Notifications Request notifications
Recent Comments | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Request notifications