In my last post, SimpleXML and Namespace Quirks, I complained about how bad the namespace handling is in the SimpleXML framework in PHP.
Since then, I have searched all over the web looking for pre-made solutions to the problem of parsing RDF XML that is heavy with namespace use, and found nothing that fits my needs. I did find some extensive RDF parsing frameworks written in PHP, but they were way too involved: I don't want to have to install a dozen class files when all I want to do is convert a simple RDF/XML string into triples. I also found some "simple" solutions that were inadequate, like simply replacing the ":" character in the string to an "_" character, so that it no longer had to deal with namespaces at all. (This is a terrible solution because namespace prefixes are just "shortcuts" to a URL, and different people can use different prefix characters to represent the same namespace.)
So, on failure to find anything acceptable, I wrote my own solution.
Please enjoy SimpleRDFElement: a class that extends the SimpleXMLElement class, and that therefore can be used elegantly hand-in-hand with code for SimpleXML
The source code is one file: simplerdfelement.php (opens as plain txt file)
(Obviously save it as a .php file and include it in your php script to use it)
If you have your RDF-style XML stored as text in a string variable $xml, then you can create your SimpleRDFElement Object this way:
$xmlobj = simplexml_load_string($xml,'SimpleRDFElement');
The resulting object, $xmlobj, acts just like a SimpleXMLObject, except you have a few more methods available:
- $xmlobj->getPrefix()
- Returns the namespace prefix of the root element of the object, based on the namespace definitions defined by the XML text.
- $xmlobj->getNamespace()
- Returns the full URI of the namespace of the root element of the object, based on the namespace definitions defined by the XML text.
- $xmlobj->getFullName()
- Returns the full qualified name of the root element, using the prefix-colon-tagname format, e.g. rdfs:Class
- $xmlobj->getFullURI()
- Returns the full URI of the root element, using the expanded URI of the namespace followed by the element tag name, e.g. http://www.w3.org/2000/01/rdf-schema#Class
- $xmlobj->getChildNodes()
- Returns an array of all of the child elements (as SimpleRDFElement Objects) of the current top-level element. Unlike the built-in children() method, this returns all child elements regardless of namespace.
- $xmlobj->getAttributes()
- Returns an array of all of the attributes (as individual SimpleRDFElement Objects) of the current top-level element. Unlike the built-in attributes() method, this returns all attributes regardless of namespace.
- $xmlobj->getTriples()
- Returns an array SimpleRDFTriple objects. This is a simple helper class that defines an object with three properties: tripleSubject, triplePredicate, tripleObject. This method parses the top level element and constructs triples based on that element, its attributes, and its immediate child elements. It is not recursive.
No comments:
Post a Comment