Vamsi Pavan’s Place

When curiousity outbursts …..

xerces-c: C++ SAX2 Parser

July 21st, 2011 · 1 Comment · Articles, C/C++, Source Code

Basics of validating xmls with a given schema in C++.

1. Create parser instance.

SAX2XMLReader* parser = XMLReaderFactory::createXMLReader();

2. Set required features to parser instance as follow.

// Enable the parser’s schema support
parser->setFeature(XMLUni::fgXercesSchema, true);

// Schema validation requires namespace processing to be turned on.<br />parser->setFeature(XMLUni::fgSAX2CoreValidation,true);<br />parser->setFeature(XMLUni::fgSAX2CoreNameSpaces,true);

3. Set schema location using setPropery api call with/without namespace. If we want use ‘ExternalSchemaLocation’ property we need to append Namespace with a space char and then schema file path.

// Define the location of the schema.
XMLCh* schemaLocation = XMLString::transcode(”/directory/path/myschema.xsd”);
parser->setProperty(XMLUni::fgXercesSchemaExternalNoNameSpaceSchemaLocation,schemaLocation);

Current parser version requires the path in below format.

XMLCh* propertyValue = XMLString::transcode(”myschema.xsd”);
ArrayJanitor janValue(propertyValue);
parser->setProperty(XMLUni::fgXercesSchemaExternalNoNameSpaceSchemaLocation,propertyValue);

Another important thing to remember is - always file path should be in “file:/// “. If you don’t follow this format, parser won’t complain anything but validation/parsing don’t go well. Fortunately, if you are using in java, File API class provides you getURL() call to get the path in file protocol.

4. Now, set the content as well as error handler to parser instance. Remember always use custom handler by inheriting from DefaultHanlder while setting these. For error handler, inherited methods error(), warning(), fatal() needs to be overridden, otherwise parsing/validation errors go unnoticed without catching the exceptions.

parser->setContentHandler((ContentHandler*) myContentHandler);<br />parser->setErrorHandler((ErrorHandler*) myContentHandler);

5. Finally, parse api call.

// Do the parse<br />parser->parse(*xmlInputSource);

6. Now complete code would be.

    SAX2XMLReader* parser = XMLReaderFactory::createXMLReader();
    parser->setFeature(XMLUni::fgSAX2CoreValidation, true);
    parser->setFeature(XMLUni::fgSAX2CoreNameSpaces, true);
   
    //* Enable strict validation
    parser->setFeature(XMLUni::fgSAX2CoreNameSpacePrefixes, true);
    parser->setFeature(XMLUni::fgXercesValidationErrorAsFatal, true);

    //* Enable the parser’s schema support
    parser->setFeature(XMLUni::fgXercesSchema, true);
    parser->setFeature(XMLUni::fgXercesSchemaFullChecking, true);
    parser->setFeature(XMLUni::fgXercesDynamic, false);

    XMLCh* propertyValue = XMLString::transcode(m_sDefSchema.getMBCSCopy());
    ArrayJanitor<XMLCh> janValue(propertyValue);

    //* Define the location of the XML schema.
    if(isNS)  //with/without namespace
        //Property name - http://apache.org/xml/properties/schema/external-schemaLocation
        parser->setProperty(XMLUni::fgXercesSchemaExternalSchemaLocation,propertyValue);
    else
        //Property name - http://apache.org/xml/properties/schema/external-noNameSpaceSchemaLocation
        parser->setProperty(XMLUni::fgXercesSchemaExternalNoNameSpaceSchemaLocation,propertyValue);

Now, we’ll see the common errors we face during validation development. For most of the errors, we should make sure that Schema is having or not having target namespace. According to that parser/validator behave further.

1. Character ‘<’ is grammatically unexpected
Cause: missing required tag.

2. [cvc-elt.1: Cannot find the declaration of element
Cause: no namespace is found in the xml or Parser’s schemalocation property doesn’t have namespace attached or Schema files were missing at the specified path for parser’s schemalocation property.

This list will be updated in future too.

Bookmark it! These icons link to social bookmarking sites where readers can share and discover new web pages.
  • bodytext
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
  • Live
  • StumbleUpon
  • BlinkList
  • YahooMyWeb
  • NewsVine
  • blogtercimlap
  • Netvouz
  • Technorati
  • Slashdot
  • Print this article!

Tags: ·····

1 response so far ↓

  • 1 xerces-c: C++ SAX2 Parser | Internet blog // Jul 22, 2011 at 1:04 am

    […] the rest here: xerces-c: C++ SAX2 Parser Bookmark to: This entry was posted in Uncategorized and tagged given-schema, instance-as-follow, […]

Leave a Comment