RelaxNG

Home

Reading assignments

Programming Assignments

Resources

Syllabus

Slide Sets

Final Project Ideas

Updates


    A simple Relax Schema

    <?xml version="1.0"?>
    <element name="foo"
             xmlns="http://relaxng.org/ns/structure/1.0"
             xmlns:a="http://relaxng.org/ns/annotation/1.0"
             xmlns:ex1="http://www.example.com/n1"
             xmlns:ex2="http://www.example.com/n2">
      <a:documentation>A foo element.</a:document>
      <element name="ex1:bar1">
        <empty/>
      </element>
      <element name="ex2:bar2">
        <empty/>
      </element>
    </element>
    

    An alternate Syntax

    Unlike WSD, RelaxNG has a non-XML alternative syntax that is equivalent to the tagged syntax. This is useful in some situations. In the words of the Committee:

    The goals of this syntax are to:

    • maximize readability;

    • support all features of RELAX NG; it must be possible to translate a schema from the XML syntax to the compact syntax and back without losing significant information;

    • support separate translation; a RELAX NG schema may be spread amongst multiple files; it must be possible to represent each of the files separately in the compact syntax; the representation of each file must not depend on the other files.

    Alternate syntax example

    namespace local = ""

    namespace ex1 = "http://www.example.com/n1"

    namespace ex2 = "http://www.example.com/n2"

    1. # A foo element.

    element foo {

    element ex1:bar1 { empty },

    element ex2:bar2 { empty }

    }

    Alternate syntax definition for relaxNG

    1. RELAX NG XML syntax specified in compact syntax.
    default namespace rng = "http://relaxng.org/ns/structure/1.0" namespace local = "" datatypes xsd = "http://www.w3.org/2001/XMLSchema-datatypes" start = pattern pattern = element element { (nameQName | nameClass), (common & pattern+) } | element attribute { (nameQName | nameClass), (common & pattern?) } | element group|interleave|choice|optional |zeroOrMore|oneOrMore|list|mixed { common & pattern+ } | element ref|parentRef { nameNCName, common } | element empty|notAllowed|text { common } | element data { type, param*, (common & exceptPattern?) } | element value { commonAttributes, type?, xsd:string } | element externalRef { href, common } | element grammar { common & grammarContent* } param = element param { commonAttributes, nameNCName, xsd:string } exceptPattern = element except { common & pattern+ } grammarContent = definition | element div { common & grammarContent* } | element include { href, (common & includeContent*) } includeContent = definition | element div { common & includeContent* } definition = element start { combine?, (common & pattern+) } | element define { nameNCName, combine?, (common & pattern+) } combine = attribute combine { "choice" | "interleave" } nameClass = element name { commonAttributes, xsd:QName } | element anyName { common & exceptNameClass? } | element nsName { common & exceptNameClass? } | element choice { common & nameClass+ } exceptNameClass = element except { common & nameClass+ } nameQName = attribute name { xsd:QName } nameNCName = attribute name { xsd:NCName } href = attribute href { xsd:anyURI } type = attribute type { xsd:NCName } common = commonAttributes, foreignElement* commonAttributes = attribute ns { xsd:string }?, attribute datatypeLibrary { xsd:anyURI }?, foreignAttribute* foreignElement = element * - rng:* { (anyAttribute | text | anyElement)* } foreignAttribute = attribute * - (rng:*|local:*) { text } anyElement = element * { (anyAttribute | text | anyElement)* } anyAttribute = attribute * { text }

    Content models

      Operator elements

      group, interleave, choice, optional, zeroOrMore, oneOrMore, list, mixed

      Alternate syntax: a | b | c, e?, e*, e+, list { pattern }, mixed {pattern}

      Formal equivalences are used in the definition of RelaxNG. For instance,

      a* is formally defined as equivalent to (a+)?.

      The first stage of schema processing is to eliminate many constructs by rewriting them to more basic ones.

      Combining

      Elements can have more than one definition; if so, all but one _must_ have combine attributes, specifying how their values are to be combined together.

      Values are: choice, interleave.

      In the alternate syntax, elements are declared like this:

      element name = pattern

      Combining elements are declared like this

      element name &= pattern

      element name |= pattern

      "ambiguous content models" are not prohibited

      It is not an error to have a schema where an input document can match the schema in more than one way. If a document is accepted, it may be ambiguous as to which branch of a choice was taken. This is different from the W3C's XML schema, and is the main reason that XML Schema is preferable for some kinds of data interchange application

      Specific values

      The value element lets you require a specific value for an element name, attribute name, or attribute value.

      For attribute values, this is very like the facilities in DTDs.

      But it can also be used in element specifications, to fix the actual string content of an element.

      DTDs can't restrict the text contents of elements at all.

      relaxNG can also use data types from an external registry, in particular XML Schema. These data types can be applied to element content, as well as to attribute values.

      content models are context sensitive

      This means that the content model of an element can vary with its parent element.

      This is a radical increase in power over DTDs.

      This is also a radical change in the parsing model.

      recursion

      patterns can reference themselves, but there must be a containing element in that case -- this prevents infinite recursion.

      How can you define a content model for footnotes in this case?

    Elements and attributes are as similar as possible

    Content models can be the same (except for nested elements).

    This includes the data type facilities (dates, numbers, etc.)

      Attribute models

      These look just like element models, only they are order-independent.

      This means that one can use optional, and groups, and choice to signify complex restrictions on attributes present.

      These can also depend on the other parts of a content model.

      attributes can be empty, and the content model is

      & and , are not different when they connect attribute specifications.

    Simplified XML syntax for RelaxNG

    See web site for official definition in Section 5.

    Name classes

    This is a way to declare sets of element with similar properties, so that they can be used interchangeably.

    Each name class is a set of element and attribute names, or some "any" declaration (with potential exclusions).

    nameclasses are used in declarations, and can be referred to from elsewhere (via ref elements).

    In the compact syntax, refs can be inferred from context.

    Modularity

      grammar elements

      A grammar is a set of definitions and patterns with a defined start symbol:

      in the compact syntax,

      start = pattern

      This means that this grammar begins with a particular pattern.

      You can override start symbols, as you'd expect, by writing

      start |= pattern

      start &= pattern

      grammarcontents:

      grammar {

      start = ...

      name = ...

      ...

      }

      External references

      You can pull in an external set of patterns (or a full grammar) with an external reference:

      external URI

      This is a straight copy, with no changes, just like an include file

      Importing

      You can also use the "include" command to pull in an external grammar.

      This allows definitions that override ones defined in the grammar you included.

      For safety, elements that need to be redefined before use can be given the content pattern: notAllowed.

      include URIreference { grammarcontents }

      The definitions in the grammarcontents override the definitions in the grammar itself, and can customize it in context.

      Importing the other way

      Of course, when you write a grammar to be used as a module, you are enabling some other grammar to override some of your own declarations.

      You may want to include definitions from your users, rather than just making your elements notAllowed depending on them to redefine them.

      To do that you can use a ParentRef

      parent name

      declares that you are requiring a definition from your including grammar, so that it's an error if you are included, but that name is not defined.

      This is basically a "module parameter" for a Relax grammar fragment.

      Data types

      42 datatypes, not listed here.

      Facets are attributes that restrict a data type.

      in compact syntax

      xs:string {xs:pattern = "regexp"}

      possibly one of the most important facets.