XML Simply Explained

seeWorld Wide Web consortium site for specifications http://www.w3c.org/ and http://www.w3c.org/XML/

XML is a way of peeling the information in a tree of Element nodes out into a string of characters for transmission and then parsing the string back into a tree. The information sent is the structure of the tree and the content of the tree. At an element Node there may be "CharData? ((element | Reference | CDSect | PI | Comment) CharData?)* ". This is optional text followed by any number of elements, References, Character Data sections, Processing Instructions, Comments, all interspersed with more optional text. See the above World Wide Web consortium site for specifications.

Imagine two pieces of string attached to the node ElementTreeRoot in the tree to our right. As each end of the string is pulled the tree is split down the middle. Step by step we get

<ElementTreeRoot>

(the rest of the tree)

</ElementTreeRoot>

Further splitting of the tree of Elements produces

<ElementTreeRoot>

<Text></Text>

<Attributes></Attributes>

<Options>

(the rest of the tree)

</Options>

<File></File>

</ElementTreeRoot>

When the tree of Elements is fully pulled apart into a string of characters of Element Nodes we finish up with

<ElementTreeRoot>

<Text></Text>

<Attributes></Attributes>

<Options>

<ViewAsXMLEditor></ViewAsXMLEditor>

<ViewSource></ViewSourcer>

<RememberFileName></RememberFileName>

</Options>

<File></File>

</ElementTreeRoot>

 

When we add in the content at each node into this string of characters we finish up with

<ElementTreeRoot>

Content at ElementTreeRoot

<!-- any comments-->

<?ProcessingInstruction PIValue?>

<Text>Content at Text Element</Text>

<Attributes Attributename1=AttributeValue1 Name2=Value2>

Content at attributes ellement</Attributes>

<Options>Text Content of Options Element

<ViewAsXMLEditor>Content</ViewAsXMLEditor>

<ViewSource>Content</ViewSourcer>

<RememberFileName>Content</RememberFileName>

</Options>

<File></File>

</ElementTreeRoot>

 

The rules for a Document in XML defines a document as a prolog followed by an element followed by any number of Misc ([1] document ::= prolog element Misc* )

Where a proglog is none or one XML declaration (<?xml version=1?>) followed by any number of Misc and none or one Document type declaration followed by any number of Misc([22] prolog ::= XMLDecl? Misc* (doctypedecl Misc*)? )

Misc is a Comment or a processing Instruction or whitespace such as spaces, line feeds, carriage returns ([27] Misc ::= Comment | PI | S)

[39] element ::= EmptyElemTag | STag content ETag

[43] content ::= CharData? ((element | Reference | CDSect | PI | Comment) CharData?)*

If you are seriously Interested I suggest you read the details at the World Wide WEB Consortium's site http://www.w3c.org/

Best regards - Trevor Croll