Basic Concepts of XML

What is XML?
  • XML stands for eXtensible Mark up Language. It is classified as an extensible language because it allows its users to define their own tags.
  • XML was developed to provide a universal format for describing structured documents and data.
  • There are no fixed tags for XML. Any user can add his own set of tags. The tags though are similar to HTML, they do differ by the way it is presented.
  • Unlike HTML, which tags elements in Web pages for presentation by a browser, e.g. Oracle, XML tags elements as data, e.g. Oracle. In this example HTML identifies as a command to display the data within as Bold. But in case of XML, the company for instance can be a column name in a database and Oracle is the column value.
Why do we use XML?
  • As XML is W3C(World Wide Web Consortium) standard, various software companies have openly accepted and implemented it in their operations.
  • It is a fee-free open standard.
  • It is platform-independent, language-independent, textual data.
  • XML can be used with existing web protocols (such as HTTP and MIME) and mechanisms (such as URL's ), and it does not impose any additional requirements.
  • XML can handle any kind and high volumes of information especially over the internet and WWW.
  • It is Unicode compatible, means it can handle UTF ready languages.
  • It is used as an interface touch-point between majority of applications. XML is replacing the age-old flat file system to send and receive data between applications.
Building blocks of XML

XML documents are made up by the following building blocks:
  • Elements
  • Attributes
  • Entities
  • PCDATA
  • CDATA
What are Elements?

Elements are the main building blocks of XML documents.

XML elements could be "my_body" and "message" in the following example. Elements can contain text, other elements, or be empty.
<my_body>some text</my_body>
<message>some other text</message>

What are Attributes?

Attributes provide extra information about elements.
Attributes are always placed inside the opening tag of an element. Attributes always come in name/value pairs. The following "images" element has additional information about a source file and its name:
Example:
<images location="computer.gif" name="some image name"/>

In the above example, images is called as an Element; whereas location and name are called as Attributes.

What are Entities?

Some characters have a special meaning in XML, like the less than sign (<) that defines the start of an XML tag. The following entities are predefined in XML:
lt;
gt;
amp;
quot;
apos;

Add and & mark before this special character.

What is PCDATA?

PCDATA means parsed character data. Think of character data as the text found between the start tag and the end tag of an XML element.

PCDATA is text that WILL be parsed by a parser. The text will be examined by the parser for entities and markup. Tags inside the text will be treated as markup and entities will be expanded.

However, parsed character data should not contain any &, <, or > characters; these need to be represented by the amp, lt; and gt; entities, respectively.

What is CDATA?

CDATA means character data. CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated as markup and entities will not be expanded.

Sample XML
<?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend</body>
</note>


In the above example XML file the string starting from <!DOCTYPE note [ upto ]> is called as DTD (Document Type Definition).


What is DTD?

DTD (Document Type Definition) is a set of rules or grammar that we define to construct our own XML rules (also called a "vocabulary"). In other words, a DTD provides the rules that define the elements and structure of our new language.

This is comparable to defining table structures in Oracle for a new system. As we define the columns of a table, determine the datatypes of the columns, determine if the column is 'Null' allowed or not, the DTD defines the structure for the XML document.

A DTD can be declared inline inside an XML document (as in the previous slide), or as an external reference(as in the below example).

Example of external DTD:
<?xml version="1.0"?>
<!DOCTYPE note SYSTEM "note.dtd">
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Some data</body>
</note>

The contents of note.dtd file is as below:
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>

Why use a DTD?

With a DTD, each of your XML files can carry a description of its own format.

With a DTD, independent groups of people can agree to use a standard DTD for interchanging data.

Your application can use a standard DTD to verify that the data you receive from the outside world is valid. You can also use a DTD to verify your own data.

9 comments :

  1. There is nothing new in this. Old wine in new bottle....

    ReplyDelete
  2. XML is always XML. It cannot change by time unless for new releases. I have just used this post as an introduction to those who are willing to work with XML as starters.

    Thank you for reading and leaving comment.

    ReplyDelete
  3. Hi,

    I have a query on XML which I am unfortunately not able to post either on your blog page or the oracle developer community. Would be greatly obliged if you could help me out. It is posted on Orkut forum - Oracle Beginners under XML tool (whitespace error) or if you could give me your email id, I could mail it you. I am a student. It is the most basic of examples but am not able to execute.

    Thanks and Regards,

    Mumtaz Sheikh

    ReplyDelete
  4. You can mail me to tlananthu@gmail.com

    ReplyDelete
  5. Hi!

    I'm new to XML. And, today after going through your XML doc - i am able to understand basics about it. It is very neat and compact, too.

    I'm a Oracle Developer working mainly in PL/SQL stuff. And, very few times i need to handle XML. Now i'll be able to handle XML with more ease.

    Thanks for sharing this with us.

    ReplyDelete
  6. Hi Anantha,

    Thanks for this article on XML. It really clears the basics of XML on which the example in reports is made up of. Thank you and keep up the good work.

    Regards,

    Mumtaz

    ReplyDelete
  7. Hi Anantha,

    Thanks for this article on XML. It really clears the basics of XML on which the example in reports is made up of. Thank you and keep up the good work.

    Regards,

    Mumtaz

    ReplyDelete
  8. hi..
    thanks for the short and quick notes..

    deepti gupta

    ReplyDelete
  9. Well , what to say about this information.I've been looking for such information for quite a long time but couldn't find anywhere online.It clears the basic of XML.
    sap support pack upgrade

    ReplyDelete