Posts Tagged ‘xml tutorial’

Introduction to XML

Wednesday, October 8th, 2008

XML stands for Extensible Markup Language. XML can be used by anyone who is desirous of using web technologies to distribute information across the Internet or the intranet.

XML is another form of formatting a document with a web browser. XML is an powerful and effective tool for processing the contents of a document. In other words XML allows you to create your own tags. XML is a very simple and flexible markup language that can be used on any operating system or environment.

XML is also platform independent that is XML can run on any operating system. XML is a better form of describing the contents of a document to the user. XML allows the exchange of data on the web more easily and efficiently. XML does this by allowing the users to develop their own (DTD’s)

DTD’s are used for describing sets of tags and attributes to display the content in a desired format. XML vocabularies or applications are called as the individual markup language that is defined using DTD’s.

Genealogical Markup Language (GedML) and the Chemical Markup Language (CML) are the examples of XML vocabularies. GedML describe ancestral data and CML describes chemical formulas and molecules. XML is an extensive language that is not only used for describing data but is also used for describing metadata. Metadata means data about data. In a nutshell, XML is an effective method of describing data and Metadata that is platform independent and that it can be used on all operating systems.

XML is a subset of SGML. Its aim is to provide generic SGML to be served, received and processed on the web. XML has been designed for the interoperability with both SGML and HTML.

XML documents are made up of entities that are called as storage units. Some of them are parsed data and the others are unparsed data. XML enables a mechanism to provide constraints on the storage layout.

Is XML a database?
XML document is a collection of data. In other words it doesn’t make much difference between the other files that store data. A XML in a database format is a self describing, portable, and can describe data in tree or graph structure. XML is a sort of Database Management System (DBMS).

XML provides storage, schemas, query languages, programming interfaces and so on .It lacks in triggers, queries, multi-user access that a real database constitutes. The main advantage of XML is that the data is portable and it allows you to have nested entries.

XML allows you to preserve physical document structure, supports document level transactions and execute queries in an XML query language.

Mapping the XML document schema to the database schema does the transfer of data between XML documents and a database. Mappings between document schemas and database schemas are performed on attributes and text. There are 2 mappings that are generally used to map on XML document schema to the database schema:

1.TABLE BASED MAPPING
2.OBJECT RELATIONAL MAPPING

Native XML databases are designed especially to store XML documents .It is always possible to store data in XML documents in a native XML database. This is done so, when your data is semi-structured. Although, this kind of data can be stored in object oriented and hierarchical databases, it is always better to store it in a native XML database. It enables us to retrieve data much faster than a relational database. One more reason is to store data in a native XML database is to exploit XML specification capabilities, such as executing XML queries.

Using Stylesheets in XML
XML is a means of exchanging data between applications. It allows the developers to describe and structure their data in their own formats. As the XML gave more emphasis on data rather than formatting, the data in the XML document can be formatted in two ways:

1. USING CSS
2. USING XSL

Cascade Style Sheet( CSS):

Initially, Cascading Style sheets (CSS) were used for formatting the data in the XML documents. It allows the Web Developers to define a formatting for the elements in XML and the same can be applied to as many documents you like. The advantages are:

1.It has a Precise control over presentation
2.It is Resolution Independent
3.It downloads Faster
4.It is easy to maintain

Though it has a lot of advantages it also has following disadvantages.

1. The order of elements for display cannot be changed
2. An element cannot be processed more than once.
3. Generated text cannot be added to the presentation

USING eXtesible Stylesheet Language (XSL):

The difficulties that were encountered with CSS were removed by making use of XSL. XSL is an application of XML It allows you to create high performance XML based systems by integrating Server side XML processing’s. The need for transforming data from one format to the other results in splitting XSL into two groups:

i) XSLT – It describes how to transform XML. Document into other formats.
ii) XSL-FO- It describes formatting details of each element in the XML document.

i) XSLT:

The XML Style sheet Language Transformation (XSLT) is a mechanism of transforming one form of XML documents to the other form. It is a set of templates based on Xpath expressions that tells how to fetch a particular node from the XML documents. It is a part of XSL, which is a style sheet language for XML. XSLT is widely used in Websites Content Management to convert XML into HTML pages. It uses Xpath to define parts that match one or more templates. Xpath is an query language that allows you to identify the nodes. It can select nodes in any direction. An XSLT processor is used to perform transformations of XML document in to other formats based on the given XSLT document.

ii) XSL_FO:

XSL-FO means Extensible Formatting Objects. There are two different ways in which the XML document can be formatted. They are:

1. Layout Based formatting and 
2. Content Based formatting

In a layout based formatting, the limitations of the target may constrain the content or appearance on the page, whereas in a Content Based Formatting, the target medium is generated to accommodate the information being formatted. The XSL FO allows you to make formatting and styling options to your document.

Working with XML using ASP
The need for manipulating XML on the server increases as the need for XML increases.XML and ASP are a very efficient and a powerful combination.XML is a simple and a powerful tool for web developers.XML was created to handle complex Web documents. XML allows you to define all your own tags with rules such as data description and data relationships. XML is used in order to remove the cumbersome problems that were faced with HTML. Information can be accessed easier using XML. ASP and XML are powerful tools for creating dynamic web pages.

ASP uses XML as a tool in its application for a simple reason that data in HTML is allowed to transmit between dissimilar platforms, whereas XML allows us to express complex structure. Moreover it allows us to create our own tags with all sorts of rules. A new document instance can be created using MSXML.There are a number of ways to access XML data from an ASP page. Document Object Model (DOM) plays an important role in retrieving the XML data from ASP. In DOM a document is viewed as a tree of nodes. Every node of the tree can be accessed randomly. The main advantage is that it provides all functions in an Object based way.An XML parser based on DOM from Microsoft is MSXML.This component is used for accessing the XML documents.

There are two groups of DOM programming interfaces.The first group defines interfaces required for writing the application. The second group defines interfaces that are required to assist developers. Once the DOM Object is created on the server, your own XML document can be created. The XML document can be easily parsed on the server in an ASP and then the results can be sent to the client.

Accessing XML using Java Technologies
The most important benefit of XML is its simplicity. Though it is simple it is powerful enough to express complex data structures. Java is one of most important programming languages that is used for creating your web pages. It is an object oriented language whose main purpose was to be used with embedded systems such as cell phones. But later it gained more importance to be used with Web pages that were dynamic in nature. Java Applet and servelets are the important mechanisms for implementing this.

Another advantage of using Java is the concept of JavaBeans, which is a software component model for Java that allows the rapid development of an application by using a visual buider.DOM is one of the methods for accessing the structure of an XML document. An alternative is to use an event driven API.SAX is a simple API designed for XML.DocumentHandler is very important since it is called every time an element is found. A DocumentHandler is used as follows:

Step 1: Importing the parser interface
Step 2: Create an instance of SAX driver.
Step 3: Using this driver, create a parser object
Step 4: Register an instance of class MyHandler as a DocumentHandler.

JOX is a set of Java libraries that allows you to transfer data between XML documents and JavaBeans. JOX matches XML document to the fields of a bean and it will use a DTD when writing an XML document when one is available.JOX, unlike the other libraries, allows you to use any form of an XML document and any JavaBean without creating a separate schema to describe the mapping between Java and XML.

XP is an XML parser written in Java. The following are the advantages of XP:

1.XP is designed to be 100% conformant and correct
2.XP aims at High performance
3.Apart from the high level parser API, it also provides a low level API that supports the construction of different kinds of parser.

Breeze XML Binder is the most complete Java/XML data binding solution available. Breeze creates JavaBeans directly created from the XML structures.

XML Editor
XML editor allows you to take input and save files. XML editor is a text editor to create DTD’s and XML documents. Some editors check the XML documents whether it conforms to the rules of XML. The term XML EDITOR refers to the different types of tools depending on the purpose for which it is been used. There are a number of XML editors available in the market.

A few of the XML editors with their detailed descriptions are given below:

1.XMLmind XML Editor
The XML mind XML Editor (XXE) has been used with the word processor like view without the help of a tree view or visible tags. This allows concentrating more on content creation. XXE is a very efficient and powerful tree view that allows us to open XML documents for which CSS style sheet is not available. XXE helps in editing the XML data and document by embedding standard controls in the word processor like tree view. XMLmind XML Editor is easy to deploy and are highly extensible, which is a very indispensable feature that is more important than a word processor.

2.Peter’s XML Editor
The version 2.0 of this Peter’s XML Editor is a major development to this tool. The Peter’s XML editor uses a new XML parser, which is more powerful than previously used. The new tree view is much faster and powerful. The new source view allows the better editing of Unicode files.

3.VIM as XML Editor
It is a great and an extensible XML editor with many features. It is a highly efficient text XML editor built to enable text editing. VIM is called “PROGRAMMER’S EDITOR” and so it is useful programming that may consider it is an entire IDE. Not only for programmer’s, it is also perfect for all text editing.

4.oXygen XML Editor
This XML editor is a simple and elegant one combined with XML editing features, which has made it popular in both the corporate and academic worlds. The oXygen XML editor provides tools for document creation and presentation that can be validated against any user defined schema. The context sensitive editing minimizes the validation errors. The oXygen XML editor 4.0 provides a special layout when entering the debugging mode.

5.Exchanger XML Editor
This is a Java based XML Editor that enables easy browsing, managing and editing. It offers an extensive functionality to help XML authors, business analysis and software developers.

6.XML SPY EDITOR
Of all the XML editors, XML SPY tops as XML editor. It is a graphical schema editing and user interface that impresses the user with its versatility and power.

XML Parser
XML parser is a software module to read documents and a means to provide access to their content. XML parser generates a structured tree to return the results to the browser. An XML parser is similar to a processor that determines the structure and properties of the data. An XML parser can read a XML document to create an output to generate a display form.

There are a number of parsers available and a few of them are listed below:

1.The Xerces Java Parser
The main applications of the Xerces Java parser is the building up of the XML-savvy web servers
and to ensure the integrity of e-business data expressed in XML.

2.expat XML parser
The expat XML parser is written in C and runs on UNIX or W32.The expat XML parser is not a validating processor that is you can use it only to write an XML parser. This parser is contributed by James Clark.

3.XP and XT
XP is a Java based, XML validating parser and XT is an XSL processor. Both are written in Java.XP detects all non well formed documents. It gives high performance and aims to be the fastest conformant XML parser in Java. On the other hand XT is a set of tools for building program transformation systems. The tools include pretty printing; bundling of systems, tree transformation etc,

4.SAX
Simple API for XML (SAX) was developed by the members of a public mailing list (XML-DEV).It gives an event based approach to XML parsing. It means that instead of going from node to node, it goes from event to event. SAX is an event driven interface. Events include XML tag, detecting errors etc,

5.XML pull parser
It is optimal for applications that require fast and a small XML parser. It should be used when all the process has to be performed quickly and efficiently to input elements.

6.XML parser for Java
It runs on any platform where there is Java virtual machine. It is sometimes called XML4J.It has an interface which allows you to take a string of XML formatted text, pick the XML tags and use them to extract the tagged information.

XML RPC
XML RPC is a Remote Procedure Calling via the Internet. XML RPC is a network programming technique for making procedure calls on remote devices. Generally, XML RPC is used for developing Web services. XML RPC messages are the requests and the responses sent between the client and the servers. XML RPC is platform independent.

It is a message in the HTTP POST Request. The body of the Procedure is in XML and the value it returns is also in the form of an XML.It is designed as simple as possible but it allows complex data structures to be transmitted, analysed, and returned.

XML RPC is a protocol that allows different languages on different machines to communicate with each other. Since procedure requests and responses are in XML it is not necessary that each end of the RPC connection have to be written in the same language. XML RPC is the simplest tool that allows you to integrate even the most communicative tools. XML RPC can transport binary data as base64.

XML RPC has 8 datatypes. They are as follows:

1. INT
2. BOOLEAN
3. DOUBLE
4. STRING
5. BASE64
6. ARRAY
7. STRUCT
8. DATE/TIME

An array type is an indexed array and the STRUCT is a kind of associative array.
XML_RPC.NET is one of the libraries for XML RPC clients and services.

Listed below are few of its features:

1.Support for .NET on both Client and Server side.
2. Interface based definition of XML-RPC servers and clients
3. ASP.NET services that support both SOAP and XML RPC
4. Dynamic generation of documentation page at URL of XML-RPC
5. XML RPC.NET defines services as services running on the Microsoft IIS servers.

XML Schema
An XML Schema defines the elements, child elements and the attributes that can appear in a document. It can also define the order, the number of child elements, the data types for elements and attributes. In DBMS, Schema is a description of a database structure. Internal structures such as tables and fields can be defined using schemas. Schema defines tables and fields that make up the data. Schemas are defined using constraints.

There are two types of constraints:

1. CONTENT CONSTRAINTS
2. DATATYPE CONSTRAINTS

An XML schema is a set of schema components in which there are 13 kinds of grouped components under 3 categories primary, secondary and helper components.

The primary components include

  •  SIMPLE TYPE DEFINITIONS
  •  COMPLEX TYPE DEFINITIONS
  •  ATTRIBUTE TYPE DEFINITIONS
  •  ELEMENT DECLARATIONS

An XML schema consists of components such as type definitions and element declarations. They are used to access the validity of the well-formed element and attribute information items.
The purpose of XML schema structures schema describes a class of XML documents by using schemas components. Schemas provide specifications and additional information such as normalization and element values. Thus, XML schemas can be used to describe and catalogue XML documents. The XML schema validators allow us to check whether the instance of an document meets the requirements. The XML data can be described and validated using the XSD (XML Schema Editor). XSD is written in XML, so it doesn’t require a parser.

XML Tools
XML tools allows the developers to produce XML in a wide variety of situations such as comparing two documents, validating the XML documents, to check whether a file is well formed, etc., There are a lot of tools available, including those tools for creating and editing XML or building e-commerce applications.

A few of the tools are listed below:

1.Microsoft XML Diff and Patch
It is a set of tools that will allow you to compare two XML documents. It detects addition, deletion and detects the changes between two XML documents. It ignores order attributes, insignificant white spaces, and it doesn’t care about document coding. These tools if used on a structured data leads to suboptimal results, since they don’t have the capacity of recognizing the tree based structure.

2.XSD schema validator
This allows the validation of XML documents against an W3C XML schema or XML data reduced schema. It checks whether the given document is well formed and has a valid schema model.

3.Microsoft XSD interference 1.0
Allows you to create an XML schema definition language if a well formed XML file is available, then it generates an XSD that can validate that XML file.

4.XML CONTENT TOOLS
The XML content tools allow us to create edit and publish XML.The XML content tools were originally designed for publishing needs. XML pro from vervet logic is another relatively cheap and worth XML editing tool. The user interface helps you to make quick work.

5.XMetaL
XMetaL from SoftQuad is another affordable tool designed to make quick work of creating and editing files. It is a set of tools designed to simplify the implementation of XML applications. Two companies offer XML application tool kits. It enables you to deliver the contents in a short time to multiple channels, reduces complexity and cost of content. It supports DOM that includes extensive documentation. It also supports COM.

6.Breeze Commerce Studio
Bluestone software offers Breeze Commerce studio for creation of XML and Java based applications. It imports schema from XML DTD’s, XML documents and JDBC/ODBC databases.
It allows us to add data types and constraints to the elements of the schema.

7.Standalone and Single User content tools
Standalone, singleuser XML content tools are very cheap, since they require the least amount of development time and effort. If you have to handle XML creating, editing and publishing for print, web, investigate the product lines of Abortext and interleaf. RAD XML persistent tools are used for easy process, display and share complex data across applications with minimum coding.

8.Altova STYLEVISION 2004
Altova STYLEVISION 2004 is a new XML tool for web developers that provide extensive utilities for migrations. It is a visual editing tool that allows us to create style sheets and forms easily based on XML schemas or databases. It is a powerful database editing tool that allows you to generate reports directly from databases to XML.

9.Altova XML SPY2004
Altova XML SPY2004 Enterprise Edition is one of the most important XML tools for advanced application development. These tools have been used extensively for editing and working with XML technologies. It uses the Enhanced Grid View to manage elements or attribute creation.

10.XT
XT is a more powerful tool, which is easy to use .The understanding of the tree structure is made easy using XT, since the file you want to turn into a result tree is the name of the last parameter.
XT is easy to use if you are familiar with Java.

SOAP XML
Simple Object Access Protocol (SOAP) is a protocol that can be used for accessing the Web pages. SOAP or Simple Object Access Protocol is an XML based Object invocation Protocol. SOAP was developed for distributed applications to communicate through HTTP and firewalls. SOAP is platform independent and it uses XML and HTTP to access services, servers and objects.

SOAP consists of 3 parts:

1.SOAP ENVELOPE: Defines a framework for expressing what is in a message and who should handle it. The SOAP envelope namespace defines header and body element names and the encoding style.

2.SOAP ENCODING RULES: Defines a mechanism for exchanging instances of application defined data types. An encoding rule means an encoding style to know how it is applied to a specific data.

3.SOAP RPC: Defines a method to represent Remote Procedure Calls and responses. Soap RPC uses a request/response model for message exchanges. The request that is sent to the end point is the call and the response it sends represents the result of the call sent.

SOAP has the following features:

  • PROTOCOL independence
  • LANGUAGE independence
  • PLATFORM AND OS independence

SOAP is a way by which programs communicate with other programs using XML.SOAP uses XML to encapsulate data that needs to be sent to a remote subroutine. In more simple terms SOAP is a way by which Java objects and COM objects communicate with each other. A SOAP client is a program that creates an XML document that contains information required to invoke a method remotely in a distributed system.

A SOAP server is a code that listens to the SOAP messages and acts as a distributor and an interpreter. SOAP defines encoding rules called Base level codings. The encodings can be either

1.SIMPLE ENCODINGS
2.COMPOUND ENCODINGS

SIMPLE ENCODINGS are simple types like ints, floats, strings or user defined data types. These include data types such as arrays of bytes and Enumerations.

COMPOUND ENCODINGS include data types such as arrays and structures.

Voice XML
Voice XML is a concept based on XML that allows the access of Web Applications and content through phones. Speech based Telephone applications can be developed using Voice XML.

Document server allows the voice requests to be received from the interpreter and responds with Voice XML documents.

Voice XML Interpreter interprets the Voice XML documents that it receives from the document server.

Implementation Platform generates responses with respect to the user requests.

Voice XML depends greatly on the infrastructure of the Internet. Voice XML uses audio browser for input and output. The voice browser runs on the voice gateway, which is connected through Public Switched Telephone Network (PSTN).

Components of Voice XML:

1.DIALOGS:
There are two types of dialogs, Forms and Menus.

Forms collect input from the user just as in HTML where data is collected in the forms. Menus provide a list of options for the user to select from.

2.SESSION:
A Session begins once the user begins to interact with a voice XML document.

3.APPLICATION:
It is a collection of Voice XML Documents. All the documents in a particular application share the same root document.

4.GRAMMER:
It specifies the set of allowable vocabulary to be selected from the menu for the user to interact with the Voice XML document.

5.EVENTS:
If there are any semantics errors in the Voice XML document, the Voice XML interpreter throws the Events.

6.LINKS:
It specifies a transition that is common to all dialog boxes. If the user input matches the Grammar the link is transferred to the specified link’s destination.

XML Spy @2004 Tool
DTD’S and XML schemas are the very important concepts in framing the content model of an XML document. XML SPY@2004 is a very important tool that integrates both DTD’s and XML schemas with the editing XML schema documents. This means that the location of the definition of each element or attribute can be located using the “Go to Definition” command.

With XML SPY @2004 we can easily create XML schemas and Documents to automatically generate code in Multiple Programming languages thus saving time.

XML SPY supports both editing and Schema validation for the following Schema types:

1.Document Type Definitions (DTD)
2.Document Content Definitions (DCD)
3.XML Data Reduced (XDR)

XML SPY has four advanced views.

• Enhanced Grid View
• Database/Table view
• Text view
• Browser View

Enhanced Grid View is used for structure editing. Database/Table view displays the repeated entry in a tabular fashion; a text view with syntax coloring is used for Low level Work and a Browser View supports CSS and Style sheets.

One of the most important features of XML SPY is its integration with databases. This includes the import of table data and the import of table structure as XML schema. Other than MS-Access it also supports ODBC and OLEDB to access the other databases.

XSLT designer is one of the attractive tools that come with XML SPY. This Designer is a separate program that allows the loading of XML files .It then uses the Drag and Drop to build an HTML Document, generating an XSLT file. The resultant document can be viewed in the integrated preview and the Style Sheet can be saved as a XSLT file.

The Voice XML has the following Components:

1. Document Server
2. Voice XML interpreter
3. Implementation platform