First published by IBM developerWorks at http://www.ibm.com/developerWorks. Visit ibm.com/ developerWorks for more tutorials on open standard technologies, IBM products, and more.
When used together, the goal of these technologies is to create a smooth, cohesive Web experience
for the user by exchanging small amounts of data with the content servers rather than reload and
re-render the entire page after some user action. You can construct Ajax engines for mashups
from various Ajax toolkits and libraries (such as Sajax or Zimbra), usually implemented in JavaScript.
The Google Maps API includes a proprietary Ajax engine, and the effect it has on the user experience
is powerful: it behaves like a truly local application in that there are no scrollbars to manipulate or
translation arrows that force page reloads.
Web protocols: SOAP and REST
Both SOAP and REST are platform neutral protocols for communicating with remote services. As
part of the service-oriented architecture paradigm, clients can use SOAP and REST to interact with
remote services without knowledge of their underlying platform implementation: the functionality of
a service is completely conveyed by the description of the messages that it requests and responds
with.
SOAP is a fundamental technology of the Web Services paradigm. Originally an acronym for Simple
Object Access Protocol, SOAP has been re-termed Services-Oriented Access Protocol (or just
SOAP) because its focus has shifted from object-based systems towards the interoperability of
message exchange. There are two key components of the SOAP specification. The first is the use of
an XML message format for platform-agnostic encoding, and the second is the message structure,
which consists of a header and a body. The header is used to exchange contextual information that
is not specific to the application payload (the body), such as authentication information. The SOAP
message body encapsulates the application-specific payload. SOAP APIs for Web services are
described by WSDL documents, which themselves describe what operations a service exposes, the
format for the messages that it accepts (using XML Schema), and how to address it. SOAP messages
are typically conveyed over HTTP transport, although other transports (such as JMS or e-mail) are
equally viable.
REST is an acronym for Representational State Transfer, a technique of Web-based communication
using just HTTP and XML. Its simplicity and lack of rigorous profiles set it apart from SOAP and lend
to its attractiveness. Unlike the typical verb-based interfaces that you find in modern programming
languages (which are composed of diverse methods such as getEmployee(), addEmployee(),
listEmployees(), and more), REST fundamentally supports only a few operations (that is POST,
GET, PUT, DELETE) that are applicable to all pieces of information. The emphasis in REST is on the
pieces of information themselves, called resources. For example, a resource record for an employee
is identified by a URI, retrieved through a GET operation, updated by a PUT operation, and so on. In
this way, REST is similar to the document-literal style of SOAP services.
Screen scraping
As mentioned earlier, lack of APIs from content providers often force mashup developers to resort
to screen scraping in order to retrieve the information they seek to mash. Scraping is the process
of using software tools to parse and analyze content that was originally written for human
consumption in order to extract semantic data structures representative of that information that
can be used and manipulated programmatically. A handful of mashups use screen scraping technology
for data acquisition, especially when pulling data from the public sectors. For example, real-estate
mapping mashups can mash for-sale or rental listings with maps from a cartography provider with
scraped “comp” data obtained from the county records office. Another mashup project that scrapes
data is XMLTV, a collection of tools that aggregates TV listings from all over the world.
Screen scraping is often considered an inelegant solution, and for good reasons. It has two primary
inherent drawbacks. The first is that, unlike APIs with interfaces, scraping has no specific programmatic |