Overview of Mashups – Concepts, Technologies and what it all means (my JavaOne presentation)

Standard

Sun has now uploaded the transcripts and audio from my JavaOne presentation about Mashups earlier this month. Both the PDF of the presentation, the audio and the transcript can now be found at the SDN’s JavaOne site. You need to be a member of the Sun Developer Network (SDN) to see it, but membership is free so don’t let that stop you.

The official title of the presentation is “Integration Gets All Mashed Up – Bridging Web 1.0 and Web 2.0 Applications”, and it is an overview of what mashups are, what technologies are involved and how the different components fit together inside a mashup. It was a lot of work getting the presentation together, but it was well worth it since I got a lot of good feedback from people (thanks to eveybody that came and heard me speak!).

The presentation starts out with the standard (= Wikipedia) definition of mashups and then move on to describing the enabling technologies such as REST, JSON, Atom and RSS. Of course all this is from a Java perspecive since it is after all a JavaOne presentation. Next are the components of a mashup such as Mashup Builders, Widgets, Mashup Enablers and how they all fit together.

Inside a Mashup

Next I describe the the mashup programming model, which is a lightweight programming model that means that the focus is on assembling and not coding. In the end I highlight what it means to be a developer in a mashup world.

If you are in too much a hurry to go through the presentation I can give you the summary right here:

  • Mashups are all about
    • Building a situational application ad hoc
    • Reuse and remix both data and funcitonality
  • Using Mashups means that
    • You can do more in less time
    • You don’t have to reinvent the wheel
    • You are not in total control of how your applications and data will be used

If you are interested in Mashups I hope that this presentation will give you a good overview. Please take a look and a listen and let me know what you think!

The Power of Platform Ecosystems

Standard

Today Facebook announced a very aggressive move to open up its system and become a Platform (read more on TechCrunch or ReadWriteWeb). At the same time several applications were announced on top of the new platform (read more on WebWare). This is completely opposite to what MySpace has done in creating its walled garden. I think this is a great move by Facebook, it gives the 20 million Facebook users more of a reason to keep coming to Facebook at the same time as it gives a lot of companies and developers access to those 20 million users. In the middle is Facebook, who, if the strategy succeeds, establish itself as the platform in a living online Ecosystem.

The Web is full of Ecosystems

There are already plenty of Ecosystems out there. Amazon was one of the first companies that understood the power of this, and created an open API that developers all over the world now are using (and thereby increasing the sales of books for Amazon, plus the brand recognition of course). YouTube’s success is largly due to the ease of including and showing videos on your own sites and blogs (the real success was of course that Google pay a ridiculous amount of dollars for them, but that is a completely different story). Flickr is doing the same for photos online. WordPress is an OK blog engine, but it would be nothing without all the plugins, so that is another successful Ecosystem. In the Enterprise world there is of course Salesforce that now has more requests coming in via their API than via their web page.

There are also successful such Ecosystems that are not based on web applications. Firefox (OK, that one is almost web based, just humor me please) is a great browser, but without all the add-ons it is not that usefull (me love Firebug!). Apple has been very successful in creating an Ecosystem around the iPod, just think of how many companies make great money by just producing skins and other extensions to the iPod.

The Platform Wins

What is in it for the Platform is pretty evident: more loyal users that have more reason to keep coming back. This is hard to create, and if System Providers build apps on your Platform then basically you are expanding your development deparment and you have more brilliant minds trying to figure out the killer app for your users. Also this is a great way to keep the competition at bay. Now when Facebook is becoming a Platform and not just a site it will either force the main competition MySpace to do the same or be left behind.

The System Providers Win

The System Providers that write applications that run on the Platform (most likely called something fancy like widget, gadget, addon, plugin or extension) gets access to a whole lot of ready made functionality and a huge existing user base. There is no longer any need to implement photo album funcitonality for your web app, just use Flickr. No need to build video capabilities, just use YouTube. If you build an app for Salesforce you can sell it on their AppExchange and make some money. So in short you get both functionality and a userbase, that is a pretty good deal.

The Users Win

The users can stay on one platform and take full advantage of that the platform continously gets more and more functionality added. Since each of the System Providers do not have to implement the most basic functionality (user handling, file uploading, photo album etc) they can concentrate on new great features, which of course benefits the end users. So more cooler useful features, that makes uses happy.

Win Win Win and Lose

The winners are the Platform providers that are in the center of the Ecosystem, the System Providers that gets features and a user base for free and the users that get more features quicker. That is a pretty impressive Win-Win-Win scenario.

The losers are the companies that do not understand that Ecosystems are the new type of light weight business partnerships. Either you try to be a Platform or you take advantage of whatever the Platforms around you offers. To build a new big web application and not provide an API is simply failing to take use of a huge opportunity. To build any web site and not take advantage of one or more existing Ecosystems is to do more work than you need to. It seems like Facebook really has realized that (that is no good reason to say no to the $1 billion that Yahoo is rumoured to have offered, saying no to $1 billion is always a bad move in my humble opinion).

What I learned at JavaOne 2007

Standard

I am just back from a week in San Francisco at JavaOne where I was one of 12.000 geeks that got together to get to know more about the latest and greatest in Java development. Since it was almost 2 years ago I worked full time with Java development this almost felt like a trip back in time for me, back to the time when Java was equal to my professional life, and what I have worked with the last few years (Web 2.0, mashups, web scraping etc) is still considered new and exotic in the Java world.

It was also very interesting for me to compare JavaOne with the Web 2.0 Expo that I attended a few weeks ago. JavaOne is almost strictly for developers that are using a mature technology that has not really taken any real leaps lately. The Web 2.0 Expo was for both techies and business people and it looking forward at new technologies and business oppertunities.

Keynotes and General Sessions

Most of the Keynotes and the General Sessions were also about running Java on all kinds of devices (cell phones, ATMs etc) and generally about how great Java is, so nothing new there. But there were a few things that caught my eye:

  • JavaFX Script – Sun has decided to take up the fight with Microsoft’s Silverlight and Adobe’s Flash, unfourtunatly they decided to give it a name that most people will confuse with “JavaScript” and Adobe’s “Flex” (just try saying JavaFX Script 10 times quickly). This is a very interesting move since it suddenly makes it easy for the whole Java community to develop Rich Internet Applications. This and Silverlight has the potential to make the web a much more interesting place.
  • Netbeans 6 – There was a quite impressive demo of Netbeans 6 and how it can be used for programming Ruby on Rails, much better than RadRails I am using now. Of course it was a demo and I havent had time to test the Netbeans 6 preview yet, but as soon as I have to dig down into Rails again Netbeans is my choice.
  • Blu-ray – There was some semi-desperat plees (=competitions) to get Java developers involved in the Blu-ray vs HD-DVD fight. Sun is squarly behind Blu-ray (they mentioned that it ran Java about 100 times). The Blu-ray demos were cool, but personally I just want one format to win quickly so I know what player to buy.

Java and Web 2.0
Most of the presentations were of course about hardcore Java stuff, and I skipped those. Instead I went to all presentations about Mashups, RSS, Atom and REST (acctually I held a presentation about Mashups myself, more about that in a later post). It is pretty clear that all the Web 2.0 technologies are viewed as some distant hype by most of the Java community.

The only really cool thing I saw in regards to Java and Mashups was a couple of demos of jMaki. It is a project developed by Sun and it is basically a framework where java developers can easily program Mashups. The great thing was what is called the “Glue” which is an event bus that enables widgets from different providers like Yahoo and Dojo. jMaki has a great future if it ever moves into the Enterprise world and it could be a real step forward for both Java and Web 2.0.

Another interesting Sun research project is Project Caroline that enables java developers to control all resouces from the code, ie create new server instances on the fly, set up new file systems etc etc. If this ever moves beyond just being a half implemented research project it could open up to a lot of competition for Amazons S3 and EC2.

HTML is the worlds most common API

Standard

Most folks that are working with Mashups just assume that services and APIs will magically appear but unfourtunatly there are not that many public APIs around today. Just check out programmableweb and you will see. More and more are added every day, but it will bever reach the level that a majority of systems have an API, especially not if you think about systems within the coporate firewalls. Simply put there is a painful lack of APIs, and if that is not addressed it will stop the mashup wave in it’s tracks. Fortunatly there are already smart people working at this, and one of the solutions is to start using HTML as it is an API. That’s right, start using all the data and functionality that today is available in HTML to build new innovative mashups and solutions.

The potential of HTML

All new interesting applications (Skype being the exception that proves the rule) has an HTML interface. And this is true not just for the consumer facing applications, but for Enterprise level applications as well. So with the millions and millions of HTML pages in existance today it is not unlikely that HTML is one of the worlds most common data formats (I wonder how it compares to printed text and audio for example). The great thing with HTML is that it does not just contain data, it also is the interface to a whole lot of functionality (when you search Google you do that via HTML don’t you?). What if we could use HTML as one big API? That would make HTML the worlds most widespread API and that would give mashup developers and programmers access to more data and more functionality than ever before.

The problem with HTML

Almost not sites on the web today are following the HTML 4 standards. So todays browsers are very good at interpreting the tag soup that most pages consists of (ie broken HTML, missing end tags etc). Furthermore HTML is used to both mark up data in a document, for example with the <title> tag, and to mark up how the data should be presented, for example the <b> tag. All this together makes HTML documents unstructured documents (by implementation, not by nature) with data in very application specific formats (microformats will help here, but there will be some time before that is widespread enough to be really usefull).

Another problem is of course that there is fewer and fewer pages on the web that uses pure (albeit broken) HTML, there are more and more Javascript around. Especially in the Web 2.0 applications most of the really interesting functionality is available via AJAX. So it is not only HTML, but also Javascript that has to be taken into consideration when one wants to get to a web applications functionality.

Parsing

So we have huge amounts of data and functionality in HTML and we want to use it to make our latest funky Mashup. The good old approach is to try to parse the page in question using Perl, now it can be done pretty well using almost any modern programming language. There are several problems with parsing though:

  • It is damn complicated to get to work on serious web pages and once it is done it breaks easily
  • Good luck handling a real tag soup, already that breaks most parsers (since using XML parsers for this means that the parser simply stops at the first error it encounters)
  • It is boring to program those parsers (if you havent tried then lucky you)
  • Can not access functionality that uses javascript and AJAX
  • It is hard to handle things like login into a web application (ie session handling) and to navigate over several pages

Still this is a very usual approach to get to data and functionality in HTML. But there is a much easier way…

Web Scraping

I bet that a fair portion of the people reading the word “Web Scraping” think of old mainframe terminals and “Screen Scraping” and frown. Don’t worry, technology has moved forward lightyears since the days of mainframes. Web Scraping is to interact with HTML (including Javascript if it is a good scraper) and to either extract data from the HTML or repackage the functionality in the HTML. The data can be saved into a database or a file for example, and the functionality can be made available as a REST service, a programming language API or whatever else makes sense. Suddenly HTML is wide open. Just imagine that you wanted to get data from Digg (before the Digg API that came out a few weeks ago) for some reason, without an API that would be hard. But using a web scraper you could for example build a REST service out of the search on Digg only by accessing the HTML. Web Scrapers are used more and more for doing things like collecting large amounts of job ads or flight information and then repackage that data into sites that then allow users to search for a job or a cheap flight.

Openkapow

The web scraper of my choice is the one supplied on openkapow.com (disclaimer: I am working for Kapow Technologies, the company behind openkapow.com, but trust me in that I am not plugging openkapow to make my boss happy – it is really a great product). Using openkapow one can access data and functionality on any web page and access it as a REST service or and RSS/Atom feed. Of course JavaScript is handled automatically, it is possible to navigate multiple pages, login to restricted pages and have full control over the process flow with conditions and error handling. I recommend that anybody that is interested in how to use HTML as an API takes a look at openkapow.

An Eye Opener…

Thinking of HTML as an API does significantly expand your horizons as a developer. I have literaly seen a light go on in fellow geeks eyes when they realize the potential. Suddenly the web is really yours to use in your programs and mashups. When suddenly APIs and services are abundant then you can start using the other cool mashup tools around (Teqlo, jMaki etc).

How vs. Why

Standard

Somebody recently mentioned that they were always more interested in the “Why” as compared to the “How” of any technical challenge. That made me think and my (current) conclustion is that there are two major groups of people (at least in the tech world) – people that primarily asks “How?” and people that primarily askes “Why?”

The “How?” Group

“How” is the focus of anybody working daily with solving problems though technology, and it is firmly rooted in us in all those Math and Programming courses at University. I am definitly in this group, even if I am slowly drifting out to the Why group. If there is any occupational hazard for computer geeks (except for destroying our hands on keyboards) it is probably that as soon as we hear about a problem we start to design the system to solve it, our subconscious is constantly working on solving some trivial or non-trivial problem. Things have to have a solution, and if we dont know what it is it is just because we havent thought it through properly.

The “Why?” Group

“Why” to do something is (in my humble opinion) not that important to most techies, and to the ones that consider the “Why” important a pure development job is probably quite unfulfilling. This is the group of where most entreprenurs belong to. They ask questions like “Why should I have to use FTP to upload my photos to the web?” before they ask “How should we implement that?” (see Flickr for the answer). This group of people is much smaller than the “How” group, but everybody in the “How” group can really benefit from thinking more about the “Why”. The best programmers I have ever worked with asks “Why” all day long. And without why we wouldn’t have most of the Web 2.0 apps of today.

More Why’s!

I am finding myself asking “why?” more and more, and “how?” less and less, and now when I am aware of this I will make sure to always ask “why?” before asking “how?”. This will be hard to do, as a programmer asking “how?” has been my focus for many many years.

If I could just expand this post to about 300 pages I can publish a management book! I also need to get a picture of myself in a suit and a tie for the cover, but that should be it…