Towards a more productive Java EE ecosystem
Date
2013-03-05
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Alates Java programmeerimiskeele loomisest 1995. a. on selle üks kõige olulisemaid kasutusvaldkondi veebirakenduste programmeerimine. Java populaarsuse põhjuseks ei olnud ainult keeledisainilised omadused nagu objektorienteeritus ja range tüübisüsteem, vaid ennekõike platvormist sõltumatus ja standardiseeritud teekide rohkus, mis tegi tavaprogrammeerijatele veebirakenduste programmeerimise jõukohaseks. Kümme aastat hiljem oli olukord muutunud märkimisväärselt. Java oli kaotamas oma liidripositsiooni uutele, nn. dünaamilistele keeltele nagu PHP, Ruby ja Python. Seejuures polnud põhjuseks mitte see, et need keeled ise oleksid tunduvalt Javast paremad, vaid Java ökosüsteemi areng oli väga konservatiivne ja aeglane.
Antud kontekstis alustasime aastal 2005 oma uuringuid eesmärgiga parandada suurimad probleemid Java ökosüsteemis ja viia see vähemalt samale tasemele ülalmainitud keeltega. Käesaolevas dissertatsioonis on esitatud vastavate uuringute tulemused. Dissertatsioon põhineb neljal publikatsioonil—kolmel eelretsenseeritud teadusartiklil ja ühel patendil.
Esimeseks katseks oli uue veebiraamistike integreerimisraamistiku “Aranea” loomine. Antud hetkel oli Javas üle kolmekümne aktiivselt arendatavat veebiraamistikku, mistõttu otsustasime fokuseeruda kahele võtmeprobleemile: raamistike taaskasutatavuse lihtsus ja koostöövõime. Selleks töötasime välja uudse komponentmudeli, mis võimaldab kirjeldada süsteemi teenus- ja kasutajaliideskomponentide hierarhilisi seoseid, ja realiseerisime eri raamistike adapterid komponentmudelisse sobitumiseks.
Järgmise probleemina käsitlesime andmete haldamiskihi kirjeldamist. Lähtusime eeldusest, et relatsioonilistes andmebaasides on SQL kõige enamlevinud andmete kirjelduskeel, ja efektiivne admehalduskiht peab võimaldama Javas lihtsasti esitada SQL päringuid, samas garanteerima konstrueeritavate päringute süntaktilise korrektsuse. Senised lahendused baseerusid reeglina SQL päringute programsel konstrueerimisel sõnedena, mistõttu päringute korrektsuse kontroll oli raskendatud. Lahenduseks töötasime välja nn. rakendispetsiifilise keele (i.k. domain-specific language, DSL) SQL päringute esitamiseks kasutades Java keele tüübisüsteemi vahendeid nende korrektsuse kompileerimisaegseks valideerimiseks. Töö käigus identifitseerisime üldised tarkvara disainimustrid, mis lihtsustavad analoogiliste tüübikindlate DSLide loomist, ja kasutasime neid kahe uue eksperimentaalse tüübikindla DSLi loomisel - Java klasside täitmisaegseks loomiseks ja manipuleerimiseks ning XMLi parsimiseks ja genereerimiseks.
Kolmanda ülesandena pühendusime ühele olulisemale Java platvormi puudusele võrreldes dünaamiliste keeltega. Kui PHP’s või Ruby’s saab programmi koodi otseselt muuta ja tulemust koheselt näha, siis Java rakendusserverid nõuavad rakenduse “ehitamist” (i.k. build) ja “paigutamist” (i.k. deploy), mis suurte rakenduste korral võib võtta mitmeid või isegi kümneid minuteid. Probleemi lahenduseks töötasime välja uudse ja praktilise meetodi koodi ümberlaadimiseks Java platvormil, mille põhjal arendasime ja lasime välja toote “JRebel”. See kasutab Java baitkoodi laadimisaegset modifitseerimist koos spetsiaalse ümbersuunamiskihiga kutsekohtade, meetodite ja meetodikeha vahel, mis võimaldab hallata koodi erinevaid versioone ning täitmisajal suunata väljakutsed viimasele versioonile.
Täna, rohkem kui seitse aastat pärast uuringute algust, tuleb tõdeda, et meie töö veebiraamistikega lõi küll eduka platvormi erinevate eksperimentaalsete ideede uurimiseks ja katsetamiseks, kuid reaalses tarkvaratööstuses ei ole leidnud laialdast kasutust. Töö tüübikindlate DSLidega oli edukam, sest see mõjutas otseselt edaspidiseid uuringuid antud teemal ning selle elemendid leidsid rakendust viimases JPA standardi spetsifikatsioonis. Kõige suurem mõju tarkvaratööstusele on meie dünaamiline koodiümberlaadimise lahendus, mis on tänapäeval Java kogukonnas laialdaselt kasutusel ning mida kasutavad igapäevaselt rohkem kui 3000 erinevat organisatsiooni üle maailma.
Since the Tim Berners-Lee famous proposal of World Wide Web in 1990 and the introduction of the Common Gateway Interface in 1993, the world of online web applications has been booming. In the nineties the Java language and platform became the first choice for web development in and out of the enterprise. But by the mid-aughts the platform was in crisis - newcomers like PHP, Ruby and Python have picked up the flag as the most productive platforms, with Java left for conservative enterprises. However, this was not because those languages and platforms were significantly better than Java. Rather, the issue was that innovation in the Java ecosystem was slow, due to the ways the platform was managed. Large vendors dominated the space, committees designed standards and the brightest minds were moving to other JVM languages like Scala, Groovy or JRuby. In this context we started our investigations in 2005. Our goals were to address some of the more gaping holes in the Java ecosystem and bring it on par with the languages touted as more productive. The first effort was to design a better web framework, called “Aranea”. At that point of time Java had more than thirty actively developed web frameworks, and many of them were used simultaneously in the same projects. We decided to focus on two key issues: ease of reuse and framework interoperability. To solve the first issue we created a self-contained component model that allowed the construction of both simple and sophisticated systems using a simple object protocol and hierarchical aggregation in style of the Composite design pattern. This allowed one to capture every aspect of reuse in a dedicated component, be it a part of framework functionality, a repeating UI component or a whole UI process backed by complex logic. Those could be mixed and matched almost indiscriminately subject to rules expressed in the interfaces they implemented. To solve the second issue we proposed adapters between the component model and the various models of other frameworks. We later implemented some of those adapters both in a local and remote fashion, allowing one to almost effortlessly capture and mix different web application components together, no matter what the underlying implementation may be. The next issue that we focused on was the data access layer. At that point in the Java community the most popular ways of accessing data was either using embedded SQL strings or an Object-Relational Mapping tool ``Hibernate''. Both approaches had severe disadvantages. Using embedded SQL strings exposed the developers to typographical errors, lack of abstraction, very late validation and dangers of dynamic string concatenation. Using Hibernate/ORM introduced a layer of abstraction notorious for the level of misunderstanding and production performance issues it caused. We believed that SQL is the right way to access the data in a relational database, as it expresses exactly the data that is needed without much overhead. Instead of embedding it into strings, we decided to embed it using the constructs of the Java language, thus creating an embedded DSL. As one of the goals was to provide extensive compiler-time validation, we made extensive use of Java Generics and code generation to provide maximum possible static safety. We also built some basic SQL extensions into the language that provided a better interface between Java structures and relational queries as well as allowing effortless further extension and enabling ease of abstraction. Our work on the SQL DSL made us believe that building type safe embedded DSLs could be of great use for the Java community. We embarked on building two more experimental DSLs, one for generating and manipulating Java classes on-the-fly and the other for parsing and generating XML. These experiments exposed some common patterns, including restricting DSL syntax, collecting type safe history and using type safe metadata. Applying those patterns to different domains helps encode a truly type safe DSL in the Java language. Our final and largest effort concentrated on a major disadvantage of the Java platform as compared to the dynamically-typed language platforms. Namely, while in PHP or Ruby on Rails one could edit any line of code and see the result immediately, the Java application servers would force one to do “build” and “deploy”, which for larger applications could take minutes and even tens of minutes. Initial investigation revealed that the claims of fast code reloading were not quite solid across the board. Dynamically-typed languages would typically destroy state and recreate the application, just like the Java application servers. The crucial difference was that they did it quickly and the productivity of development was a large concern for language and framework designers. However as we investigated the issue deeper on the Java side, we came up with a novel and practical way of reloading code on the JVM, which we developed and released as the product “JRebel”. We made use of the fact that Java bytecode is a very high level encoding of the Java language, which is easy to modify during load time. This allowed us to insert a layer of indirection between the call sites, methods and method bodies which was versatile enough to manage multiple versions of code and redirect the calls to the latest version during runtime. There have been over the years some basic developments in the similar fashion, but unlike them we engineered JRebel to run on the stock JVM and to have no visible impact on application functional or non-functional behaviour. The latter was the hardest, as the layer of indirection both introduces numerous compatibility problems and adds performance overhead. To overcome those limitations we had to integrate deeply on many levels of the JVM and to use compiler techniques to remove the layer of indirection where possible.
Since the Tim Berners-Lee famous proposal of World Wide Web in 1990 and the introduction of the Common Gateway Interface in 1993, the world of online web applications has been booming. In the nineties the Java language and platform became the first choice for web development in and out of the enterprise. But by the mid-aughts the platform was in crisis - newcomers like PHP, Ruby and Python have picked up the flag as the most productive platforms, with Java left for conservative enterprises. However, this was not because those languages and platforms were significantly better than Java. Rather, the issue was that innovation in the Java ecosystem was slow, due to the ways the platform was managed. Large vendors dominated the space, committees designed standards and the brightest minds were moving to other JVM languages like Scala, Groovy or JRuby. In this context we started our investigations in 2005. Our goals were to address some of the more gaping holes in the Java ecosystem and bring it on par with the languages touted as more productive. The first effort was to design a better web framework, called “Aranea”. At that point of time Java had more than thirty actively developed web frameworks, and many of them were used simultaneously in the same projects. We decided to focus on two key issues: ease of reuse and framework interoperability. To solve the first issue we created a self-contained component model that allowed the construction of both simple and sophisticated systems using a simple object protocol and hierarchical aggregation in style of the Composite design pattern. This allowed one to capture every aspect of reuse in a dedicated component, be it a part of framework functionality, a repeating UI component or a whole UI process backed by complex logic. Those could be mixed and matched almost indiscriminately subject to rules expressed in the interfaces they implemented. To solve the second issue we proposed adapters between the component model and the various models of other frameworks. We later implemented some of those adapters both in a local and remote fashion, allowing one to almost effortlessly capture and mix different web application components together, no matter what the underlying implementation may be. The next issue that we focused on was the data access layer. At that point in the Java community the most popular ways of accessing data was either using embedded SQL strings or an Object-Relational Mapping tool ``Hibernate''. Both approaches had severe disadvantages. Using embedded SQL strings exposed the developers to typographical errors, lack of abstraction, very late validation and dangers of dynamic string concatenation. Using Hibernate/ORM introduced a layer of abstraction notorious for the level of misunderstanding and production performance issues it caused. We believed that SQL is the right way to access the data in a relational database, as it expresses exactly the data that is needed without much overhead. Instead of embedding it into strings, we decided to embed it using the constructs of the Java language, thus creating an embedded DSL. As one of the goals was to provide extensive compiler-time validation, we made extensive use of Java Generics and code generation to provide maximum possible static safety. We also built some basic SQL extensions into the language that provided a better interface between Java structures and relational queries as well as allowing effortless further extension and enabling ease of abstraction. Our work on the SQL DSL made us believe that building type safe embedded DSLs could be of great use for the Java community. We embarked on building two more experimental DSLs, one for generating and manipulating Java classes on-the-fly and the other for parsing and generating XML. These experiments exposed some common patterns, including restricting DSL syntax, collecting type safe history and using type safe metadata. Applying those patterns to different domains helps encode a truly type safe DSL in the Java language. Our final and largest effort concentrated on a major disadvantage of the Java platform as compared to the dynamically-typed language platforms. Namely, while in PHP or Ruby on Rails one could edit any line of code and see the result immediately, the Java application servers would force one to do “build” and “deploy”, which for larger applications could take minutes and even tens of minutes. Initial investigation revealed that the claims of fast code reloading were not quite solid across the board. Dynamically-typed languages would typically destroy state and recreate the application, just like the Java application servers. The crucial difference was that they did it quickly and the productivity of development was a large concern for language and framework designers. However as we investigated the issue deeper on the Java side, we came up with a novel and practical way of reloading code on the JVM, which we developed and released as the product “JRebel”. We made use of the fact that Java bytecode is a very high level encoding of the Java language, which is easy to modify during load time. This allowed us to insert a layer of indirection between the call sites, methods and method bodies which was versatile enough to manage multiple versions of code and redirect the calls to the latest version during runtime. There have been over the years some basic developments in the similar fashion, but unlike them we engineered JRebel to run on the stock JVM and to have no visible impact on application functional or non-functional behaviour. The latter was the hardest, as the layer of indirection both introduces numerous compatibility problems and adds performance overhead. To overcome those limitations we had to integrate deeply on many levels of the JVM and to use compiler techniques to remove the layer of indirection where possible.
Description
Väitekirja elektrooniline versioon ei sisalda publikatsioone.
Keywords
Java (programmeerimiskeel), veebiraamistikud, päringukeeled, valdkonnaspetsiifilised keeled, koodid, Java (programming language), query languages, web frameworks, domain-specific languages, codes