Software Engineering - Best Practices: September 2006

Wednesday, September 27, 2006

[Process] Measuring "Agility"

"Agile Manifesto" Agile Trends in Software Engineering

The "Agile Manifesto", eXtreme Programming (Kent Beck), Scrum... Agile methods of software development can be found everywhere these days. And there are good reasons for this trend. Traditional software engineering apparently found it's limits. Particularly in software projects dealing with fast changing requirements, fast changing technologies and unclear user-demands.

Measuring Agility

Talking about agile processes I have the impression, that there is yet no propoer definition of what agility actually is. Particularly not a definition that could be seen as a foundation for a quantitative measurement of agility. Or a measurement if a (software) team is able to react agile on changing situations. So following internal discussions I want to propose an idea on how we could treat this question.

This figure should illustrate the idea (please click on the figure to enlarge it):

The x-axis shows the timeline
The y-axis actually shows two things: on the left (according to the red line) the overall increase in "business value" e.g., of a Software product (number of features, ...) the right side indicates the ideal level of maximum fulfilment of feature request at a given time (iteration time). The "100%" shift upwards, as from iteration to iteration higher business value is demanded; or expressed otherwise: as the complexity of the environment is continually increasing (right scale) the business value of a static application is continuously decreasing
As the overall complexity increases, the "100% line" gets higher with time
However, this means, that the actual quality (according to the business value) of the product decreases over time as the scales shift. This could be measured as d (deterioration).
The time needed between two iterations is r (retention) time
Then total gain in quality (absolute according to the left scale) between two iterations is indicated by i (improvement)

Agility now should be a measure that follows the following structure:

A system is more agile if

r : d is small (the faster deterioration d takes place the shorter r should become)
i1...in in relative terms (right axis) stays constant, meaning the relative quality over iteration stays constant (or gets better)
(d+i) --> 0; ideally the steps between iterations are low, and a continuous improvement is observed

This is the initial idea; I hope, that discussion brings improvement to this concept and maybe a better (formal) description than offered here.

Thursday, September 21, 2006

[Arch] Web 2.0 Patterns and the lacking Separation between Producers and Consumers

"Web 2.0" Patterns

Even though the term Web 2.0 is not really "brand-new" and also the O'Reilly article is from 2005, I still found it just recently (shame on me), and I still think there might be others out there having not read it.

This article tries to define what these "new Web 2.0" applications distinguish from the Web-applications we have seen the years before. The article covers the "Web as Platform", new concepts of web-services and applications as well as the social phenomenons emerging around these new services.

Besides the fact, that new technologies like Ajax, Flash, OpenLaszlo, Rails and the like push the development of new and rich Web-Interfaces (which is an SE topic itself), the most interesting aspect for me at the moment, is the discussion of the "End of the Software Release Cycle". Particularly the second assumption: "Users must be treated as co-developers" is (from my point of view) the most relevant and stunning new phenomenon in the new "Web 2.0" and "E-Commerce" concepts.

In and Out

The concept of inside and outside the company, company employee and customer get's more and more fuzzy every day. This was always a clear idea within the Open Source movement: the "customer"/user was invited to participate in the software development process not only by contributing code, but also but contributing in the process, e.g., by

reporting bugs
helping new users in the mailing list
providing information in the project Wiki
writing tutorials

However, in an Open Source project there is no business model behind the process, but considering the business strategies of companies like Amazon or Flickr things are apparently different:

Amazon partly builds upon the workforce of users commenting books, CDs and so on: these comments are actually for free (from the perspective of Amazon), and the customers are providing work that was until now paid for.
Flickr builds also upon the idea, that users tag pictures, hence they provide the necessary information for the company to build efficient categories and search functionality.
Even if customers do not provide active work, their sheer activity on the website is used: for example Amazon evaluates the click streams and buying behaviour and creates additional functionality for their website: "What do customers ultimately buy after viewing this item?"
Google evaluates Websites and searches from users; analyses the work web-authors do, for example by making references to other pages. This information is a core necessity for their page-ranking algorithm. The data comes from customers.
Google provides a fundamental Maps framework; in the future the rich Map-applications will be provided by "customers"

I do not want to be misunderstood, I do not intend to say that all this is negative a priori, but it has significant impact on the way Web-Applications are designed and software is developed. Ten years ago, there was a clear distinction between the company who provided software and the user community. We now detect an erosion of this "wall". If, 10 years ago, the company hired a designer to create some visual templates, probably company X asks the user community to provide templates (and they work for free).

Similar concepts can be seen in online gaming: "monolythic" games will be, in my opinion, replaced by platforms where users very actively will contribute their designed landscapes, characters, weapons, and so on.

This trend will pose significant changes on companies and developers who still stick to old software engineering paradigms.

This will more and more effect the relationship between customers and producers, the way sotware is developed and eventually the relationship between companies and their employees.

Monday, September 18, 2006

[Tech] BPEL with User Interaction

BPEL (Business Process Execution Language) is used for service orchestration in Service Oriented Architectures. The current standard of BPEL is designed to automate business processes based on Web Services. However traditional business processes often requires human interactions which is currently not supported by the BPEL standard.

For this purpose, IBM and SAP developed BPEL4People. This extension is defined in a way that is layered on top of the BPEL language so that its features can be composed with the BPEL core features whenever needed. The most important new constructs in this extension are:

Generic Human Roles define what a person/group can do in a specific process activity.
People Links are used to represent the different groups of people who participate in the execution of the process (e.g. link to LDAP)
People activity is a basic activity, which is not implemented by a piece of software, but realized by an action performed by a human being (e.g display an User Interface where the user can do something).

To derive benefit from BPEL4People a BPEL engine must process people activities differently from activities invoking Web Services. This paper (IBM and SAP) gives an detailed overview of BPEL4People.

Wednesday, September 13, 2006

[Tech] Derby's next round

Since the "Open Source" move from IBMs Cloudescape to Apache Derby, there were quite some discussions about which would be the Java Database of choice. Currently mainly two systems are prominent and in discussion: hsqldb and Derby.

Java Databases?

The foremost question for many developers/users is, why anyone would want a relational Database written in Java. Actually there are several reasons, mostly interesting for Java-based projects:

Databases like Derby are meanwhile very mature and can compete with non-Java systems
Java-based databases integrate smoothlessly into Java projects; also using O/R mapping tools like Castor or Hibernate: In this case, the complete application including persistance stack is in Java.
Different operation modes (Server, Embedded)
In the embedded mode, no network communication takes place, with respective consequences for performance and security (no connection to database from outside the application is possible)
No need to install or configure the database at the customer (in embedded mode)!

This makes systems like Derby or hsqldb a very good choice for development, testing and easy installation of the Java application (e.g., potential users can easily download a complete configured webapplication including database). This is helpful for customer evaluation-purposes. But systems like Derby are mature and powerful enough to also serve for many classes of production systems.

Performance

Performance considerations are always very difficult issues in evaluating (relational) databases. Simple tests typically give misleading results. Just an example:

hsqldb in default setup is compared to PostgreSQL (or Derby) in default setup, and indicates a far higher performance.

A second (and third look) shows: hsqldb uses (per default)

in-memory tables: all data is kept in memory
might be used embedded, hence no network overhead takes place
has enabled huge write delay, meaning, that even committed transactions are not persisted at the moment, but up to one minute later!
hsqldb has no proper implementation of the database ACID criteria, e.g., allows dirty read

This is just an indication of issues. Of course a system can be very fast, when onle once a minute I/O is taking place, transaction isolation is not implemented and all data is kept in memory and finally no network activity is involved. This is not to say, that hsqldb is a bad choice, one just has to be aware of the limits.

Additionally, the access frameworks have to be taken into consideration: nowadays developers hardly use JDBC directly, but use technologies like iBatis, Hibernate or Castor, and this again has significant impact on performace (using Caches or not, ...).

Generally spoken: Java databases are very fast when used in embedded mode and even faster when in-memory tables are used (these are not yet available for Derby though). As soon, as these databases are used in Server mode, they typically play in the same performance region as other known database systems. But significant differences can be observed depending on specific usage scenarios.

News in Derby 10.1.3.1

This release is mainly a maintenance release and offers and increases stability and reliability, better query performance as well as updated documentation.

News in Derby 10.2

The upcoming Version 10.2. will have some significant improvements (from the Derby Wiki):

Scrollable Updatable Result Sets
JDBC4
Grant/Revoke
Online Backup
Stronger Network Authentication

Very important is the implementation of the missing Grant/Revoke statements combined with better network authentication. This was a significant lack of functionality with recent Derby versions particularly important when using Derby in server mode.

This release should be out yet in September, so stay tuned!

References

To get information about Apache Derby, read the Website and the Wiki (!) for recent activities.

Check out the very interesting BLOG entry: David van Couvering's Blog: About ACIDity and Java Databases.

Additionally I might refer to my recent article (german) in the Swiss magazine: Infoweek.ch discussing Apache Derby in some more details as well as one of my articles in german iX magazine 2005/5: in this article I discuss and compare with my colleage Marco Zapletal hsqldb with Apache Derby. However, it should be noted, that since then some new versions of Derby were released and not all statement are up-to-date according to these new versions.

Sunday, September 03, 2006

[Misc] Does it matter to know more than one language?

This weekend I've read two interesting articles. The first one from Joel Spolsky with the title “Language Wars” discuss the usefulness of the ever raging discussions between software developers about the “right” programming language. Besides that he admits that this kind of discussion is definitely fun, but if you want to develop a production quality system the decision depends on the maturity of the language (and tools) and the skills of your team.

The second article is from Luke Plant titled “Why learning Haskell/Python makes you a worse programmer”. He describes his experience of using concepts of not so common languages at daily work. His conclusion is, that, not surprisingly, it comes to very bad results if you use as example sophisticated functional language idioms inside code which does not support the underlying concepts.

In the real world, knowing a lot of languages gives you definite advantage. But, and this is my question, is this so clear for programming languages?

Learning a language needs a lot of time and most of all, it needs practice. Most professional software developers only use one (or at most two) languages for their daily work and they speak them fluently (that's why they are called “profession” ;-). In our days, this are Java and sometimes C#, a decade ago this was mostly C/C++. The tides change so it is out of discussion that a developer learns more than one language in his life (at least I know one who knows someone who knows Fortran). Today, there are a lot new languages on the surface, some really new, such as Ruby, some rediscovered, such as Erlang or some which were around us but nobody noticed them (Python). There are also a lot of mere academic languages, such as the upcoming Fortress and Scala. And there are all the rumours about Domain Specific Languages.

New languages may have features you want to use, which are so sophisticated and elegant that you never want to see braces, but makes it sense for your daily work, and the quality of it, to learn one of them? There a to different views to this: for geeks and developers, it makes absolute sense to to this, learn new concepts, write little and shiny programs and have some fun to explain your knowledge to the “unliterated”. But for your project and your company, it can make no sense at all, it can, as Luke Plant describes it, be a waste of time. As example, Java and C# have are easy to learn languages but to use them in a productive and professional way needs a lot of time, besides nobody will ever know all features of their frameworks. So for your daily work, it may be the best to practice what you already know, develop useful idioms and patterns and read about knew concepts in your language, such as Generics or LINQ, and learn to use them to build readable, testable and stable code. It may be possible that you are someone who is skilled enough to speak more languages fluently and you can practice, but most of us developers are not (mostly because of practice).

For me personal, it makes sense to learn other languages because they can teach you new ideas and concepts and it is at least possible to practice if your take part in some open source projects or build small tools for your personal use. On the other hand, at work you need every bit of knowledge you have to develop a good piece of software, so every experiment can be harmful.

Software Engineering - Best Practices

Wednesday, September 27, 2006

[Process] Measuring "Agility"

Thursday, September 21, 2006

[Arch] Web 2.0 Patterns and the lacking Separation between Producers and Consumers

Monday, September 18, 2006

[Tech] BPEL with User Interaction

Wednesday, September 13, 2006

[Tech] Derby's next round

Sunday, September 03, 2006

[Misc] Does it matter to know more than one language?

Blog Articles

Select by Category/Label

Article Main Categories?

Stay tuned...

Tip: Permalink to Articles

Motivation

Links

About Us

Editor

CO2 Stats