Thursday, June 18, 2009

[Misc] Resilient Services & Software Engineering

I recently read the interesting paper by Brad Allenby and Jonathan Fink "Toward Inherently Secure and Resilient Societies" published in Science August 2005 Vol. 309 and surprisingly enough, free to download. This paper is apparently "inspired" by the attack to the World Trade Center, however discusses resilience of important systems our modern societies are depending on in a more general way. The authors definition of resilience is:
"Resiliency is defined as the capability of a system to maintain its functions and structure in the face of internal and external change and to degrade gracefully when it must."
The further state that:
"[...] the critical infrastructure for many firms is shifting to a substantial degree from their physical assets, such as manufacturing facilities, to knowledge systems and networks and the underlying information and communications technology systems and infrastructure.

[...] the increased reliance on ICT systems and the Internet implied by this process can actually produce vulnerabilities, unless greater emphasis is placed on protecting information infrastructures, especially from deliberate physical or software attack to which they might be most vulnerable given their current structure."
The authors apparently have more physical infrastructure in mind (like physical network backbones and the like), however, I am a little bit more worried on the pace certain type of pretty fragile IT services becomes a foundation for our communication and even business models.

I wrote in a recent blog post about my thoughts on Twitter, which became even more important considering the latest political issues in Iran and the use of this communication infrastructure in the conflict. Twitter is (as we know from the past) not only a rather fragile system, it is additionally proprietary and has in case of failure no fallback solution in place.

But Twitter is not the only example: many of the new "social networks" are proprietary and grow at a very fast speed, and we wonder how stable the underlying software, hardware and data-management strategy is. Resilience is apparently no consideration in a fast changing and highly competitive market. At least not until now.

But not only market forces are troubling these days, also political activities that can effect large numbers of systems. Consider the new "green dam" initiative, where Chinese authorities demand each Windows PC to have a piece of filter software pre-installed that should keep "pornography" away from children. This is of course the next level of Internet censorship, but that is not my point here. My point is, that this software will be installed probably an millions of computers and poses a significant threat to the security of the Internet in case of security holes.

Analysis of the green dam system already reveal a number of serious issues. For example Technology Review writes about potential zombie networks, Wolchok et al. described a serious of vulnerabilities. Now this is not the only attempt in that direction. Germany for example is discussing "official" computer worms that are installed by the authorities on computers of suspects to analyse their activities. France and Germany want to implement internet censorship blocking lists of websites. The list of the blocked websites are not to be revealed and it is questionable who controls the infrastructure. Similar issues can be raised here.

I believe, that also software engineering should start dealing with resilience of ICT services and describe best-practices and test-strategies that help engineers to develop resilient systems, but also to allow to assess the risks that are involved in deployed systems. I am afraid we are more and more building important systems on top of very fragile infrastructure and this poses significant risks for our future society. This infrastructure might be fragile on many levels:
  • Usage of proprietary protocols and software that makes migration or graceful degradation very difficult
  • Deployment of proprietary systems to a large number of computers that cannot be properly assessed in terms of security vulnerabilities or other potential misuses, instead of providing the option to deploy systems from different vendors for a specific purpose
  • Single points of failure: many of the new startups operate only very few datacenters, probably even on one single location
  • Inter-dependece of services (e.g. one service uses one or multiple potential fragile services)
  • Systems that can easily be influenced by pressure groups (e.g. centralised infrastructure vs. p2p systems) e.g. to implement censorship
  • Weak architecture (e.g. systems are not scaling)
  • Missing fallback-scenarios, graceful degradation.


Andreas Hubmer said...

Thanks for this interesting article.
I've already been aware of censorship of the internet and that the neutrality of the internet is in danger, but I have not seen proprietary software as part of this problem.
I agree that relying on proprietary communication software (facebook, twitter,...) is not the best solution, but I also want to mention that a proprietary facebook/twitter is better than no facebook/twitter.

Two successful projects come to my mind: Wikipedia and Jabber. Wikipedia is an open project par excellence and Jabber is highly decentralized. Maybe it is possible to use them as inspiration how to enhance other projects or create a better (open) social network.

Alexander Schatten said...

I think you are making an interesting point. Better Twitter/Facebook than nothing.


Better a compass and a map, that is crappy than nothing? But as you have nothing else and the stuff works more or less fine in the neighbourhood you start relying on it for future expeditions. Then when you are really in the wilderness and dependent, the stuff turns out not to work any longer (compass) or to be substantially wrong (map)?

Plus: no one is able to fix it, because the knowledge/technology is not accessible.

I am not very happy with that idea. Yes, I am also using Twitter, at the same time I know, that is it actually a pretty dumb idea...

However, I totally agree on your last paragraph. This was one of the intentions of my post.