Wednesday, April 21, 2010

[Arch] Build scalable systems that handle failure withtout losing data

I found a very interesting article on the MSDN Architecture Center illustrates a real life use case about scalable systems. Designing and building scalable systems is one of the major challenges of Software Engineers. A lot of best practices and patterns exist on the web illustrating the problem, but the specific design and the implementation differ in projects. This article tells a real life example of such a system and the essential steps that were done in order to build a scalable system that also handle failure without losing data. The following topics are covered:

  • HTTP and Message Loss
  • Durable Messaging
  • Systems Consistency
  • Transactional Messaging
  • Transient Conditions
  • Deserialization Errors
  • Messages in the Error Queue
  • Time and Message Loss
  • TimetoBeReceived
  • Call Stack Problems
  • Large Messages
  • Small Messages from Large
  • Idempotent Messaging
  • Long-Running Processes
  • Learning from Mistakes

No comments: