Architecture

You are currently browsing the archive for the Architecture category.

If you want to see an excellent way to represent complex concepts graphically, look at the work of Bryce on flickr .This diagram represent, after heavy Gobbledy Gook, the concepts introduced during a project I was leading. I must say that this work really helped communicate all the different objects and concepts.


You can read here Bryce’s version of Flickr user model, even better.

I’ve really appreciated working with you Bryce.

Today, Microsoft announced, along with other vendors, the publication of, or their intend to publish later, a new language to model data center resources and their management in XML. This language is called Service Modeling Language, and is based on Schematron.

As a result of collaboration, the open, industry-wide specification defines a common language for expressing information about IT resources and services. Called the Service Modeling Language (SML), the specification enables a hierarchy of IT resource models to be created from reusable building blocks rather than requiring custom descriptions of every service, thus reducing costs and system complexity for customers. The group plans to submit the draft specification to an industry standards organization later this year.

This language is intended to replace Microsoft’s System Definition Model

Some more information is available at SearchWebServices.com.

Interesting piece of information from this post on PushToTest :

Caucho told developers that they are seeing an astonishing 4 to 6 times performance improvement over the C-version of PHP. Quercus runs with JVM thread safety – something not available to PHP developers today – to enable things like database connection pooling in a threaded environment. Quercus is expected to ship by December 2006.

Here is another mention of this information on ServerSide :

Apparently, the PHP pages are compiled in the background to byte-code, and the resulting performance is six times that of Apache mod_php!

And this post is generating quite a few comments.

A more definitive source of information is available on the caucho forum where some actual numbers have been posted :

Test Name Resin/Quercus Apache 2.0/PHP 5.0
file_1k 6341 ops 3255 ops
file_1k (10 clients, 16 keepalive) 13186 ops 6154 ops
file_64k 857 ops 841 ops
file_64k (10 clients, 16 keepalive) 1019 ops 995 ops

file_7m 10.7 ops 11.8 ops
jsp_1k 7070 ops n/a
gzip_1k 2570 ops n/a
gzip_1k (cache ) 6529 ops n/a
gzip_64k 343 ops n/a
gzip_64k (cache) 6220 ops n/a
ssl_1k 173 ops n/a
ssl_1k (10 clients, 16 keepalive) 1795 ops n/a
ssl_64k 85 ops n/a
ssl_64k (10 clients, 16 keepalive> 155 ops n/a
php_1k 4194 ops 1151 ops
php_1k (10 clients, 16 keepalive) 7806 ops 1508 ops
mediawiki 17 ops 5 ops
mediawiki (10 clients, 16 keepalive) 17 ops 5 ops
mediawiki (proxy cache) 3546 ops 5 ops
drupal 33 ops 10 ops
drupal (10 clients, 16 keepalive) 30 ops 10 ops

and, with PHP acceleration :

Test Resin/Quercus Apache/PHP/eaccelerator

drupal 46 ops 43 ops

wiki 30 ops 17 ops

With more tuning, the PHP performance might match the Resin/Quercus ones, but it should be a wake up call to all the Java detractors. Java IS NOT SLOW !

In a previous post I tried to capture the functions required to manage datacenters and services across their lifecycle. In a subsequent one, I tried to map the VMWare virtual infrastructure 3 functions. Obviously, the granularity was not the right one. Below is another attempt to capture what the holistic management offering should be :

200606261336

 

Hardware Management

Not specific to the virtualization domain, but critical to have an end to end solution. Usually, this part is implemented with solutions like IBM Director, N1 System Manager, or HP Insight Manager. However, one of the many issues when done in a virtualization setup is the control of virtual machines in a way similar to physical machines (with a virtual IPMI implementation for example).

  • HW Discovery: This is discovery of bare metal systems using IPMI or other vendor specific baseboard interfaces. The machine does not need to have an OS, or Domain 0 to be discovered.
  • FW Update: Update of the BIOS or other hardware specific software. Usually requires some tight integration with hardware management interfaces.
  • HW Monitoring : Monitoring of environmental information like fan speed, CPU temperature …
  • Hardware Abstraction Layer : A kind of driver layer allowing to decouple the hardware specific functions from the various protocols or vendor interfaces. Critical to be able to, for example, configure the machine to boot by using PXE.

Hypervisor

the central piece of the virtualization offering, this could be Xen or VMWare ESX for example. This layer is now a commodity and offered for free as part of the OS.

Virtual Server Management

Functions usually implemented in the Domaine 0 (aka Host OS, or Service Partition).

  • Virtual Device Management : Management of devices exported to Virtual Machines. This includes network, disks, PCI devices, consoles, … this function is usually highly tied to the hypervisor and share its implementation with the virtual server management. Can include the fault management of the various devices.
  • VM Management: Life cycle management of the Virtual Machines instanciated on a given host. Provides essentially methods to boot VMs, either from local disk, CDROM or network (e.g. through PXE).
  • Volume Management: Provide a logical volume manager, as well as various storage related functions.
  • Resource Management: Allocations of physical resources to many VMs hosted on a given server. Includes the scheduler configuration.
  • Monitoring: Monitoring of resources used by various VMs. Could be an aggregation point for information collected in the various guest OS.
  • Migration: Live migration of VMs from a server to a given target server. Could be more or less stateful or real time dependending on the virtualization layer capabilities.

OS Management

Mainly targeting the management of the guest OSes, and therefore not specific to the virtualization domain, but critical to have a end to end solution. Could be implemented by solutions like IBM Director, N1 System Manager, or HP Insight Manager but this would lead into additional complexities introduced by the management of virtual machines.

  • OS Provisioning: Infrastructure required to implement NAS boot or SAN Boot, as well as building OS profiles
  • OS Monitoring: Monitors OS parameters such as CPU utilization, directly from the guest OSes.
  • OS Patching: Update of OSes with patches. Additional complexity when required to update non deployed VMs.

Datacenter Management

  • Datacenter Resource Management : Grouping of server, storage and network resources and allocation to various services. Basis of multi-tenant aware solutions, or hierarchical resource management capabilities. Can implement some level of workload management.
  • Failover : migration of virtual machines based on availability related policies.
  • Configuration management : repository of various information about virtual machine parameters, or OS configurations. Could be also a repository for OS profiles, OS versions, …
  • Policy Management : More generic policies driving the operations like migration, or provisioning. Can be linked to performance goals, cost/utilization goals, schedules (calendar events)
  • Automation : Definition of workflow on top of basic operations allowing the implementation of complex operations.

Next step is to try to map the various vendors to this breakdown.

With the increased popularity of SOA, service oriented architecture enterprises are faced with news challenges :

  • Applications are de-composed into service units, distributed across multiple architectures, across multiple management domains or even different governance models. Think about the difference in complexity between a single application deployed on a mainframe, versus a composite application with components running on Solaris, windows, one in the finance department, the other in the HR department, and the third one in that newly acquired company. It is clear that new management paradigms or infrastructure models are required to address this new type of applications.
  • Applications have dependencies on many parts and components. deploying such an application is more complex than just installing a binary on a single machine. Starting this application is no less challenging. How to ensure that all the required components are up and running ?
  • The definition of an application can evolve dynamically. Think about the possibilities that BPEL based orchestration engines bring to the table. New workflow, with new dependencies can be created and deployed dynamically by business analysts, which are maybe not all that plugged in the operations side of the datacenter. How to deal with that dynamic environment ?

It seems clear that the datacenter, and its management had to evolve into something that is more service oriented, or service focused. My focus being mainly the management side, this translates into these problems: how to deploy and run these new applications in a way that integrates tightly with the execution infrastructure, and the definition of the services.

The first step was to try to define an operational model that would follow some standards or best practices.

The Telco view.

This problem has been solved by Telcos for decades now. There operations are highly service focused. They even have defined an operation map capturing business processes required to deliver and manage services. The latest version, eTOM, the enhanced Telecom Operation Map become even less Telco specific (e.g. the term network has been replaced by the term resource).

The IT view.

In parallel, ITIL, the IT Information Library was developed but took a more inward focused point of view. There is little mention of customers in the ITIL books (ok, except maybe in the servicedesk chapter). Most of the best practices are addressing problems that an IT department would face in their quest to improve its maturity. The two main books in this library are the Service Delivery book, and the Service Support book.

The synthesis.

It is obvious that those two points of view had to align and come together in some way, or at least, be able to relate to each other. A document published by the TMF, Telecom Management Forum, GB921V (absorbing the content of GB921L) is helping the practitioners to make sense of those two models. By simplifying a lot, we can say that ITIL processes can be built by using eTOM process elements.

A proposal for service centric functional model

My primary focus was to define a model which would align nicely with ITIL, either by not forcing an incompatible operational model, or by being able to support and automate the customers’ business processes. It’s a mix and match, simplified melting pot of ITIL, eTOM and our N1 concepts. The result can be seen in the following picture :

200606042205

Definitions

  • Resource Provisioning : In the case of compute resources, deployment of OS images on systems and groups of system. Includes set up of images, boot servers, network elements required to support this function (DHCP, DNS/NIS, NFS servers)
  • Resource update: In the case of compute resources,Deployment of one or more OS or application patches on systems. Includes the patch analysis, dependency analysis, and compliance analysis to a defined baseline.
  • Failover : Reprovisioning of compute elements in the event of an unrecoverable hardware error. All appropriate software elements are deployed on the new compute element, and all storage and network elements are reattached to it.
  • Resource Accounting: Recording of all events that can be associated with the usage of a compute resource. It can also include specific metering to provide a finer accounting of resource usage.
  • Advanced Monitoring: In-depth monitoring of compute and software elements, including display of specific field replaceable units (FRUs) and isolation of failed FRUs for replacement.
  • Discovery : Automated dynamic discovery of compute, storage, network and software elements.
  • Firmware update : Firmware upgrade of compute elements. Includes all the operations required to perform the update, like driving the boot sequence, the distribution of the update to the machine, the tracking of firmware levels, the analysis, and comparison with a baseline.
  • Hardware Abstraction : Provides a level of management operation abstraction on top of hardware element specific management operations.
  • Service Design : Operations required to capture the business service requirements, composition, and specific operational model (how to start/.stop it, deploy or upgrade it, …). Include the definition, management , or use of patterns, reference models, or templates.
  • Service provisioning : Deployment of services on a set of possible resources. Includes the deployment of the required software elements, the configuration of specific software, network, or storage elements. Includes also the capture of existing deployments for duplication and the update of the configuration management database.
  • Application Deployment : Deploys software applications and their components and concerns all required application realization conditions and interdependencies. Includes the definition and capture of application models and configuration, as well as the abstraction of applications parameters in order to deploy application across various datacenters of life cycle phases.
  • Configuration Management : Maintain the configuration management database. Includes entering new hardware and software elements either manually or through discovery, update the configuration by synchronizing it with the deployed services and elements, as well as auditing the configured services and elements for compliance with their expected state.
  • Service Update : Update of complex services through multi steps processes. Includes delivering the updates to the various elements, orchestrating the upgrade by executing the individual upgrade operations while minimizing downtime, updating the configuration management database with up to date information.
  • Performance Management : Enforce performance related Service Level Objectives by monitoring the services and their elements for compliance with key performance indicators, report any violation, and initiate remediation operations in order to maintain SLOs.
  • Availability Management: Enforce availability related Service Level Objectives by monitoring the services and their elements for availability, report any violation, and initiate remediation operations in order to maintain SLOs. Include multiple strategy for remediation, depending on the capabilities of the infrastructure, and local policies.
  • Workload Management: Enforce datacenter wide resource allocation policies, including associating the right resources to services, optimizing the resources across the datacenter in order to achieve specified objectives (lower cost, increased utilization, increasing throughput, …). Includes matchmaking of resources with specified service constraints and SLOs, maintaining the current resource allocation in the CMDB, optimizing the allocation, generating accounting information.
  • Service Accounting : Provide a service level accounting by aggregating the resource usage of all the service elements. Includes the collection of usage events, the normalization to take into accounting the dynamic in the resource allocation, the composition and aggregation into resource per service usage records and the reporting to relevant billing entity.
  • Orchestration : Orchestrate, and automate the various operations required across the life cycle of a service. Includes the design of service specific workflows, the execution of these workflows at pre-defined life cycle events, the monitoring of the execution, and the remediation in case of failure of the worflows. The orchestration implements the specific operational models for the services.

The EJB 3.0 specification really simplifies the development of entity persistence. Rahul and Ed are detailing the differences in this article. Annotations are really simplifying the developer’s job, while still allowing customization. The same holds true for JAX-WS 2.0.

And all of that, open sourced in Glassfish, with performance improvements above the previous Application Server benchmarks … look at the results posted by Scott Oak.

Excellent … 167% improvement above previously posted results …

This book by David A. Chappell seems very interesting. I have to add it to my to read list.

Seems to be very relevant to efforts like Open ESB and how ESBs can be applied in Service Oriented Architectures.