2010/10/13

Why new Secure Internet solutions are technically Hard

Information Security is both very hard and very easy at the same time.

Not only are Internet Nasties a nuisance, or worse, they prevent the  new, useful Applications and Networks like e-Commerce, i-EDI, e-Health, e-Banking, e-Government and other business/commercial transactions systems.

Perfect Security isn't possible: ask any bank.

Defenders need to be 100.00% correct, every minute of every day.
Attackers need just one weakness for a moment to get in.

Not all compromises/breaches are equal: from nothing of consequence, up to being in full control with system owners not being aware of it.

All 'Security Systems' can only be "good enough" for their role, which depends on many factors.
How long do you need to keep your secrets? Minutes or Decades?


2010/09/20

Quality and Excellence: Two sides of the same coin

Quality is predicated on Caring.
High Performance, also called "Excellence",  first requires people to Care about their results.

They are related through the Feedback Loop of Continuous Improvement, also known as O-O-D-A (Observe, Orient, Decide, Act) and Plan-Do-Check-Act (from W. Edwards Deming).

The Military take OODA another level with After-Action-Reviews or After-Action-Reports (AAR's), a structured approach to acquiring "Lessons Learned".

High Performance has two aspects: work-rate and consistency.
It's not enough to produce identical/consistent goods or results everytime, but you have to do it with speed.

There's an inviolate Quality Dictum:
You can't Check your own work.

For Organisations, this Dictum becomes:
 Objective assessment requires an Independent Expert Body.

From which follows the necessity for an External Auditor:
  Only Independent persons/bodies can check an Organisation and its people/processes for compliance and performance.

For around 80 years, Aviation has separated the roles of Investigation, or Root Cause Analysis, from Regulation, Compliance and Consequences. In the USA the NTSB Investigates and the FAA Regulates. This has led to consistent, demonstrable improvement in both Safety and Performance. Profitability is linked to Marketing, Financial Management and Administration, not just Performance.

All of which leads to the basic Professional Test for individuals:
 "Never Repeat, or allow to be repeated, Known Errors, Faults and Failures".

And the Raison d'être of Professional Associations or Bodies:
 To collect, preserve and disseminate Professional Learnings of Successes, Failures, Discovery and Invention.

Barry Boehm neatly summaries the importance of the Historical Perspective as:
Santayana half-truth: “Those who cannot remember the past are condemned to repeat it”

Don’t remember failures?
  • Likely to repeat them
Don’t remember successes?
  • Not likely to repeat them

All these statements are about Organisations as Adaptive Control Systems.

To effect change/improvement, there has to be reliable, objective measures of outputs and the means to effect change: Authority, the Right to Direct and Control, the ability to adjust Inputs or Direct work.

Which points the way as to why Outsourcing is often problematic:
  The Feeback Loop is broken because the hirer gives up Control of the Process.

Most Organisations that Outsource critical functions, like I.T., completely divest themselves of all technical capability and, from a multitude of stories, don't contract for effective Quality, Performance or Improvement processes.

They give up both the capability to properly assess Outputs and Processes and Control mechanisms to effect change. Monthly "management reports" aren't quite enough...

2010/09/12

Business Metrics and "I.T. Event Horizons"

Is there any reason the "Public Service", as we call paid Government Administration in Australia, isn't the benchmark for good Management and Governance??

Summary: This piece proposes 5 simple metrics that reflect, but are not in themselves pay or performance measures for, management effectiveness and competence:
  • Meeting efficiency and effectiveness,
  • Time Planning/Use and Task Prioritisation,
  • Typing Speed,
  • Tool/I.T. Competence: speed and skill in basic PC, Office Tools and Internet tools and tasks, and
  • E-mail use (sent, read, completed, in-progress, pending, never resolved, personal, social, other).


As a taxpayer, in a world of White Collar Desktop Automation, I'd expect some quantitative metrics for "efficiency" and "effectiveness" as required of Agency heads in s44 of the FMAA (Financial Management and Accountability Act, 1997), not just some hand-waving, bland reassurance by those Heads and the Audit Office that "we're World's Best Practice, Trust Us".

We know that "what you measure is what you get" (or is maximised) and that career bureaucrats are:
  • risk adverse (C.Y.A. at all times),
  • "very exact", they follow black-letter rules/regulations to the precise letter, and
  • very adept at re-interpreting rules to their advantage.
Unanticipated outcomes abound, the least of which is using reasonable rules to fire or move-on challenging and difficult people,  such as "whistle-blowers", innovators, high-performers (showing up others is "career suicide") or those naively enquiring "why is this so?". 

These "challenging" behaviours are exactly those required under s44 of the FMAA to achieve:
 "Efficient, Effective and Ethical use of Commonwealth Resources",
yet they are almost universally considered anathema by successful bureaucrats.

This bureaucratic behaviour also extinguishes and punishes exactly the elements of "Star Performers" identified in the research of Robert E. Kelley.
Done in the mid-90's, the lack of take-up, or penetration, of this research in the Public Service leads to other questions: Why Not?

In the Bureaucratic world, asking for one thing gets precisely the opposite result.
Something is missing, wrong or perverted... But this has been the essential nature of large bureaucracies since the Roman Empire.

There's a double- or triple-failure going on here:
  • The desired outcomes are not being achieved,
  • This isn't being detected by the Reporting and Review mechanism's: Agency Annual Reports or Audit Office reports, and
  • a culture of non-performance is built, reinforced and locked-in.
An aside:
  If any Agency is truly managed to the minimum standards required by the FMAA, the three-E's, how could there ever be any whistle-blowing??

That there are whistle-blowers such as Andrew Wilke,  is proof of systemic, perhaps systematic, failures, and worse, at many levels.

Simple mindedly imposing minimum "standards" across-the-board would not only be a waste of time, but would be massively counter-productive within this environment.

So what might work??

"Events Horizons" in the world of Information Technology may point the way.

Between 1990 and 1995, using {Intel 486, 256Mbit RAM chips, twisted-pair Ethernet LANs, and Windows Desktop plus File and Print Servers}, PC Desktops made their way onto the bulk of clerical desktops. Usage was mainly "Productivity Applications", mainframe access and a little in-house messaging. Cheaper and Faster paper-and-pencils plus zero-delay transfer-to-next-in-process:  Automated Manual processing, with simple fall-back to actual manual processes.

From 1995 to 2000, Internet use took off and Office Desktops went from being expensive/Leading Edge items to low-end commodity units.  Usage focused more on e-mail, Intranets and some office tools. "Client-Server" became the buzz-word. New processes arose that weren't just Automated Manual processing.

After 2000, and the forced upgrades for Y2K, there was a 3-5 year technology plateau/recovery followed by an up-tick in both penetration, usage and integration of I.T. tools and facilities.
File/Print and Office Tools "on the Network" are now taken "as a given", as is high-speed Internet links, good Security, "Standard PC images" and centralised I.T. purchasing, admin and support.
E-mail and Web access are ubiquitous and inter-system compatibility is a necessity.

From 1990 to the present, Government Agencies have moved from having all backend processing automated, to the majority of front-end processes and work tasks being dependent on I.T. Automation:  Desktops, Networks and Services.

Telephony systems are increasingly being moved to the common network, becoming less robust and less reliable in the process. We are yet to see the full impact of this trend and it's reversal for critical services.

Now, when a large Agency has a major computer room malfunction or upgrade glitch, most or all office staff are sent home until all critical systems are restored.

This didn't happen only 10 years ago:
 the loss or slowing of back-end systems didn't halt all Agency work,  an effect unremarked and unreported by both Agencies and their oversight organisations, Finance and the Audit Office.

There are real End Service-Delivery implications of this current Event Horizon and they aren't being addressed or even acknowledged. Nor do these avoidable costs and employee time losses constitute efficient or effective management.

We've passed the Event Horizon of Dependence of Front Office Operations on I.T. [The next is complete dependence, there after "invisibly dependent", like water, gas and electricity.]

The bell can't be "unrung", we now have to manage our efforts in this context. Wanting to go back to "simpler times" is not an option for many and complex reasons. Even if it were possible or desirable...

Can we use this insight to define some universal metrics related to the use ("efficiency") and outputs ("effectiveness") of the technology and whose measurement has insignificant staff impact?

An aside:
  • Measuring values and analysing data, "instrumentation", always costs: money, time, complexity.
    This process/proposition is not free or trivial.
    Nor is the valid interpretation of read-outs always simple and obvious.
  • You wouldn't think of flying a 747 without instrumentation and that wall of switches-and-displays are the minimum needed for "Safe and Effective" aviation (we know this from the constantly improving safety/performance metrics from the ATSB etc).
    Why do Managers and Boards think the larger, more complex, more costly machines we call "organisations" need little or no instrumentation?

Some Metrics

Business tasks, perhaps as well the stock-in-trade of Managers, "decisions", have four major internal dimensions:
  • Degree-of-Difficulty
  • Size
  • Urgency
  • Importance (and/or Security Classification)
And four related external dimensions:
  • Time: Deadline, timeliness or elapsed time
  • Minimum Acceptable "Quality"
  • Input Effort (work-time or staff-days)
  • Cost
All tasks, projects and process have Inputs as well:
 Resources, Plant, Equipment, Tools, Information, Energy, Licences, etc
and, for well-defined tasks/projects, defined measurable Outputs.

When measuring Inputs vs tasks completed, actions taken or messages communicated, classification by Internal dimensions is necessary for "Like-with-Like" ("really-same", not "about-same") comparisons.
  • Are Urgent tasks/enquiries dealt with as appropriate?
  • Are Important tasks/projects completed? On time, On Budget, To Spec?
  • How many tasks of what size/difficulty are reasonable to expect?
  • Do staff find the work rewarding and motivating or not?
    Are they engaged, stimulated and developed through the work or demotivated, unproductive and either leaving or Time-Serving until they can ("golden handcuffs" of superannuation or other benefits).
With E-mail, both the inputs and outputs are available to be analysed.
Individual and Group Outputs can be assessed according to the External dimensions.
  • Were budgets (costs, deadline, effort, resources) met for a matrix of different task classes (urgency, importance, size, difficulty)?
  • Were Quality targets met per task matrix?
  • Where were Staff Effort and Resources expended?
    Was this optimal or desired?
Measuring Realised Benefits and "Expectations vs Outcomes", the very heart of the matter, is beyond the scope of this piece.

Having generic I.T. tools available on every desktop and used for every task implies three metrics related to mastery of basic tools and skills:
  • Typing Speed,
  • Tool/I.T. Competence: speed and skill in basic PC, Office Tools and Internet tools and tasks, and
  • E-mail use (sent, read, completed, in-progress, pending, never resolved, personal, social, other).
There are two basic work-practice skills, where the I.T. systems gather data necessary for analysis:
  • Meeting efficiency and effectiveness
  • Time Planning/Use and Task Prioritisation
After 5 decades of availability of definitive solutions in, and widespread training and consulting firms offering services, these basic and critical Business Processes, there is no excuse for poor meeting skills or undue waste of staff-time in meetings.

Nor of incompetent time management/task prioritisation and the associated waste of staff-time, idleness and under- or non-achievement of goals.

"Meetings, The Practical Alternative to Work" (or now, "Email, T-P-A-t-W"), is not "that's how it is here" or just amusing, it is an indictment of poor and ineffective management and failed governance systems.

So we hit an inherent contradiction:
 We need to measure basic performance metrics to affect improvement, but if we try to use those metrics to achieve improvement, we can only create the opposite effect.


If tying Performance Metrics to pay, bonuses or "Consequences" isn't useful, why measure and report?

Jerry Landsbaum in his 1992 book, "Measuring and Motivating Maintenance Programmers" definitively answered this question.

Just by measuring and publicly posting individual resource usage, he was able to achieve radical changes in habits (and costs) without imposing penalties or instuiting any formal process.

Reasonable people given timely, objective feedback will modify their behaviour appropriately.
Landsbaum went on to provide a suite of tools to his staff providing various code metrics.
Without direction, they consistently and deliberately improved the measured "quality" of their code.

As a side-effect, Landsbaum was able to quantify for his management considerable savings and productivity improvements. Most importantly, in a language and form they could understand:
 an Annual Report with year-on-year financial comparisons.

This approach is backed up by Dr. Brent James, Executive Director of Intermountain Health Care in Salt Lake City, Utah, described in "Minimising Harm to Patients in Hospitals", broadcast in October 2000.

Dr James and his team spent time discovering exactly what caused the most harm to patients under their care, then prioritising and addressing those areas.

The major cause of "adverse events" (harm to patients) wasn't Human Error, but injuries due to Systems failures, by a factor of 80:1 (eighty times).

Charles Perrow calls these "Normal Accidents", whilst James T. Reason, author of "Human Error" and the "Swiss Cheese Model" of accidents, calls them "Organisational Accidents".

Perrow and Reason's work is the underpinning of the last 5 decades improvement in Aviation and Nuclear safety. It's based a sound theory that works in practice, based on real, verifiable evidence.

Dr James said the approach: "could save as much as 15% to 25% of our total cost of operations" whilst delivering much reduced Adverse Events and better, more consistent, patient outcomes.

An unanticipated benefit of Dr James work was identifying the occasional "bad physicians and nurses":
"If we see active criminal behaviour, if we see patterns of negligence or malfeasance, we will react.".

(Because there was) "less noise in the system. It’s easier to see them.
And I have to tell you that was startling when we first encountered that.
We knew we needed to go after the 95% of the system’s failures
but as we started to take down those rates we also knew that there were some bad physicians,
it was just hard to find them,
and suddenly, there they were,
and we were able to take appropriate action."

Bundaberg 2005: Dr Jayant Patel

In Dr James' hospitals, Jayant Patel would have been quickly noticed and dealt with.

If you lived in Bundaberg, you might be asking why their systems didn't detect Patel: 5 years after Dr James' public broadcast in Australia?
Is there an excuse the management at Bundaberg ignored Dr James proven, effective, methods?
Especially as he'd documented substantial savings as well as fewer injuries and better patient outcomes from his approach. All the self-described goals of every Health system in the country.

Queensland Health's performance fails the basic Professional Test:
 "Repeat, or allow to be repeated, Known Errors, Faults and Failures".
And seemingly without timely, direct or personal consequences to anyone.

Whilst Patel is seen to be "the one bad apple", his being charged and held to account is not timely nor will it improve the standard of care for others, or cause lasting change where "it can't happen again".

Just what did those lives lost and needlessly destroyed, and the ruining of Patel buy the community?
Seemingly, very little.
Retribution leaves ashes in the mouth, and "playing the Blame Game" only increases workplace fear and risk-adverse management decisions. None of which drives useful or lasting organisaitonal change.

In Bundaberg, the culprit is "The System" that let Patel firstly pratice at all, then get away with bad performances for an extended period. I won't go into the poor treatment of the nurses that tried to address the situation and who eventually managed to get media attention.
It's all the same Organisational Failure.

Other 2005 events: Lockhart River aircrash and the sinking of the Malu Sara.

Where is the routine investigation and transparent, public reporting by an independent expert body akin to CASA/FAA, as for  the 2005 Lockhard River crash that killed 15 and led to the demise or deregistration of two companies. This same crash led to a Senate Inquiry critical of the oversight bodies: the coronial inquest and CASA. "Who watches the Watchers?" The Senate, for one.

Patel was linked to 87 deaths, six times more that the crash, though only convicted of the manslaughter of 3. In spite of the size of this "oops" and the overwhelming evidence of the power and effectiveness of the NTSB/FAA system in Aviation, there are no moves to effect this sort of change.

This isn't an isolated organisational condition or limited to any one level of Government or area of practice.

Consider the 2005 deaths of all five on-board the "Malu Sara", specified, purchased and operated by the Department of Immigration. Sinking 6 weeks after going into service.
The 2008 Coroner's Report, is cited by a 2010 SBS programme, around 12 months after the Coronial Inquest, as saying:
... Queensland's coroner ruled it was a totally avoidable disaster, caused by the incompetence and indolence of senior Immigration official ...
The SBS programme claims:
  • "No charges were laid after a 2007 police investigation."
  • The senior Immigration official "avoided departmental disciplinary proceedings by retiring from immigration - with his full entitlements."
  • "The federal work place regulator ... is prosecuting Immigration over the deaths.
    The maximum penalty - a $240,000 fine."
  • "So far, all the families have received from authorities is an apology from Immigration and, in January, the department named two rooms in its Canberra headquarters after the deceased (departmental) officers."
The formal words of the Coroner, "incompetence and indolence",  should alarm anyone reading them, especially those with oversight of the Department or responsible for Good Governance.

This behaviour is never justifiable in a well managed and is completely inconsistent with the Three-E's required of Agency Heads. That one senior officer failed their basic performance requirements and was either undetected or known and allowed to continue, is a failure of Governance and oversight.

One major event like this is complete proof of failing under s44 of the FMAA.

The Audit Office has not investigated the matter, nor has Finance, the administrators of the FMAA, taken an interest.

A Senate question was asked in May 2007 about the investigation:
In July/August 2006, more Senate questions were asked in relation to AusSAR.
Is the Department’s report on the Malu Sara incident a document that the Committee can have access to? If not, why?
Answer:
The Department’s report was provided to the Coroner at the directions hearing on 15 February 2007. The Coroner, on his own motion, made an order that prohibits the publication of the report other than to the formal parties to the proceeding.
There was an independent ATSB inquiry and report released in May 2006 (No 222) and a supplemental report (MO-2009-007) released in September 2009, reopening the investigation after the Coroners Report.

In late 2009, ComCare issued a media release saying they would be launching court action against the Department and the boat builder.

The Head of Immigration in the 2008-9 Annual Report commented:
The department has since made changes and improvements to its procedures to ensure that such a tragedy could never occur again,...
It seems all Agencies involved in the matter are unaware of the Quality dictum:
 You cannot check your own work.

Organisationally, this equates to:
  Performance and Compliance can only be assessed by an Independent Expert Body.

Organisations can't investigate their own problems, nor categorically state, as Immigration has, "We fixed it, it can't happen again. Trust Us".

Since the 1926 formation in the US of the dual-body model used in Aviation, one to investigate causes (NTSB) and another to form and enforce regulations and issue non-compliance penalties (FAA), has shown itself to be an effective, possibly a definitive solution to Organisation Safety and Quality improvement.

From the steady improvement in aviation performance figures, an unintended effect of the dual-body system is that it may also improve Performance and Efficiency/Effectiveness for free.

Why is there this blindspot, especially as the NTSB/FAA model is so well known and respected throughout the commercial and public sectors, and in political and judicial circles?


Wrapping up Performance Metrics

Putting real numbers out in Public enables good people to lift their game while exposing poor performers and worse (malfeasance, misfeasance, nonfeasance, negligence, incompetence, indolence, ...)

Formalising the measurement of basic management outcome metrics and tying them to rewards and punishments can only result in disaster. Mangers will devote themselves to doing whatever it takes to get promotion or reward, not achieving their mission: good taxpayer outcomes and good governance.

Providing good data to taxpayers and their proxies, journalists and researchers, will provide enough leverage to see real, lasting change.

But this approach is not "sexy", big or expensive - and certainly can't be used as a punitive political weapon, either for career bureaucrats or their political masters.

Why wouldn't you do this if the whole Output of Government was dependent on the use of I.T. and you cared about "the efficient, effective and ethical use and management of public money and public property"?

So who will champion this idea?

Who has something to gain from real change? [Not the major Political Parties, not incumbent Bureaucrats, not existing oversight bodies: the status quo works for all them.]

Who has the Motivation, Authority and Will to make real change happen?
That's the real question here...

2010/08/29

Top Computing Problems

The 7 Millennium Prize Problems don't resonate for me...

These are the areas that do engage me:
The piece for the second item, "Multi-level memory" is old and not specifically written for this set of questions. Expect it to be updated at some time.

    Internetworking protocols

    Placemarker for a piece on Internetworking protocols and problems with IPv4 (security and facilities) and IPv6 (overheads, availability).

    "The Internet changes everything" - the Web 2.0 world we have is very different to where we started in 1996, the break-through year of 'The Internet' with IPv4.

    But it is creaking and groaning.
    Around 90% of all email sent is SPAM (Symantec quarterly intelligence report).

    And since 2004 when the "Hackers Turned Pro", Organised Crime makes the Internet a very dangerous place for most people.

    IPv6 protocols have been around for some time, but like Group 4 Fax before them, are a Great Idea, but nobody is interested...

    What are the problems?
    What shape could solutions have?
    Are there (general) solutions to all problems?

    Systems Design

    Are these new sorts of systems possible with current commercial or FOSS systems?
    What Design and Implementation changes might be needed?

    How do they interact with the other 'Computing Challenges' in this series?

    Flexible, Adaptable Hardware Organisations

    Placemarker for a piece on flexible hardware designs.

    I'd like to be able to buy a CPU 'brick' at home for on-demand compute-intensive work, like Spreadsheets.
    I'd like be able to easily transfer an application, then bring it back again.

    Secondly, if my laptop has enough CPU grunt, it won't have the Graphics processing or Displays (type, size, number) needed for some work... I'd like to be able to 'dock' my laptop and happily get on with it.
    The current regime is to transfer files and have separate environments that operate independently and I have to go through that long login-start-apps-setup-environment cycle.

    I prefer KDE (and other X-11 Windows Managers) to Aqua on Snow Leopard (OS/X 10.6) because they remember what was running in a login 'session', and recreate it when I login again.

    In 1995, I first used HP's CDE (IIRC) on X-11, that provided multiple work-spaces. This was mature technology then.

    It was only this year, 15 years on, that Apple provided "Spaces" for their uses.
    Huh??

    We already have good flexible storage options for most types of sites.
    Cheap NAS appliances are available for home use, up to high-end SAN solutions for large Enterprises.

    For micro- and portable-devices, the main uses are "transactional" web-based.
    These scale well already, and little, if nothing, can be done to improve this.

    Systems Design

    What flows from this 'wish list' is that no current Operating System design will support it well.
    The closest, "Plan 9", developed around 1990, allows for users to connect different elements to a common network and Authentication Domain:
    • (graphic) Terminals
    • Storage
    • CPU
    The design doesn't support the live migration of applications.

    Neither do the current designs of Virtual Machines (migrate the whole machine) or 'threads' and multi-processors.

    Datacentre Hardware organisation

    Related posts:

    Senior Google staffers wrote The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, which I thought showed break-through thinking.

    The importance of this piece is it wasn't theoretical, but a report of What Works in practice, particularly at 'scale'.

    Anything that Google, one of the star performers of the Internet Revolution, does differently is worthy of close examination.  What do they know that the rest of us don't get?

    While the book is an extra-ordinary blueprint, I couldn't but help asking a few questions:
    • Why do they stick with generic-design 1RU servers when they buy enough for custom designs?
    • How could 19-inch racks, designed for mechanical telephone exchanges a century ago, still be a good, let alone best, packaging choice when you build Wharehouse sized datacentres?
    • Telecommunications sites use DC power and batteries. Why take AC, convert to DC, back to AC, distribute AC to every server with inefficient, over-dimensioned power-supplies?
    Part of the management problem with datacentres is minimising input costs whilst maximising 'performance' (throughput and latency).

    There are three major costs, in my naive view:
    • real-estate and construction costs
    • power costs - direct and ancillary/support (especially HVAC).
    • server and related hardware costs
    Software, Licensing and "Operation, Administration and Maintenance" (OAM) costs may also be 'material' in the Accounting sense, I don't have that information or sources describing them.
    [HVAC: Heating, Ventilation, Air Conditioning]

    The usual figure-of-merit for Datacentre Power Costs is "Power Use Effectiveness"(PUE), the proportion of power that actually ends up being consumed by the Data Processing (DP) systems.
    Google and specialist hosting firms get very close to the Green IT "Holy Grail" of a PUE of One. Most commercial small-medium Datacentres have PUE's of 3-5, according to web sources.

    IBM Zurich, in partnership with ETH has done work water-cooling servers that beats a PUE of one...
    2007, 2008 and 2009 press releases on their Zero Emission Datacentre.

    Their critical insight is that the cooling fluid can start hot, so the waste ('rejected') heat can be used elsewhere. Minimally for hot water, maybe heating buildings in cold climates. This approach depends on nearby consumers for low-grade heat, either residential or commercial/manufacturing demands.

    Water is 3,500 times more dense than air.  It's much more efficient to use water rather than air as the working fluid in heat exchangers. The cost (plus noise) of moving tons of air around, both in capital costs, space and operation/maintenance is very high. Fans are a major contributor to wasted energy, consuming around 30% of total input power (according to web sources).

    But in all that, where's the measure of:
    • DP Power productively applied (% effectiveness, Η (capital Eta)), and
    • DP Power used vs needed to service demand (% efficiency, η).
    %-Effective, Η, is about power-use of hardware unavailable for production, poor architecture, poor OAM practices or duplicated-but-not-contributing 'mirrors' as in traditional active/passive redundancy. Network Appliance (rightly) make a lot of their active/active configuration.

    Pay for two, use two not one. A better than 50% reduction is capital and operational costs right there, because these things scale super-linearly.

    %-Efficiency, η, is the proportion of production DP power on-line used to serve the demand. Not dissimilar to "% CPU" for a single system, but needs to include power for storage and networking components, as well as CPU+RAM.

    One of the critical deficiencies of PUE is it doesn't account for power-provision losses.
    Of each 10MW that's charged at the sub-station, what proportion arrives at the DP-element connector?
    [Motherboard, Disk Drive, switch, ...] There are many other issues with Datacentre design, here's a 2010 Top Ten.


    When is a Wharehouse not a Wharehouse?

    Google coined the term, Warehouse Scale Computing, so why aren't modern wharehousing techniques being applied, especially in "Greenfield" sites?

    Why would anyone use raised flooring (vs solid or concrete) in a new datacentre?
    One of the unwanted side-effects of raised floors is "zinc whiskers". Over time, the soft metal edging on the tiles suffers tiny growths, 'whiskers'. These get dislodged and when they end up in power supplies, create serious problems.
    Raised floors make very good sense for small server rooms in office buildings, not wharehouse sized Datacentres.

    Automated wharehousing technologies with tall racking and robotic "pick and place", with small aisles, seem to be quite applicable to Datacentres. Or at least trundling large sub-assemblies around with fork-lifts. These sub-assemblies don't need to be bolted in, saving further install and remove/refit time.

    Which goes to the question of racking systems.
    Servers are built to an archaic standard - both in width and height.
    Modern industrial and commercial racking systems are very different - and because they are in mass-production, available immediately.

    Which doesn't exactly go against the trend to use shipping containers to house DP elements, but suggests that real Wharehouses might not choose container-based solutions.

    As the cost of Power, and even its availability, changes with economic and political decisions, it's efficient and effective management and use in Datacentres will increase in importance.
    • Can a Datacentre store power at off-peak times and, like other very large power consumers, go off-grid, or at least reduce power-use in these days of the "smart grid"?
    • Could those "standby" generators be economically used during times of excess power demand? It creates a way to turn a dead investment into an income producing asset.
    A useful result of treating a Wharehouse Datacentre as a true wharehouse is that modulo the additional power and cooling, they are conventional wharehouses.  Allowing either existing buildings to be bought and used with minimal conversion, or old/surplus Datacentre buildings to be sold and used "as-is".

      Datacentre Cooling

      Whilst "Free Cooling" (using environmental air directly, without any cooling plant) is a very useful and under-utilised Datacentre technique applicable to all scales of installation, it's not the only technique available.

      One of the HVAC technologies that's common in big shopping malls, but never mentioned in relation to Datacentres is "Thermal Energy Storage", or Off-peak Ice production and Storage.

      Overnight, when power is cheap and conditions favour it, you make ice.
      During your peak demand period (afternoon), your A/C plant mainly uses the stored ice.
      This has the dual benefit of using cheap power and of significantly reducing the size cooling plant required.
      In hot climates requiring very large ice storage, they don't even have to insulate the tanks. The ratio of volume to surface area for large tanks means the losses for daily use are small.

      "Hot Aisle" and related techniques are well known and in common use. Their use is assumed here.



      Datacentre Power Provisioning

      Reticulating 110-240V AC to each server requires multiple levels of inefficient redundancy:
      • UPS's are run in N+1 configuration, to allow a single failure. Additional units are installed, but off-line, to cover failures and increased demand.
      • Ditto for backup Generators.
      • Every critical DP-element (not all servers are 'critical', but all routers and Storage Arrays are), requires redundant power-supplies to cater for either PSU or supply failure. You need to draw a line once between what's protected and what isn't, a major gamble or risk. Retooling PSU's in installed servers isn't economic, and mostly impossible.
      That's three levels of N+1 redundancy, plus UPS's and generators that must be regularly live-load tested, which in itself is hugely problematic.

      Plus you've got a complex AC power distribution system, with at least dual-feeds to each rack, to run, test and maintain. Especially interesting if you need to upgrade capacity. I've never seen a large installation come back without significant problems after major AC power maintenance.

      The Telecommunications standard is dual 48V DC supplies to each rack and a number of battery rooms sized for extended outages, with backup generators to charge the batteries.
      Low voltage DC has problems, significantly the power-loss (voltage drop) in long bus-bars. Minimising power-losses without incurring huge copper conductor costs is an optimisation challenge.

      Lead-acid batteries can deliver twice the power for a little under half the time (they 'derate' at higher loads), so maintenance activities are simple affairs with little installed excess capacity (implied over-capitalisation waste) because a single battery bank can easily supply multiples of it's charging current.

      There are three wins from delivering low-volatage DC to servers:
      • Power feeds can be joined with simple diodes and used to feed redundant PSU's. Cheap, simple and very reliable. Operational cost is a 0.7V or 1.4V supply-drop through the diode, so higher voltages are more efficient.
      • Internal fans are not needed, less heat is dissipated (according to web sources). I'm unsure of the efficiency of DC-DC converters at well below optimal load, if you over-specify PSU's.
        This suggests sharing redundant PSU's between multiple servers and matching loads to PSUs.
      • the inherent "headroom" of batteries means high start-up currents, or short-term "excessive" power demand are easily covered.
      The biggest win for a DC Datacentre supply is the integrated power control we see in every laptop:
      the battery, PSU and HVAC systems can dynamically interact with the DP-elements to either minimise power use (and implied heat production) or maximise power to productive DP-elements.

      The 3 dimensional optimisation problem of Input Power, Load-based Resource Control and Cooling Capacity can be properly solved. That's got to be worth a lot...


      The "standard" 1RU Server

      The problems I've got with typical 1RU servers (and by extension, with the usual 'blades') are:
      • There are 6 different types of heat load/producers in a server. Providing a single airflow for cooling means over-servicing lower-power devices and critical placement modelling/design:
        • CPU, ~100W per unit. Actively cooled.
        • RAM, 1-10W per unit. Some need heatsinks
        • Hard Disks, 2-20W. Some active cooling required.
        • motherboard 'glue' chips, low-power, passive cooling, no heat sinks.
        • GPU's, network and PCI cards. Some active cooling required.
        • PSU. Own active cooling, trickier with isolation cage.
      • 1.75", the Rack Unit, is either too tall or too short for everything.
        It's not close to optimal packing density.
      • Neither is the "standard height rack" of 42RU optimised for large, single-floor Wharehouse buildings. It's also too tall for standard shipping containers with raised flooring.
      • The fans and heat-sinks needed to cool high-performance CPU's within the 1.7" available are tricky to design and difficult to provide redundantly. Blowers are less efficient at the higher air speed caused by the restricted form-factor. Maintenance and wear is increased as well.
      • 1Gbps ethernet has been around for a decade or more and is far from the fastest commodity networking option available.  SAS cards are cheap and available, using standard SATA chips running at 3 or 6Gbps down each of 4 'lanes' with the common SFF 8470 external connector. At $50 extra for 12-24Gbps, why aren't these in common use? At least until 10Gbps copper Ethernet becomes affordable for rack-scale switching.
      • HDDs, (disks) are a very bad fit for the 1RU form factor, both for space-efficiency and cooling.
        3.5" drives are 1" thick. They need to lie flat, unstacked, but then take considerable internal real-estate, especially if more drive bays are allocated than used.
        2.5" drives still have to lie flat, but at 9.5-12.5mm thick, 2-3 can be stacked.
        3.5" drives running at 10-15,000 RPM consume ~20W. They need good airflow or will fail.
        Low-power 2.5" drives (2-5W) resolve some of those issues, but need to be away from the hot CPU exhaust.
      One technology solution that starts to address the "blade cooling problem" is the sealed immersed-board liquid-coolant system from the UK startup, "Icetope", also in this referring article.

      Sealing a motherboard puts all the liquid-tolerant components in the same environment, providing you've chosen your "liquid" well.
      The IBM "ZED" approach above is directed at just the major heat load, the CPU's and uses plain water.
      IBM is taking the stance that simple coolant loops, air-cooling of low-power devices and simple field-maintenance are important factors.

      Which approach is better for Wharehouse-scale computing? Real-life production data is needed.
      As both are new, we have to wait 3-5 years.

      The obvious component separations are:
      • PSU's, separated from the motherboard and shared between multiple motherboards. PSU capacity can be very closely matched to their optimal load and cooling capacity and all DP-elements can benefit from full-path redundancy.
      • Disk drives can be housed in space- and cooling-optimised drawers.
      There are already commercial solutions that pack 42 * 3.5" drives into 4RU drawers, either SATA or SAS. It's cheap and simple to provide a bulk connector, then run cabling to adjacent motherboards.

      This shows my antipathy to the usual commercial architecture of large, shared storage-arrays with expensive and fragile Fibre Channel SAN's.  They are expensive to buy and run, suffer multiple reliability and performance limitations, can't be optimised for any one service and can't be selectively powered down.
      Google chooses not to use them, a very big hint for dynamic, scalable loads.


      What could Google Do?

      Very specifically, I'm suggesting Wharehouse-scale Datacentres like Google's with dynamic, scalable loads could achieve significant PUE and η gains (especially in a Greenfield site) by:
      • purpose designed tall racking, possibly robot-only aisles/access
      • DC power distribution
      • shared in-rack redundant DC-DC PSU's
      • liquid-cooled motherboards
      • packed-drawer disks, and
      • maybe non-ethernet interconnection fabric.
      I suspect the total power demand for a notionally 20MW site might be reduced 25-50%.

      Some other links to 'The Hot Aisle" blog by Steve O'Donnell:

        2010/07/30

        What can you learn from a self-proclaimed "World's Greatest"?

        Note: This document is copyright© Steve Jenkin 1998-2010. It may not be
        reproduced, modified or distributed in any way without the explicit
        permission of the author. [Which you can expect to be given.]

        Lessons from the Worlds' Greatest Sys Admin - July 1998
        Presented at SAGE-AU Conference, July 1998
        Contents
        Introduction
        Background
        Principles of System Admin
        Some WGSA Attributes
        About The WGSA
        Sayings of the WGSA.
        Some Sound Management Laws
        So What?
        How do you work with a "World's Greatest ..."
        Some "Good Stuff" I learnt from friends.
        Some of the WGSA's work
        Summary


        Introduction


        Why


        (2005) The most frequent comment I received on the 'WGSA' talk was:
        "So you think you are the World's Best Sys Admin?"

        Answer: No, I am not the "WGSA".


        This paper is about someone, who is really an amalgamn of a number of people, who regarded themselves as "The World's Best Sys Admin". They never verbalised this opinion - they just lived it.

        Unfortunately, as is the case with all self-appointed 'guru's I've met, they had limited raw talent and an arrogance that prevented them admitting less-than-perfect performance, taking on-board any useful criticism or correction or learning new tools, techniques, processes, andorganisations from others they didn't consider an authority.


        I apologise in advance to the reader that the paper is mostly about "negative" learning,
        or What Not To Do...

        I included a section on Good Things I've Learned to show that I wasn't totallypreoccupied with the negative :-) But there were just too many good stories, and I really had to let steam off over this...

        Do you think you know the identity of "The WGSA"?

        You don't. For those that may even think it's them - No, it's is not you.

        The observations and opinions here came over a considerable period of time. It's not a single person - and it's not just Administrators. I've met "WG Programmer", Architect, Designer, Tester, Integrator, Networker, Technical Manager and CIO.

        So - onto the main game - the paper as presented to SAGE-AU in Old Parliment House.
        If you were there - did you catch any of the lollies I threw, or even a chocolate egg?

        Feedback is something I'm interested in.

        Drop me a line if you have your own stories, can add more useful models, or if you're late to this and found it useful. Suggestions for improvement gratefully accepted and acknowledged.


        Why?

        To codify and inform.

        Once a problem is recognised and named, you can start to understand and address it.

        Audience

        Junior Sys Admins
        - If you work for one.
        Senior Sys Admins
        - If you work with one.
        Managers
        - If you have one working for you.

        Format

        • Talk
        • War stories
        • Feedback
        • And lots of Opinion.

        The BIG Questions

        • So What?
        • How do you work with one?

        Background

        I spent a year in 96/97 contracting in Sydney for what should've been a large, prosperous Australian multinational. They hadn't paid a dividend since 1990 and were taken over by a Dutch company at the end of 1996.

        The I.T. group was appalling.
        Staff turnover in the Unix Support and Networking areas was high - close to 100% in 12-18 months! The company spent under 1% of turnover on I.T. - against the industry average of 5+%.

        It felt like we were doing the impossible - and we were.

        They'd outsourced their mainframe, embraced 'open systems', installed a large scale WAN, gone client-server, were developing GUI and O-O applications and had an Internet presence.

        They'd also radically downsized in two steps:- from 200+ to 30-40 staff in 18 months.

        They were fully Buzzword Compliant, but were going nowhere.

        I was privileged to meet two people - both ex-telecoms technicians who had moved into computing. One, the self-proclaimed WGSA, had been responsible for setting up the Unix environment, and it's associated X.25 network, and been the Unix support manager for a couple of years, until finally taking a job in 'Technology Planning' - but just doing more of the same.

        The other ex-tech I've remained good friends with. He moved into Networking after a career in Civil Aviation, then a TAFE. He had enough PC, Unix, and Internet knowledge 'to be dangerous'. He'd left behind at the TAFE an environment where just 2 of them supported and ran the
        whole state TAFE network, +1 for Unix, 1 for printers and passwords, and 2 on the HelpDesk. When the lot was outsourced and a crack systems company took over - they boast they can cut 10%-20% from any operation - they ended up having to spend more.

        The contrast was stark and savage - one had left behind a legacy of chaos and disorder, the other was undoing the damage and providing real business productivity.

        This talk is about that experience and what I've learned.


        Principles of System Admin

        These are my values and principles. Your mileage may vary.
        • Know why you're there - To statisfy others business
          needs.
        • Know what you Know, Know what you Don't Know,   and don't be afraid to get assistance.
        • Obey Sound Management Laws.
        • Learn, Develop, and Stay Current.
          We learn through Invention, Discovery, and Failure
          "That which isn't growing, is dying"
        • Give Value for Money.
          • Actively seek ways to put yourself out of work.
          • Minimise recurrent costs - wages, maintenance/support/rental charges
          • Maximise Reusability, Flexibility, Functionality, Reliability/Robustness.
        • Provide what's needed, not apparently wanted.
        • Listen and communicate with your users.
          • Provide Solutions.
          • Focus on Outcomes.

        Some WGSA Attributes

        • They don't exist outside fertile ground. They have to be allowed
          and encouraged by management and peers.
        • Only Dysfunctional people thrive and rise in dysfunctional
          environments.
          Good people leave broken places - possibly after fighting for a time.
          The only other alternative is to withdraw and retreat into minimal
          performance.
        • People are the ONLY asset of I.T. Organisations


          • Hardware: $1M to zero in 3 years
          • Software: $100k to zero in 3 minutes
          • Network: $250/point to zero in 3 seconds
        • Indicators:
          • High staff turnover
          • High contractor ratio
          • N.I.H. - Resistance to Change
          • Lack of "Professional" work habits - Defined Processes,
            Designated Responsibilities, Delegated Authority
          • Lack of History, Documentation, Policy, Procedures, Config Mgt,
            Version Control, Handovers, Induction
          • Chaos and Frenzy. Apparently understaffed and overworked - definitely unorganised
            "No time to fix problems, too busy fixing faults."
          • Nobody tasked with automating jobs or passing work back to level 1 support.
          • Every install project goes into crash mode.
            No standard, fast, system builds.
          • Maintenance frenzy - never seems to get any better.
          • Lack of fault analysis, reviews, Post Mortems, Post Implementation Reviews, capacity planning.
          • Single source of innovation and improvement - The "Guru"
          • The "BIG BANG FIX" is coming [Or the"Silver Bullet"]
            Nothing can be done, because "someone" [WGSA or friend] is creating "the Solution to All Our Problems".
          • (Senior) Management "Swooping" is allowed and tolerated.
          • " Don't show me problems, Show me solutions"
          • Mentoring and skills transfer absent
          • Constant reactive, not proactive, administration.
          • No organisation accountability - fail to do any task - routine or project - with impunity.
          • Blaming and Recrimination normal. No attempt to perform 'root cause analysis' and rectify faults.
          • No recognition or rewards for work well done.
          • Few Diagnostic, debugging, or troubleshooting Tools - even for common failure modes.
          • No Communication - up or down.
          • No Performance Indicators or Measurement/Assessment


        About The WGSA

        Of course he was the best. He had read every single 'white paper' from the vendor, and with his photographic memory, could recite it all back. All he needed to know was in those papers, and the manuals he'd read.

        He didn't need to meet and talk with his peers, he had none anyway! He had no need of professional organisations or finding out what had worked, or not, for other people.

        If he didn't have time to do something himself, he would get in a contractor, create a project, or hire a consultant. Funnily, these people were always only of very modest ability. The projects mostly ran
        out of money in "phase 1", when only the basic work was being done and well before the real benefits were to accrue.

        He'd written 25,000 lines of shell script to provide a "common" menuing and execution environment. It was a most flexible, adaptable, and configurable environment - and surprisingly similar to that run by
        his previous employer. Just the thing to control 12 machines... It was a real engineering triumph - for 1982! He'd built and deployed all this with no version control, configuration management, or documented release and maintenance procedures - and certainly no review.

        His crowning glory, "Xferutility", 7,000 lines in a single script, heavily utilised 'comes from' control files [they just appeared places, with no trace of whence they came], and could use 'rcp', 'ftp',
        and e-mail to achieve the functionality of uucp. Plus, it was the transfer mechanism, the interactive menu, the scheduler, and the status reporter. All things to all people bar those left to maintain it.

        Having not apparently done "Programming 1A", he'd not been introduced to the concepts of "coupling and cohesion" - put together everything that belongs together, separate unrelated concerns -
        and least necessary complexity.

        To go from the login prompt to the first displayed menu, over a dozen files or scripts were executed - often in perverse order. The system drive defaults would overwrite the local definitions!

        He also seemed unaware of basic capacity planning issues - like tracking the number of systems in the machine room and providing adequate rack space and cabling. Backups were another story entirely.

        The I.T. department policy was to have separate small systems for every division, no two the same. In 12 months it went from 12 systems in the machine room, to 23. And then to 35+ in the next 6 months.

        Having labelled me "a cannonball contractor who won't be around in the long term", he resigned the week he penned it, took an overseas holiday [run in the same flexible fashion], and rejoined his previous employer, through a services company, performing Network Management.



        Sayings of the WGSA.

        A few of these are paraphrases.

        What I find myself saying often is :-
        Why would you want it any other way?, and
        (2005)Would you expect any less?.

        The answers to these questions are usally: Yes, any other way, and "NO!".

        Sayings and tactics of WGSA and friends:
        • A basic tactic: Plan, Plan, Plan - and produce massive documents everyone else has to review.
          Nothing will actually get done.
        • Another basic tactic: Reality is at Fault, Adjust your Perceptions.
        • "It's worked that way for 3 years - it couldn't be broken now." [A basic tactic. You obviously have got the nature of the fault wrong. Ignorance and Rigidity are a powerful combination.]
        • Another basic tactic: Concentrate on the trivial, the Big Issues will fix themselves.
        • "You don't understand the full range of issues or complexities." [I know, you don't.]
        • "It works/worked fine for me..." [Hasn't told you or Reality is at fault.]
        • "Read the documentation I wrote." [But hasn't told you about.]
        • "You have to fully document that." [An attempt to divert, stall, or put you off.]
        • "The client doesn't want that." [Were they ever asked? Were they ever given options?]
        • "They [the clients] never asked." [Deflection. Clients are expected to be technical experts.]
        • "It's UnAuthorised." [But where is the Policy on that?]
        • "It's not Standard" [It's free. We have to pay heaps or the other boys will think we're not cool.]
        • "We can't afford that." [May be true, but unlikely based on the money chucked around on junkets/trinkets for the favoured few.]
        • "It's freeware. It's not supported." [Often said without a hint of irony in response to 'costs too much'.]
        • "If you can Cost Justify that..." [A stalling tactic. Nothing you put up will ever get approved.]
        • "You Just ..." [Makes you out to be a fool/incompetent, even though there is no way you could've known.]
        • "Why haven't you ... <;said angrily>" [So how would you know to do that, when you haven't been told about it?]
        • "It's really flexible/efficient/configurable/Easy when you use it... " [Defending a wildly over-complicated script]
        • "We need it because ...or We have to do it that way." [Of course there is nothing written to back it up. The WGSA wrote it, so it's going to stay.]
        • "We won't discuss that [now]." [No argument if there is no discussion.]
        • "That's not the way we do it around here." [No change is possible. Of course, nothing is written down and there is no Policy to back that up.]
        • "You can't do/say that." [Controlling.]
        • "What is the Vendor's policy on replacing that?" [Deflect and control. Of course the vendor doesn't have a written policy on when something is broken.]
        • "The Vendor's White Paper/Documentation says ..." [Appeal to another Authority. Stifle argument. Don't let facts or prior experience get in the way.]
        • "The Consultant's Report says ..." [Appeal to another Authority.]
        Remember, there are rules for him and another set for you.
        He will ignore e-mail, talk about you behind your back, set impossible deadlines [for you], and not keep his promises. Don't expect to be told about important stuff that impacts you, or that you happen to be expert in. You won't get invited to meetings, see reports, or be involved in
        the 'discussions' held before major decisions are announced.

        Rumour, disinformation, and 'Need to Know' are powerful tools for the WGSA.

        He will casually drop bombshells, regularly spring 'surprises' on you, and practices 'Divide and Conquer' extremely well. He allocates work, but will never help or clarify what he wants. And of course, won't follow up on it. He may fly into a 'justifiable rage' if he comes back in a month and something hasn't been done to his satisfaction... It's not easy being so perfect and all-knowing all the time.

        Rational argument won't work with the WGSA. What matters is that he thought it up, he's important, and the bosses [his mates], think he is an absolute Guru on everything.

        And if you ever get close to criticising him or winning an argument - slander and libel work just fine for him.


        Some Sound Management Laws

        (2005) Note:I don't try to come up with any principles or Laws for The "WGSA" follows.
        There is probably only one: "Seize Ever Opportunity". Which isn't a bad dictum, if it respects other people, fulfills your business's needs and goals and isn't only about advancing your personal agenda.


        My version of "Sound Management Laws" are presented for you to consider and understand where I am "Coming From":
        • Delegate Authority with Responsibility and Accountability.
        • Follow up, Follow through, Be Consistent.
        • Value and Empower your staff: People are your only asset.
        • Do It NOW!
        • Follow The Quality Circle: Plan, Act, Evaluate.
        • Encourage and Reward Professional Behaviour, deal quickly with repeats of poor behaviour.
        • Lead by Example.
        • Forge, maintain, and support Teams.
        In I.T. there are special management considerations:
        • Users come first
          • Satisfy Business Needs
          • Actively sell your successes and services to your users.
          • Constantly set and manage users expectations.
          • Inform, advise, consult
          • Be Honest and forthright - especially about your mistakes and failures.
            Take care to explain Why it won't happen again.
          • Be Proactive. You get to drive the technology, they drive the business operations.
        • Know Yourself, Your Staff, Your Tools.
        • Never take on a job you cannot do.
        • Don't give others jobs they can't do.
        • Risk Management, Reviews, and 'Performance Audits' are your chief tools in establishing a Learning organisation.
        Good working relationships between management and staff take time and effort to develop. They proceed through the following stages and are fragile. The whole lot, years of work, can be destroyed in an instant with a lie.

        What management want are people they trust, work very hard, and consistently produce quality work. People who hold the company's best interests to heart.

        Development Stages of People and Teams:
        • Honesty, Integrity, Openness, Frankness, Consistency
        • TRUST
        • RESPECT
        • LOYALTY
        • COMMITMENT, CARING


        So What?

        Since the advent of the 486 in ~91, cheap LAN's in ~94, and the Net in ~96, I.T. systems and infrastructure have become essential and critical for all business operations. Systems Administration, Networking, Help Desk, and Database Admin are the glue that holds it all together from
        day to day.

        There is a myth that software doesn't wear out like machinery.
        The bits don't change, so it must be OK! By implication, you don't need to "maintain" systems and software, like you do machines.

        So why aren't we all running 286's and DOS 3.3?

        It's called 'bit rot'. The software doesn't change, but the environment does - which gives the same net effect. Year 2000 isn't a problem until your clock says 01/01/00.

        My argument is that company profitability is related directly to, ignoring management and leadership issues, staff efficiency [$ cost / $ sales] and new product evolution. These are driven directly by I.T.
        capability, which requires systems be constantly upgraded and enhanced - just to stay where you are! Similarly, I.T. operations staff must be continually increasing their own efficiency just to keep up.

        (2005) See the 2003 Harvard Business Review article
        "I.T. Doesn't Matter" by Nicholas G. Carr.

        Effective Systems Administration is the single greatest point of leverage in the I.T. infrastructure - which is itself the single greatest point of leverage in an organisation. It amplifies and extends the
        thinking, analysis, and decision making ability of the people in the organisation. Even sometimes the managers. It can even provide some corporate memory - a prerequisite for Knowledge.

        It's obvious the software in airplanes, spaceships, nuclear reactors, medical instruments, weapons systems, banks, and ATM's has to be correct, robust, and dependable or there are disastrous, often
        immediate, consequences. People die or billions goes missing. [Roll on NT - reactor control!]

        What's not obvious is the long, lingering decline and demise of businesses - large and small.

        The cost to Australia of losing a multi-billion dollar multinational company is incalculable. Well managed and well lead, it could still be a potent force on the global stage. Instead we have lost profits,
        destroyed assets, and put a few thousand people out of work.

        (2005) On May 28 2001, Australia's fourth largest telco, One-Tel, ceased trading on the ASX.
        The Packer and Murdoch families, who control the media conglomerates PBL and News Corporation, lost about A$1Billion in the debacle. A major factor in the failure was uncollected "receivables". The computer billing system was faulty.

        One.Tel closely followed the failure of HIH Insurance and Impulse Airlines.

        That's a disaster 10 times bigger than TWA-800 going down outside New York just after take-off in 97, and they are still fishing out pieces. Just because it is in glorious slow motion - taking a decade, not a minute to unfold - doesn't mean we shouldn't still be as concerned with businesses going down as with aircraft crashes. People lives are destroyed and assets lost just as thoroughly in both types of crashes.

        The government and professional bodies should be just as concerned with these outcomes and ensuring they can never happen again.



        How do you work with a "World's Greatest ..."

        I don't have an answer.

        My style has been described as "Straight Up the Middle, with lots of smoke and noise."

        My only response is to recognise an intractable situation early and leave as quickly as you can. A luxury I can afford, having no dependants and a low level of debt.

        I Need an answer and would like - Your Feedback.



        Some "Good Stuff" I learnt from friends.

        • Know what's important. Focus on that, ignore the trivia.
        • Practice - Order, Discipline, Rigour.
        • The job isn't done until your records are up to date.
        • Professionals do for $100k what anyone can do for $1M.
        • Remember Good Ways to do things when you see them.
        • ASK other people - what works? What doesn't.
        • STANDARDISE. Make it so they is just one way things happen.
        • Be prepared to work odd hours to not impact your users.
        • Hit your deadlines.
        • Clean up as you go.
        • The details are important.
        • DON'T accept a job you can't do.
        • Be Proactive, not Reactive.
        • Practice 'Root Cause Analysis' - fix faults and processes, not just
          symptoms.
        • You have to stay on the leading edge. This takes lots of time and experimentation.
        • There is NO substitute for ability, experience, and general knowledge.
        • Aim for 100% reliability. Know what you have to do to achieve it.
        • If you make Rules, apply them without exception.
          You may get called The Network Nazi, but it will all work and you will be respected.
        • Be personally flexible when dealing with users. Meet their needs, not just their expressed desire. This may involve some education.
        • Let users know what's happening.
        • Protect your staff from the vicissitudes of Management.
        • Freeware is FINE. If it meets the need, use it.
        • Know and Explore your tools.


        Some of the WGSA's work

        Here is a [longish] list of some of the wonderful technical and process problems I came across. Remember this was a largish, not huge, enterprise. There were only 75 Unix hosts, a thousand or so users [total], and a network that went to less than 100 sites.

        Many of the systems were front-ends to the mainframe or a production system for the business.
        The Unix support team was mostly 3 people, sometimes with a manager, sometimes with people doing performance analysis/reporting, or 'implementations' - such as HP Openview [I.T. Operations].

        • Common Environment: 25,000 lines of Shell Script. A good technology for 1982, not 1997.
          Very poorly written. Basic programming rules of 'Coupling and Cohesion' violated.
        • All actions implemented as shell functions, but merged with interactive menu system. Extremely heavy reliance on Environment variables, with perverse re-mapping of names.
        • 'Standard Operating Environment'. More shell script! No concept of standard builds, current patch levels, consistent program versions, or automatic software updates. 12 or more months of wasted effort. [Sold to the management initially on the great results from HP's internal network.
          With 100,000 PC's and 23,000 Unix hosts spread over 660 sites, they saved US$200M/year in support costs alone by adopting a 'Common Operating Environment'. That was based on keeping all systems up to the same versions of software and config files.]
        • Xferutility: 7,000 lines of shell script, doing a subset of uucp's functionality.
          Insidious bugs like:-


          • Using the (local) return code of 'rsh' and thinking it was all working.
          • Using rcp and not checking for a previous aborted transfer.
          • [Destination file ends up with zero modes. Not writable by owner. Copy aborts, but script keeps chugging along.]
        • HP-UX 10 'bug'. #!/bin/ksh missing. Default '/sbin/sh' used with surprising results - 'exit' doesn't work.
        • /usr/local/bin banned. All executables and tools to reside in admin's home directories.
        • Common Admin logons banned. But 'essential utilities', like Xferutility, used a common account with .rhosts trusted all over the place, and even privileged access possible with sudo.
        • Common User Home directories basic to functioning of 'Common Environment' scripts. Ran ~/.profile to start menu, which [eventually] ran ~/$LOGNAME.profile.
        • NO master passwd file. No unique UID's, but notionally unique LOGNAME's.
        • NO mechanism to add or remove users from multiple machines.
        • NO shadow password files.
        • NO password aging.
        • NO retiring of unused accounts. No checking for intrusions.
        • Default password of LOGNAME. Never checked and never reset.
        • Help Desk's 'Password reset' function broken on most machines. No corrective action taken.
        • Crack broke 80% of the passwords on the central admin hub. [Including that of WGSA]. Nothing was done.
        • WGSA login setup on all systems, with .rhosts back to the admin hub, and 'sudo' access to 'mv' and 'cp'. WGSA had two passwords, family member names + digit. These were well publicised to all admins, and others.
        • NO definitive list of managed hosts.
        • DNS control files rebuilt every time from a 'hosts' file with 'host_to_named'.
        • NO alternate DNS primary.
          A single central machine contained all the network services - DNS, e-mail, dial-in access, administration, master copies of scripts and system config files, root
          passwords for all machines. This 'admin hub' was trusted, and could access all other systems. There was no fail-over system or contingency plan for massive failure.
        • Crippled DNS secondaries.
          This was for 'security'. There was NO IP access control in the network. A user with only a little knowledge could navigate the entire network. There was an IP path back to the central DNS, and the IP number were allocated in an orderly fashion.
        • Internal domain left at: XXXXX.com. Even where a firewall was installed with the domain of XXXXX.com.au!
        • Even with over 2000 device entries in the DNS, and a strong numbering plan initiated by Networks, running sub-domains was firmly and frequently rejected.
        • Win-NT and DHCP posed no problem for the DNS. Permanent number leases were granted.
        • 10 or more IP address ranges in use. Including a Class-B [the company owned], and other cute addresses like, 150.150.x.x [Wells Fargo's!]
        • IP over X.25 was chosen in 1994.
          Routers were 'too expensive'. By March 1996 there were massive network failures - morning and afternoon - due to overload of the $250k X.25 switches.
          Expensive terminal servers were deployed widely, 'because they handle IP over X.25'. Most production support problems related to config mgt, Network, printers, or terminal servers.
        • Untested backup tapes. In spite of a failure resulting in almost total loss of backup tapes for a system, no testing of readability of backups was performed.
        • Configuration Managements consisted of copies of scripts in WGSA home directory.
          NO mechanism for rolling out fixes to faults as found.
        • Version Control consisted of block comments at the start of the scripts.
        • Common Code duplicated across 'menus'.
        • Hard coded 'user types' in Common Environment scripts.
        • File names not distinguished by hostname. All called 'AdminMenu' for the 'Admin' user.
        • nonStandard capitalisation of file Names and environment variables.
        • Very early version of 'sudo' used and modified. Non-standard config files. No repository of config files. No version control. [And WGSA didn't believe me when I found a long standing bug in his code.]
        • No reviews of code, scripts, systems.
        • Little testing of new code. Try it live!
        • No documented procedures for standard tasks.
        • No records of faults fixed.
        • No regular analysis or reporting of production faults.
        • No running sheets on production faults.
        • No weekly section meeting. No dissemination of information, plans.
        • No standard machine builds. [Complex and long procedures to build the production systems - with many variants.]
        • No capability to track or report critical file changes on production systems.
        • Network Naming Standard defined [but not for Printers and print queues]:
          ux div 2 loc nr : 11 chars. Accepted by hostname, not by uname

          • ux = Unix,
          • div = Division 3 letter code,
          • 2 = 1st digit of state postcode,
          • loc = 3 letter code for town/suburb, arbitrarily assigned,
          • nr = 2 digit machine number
          WGSA Response: Set hostname to the long name, and uname to the old short
          name!
          [So what's the standard??]
        • X.400 was chosen as the 'Standard external E-mail system'.
        • HP Openview [@$100k ?], was chosen as the corporate mail system - 'because it could make an address a program'.
        • External E-mail addresses were:- Firstname_Lastname@XXXXX.com.au
          It took a long and bloody fight to get a script into production that used the Net standard of 'First.Last@XXXXX.com.au', plus generate all the usual abbreviations, and allow specific people to be included/excluded. This of course was removed a week or so after I left... [Only for them to hurriedly fall back to a manual list once they found a mail-loop problem.]
        • There were over 10 printing mechanisms, no map of network printers, and no naming standard for printers. [There was a printer called 'printer', and more than one called 'laser'.] Of course, nothing was documented on how it all worked, what got changed, or subtle faults found.
        • No disaster recovery or contingency plans existed. Hardware in the old AIX boxes occasionally died and caused not inconsiderable panic to the new admins.
        • The machine room had no sensible layout - even though it was newly installed in 96. There was a single ethernet for all the production, development, accounting, and maintenance systems.
        • Disk Layouts were recorded nowhere.
        • There was no consistency or standard way to way out Logical Volumes on disks.
        • The Journalling Filesystem [Veritas], was supposedly 'banned' from all HP-UX 10 systems. The defrag and on-the-fly extend utilities were an extra [pay for] package, so the 'free' part couldn't be used.
        The watchword for the I.T. branch was 'CHEAP'.
        [Do you think that was in any way related to the company dying?]



        Summary

        There are some people out there that don't just think, but know, they are the best.

        They are dangerous.

        Left unchecked they will not only make life a misery for everyone around, they help bring companies,
        even very large ones, down.

        What singles them out is their inability to take input from others.

        Typical behaviours are:
        • Rigidity. Nothing can be changed.
        • Control. They have to say how everything is done.
        • Fixation. Things have to be done their way or not at all.
        • Discipline, rigour, defined processes. Usually absent. Always perverted.
        • Favoured Few. There is always an inner sanctum who control everything.
        If they are well settled and well regarded, the organisation is dysfunctional. It will be soul destroying staying.

        The only defense I know against them, once entrenched, is to leave.

        And thank you all for your patience. I hope you have taken something away from all this...

        Questions and Comments, please.


        Page Last Updated:
        Fri 30 Jul 2010 09:19:47 EST (to blogger)
        Wed Feb 1 19:17:47 EST 2006
        02-Jul-98  (first version)

        2010/05/09

        Microsoft Troubles - IX, the story unfolds with Apple closing in on Microsoft size.

        Three pieces in the trade press showing how things are unfolding.

        Om Malik points out that Intel and Microsoft fortunes are closely intertwined.
        Jean-Louis Gassée suggests that "Personal Computing" (on those pesky Personal Computers) is downsizing and changing.
        Joe Wilcox analyses Microsoft latest results and contrasts a little with Apple.

        2010/05/03

        Everything Old is New Again: Cray's CPU design

        I found myself writing, during a commentary on the evolution of SSD's in servers, that  large-slow-memory like Seymour Cray used (not cache), would affect the design of Operating Systems. The new scheduling paradigm:
        Allocate a thread to a core, let it run until it finishes and waits for (network) input, or it needs to read/write to the network.
        This leads into how Seymour Cray dealt with Multi-Processing, he used multi-level CPU's:
        • There were Application processors, many bits, many complex features like Floating Point and other fancy stuff, but had no kernel mode features or access to protected regions of hardware or memory, and
        • Peripheral Processors (PP's), really a single very simple, very high-speed processor, multiplexed to look like 10 small, slower processors that performed all kernel functions and controlled the operation of the Application Processors (AP's)
        Not only did this organisation result in very fast systems (Cray's designs were the fastest in the world for around 2 decades), but very robust and secure ones as well: the NSA and other TLA's used them extensively.

        The common received wisdom is that interrupt-handling is the definitive way to interface unpredictable hardware events with the O/S and rest of the system. That polling devices, the old-way, is inefficient and expensive.

        Creating a fixed overhead scheme is more expensive in compute cycles than an on-demand, or queuing, system, until the utilisation rate is very high. Then the cost of all the flexibility (or Variety in W. Ross Ashby's Cybernetics term) comes home to roost.

        Piers Lauder of Sydney University and Bell Labs improved total system throughput of a VAX-11/780 running Unix V8 under continuous full (student/teaching) load by 30% by changing the serial-line device driver from 'interrupt handling' to polling.

        All those expensive context-switches went away, to be replaced by a predictable, fixed overhead.
        Yes, when the system was idle or low-load, it spent a little more time polling, but marginal.
        And if the system isn't flat-out, what's the meaning of an efficiency metric?

        Dr Neil J Gunther has written about this effect extensively with his Universal Scaling Law and other articles showing the equivalence of the seemingly disparate approaches of Vector Processing and SMP systems in the limit of their performance.

        My comment about big, slow memory changing Operating System scheduling can be combined with the Cray PP/AP organisation.

        In the modern world of CMOS, micro-electronics and multi-core chips, we are still facing the same Engineering problem Seymour Cray was attempting to address/find an optimal solution to:
        For a given technology, how do you balance maximum performance with the Power/Heat Wall?
        More power gives you more speed, this creates more Heat, which results in self-destruction, the "Halt and Catch Fire" problem. Silicon junctions/transistors are subject to thermal run-away, as they get hotter, they consume more power and get hotter still. At some point that becomes a viscous cycle (positive feedback loop) and its game over. Good chip/system designs balance on just the right side of this knife edge.

        How could the Cray PP/AP organisation be applied to current multi-core chip designs?
        1. Separate the CPU designs for kernel-mode and Application Processors.
          A single chip needs only have a single kernel-mode CPU controlling a number of Application CPU's. With its constant overhead cost already "paid for", scaling of Application performance is going to be very close to linear right up until the limit.
        2. Application CPU's don't have forced context switches. They roar along as fast as they can for as long as they can, or the kernel scheduler decides they've had their fair share.
        3. System Performance and Security both improve by using different instruction sets and processor architectures for different applications. While a virus/malware might be able to compromise an Application, it can't migrate into the kernel unless it's buggy. The Security Boundary and Partitioning Model is very strong.
        4. There doesn't have to be competition between the kernel-mode CPU and the AP's for cache memory 'lines'. In fact, the same memory cell designs/organisations used for L1/L2 cache can be provided as small (1-2MB) amounts of very fast direct access memory. The modern equivalent of "all register" memory.
        5. Because the kernel-mode CPU and AP's don't contend for cache lines, each will benefit hugely in raw performance.
          Another, more subtle, benefit is the kernel can avoid both the 'snoopy cache' (shared between all CPU's) and VM systems. It means a much simpler, much faster and smaller (= cooler) design.
        6. The instruction set for the kernel-mode CPU will be optimised for speed, simplicity and minimal transistor count. You can forget about speculative execution and other really heavy-weight solutions necessary in the AP world.
        7. The AP instruction set must be fixed and well-know, while the kernel-mode CPU instruction set can be tweaked or entirely changed for each hardware/fabrication iteration. The kernel-mode CPU runs what we'd now call either a hypervisor or a micro-kernel. Very small, very fast and with just enough capability. A side effect is that the chip manufacturers can do what they do best - fiddle with the internals - and provide a standard hypervisor for other O/S vendors to build upon.
        Cheaper, Faster, Cooler, more robust and Secure and able to scale better.

        What's not to like in this organisation?

        A Good Question: When will Computer Design 'stabilise'?

        The other night I was talking to my non-Geek friend about computers and he formulated what I thought was A Good Question:
        When will they stop changing??
        This was in reaction to me talking about my experience in suggesting a Network Appliance, a high-end Enterprise Storage device, as shared storage for a website used by a small research group.
        It comes with a 5 year warranty, which leads to the obvious question:
        will it be useful, relevant or 'what we usually do' in 5 years?
        I think most of the elements in current systems are here to stay, at least for the evolution of Silicon/Magnetic recording. We are staring at 'the final countdown', i.e. hitting physical limits of these technologies, not necessarily their design limits. Engineers can be very clever.

        The server market has already fractioned into "budget", "value" and "premium" species.
        The desktop/laptop market continues to redefine itself - and more 'other' devices arise. The 100M+ iPhones, in particular, already out there demonstrate this.

        There's a new major step in server evolution just breaking:
        Flash memory for large-volume working and/or persistent storage.
        What now may be called internal or local disk.
        This implies a major re-organisation of even low-end server installations:
        Fast local storage and large slow network storage - shared and reliable.
        When the working set of Application data in databases and/or files will fit on (affordable) local flash memory, response times improve dramatically because all that latency is removed. By definition, data outside the working set isn't a rate limiting step, so its latency only slightly affects system response time. However, throughput, the other side of the Performance Coin, has to match or beat that of the local storage, or it will become the system bottleneck.

        An interesting side question:
         How will Near-Zero-Latency local storage impact system 'performance', both response times (a.k.a. latency) and throughput.

        I conjecture that both system latency and throughput will improve markedly, possibly super-linearly, because one of the bug-bears of Operating Systems, the context switch, will be removed. Systems have to expend significant effort/overhead in 'saving their place', deciding what to do next, then when the data is finally ready/available, to stop what they were doing and start again where they left off.

        The new processing model, especially for multi-core CPU's, will be:
        Allocate a thread to a core, let it run until it finishes and waits for (network) input, or it needs to read/write to the network.
        Near zero-latency storage removes the need for complex scheduling algorithms and associated queuing. It improves both latency and throughput by removing a bottleneck.
        It would seem that Operating Systems might benefit from significant redesign to exploit this effect, in much the same way that RAM is now large and cheap enough that system 'swap space' is now either an anachronism or unused.

        The evolution of USB flash drives saw prices/Gb halving every year. I've recently seen 4Gb SDHC cards at the supermarket for ~$15, whereas in 2008, I paid ~$60 for USB 4Gb.

        Rough server pricing for RAM in 2010 is A$65/Gb ±$15.
        List prices by Tier 1/2 vendors for 64Gb SSD is $750-$1000 (around 2-4 times cheaper from 'white box' suppliers).
        I've seen this firmware limited to 50Gb to improve performance and reliability comparable to current production HDD specs.
        This is $12-$20/Gb, depending on what base size and prices used.

        Disk drives are ~A$125 for 7200rpm SATA and $275-$450 for 15K SAS drives.
        With 2.5" drives priced in-between.
        Ie. $0.125/Gb for 'big slow' disks and $1 per GB for fast SAS disks.

        Roll forward 5 years to 2015 and 'SSD' might've doubled in size three times, plus seen the unit price drop. Hard disks will likely follow the same trend of 2-3 doublings.
        Say SSD 400Gb for $300: $1.25/Gb
        2.5" drives might be up to 2-4Tb in 2015 (from 500Gb in 2010) and cost $200: $0.05-0.10/Gb
        RAM might be down to $15-$30/Gb.

        A caveat with disk storage pricing: 10 years ago RAID 5 became necessary for production servers to avoid permanent data loss.
        We've now passed another event horizon: Dual-parity, as a minimum, is required on production RAID sets.

        On production servers, price of storage has to factor in the multiple overheads of building high-reliability storage (redundant {disks, controllers, connections}, parity and hot-swap disks and even fully mirrored RAID volumes plus software, licenses and their Operations, Admin and Maintenance) from unreliable parts. A problem solved by electronics engineers 50+ years ago with N+1 redundancy.

        Multiple Parity is now needed because in the time taken to recreate a failed drive, there's a significant chance of a second drive failure and total data loss. [Something NetApp has been pointing out and addressing for some years.] The reason for this is simple: the time to read/write a whole drive has steadily increased since ~1980. Recording density (bits per inch) times areal density (tracks per inch) have increased faster than read/write speeds, roughly multiplying recording density times rotational speed.

        Which makes running triple-mirrors a much easier entry point, or some bright spark has to invent a cheap-and-cheerful N-way data replication system. Like a general use Google File System.

        Another issue is that current SSD offerings don't impress me.

        They make great local disk or non-volatile buffers in storage array, but are not yet, in my opinion, quite ready for 'prime time'.

        I'd like to see 2 things changed:
        • RAID-3 organisation with field-replaceable mini-drives. hot-swap preferred.
        • PCI, not SAS or SATA connection. I.e. they appear as directly addressable memory.

        This way the hardware can access flash as large, slow memory and the Operating System can fabricate that into a filesystem if it chooses - plus if it has some knowledge of the on-chip flash memory controller, it can work much better with it. It saves multiple sets of interfaces and protocol conversions.

        Direct access flash memory will be always be cheaper and faster than SATA or SAS pseudo-drives.

        We would then see following hierarchy of memory in servers:

        • Internal to server
          • L1/2/3 cache on-chip
          • RAM
          • Flash persistent storage
          • optional local disk (RAID-dual parity or triple mirrored)
        • External and site-local
          • network connected storage array, optimised for size, reliability, streaming IO rate and price not IO/sec. Hot swap disks and in-place/live expansion with extra controllers or shelves are taken as a given.
          • network connected near-line archival storage (MAID - Massive Array of Idle Disks)
        • External and off-site
          • off-site snapshots, backups and archives.
            Which implies a new type of business similar to Amazon's Storage Cloud.
        The local network/LAN is going to be ethernet (1Gbps or 10Gbps Ethernet, a.k.a 10GE), or Infiniband if 10GE remains very expensive. Infiniband delivers 3-6Gbps over short distances on copper, external SAS currently uses the "multi-lane" connector to deliver four channels per cable. This is exactly right for use in a single rack.

        I can't see a role for Fibre Channel outside storage arrays, and these will go if Infiniband speed and pricing continues to drop. Storage Arrays have used SCSI/SAS drives with internal copper wiring and external Fibre interfaces for a decade or more.
        Already the premium network vendors, like CISCO, are selling "Fibre Channel over Ethernet" switches (FCoE using 10GE).

        Nary a tape to be seen. (Hooray!)

        Servers should tend to be 1RU either full-width or half-width, though there will still be 3-4 styles of servers:
        • budget: mostly 1-chip
        • value: 1 and 2-chip systems
        • lower power value systems: 65W/CPU-chip, not 80-90W.
        • premium SMP: fast CPU's, large RAM and many CPU's (90-130W ea)
        If you want removable backups, stick 3+ drives in a RAID enclosure and choose between USB, firewire/IEEE 1394, e-SATA or SAS.

        Being normally powered down, you'd expect extended lifetimes for disks and electronics.
        But they'll need regular (3-6-12 months) read/check/rewrite cycling or the data will degrade and be permanently lost. Random 'bit-flipping' due to thermal activity, cosmic rays/particles and stray magnetic fields is the price we pay for very high density on magnetic media.
        Which is easy to do if they are kept in a remote access device, not unlike "tape robots" of old.
        Keeping archival storage "on a shelf" implies manual processes for data checking/refresh, and that is problematic to say the least.

        3-5 2.5" drives will make a nice 'brick' for these removable backup packs.
        Hopefully commodity vendors like Vantec will start selling multiple-interface RAID devices in the near future. Using current commodity interfaces should ensure they are readable at least a decade into the future. I'm not a fan of hardware RAID controllers in this application because if it breaks, you need to find a replacement - which may be impossible at a future date. (fails 'single point of failure' test).

        Which presents another question using a software RAID and filesystem layout: Will it still be available in your O/S of the future?
        You're keeping copies of your applications, O/S, licences and hardware to recover/access archived data, aren't you? So this won't be a question... If you don't intend to keep the environment and infrastructure necessary to access archived data, you need to rethink what you're doing.

        These enclosures won't be expensive, but shan't be cheap and cheerful:
        Just what is your data worth to you?
        If it has little value, then why are you spending money on keeping it?
        If it is a valuable asset, potentially irreplaceable, then you must be prepared to pay for its upkeep in time, space and dollars. Just like packing old files into archive boxes and shipping them to a safe off-site facility cost money, it isn't over once they are out of your sight.

        Electronic storage is mostly cheaper than paper, but it isn't free and comes with its own limits and problems.

        Summary:
        • SSD's are best suited and positioned as local or internal 'disks', not in storage arrays.
        • Flash memory is better presented to an Operating System as directly accessible memory.
        • Like disk arrays and RAM, flash memory needs to seamlessly cater for failure of bits and whole devices.
        • Hard disks have evolved to need multiple parity drives to keep the risk of total data loss acceptably low in production environments.
        • Throughput of storage arrays, not latency, will become their defining performance metric.
          New 'figures of merit' will be:
          • Volumetric: Gb per cubic-inch
          • Power: Watts per Gb
          • Throughput: Gb per second per read/write-stream
          • Bandwidth: Total Gb per second
          • Connections:  Number simultaneous connections.
          • Price: $ per Gb available and $ per Gb/sec per server and total
          • Reliability: probability of 1 byte lost per year per Gb
          • Archive and Recovery features: snapshots, backups, archives and Mean-Time-to-Restore
          • Expansion and Scalability: maximum size (Gb, controllers, units, I/O rate) and incremental pricing
          • Off-site and removable storage: RAID-5 disk-packs with multiple interfaces are needed.
        • Near Zero-latency storage implies reorganising and simplifying Operating Systems and their scheduling/multi-processing algorithms. Special CPU support may be needed, like for Virtualisation.
        • Separating networks {external access, storage/database, admin, backups} becomes mandatory for performance, reliability, scaling and security.
        • Pushing large-scale persistent storage onto the network requires a commodity network faster than 1Gbps ethernet. This will either be 10Gbps ethernet or multi-lane 3-6Gbps Infiniband.
        Which leads to another question:
        What might Desktops look like in 5 years?

        Other Reading:
        For a definitive theoretical treatment of aspects of storage hierarchies, Dr. Neil J Gunther, ex-Xerox PARC, now Performance Dynamics, has been writing about "The Virtualization Spectrum" for some time.

        Footnote 1:
        Is this idea of multi-speed memory (small/fast and big/slow) new or original?
        No: Seymour Cray, the designer of the world's fastest computers for ~2 decades, based his designs on it. It appears to me to be a old idea whose time has come again.

        From a 1995 interview with the Smithsonian:
        SC: Memory was the dominant consideration. How to use new memory parts as they appeared at that point in time. There were, as there are today large dynamic memory parts and relatively slow and much faster smaller static parts. The compromise between using those types of memory remains the challenge today to equipment designers. There's a factor of four in terms of memory size between the slower part and the faster part. Its not at all obvious which is the better choice until one talks about specific applications. As you design a machine you're generally not able to talk about specific applications because you don't know enough about how the machine will be used to do that.
        There is also a great PPT presentation on Seymour Cray by Gordon Bell entitled "A Seymour Cray Perspective", probably written as a tribute after Cray's untimely death in an auto accident.

        Footnote 2:
        The notion of "all files on the network" and invisible multi-level caches was built in 1990 at Bell Labs in their Unix successor, "Plan 9" (named for one of the worst movies of all time).
        Wikipedia has a useful intro/commentary, though the original on-line docs are pretty accessible.

        Ken Thompson and co built Plan 9 around 3 elements:
        • A single protocol (9P) of around 14 elements (read, write, seek, close, clone, cd, ...)
        • The Network connects everything.
        • Four types of device: terminals, CPU servers, Storage servers and the Authentication server.
        Ken's original storage server had 3 levels of transparent storage (in sizes unheard of at the time):
        • 1Gb of RAM (more?)
        • 100Gb of disk (in an age where 1Gb drives where very large and exotic)
        • 1Tb of WORM storage (write-once optical disk. Unheard of in a single device)
        The usual comment was, "you can go away for the weekend and all your files are still in either memory or disk cache".

        They also pioneered permanent point-in-time archives on disk in something appearing to the user as similar to NetApp's 'snapshots' (though they didn't replicate inode tables and super-blocks).

         My observations in this piece can be paraphrased as:
        • re-embrace Cray's multiple-memory model, and
        • embrace commercially the Plan 9 "network storage" model.