2005-07-31

NASA keeps breaking a core rule of engineering

Well, in case you haven't heard, foam continues to be problem for the Shuttle Transport System (STS). While the 100,000 part STS was designed with inherent risks, including the breakage of insulation foam during lift-off (and even o-rings before that), these risks were mitigated in design, component test, regression test and did fly in integration many times before failure.

So what's the problem? Lack of building every STS like the one before.

Various supplier and even NASA policy changes have dictated material changes during the lifetime of the STS. First it was a materials change for environmental reasons by the supplier of a support agents used around the o-rings not even a dozen months before the Challenger disaster. Coincidentally, and unluckily for 7 brave astronauts, Challenger was on the pad during one of the colder winters of central Florida after many delays. Although it is true that the o-ring design was not the most ideal, it was a mitigated risk in STS' design. It did not become an issue until some support material was changed that radically increased the risk -- especially when certain factors (such as weather) were not considered by management.

Secondly and more recently was the fate of Columbia due to a US-wide environmental policy change in 1987 that started taking effect for everyone, including the military**, by 1997**. Foam insulation had always broken off in every shuttle loss and tile damage was an accepted, post-launch repair. This is how the STS was designed largely because foam insulation loss was minimal, did not adversely affect tile wear (i.e., did not impact very deep if it did) and most tile damage (around 40/mission) on the returning shuttle orbiter was due to heat. The problem was the change to CFC-less foam in 1997, which had a tensile strength 11x worse -- OVER AND ORDER OF MAGNITUDE WORSE! -- than the original CFC-based foam. Now tile damage was in the hundreds, with several impacts causing CUTS IN TILE AS DEEP AS 75% ON A REGULAR BASIS! Again, was the use of insulation foam that could break off and damage tiles on the shuttle orbiter ideal? No. But was it a mitigated risk in the original design? Yes. It just wasn't an issue until the material change.

[**NOTE: I have personally seen (via video of a White Sands launch in 1998 on a defense launch system I worked on prior) how poorly the CFC-less insulation performs when defense systems are designed to rely on its tensile strength like it is CFC-based. What is an "equivalent replacement" from an insulator standpoint is usually based on its actual insulation properties. The problem is that any material changes that seem "equivalent" in "core functionality" might impact other attributes. E.g., early CFC-free refrigerant replacements for A/C in cars that EXPLODED when there was a leak (after all, they ALWAYS leak!) and someone was smoking. The same is true with CFC-less insulation, the tensile strength is dismal, and a major issue for not only NASA, but our defense arsenal -- which is even more scary! Especially when everything from ballistic to hand-held rockets are often dependent on foam "breaking away whole" and not "fragmenting" during launch. ]

The fact that NASA has designed, tested, re-tested and regression tested components in the STS over and over, taking on risk that they believe they have mitigated, is a recurring theme and part of the reason why NASA first responds with little less than enthusiasm when it is criticized by the public. Even I as an engineer can understand why NASA's initial reactions are not critical enough of itself. After all, NASA is constantly bombarded with criticism by people who do not understand engineering principles (let alone those arguments that not scientific in nature -- e.g., people who complain about the use of radioisotope thermoelectric generators, RTGs). In NASA's mind, at least at the on-set of criticism, the public is not privy to all the testing they have done, which is why they downplayed things like the O-Rings and, now, foam. Because, after all, they did test the heck out of the original o-ring design with the ORIGINAL materials, just like the original foam insulation design with the ORIGINAL CFC-based form.

Now the PROBLEM WITH NASA, or at least NASA's management of the STS, is that their management refuses to accept the fact that they are consistently VIOLATING ENGINEERING PRINCIPLES. A core rule of engineering is that when you engineer a design to meet specifications that have been tested and assume a risk that has been mitigated ANY CHANGE IN THAT DESIGN -- even the smallest of changes -- can have the largest of impacts! This is far more true in the age of the STS, a 20+ year program, than in prior and short-lived, manned programs of Mercury, Gemini and Apollo that were under 5 years and a dozen manned launches. The STS was designed and tested to exacting specifications -- most importantly, the materials used -- and over time, those materials might come under change.

Some of it cannot be solely blammed on NASA because A) there are over 100,000 parts in the STS and B) a lot of materials come from private industry suppliers, suppliers who also sell commercial products and, therefore, are forced into product changes by various regulation. Even when NASA is granted exceptions for certain material use that is forbidden in commodity, consumer usage (such as the EPA did for CFCs), NASA may not have the resources or will to deal with that overhead. Especially in dealing with political fall-outs. It's almost ironic that the European Space Agency (ESA), let alone the Russians or Chinese (the latter doesn't even use flight termination, and their space agency has been responsible for the deaths of thousands of their own country's citizens), are not held up to the same standards.

E.g., RTGs are a perfect example where NASA continues almost "apologize" with people who have absolutely no idea how RTGs work. There is absolutely no chance of harm that could come to you and you're more likely to be killed by debris than a RTG landing in your backyard (which still wouldn't be a danger)! Same deal with use of CFC foam. I have met countless environmentalists who not only don't put the limited amounts of CFC foam previously used in the STS into perspective, but also the realities of the eventual fate of CFC-based insulation in the STS. The amount of CFC-based materials used in the STS over its entire history is only about 1 / 10,000,000 -- ten millionth -- of the CFC products in prior, US consumer usage in just a matter of weeks! And unlike consumer CFC products are still decaying over years in our landfills and in the open environment, which is what causes the (alleged, and I'll even agree it does) damage to the ozone layer, the majority of CFC foam used in the STS doesn't even have the chance to decay -- as it is disinegrated at extremely high temperatures which often changes its composition radically (and beyond the Ozone layer!).

So, with all that said, if I was director at NASA for all STS operations, I would hand down the following law (in the following order of priority):

1) All suppliers of all 100,000 STS components must now account for a full and complete history of all materials delivered for STS flights, including how any materials may have changed since the original design and early launches of STS.

2) While #1 is going on, any changes that have been knowingly made during the STS program must now undergo peer review and "all ears are open" discussions. It's time to have an "open house" in "peer review" fashion where people who are just "bitching" are quickly dismissed while people who have legitimate and "swepted under the carpet" by managers or other political non-sense can finally get exposed. There should be an edict that no one will be fired for what might be called "mis-management" as well as "blowing the whistle" because it's far more important for people to expose everything at this point in the STS program.

3) Lastly, and with far more difficulty, all former engineers with NASA, the United Space Alliance (USA) and other organizations working on STS will be recalled for a 2 week conference at various times over a period of a few months, to discuss any known changes made during their tenure at NASA on the STS. This will attempt the "loss in mindshare" that has plagued STS unlike all other manned programs prior. Normally this is a tall order from both a logistics and "it ain't my bitch anymore" attitude, but with the correct appropriation of funds and effort from a public organization that wants to guarantee the public trust, I think many organizations and their engineers who were formerly on STS would be willing to participate if they honestly felt it would render much good.

I point out #3 because STS is far different than Mecury, Gemini and Apollo. In fact, given the duration of the program, STS is really NASA's greatest triump in engineering feat. I know many won't see that, but the reality is that had Saturn-Apollo designs been used over and over and over again, it would have likely faced even more failures and issues -- let alone under the budgets of STS in comparison to the Moon shot. Ironically, the manned missions of a few years leading up to the Moon shot was a small order in comparison to the logistical nightmare of keeping suppliers from changing components, engineers from leaving (or just dying over the duration of the program for that matter ;-) and other details that short-lived programs never had to deal with. The public doesn't realize, but I'm sure many NASA managers do, that the STS is a product, not a project, and more like running a corporation from a logistics, manpower and other issue standpoint, than any other NASA endeavor in its entire history.

2005-07-30

Dissecting Virtual Tape Libraries (VTLs)

Although it's not on the front page of CMP "Sys Admin" magazine's site yet (probably come Monday), it appears for the 2005 September issue, my next article in my Dissecting [technology] series is going to be the featured web article:
Dissecting Virtual Tape Libraries (VTLs)
- http://www.samag.com/documents/sam0509a/

If you don't have time to read all my verbage, hit these 3 figures and then just skim through the article:
- http://www.samag.com/documents/sam0509a/0509a_f1.htm
- http://www.samag.com/documents/sam0509a/0509a_f2.htm
- http://www.samag.com/documents/sam0509a/0509a_f3.htm

In a nutshell, although there are a lot of network and storage articles on VTLs and near-line storage, I thought there was a much needed article to explain how they work to system administrators -- especially system administrators who still do "end-system" to "end-tape" over the network in real-time, and even more so for those that have forgone tape, which is still the most reliable, long-lasting off-line medium for disaster recovery. There is also a sidebar on "near-line" disk and how it differs from "commodity" and "enterprise" disk.

This article complements my prior "Dissecting" [technology] series as published in Sys Admin magazine including: Dissecting PC Server Performance (2004 November, on-line)
- http://www.samag.com/documents/sam0411b/
Dissecting ATA RAID Options (2004 April, not on-line)
- http://www.samag.com/articles/2004/0404/

Linux Distributions: Packages v. Ports

Outside of the Windows world (which is slowly moving towards very simple "packages" with Microsoft Installer, ".msi", files), the majority of operating system (OS) distributions (i.e., various, useful software as a collective whole) come in two forms:
- Ports
- Packages

EARLY CONTEMPORARY SOFTWARE DEVELOPMENT

In the golden days of community software development in the original APRANET, programmer source code -- typically C code as of '70s on-ward (through today, long story) -- was always used. One of the hallmarks of AT&T's UNIX System Laboratories (USL) and, subsequent due to the fact that the US government would not let AT&T (who had a monopoly on the telephone) enter the computer/OS market, involvement of academia and research (such as the University of California at Berkeley, UCB, Software Distribution, BSD) UNIX, was freely available C source code. Most everyone used the same C compiler (compilers turn source code, like C .h header and .c code, into .o object code aka "binary code" for specific computers so they can run -- NOTE: that is a mega-oversimplification) and other development tools, so "building" software for a particular UNIX platform was not too terribly difficult.

While AT&T was left outside the market, various commercial entities started cornering the commercial software market. Most well known as Bill Gates and Microsoft, who had ported Digital Basic to most fledging consumer systems (e.g., Altair, Apple, etc...) without a license. Back in the early days of compuing, code was still fairly simple across platforms, so coders shared source code freely. Gates was the first one in 1975 to suggest no one would write good software if they didn't pay for it. Ironicially and hypocritically enough, Gates' "anti-piracy letter" was actually written after he had swiped code from someone else, then modified it, because the original developers swiped his changes back.

Software quickly turned from having available source code to virtually no source code in the late '70s and early '80s. At first, this was manageable. Most utilities were simple, standalone and relied on little more than a kernel (typically the first, main and single, monolithic program that controls everything and is always loaded -- NOTE: again, that's an oversimplification) and maybe a C library (a C library is typically a set of object code that other programs can use, either "linking dynamically" at run-time, or statically by "linking statically" into the program itself, no longer requiring the library -- long story). They were usually one program and maybe a few support files other than the data files created by the program.

INSTALLERS

Once programs became more complex, with lots of different support files, installing software was not so simple. To add more issues, some programs only worked on specific versions of an OS. This is because unlike software that is released as source code, and can typically be built against different components and versions of an OS (with little or no effort), object code is linked with other object code into an OS-specific version on a specific hardware platform. Even the same hardware platform and OS, like the Disk Operating System (DOS) developed by Seattle Computer Products as an unlicensed version of CP/M (which Microsoft bought all rights to for $50,000 and IBM would later settle out of court with CP/M's creator for $800,000), had incompatibilities between versions.

One quick fix in the MS-DOS world was to use an "installer" which was a specific program that was only use to test the system to see how to install the program. This was done because DOS itself did not provide any formal mechanisms to provide any concise, relevant information to the program. Installing programs was arbitrary, and the details (and headaches) was left to any installer (or lackthereof), with shortcoming being left as "an exercise for the user." Microsoft provided virtually no mechanisms for formal archiving (files inside of one file), tracking of installed files on a system, modification of core system resources, etc... until the mid-to-late '90s, and only then after another company solved it for them.

Today, Windows still suffers from "conflicts" in software, largely because it does not allow multiple files of the same use, but different versions, to be installed or accommodated easily. This is known as "DLL Hell." The only mitigation that Windows continues to offer is "protection" of some core system files, like core Windows DLLs. Microsoft has been working with InstallShield and proliferating their Microsoft Installer (.msi) "archive/package," which most installers are now based on, but it still has massive and inherent weaknesses compared to most "real" package systems as we'll see. In fact, one of the major reasons for the .msi format is not to combat configuration management issues (more on that later), but to prevent the installation of trojan horses which are commonly in .exe or other executables but appear to be harmless installes.

PACKAGED DISTRIBUTION

Unlike the DOS world, UNIX included a wealth of utilities developed by USL, BSD and the academia/enthusiast community in the '70s through early '80s. Two of the most well-known and used are the tape archiver (tar) and copy I/O (cpio) programs. Many systems came with software in tar or cpio formats. Although tar may be more well-known to most Linux users today, cpio has explicit functions like "-i" (literally for "install") and is still popular with some UNIX flavors. After the shackles were lifted off in AT&T breakup after 1984, AT&T began a standardizations effort on their new System V (often abbreviated as SysV or SV) Version 1 UNIX. Most tar and cpio programs today are actually the same, streaming format (known as "USTAR", tar uses a 10KiB blocking by default, cpio uses a 5KiB blocking by default -- NOTE: POSIX2001+/SUSv3+ augment the streaming formats and introduce a replacement known as "pax").

UNIX, unlike DOS until 1993 with Windows NT (a true, Protected386 operating system with its own NTFS and VFAT filesystem extentions for FAT12/16 -- although VFAT was not popular until DOS7.x/Windows4.x in Windows 95/98/Me), typically has very large filename limitations. 31 characters were a typical minimum, and 255+ was almost standard in most implementations -- especially after the Institute of Electrical and Electronics Engineers (IEEE) began their Portable Operating Systems Interface (POSIX) committee to document UNIX standards. So it was commonplace to put version numbers on libraries and other system utilities, removing the possible conflicts that some programs could have -- something Windows still does not do today (with many ill effects). POSIX systems also have filesystems with symbolic links (symlinks) which allow one file to be referenced as another name, so a version like mylibrary-1.1.1.so (Shared Object, .so -- the common System-V format for libraries in the UNIX world -- somewhat of the equivalent to a "DLL" in Windows) could also appear as mylibrary-1.so or even mylibrary.so.

As software grew increasingly complex, UNIX systems found themselves having packages that conflicted -- some libraries not working with others, or requring a different (typically newer) standard C library (the foundation of "binary compatibility" in any POSIX/UNIX system). This gave rise to new packages with dependency checking -- packaging systems that would check to see if any other packages are required or could conflict. Although some installers could do the same, their arbitrary nature meant there were not only no guarantees, but older installers were often totally unaware of newer changes. Packaged distribution moves the "logic" of package management from the installer/program itself to the operating system, removing the issue of older installers with "legacy" information that may be ignorant of new programs and interfaces.

If you've ever installed an older Windows program only to completely trash a newer Windows program or vice-versa -- Microsoft's own Office product is a regular offender of this -- this is a major advantage of packaged OSes.

CONFIGURATION MANAGEMENT

The typical goal of packaged distribution, at least in an enterprise, is configuration management. The idea that if you can test software against other software on specific systems, and package it in a way that it can be pushed out to systems, installed and configured and resolve any conflicts (possibly uninstalling or interrupting its own install) automatically is a major reduction in IT overhead need and cost. Because even if software is stable and reliable on its own, there is no guarantee there will not be an issue once the software is integrated with other software on a system -- and systems can vary. Such "integration testing" is a crucial part of engineering, and information technology (IT) continues to be a practical extension of engineering principles (at least when addressed).

This is especially the case in the Windows world, where Windows itself lacks a good package management system, but there are mitigating systems. Even Microsoft has its own System Management Server (SMS) product, and even includes elementary package management starting with Windows [NT 5.0] (2000). Systems can have software "pushed down" to the unit and automatically installed without operator intervention, with conflicts being handled gracefully. After the SQL Slammer worm of January 2003, Microsoft finally started taking security seriously (despite patronizing to the contrary 10 months earlier), and released more patch management.

There are countless patch and package management systems for Windows, including some enterprise quality solutions that Microsoft itself relies on more than their own -- such as Altiris' line of solution. But in the Linux world, package and configuration management has been an all-in-one solution for some time, including encrypted signatures on packages to avoid trojan horse software.

LINUX PACKAGE MANAGEMENT SYSTEMS

The evolution of a sprawling GNU platform (GNU = GNU's Not UNIX, a "clean room UNIX-like) project founded after AT&T's USL started asserting copyrights and closing up source code post-AT&T breakup) known as Linux brought forth some extremely innovative approaches in package management. One of the earliest distributions of Linux, Slackware, uses its own Tar format, although it has added package management in more recent versions (NOTE: it's actually .tar.gz, a LZ77 compressed aka "Gzip" tar stream archive -- for those unfamiliar, PKZip is actually both a compressing LZ77 and block archiver in one).

Two of the most well-known among Linux users since the mid-'90s have been the Debian Package (DPKG) and the RPM Package Manager (RPM, fka Red Hat Package Manager) systems. A little known fact is that both formats use the USTAR format internally, System-V cpio with 5KiB blocking and the "cpio -i" install functionality, plus their own set of rich meta-data which is a hallmark of their respective debates.

Debian's DPKG was actually a very capable implementation from initial creation, with RPM gaining more features as Red Hat developed the system. A hallmark of Debian's DPKG system was founder Ian Murdock's strong attention to proper software packaging in general, avoidance of unnecessary depenedencies (e.g., avoiding allowing packagers to require little used programs, like scripting languages, just to insatll) and other "standards" in Debian's formal guidelines. Red Hat, on the other hand, developed RPM for more of an immediate need and has added more functions since.

Red Hat Linux v8 included RPM v4, which is considered to be the foundation of Linux Standards Base (LSB) package management, and is very similar to DPKG in capabilities including use of alternatives, multiple versions and other LSB-compliant features. Unfortunately for package management systems, especially the ever-popular Red Hat, the wealth of "protection" in dependency checking gives rise to countless situations where the system will prevent you from installing software. This quickly came to be known as "dependency hell" -- often personalized as "RPM hell" in reference to Red Hat (or other Linux distributions that used RPM).

Debian's early proliferation came about due to a set of features that allowed it to avoid "dependency hell." In addition to good packaging standards combined with a "maintainer Democracy," which helped address unnecessary dependency resolution, the Advantage Package Tool (APT).

PACKAGE MANAGEMENT FRONT-ENDS

APT is a package management front-end designed for the distributed development nature of the Debian Project. It automates package (as well as package source code, if a user wishes to rebuild) fetching from Debian's extensive respository of packages, dependency resolution (including automatic fetching of any additional packages required) and interactive (or even automated "fix") resolution. Literally overnight many people found themselves pleased with the avoidance of "dependency hell."

Most RPM distributions, to the contrary, merely focused on their own service issues. I.e., they introduced package management systems that would only update their own packages and updates for them. The Red Hat Network (RHN) developed the Up2Date service and client software, SuSE created its own services and clients for its YaST (Yet another System Tool) and other RPM distributions did likewise. Although they all used RPM for the actual package management back-end, the front-ends were vendor-specific and some were subscription-based.

Contrary to popular opinion, while APT was designed for DPKG and the Debian repositories, including source code fetching, APT is applicable to any package management system "back-end." E.g., Slapt-apt is the Slackware implementation. Connectiva, a distribution popular with Spanish-Portugese (South American) users, ported the APT system to RPM and used it as its base for updates turn of the millenia. By Red Hat Linux 7, many independent repositories sprung up and started to use this implementation, often referred to as APT-RPM, to download updates, new software as well as base software not installed.

SIDE-BAR: FROM RED HAT LINUX TO FEDORA CORE

Probably the most significant repository was the University of Hawaii's Fedora Project, started in 2002, who's APT-RPM showed that a completely Red Hat aligned repository and RPM-specific modifications could be made to solve community distribution issues -- both for the core OS distribution as well as add-on software. Facing increased scrutiny that its trademark was public domain due to lack of past enforcement, as well as the introduction of its new, almost competing "enterprise" product as its prior attempt to bundle Service Level Agreements (SLAs) with its community-released Red Hat Linux 6.2 "E" product (at least in comparison of sales to SuSE Linux Enterprise Server, SLES, at the time), Red Hat decided to "unproduct" its Red Hat Linux into a community project. The resulting Fedora Project, taking the name from U of Hawaii, was now a Fedora Core (fka Red Hat Linux) and Fedora Extras (community add-ons) and a Fedora Legacy (post-end-of-life support -- including all the way back to Red Hat Linux 7.2).

One of the first and underappreciated moved was the fact that Red Hat halted development on Red Hat Linux 10, now Fedora Core 1, and started addressing a lot of "inter-dependency" among other non-sense (e.g., setuid root on too many binaries) in its core packages over the span of 2 months. In a nutshell, since Red Hat Linux, now Fedora Core, was no longer a "product," there was no longer the attitude of "what we ship we have to support" that has plagued Red Hat Linux with criticism for so long -- let alone it was redundant with Red Hat Enterprise Linux (RHEL) existing (which is far more anal, because Service Level Agreements are sold on it, something that failed to sell as the unified Red Hat Linux 6.2 "E" offering prior).

The second move for Red Hat was to integrate the two most popular front-end systems, APT and Yellow Dog (a Red Hat Linux fork for PowerPC systems) Updater, Modified (YUM), into its Up2Date tools. Although Red Hat now provides formerly subscription Red Hat Network (RHN) access for free for Fedora Core, Extras, etc..., use of APT and YUM are encouraged. Although the formal Fedora Core project bundles only YUM and downplays the role of APT support in Up2Date itself, many believed APT-YUM to be superior due to its mature logic and availability of a GUI in Synaptic. But as of RPM v4.3 and YUM in Fedora Core 3, additional capabilities have been added with support for multiple architectures (e.g., running x86-64 and i386 binaries simultaneously on x86-64 systems -- NOTE: Debian solves this by using a chroot environment for i386 under x86-64, an innovative approach), 3-tier packaging sets (which the Anaconda installer for Red Hat has had for a long, long time, but not in the package system itself) and other capabilities.

YUM still seems to have some growing pains, as GUIs are not forthcoming. E.g., YUM Extender (yumex) is not developed by and for Fedora, and it has caused major issues in the past. Fortunately, a new front-end solution from the creator of APT-RPM is on the horizon -- SmartPM which we'll discuss a bit later.

TAKING A BREATH ...

Now at this point, you're probably wondering what distribution I think is best? In reality, I don't think any are "best." What distribution is for you? It really has nothing to do with brand name (unless there is some "commercial misconceptions" which I'll get to later), and I'm only providing this as a "foundation" of "technologies." I know a lot of people talk about Linux and "choice," but many times they find themselves arguing about "brand name choice," instead of actual "technical choice."

The reality is that if something works well in one distribution, it is typically adopted by another. In any democracy of infinite choice, most people will standardize around 2-3 implementations -- something that the natural laws of sigma statistics (1 ~ 67%, 2 ~ 96%, 3 ~ 99+%, etc...) seem to support. While some might argue that "foundations" of one distro might be better than another -- and I'll be the first to admit, as an engineer, I agree with the Debian Project as a community and Ian Murdock's Progeny endeavor as a commercial services company is most ideal (configuration management is a corporate-specific detail, and you can't get that out of a "shrink wrapped" box or even fixed product SLAs that may not address all your needs) -- all distributions have offered many other things to others, and they all leech on each other. Because Linux is about a choice of "technologies," not "brands," although our history and exposure to "marketing" in the commercial world still tends to distract our focus.

Which brings me to our next focus, "Ports" ... that's right, all we've talked about to this point is OS distributions that use "Packages."

GETTING BACK TO BASICS: PORTS

The aftermath of the original 4.3BSD-based 386BSD project, and the resulting 4.4BSDLite codebase agreed by both AT&T USL and UCB to be copyright/ownership of UCB that would eventually see 3 new, community BSD UNIX flavors of FreeBSD, NetBSD and, later from NetBSD, OpenBSD, about the same time as the founding of the Debian Project and Red Hat, Inc. Instead of focusing on the traditional realm of package management to solve system configuration management, most community BSD implementations have stuck with the basic premise that UNIX is C source code that can be built against any OS implementation it has been "ported" to. GNU system tools like Autoconf are designed to software can be easily ported and built on different GNU, POSIX-like platforms, depending on how well the software developer wrote the software (including using the Autoconf and other tools, as well as following GNU coding or other portability guidelines).

To start with a specific example, FreeBSD's "ports" system is a different approach to distributed software installation, updates and other software. Instead of a "package maintainer" taking software, building a package configuration file (e.g., a "SPEC" file in the case of RPM), building it from the "source package" (e.g., a .src.rpm aka SRPM) on one more more systems for one or more "packages" distro releases, the "ports maintainer" merely includes the small configuration so the "ports system" can fetch the software, as well as any additional developer files (e.g., a modified Makefile, Autofoncf configuration output, etc..) needed to build the software. That way, instead of having to make packages for different system configurations -- a redundant and disk space bloating option -- the end system fetches all the files it needs to build the software for just itself.

In other words, "ports" distributions are a front-end for automating software building from source code. "Ports" distributions maintain a centralized repository where "ports maintainers" can collaborate and release new support files (if any are required) for new software releases. In many cases of various ports repositories, the software itself is actually not stored in the "ports tree/repository," but is fetched by the client. This is quite unlike "packages" distros, which include not only the software in the binary/usable form, but the source packages as well.

The Linux world was without a good "ports" distribution for a long time until Daniel Robbins created the Gentoo Project.

PORTS V. PACKAGES: ADVANTAGES

Although opinions vary, especially those who focus on "branding" instead of "technology," here are some key and distinct advantages/disadvantages of Ports v. Packages.

- Always Current -- ADVANTAGE: PORTS

Ports are almost always current. Unless a software release requires a significant change to the build process, ports trees/repositories are updated almost immediately. There may be some testing done by the "ports maintainer" for the software, but he/shee has a lot less effort to deal with than an"packages maintainer" which brings us to the next advantage ...

- Distribution Release Effort -- ADVANTAGE: PORTS

By far the build effort is greatly reduced on the distribution maintainers themselves in a ports distro. Other than making sure newer versions can built with the existing support files in the ports tree/repository, the tree remains "current" without much effort. This, of course, means that end-user systems have more effort in actually building the software (which can introduce delays). But with a good configuation management roll-out with distributed builds and binary distribution, this can be mitigated -- even in a SMB organization.

The effort at the packages distribution gets heavy, especially when the distributor is maintaining several different, simultaneous versions (which is what Red Hat was doing when Red Hat Linux 9 was released -- supporting 6-7 community versions, not including enterprise!). This is why most packages distros don't support more than 2-3 releases simultaneously, or don't guarantee timely or well tested updates for older releases (e.g., Fedora Legacy). The only main advantage is that once the software gets to the end-user system, assuming the version is supported, the effort (and assuming compatibility -- see the next point) is next to nothing.

- Update/software compatibility/availability/customization -- ADVANTAGE: PORTS

Now this is where Ports really shine. Because the software is built at the end system (even if distributed in binary form to all other systems of the same configuration), it can be built by and for a significant number of different configurations. Packages distros often only have 2-3 maintained releases (going back to the "release effort" required), and while the dependency resolution is typically sound on major distributions, and the front-ends very accommodating, it can get very messy when "package maintainers" of the official, distribution's packages do not match their build processes/assumptions with the "package maintainers" of other, 3rd party packages. In a "ports" distribution, the build processes/assumptions are unified at the end building system.

Compatibility of updates and software availability is also extensive, and "ports" distributions like Gentoo outstrip Debian in sheer number of available packages (which goes back to the "release effort"). Small changes do not have a significant impact on "ports" distros, or they can be accommodated by supporting different, core system "profiles" (using the Gentoo term) whereas many "packages" distros don't. I.e., a "packages" version may dictate whether /dev has actual files in it, or uses a virtual /dev filesystem like devfs or udev -- and an "older" version is left behind in available software. A "ports" distribution could be built to either, but run the same, latest user-space software could be updated.

Now you _could_ use the source code fetching and rebuilding features of a "packages" distribution to rebuild the entire distribution from source, solving many issues. Indeed, both Debian and, now that it is completely open, Fedora have extensive build frameworks for rebuilding each system with many, many options that rival "ports" distros, they still make a lot of assumptions on and do not offer many inherent features of ports. E.g., referring back to the previous statements, the "profiles" that tend to be pre-determined for each "packages" distribution (although Debian is clearly less of an issue here than Red Hat, for more approach reasons than anything, each having merit).

- System footprint -- ADVANTAGE: PORTS

Now this one is very misunderstood.

First off, "ports" distros are very nice when disk space is limited. "Ports" distros make it the easiest to build a system with just what the system needs in software, whereas "packages" distros take a set of software and often build the whole kitchen in case different systems need to use different parts of the program. This makes "ports" distros the most ideal for application-specific systems such as appliances and other solutions where disk space is at a premium. Although some embedded work will require a more "elementary" level that the assumptions the "port maintainers" make, "ports" distros still make a far better "starting point" than common "build from scratch" approaches is not true. Only when you reach a level where support tools are necessasry for the development/targetting of the platform do "ports" distros fall short (e.g., targets where something like the embedded/loader/tool support in Monta Vista Hard Hat Linux might be more useful).

Secondly, but differing, there is a common believe that "ports" distros are significantly faster on newer hardware than "packages" distros. It really depends on how the "packages" are optimized, but leading edge "packages" distros tend to build all software as optimized for the most common platform. In the case of "extensions," such optimizations are not typically a compile-time function, but the design of the software itself. E.g., using SSE units in a processor instead of ALU or FPU is a decision made by the software, not the compiler, because SSE is not as precise and could typically does significantly adversely affect calculations (neligable for games, detrimental for sci/eng and sometimes system calculations). "Packages maintainers" and "ports maintainers" are in the same boat -- without re-writing the software, the former typically finds itself building the packages configuration to support _all_ extensions and optimal performance while the latter allows end-systems to build as optimally for one system.

[ TECHNICAL NOTE: I recommend against throwing the -O3 switch system wide. On Intel P3/P4 with 1/2 SSE pipes, respectively, you will see signficant ALU/FPU calculation errors. And even on the AMD Athlon-Opteron, which uses its percise, pipe-abundant 3+3 issue ALU/FPU to do SSE, -O3 often causes risky optimizations that interfere with its built-in out-of-order execution and register renaming. GCC defines -O3 to explicitly try optimizations that are "risky." -O2 is the "safest, recommended" setting. As much as "ports" advocates might claim that a ports distro has an advantage by using -O3, they are not only incorrect on the advantage due to lose stability, but _all_ major "packages" distros can be completely rebuilt with -O3 (let alone different --march/--mtune settings) with their respective build systems (e.g., the dist-tools for Fedora). ]

But reversing that discussion, the previous assumes you are talking popular, modern systems. If you are talking early hardware platforms that are not optimized like today's sytems, definitely, "ports" distributions will perform much better. E.g., Red Hat currently builds its Fedora Core 2+ x86 releases for i486 ISA (instruction set architecture, essentially i386 + FPU + TLB -- the TLB being the major boost over i386), optimized (scheduler, registers, units, etc...) for Pentium 4 (which also is optimized for Athlon), and these will perform less than optimally on less than a Pentium 4 or Athlon (although they will still boot). Even Red Hat Linux 8 (7.3 too?) was built for i486 ISA and prior was i686 (Pentium Pro and K6, but _not_ Pentium) optimized.

Especially if you have a true Pentium or Pentium MMX (and not Pentium Pro/II/III/4, K6, Athlon, etc...) which has optimizations that are actually "de-optimizations" on other processors, even i686+. A "ports" distro like Gentoo is probably the highest performing distribution for true i586 platforms. Same deal with "embedded," 500MHz Pentium II/III "class" processor that only has an i486 ISA and does not support the i686 ISA, or is at least not designed to be i686-class superscalar. AMD SCLAN, Cyrix Mx686/M1, IDT Centaur WinChip/2, SGS Thompson, and several other processors, available in even 500MHz+, are actually only i486 ISA. And while even the AMD/Cyrix/NS M2/Geode, IDT Centaur WinChip4 and the Cyrix-Centaur evolution into the ViA C3 might support the i686 ISA, they are clearly not i686 class superscalar designs like the Intel Pentium Pro-P3 or AMD Athlon-Opteron that will do well with --march=i686 (let alone --mtune=p4/athlon).

These latter points with older/embedded hardware are cleary cases where compile-time ISA/optimizations will have a significant impact on performance, and "ports" _do_ offer advantages -- at least advantages that are inherent to the process (whereas a "packages" distro requires their build tools, and might "break" if i686 is required for some things).

- Legal Redistribution -- ADVANTAGE: PORTS

Another obvious advantage for ports from a legal standpoint is the fact that some legal issues are bypassed. For example, while some software is free, many are not freely redistributable. Packages maintainers may not have permission to repackage and redistribute software, and by providing another entity's software in a package repository, that repository could be liable. Not so with ports trees, because in the case capacity that an individual ports distribution system can fetch the build information from the ports tree/repository, it could also fetch the software from the vendor's site. With exception of any click-thru agreement requirements, there is no redistribution involved, other than the support files (which are typically not from the vendor anyway).

Now it may seem there are no negatives to "ports" distros other than the build-time effort. Depending on your viewpoint, this may be correct. But let's re-visit an old friend of engineering and key to configuration management, "integration testing."

PORTS v. PACKAGES: INTEGRATION AND REGRESSION TESTING

Many Open Source advocates claim that all bugs are shallow with the availability of code, and this results in better code. Furthermore, they claim that the availability of code makes it easier to test software with other software, and change either software to accommodate each other. For the sake of this article, even if you don't agree, assume these are true facts about Open Source. Why do I ask this? Because Open Source projects on their own still doesn't solve the final detail of building an OS distribution: integration testing.

Integration testing is usually the primary reason why engineering projects slip. This is no different in software than in anything else. You could have teams designing the best software components for use in a system, and even sharing interfaces and standards with each other (which certainly helps), but until you start actually integrating the different components as a single unit, there is no guarantee they work with each other. While one project may involve many others, because of the assumptions of thousands of packages in a single distribution may bring to the table, and differ on, integration testing is necessary.

There is absolutely no guarantee that a "packages" distro will offer any integration testing. Packages distros can just plunk out packages from different Open Source projects, hoping the developers have built their software to work together with other software the "packages" distribution has chosen -- possibly something the original software developer didn't think of. There are countless "packages" distributions out there, many forks of others, or partially based on the prior works of others. Many are innovative, many are offering new packages as standard, but they all fall under the same need for integration testing.

Some "packages" distros do more than others. Some "packages" distros are known not only for their integration testing, but integration plus regression testing approach. For example, both Debian and Red Hat have 3-tier development approach for every new distribution. Now Debian and Red Hat's cycles differ greatly, but they use the same set of package regression, distribution integration and distribution integration testing ...

Package Regression: Done by maintainers, then put to ...
Debian Unstable, Fedora Development (fka Red Hat Rawhide), etc...

"Package maintainers" do their own package-level regression testing. Sometimes they do it privately or as part of their team, othertimes they release it into the "new package repository" of their respective distros where others can regression test along with them. Many times they are doing regression testing and building upon prior regression testing on each revision or patch level of software. E.g., Red Hat regularly maintains up-to-the-date kernel developments and keeps applying patches to kernels, and releasing their continuing regression tests of their patch kernels, typically within days after kernels are released from kernel.org.

Distribution Integration: Done with "new package repository," eventually becoming ...
Debian Testing and Fedora Test (fka Red Hat Beta)

As packages are dropped into the new package area, both the distro and the users who are running those packages as a whole (e.g., Debian Unstable or Fedora Development) will quickly and most definitely quickly run into integration issues between packages. As such, integration testing will occur, changes that need to be made to those packages made, put back out in the "new package area" to be downloaded, tested, etc... again. Eventually, at some point there are freezes on new package submissions as most integration issues have been worked out. And thus, a formal integration test begins, such as Debian Testing and Fedora Test are. Whether the changes and releases are formally interated or just done over time, the integration test quickly turns into a series of ...

Distribution Regression: Done by distribution release and community testers

How well this is done, I am not here to tell you. I am merely pointing out that when a "packages" distribution goes to build a release of end-user usable binary packages, this is the model they use. Many people debate whether or not Fedora/Red HAT's 2-2-2 (Development/Rawhide-Test/Beta-Release) @ 6-6-6 (0-1-2 -> Enterprise) month model is better than Debian's more direct 6-6-6 model. But it is clear that one of the reasons why Red Hat "pushed adoption" of things like GLibC 2, GCC 3 (and now GCC 4), kernel 2.4, 2.6 (and backports) is because they push out community revisions every 6 months, with the first of a new 18 month cycle being a ".0" release (in old Red Hat Linux speak) with a lot of things changed.

In most "ports" distribution, there typically are no formal binary releases typically because most are an "always current" release. Now cases can be made to show that many "ports maintainers" take the time to regression test a software release before adding it into the ports tree/repository, and there can be and there often is even ports distro-wide, formalized integration testing at times -- especially for major changes. And some may argue that with small changes, the 3-phase approach of the more popular "packages" distros are overkill and inefficient, and this holds true in the eyes of many.

And lastly, some would argue that "packages" distros are trying to solve a problem that was introduced by the commercial software model, whereas "ports" are just a return to the foundation of UNIX ... source code.

ENTERPRISE CONFIGURATION MANAGEMENT

SuSE was the first vendor to introduce an "enterprise" specific distribution. Before SuSE Linux Enterprise Server (SLES), Red Hat was the first to offer a Service Level Agreement (SLA) with Red Hat Linux 6.2 "E" -- the last 6 month revision (a total of 18 months of releases) in the 6 release. The corporate Linux world responded by awarding SuSE with sales for its "separate" product, and Red Hat was left scrambling to introduce the same (which they did with Red Hat Linux Advanced Server 2.1 based on Red Hat Linux 7.2, and refreshes of different products with Red Hat Enterprise Linux 2 based on Red Hat Linux 7.2/7.3).

In a typical enterprise, there is this common tendency to trust something that is fixed and shrink-wrapped. After all, with any arbitrary Linux, how do I know if it will run Oracle -- at least without doing much research? But despite all the the belief in how "out-of-the-box usable" supposed fixed and shrink-wrapped software is, organizations still do formal configuration management whereby they install software, test it for their applications, etc... before rolling it out. So not only is a "packages" (or other binary) distribution doing their own configuration management prior and after release, but the end user is as well.

So at what point are the duplications not worth making?

This is really a question left to organizations who will believe they can answer it best for themselves, and this is probably an understatement. Organizations who believe they can do a better, more relevant and custom drop of assembling, building and supporting Linux will see little value in an "enterprise" Linux. In fact, there is much proof of this in the fact that BSD UNIX is still far more popular than people believe, and its only the marketing and product availability and resulting perceptions that say otherwise -- just like more community endeavors including Debian and Gentoo versus Novell and Red Hat.

But what I can tell you is the strategy of each project or organization, which then leaves the decision up to you. First off is Novell and Red Hat, who have seemingly taken turns mirroring each other.

SPLIT PACKAGE DISTRIBUTIONS: SuSE and Red Hat

Both SuSE and Red Hat's core models have been more similar than different. Each maintains 2-4 "revisions" of a largely "binary compatible" series over 18-24 months. The last release is typically the most compatible with the ones before it, and the most used and tested and therefore stable. Both "push the envelop" in their ".0" releases, purposely changing the core kernel, GLibC, GCC and/or other components, sometimes even adopting "beta/pre-release" or "backporting" software for the first revision, because the release version of the software will most likely be out for later revisions.

As mentioned previously, Red Hat attempted to keep its product unified by offering Service Level Agreements (SLAs) on its Red Hat Linux 6.2 release, known as 6.2 "E" for enterprise. Red Hat has always maintained the status that "what we ship, we support" and would not entertain extra packages.

SuSE, on the otherhand, often included the kitchen sink, whether they supported it or not. Starting with SuSE Linux 7, SuSE released a subset package release built for enterprises and offered SLAs with its contents and, unlike Red Hat, called it a different distribution: SuSE Linux Enterprise Server (SLES).

Red Hat quickly followed SuSE's lead as the corporate world was willing to believe that a separate enterprise product was better. Red Hat still maintains its 6-6-6 release model in Red Hat Enterprise Linux, but they hide it more. This release model, and the sub-2-2-2 model dev-test-release, has not changed. And the same developers who work on Red Hat Enterprise Linux will work on Fedora Core, because their is a 1:1 package relationship, just like with Red Hat Linux prior.

Despite belief to the contrary, Red Hat never supported releases long term (typically only the last ".2" release and the current 1-2 -- about 18 months), and only when Red Hat Linux 6.2-7.1 became popular did Red Hat indulge to support up to 6-7 revisions simultaneously -- something they eventually found was a great waste of resources. As of Red Hat Linux 8, Red Hat officially declared they would only support revisions for 1 years (basically only 2 back), which was really just a clarification of their past model since Red Hat Linux 4.2 (4.x/5.x were the first "modern" release under the 2-2-2/6-6-6 approach -- "Rawhide" being formally designated introduced around the 5.x release cycle). Red Hat never offered Service Level Agreements for Red Hat Linux (except 6.2"E", their "RHELv1" retroactively if you will), and the RHN access wasn't dropped, direct updates via Up2Ddate were just opened up for free.

COMMUNITY AND SERVICE: DEBIAN AND PROGENY

I have already discussed Ian Murdock's Debian Project in many advantages, many of which continue to today. About the only note I can make from a political standpoint is that some of the same "advantages" that were argued by many Debian proponents against RPM distributions: more packages, better/automatic updates, no dependency hell, etc... are now the same arguments that Gentoo users are making (thanx to its "ports" model). In reality, as always, technologies are key to understanding differences in Linux, and not making them a marketing stick because many technologies do overlap distributions.

E.g., there are even some "packages" distributions that are starting to use some collectiosn of software via "ports" (e.g., Perl, Java, etc...) repositories where formal configuration management is less of an issue (e.g., especially during development, hence development software), as well as more "pre-packaged" binaries for "ports" distributions that are good for specific (e.g., modern) hardware to reduce build-times.

Debian is still a single project and set of releases. Although many independent, commercial distributions are based on it -- from Xandros (fka Corel) to Lindows (actually based on Xandros), Ian Murdock did not create a commercial distribution from Debian, but a commercial endeavor, Progeny. Progeny markets itself as the "Linux Platform Company," and clearly sees Linux configuration management as a service it can provide to organizations to help themselves best. After all, even when most organizations buy "shrink-wrapped" or "enterprise" versions of software, they still tend to do their own integration testing (if not regression testing as well).

So it's not surprising that larger organizations like the City of Munich picked Debian over its prior initial evaluation of Windows and Linux with Novell-SuSE with IBM Global services as the Linux option. Now even Murdock has confirmed in his blog that between the initial OS choice and final solution decision to go with Debian has been swayed by the fact SuSE was no longer SuSE AG, its own German corporation, but now a division of Novell, Inc., an American corporation. But that on its own would and could not sustain the endeavor the City of Munich (with ~70,000 desktops) with Debian.

The reality is that many believe that you do not need an "enterprise" distribution to roll out Linux, because you're often going to be doing "configuration management" anyway that involves integration testing. After all, even if the "enterprise" distribution goes through a lot of regression testing and offers formal Service Level Agreements (SLAs) to 4 hours (and even lower), they still only offer this on their fixed, limited set of software. Because most corporations run more than just what the vendor supports.

[ SIDE NOTE: Which is why I often say if you choose Microsoft solutions, stick with only Microsoft software -- across-the-board. But if you find it is limiting, you shouldn't just limit yourself to software that only runs on Windows, and consider a more "portable/open" future. ]

At the same time, people still do purchase SLES and RHEL because it runs the few, certified applications. Which is why it is not surprising that one of the new endeavors for Murdock's Progeny is to ensure Debian Linux is binary compatible with Oracle and other commercial software that normally runs on RPM. The idea here is not to create an "enterprise" distribution of Debian, but a Debian configuration that is ready for select applications that are certified against select, marketed "enterprise" releases like SLES and RHEL.

As I have said repeatedly, Ian Murdock is ahead of his time and the engineer in myself completely appreciates the unleashing of his foresight onto the corporate Linux world.

FOOTNOTE: SMARTPM AND EMERGE FOR EVERYONE?

This blog ended up being a lot longer than just a simple discussion of packages v. ports, eh?

The reality is that with any innovative, technical approach, everyone has to accommodate. "Packages" distros are already starting to use "ports" mechanisms for some portions of their distribution (e.g., development software like Java, Perl, etc...). Gentoo's excellent "emerge" is commonly used now for such, even atop of Debian and Fedora (and even found in repositories for both now!). All "ports" distributions to offer good portions of their system in "binary" form that is testing to various levels, including Gentoo.

With that said, even the "packages" world is undergoing a unification of sorts. Progeny already has its "Componentized Linux" which makes support of Debian and Fedora systems uniform (even if package exchange/support is not quite). And many endeavors to port APT-RPM to other platforms, like Solaris, for unified configuration management have actual, major usage at universities like Rutgers. And even the "front-end" for package management systems is starting to unify if Mandriva has anything to say about it.

Mandriva, formerly RPM-based Mandrake, gobbled up both RPM-based Connectiva and DPKG-based Lycoris. The same people at Connectiva that first adopted APT-RPM has now introduced SmartPM -- a near-universal front-end for package management systems (e.g., DPKG, RPM, Slack, etc...) that set out to solve at least three major issues from the get-go:
- Advanced dependency resolution (improvements over both APT and YUM)
- multiple respositories (beyond what APT's pinning can do, and can even support multiple repository formats) and
- Comes with a variety of interfaces as standard (CLI, CLI+shell, GUI like APT's Synaptic, etc...).

It's still in testing but many people are switching, including DAG which is one of the largest APT/YUM-RPM repositories for Red Hat Linux, Fedora Core and Red Hat Enterprise Linux (as well as CentOS). The FAQ is extremely enlightening, including more information on all the support, as well as case studies on how it approaches issues with the APT and YUM systems differently (and more completely) while supporting their repository formats so it can be used today without any change:
http://www.smartpm.org/

2005-07-26

'Ritters: Because most NAT/PAT devices are NOT Routers!

As you may or may not know, I've been cringing at the term "Routers" applied to NAT/PAT devices that do only basic IP forwarding at most. Several months ago I ran into a potential client who couldn't understand a thing about _real_ "routing," routing tables, their distribution, etc... when it was clear they needed routing on a multi-subnet network (let alone with their VPN). Since then I've been trying to evangelize, even if only from my closet.

But at least someone at Wikipedia knows my plight (among countless other professionals) as someone just pointed out to me the other day:
http://en.wikipedia.org/wiki/Router

"... These are not "routers" in the true sense, but the terminology has been confused with network address translation."

I had suggested "NFR" -- for "NAT-Fowarding Router" and, slangwise, "Not a Fscking Router." But acronyms aren't always the most ideal. E.g., I hate the terms FOSS, FLOSS, etc... that try to solve the Stallman insistance of "Free Software" ("free" not being appropriate even in the mid-'80s IMHO) with the "Open Source" movement. Frankly, I think the term "Freedomware" is far more appropriate, since it clearly tells many people that's it does "cost" in some way, but it's typically worth the vigilance. ;->

Last week I finally thought of one for these NAT/PAT devices, although I forgot about it until someone brought it up on a list. I just slurred my speech into a stereotypical southern drawl and came up with "'Ritters."

Like [c]'Ritters to a house, one is fine, but many are a nuisance. They are something you want to keep out of your house, and definitely off your enterprise network. More than one is an infestation you can't control as they mass produce off-spring (unnecessary IP issues). The few professionals I used it in front of last week really liked it.

And when I hit Wikipedia (after someone pointed out the NAT and Router references), sure enough, 'Ritters has another meaning. It is the lowest title for lower nobility in German history. That would also apply, as translation is the lowest and most uncontrolled form of packet/transport (Layer-3/4) forwarding. Wikipedia claims it is the equivalent of a Knight, which would also make sense. One Knight alone is fine in a house, but two loners without any alliance or way to communicate (no actual routing protocols) will cause the same issues. Which is why you need a protocol, like a routing protocol (RIP, OSPF, etc...) as a "round table" for Knights who work together.

And the difference between a Router and a 'Ritter. Because there are $200-300 Routers out there, and occassionally at the Superstore, that also have RIP and, gasp, even OSPF setup and support. But most of them are 'Ritters, with absolutely no support for multiple subnets, distribution of routes, etc... which are a PITA with SMB networks, VPNs, etc... 'Ritters are ultra-simple Layer-3/4 NAT/PAT devices that can only act as a "default route" for *1* subnet without causing all sorts of network headaches (e.g., useless ARP, added overhead, etc...).

NT 6.0 Longhorn "Technology" Hasta La Vista: The Vaporware Sequel to NT 4.0 Cario

Well, last week the rumors came out and Microsoft finally announced NT 6.0 "Longhorn" is going to be named "Windows Vista."

Here's the "new view" that Vista de la Windows delivers:

1. INDIGO Services
... cricket, cricket, cricket ...
[ Yes, that means _nothing_ else! ]

- What is INIDGO?

Indigo is a sandbox of .NET services atop of aging and "Chicago" polluted Win32. In other words, XML/XSL and other content/style and related services over HTTP and other Internet protocols. Win32 remains unchanged from its evolution from ideal, single user security to "Chicago"-Internet Explorer infested DLLs with no security.

The idea here is that Microsoft will sandbox all these Internet services away from the normal "LAN" services we already have had since NT3.x, 4.0 and 5.0/2000, as Win32 with Access/MSDE still remains the foundation for ADS, MS-RPC, SMB and even Exchange (it's ESMTP service, for Internet e-mail, remains a wildcard -- see MONAD).

Now how is this any different than a Java sandbox of services atop of Win32? It isn't. But Microsoft couldn't call it Java(R), as they lost the rights, but not the code (Java 1.1). And Microsoft has re-licensed Java from Sun, in their new alliance (or didn't you know what that was all about and not some anti-Linux thing?).

And best of all, that's only in "Longhorn Server" now "Windows Vista Server," due in 2007.

Now, with all due respect to Governor Aaarnold, Longhorn is turning into one of Microsoft has the best sequels ever!

The original movie,
- NT (New Terminator) 4.0 "Cairo": First Vaporware

Totally had the world fooled. The supposed "Consumer NT" was totally over-promised as "Cario." They eventually became "Cario" technologies and eventually VAPORWARE! The CarioFS that was supposed to fix design flaws in NTFS died, along with the promise of a "real" OpenGL desktop, as DirectX had taken over.

Now we have the new blockbuster ...
- NT II -- 6.0 "Longhorn": Windows Hasta La Vista

Let's review those Longhorn now WinFX technologies that have NT has said "Hasta La Vista" too ...

2. WINFS, fka CarioFS
3. AVALON Desktop with DirectX 10, now known as WGF2.0
4. MONAD Shell

Will these former Longhorn features, now "WinFX Technologies," ever see the light-of-day? Hmmm, history lesson ...

Win32 was 1991, NT 3.1 was 1993, Gates gave the go-ahead to "Chicago" (DOS/Win) not NT/Win, in 1994 and resulted in NT 4.0 "Cario" _dying_ as promised, because people were still developing for "Chicago." Win32 died as an API, infested with "Chicago," and NT 4.0 released in 1997 was rather pathetic from its promises. All the separated "Cario Technologies" died off into VAPORWARE.

.NET came about in 2001, .NET Server nka Windows Server 2003 came out in 2004 and was _nothing_ of .NET, and by 2004, the MS Office and Internet Explorer teams were successful in keeping the Win32 codebase and _preventing_ .NET security and other APIs from going into the _core_ of Visual Studio .NET (Microsoft's own applications division, not Independent Software Vendors, ISV, is the "core problem" with the lack of new API adoption). And just as with "Cario," all of the so-called NT 6.0 "Longhorn" promises became separate "technologies" -- basically all but "Indigo," which Microsoft already had from its Java 1.1 codebase now C#.NET (although many things were missing, hence their Java re-license recently).

But now is also the time of the Internet and Open Source. From the looks of it, Microsoft is releasing many of these technologies in Betas and to MSDN developers. It's clear that WINFS and MONAD are _CLEARLY_BROKEN_ and they _BREAK_COMPATIBILITY_ in the OS rather significantly. So maybe Microsoft is hoping that others will test, modify and fix this?

Maybe something that is "released, known, but not supported," like the utilities and services you'd find on you "Resource Kit" CD or other downloads, even from Microsoft. It will be interesting to say the least.

But until then, what are these technologies, at least as promised???

- What is WINFS?

Most people don't know that WinFS isn't really about giving you a cool database in the filesystem. It's largely about fixing bugs inherit to the design of NTFS, without breaking applications. Where NT 4.0 Cario, which became the separate "Cario Technology" known as CarioFS, failed and turned into vaporware, NT 4.0 Longhorn, which became the separate "WinFX Technology" known as WinFS, doesn't look much better. Why?

It breaks all sorts of compatibility. NTFS has turned into a total support fiasco with hacks upon hacks, all because Microsoft made a serious boo-boo in design approach. It's a major reason why Windows domains exist, because each NTFS installation is tied to the Security Accounts Manager (SAM) of the registry of the system that created it. Domain Controllers (DCs) are merely systems that make their registry-SAM information available network-wide, which doesn't totally solve the problem.

For more on this history, see my recent SLUUG presentation on "Low-Level Interoperability" Part 3 (temporary location):
http://www.geocities.com/thebs413/SLUUG_LowLevelInterop_Part3.pdf

Conclusion: WinFS is VAPORWARE like CarioFS before it

- What is AVALON?

Well, it's really 2 things.

One, the concept of Avalon really breaks a lot of legacy and (yet more poor) NT design approaches. One is the insistence by Gates himself that NT not be bootable into a Command Line Interface (CLI) mode, even though it uses CMD.EXE from OS/2. This gave rise to the Graphical Display Interface (GDI). Although Microsoft has added some modes in NT5.0/2000+ like the "Command Console," 99.999% of Windows programs rely on the GDI, and won't work except under a full GDI (including the _core_ NT kernel/DLLs like the registry and other access libraries). Avalon is the first attempt to transition new programs off of the GDI requirement.

Two, Avalon is also Microsoft's attempt to produce a desktop framework that leverages the Graphics Processor Unit (GPU). For those who have experienced MacOS X's QuartzExtreme, or maybe even FreeDesktop.org's Cario on Linux (not to beconfused with NT 4.0's codename) or maybe Sun's Looking Glass (which is far more than just video/display -- a quantum leap in total 3D computing, long story), you have seen what using the GPU and storing windows as "bitmap planes" can do. It's not about "eye candy," but moving the load of overlapping windows from main memory maps and CPU to the GPU which it is explicitly designed for.

It is far more efficient to represent a 2D window as a bitmap on a 3D surface inside of the GPU, and those built-in geometry functions of your video card, instead of using main memory and the CPU interconnect/processing rather "dumbly." It massively increases spead, and allows far more tricks to be had for _reduced_ memory and CPU footprint. Let me say that again, it _reduces_ memory and CPU footprint for remedial tasks, and basically adds "eyecandy" for free.

Now we come to AVALON. As you will read in the near future, Microsoft _is_ supposively shipping Avalon in NT 6.0 "Longhorn" client, now Windows Vista (Client? Home? Pro?) in 2006. What Microsoft doesn't tell you that it will be coming with what they now call Windows Graphics Foundations (WGF) 1.0 or 1.1. So what is WGF? WGF 1.x? WGF 2.0? More on that in a bit ...

MacOS X's QuartzExtreme, FreeDesktop's Cario and Sun's Looking Glass all use OpenGL, typically either 1.3 or newer 2.0. OpenGL is a true and full geometry setup, designed for professional graphics on-the-on-set. Even Microsoft standardized on OpenGL _until_ they couldn't get it to run on "Chicago" and its 386Enhanced mode (switching constantly between Real86 and Protected386 -- yes, Windows 95/98/Me _still_ do this!), so that's where attempts with WinG, then DOS Direct Memory Map aka DirectMM aka Direct2D, then eventually DirectX. DirectX came from the other direction, "What 3D do we need for games now? We'll add professional graphics later."

Microsoft has done an excellent job swindling IP out of major OpenGL holders in the last few years (so much so that Microsoft's #1 IP strike at Freedomware will be in the 3D space), and DirectX 7, 8 and 9 have gained more and more geometry, T&L and other OpenGL capabilities. But DirectX 9 still isn't capable of many features required for Avalon. So what is WGF 1.x?

WGF 1.x is _still_ DirectX 9, as it can handle the new Avalon system as best as it can. It's largely a combination of some hardware GPU functions with a lot of software "back-fill." The performance of Avalon on full eye-candy on WGF 1.x that will be shipping in Windows Vista in 2006 is rather _pathetic_. Which is why there are 3 options, and most people will want to choose either the legacy Explorer or middle (some Avalon features) settings.

WGF 2.0 is basically what DirectX 10 was going to be, with the full support for Avalon off-loading to the GPU and its 3D framebuffer. It will _not_ ship with Windows Vista and is projected for release some 18 months later, probably after Windows Vista Server, circa late 2007.

Conclusion: Avalon will _suck_ until WGF 2.0 comes out. But WGF 2.0 will come out and won't be vaporware. Microsoft has assured that applications will be written for Avalon by including WGF 1.x in the initial Windows Vista, although tools in the current Visual Studio suite still remain an issue.

Further Predictions: It is my hope that Avalon will eventually make it possible to run Windows software much "lighter" than the current Remote Desktop Protocol (RDP) and Independent Client Architecture (ICA) invented by Citrix. Long story short, RDP/ICA rely on MultiWin, first used by Citrix in NT 3.51 "Daytona" and NT 4.0 Terminal Server, then standard in NT5.0+. MultiWin virtualizes multiple GDI instances, which allows Windows programs (which normally require the single, _physical_ GDI hardware) to run -- and then be remotely displayed.

Of course, as with anything Microsoft, the RAD tools in Visual Studio will abuse the WGF/Avalon APIs, and probably render this improbable and introduce a new set of compatibility issues with remote display (which companies like Citrix will yet again fix ;-).

- What is the MONAD Shell?

One thing NT continues to be plagued by is the lack of _good_ shell scripting tools and communities built around them for _system_ level functionality. Yeah, Microsoft has turned Basic into a scripting language for users with virtually _no_ security, so when turned into forms like Windows Scripting Host (WSH) at the shell, or ActiveX for the web, security was a _joke_.

Now there is ActiveState, which Microsoft quickly funded after their Perl::Win32 distribution became popular, but it's clearly not native. Various POSIX/UNIX environments like Cygwin (complete POSIX emulation layer) and MinGW (Win32-POSIX conversion libraries) and even Microsoft's own licenses of MKS' and Interix toolkits are rather non-native and incomplete. It was clear Microsoft needed a new shell environment, one that was ADS integrated, full .NET API/security-based, powerful and maybe could even challenge traditional UNIX mindset and features.

MONAD was the project. Unfortunately, like Cario Technologies in NT 4.0 before it, it has been segmented as a WinFX Technology and not part of Longhorn nka Windows Vista.

MONAD could serve some serious needs for Microsoft.

#1 is my personal favorite, the ESMTP parsing logic in Microsoft Exchange. Back in 1999, I found serious RFC821/822 non-compliance issues in the ESMTP service of Exchange 5.5. In a nutshell, there was a Windows problem tracking system that was spitting out malformed (and even correctly formed) SMTP that a small, remote office of my company had (supported by a Microsoft Solutions Provider, who would quickly become a source of great disgust). Since I controlled the DNS and MX records for the domain, I regularly got the finger pointed at me, but I eventually started throwing raw, properly formed and malformed SMTP strings, based on what the problem tracking program was trying to send it on the local host.

I was able to _crash_ the NT server, and _reproduce_ it.

I got _no_where_ with Microsoft support, the MSP staff, and I tried to escalate it to report it as a serious bug. No dice. Eventually, when I wanted to take it to CERT, I was basically told not to. So I left it.

Apparently the parser is still at the heart of Exchange 2000 and even 2003, allowing even a full and remote root exploit in 5.5 and 2000, and crashing of Exchange 2003. As much as Microsoft has now contacted me (ever since they _really_ got "serious" after the SQL Slammer worm of January 2003 -- almost a good year after they got "fake serious/marketing gimmick" about security), I have lost all my notes.

I'm sure one benefit of MONAD was to create a _secure_, .NET shell environment for Exchange, among other services. Even if Exchange's non-ESMTP services are still Win32, and not .NET, many of the Internet-facing components, and add-ons, could use the environment and be more well "sandboxed." Because right now, Microsoft has virtually _no_ formal environment for secure scripting, and the best way to get that would be to build a .NET environment (which is very UNIX-like in security, as Miguel deIcaza and other Ximian developers behind the Freedomware, .NET-based Mono development environment feel) for scripting.

Conclusion: Apparently MONAD is still in development and while it might _never_ be an official Microsoft subsystem, it might be limitedly used without warranty/support. And it wouldn't surprise me if it is quite unknowingly in various Microsoft Internet services, like the ESMTP service of the next release of Exchange (2006?).

As always, and an original NT 3.1 beta tester myself, I have to admit Microsoft has come up with some excellent APIs and technologies in their history -- Win32 and .NET are two of them. But at the same time, not only were they not original,but more importantly, Microsoft _never_ adopted them themselves! They are the king of reuse, and .NET has died, just like Win32 has died, at the hands of security-ignorant, DOS/Win "Chicago" code. Code that caused Microsoft to abandon even the basic Win32 security model, why you must run as Power User or Administrator, why NT5.1/Windows XP was "hacked" so programs could run that didn't under NT5.0/Windows 2000, then "unhacked" back to NT5.0/Windows 2000-like settings in Service Pack 2 (SP2), and Chicago-designed Internet Explorer, implanted into Visual Studio in the mid-'90s (and anything built upon it), is why you will _never_ rid Win32 of _core_ DLL security issues.

At this point, Microsoft is only offering a "sandbox" .NET environment in Indigo. And it's way behind even Linux/Sun in desktop technologies, let alone well and heavily behind Apple.

No wonder Marc and other long-time Microsoft architects have left for Google out of _pure_frustration_!

2005-07-25

BS Acronym of the Month: F'LOST-ME

I rather tire of the "community software" world trying to invent and re-invent terminology. It's the one thing that defies everything that is the Internet, UNIX and other details ... perpetuality. If I wanted terminology that changes every 5 years or so, I'd form my own software company, start buying and marketing the 3rd best product as the one you need, and change names and terms every so often to drive sales, training partnerships and general industry awe (along with the occassional, bothersome government agency).

So far, the "community software" world has come up with the following names ...
- "Free Software," which wasn't even viable in the early '80s when Stallman came up with it, it connotates free in cost, and not free as in freedom which it really is
- "Open Source," which was a great idea from several sources, and has now become the greatest marketing gimmick for commercial companies who want to say, "no, we aren't proprietary"
- "Free and Open Source Software (FOSS)" -- great, another acronym to learn!
- "Free, Libre and Open Source Software (FLOSS)" -- the nitpickers have won?

Which is why I refer to this sprawling mess as ...
"Free, Libre and Open Source Technology, Mucho Expresiones (F'LOST-ME)"

Now I've open to anything, but I have, for a long time, been stressing "Freedom Software" based on Stallman's original intent -- to think of "free" as in "freedom" instead of "cost." But I just added the "dom" so I'm not like Stallman, arguing for no reason. Kinda like adding "White Hat" in front of "Hacker," instead of insisting that all "Hackers" are good like some still do. Take the extra 1-2 syllables to save yourself 1-2 minutes of explaining. And who knows, you might actually coin a term, eh?

Over the last year, I have shortened it to "Freedomware," which is much better than what more and more people who call Linux as "Freeware" (or still don't know the difference from "Shareware"). At least "Freedom" in any context tells someone (at least in the West/English-centric) that:
1. It does have "cost"
2. It probably is worth doing, despite any sacrifices
3. It requires "vigilance" (against vendors) to keep

I've also heard the term "Libertyware" as a viable name, and I'm open to it as well, especially if it translates better to other Latin (and even more so non-Latin) languages.

I went into further detail in my 2-part 2005 January and February articles in Sys Admin magazine:
http://www.samag.com
Please note that both were poorly edited (largely done at the last second request of the editor for copy):
http://www.samag.com/articles/2005/0501/
http://www.samag.com/articles/2005/0502/

The article focuses on _Risks_ of software licenses and their adoption, and not some arbitrary political or religious (socialist?) context. People get so caught up on Open Source, Open Standards, etc..., often in a marketing context. Heck, the original concept of Open Systems is even more baffling to some. What is "open" and what is "proprietary" -- especially from a standpoint of mitigating risk to my data?

I quickly came to the conclusion several years ago that "open" and "proprietary" for software was about as useful as "liberal" and "conservative" in American politics. And trying to use some variants like "fiscally conservative" and "liberal freedom" were confusing, and a good 2D (4 extreme) model would do far better. Ironically, software licenses are more about social contracts, and it seems some common, natural law exists in a grand social theory.

In a nutshell, I apply a 2D axis map to "Standards" and "Source" of values "Open" and "Proprietary" to come up with a 4 extreme model of:
- Freedomware (Open Standard, Open Standard)
- Standardware (Open Standard, Proprietary Source)
- Sourceware (Proprietary Standard, Open Source) -- very rare (or IP-innudated Freedomware)
- Commerceware (Proprietary Standard, Proprietary Source)

Even stranger is what happens if you take a 2D map of a social construct, like software licenses or political parties, and map it on a 3D sphere as its surface area. I call software on the converging point on the "dark side" of the sphere , regardless if it wraps around longitudal or latitudal, as "unmaintained standard" or "umaintained source." That is when software has reached a point of uselessness and your data becomes "hostage."
- Hostageware (Unmaintainable Standard, Unmaintainable Source)

I actually wasn't the first person to suggest "Hostageware," but I don't consider all proprietary software to be "Hostageware" from a risk standpoint (like the first person that suggested it), if the owner values the software and values you.

Kinda like too much freedom, or too much capitalism can actually result in "anarchy," the "dark side" of the sphere in the political model, as much as too much tolitarianism or too much socialism. And it I really want to get "nuts" in some grand unified theory -- beyond social tendencies, but into the natural, physical laws of the universe -- it's it ironic that gravity is a singularity with a center, and all large objects form into a sphere, which we then live atop of its 2D surface area? Okay, okay, enough stretching. ;->

I also came up with a variant of Freedomware:
- Commuware (Forced Open Standard, Forced Open Source)

Now I'm all for mandated open standards, but I still believe the best product can win. All-too-often people think of "anything but Microsoft," and expense the good companies in between community software and commercial software. You don't off-set tactics of one entity by giving a preference to another, because not only do a lot of innocent by-standers get the fallout, but you actually _limit_ choice. Choice is freedom.

The reason why Linux and other Freedomware, a community-developed, public commons, works in a capital/free market society is because we choose _individually_. People choose to work together _individually_, and people then choose to use it as part of the community by _individual_ choice. Now while I do believe in mandating "open standards," I will _never_ agree to mandate of "open source" -- which is when a "public commons" turns into "communism."

A classic example is one of the first "public goods" we had in this country (USA), the labor union (read this _completely_ before forming opinions -- I show why they went from "good to bad" ;-). It was workers who banded together, _by_choice_, to challenge abusive employers. Because people "choose individually," the good employers didn't see much union representation, but the abusive ones saw heavy resistance -- a "public good" balanced. Unfortunately, they became mandated. At first, this hurt the employers, the balance was artifical. Then as employers went out of business, they lobbied for counter-laws to limit the power of mandated unions.

So now we just have a bunch of laws and counter-laws between companies and unions that ruin it for everyone in "closed shop" states. There is _no_ "free market balance" (other than closing shop and moving to another state/country). If Linux and other Freedomware is mandated, the same will happen. You will be
innudated with laws and counter-laws that will destroy its autonomicity and freedom to digitally assemble. That's why I'm very much dislike the "Commuware" movement, which is only a portion of the "Freedomware" movement. Linux will also stall without competition -- yes, even the community. Because it is forced, and those mandates will drive things, not actually, individual choice and merit. for counter-laws to limit the power of mandated unions.

So now we just have a bunch of laws and counter-laws between companies and unions that ruin it for everyone in "closed shop" states. There is _no_ "free market balance" (other than closing shop and moving to another state/country). If Linux and other Freedomware is mandated, the same will happen. You will be innudated with laws and counter-laws that will destroy its autonomicity and freedom to digitally assemble. That's why I'm very much dislike the "Commuware" movement, which is only a portion of the "Freedomware" movement. Linux will also stall without competition -- yes, even the community. Because it is forced, and those mandates will drive things, not actually, individual choice and merit.

I'm here

I created this page.