20 August 2006

Intro to Client/Server Computing-3

The OSI Reference Model for Networks

(Part 1 & 2)

The Open System Interface (OSI) Reference Model specifies how data from an application in one computer moves through a network medium to an application running on another computer. The 7 layers are organized as media(1-physical, 2-data link, 3-network, 4-transport) and host(5-session, 6-presentation, 7-application). Media, here, refers to the mode by which data is transmitted over distance; e.g., over microwaves, coaxial cable, and so on. Host, here, refers to any computer on a network that is capable of running applications. All devices on a network, regardless of capability or description, are called nodes. (Nowadays, cases of the node that can't run a program are rare; they include dumb terminals). So the "host layers" are responsible for the final communications between the network and the application (such as a web browser).

There are several fundamentally different networking formats described by the OSI reference model; the different formats are called suites, and the TCP/IP (Internet Protocol) represents one such suite of 7 mutually compatible layers. Another, technically analogous suite is the UMTS cell phone standard, which also has seven layers--although not all of those layers are populated. Within each layer of each suite, there may be several alternative standards, which serve distinct purposes; as, for example, within the TCP/IP suite, the network layer (layer 3) includes both the Internet Protocol (IP) and the Internet Control Message Protocol (ICMP). Both are essential for data layer to function.

These layers can also be called protocol stacks, rather like a seven-story building with several different offices on each floor. Since the division between individual floors is somewhat subjective, it's conceivable that empty floors might not be counted. Still, at some future date, they might be.

Bear in mind that the purpose of the OSI reference model is to furnish a topology of all possible types of data networks. It is not always the case that each of type of data network will "see itself" in the same 7-layer model. For example, the TCP/IP model was developed separately from the OSI reference model; it lumps together the application (7), presentation (6), and session (5) layers; moreover, it refers to layer 3 as the "Internet layer" rather than the "network layer." Still, it is not hard to find the analogous layers in the simpler 5-tier model.

Two components of two different networks that occupy the same level in the OSI model are known as peers. Each layer in the OSI model interacts directly with either the layer directly underneath, the one directly overhead, and (sometimes) with peers. So, for example, the data layer (2) in a network interacts with the physical layer (1), the network layer (3), and other network's implementations of the data layer.
Layer 1: Physical
The physical layer includes the specs for cables and routers. The data link layer consists of the precise format for encoding data for transmission. All of the hardware and its technical requirements constitute layer one. Examples include serial connections, fiber optics (and connectors), coaxial connectors, DSL, and other tangible cable standards; also, standards for wireless transmission.
Layer 2: Data
Data needs to be transmitted in a particular format so that higher-order patterns can be communicated to the machines in the system. In the familiar TCP/IP suite, data is transmitted in packets corresponding to very specific parameters. One of the more famous examples of this is Ethernet, which has very detailed instructions for framing of packets. These packets can be likened to cars on a highway; in contrast, the continuous stream of data over a telephone line during a telephone conversation would be more comparable to a train, which is running down that length of railroad during the entire duration of the conversation. In order for transmissions of data to be transmitted this way coherently, protocols must exist to define what indicates the beginning and duration of a packet. Other protocols must exist to ascertain if the packet is valid, or if it is corrupted.

Examples of data protocols for the TCP/IP suite include, as mentioned, Ethernet (IEEE 802.3); WiFi (IEEE 802.11); Asynchronous Transfer Mode (ATM); Frame Relay; and Token Ring. Please observe this list is not exhaustive.
Layer 3: Network
This is the format that describes the actual network structure. IP is the component of the Internet suite that resides at the network layer (3); it reads the MAC address and uses it to route the packet in accordance with the network architecture. Part of the technical difficulty posed by IP is that it addresses a network of networks. Hence, a frame is likely to pass through many routers as it goes from network to network. The part of the network that an individual router is connected to is called a subnet; each node in the network is connected to a router within the subnet. The level 3 IP address directs the packet/frame to the correct subnet, but requires the address resolution protocol (ARP) to convert the IP address to the correct (layer 2) MAC address. The ARP often manages broadcasts between routers and all nodes on its subnet, which therefore imposes some technical constraints on the size of the subnet attached to a router. If the router addresses too many nodes, ARP requests will constantly be tying up the NIC's on the subnet.

Another component of the TCP/IP suite that is necessary for normal functioning is the Internet Control Message Protocol, which is used for error messages between networked computers (partial list of messages).
Layer 4: Transport
The transmission control protocol (TCP) is the other part, that resides at the transport layer (4). The TCP is responsible for converting packets into steady pipelines of data that a local application requires; or, when data is flowing down the protocol stack, the transport layer is responsible for splitting the datastream into packets.

An alternative protocol to TCP is Stream Control Transmission Protocol (SCTP), a somewhat more advanced data transmission standard that allows for networking along multiple transmission modes (like Ethernet & WiFi) at the same time. In network parlance, this is known as multi-homing. The link above explains the application possibilities in greater detail, but the basic concept is that a network with widespread multi-homing would enable a higher-order of parallelism.
Layer 5: Session
The session layer is the set of standards for coordinating and managing a period in which the host processor is connected to the network. In a client/server network, the session manager coordinates requests and responses between the local and remote applications; it manages the opening and closing of sessions. Session protocols determine if a connection between client and server are full-duplex (information can flow both directions simultaneously) or half-duplex (information can flow only one direction at a time).

In the TCP/IP suit, this is also managed by TCP (for the most part).
Layer 6: Presentation
Translates data into formats recognizable to the application layer. Also allows data security at this layer [*].

Multipurpose Internet Mail Extensions (MIME) is an example of a universally-used presentation layer protocol. What it does is allow the transmission of non-ASCII data, such as graphics or non-English characters (e.g., Æ, č, љ, ζ) to applications such as STMP, which can only recognize ASCII characters.
Layer 7: Application
This layer is responsible for managing applications and application interface with the rest of the network. An example is the Domain Name System (DNS), which correlates domain names (like news.bbc.co.uk) with an IP address. This then allows browsers to navigate the internet. Another important application-layer protocol is the already-mentioned SMTP, which manages email; or HTML, which allows a browser to generate a graphical display of Internet data.
ADDITIONAL READING & SOURCES: Cisco Systems, "Internetworking Basics"; RouterGod, "CCNA Bootcamp: the OSI Model";

Wikipedia: OSI Reference Model;


18 August 2006

Gnote on GNU

In the mid 1980's, some enthusiasts of the Unix operating system decided the world needed a new operating system that was like Unix, but wasn't actually Unix.

The reason for this was that large parts of the code for the software were owned by AT&T's Bell Labs (or by its collaborator, Sun Microsystems) ; and developing new features as extremely expensive for individual developers.

Moreover, there was an open-source movement which held that there was something inherently wrong about the proprietary software development model, in which a single firm owns software and charges whatever it expects the market will bear. Such a model has, in recent years, posed an immense diplomatic and law-enforcement obstacle as Western governments (most notably the US government) have sought to both expand monopoly powers over intellectual property, and wage a losing battle against software "piracy." At the same time, the quality of proprietary software has nose-dived, so that the much-vaunted progress in computer technology is more than offset by software bloat, malware, and stupid design.

The GNU Project is an organization that developed free versions of many Unix-related utilities, libraries, shells, compilers, and the most commonly-used version of Emacs. Someone unaffiliated with GNU, a Linus Torvalds, wrote a kernel that worked well with GNU software, and this became Linux. In the great majority of cases, installations of the Linux kernel employ mostly GNU system software for the rest of the OS. However, there also exists a GNU kernel called Hurd, which is quite interesting. It is to be hoped that I'll have the opportunity to post about in depth someday. But it's an odd peculiarity of fate that the GNU Project developed a complete OS, and the open-source movement embraced all of it except the kernel.

When I began reading about GNU, I was plagued by the question, All right, I understand the technical imperative for a Unix-like OS that is open-sources, but didn't something like this exist already? I was thinking of BSD 4.4-Lite, a variant of Unix that achieved "free software" status in 1994. Indeed, the GNU Project itself developed was very closely tied to the BSD branch of the Unix development community. It turns out the expressly BSD tendencies of the "free software" movement mostly split off from the GNU Project in the early 1990's over the precise understanding of "open-source." The GNU General Public License was a legal formulation of intellectual " copyleft" principle under which any derivative of GNU-GPL-governed material was itself obligated to be covered by the GNU-GPL. It was legal and welcome that a developer should create a fork of GNU software, but any such fork would be as open-source as the original GNU source code. Hence, it was illegal for any program using GNU source code to be proprietary.

Hence, OpenBSD, NetBSD, FreeBSD, and the two dozen other non-GNU variants of BSD Unix are often covered by the BSD licenses, which allow forks to be copyrighted. The source code remains public domain, however. The developers of these other variants of BSD were obligated to fork off their own versions of BSD 4.4-Lite so that their version could be protected by BSD license rather than GNU-GPL.

The commercial Unix world has several proprietary GUI's; these include the Mac OS X, Sun Microsystems Solaris, CDE, and so forth. KDE was the first GNU-compatible GUI, but had been developed with a proprietary software called Qt. The GNU project therefore developed GNOME, which has suffered from much of the controversy plaguing all Unix operating environments. For the record, Unix is not a very easy OS to write GUI's for since any such thing has to cram an exceptionally large number of functions into the interface. Attempts to economize on what controls ought to be replicated through the GUI will inevitably spawn a lot of aggravation [*]. In addition to KDE and GNOME, there are Rox and XFCE. In many cases, such as Red Hat and Novell (SuSE), both are supported.

Both GNOME and GNU/Linux include an enormous number of components that must be assembled by specialists into a distribution that an ordinary user can deploy. This explains the wide variance in distributions, which specialize in a particular mission.

As of this writing, GNU does not have a distribution that uses a non-Linux kernel. Hence, any discussion of the GNU OS has to be about GNU/Linux.

copyleft: a legal covenant binding on certain materials. Copylefted material may be distributed freely but no one is permitted to restrict the copylefted material. Unlike material in the public domain, derivatives of copylefted material may never be themselves copyrighted.

NOTE: RE other products from the GNU Project: in addition to the OS components that constitute most of GNU/Linux distributions, there are many other programs created by the GNU Project. Unfortunately, their website does not list them all in one place so I have included a few links that do:

GNU Project is part of the Free Software Foundation (FSF) and engages in many valuable or constructive political campaigns.

SOURCES & ADDITIONAL READING: "SCO, GNU, and Linux," Richard Stallman-Free Software Foundation (2003); "About the GNU Project," GNU website; GNU Hurd homepage;

Wikipedia, GNU & GNU Project, GNU Hurd, Debian, Emacs, GNU General Public License, GNOME, GTK+; GNOME Human Interface guidelines;

Labels: , , , ,

15 August 2006

Intro to Client/Server Computing-2

(Part 1)

The first part very briefly outlined the development of major computer architectures responsible for client/server computing, organized somewhat thematically. However, I'm going to be a little more structured hereafter. In what follows, I'm going to be referring to networks that do not follow the client/server model, and for which "client" and "server" don't really apply. So for networks (whether C/S or not), all of the networked machines are called nodes; those nodes that are computers are called hosts. Elsewhere I've used the word "host" to refer to web servers that host a website. Here, I'm greatly expanding the sense to include any computer-like device with a microprocessor. All hosts in a network are nodes; but not all nodes are necessarily hosts: "dumb terminals," passive terminal emulator sessions, and some industrial machine tools, may be nodes that can't, or don't, run programs within the network.

The client/server model of data organization is actually one of several network formats that can be used.
  • The mainframe model
    In the mainframe model, the terminal transmits keystrokes to the mainframe and receives a screen image back. This is the most extreme form of data centralization. Since genuinely dumb terminals are comparatively rare nowadays, it's more likely one would be using a terminal emulator program.
  • File sharing
    At the opposite extreme is the file-sharing architecture, in which the server merely hosts files, like a library; the processing functions remain on the client.
  • Peer-to-Peer
    Any type of network configuration in which nodes on the network are responsible for maintaining themselves on the network. This requires a network architecture that is highly decentralized. All nodes are on an equal footing.
  • Client Server
    Under this system, there is an integral relationship between the application running on the client, and that running on the server. Specifically, the server maintains a database backend, and the client maintains a database frontend. Several standardized languages were developed for communication between client and server, allowing them to run separate parts of a common program.
Of the C/S formats, there are a few basic standards:
  • Two tier architectures
    The most obvious client-server relationship. The client is the lower tier, while the server is the upper tier. This sharply enhances the power of a computer system, by dispersing but coordinating the functions of data storage and data presentation. However, two-tiered architectures have a limited scalability because of limitations imposed by the address resolution protocol (ARP). For large numbers of subnet nodes, ARP queries seriously slow down the efficiency of the network.

    Another problem is that, with a two-tier architecture, if the front end resides entirely on the user networks, this requires the system administrator to install updates on all of the client computers.

  • Three tier architectures
    With a third tier, the ARP can readily address a far larger number of end-user nodes. Moreover, the intermediate tier was given additional features, such as queuing and application execution. The three tiers are known as the user interface tier (which resides on the user nodes), the business logic (intermediate) tier, and the database access tier.

    The user interface tier merely makes requests of the business logic tier. In some cases, it interfaces with the database through a web browser (e.g., Citrix Presentation Server); updates to the front end are made to the server with the business logic tier installed, and do not require that the system administrator ensure that the software is installed properly on the desktops of many employees. The database access tier, likewise, is responsible solely for storing data. The business logic tier is responsible for handling queries, forms, generating output, and other functions of the database front end. Of course, the database access and business logic tiers may well be stored on the same physical computer.

    Another way of describing this architecture (explained later) is "DBMS independence." Ideally, the intermediate tier could house multiple DB management softwares, such as Oracle and Ingres, which both access the same database. This is made possible by a standard SQL format for relational databases.

  • Three tier architectures with Transaction Processing Monitor technology
    When operations are performed on the data in a database, or when a DBMS performs a query on a database, the operation typically consists of several steps that must be be executed atomically. For example, if money is transferred from a checking account to a savings account at an ATM, then the checking account must be credited and the savings account must be debited the amount. These two operations must occur seamlessly; it is not acceptable for the database to have an intermediate state in which one step has been executed and the other hasn't. We call such operations atomic because they are indivisible, and the intermediate state is unknown to the process; they are also known as transactions.

    Transactions can be bundled and processed, while remaining atomic (i.e., with no intermediate state data available to the running software process) using the Transaction Processing (TP) Monitors [*], which manages transactions from their point of origin. The TP Monitor does this by dividing many requests from users into separate threads, and allowing the processor to divide its time among the threads one transaction at a time. As the system resources grow, the architecture can be scaled upward in size to support more processors (thereby allowing many threads to be run at the same time).

  • Three tier architectures with message server
    Here, the intermediate tier is implemented as a message server. Transactions take the form of messages between database server and user/client. The difference between TP monitor technology (above) and message server is that the message server architecture treats messages as intelligent (i.e., containing direction and purpose) whereas the TP Monitor environment has the intelligence in the monitor, and treats transactions as raw data packets.

  • Distributed/collaborative enterprise architecture
    The crucial element here is that the database (the highest tier) is distributed, possibly over a large area and probably among multiple machines. As a term of art, the distributed/collaborative enterprise architecture incorporates an object request broker (ORB), or software component that manages communications among data objects. ORB's interpret user requests, then logically establish the resources within the distributed system and invoke them [*]. The ORB contains the communications infrastructure that allows applications to access DB services.


  • International Business Machines (IBM)
    IBM's contributions include SNA and the token ring. Neither was a prolonged commercial success. The token ring was initially adopted by many of IBM's clients, but soon afterwards overwhelmed by Ethernet, which made TCP/IP networks more capable. (Ethernet is not a competing network architecture, but it did compete with the data level standard for IBM token ring).

    Systems Network Architecture (SNA) is a complete set of protocols defining all levels of a network, from cables to applications. It was created in 1974 and remains in use in some financial settings. The SNA is interesting insofar as it provides an illustration of a completely distinct network ecosystem, entirely separate from the Internet. (MORE INFO: Cisco Documentation, IBM Systems Network Architecture Protocols)

  • Digital Equipment Corporation (DEC)
    The network system DECnet (AKA Digital Network Architecture, or DNA) was introduced in 1975 to link PDP-11 computers. Rather than corresponding to the then-familiar concept of client/server networks, DECnet was based on the concept of peer-to-peer (P2P) networks, rather than the hierarchical structure used by SNA.

    DEC was bought by Compaq in 1998; Compaq, in turn, merged with Hewlett-Packard in 2001. DECnet was by then slated for obsolescence. Even before 1996, the DECnet architecture was drastically revised into DECnet-Plus, and made compatible with TCP/IP data, network, and transport layers [*]. As DNA-specific protocols were made optional, they were phased out. This was a strategic decision made by Compaq and HP management; the payoff was continued viability of the OpenVMS operating system. However, it is worth noting that DECnet-Plus V is still supported in open source, and is available for Linux [*].
    (MORE INFO: HP Documentation, OpenVMS Systems Documentation & esp. table 1-2; for a beginner's guide, see DECnet-Plus for OpenVMS Introduction and User's Guide)

  • AppleTalk
    AppleTalk was a network protocol suite that was introduced in 1984 as part of the Macintosh. It was carefully designed to follow the OSI reference model, and compensate for the inadequacy of TCP/IP network architecture for small, peer-to-peer computer networks. Soon after the release of MS Windows, Apple moved away from AppleTalk.

    An industry standard for object request brokers (ORB's); CORBA is the Common ORB Architecture. The goal was to develop a standard by which databases distributed over many servers (perhaps in many different locations) could be accessed by diverse DBMS's; the different DB-locales would be invoked as software objects, and the invoked results based on the requestor (client) transparently.[*]

  • Distributed Computing Environment (DCE)
    A competing standard for distributed client/server architecture developed by the Open Group.[*] Like CORBA, DCE is a set of interface standards that define how applications ought to be designed. However, DCE also includes a reference implementation, or source code on which DCE products are based. The goal is for DCE-compliant applications to be interoperable (able to exchange information with each other, that can then be used by the applications rather than merely parsed) and flexible.

passive terminal emulator sessions: if you access a session on a mainframe on a modern Windows PC, you will do so through an emulator session. While the PC is capable of being a host, and is (at the hardware layer), at the application layer, all it is doing is transmitting keystrokes and receiving a screen signal. Terminal emulators are programs that communicate with the mainframe in signals identical to those of the original terminal, even though one's terminal may be any type of computer.
ADDITIONAL READING & SOURCES: Wikipedia entries for types of client/server network architecture: client/SOA; SNA, token ring (IBM); DECnet (DEC);

Carnegie-Mellon University, "Client/Server Software Architectures—An Overview," w/links to related topics; CodeWhore.com, "Networking: Broadcast, Client/Server, Token Ring"; IONA Technologies, "Two- " and "Three-Tier Client-Server Architecture,"; Brian E. Hoffmann, "Online defect management via a client/server DBMS" (1993); "A perspective on Advanced Peer-to-Peer Networking" (PDF), by P. E. Green, R. J. Chappuis, J. D. Fisher, P. S. Frosch, & C. E. Wood, IBM (1987);

BOOKS: An Introduction to Database Systems-7th Edition, C.J. Date, Addison-Wesley (2000)

Labels: ,

14 August 2006

Intro to Client/Server Computing-1

The IBM Stretch 7070 7030 with a teletype for interface; the operator console on the extreme left of the upper photo was for maintenance & service. IBM built eight of these.

It's difficult to sort through the colossal variety of client/server computing formats, which all seem like a fairly old idea with many hairline distinctions. The basic concept has been with us from the beginning of a serious commercial computer industry: people didn't exactly have a Stretch sitting on their desks!

Readers are reminded ahead of time that this blog is autodidactic and entries are likely to have errors. Why waste time reading it then? Well, I'll be linking to information sources that are of much greater reliability than this.

First, let's discuss a few of the motives for client/server computing systems. From the beginning engineers have sought to built a reliable and safe way for multiple users to use the same processor at the same time. How could this be dangerous? One risk is that the computer, attempting to run multiple jobs at the same time, gets the thread for any two mixed up. Computer processors typically economize on register space, so designers have to ensure that the kernel--the program responsible for allocating register space, among other things--does not run the risk of having data from one user thread get inserted into that of another. Overcoming this challenge was to become a major preoccupation with the various computer manufacturers, most famously IBM. IBM's System/360 was supposed to have been introduced with a multi-user interface, but the software division failed to create one. Fortunately, the University of Michigan developed MTS. Another multi-user operating system for mainframes was Multics, which was influential in the development of Unix.

The great motivator to overcome these obstacles was the expense of computers; most computers were leased to large organizations, and lease rates were typically so high that only a few large organizations could afford one. Moreover, once an organization (such as a university) had a computer available, there was the problem of training personnel to use it. Writing programs required lots of time spent unproductively. Finally, processing speed was a major concern for engineers. Once a very fast computer was developed, it made sense to ensure its capacity was constantly utilized, which suggested that some overlap in user access would be necessary.

The concept of multiple access developed along competing lines. One line was MVS, actually a latecomer to the technology since it was introduced in 1974, long after MTS and Multics. MVS is actually a dynasty of successive operating systems, like Windows; in recent years it has been known under different trade names, such as z/OS. The essence of the MVS approach to supporting multiple users was to introduce each progam-user (instance) as a batch process. This minimized any application interference with the operating system, or with its close collusion with the IBM architecture.

Alternative to MVS was VM/CMS, another OS created for the IBM System/360. The name comes from two separate operating systems that were integrated to handle separate functions. The virtual machine (VM) was responsible for allocating data among instances by providing them with a "virtual" (i.e., software-simulated) mainframe. VM was so elegantly designed that it was possible to run multiple instances of VM on a single instance of VM, without any penalty in performance. In order to achieve this feat, the virtual machine had to be so convincing it could load and run itself exactly as if it were a tangible computer. This is quite useful because it makes the computer running VM far more difficult to crash.

(VM is used in this sense as a product name; however, VM refers to a generic concept, very closely related, of the virtual memory. )

VM was developed with its other half, the Conversation Monitor System (CMS), which was actually responsible for the time-sharing part of the system. VM/CMS was among the first of the operating systems to have a life [mostly] independent of the vendor's wishes, and it is in common use today. VM/CMS differs from MVS in the sense that VM/CMS (and its successors, VM/ESA and z/VM) have a federal division of the entire computer's resources, whereas MVS (now z/OS) is more of a unitary state.

Digital Eguipment Corp. (DEC) differed from IBM in that it was "politically" aligned with the large population of "dwarves" that resisted IBM's attempts to rule by decree. Its fleet of minicomputers were fundamentally different from IBM's mainframes, not merely in size, but also architecture and usage. Digital machines generally were closer in conception to the modern home computer, in which the organization of computer processes assumed multiple users or multiple programs (instances), and response to user "events" such as the typing of a command. With the VAX/VMS series, Digital introduced virtual addresses to accommodate 32-bit words, an innovation that swept away its competition. One feature that was especially consequential was "clustering," or distributed computing. This allowed a network of linked VAX machines to distribute a processing job among themselves.

(Part 2)
ADDITIONAL READING & SOURCES: "Introduction and Overview of the Multics System"; "The Creation of the UNIX* Operating System"; "What is MVS?," Pam McCann and David Migliore; PDP-8 FAQ;

"The Elements of Operating-System Style" & "Operating-System Comparisons"The Art of Unix Programming (online book) by Eric Steven Raymond;

Labels: , ,