A wide variety of network communities exist today, supported by many different computational platforms. As the need for new architectures for these platforms arises, so does the need to articulate what exactly should these platforms provide, abstracted from the many (and somewhat diverse and divergent) concrete realizations of these ideas (e.g. in MOO, MUSH, MUD etc).
This note presents my analysis of the desiderata for network spaces , my generic term for the computational platforms underlying network communities. My analysis is not driven by any attempt to understand a "least common denominator" for these different approaches. Rather it is driven by my experience starting, administering and participating in several such communities since 1994 and from my desire to find a coherent and consistent conceptual framework (e.g. one that resolves issues of objects, persistence, identity, change etc) within which system development may proceed interlinked with, and yet somewhat decoupled from, the diversity of network communities that may arise atop such spaces.
I believe this task (of articulating the desiderata of network communities) is of some urgency. Conditions are now ripe for an appropriately designed architecture and implementation to provide the basis for the development of tens of thousands of interlinked network communities all over the globe. On the side of social sciences research the extraordinary interest of these spaces as both a synthetic and analytic tool for the study of communities is now rapidly becoming evident. On the computational side the development of MOO as a basis for such spaces has come to a halt with the disintegration of groups working on this technology at PARC. On the other hand, the rapid maturation of Java and CORBA technology, and widespread deplyoment of networked personal computers is finally(!) providing the ubiquitous basis on which large-scale end-user populated distributed systems may be realized. Therefore this task is both timely and important.
In the following I elaborate on each of these ideas in turn.
A network space concretely consists of a collection of objects representing "world" objects (e.g., doors, chairs, people) as well as system objects (e.g., event-dispatcher), services (e.g. dictionary services, news-feeds) and the processes that animate them. For this discussion, an "object" is just an unitary bundle of state (values for variables) and code (response of the object to various stimuli from the environment). Objects may be implemented in any of a variety of programming languages with any of a variety of different detailed computational semantics. For instance, objects may be classified according to a hierarchy that allows code-sharing, thus obviating the need for every object in the system to have its own separate program. (Note that in languages like Java what I call objects here actually fall into two categories, classes and "objects". There is no particular need in this discussion to make that distinction, so I shall use the term objects in the more inclusive sense.)
A network space must be ready to accept connections from the outside world in a variety of "transport" protocols (http, RMI, IIOP, SMTP, netnews, telnet-style raw line at a time interface) etc. People (or their representative computational agents) may connect up to such a space and perform operations which may result in very long-lived computations being set up, or may side-effect the collection of objects, or may side-effect the connections opened by other players to the space (e.g. by sending lines of text down it, for instance in response to a page message from a player). Connections may stay open for extended periods of time (e.g., forever).
The space may itself open connections to the outside world (e.g. http/smtp/ftp/telnet) on behalf of a user of the space.
Connections may be opened to the space from anywhere on the net by different people from different organizations and affiliations -- this is one of the simplest and most profound attributes of a network space. Suddenly it opens up the possibility of having groups of people across organizations, institutions and walks of life congregate and play and work together.
It is convenient to represent the person opening a connection to the network space as a "player" object in the network space. (Most connections are opened by people, but a network space may itself provide services to the network via incoming connections that are not necessarily associated with a player object.)
The notion of a player in a network space is a central conception. As in the real world (and in real world organizations), player objects are the source of agency and unit of accountability (and administration) in the network space:
This has several consequences.
First, it means that the network space must support authenticated player connections. (Note that this is not to imply that a network space must necessarily maintain a connection between a player object and a verifiable real-world entity, e.g. person; network spaces may support anonymous, but authenticated participants.) The vast majority of players in a network space are likely to be associated with real world individuals (as opposed to computational agents or collectives of humans).
Footnote: One can imagine tremendous variations on this theme though, scope for artistic exploration. Is there mere chaos beyond the notion of individuality? May not a group of people be able to project a cohesive identity cf Bourbaki. The question one must ask is what context of ongoing interactions do the various participants in the group have access to. This question remains unasked in the case of a single person, it is left to the individual elements in the person's psyche/brain to sort out the interactions with the environment and carry within them impressions of past interactions to inform future interactions. Extraordinary space for social interaction research here!)
Second, it becomes important to recognize the special function of administrators for the space. Administrators are people who understand the principles of operation of the network space, and provide underlying management and implementational skills necessary to administer to fluctuating user populations and needs.
Footnote: In this, they are akin to administrators of computational systems anywhere; the differences, if any, arise in that administrators are often also participants in the network community itself, and so may have privileged positions of authority and responsibility in the space. They usually also serve as the interface and implementation agents between the user population of the network space and whatever organization is hosting the computers on which the space runs.
Computationally, it must be possible for administrators to monitor system usage, take down the space in an orderly fashion if necessary, restore it from a past state (see below), enforce policies on system resource control and use (e.g. aborting runaway user-controlled processes, administering "quota"), fix system errors and set up or take down systems services (e.g. IIOP bridge, federation connectivity) and log system events in a persistent database. Administrators must also have the ability to create new instances of protected resources (e.g., new players), have the ability to abruptly terminate network connections (e.g. initiated by a "rogue" player) in accordance with the policies of the community, and to restore past state of user objects and processes. In some network space implementations (e.g., MOO), administrators may have privileged access to the state of user objects, though this is clearly not desirable. Administrators must manage the off-line processes of backing up system state and controlling access to the backups.
One of the primary player activities is construction. This means that there must be some mechanism for associating objects with the players who constructed them, for example for purposes of accounting. This is usually done through the notion of ownership .
Ownership means several things. The owner is charged with the resources (memory, threads, network access points) consumed by their computational elements. Owners must have the ability to dispose off their computational elements (e.g. destroying objects, threads and network connections, stopping to listen on ports). Conversely, it should be the case that a player who does not own a computational element cannot dispose of it. Owners may also have special rights on their owned objects --- for instance, an owner is usually able to name his/her owned objects, further re(de)fine it (e.g. by changing its parent class, or adding/deleting properties/methods), read the state and code associated with that computational element (including network connections, if any) and "move" that object in the network space ( assuming that the network space has a notion of "locations", see below ). In particular, ownership may imply possession of the right to transfer ownership, to "gift" the object to some other player.
Finally, for purposes of accountability, actions taken by computational elements owned by a player are presumed to have been taken on behalf of the player. For instance, the contributions of a puppet owned by P to a conversation are presumed to have emanated from the owner of P, since only the owner of P typically has the capability to manipulate the puppet in that way. (Typically, the owner of a player is itself.)
Footnote: I must say that placing player-ownership at the heart of the concept of network space is somewhat troubling because ownership is a very complex notion. For instance, the notion of public goods (commons) or even natural resources (the sky, the meadows) does not naturally lend itself to a usable metaphor based on personal ownership. A crisper analysis which allows for public goods while requiring accountability for actions would be desirable.
A network space must support some policies for access to resources (e.g. processor time, space, net connectivity) that is equitable across participants. This implies the network space must support some notion of multiple threads of execution, so that if one thread becomes blocked (e.g. due to I/O) some other available user-process may be run. It must also be able to do pre-emptive scheduling, throwing out long-running threads in favor of runnable, long-waiting threads.
It should be possible for a participant to construct state (classes, objects, code, data) that is not visible to other participants, or that can be shared between a select group of participants only.
This places a requirement that the object model be encapsulated (opaque). Encapsulation means that there may be a gap between the external interface maintained by the object and internal structures used to implement that interface, and this gap is respected by the other objects in the environments, i.e. they can access only the external interface, not the internal representation. Encapsulation is essential to maintaining the integrity of interacting, separately designed, independently evolving object systems.
A password is a typical example of administrator-maintained state on user-objects that is required to be not visible to non-administrators. Similarly, it must be possible for a participant A to keep a diary that can only be read by A, or other participants explicitly authorized by A -- indeed its very existence may not be knowable to unauthorized players. Or to create a particular kind of transportation device that can only be instantiated by a particular collection of (groups of) participants. Or a communication channel that can only be used by fans of a particular football team. Or to build a house such that there is only one way to get in: participants must ring the doorbell, be observable through the eye-hole, and be explicitly permitted to enter.
Further, access properties should be dynamically changeable ... "run-time is all that there is".
Here are three simple "litmus" tests for the kind of participant-specifiable access control that must be supported by network spaces.
It should be possible for an arbitrary designer to implement (generic) vehicles. A vehicle can be used to transport passengers from one place to another, while travelling through several other places. A vehicle A may crash for a variety of reasons --- in the event of a crash, every participant B not wearing a seat belt is ejected into the current place the vehicle is in. This means that on entering A, it must be possible for B to communicate to A an ability K to eject B from A (to the location of A). Furthermore, K must be usable by A only as long as B is in A; any other attempt by A (or any other object to which A may have communicated K) to use K should fail.
It should be possible for an arbitrary designer to implement (generic) vending machines. A vending machine A can be loaded by a vendor with particular templates (e.g. coke cans, candy). A participant B should be able to invoke an operation on A that allows it to create an instance of the selected item in the B's name. This means that it must be possible for B to communicate to A an ability K to create an instance of the given item for B, and to communicate it to B. Furthermore, K is usable by A exactly once, and any attempt by A to use it differently, for instance to create an instance of another item in B's name, should fail.
It should be possible for an arbitrary designer to implement currency notes (money). Money is an unforgeable object that can be issued by an agency (bank). It should be possible for a currency note with value A to accept another currency note with value B exactly when B is issued by the same bank as A, and atomically destroy B, while raising its value to A+B.
The space must continue to exist, even if no player is connected. The computational elements created by a player must continue to exist until (if ever) they are destroyed (typically by, or on behalf of, their owner). Computationaly this may be achieved by periodically checkpointing the state of the system onto stable storage, or implementing a more fine-grained, incremental scheme for moving "dirty" objects out to stable storage. It must be possible to recreate an identical copy of the running system from the image on stable storage.
The flip-side of persistence is mutability (change) and identity. Any computational element created in a network space must be mutable because any decision made about its structure may potentially need to change in the light of changed designs or circumstances. For instance, there must be ways to (1) change the state of objects (2) change (add/delete/modify) the set of properties and methods associated with an object, together with access rights, if any (3) change the parent of an object, (4) perform such transformations on entire classes (e.g. changing their inheritance structure) thereby affecting entire subgraphs of the inheritance and instantiation tree.
The notion of identity of objects in network spaces deserves a deeper analysis. For the moment, suffice to say that the most convenient notion seems to be that two objects are the same if state-change operations on one affect the other identically. This makes sense in a world-view in which objects have references (handles) on other objects which must continue to work through changes to the referred object.
(When you create your house, you create one object, which may have many subparts, and over time you may give out references to that object, or other participants may discover references to it, e.g. from white pages. Any changes you make to your house will continue to allow other participants to refer to your house via their original references, even though operations they may have performed before on your house now cease to be applicable. For instance you may have removed the doorbell and instead installed an "elecronic" access system that lets people automatically in the door as long as they are carrying a pass key. An attempt to use the doorbell must fail appropriately because there is no longer any such thing.)
Therefore, in a network space, changes to objects must be made in situ. This kind of flexibility is quite familiar to people who have worked in incremental program development environments (e.g. with Smalltalk or Symbolics Lisp) .. the primary technical interest of network spaces is that they offer these ideas in a multi-person networked context to end-users.
Run-time is all that there is.
Two levels of programming interfaces can be distinguished: systems programming and particpant-modeling. The first is concerned with implementing the core abstractions of the network space: players, students, teachers, administrators, rooms, exits (doors, windows...), movement, perception (who-lists, descriptions), generic containers, generic classrooms, furniture, notes, user-input parsing, utilities... No such list can be complete for an evolving network community --- which is likely to need ongoing systems support. So the network space must support the ongoing construction, installation and maintenance of systems services and core abstractions in the context of a running world.
A desirable attribute of a network space is that it allows the construction of such core abstractions and systems services in any programming language whatsoever (C, C++, Visual Basic...) with appropriate encpasulation. This may be accomplished in many ways, e.g. via an interoperability strategy based on CORBA.
From these abstractions, participants should be able to assemble a vast variety of idiosyncratic objects, behaviors and experiences via primarily non-procedural construction --- homes, bathrooms, the sinking Titanic, the holodeck of the Starship Enterprise, radio-stations and radios, cars, pets, jacuzzis, swimming-pools, caves for spelunking, mountain bluffs for rock-climbing. Network spaces are likely to attract significant participation by people and become the incubators of network communities only when they offer significant affordances for self-expression. Richness of construction is one such affordance.
In many cases, significantly sophisticated "experiences" may be constructed by assembling and customizing a variety of system defined components and services. An agenda-manager can be combined with an object recycler to produce a time-bomb: at a pre-set future point in time it explodes, with appropriately noisy effects (rendered in text or other output modalities available at the client end), destroying all the objects owned by the owner of the bomb in the current room. A museum is a series of rooms with great artifacts and (their supporting material) on display, through which participants can be taken on a tour by a (computational) guide.
Crucial to supporting such rich constructions, then, are sophisticated environments for component assembly. Three areas of programming languages design likely to be quite successful here are constraint programming, visual programming, and programming-by-example. The strength of (timed, concurrent) constraint programming is in allowing declarative assembly of simple reactive, modular elements (e.g. power sources, lasers, wires, switches, sensors, lamps) that can exhibit intricate behaviors when assembled into interacting systems (e.g. burglar alarms). ( and that can explain their behavior by keeping appropriate track of the inferences that underly the composite behavior .) The strength of visual programming (e.g. as in Toon Talk ) is in presenting a powerful computational model via direct manipulation of pictures and animations and intuitive operations on them. As "component architectures" (COM, Java Beans) become more sophisticated, more and more widely usable visual programming environments are becoming available that allow end-user to construct sophisticated applications via iconic manipulations. The strength of programming by example is in letting a participant walk through a scenario of usage and allowing the system to construct a skeletal program in an appropriate high-level language which merely needs to be verified or fleshed out by the participant.
I expect the area of participant modeling paradigms to be very rich, interesting, and open-ended in the coming years. Network space designs must leave room for experimentation along this dimension. This can be accomplished by supporting an appropriate object inter-operability model (e.g CORBA). (For instance the model must support the notion of secure access to the persistent, dynamically varying collection of objects and services that constitute a network space.)
Over the last decade, network spaces have largely been connected to via line-at-a-time text-based command-line interfaces (e.g., Mud-dweller, tinyfugue). These have proven to be suprisingly rich and robust -- in many ways, the task of creating an illusion about a virtual world is considerably simplified when text is the only medium of interaction. Perhaps because of their exposure to novels and other traditions of literature, people seem to be quite willing to suspend disbelief when reading text, and allow themselves to actually believe that they are in a room when they read a (well-written) description of a room, and to actually believe that someone has entered the room when they see a line of text across the screen which says so. Producing imaginative visuals and animations with a similar sense of illusion seems much more difficult given today's tools and peoples' training.
Footnote: On a technical note, supporting plain text-based interfaces, with no client-side processing, is actually considerably complicated for the network space because a natural-language style interface must be supported for refering to objects in the environment, and performing actions on them. MOO, for instance, supports a slow and intricate scheme for user input parsing which still leaves considerable room for improvement. For instance, an utterance such as "put black cherry in fruit bowl" must be parsed in the context of the objects available in the current environment, and the operations on them. This can become fairly complex, for instance if there are many black cherries on the table, and if the fruit bowl is named "Aunt Martha's wedding gift" and is recognized as a fruit bowl only because it is an instance of one. A direct manipulation interface, on the other hand, would have iconic representations for the various objects on the client screen, allowing a participant to click on the appropriate black ball and drag it into the object that appears like a red bowl on the screen.
Nevertheless, advances in client design and network protocols have made possible a significantly wider array of input/output modalities. As the Jupiter UI at PARC showed, it is straightforward to build a point-and-click GUI for a network space, supporting multiple windows, local and input editing, cut-and-paste, user-specified shared UIs (e.g. shared white-boards, tic-tac-toe games, post-its) etc. Extending these protocols to allow for client-side drag and drop operations is not difficult. Similarly, advances in VRML and network games have made possible the development of richly described (though generally static) visual 3D spaces with client-side rendering and real-time multi-player interactions mediated through a server. Shared audio and video running through separate net connections but controlled via participant connections to a network space have also been demonstrated. Separately network telephony systems are now available, their integration into network spaces would be quite intriguing.
Network spaces must support experimentation with these wide variety of user interface modalities. Primarily what this calls for is a simple reliable way for "out-of-band" communication between a client and server, that is, communication from the client to the server which can be clearly distinguished from the user input to the server (and the response from the user) flowing down the same connection. Alternatively, RMI and CORBA-style interfaces can be used to accomplish the same effect.
The computational universe will consist of internetworked spaces --- a federated network of network spaces. Each subnetwork of spaces (federation) may be separately administered and hence may support different internal computational structures and administration policies (for instance, policies on player creation, "themes", appropriateness of construction, user charges etc).
In such a view it is natural to assume that each network space is responsible for the implementation of (a portion of) a virtual world. To participate in a conference on a room in a virtual world hosted by network space A, the participant must connect to A. This design decision allows regions of synchrony (which guarantee that a collection of actions initiated by multiple participants appears to each participant to occur in the same order; a room is an example of a region of synchrony) to be localized to single network spaces, avoiding computationally (and metaphorically) complicated issues arising with distributed transactions. A network space may implement some transparent master/slave replication scheme in order to support regions of synchrony that must handle a number of participants (say, thousands) larger than can be handled by a single network space implementation.
At the minimum, each federation must provide for authenticated connection of participants from other federations. A federation must also provide seamless transfer of some kind of computational traffic between the spaces (e.g. paging, email, bboard, http requests, etc). This implies the design and implementation of an application-level DNS-like naming and location service (for participants), and of associated messaging service providing varying levels (synchronous/asynchronous, secure/unsecure,...) of participant to participant communication. In particular, the messaging service should be able to identify whether a participant is connected to the network of federations, and if so where, and, if necessary, be able to continue to deliver messages even as the target moves from world to world.
Additional higher levels of communication between federations may also be supported, e.g. supporting migration of code from federation to federation. It should be possible, for instance, for a participant to develop a particular kind of flower on her home network space and offer it for sale to customers from different network spaces. Instances of the flower should work on oher network spaces as long as their system requirements, if any, are met (e.g., existence of particular abstractions or interfaces on the target spaces).
The combination of multiple people, objects and persistence means that the ingredients for a notion of time, space and experience shared among multiple participants exist. This is the basis for the construction of shared virtual worlds.
A network space must support the notion of dynamic contexts, or locales . A locale is the unit of space in a network space (e.g. room in MOO): it is an object that can serve as a "host" for a collection of players and other objects. The mere act of "colocation" of this collection can offer to the members of this collection certain computational facilities. For instance, a participant can directly name the objects in the current context and invoke "world level" operations such as "look" which display the description associated with the object. One participant may address another participant, secure in the knowledge that the discussion is "public", is a shared, common experience , that is, is visible to the other participants in that locale. Something uttered in "public" discourse in such a locale will be guaranteed by the network space to have been communicated to the players in that locale. Therefore, the participants have a basis on which they can agree on the "facts of the matter" in an episode --- the bare chronology and sequence of events (who was there, who said what to whom, who did what) --- even though, of course, they may not agree on the interpretations of what happened. Indeed, in some cases it may make sense to record the events happening in a particular locale over a particular period of time (e.g., town-meeting), thereby opening up the common experience to subsequent study and examination by other participants ("archeologists", reporters).
The context provided by a locale is said to be dynamic because each of the participants in the locale is an independent source of action and change in the locale. For instance, each participant could autonomously decide to stay in that locale, or say something, or perform some action on some object in the locale (open a book, read out a passage, fidget with her dress, express attitudes through body language, hand a document over to another person) --- very much in the same way as a group of people in a room may perform independent actions affecting the context and structure of their interactions.
This notion of a common shared experience between participants is one of the most striking affordances of a network space. It is what considerably strengthens the claim that such network spaces can richly support interactive, public virtual worlds.
The notion of network spaces offers a vision of how large-scale people-oriented computational systems can be designed. What we need now are concrete architectures and implementations supporting such a vision.
[This is intended to be a more succinct and carefully worded check-list of the desiderata from the main text. It should be of use to architects evaluating their proposals for network space design.]