William Kent, "Carriers And Cargos: A General Paradigm For Modeling Collections, Tables, Multimeda, Spatial Constructs, And Other Intensional/Extionsional Objects", Oct 1993. [20 pp]

## Carriers And Cargos: A General Paradigm For Modeling Collections, Tables, Multimeda, Spatial Constructs, And Other Intensional/Extionsional Objects

William Kent
Oct 1993

CONTENTS:
> 1 INTRODUCTION/ABSTRACT . . . 2
> 2 SETS . . . 2
>> 2.1 A Question . . . 3
>> 2.2 Extensional Sets . . . 3
>>> 2.2.1 Identity . . . 3
>>> 2.2.2 Immutability . . . 3
>>> 2.2.3 Existence . . . 3
>>> 2.2.4 Operations on Extensional Sets . . . 4
>> 2.3 Intensional Sets . . . 4
>>> 2.3.1 Identity . . . 4
>>> 2.3.2 Mutability . . . 5
>>> 2.3.3 Existence . . . 6
>>> 2.3.4 Operations on Intensional Sets . . . 6
>> 2.4 A Terminological Digression . . . 6
>>> 2.4.1 Extensional and Intensional Specification . . . 7
>>> 2.4.2 What's an Object? . . . 7
>>> 2.4.3 Casting . . . 7
>>> 2.4.4 Parameterized Types . . . 8
>> 2.5 Set Types . . . 9
> 3 CARRIERS AND CARGOS . . . 10
>> 3.1 Mutation Operators . . . 11
>> 3.2 Constraints on Mutation . . . 12
>> 3.3 Intensional Set Types . . . 13
>> 3.4 The Answer . . . 14
> 4 OTHER USES FOR CARRIERS AND CARGOS . . . 14
>> 4.1 Multisets, Lists, and Other Aggregates . . . 14
>> 4.2 Tuples and Records . . . 14
>> 4.3 Spatial and Geometric Constructs . . . 15
>> 4.4 Documents, Files, Multimedia, and Other Containers . . . 16
>> 4.5 Large Objects . . . 17
>> 4.6 Events . . . 17
>> 4.7 Relational Tables . . . 17
>> 4.8 A Comparison of Carriers and Variables . . . 19
> 5 SUMMARY/CONCLUSIONS . . . 19
> 6 REFERENCES . . . 20

## 1 INTRODUCTION/ABSTRACT

Confusion over identity, existence, and mutability of sets can be sorted out by distinguishing between intensional and extensional sets. The same applies to other collections, of course. A similar pattern also arises in distinguishing between a relational table and its collection of tuples, a team and its members, a document and its text, a file and its contents, a circular object and a geometric circle, a movable point and a point in geometric space, a photograph and the image it contains, an event and the time at which it occurs, a counter and its value, and probably other things as well.

The essential paradigm for dealing with all these phenomena is to treat them as pairs of associated but distinct objects, which we can generically call carriers and cargos. Carriers are created objects which carry other objects (often literals) as their cargos. The cargo of a given carrier is constrained to be of a specified cargo type. Thus the cargo of an intensional set is an extensional set, the cargo of a team is a set of persons, the cargo of a document is its text, and so on. A carrier will often "inherit" operations defined on its cargo, so that the union of two intensional sets is the union of their extensions, and printing a document means printing its character string text. In effect, a carrier can be "cast" into a role normally played by its cargo. At the same time, a carrier has unique operations of its own, such as recording the author of a document or the owner of a team.

This paradigm can be realized using familiar language facilities such as casting and parameterized types.

## 2 SETS

### 2.1 A Question

Suppose the set of Bowlers is {Dick,Jane}, and the set of Golfers also happens to be {Dick,Jane}. (Let's simplify one thing by pretending that people are uniquely identified by their first names.) Are Bowlers and Golfers the same? Well, yes and no. A precise answer requires a careful distinction between extensional and intensional sets.

### 2.2 Extensional Sets

#### 2.2.1 Identity

Extensional sets are the basic mathematical constructs we learned about in school. The essential catch phrase was "a set is determined by its extension" (i.e., by its members), which implies an essential principle of identity. If a set contains just Dick and Jane, that uniquely identifies the set. There is no other set which contains just Dick and Jane. It's even a mistake to say that two sets which contain just Dick and Jane are the same, because they weren't two sets to begin with. It's like saying two numbers which are both five are the same number. There is only one number five. There is only one set {Dick,Jane} which contains just Dick and Jane.

The nature and identity of an extensional set is not affected by type considerations. If Dick and Jane are both students and teachers, then the set of teachers {Dick,Jane} and the set of students {Dick,Jane} are one and the same set.

#### 2.2.3 Existence

There's also an important existence principle. If anything exists, then the set containing it exists. That's the foundation of the set-theoretic definition of numbers: the empty set exists (corresponding to 0), and the set containing an existing set also exists (thus establishing the notion of successor). Furthermore, the union of any sets is an existing set. The net effect is that for any bunch of things which exist, the set containing those things exists. It's somewhat analogous to saying that all the numbers exist.

All sets of existing things exist, including such bizarre combinations as {Dick,5,Chicago,Jupiter,red,{}}. Such sets exist whether or not they have ever been mentioned before, and whether or not they are currently recorded in some data item. The number 49786554313989 exists, whether or not anyone has ever mentioned it before, and so does that bizarre set.

It follows that extensional sets cannot be explicitly created or destroyed. They exist or don't exist automatically in correspondence with the existence of their members. At the moment Dick comes into existence, all the sets containing Dick (with zero or more other existing members) come into existence, together with all the sets containing such sets, ad nauseam. The moment Dick ceases to exist, all such sets cease to exist.

We can call this an ephemeral mode of existence. Such things do not exist forever, nor are they overtly created or destroyed by some explicit operation. Their existence is determined by some rule which might be satisfied at some times but not at other times; they exist only while the rule holds. Things which always exist, like the numbers and the empty set, are said to have an eternal mode of existence. Things which are explicitly created and destroyed, such as people or departments or tables, can be said to have a mortal mode of existence. Some sets are in fact eternal, namely sets of eternal things, such as the sets of numbers. Thus the set {1,2,49786554313989} is eternal, always existing.

Calling the expression {Dick,Jane} a "constructor" is another misnomer to the extent that it implies that something new is being created. It's simply an expression that refers to an existing thing, just as 1+1 refers to an existing number. We will instead call such an expression a designator, to avoid the unintended connotation. (The term has been adopted in SQL3 [reference].)

Sets have no ordering, so that the expressions {Dick,Jane} and {Jane,Dick} designate the same set, just as 5+1 and 1+5 designate the same number. Thus different designators may designate the same set.

#### 2.2.4 Operations on Extensional Sets

The usual operations of union, intersection, and difference of sets are well known. Card(s) gives the cardinality of a set s, i.e., the number of members it has. (We are not much concerned about infinite sets.) The predicate Occurs(x,s), often written x IN s, is true if and only if x is a member of the set s.

Equality of set-valued expressions, in the sense of being identical (one and the same), is defined as

s1=s2 <=> FORALL x, x IN s1=x IN s2.

### 2.3 Intensional Sets

#### 2.3.1 Identity

Bowlers and Golfers are different teams, regardless of their membership.

If Tom becomes one of the Bowlers, the team is still the Bowlers, whose membership has changed to {Tom,Dick, Jane}. Tom might also become one of the Golfers, making its membership {Tom,Dick,Jane} as well, but Bowlers remains a different team from Golfers.

By analogy, Symbols and Procedures are different counters in the compiler, even if they happen to have the same value.

#### 2.3.2 Mutability

Intensional sets are mutable, via a replacement process. An intensional set always has an associated extensional set (possibly empty), sometimes called the "value" of the intensional set. We can think of the intensional set as a carrier, having an extensional set as its cargo, being the value of a Cargo function applied to the carrier.

An intensional set (i.e., a carrier) is mutated by replacing its cargo with a different cargo. It's crucial to distinguish the two distinct steps involved:

1. Compute the new set. This does not alter the old cargo, nor does it yet affect the carrier.
2. Make this result the new cargo of the carrier. It replaces the previous cargo, without altering the previous cargo. Now we consider the carrier to be mutated.

Thus, Tom becomes one of the Bowlers in these two steps:

1. Compute the new cargo value: Cargo(Bowlers) UNION {Tom} = {Dick,Jane} UNION {Tom} = {Dick,Jane,Tom}. This does not alter the old cargo, which is the extensional set {Dick,Jane}, nor does it yet affect the carrier Bowlers.
2. Make this result the cargo of the carrier: Cargo(Bowlers)<-{Dick,Jane,Tom}. It replaces the previous cargo, without altering the previous cargo. Now we consider the carrier to be mutated.

Analogously, incrementing the Symbols counter occurs in the same two steps:

1. Compute the new value, e.g., Symbols+1 = 5+1 = 6.
2. Assign that as the new value: Symbols<-6.

Separating the two steps clarifies the significance of constraints (including type declarations). Constraints typically apply to carriers rather than directly to their cargos. For example, suppose that teams such as Bowlers and Golfers may only have sets of people as their cargos. It's meaningless to think of this as a constraint on the set {Dick,Jane}; there's no point in constraining it - it is a set of people, and it will always be one so long as Dick and Jane are people. The constraint is on the team, restricting what sorts of cargos it may carry.

Thus, if we try to add Spot the dog to Bowlers, the first step is still valid: it is perfectly legal to compute the set {Dick,Jane} UNION {Spot}={Dick,Jane,Spot}. That is, the union operation is legal. It is only the second step which is illegal, namely trying to make this set the cargo of Bowlers.

Similarly, if counters are restricted to non-negative integers, that does not constrain the number 5; it simply is a non-negative integer. Subtracting 10 from the current value of Symbols is perfectly legal arithmetic, though it would not be legal to decrement the Symbols counter by 10. The subtraction Symbols-10 is legal, while the assignment is not.

#### 2.3.4 Operations on Intensional Sets

We'd often like to perform operations like union, intersection, difference, cardinality, and occurrence on things like Bowlers and Golfers. These operations are defined on sets, but Bowlers and Golfers are teams, not sets.

It is thus useful to treat Cargo as a casting function [Section 2.4.3], which allows operations defined on cargos to be applied to the carriers of such cargos. We'd like Bowlers UNION Golfers to be interpreted as Cargo(Bowlers) UNION Cargo(Golfers). This isn't true for all operations, equality being one of the notable exceptions. Bowlers=Golfers is very different from Cargo(Bowlers)=Cargo(Golfers). Casting is only invoked for operations which are not otherwise defined on the carriers.

Carriers often have properties of their own apart from the properties of their cargos. A team might belong to a league, and have an owner, manager, schedule, and record. Thus Bowlers and Golfers would be instances of the type Team, on which such properties are defined.

Another general class of operations defined on carriers but not on their cargos are the mutators [Section 3.1], which change the cargos of carriers. Thus, for example, the union operation Bowlers UNION {Harry} may yield a different set, but it does not alter the membership of Bowlers. An Insert operation would be a different thing, assigning the result of such a union to be the new cargo of Bowlers. The Insert operation would be defined as

Insert(x,c) ::= Cargo(c)<-Cargo(c) UNION {x},

incorporating the two steps described in Section 2.3.2.

### 2.4 A Terminological Digression

#### 2.4.1 Extensional and Intensional Specification

If you find this section confusing, ignore it. It only matters if you've seen the terms used otherwise.

The terms "extensional" and "intensional" are sometimes used in a different sense to describe the form of specification of a set. An extensional specification enumerates the members, while an intensional specification provides a membership rule. The two usages are orthogonal, since we can find examples of extensional and intensional sets specified either extensionally or intensionally.

Here's an extensional specification of an extensional set: {1,2,3,4,5}. The members are simply enumerated in the specification. The same set can be specified intensionally by the membership rule "the integers between one and five, inclusive". Though expressed in the form of a membership rule, it defines an extensional set determined by its members. In fact, it is the same as the set specified by the rule "the prime factors of 120, including one".

#### 2.4.2 What's an Object?

The terminology of object technology is in great disarray, such that any choice of terminology is bound to alienate a sizable portion of readers. The best we can do is to make clear how terms are being used, hoping that the reader is willing and able to translate into his own favorite set of definitions.

The main issue with the word "object" itself is whether it denotes only mortal things [Section 2.2.3], or all things whether mortal, eternal, or ephemeral. We use it here in the latter broad sense, using the term "surrogate" for the narrower concept. We thus have the following taxonomy, with some alternatives shown in brackets:

• Object [abstract object, anything]
• Surrogate [object, mortal thing]
• Literal [value, eternal thing]

#### 2.4.3 Casting

Casting is a general mechanism which replaces an invalid argument to an operation with a related valid argument, selected by a casting function. For example, the Factorial function is defined on integers. Without casting, applying it to a rational, as in Factorial(2.5), would be a type violation. However, if Truncate is defined to be the casting function from rationals to integers, then Factorial(2.5) would be evaluated as Factorial(Truncate(2.5)), i.e., Factorial(2).

A casting function g is defined between a pair of types T1 and T2, with signature g:T1->T2. In effect it overloads the definitions of all functions which are defined on T2 but not on T1. If a function f:T2->T3 is not also defined on T1, then it implicitly acquires another definition on T1 such that f(x)=f(g(x)) if x is an instance of T1.

Casting may not be essential, since it can always be written explicitly, but it is a commonly provided convenience in many languages. It does help to simplify programming, and enhance readability of programs.

Languages place various sorts of constraints on casting functions. The following restrictions are adequate for our purposes:

• There is at most one casting function from a given type to a given type.
• There may be any number of casting functions from a given type.
• Casting is only permitted between disjoint types.

There are several mechanisms by which a casting function may be identified:

• Use a reserved name, such as Cast. It can be overloaded with multiple definitions for different argument and result types. Some languages might not allow overloading with more than one definition for a given argument type, thereby disallowing such casting to more than one target type.
• A casting function can be defined with its own user-specified name, with a keyword flagging it as a casting function. This is somewhat similar to the previous case, if we consider the keyword as providing a common alias for the functions.

#### 2.4.4 Parameterized Types

A family of types such as "Set of Persons", "Set of Integers", etc., can be described as "parameterized types" of the form SET(Person), SET(Integer), etc. SET(Person) designates a type whose instances are all the sets having only people as members (its extension is the power set of the set of all persons).

SET itself is not a type, but a type generator. It is a function which maps a type such as Person into another type, SET(Person), which is a generated type. The term parameterized type should be taken precisely to mean a generated type, not the generator itself.

Such generated types are themselves ephemeral objects. Whenever the type T exists, the type SET(T) is also an existing object, again recursively ad nauseam. Thus the type whose instances are sets whose members are sets of sets of integers exists, whether anyone cares or not. To be really precise, we should also avoid the term "generator", since it too implies that something new is being created. However, we will continue to use the terms "type generator" and "generated type" rather than the more precise "type designator" and "designated type".

Generated types, being ephemeral, are a distinct sub-category of types, distinguished from the more familiar sub-category of types which are explicitly created.

A type template is a description of a type generator. It is analogous to a function body (procedure); when the parameters in the body are instantiated, the template then serves as the definition of a specific generated type.

A simple type generator takes a single type as its only parameter. Such a generator is a function mapping any type into a generated type. Tuple types [Section 4.2] provide an example of non-simple type generators.

An interesting subtype relationship exists among simply generated types. Let's use T2 SUBSET T1 to denote subtyping. If T2 SUBSET T1, then G(T2) SUBSET G(T1) for any simple type generator G. Thus SET(Student) SUBSET SET(Person) - a set of students is also a set of persons.

A simple type generator G also has a maximal generated type G(Object) which is a supertype of any other type G(T) generated by G. Thus although SET itself is not a type, SET(Object) serves the same purpose, being the type whose instances are all the existing sets.

There is also a parallel notion of sub-generators: if G2 SUBSET G1, then G2(T) SUBSET G1(T) for any type T. If we wanted Set to be a subtype of Multiset, we might define SET as a sub-generator of MULTISET. Then SET(Person) would be a subtype of MULTISET(Person).

### 2.5 Set Types

We could say that ExtensionalSet is a single type, having all the extensional sets shown in earlier examples among its instances. However, since types are primarily used to constrain the permissible values of variables and functions, it's useful to have more specific set types, such as "Set of Persons" or "Set of Integers". These are conveniently described in terms of generated types, as just discussed.

Let SET be the type generator for extensional set types. (The unqualified term "set" will always mean an extensional set.) It is a function which takes a type as a parameter and returns the corresponding extensional set type. Thus the two types we just mentioned would be designated as SET(Person) and SET(Integer). The type whose instances include all extensional sets is given by SET(Object).

If Dick and Jane are students, then {Dick,Jane} is an instance of both SET(Student) and SET(Person). If an object can be an instance of several types which are not related as subtypes, then a set can be an instance of set types which are not related as subtypes. Thus if Dick and Jane are also teachers, then {Dick,Jane} is also an instance of SET(Teacher), which is neither a subtype nor supertype of SET(Student). In general, a set is an instance of SET(T) whenever all its members are instances of T.

If objects can change types, then sets containing those objects may also change types. If Dick alone becomes an Employee, then {Dick,Jane} is not yet an instance of SET(Employee) - but it will be as soon as Jane also becomes an Employee. This membership ceases as soon as either person stops being an Employee. If s1 and s2 are instances of SET(T1) and SET(T2), respectively, then the result of s1 UNION s2 is an instance of SET(Ti), where Ti is any type having both T1 and T2 as subtypes, i.e., any common supertype of T1 and T2. Such a common supertype is not necessarily unique. This is illustrated by the following example, under which the union of a set of interns with a set of trainees yields something which is both a set of students and a set of teachers.

```        ..........
: Person :
..........
/\
/  \
/    \
/      \
...........  ...........
: Student :  : Teacher :
...........  ...........
|    \      /    |
|     \    /     |
|      \  /      |
|       \/       |
|       /\       |
|      /  \      |
|     /    \     |
|    /      \    |
...........  ...........
: Intern  :  : Trainee :
...........  ...........
```

We'll deal with intensional set types in Section 3.3, after a general introduction to carriers and cargos.

## 3 CARRIERS AND CARGOS

We have defined Cargo to be a function which serves as a casting function. It can be redefined on various argument and result types. A specific Cargo function is a definition of Cargo on a particular argument type T, which will be denoted as T.Cargo.

A carrier type is a type which is the argument type of a Cargo definition, and a carrier is an instance of a carrier type. The cargo of a carrier c is the value of Cargo(c).

The following restrictions are useful:

• Argument types of Cargo definitions should be disjoint, so that an object is an instance of at most one carrier type, and at most one specific Cargo function applies to any object.
• Argument and result types for any specific Cargo function should be disjoint.

### 3.1 Mutation Operators

Although it might be commonly assumed that casting functions are not updatable, the Cargo function is updatable. Mutation of carriers is defined as update of the Cargo function. The basic form is direct assignment:

Cargo(x)<-y,

as in

Cargo(Bowlers)<-{Dick,Jane},
Cargo(Symbols)<-5.

Many operations which modify the current cargo can be described in the general form:

Mutate(x,op,y) ::= Cargo(x)<-op(Cargo(x),y).

For example, a counter might be incremented by

Increment(Symbols,2) ::= Mutate(Symbols,Plus,2),

i.e.,

Cargo(Symbols)<-Plus(Cargo(Symbols),2).

Someone might be added to a team by

Insert(Bowlers,Tom) ::= Mutate(Bowlers,Append,Tom),

i.e.,

Cargo(Bowlers)<-Append(Cargo(Bowlers),Tom),

where

Append(x,y) ::= x UNION {y}.

It might also be useful to consider a form of reverse casting on assignment. If there is a Cargo definition such that x is an instance of the argument type (i.e., the carrier type) and y is an instance of the result type, then

x<-y

can be interpreted as

Cargo(x)<-y.

Then

Bowlers<-{Dick,Jane}

would be interpreted as

Cargo(Bowlers)<-{Dick,Jane}.

### 3.2 Constraints on Mutation

Constraints on mutation are defined by the applicable definition of the Cargo function. For the types Team and Counter, the Cargo function would be defined with signatures

Cargo:Team->SET(Person),
Cargo:Counter->Integer.

Thus any mutation of Bowlers or Golfers must leave their cargo as a set of persons, while any mutation of Symbols or Procedures must leave their cargo as an integer.

The type of cargo which a carrier can carry is the result type of the specific Cargo function which is applicable to that carrier. All instances of this carrier type have the same cargo type.

It is sometimes necessary to further constraint the cargo type for individual instances of a carrier type. For example, one team might have to be a team of doctors, another a team of pilots, etc. (This becomes relevant for relational tables [Section 4.7].) This essentially requires a CargoType function with signature

CargoType:T->Type.

Mutation of carriers is now further constrained to satisfy

Cargo(x) IN CargoType(x).

Given a carrier type T1 with the two functions

Cargo:T1->T2,
CargoType:T1->Type,

let CargoType(x)=T3 for some instance x of T1.

Cargo(x) must now be an instance of both T2 and T3, which only makes sense of they are non-disjoint. In general, it seems reasonable to require that T3 be a subtype of T2. In effect, this makes the cargo type of a carrier a subset of the cargo type defined for the carrier type. Thus we might have Medics as a team, with

CargoType(Medics)=SET(Doctor).

Type checking for carrier mutation now has to go beyond signature-based constraints implied by the Cargo function, and enforce the specific CargoType constraint as well.

One plausible approach to all this is to define Carrier as a type, with the following functions defined on it:

Cargo:Carrier->Object,
CargoType:Carrier->Type.

The Cargo function could be redefined on any subtype of Carrier. If Cargo is redefined on a subtype T of Carrier, then the following rules apply for any instance x of T:

• The initial (default) value of CargoType(x) is the result type of T.Cargo.
• CargoType(x) must be a subtype of the result type of T.Cargo.

Any subtype of Carrier could be considered a carrier type.

## 4 OTHER USES FOR CARRIERS AND CARGOS

Most applications of the carrier/cargo concept take the following form:

• Define a Cargo function on a type T, making it a carrier type.
• Define additional operations on the type T.

### 4.1 Multisets, Lists, and Other Aggregates

The concepts extend trivially to multisets and lists. Details are not too important for the present paper. MULTISET and LIST can be defined as generators. MULTISET may or may not be a sub-generator of SET. There could be an abstract (non-instantiable) COLLECTION type which is the supertype of all the SET, MULTISET, and LIST types. Designator expressions are evaluated differently. Evaluating the multiset designator [Dick, Jane, Dick] does not eliminate duplicates, while evaluating the set designator {Dick, Jane, Dick} does. Similarly, the list designators <Dick,Jane> and <Jane,Dick> evaluate to different lists, while {Dick,Jane} and {Jane,Dick} evaluate to the same set.

### 4.2 Tuples and Records

A tuple type is defined by a sequence (list) of one or more types. There are two main variants on the tuple concept:

• The tuple type is simply defined by a sequence of types. The elements can be accessed by numerical index, just like a list. Tuples of the same length are equal if corresponding elements are equal. Column names in a table correspond to ordinal positions within its tuples.

Within this variant there are two ways to relate tuples and lists:

• They are always different things. A list of two people is not an instance of TUPLE(Person,Person).
• Lists can be instances of tuples. A list of two people is an instance of TUPLE(Person,Person). More generally, a list of n elements is an instance of TUPLE(T1,...,Tn) if the i-th element of the list is an instance of Ti for each i.
• The tuple type is defined by a sequence of name:type pairs, where the name serves as an "attribute name" or "field name". The generator then takes the form TUPLE(a1:T1,...,an:Tn).This variant is sometimes also called a "record type". Tuple equality depends on matching attribute names as well as element values. The column names of a table are the attribute names of its tuples.

The TUPLE type generator is not a simple generator [Section 2.4.4], since it takes a list of types rather than a single type as its parameter. It has no corresponding maximal type. The type which contains all tuples has to be defined separately as the supertype of TUPLE(Object), TUPLE(Object,Object), TUPLE(Object, Object,Object),....

As a notational convenience, we allow G() to designate the union of all types generated by the generator G, allowing us to use TUPLE() to denote the type whose instances are all tuples.

### 4.3 Spatial and Geometric Constructs

A point in space is not the same thing as a point that can move in space. Points in a geometric space are much like literals. Once the space has been defined, all the points exist. They can't be created and destroyed, and they have immutable properties. Thus, we might define GeometricPoint as a two-dimensional Euclidean space. The points in the space all exist, having rectangular and polar coordinates as immutable properties. Such points might be designated via either sort of coordinates, e.g., RectPoint(10,20) or PolarPoint(10,30). Immutability of properties means that it makes no sense to move these points; their coordinates can't be updated. The identity of a geometric point is inseparable from its location; its location is its identity.

The definition of such a type might take the form

CREATE LITERAL TYPE GeometricPoint p
FUNCTIONS

Xcoord->Real;
Ycoord->Real;
Magnitude->Real;
Angle->Real;

CONSTRAINTS

Magnitude(p)=Sqrt(Xcoord(p)² + Ycoord(p)²);
Ycoord(p)=Xcoord(p) * Tan(Angle(p));

DESIGNATORS

RectPoint(x Real, y Real): Xcoord(p)=x, YCoord(p)=y;
PolarPoint(m Real, a Real): Magnitude(p)=m, Angle(p)=a;

EQUAL(RectPoint(x1,y1),RectPoint(x2,y2)): x1=x2, y1=y2;

EQUAL(PolarPoint(m1,a1),PolarPoint(m2,a2)): m1=m2, a1=a2;

EQUAL(RectPoint(x,y),PolarPoint(m,a)): m=Sqrt(x²+y²), y=x*Tan(a);

On the other hand, suppose we want to talk about a line between PointA and PointB, such that the line can be moved by moving these end points. These points are a different sort of thing, since they have to be explicitly created and destroyed, and they can be moved (i.e., their coordinates can change). Their identities are independent of their locations. They can change location without changing identity, and two of them can be in the same place without being the same point.

For these we need a different type, MovablePoint, which is a carrier type having GeometricPoint as its cargo:

CREATE TYPE MovablePoint
FUNCTIONS

Cargo->GeometricPoint;
Color->Colors;
OccursOn->SET(Line);
Location ALIAS Cargo;

PointA, an instance of MovablePoint, has an instance of GeometricPoint as its cargo. A property such as Xcoord(PointA) is defined by cargo casting, being evaluated as Xcoord(Cargo(PointA)); aliasing makes this equivalent to Xcoord(Location(PointA)). Mutators could be defined on MovablePoint such that, for example, Xcoord(PointA)<-25 means

Cargo(PointA)<-RectPoint(25,Ycoord(PointA)).

The same reasoning extends to other geometric constructs. A circle is not the same thing as a circular object. Circles are abstractions like literals, in the sense that all circles exist, and there is only one circle with a one-inch diameter. The identity of a circle is inseparable from its diameter (and its radius and circumference as well).

### 4.4 Documents, Files, Multimedia, and Other Containers

A river is not the same thing as the water in it. A document is not the same thing as the text in it. A file is not the same thing as the bit-string or byte-string it contains. A container is not the same as its contents. All of these can be modeled as carriers and cargos.

Properties of document objects include such things as authors and publication schedules, which are not properties of character strings or structures of character strings. Operations such as concatenation, or finding substrings, are defined on character strings, not documents. Such operations can be performed on documents if they are defined as carriers whose cargos are character strings or structures of character strings (e.g., chapters). Documents are edited by effectively replacing one cargo with another.

Properties of files include such things as owners, creation dates, privileges, positions in directories, etc. Operations on bit strings are much the same as operations on character strings. Files can thus be modeled as carriers of bit string cargos.

A photograph is not the same thing as the image in it. Different photos may have the same image, and the image in a photo can be changed by retouching or other forms of enhancement. A photograph is thus a carrier having an image as its cargo.

### 4.5 Large Objects

Large objects [4,1] may also fit the paradigm of carriers and cargos. The distinction between a very long string and its "handle" is visible to the user to the extent that he can control which is to be moved or copied or otherwise operated on. This parallels the carrier/cargo distinction. At the same time, operations defined on strings are to be applied to their handles, with the intent of having them performed on the corresponding strings. This parallels the casting semantics of the Cargo function.

Different sorts of large objects (e.g., bit, byte, and character strings) can be distinguished by different result types on specific Cargo functions, or by different values of CargoType.

### 4.7 Relational Tables

For tables, the signature of the Cargo function defines the form of relational model, while the individual cargo types describe the row types for individual tables.

In its basic form, a relational table is a named, created object with an associated varying set of tuples of atomic elements, very much like a team whose members happen to be tuples rather than people. Thus a table is a carrier whose cargo type is SET(TUPLE(AtomicType,...,AtomicType)). This form of relational table is described by Cargo function signatures of the form:

Cargo: Table->SET(TUPLE(AtomicType,...,AtomicType)).

Cargo: Table->MULTISET(TUPLE()).

Ordered tables are admitted by allowing arbitrary collections. If the COLLECTION generator has SET, MULTISET, and LIST as sub-generators [Section 2.4.4], then the signature takes the form

Cargo: Table->COLLECTION(TUPLE()).

To go even further and allow a table to be a collection of arbitrary objects, the signature can be generalized to

Cargo: Table->COLLECTION(Object).

Generalized relations of this form have been considered for SQL [2,3].

The form of a particular table is given by the CargoType function. A Scores table might be created as an instance of Table with its CargoType initialized as

CargoType(Scores)<-SET(TUPLE(Person,Integer)).

In general, CargoType for a table has the form COLLECTION(RT), where RT is the row type of the table.

Overall, this approach differentiates between a table object and the collection it carries as its cargo. This allows queries to be described more precisely as closed operations on collections rather than on tables. The semantics are the same. When a query is applied to a table, it is naturally cast into a query on the associated collection. Queries on multiple tables can be defined as operating on the cross-products of their associated collections.

Letting the results of queries be collections rather than tables simplifies certain existence, identity, and typing issues. If results of queries are table objects, then executing the same query twice necessarily yields unequal results. Treating the results as extensional collections allows them to be equal, in the sense that {Tom} UNION {Dick} and {Tom} UNION {Dick} yield the same result.

Collections are instances of generated types, which are ephemeral and hence don't need explicit creation [Section 2.4.4]. Hence result types do not have to be created for the results of arbitrary queries. Result types can be determined from the form of the SELECT clause. For example, the result of a query to SELECT x.Name, x.Age FROM... is an instance of MULTISET(TUPLE(Character,Integer)). This type always exists, so long as the types Character and Integer exist, and does not have to be fabricated for any particular query.

If tables are desired as query results, then the collection yielded by a query can be assigned as the cargo of a table object, so long as it satisfies the CargoType constraint of the table.

As with carriers in general, tables have other properties in addition to the properties of their cargos, such as names, the schemas they belong to, and authorization constraints. Depending on how tuples are defined [Section 4.2], the list of column names could also be a property of the table.

### 4.8 A Comparison of Carriers and Variables

Even a variable is not quite the same thing as its value. A carrier and its cargo bear some resemblance to a variable and its value, with the cargo type of the carrier being analogous to the declared type of the variable. The Cargo function is somewhat analogous to an expression evaluator - for a carrier c, f(c) sometimes behaves like f(Cargo(c)), while for a variable x, f(x) behaves like f(Eval(x)). In other words, a counter such as Symbols sometimes behaves like an integer-valued variable.

## 5 SUMMARY/CONCLUSIONS

Carriers and cargos provide a general mechanism for modeling many sorts of things which come in natural pairs, usually having an intensional/extensional or container/content flavor.

The Cargo and CargoType functions are defined on all carriers, imposing the constraint

Cargo(x) IN CargoType(x).

A carrier type is the argument type of a specific Cargo function. The default CargoType for a carrier is the result type of the specific Cargo function defined on its carrier type. The CargoType can be explicitly specified for individual carriers, so long as it remains a subtype of the default.

The essential characteristics of carriers and cargos:

• They are distinct objects, each having properties of their own.
• The Cargo function serves as a casting function, so that a carrier can be replaced by its cargo in operations which are defined on the cargo but not on the carrier. Thus in many respects a carrier behaves as though it was its cargo.

This approach seems useful for such things as intensional sets and other collections, documents, files, geometric and spatial constructs, relational tables, and many others.

This approach is closely related to identity vs. equality in some object models, with identity referring to objects and equality referring to their "values" [6]. Such values are in fact the cargos for objects which are their carriers. The difference in the present approach is to recognize such objects and their values as being distinct things, with an explicit mapping between them.

Although it seems quite promising, this approach still needs to be validated by prototyping in a real language. Areas needing further investigation include:

• Interaction with other forms of inheritance and casting.
• Type checking mechanisms for enforcing CargoType constraints.
• Complex cargos. A movie or video might contain several concurrent frame sequences and sound tracks, with synchronization controls.

## 6 REFERENCES

1. ANSI X3H2-93-359R/ISO DBL MUN-003, (ISO-ANSI Working Draft) Database Language SQL (SQL3), Jim Melton (ed.), August 1993.
2. ANSI X3H2-93-231, "Tables of ADTs", May 1, 1993, by Phil Shaw.
3. ANSI X3H2-93-302, "Removal of table of ADT", March 8, 1993.
4. ANSI X3H2-93-341r1, "Large Object Strings", July 11, 1993, by Paul Cotton, Toby Lehman, Nelson Mattos and Frank Pellow.
5. ANSI X3H2-93-425, "Another Approach to BLOBs", Sept. 10, 1993, by Jim Melton.
6. Setrag Khoshafian and George Copeland, "Object Identity", Proc. Conf. on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA), Portland, Oregon, 1986.