The C[] Language Specification
 

Table of Contents

  1. Arrays
  2. Pointers
  3. Access to the Elements of an Array
  4. Vectors
  5. Access to Subarrays
  6. Scalar Operators
  7. Vector Operators
  8. Lvectors
  9. Reduction Operators
  10. Array Segments
  11. Irregular Segments
  12. Determining Undefined Vector Size from Context

Arrays

In the C language an array comprises "a contiguously allocated set of elements of any one type of object". In the C[] language an array comprises a sequentially allocated elements (with a positive 'step') of any one type of object. Thus, in the C[] language an array has at least three attributes, namely, the type of its elements, the number of elements and the allocation step. In the C[] language, the array declarator syntax differs from the standard in following way. The rule

     direct-declarator:
         direct-declarator [ expression(opt)]


is replaced with the rules

     direct-declarator:
         direct-declarator [ expression(opt) step(opt)]
     step: ':' expression

If step is not specified, then it is equal to 1. The step should be a positive integral value.

Unlike the C language, C[] allows arrays to have variable length, i.e. an array length may be any expression of an integral type. Similarly, a variable array step is also permitted. Arrays with non-constant steps or lengths must have an automatic storage durations, i.e. must be declared either with auto storage class specifier or must be declared within function body without storage class specifiers.

Examples:

The declarations int a[3:1] and int a[3] both define an array of the form

a[0] a[1] a[2]


The size of the slot between elements of the array is equal to zero.

The declaration int a[3:3]; defines an array of the form

a[0] a[0] a[0] a[1] a[0] a[0] a[2]


The size of the slot between array elements is equal to 2*sizeof(int) bytes.

In the following example the array a of N+M elements is declared:

   int N,M;
   ...
   f(){
     int A[N+M];
     ...
  }



Pointers

In the C language a pointer has only one attribute, namely, the type of object it points to. This attribute is necessary for the correct interpretation of values of objects it points to as well as the address operators + and - . These operators are correct only if the pointer's operands and the pointer's results point to elements of the same array object.

The same rule is valid for the C[] language. Therefore, to support the correct interpretation of the address operators, one additional attribute of pointer is introduced; this attribute is step.

In the C language Standard, "when an expression that has integral type is added to or subtracted from a pointer, the integral value is first multiplied by the size of the object pointed to". In the C[] language , the multiplier is equal to the product of the pointer step and the size of the object pointed to. In the C language, "when two pointers to elements of the same array object are subtracted, the difference is divided by the size of a element". In the C[] language, the divisor is equal to the product of the pointer step and the size of an element.

In the C[] language, the pointer declarator is defined as follows:

   pointer:
     * step(opt) type-specifier-list(opt)
     * step(opt) type-specifier-list(opt) pointer


If step is not specified, it is equal to 1. The step may be any expression of an integral type. Pointers with non-constant steps must have an automatic storage durations, i.e. must have either auto or register storage class specifiers or must be declared within function body without storage class specifiers.

Example:  The declaration int a[]={0,1,2,3,4} defines an array of the form

a[0] a[1] a[2] a[3] a[4]


The pointer declaration int *:2 p1=(void*)a forms the following structure of storage

a[0] a[1] a[2] a[3] a[4]
a[0] a[1] | a[3] a[4]
a[0] a[1] p1+1 a[3] a[4]


and address expressions (p1+1) points to the a[2] element of the array a .

Access to the Elements of an Array

In the C[] language, access to e2-th element of an array object e1 is obtained with using one of the expressions e1[e2] or (e2)[e1] . Both are identical to (*(e1+(e2))) . Here, e2 is an integral expression, e1 is an lvalue that has the type "array of type". This lvalue is converted to an expression that has the type "pointer to type" and that points to the initial element of the array object (the attribute step of this pointer is identical to the attribute step of the array object).

Vectors

The basic new notion of the C[] language is a notion of vector. A vector is defined as an ordered sequence of elements of the same valid vector element type. The number of vector elements is called vector size. A valid vector element type is any type excepting function type, void type, or any incomplete C type (recall that an incomplete type is an array of unknown size or structure or union of unknown content).

The C[] language introduces a new kind of derived types - vector type. A vector type describes a set of objects or values with a particular member object type, called the element type. The element type must be a valid vector element type. A vector type is characterized by its element type T and by the number N of elements in the vector. A vector type is said to be derived from its element type, and if its element type is T, the vector type is called vector of N elements of type T or simply vector of T. Unlike any other type, a vector type can not be explicitly specified and hence can not appear in declarations. But C[] expressions may have a vector type.

The simplest way to construct an expression of a vector type is to apply a special blocking postfix operator [] to an expression of an array type. If the expression e designates an array of N elements of a non-array valid vector element type, then the expression e[] designates a vector of N elements and the i-th element of that vector is just the i-th element of the array e, namely, e[i]. If the expression e designates an array of N elements of an array type, then the expression e[] designates a vector of N elements and the i-th element of that vector is the result of applying the blocking operator to the array designated by e[i].

In this document we often use the term "vector a[]" instead of "vector designated by a[]".

Example: Let the array a be defined and initialized by the declaration

    int a[3][2];

Then the expression a[] has the type "vector consisting of three vectors, each of which consists of two integers" with elements {{a[0][0], a[0][1]}, {a[1][0], a[1][1]}, {a[2][0], a[2][1]}} .

Vectors and arrays are similar in many features, but there is one principal difference,namely, in expressions, arrays are converted to pointers meanwhile vectors do not. For example, if array a is declared as int a[8], then the expression a+1 has type "pointer to int" and points to the first element of the array a. At the same time, the expression a[]+1 has type "vector of 8 ints", and the i-th element of that vector is equal to a[i]+1. See the section Vector Operators for a comprehensive explanation of vector operators.

The blocking operator [] is also applicable to pointers. If e is an expression of type "pointer to a non-array valid vector element type", then the expression e[] designates a vector of N elements (N is determined from the context according to rules explained later in the document), and the i-th element of that vector is e[i]. If e is a pointer to array type T, and T is also a valid vector element type, then the expression e[] designates a vector of N elements (N is determined from the context), and the i-th element of that vector is the result of applying the blocking operator to the array designated by e[i].

The blocking operator [] is applied to an expression designating an array of unspecified length in the same manner as it is applied to a pointer.

Here are more intricate examples of use of the blocking operator.

Example:

   int a[10];
   int b[10];
   int *p;
   a[]=b[]+p[];

Here the expression p[] is a vector of 10 elements. The vector`s length is determined from the context.

Example: Let pointers p1 and p2 be declared as follows:

   int (*p1)[10];
   int **p2;

Then the expression p1[] has the type "vector of N of vectors of 10 elements of type int", where N depends on the context. The expression p2[] has the type "vector of M pointers to int", where M depends on the context.

Access to Subarrays

In this section, we describe how the blocking operator is used to access subarrays.

By definition, a (data) object belongs to an array, if it is an element of the array or it belongs to an element of the array. Any set of objects belonging to the same array is called a subarray, iff this set can be described as an array (using bounds and step attributes as defined above). In addition, any subarray can be referred to as an object belonging to its superarray.

In principle, the facilities introduced are sufficient to access subarrays. For example, if the array object a is defined by the declaration

     int a[5][5];     

then the expression

     (*(int(*)[5:6])a)[]      (1)

designates a vector of five ints that contains the main diagonal of the matrix a , and the expressions

     (*(int(*)[4:6])(a[0]+1))[]      (2)

and

    (*(int(*)[4:6])(&a[0][1]))[]     (3)

designate a vector of four ints that contains the diagonal of the matrix a which is placed above the main diagonal.

The more compact notation results, if variables of type "pointer to array" are used. So, if the pointer objects p1 and p2 are defined by declarations

    int (*p1)[5:6]=(void*)a;
    int (*p2)[4:6]=(void*)(a[0]+1);

then the expression (*p1)[] can be used instead of (1) and the expression (*p2)[] can be used instead of (2) and (3).

Scalar Operators

Two binary scalar operators were added to the C[ ] language, namely, operators ?> and ?< , calculating the maximum and minimum, together with their corresponding compound assignments.

Vector Operators

The operand of unary &*, +, -, ~, ?, %, !, ++ (postfix and prefix form), and -- (postfix and prefix form) operators and scalar cast operators may have a vector type. In that case, the result of such an operator is a vector of elements, which results from applying the corresponding operator to the elements of its operand.

A vector type name is forbidden in a cast operator.

Example:

   int a[3];
   int b[3];
   a[]=-b[];

In this example, the i-th element of the vector a[] is set to -b[i] for i=0, 1, 2.

One or both operands of binary =, *, /, %, ?<, ?>, +, -, <<, >>, <, >, <=, >=, ==, !=, &, ^|, &&, ||, *=, /=, %=, +=, -=, <<=, >>=, &=, ^=,  and |= operators may have a vector type.

If both operands are vectors of the same length, then the result is a vector, the elements of which are the results of application of the corresponding operator to the elements of the operands. If one of the operands is a scalar, then it is converted to a vector of the same length as the vector operand.

In the following C[] code portion

         int a[10], b[10], c;

         a[]=b[]*c;

b[i]*c is assigned to the i-th element of array a for all i (0 <= i <10).

If vector operands of a binary operator have different lengths, the behavior is undefined.

However, a binary operator is applicable to vector operands of different number of dimensions. In the following example

         double a[10], B[10][20];
         B[] *= a[];

each element of the i-th row of B is multiplied by a[i].

A conditional operator may also have vectors as its operands. If the first operand of conditional operator is a scalar and the second or third operand or both are of vector type then the result of the operator has the same vector type as for binary operators discussed above. The first operand of a conditional operator may have vector type. In that case the second or the third operand but not both of them may be omitted. If none of the operands is omitted then unlike the C language all three operands are evaluated. If all three operands are vectors of the same length then the result is produced by elementwise application of the operator. If vector operands of a conditional operator have different lengths then behavior is undefined. If the second or the third operand is non-vector then the length of that operand is converted to the length of the vector operands. If the first and the second (or the third) operands are vectors, the third (the second) operand is omitted, and the elements of the first operand have scalar type, then the result will be the vector of the same type as the second (the third) operand; the i-th element of the result is equal to the k(i)-th element of the second (the third) operand where k(i) is the index of the i-th non-zero (zero) element of the first operand. The other elements have indefinite values.

For example, execution of

         int a[5]={1,2,3,4,5};
         int b[5]={3,3,3,2,6};
         int c[5];

     
         c[]=a[] < b[] ? A[]:;

results in the vector c[] equal to {1,2,5,w,w}, where w denotes an undefined value.

If the first and the second (or the third) operands are vectors, the third (the second) operand is omitted, and the elements of the first operand have vector type, then the result is achieved by elementwise application of the operator.

Subscript operator also allows vector operands. Recall that the C language expression e[f] is defined as *(e+f). In C[] the subscript operator is treated in the exactly same way.

Example: In the following C[] program the last column of the array A is set to 1:

   int A[2][3];
   int* p[2];
   p[0]=&A[0][0];
   p[1]=&A[1][0];
   p[][2]=1;

Indeed, the expression p[][2] has an equivalent form, *(p[]+2), that is clearly a vector consisting of pointers to the last matrix A column elements.

The blocking operator is applicable to vectors in elementwise fashion. See section Array Segments for details.

Lvectors

A vector comprising modifiable lvalues is called lvector. In C[], the left operand of a simple or compound assignment operator or the operand of postfix/prefix increment/decrement operators must be either modifiable lvalue or lvector.

Example: Expressions a[] and b[] are lvectors, but the expression a[]+b[] is not.

Note, that the result of applying the blocking operator is always an lvector. Lvectors may also be the result of applying operators other then the blocking one. For example, if array p is declared as int* p[3], then the expression *(p[]) is an lvector, and, hence, the expression *(p[])=1 is correct.

Reduction Operators

The unary reduction [*], [?<], [?>], [+], [&], [^], and [|] operators correspond to binary *, ?<, ?>, +, &, ^, and | operators. These operators are applicable only to vector operands. Let v[0], v[1],...,v[N] denote the elements of vector operand v. Then the expression [op] v[] has the same semantics as the expression of  (...((v[0] op v[1]) op v[2]) op ... op v[N]) kind.

Example: In the following code portion

         int  a[]={0,1,2,3,4}, sum;
         sum=[+]a[];

the value of sum is equal to the value of the expression

        ((((a[0]+a[1])+a[2])+a[3])+a[4])

which is equal to 10.

Example:

   double A[2][3];
   double s[3];
   double sum;
   s[]=[+]A[];
   sum=[+][+]A[];

Here the sum of the rows of the array A is assigned to the vector s[], and the sum of all elements of the array A is assigned to sum.

As we have mentioned above, C[] has maximum ?> and minimum ?< binary operators. The corresponding [?>] and [?<] reduction operator are used for evaluating the maximum and minimum values among array elements.

Example:  The following C[] code portion is aimed at evaluating the maximum among elements of matrix A:

   int A[2][3];
   int max;
   max=[?>][?>]A[];

Array Segments

Not every regular set of objects belonging to an array is a subarray. For example, the rectangular segment of the array a represented in Fig.1 is not a subarray.

a[0] a[0] a[0] a[0] a[0]
a[0] a[0] a[0] a[0] a[0]
a[0] a[0] a[0] a[0] a[0]
a[0] a[0] a[0] a[0] a[0]

Figure 1: Rectangular 2x3 segment of array A

Access to such array segments is provided by so-called grid operator [:], the only quaternary operator in C[]. The general notation for the grid operator is:
     e [ l : r : s ],
where expression e either is of a pointer type or designates an array, and expressions l, r, s are of any integral type. l, r , s denotes the left bound, the right bound and the step correspondingly, and e [ r : l : s ] designates a vector of (r-l)/s + 1 elements whose i-th element is e[l+i*s] .

Example: If array a is declared as int a[5], then the expression a[2:4:2] designates a two-element vector comprising a[2] and a[4] .

The step operand may be omitted, and in that case the second semicolon in the grid notation is optional. One or both bounds are also may be omitted. The omitted left bound is replaced by 0, while the omitted right bound is replaced by N-1 , where N is the number of array elements, if the fist operand designates an array, or determined from the context, if the operand is a pointer. Fig. 2 gives some examples of grid expressions with various combinations of omitted operands.

Let the array a is declared as int a[5].

a[1:3:2]

No operands are omitted.

a[0] a[1] a[2] a[3] a[4]

a[1:3]

The step is omitted.

a[0] a[1] a[2] a[3] a[4]

a[::2]

Bounds are omitted.

a[0] a[1] a[2] a[3] a[4]

a[:]

Bounds and a step are omitted.

a[0] a[1] a[2] a[3] a[4]


Figure 2. Various combinations of omitted values in grid expressions.

The first operand of the grid operator may have a vector type. In that case the operator is applied elementally. Consider the array A which is declared as int A[4][5]. The expression A[1:2] is a vector of 2 arrays corresponding to second and third rows of A. In the expression A[1:2][1:3], the second grid operator ( [1:3] ) is applied to each of the arrays selecting their second, third and fourth elements (Fig. 1). Successive grid operators are very convenient to access segments of a multi-dimensional array.

The operand of the blocking operator [] also may have vector type. In that case, the operator is applied elementally. If the array A is declared as int A[5][5][5], then the expression A[1:3] has type "vector of 3 arrays". In the expression A[1:3][], the blocking operator is applied to each of the arrays. Thus, the expression A[1:3][] designates the 3x5x5 array segment.

One can see that the expression A[1:3][] has an equivalent representation, A[1:3][:][:]. Thus, the blocking operator can be considered as more compact notation to express successive grid operators with omitted steps and bounds.

Irregular Segments

As we have mentioned, operands of the subscript operator may have vector types. Any subset of array elements can be accessed by means of some subscript expression, whose right operand is of vector type (so-called vector subscripting ). In other words, access to the vector consisting of i1-th, i2-th, ..., in-th elements of array a is provided by the expression a[i[]], where i[] is the n-element vector whose elements are i1, i2, ..., in.

Example: If the array a is declared as int a[5] and the array i is declared as int i[4]={0,1,3,4}, the expression a[i[]] designates a vector of a[0], a[1], a[3], a[4].

Example: In the following C[] code portion

   int A[5][5];
   int i[4]={0,1,3,4};
   A[ i[] ][ i[] ]=1;

the value 1 is assigned to all elements of the irregular array A region depicted in Fig. 3.

a[0] a[0] a[0] a[0] a[0]
a[0] a[0] a[0] a[0] a[0]
a[0] a[0] a[0] a[0] a[0]
a[0] a[0] a[0] a[0] a[0]
a[0] a[0] a[0] a[0] a[0]

Figure 3: Irregular segment of the array A

In fact, the grid operator provides more convenient but less flexible way to access subsets of array elements then vector subscripting. If access to regular segments is required, the grid operator is preferable. Of course, any grid operator may be replaced by some equivalent subscript operator. Indeed, consider the array a declared as int a[5]. Expressions a[1:3:2] and a[i[]], where vector i[] is a two-element integral vector whose elements are 1 and 3 correspondingly, designate the same vector. Similarly, any combination of successive grid operators can be expressed by means of appropriate vector subscripting.

Determining Undefined Vector Size from Context

If one of the operands of a binary operator is a vector of undefined size, and another is a vector of a definite size, N, then the size of the operand of undefined size is assumed to be equal to N. Similarly, if one of the operands of a ternary operator is a vector of definite size N, then the size of a vector operand of undefined size is assumed to be equal to N.

Example:

   int a[3];
   int b[3];
   int *p
   a[]=b[]+p[];

Here, the size of vector p[] is 3, because the size of the left operand of the binary + operator (b[]) in the expression b[]+p[] is 3.

Example: If the pointer p is declared as int **p and the array A is declared as int A[], then the element size of the vector p[][] in the expression a[]+p[][] can not be determined from the context regardless the element size of array a is definite.