IOI'95 - Examples of New Problems for the IOI

# Examples of New Problems for the IOI

We give two examples of informatics problems that are not programming-contest problems and that might have been posed at a future IOI (if we had not revealed them here). The first example can be classified as a theoretical problem (not requiring the use of a computer), the second as an empirical problem (requiring a computer).

## Problem 1 (Theoretical)

A high-tech project involves 100 scientists, who are proud of their capabilities for mental arithmetic. In a noisy machine room, the number zero has been written on a blackboard. While doing other work, each scientist goes through the following routine 100 times:

1. Enter the machine room.
2. Read and memorize the number on the blackboard.
3. Exit the machine room.
4. Mentally increment the memorized number by 37.
5. Enter the machine room.
6. Erase the number on blackboard.
7. Write the incremented number from memory on the blackboard.
8. Exit the machine room.
The scientists work at their own pace independent of each other. The machine room, however, is so small that at each moment it can accommodate at most one scientist. The largest possible value remaining on the blackboard after all scientists have finished their routines is 370,000. What is the smallest final value possible?

NOTE: This is an informatics problem because it deals with a parallel computation involving a number of communicating processors (the scientists).

## Problem 2 (Empirical)

Finding an element in a sorted sequence is a common task. Think of looking up a word in a dictionary or a name in a telephone directory. In this problem, two search algorithms are presented. You are requested to set up and carry out some experiments for comparing their efficiency.

Suppose S is a sequence of N , N> 0 , elements. The elements of S are denoted by S[i] with 0<=i<N , and are sorted in increasing order: S[i-1]<S[i] for 0<i<N .

Both search algorithms have as input a value x occurring in S , and as output the value i such that 0<=i<N and S[i]=x . Both algorithms have local integer variables j and h . Here is algorithm A :

1. Initialize i at 0 , and j at N .
2. While i differs from j-1 repeat steps 3a and 3b.
3. (a)
If i+j is even then h becomes (i+j)/2
else h becomes (i+j-1)/2 .
(b)
If x < S[h] then j becomes h else i becomes h .

The idea behind algorithm A is that invariantly S[i]<=x<=S[j-1] holds, that is, x occurs in the segment from S[i] to S[j-1] . If this segment consists of one element ( i=j-1 ), then the search terminates (see step 2). Otherwise, the segment is split into two parts, by choosing h halfway between i and j (see step 3a). Depending on whether x lies in the left half or the right half, either j or i is adjusted (see step 3b).

Algorithm B is obtained from algorithm A by expanding step 3b into

3. (b)
If S[h] = x then i becomes h , and j becomes h+1
else if x < S[h] then j becomes h else i becomes h .

The idea behind algorithm B is that the search can be terminated immediately when the middle element S[h] equals x .

There are two questions concerning algorithms A and B that you should investigate.

1. What is the difference in the mean number of repetitions of step 3?
2. What is the difference in the mean search time, expressed in the number of elementary operations?
You should consider various sequence lengths. You may assume that the values of x are uniformly distributed over the sequence, that is, each value S[i] with 0<=i<N occurs equally likely as value for x . Assignments and comparisons are the elementary operations. Because most machines have a special (shift) instruction for it, step 3a is considered a single elementary operation. Thus, for the execution of algorithm A , step 1 involves two elementary operations (two asignments); step 2 involves one elementary operation (a comparison); step 3a involves one elementary operation (a special assignment); step 3b involves two elementary operations (a comparison and an assignment).

## Analysis and Solution of Problem 1

This problem illustrates one of the complications with so-called parallel computations, where a number of processors act on the same data concurrently. Each scientist represents a processor; the number on the blackboard is the data. The blackboard can be viewed as a shared variable x , because it is accessible to all processors for inspection and modification. The program that each processor executes consists of one hundred repetitions of the assignment statement incrementing x by 37. The programs of all processors together is called a parallel program. Execution of this parallel program involves 10,000 increments of x .

In a parallel program, access to shared variables must be controlled. For instance, what happens when two processors write different values to the same shared variable at the same moment? The result could be garbage. In our problem, this is avoided because the small machine room guarantees mutual exclusion: only one scientist can inspect or modify the number on the blackboard.

In a sequential program (executed by one processor), 10,000 increments by 37 starting from zero would result in the value 370,000. Parallel execution of these same increments, however, may result in different (smaller) values. The reason is that the increments are split into three actions (read, increment local copy, write), which may be executed in an interleaved fashion. For instance, scientist S reads zero, scientist S also reads zero, both increment their zero to 37, S writes back 37, S writes back 37. This illustrates that two such increments may have the same effect as one. Obviously, one hundred of them executed in parallel may also act as a single increment. Therefore, another possible final value after the 10,000 increments is 3700.

This is not yet the whole story, however. A careful analysis of the possiblities reveals that any multiple of 37 in the range from 74 to 370,000 can be obtained. Here is an interleaving yielding 74:

1. All scientists read and memorize zero.
2. Scientist S completes 99 increments, leaving 3673 on the blackboard.
3. Scientists S through S complete all their 100 increments. (For instance, one after the other. This leaves 3700 on the blackboard, since each starts from a memorized zero.)
4. Scientist S now completes the first increment (having memorized zero), and thus puts 37 on the blackboard.
5. reads and memorizes this 37.
6. does the remaining 99 increments, leaving 3700.
7. mentally increments 37 to 74 and completes the final increment by writing this value on the blackboard.

## Analysis and Solution of Problem 2

Algorithm A is the classical BINARY SEARCH. Algorithm B is a variant. In algorithm A the number of repetitions is almost independent of the value of x : roughly log_2 N (the base-two logarithm of N ). In algorithm B this number can be much smaller. It is, however, (too) little known that the mean advantage of algorithm B , in terms of repetitions saved, is marginal. When the sequence length is increased, the mean advantage converges (from below) to one single repetition saved, no more. The penalty for this `improvement' is an additional comparison in each repetition. The net result is a search algorithm that, in fact, turns out to be slower on average.

This can be confirmed by appropriate experiments. (An analytic treatment is possible but tedious. Sequence lengths that are a power of two are easy to deal with; the complications arise for other lengths). There are several ways to set up an experiment. A naive way is to repeat the following experiment a number of times. Choose a random increasing sequence of strings (also its length is chosen randomly), apply the search algorithms to a (random) number of random input values x and measure the actual search times. Collect enough data and summarize the statistics. A disadvantage of this approach is that it is unclear how much data must be collected for a reliable result. It also difficult to summarize the results.

A more controlled experiment is this. Note that attention may be restricted to INTEGER sequences with S[i] = i , because it is not interesting WHAT is searched, only HOW it is done. Considering various sequence lengths, invoke the algorithms ONCE for EACH allowed input value x , keeping track of the total number of repetitions and elementary operations (that is, comparisons and assignments). When all input values have been searched, divide the total counts by the sequence length to obtain the mean counts. A broad range of sequence lengths can be covered in a relatively small number of experiments by taking some powers of ten, say from one to one million. (Because of the nature of the algorithms, sequence lengths that are a power of two are a special case, and these would not be representative.) Table 1 below illustrates the results.

```              Algorithm A      Algorithm B
Sequence     Mean     Mean    Mean     Mean
Length       # Rep    # Op    # Rep    # Op
-------------------------------------------
1             0,00    3,00     0,00    3,00
10            3,40   16,60     2,80   17,00
100           6,72   29,88     5,79   31,95
1000          9,98   42,90     8,99   47,93
10000        13,36   56,45    12,36   64,81
100000       16,69   69,76    15,69   81,45
1000000      19,95   82,81    18,95   97,76```
Table 1: Mean repetition and operation counts for the two algorithms

You can find below a Turbo Pascal program for these experiments. Note that in Pascal, step 3a can be programmed as h := (i+j) div 2 .

```program Problem2 (input, output) ;
{ Comparing two search algorithms }
const MaxE = 6 ; MaxN = 1000000 ; { MaxN = 10 ^ MaxE }
type  index = 0..MaxN ;
var   N: index ;
{ the sequence consists of S[i] = i for 0 <= i <  N }

procedure AlgorithmA (x: longint; var i: index;
var repcount, opcount: longint);
var j, h: index ;
begin
i := 0 ; j := N ; opcount := opcount+3 ;
while i < >  pred(j) do begin
h := (i+j) div 2 ;
if x <  h { x <  S[h] } then j := h else i := h ;
repcount := repcount+1 ; opcount := opcount+4
end { while }
end { AlgorithmA } ;

procedure AlgorithmB (x: longint; var i: index;
var repcount, opcount: longint) ;
var j, h: index ;
begin
i := 0 ; j := N ; opcount := opcount+3 ;
while i < >  pred(j) do begin
h := (i+j) div 2 ;
if h = x { S[h] = x } then begin
i := h ;
j := succ(h)
end { if h = x }
else if x <  h { x <  S[h] } then
j := h
else { x > S[h] }
i := h ;
repcount := repcount+1 ; opcount := opcount+5
end { while }
end { AlgorithmB } ;

procedure Experiment ;
var k, r: index ; rcA, ocA, rcB, ocB: longint ;
begin
write(N: 7, ' |') ;
rcA := 0 ; ocA := 0 ; rcB := 0 ; ocB := 0 ;
for k := 0 to pred(N) do
AlgorithmA(k {S[k]}, r, rcA, ocA) { assert(r=k) } ;
write(rcA/N:7:2, ocA/N:7:2, ' |') ;
for k := 0 to pred(N) do
AlgorithmB(k {S[k]}, r, rcB, ocB) { assert(r=k) } ;
writeln(rcB/N:7:2, ocB/N:7:2)
end { Experiment } ;

procedure DoExperiments ;
var i: index ; e: integer ;
begin
writeln('Experiments for comparing two search algorithms') ;
writeln ;
{ for i := 0 to MaxN do S[i] := i ;}
writeln('        |  Algorithm A  |  Algorithm B') ;
writeln('  Seq.  |  Mean   Mean  |  Mean   Mean') ;
writeln(' Length | # Rep.  # Op. | # Rep.  # Op.') ;
writeln(' ------ | ------ ------ | ------ ------') ;
for e := 0 to MaxE do begin
if e = 0 then N := 1 else N := 10*N ; { N = 10^e }
Experiment
end { for e } ;
end { DoExperiments } ;

begin
DoExperiments
end.
```

Last updated on JUN-23-95, by Richard Verhoeven.
IOI 95