Comments

1. After a long analysis, I slightly decreased the original MIX
program: removed the conditional jump from the beginning of the main
cycle. I can explain this step the following way: in the above
algorithm in both cases - when no elements were swapped and when the
last element was swap the value of variable t=0, so both conditions
can be combined to one.

2. I supposed that sort data is wydes (2-byte integer). To make
changes easier, I used Nb parameter in all the instructions where it"s
essential. To modify program for tetras or octas you just go to set
the new value of this parameter and change wyde instructions to
correspondent ones in lines 5,6,9,10. I have tested these variants
too.

3. As data exchange with memory (1mem+1oop) is longer than ordinary
operations (1oop), I wrote some additional commands to store one of
two compared elements (the last one) in MMIX register.  So on any step
except the first the discussing algorithm need to read only one (new)
element from array, using the previous one from $4.  Please note that
when elements are swapped, we must skip moving $5 to $4, because the
number from $4 becomes the last element in this case.

4. Note that $1 and $2 are used for absolute memory addresses, but $3
is the relative address of the elements in array. It is important for
comment 1 realization.

I"ll be thankful for any discussion of my program.  Please note only
that English is not my native language.

Regards
Evgeny