|
Content
|
Author: Martin Ruckert
Situation:
The old specification: $Y+$Z contain a 64 bit translation key, where the last 3 bit have the following meaning:
- if equal to 000 delete the key value pair from all possible translation caches
- else replace the 3 bit protection code in all caches where the translation is present by the given 3 bit.
The result $X will be
- set to 0 if the key was not present in any translation cache
- set to 1 if the key was present in the instruction translation cache
- set to 2 if the key was present in the data translation cache
- set to 3 if the key was present in both caches
The purpose of this instruction is, according to mmix-doc.w: The operating system needs a way to keep such caches up to date when pages are being allocated, moved, swapped, or recycled. The operating system also likes to know which pages have been recently used. The LDVTS instructions facilitate such operations.
For the purpose of allocated moved swapped or recycled pages, the single case with three zero bit would be sufficient. It deletes an invalid translation from the cache and a correct translation is reloaded from memory, if and when it is needed.
In the light of cache consistency, the other cases with these bits not zero, the instruction is efficient but dangerous. It permits the creation of a cache, that is inconsistent with the page tables kept in memory, by writing one value to memory and inserting an other key into the cache.
The return value of 0, 1, 2 or 3 gives a limited amount of information about cache consistency. Especially the need of the operating system to convert virtual to physical adresses is not met. This need to convert virtual addresses to physical addresses occurs for an operating system for example, when it needs to access user space data, given by a virtual address, with functions or devices that need physical addresses.
Proposal:
The original LDVTS Load Virtual Translation Status Instruction should be enhanced to enable an operating system to compute a translation from virtual to physical addresses in a fast and convenient way.
For the following case, access to the translation would be beneficial.:
To implement a TRAP instruction for the Fread function using a DMA capable disk drive. We proceed as follows: We determine from the file handle the sector number on disk and write this number to the corresponding data register of the disk. The operating system maintains file buffers for the user program in the users pool section, such that file buffers are page aligned. For the target buffer, owned by the running process, we know only the virtual address, and we have to translate it to a physical address to be stored in the corresponding DMA register of the disk drive. At this point, even if hardware support for the translation is available, we have to resort to slow, software based translation using the page tables unless we have an instruction that allows the use of hardware based lookup (and the use of a translation cache).
We therefore propose an augmented return value for the LDVTS instruction. The following key considerations guide the proposal:
- We want a plain lookup of the translation either with or without having the value in the cache.
- We want to retain the ability to delete a key/value pair from the cache.
- We want the ability to force a reread of the page tables, either deleting or redefining the key/value pair.
- We do not want the ability to set/modify the cache value directly.
This is efficient but dangerous. A reread from the page tables immediately after changing the tables in memory should be efficient too, since the necessary data is still in cache. Alternatively just deleting the pair from the cache is fast and the lookup will be done if and when needed. Changing the cache without updating the memory (or having the values in the data cache is dangerous, since different processors may need synchronized caches.
- We want information about the contents of the caches regarding the given value.
To accomplish 1, 2, and 3, we need to pass additional parameter information to the instruction
- we nee to add this information to the parameter $Y+$Z
- We could use the sign bit, but it is only one bit.
- We can use the s-13 bit, but s may be 13 so none are left.
- And we are stuck with using the 3 low bit.
- These can have the values from 0 to 7.
- We choose the behaviour outlined below.
To accomplis 5, wenn need to encode the information in the return value $X
- There is enough room in the register $X. Only 48-s bit are used.
- Either the upper 16 bit or the lowest 13 bit could be used.
- We continue to use the lowest 2 bit, compatible with the old definition.
The lowest 3 bit of $Y=$Z are used as a function code and determine the operation performed by the LDVTS instruction as follows:
Function Code |
Operation |
0 |
Delete from both caches |
1 |
Check data VT cache, provide translation if present |
2 |
Check instruction VT cache, provide translation if present |
3 |
Lazy update both caches: If a matching entry is present in one of the caches, reread the translation from memory and replace matching entries in both caches. |
4 |
Forced update of data VT cache. Reread translation from memory and place into data VT cache. Replace an existing instruction VT cache entry. |
5 |
Forced update of instruction VT cache. Reread translation from memory and place into instruction VT cache. Replace an existing data VT cache entry. |
6 |
Read data translation: Read a translation from the data cache, if not present read the translation from memory and keep the result in the data VT cache |
7 |
Read instruction translation: Read a translation from the instruction cache, if not present, read the translation from memory and keep the result in the instruction VT cache |
The result of the instruction will be in register $X.
- The result register X is set to -1 if no valid translation was obtained, because the entries in the cache were deleted, or not present and the instruction did not call for a read from memory, or the reading from memory did not produce a valid translation.
- Otherwise register X will contain the translated address in this form:
[0 16 bit][ a 48-s bit][0 (s-3)-bit][the 3 protection bit]
Further Considerations
These are typical usage patterns for the instruction:
- After a page has been swapped out, use with function code 0.
The key is deleted from both VT caches. A new translation is not cached, since it probably won't be needed in the near future.
- After a new page with instructions has been assigned, use with function code 5
The translation of the given key will be determined from memory. The needed memory locations are in the data cache if this instruction is executed just after updating the page tables. The new value is placed in the instruction VT cache. If there was an (old) entry for the key in the data VT cache, this entry is replaced. Similar if a new page of data is assigned, use with function code 4.
- If the physical address of a data buffer is needed, use with function code 6
This will use a cached translation if present. If not present, the translation will be read from memory. This is the normal behaviour of a LDO or STO command. without loading or storing. If the physical address of some instruction is needed use with function code 7.
- If a page that was used as a data page is changed into an instruction page (when modifying code, be sure to use the SYNCID instruction) use LDVTS with function code 5, if the instruction is probably executed soon.
- If a page is moved in memory, and the operating system does not care whether this is an instruction or a data page and neither if the value is currently cached, since it is unclear whether the program is just active or sleeping, use function code 3. This will update translations that are already cached but will not create new cache entries. It does not read memory unless needed.
- If the operating system wants to check the content of the caches, without using memory access, it can use 1 or 2.
Discussion
mail comments to ruckert@cs.hm.edu
|