MMIX Bug Report SYNCD |
||
Content
|
MMIXware Versionmmix-20110831Bug ReportedInitial: 09/20/2011Update: 11/6/2011 AuthorEiji YoshiyaDescriptionThe syncd instruction may lead to the following situation:
000000000000: e0018000e6010001 seth $1,0x8000 incml $1,0x0001 f200003efd000000 pushj $0, 0x3e*4 swym 0,0,0 e0008000e5000001 seth $0,0x8000 incmh $0,0x0001 e3010110af000000 setl $1,0x0110 stou $0,$0,0 2300002027010101 lda $0,$0,0x20 subu $1,$1,1 4b01fffdfd000000 bnz $1,@-0x3*4 swym 0,0,0 e3030030fd000000 setl $3,0x0030 swym 0,0,0 270303014b03fffe subu $3,$3,1 bnz $3,@-0x2*4 e0018000e6010002 seth $1,0x8000 incml $1,0x0002 f200002ef0000000 pushj $0, 0x2e*4 jmp @ 000000000100: e0028000e5020001 seth $2,0x8000 incmh $2,0x0001 e7020300fd000000 incl $2,0x0300 swym 0,0,0 8f000000b9070200 ldou $0,$0,0 syncd 7,$2,0 f8000000fd000000 pop $0,0 swym 0,0,0 000000010000: 0000000000000000 000000020000: 0000000000000000 000100000000: 0000000000000000 Then start the program in kernel mode: mmmix> k mmmix> 10000 mmmix> v1ff mmmix> 1 ... mmmix> 1 ... The syncd instruction never completes. DetailsCycle 8449LSU1:2 executes ld instruction. Data is not in Dcache and Dcache is full. So LSU1:2 starts Dflusher, gives Scache lock to Dfiller, and starts Dfiller. (mmix-pipe.w, line 4982-4989) Cycle 8450 * Dfluhser waits for Scache lock. (mmix-pipe.w, line 3860) Cycle 8451 Dfiller starts Sfiller. (mmix-pipe.w, line 4035) Cycle 8454 Sfiller reads data from memory, passes data to Dfiller, and schedules Dfiller in 10 cycles. (mmix-pipe.w, line 3951-3955) Cycle 8464 Dfiller receives data from Sfiller, passes data to LSU1:2 and schedules LSU1:2 in 2 cycles. (mmix-pipe.w, line 3999-4003) Cycle 8466 LSU1:2 receives data from Dfiller and then LSU1:2 completes ld instruction. (mmix-pipe.w, line 2674-2683) Cycle 8468 Now syncd instruction reaches the hot seat; LSU2:2 executes syncd instruction and starts Clean coroutine. (mmix-pipe.w, line 6428-6435) Cycle 8469 * Clean gets Dcache lock and waits for Dflusher to complete. (mmix-pipe.w, line 4154-4160, 4130-4133) Cycle 8494 Sfiller completes reading cache block and schedules Dfiller at next cycle. (mmix-pipe.w, line 3964) Cycle 8497 To copy data from Scache to Dcache, * Dfiller waits for Dcache lock holding the Scache lock. (mmix-pipe.w, line 4011) Lines with a leading * show that the meta-simulator goes into deadlock. Proposed patchLocking the Dcache in "clean" is dangerous. In this case, waiting for Dflusher while holding the Dcache lock causes deadlock. So, I changed mmix-pipe.w to release the Dcache lock before waiting for Dflusher. The following is my patch:--- ../mmix-20110831/mmix-pipe.w 2011-09-01 03:23:52.000000000 +0900 +++ mmix-pipe.w 2011-10-30 00:06:11.562500000 +0900 @@ -4130,7 +4130,7 @@ Dclean: data->state=1;@+ data->ptr_b=(void*)p;@+ wait(Dcache->access_time); -case 1:@+if (Dcache->flusher.next) wait(1); +case 1:@+if (Dcache->flusher.next) goto Dclean_retry; flush_cache(Dcache,p,data->x.o.h==0); p->tag.h|=data->x.o.h; release_lock(self,Dcache->lock); @@ -4161,6 +4161,16 @@ } data->state=9;@+ wait(Dcache->access_time); +Dclean_retry: + release_lock(self,Dcache->lock); + if (data->i==sync) data->state=100; + else data->state=4; + wait(1); +case 100:@+ if (Dcache->lock || (j=get_reader(Dcache)<0)) wait(1); + startup(&Dcache->reader[j],Dcache->access_time); + set_lock(self,Dcache->lock); + i=data->y.o.h, j=data->y.o.l; + goto Dclean_loop; @ @<Cases 5 through 9...@>= case 5:@+ if (self->lockloc) *(self->lockloc)=NULL, self->lockloc=NULL; DiscussionThis bug was fixed.I did an extensive study of the source code to see what locks coroutines do hold and do wait for. It is probably best to bring the locks/coroutines in an partial order so that a owning lock A, a coroutine is allowed to wait for lock B only if B comes later in the partial order. (coroutines are like locks because you can wait on coroutine->next) I established the following order: (with Scache present)
case 3: @It should release the Scache->lock before waiting, similar to the fill_from_mem routine. So the code should be: case 3: @compare this to fill_from_mem: case 1: release_lock(self,mem_lock); data->state=2; case 2:@+if (c!=Scache) { if (c->lock) wait(1); set_lock(self,c->lock); } if (cc) awaken(cc,c->copy_in_time); /* the second wakeup call */ load_cache(c,(cacheblock*)data->ptr_b); data->state=3;@+ wait(c->copy_in_time); case 3: goto terminate;To release the Dcache lock in clean while waiting for the Dcache flusher as proposed by Eiji Yoshiya would eliminate the Dcache lock as a precondition to the Dcache flusher. Even without actually taking that lock, for instance write_from_wbuf in lines 4660 to 4675 assumes, that there was a wait on the Dcache lock in line 4601. So the lock should remain a precondition for the Dcache flusher. (@ |
Please help to keep this site up to date! If you want to point out important material or projects that are not listed here, if you find errors or want to suggest improvements, please send email to