Tom, it's hard to tell if we are focusing on different details but if we remove the information capacity constraint, the the story is different but...
	
		
			
				
				
					tom.stoer said:
			
		
	
	
		
		
			It acts on an infinite strip and it can store both data and program code on that strip.
		
		
	 
My point is that the abstractions that considers infinite (or even unbounded) tapes, simply doesn't correspond to a physical system. My perspective is that any given observing system, encoding a theory and processing information about it's environment has a given finite information capacity. This can grow and shrink, presumable related to the process that is responsible for the origin of inertia and mass, but it doesn't charge arbitrarily.
So IMO, what we have here is a "natural truncation". This truncation is also what limits decidability.
If we consider a larger theory, they it's possible, I agree, but then this also corresponds to a different (more complex) observer. This is IMO somewhat analgous to the gödel expansion. But this expansion is IMO a physical process and is constrained by inertia. That's why we have saturation and overflow of data. I think all these things have interesting connections to physical interactions, radiation etc. But now we get into more speculaton. I mainly wanted to add "my line of association" to Gödel theorem. I do not find that abstractions of information, and computability that makes use of unbounded tapes, or infinite computational times to be physically useful.
As I see it, expectations are produced analogous to computations, and what's interesting IMO is that the computation is completed/halted at a rate that is on par with the rate of new input. This suggest that the constrainst of information capacity and computing power, determine what is the optimum algorithm. As a simple mind, needs simple rules. A complex mind will use more complex rules. 
As I see it there is a highly dynamical interaction here that involes evolution of algorithms, and there is from the point of view of a given observer a limit as to what's decidable. And this can ideally be exploited by a larger observer, to predct how two smaller systems interact, by revealing "their logic".
But any given observer has a decidability limit, and this IMHO at least, is very likely to show up (as observable effects) in the ACTION of this observer, as this is effect a natural cutoff, that is systme dependent.
 /Fredrik