[TUHS] Introduction

Jose R. Valverde jrvalverde at cnb.csic.es
Fri Jun 27 22:24:30 AEST 2008


Dear Oliver,

	well, the fact that prf.c uses the anding explicitly means that
it was actually used as such directly in the code. But...

On Thu, 26 Jun 2008 16:52:46 +0200
Oliver Lehmann <lehmann at ans-netz.de> wrote:
> Hi Jose,
> 
> first - thanks for taking the time helping me here on this issue.
> 
> 		s=(char *)(*(long *)adx & 0x7F00FFFF);
> 
> in prf.c compiles to:
>
Let me test if I understand this:
 
>         ldl     rr2,|_stkseg+~L1+8|(fp)

	rr2 = adx

>         ldl     rr4, at rr2

	rr4 = *adx

>         and     r4,#32512	
>         ldl     |_stkseg+~L1+12|(fp),rr4
	s = rr4
> 


Which is the equivalent to the code you describe in your problems page
except for the @

it also would look like what's doing is

	s = (char *) ( (*(long *)adx) & 0x7F00FFFF;

reflecting how the compiler has read the line (giving higher precedence
to the * than to the &). Am I mistaken? Hence the & is done to the value
pointed by adx, not to adx itself before indirecting it.

That means that adx contains a pointer to a pointer instead of a pointer
to unsigned int as declared, and would explain why the need for the 
first cast (long *), so the value pointed by adx is stored in a long
not an unsigned int. I wonder why adx would not have been declared as
unsigned long * directly...

Then that long (which is actually a pointer) is ANDed to fall in the 
stack, and finally coerced to be interpreted as a char *.

>  
> So the only thing in sys2.c's link() you wanted me to change in your
> previous mail was:
> 
> 	u.u_dirp.l = (caddr_t) ((long *) uap->linkname & 0x7F00FFFF);
> 
> right? Tried this, and got:
> 
> "sys2.c":305: operands of "&" have incompatible types 
> Error in file sys2.c: Error.  No assembly.
> 
My take is that whatever the original source must have been very
close to my suggestion. If we assume the former interpretation
then, of course in prf.c it compiles (as it is ANDing a pointer
coerced to long with the constant) and here it doesn't (as it
would be ANDing a long *.

Why don't you try to split the assignment into various statements
to reproduce the assembly and the recombine them? Like, e.g.

1:	r2 = uap->linkname;		/* ldl rr2,rr8(#4) */
2:	r4 = (long) r2;			/* ldl rr4,rr2 */
3:	r4 &= 0x7F00FFFF;		/* and rr4,#32512 */
4:	u.u_dirp.l = (caddr_t) r4;	/* ldl _u+78, rr4 */

If you can get it by parts, then you can work your way back
recombining with parenthesis. I suspect line (2) above will give the
lead.

Other possibility is some other conversion was used. I notice similar code 
on rdwr(), but here it is of the king
	ldl rrX, something
	ldl rr4,rrX(#some offset)
	and r4,#32512 (or some other value, like 61440) 

So, what if it was called somehow so that the compiler decided to assign 
the value of rr2 to an auxililary register believing there was an offset 
but the offset was zero?

	u.u_dirp.l => ((saddr_t) (uap->linkname)).l

may be they first cast uap->linkname into a segmented address (as it points to
user data) leading to

		(saddr_t) uap->linkname).l

to get the segmented stack pointer that was to be fixed by the AND and then you
cast it to long for the AND

		(long) ((saddr_t) uap->linkname).l
giving
	u.u_dirp.l = (caddr_t) ((long) (((saddr_t) uap->linkname).l) & 0x7F00FFFF);
	// this might force use of an aux. variable for the 0 offset and then anding it

or may be the simpler implicitly forces the code

	u.u_dirp = (saddr_t) (((long) uap->linkname) & 0x7F00FFFF);

Another possibility is that it were coded by hand in assembler working over
assembly listings generated by the compiler: on development, probably prf.c
was coded early on, and then maybe they hand coded that code using prf.c as
a template (reproducing the verbose now unneeded ldl rr4,rr2 line).

> 
> > I notice that nsseg in mch.s may return %7F00 on some cases and is used
> > in machdep.c as stseg = nsseg(u_state->s_sp); so it seems the stack uses
> > segment 0x7F00. Then may be the & is shorthand to make sure the address
> > pointed by the ANDed pointer falls within the stack. It would probably
> > imply user programs have a maximum stack size of 65536 bytes as well.
> > 
> > That may explain why some pointers are ANDed and others not. I haven't
> > had a thorough look, but if the &0x7F00FFFF usage is consistent, then
> > that's is an explanation that may guide source reconstruction.
> 
> A memory segment is 64Kbyte of size. The hardware is a bit special here.
> The CPU can access the memory in a segmented and a nonsegmented mode. For
> this purpose 3(!) MMUs are existing. A special MMU control logic is
> implemented which can handle 3 states:
> 1: segmented OS (CPU works in system mode)
> 	The segments Code, Data and Stack are managed by MMU1. MMU2 and
> 	MMU3 are not active
> 2: userprocess not segmented (CPU works in normal-mode, segmentnumber 63)
> 	The segments Code, Data and Stack are managed by MMU1. MMU2 and
> 	MMU3 are not active. This is done by a special break register
> 3: userprocess segmented (CPU works in normal-mode)
> 	MMU2 and MMU3 are used to process the 128 possible memory segments
> 	which can be Code-, Data- or Stack-Segments. MMU2 manages the
> 	segments 0-63 and MMU3 manages the segments 64-128. The switching
> 	between both MMUs works hardwarecontrolled in dependence of the
> 	segmentline. Both MMUs are programmed for segment 0..63.
> 
> A colleague of mine wrote about this:
> >>>>>
> I've looked at your problems site and think I can imagine why the AND 0x7f00ffff
> is there. Remember, the Z8000 segmentation concept is flawed the way that a segment
> address can wrap around without warning. Now, at a higher level, UNIX uses a flat 
> address space and somewhere, this logic address needs to be translated into 
> physical addresses. This is done by the MMU - however, if within a pointer arithmetic
> an overflow beyond the 64k boundary happens, it can spill over into the segment number
> which is - you might remember at bit [30:24]. So as soon as a pointer is created by the
> compiler, it is ANDed with 0x7f00 for the upper 16bits to extract the segment number and 
> with 0xffff for the lower 16bit address to obtain the real logic address PC.
> 
> It would be really interesting to look at the implementation of malloc for memory blocks
> greater than 64K byte. My assumption is that the compiler inserts this AND on its own for
> any pointer arithmetic.
> <<<<<
> 
> Maybe this helps...
> 
>    Greetings, Oliver
> 
> -- 
>  Oliver Lehmann
>   http://www.pofo.de/
>   http://wishlist.ans-netz.de/


-- 
	These opinions are mine and only mine. Hey man, I saw them first!

			    José R. Valverde

	De nada sirve la Inteligencia Artificial cuando falta la Natural
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20080627/3f93ff3d/attachment.sig>


More information about the TUHS mailing list