020. Pre-modern-UNIX sizeof division — blognꞌt

↩ `020. Pre-modern-UNIX sizeof division` ↩

Sun, 28 Sep 2025 01:02:48 +0200Sun, 28 Sep 2025 19:54:03 +0200

I've been wondering why the _Countof() operator was not implemented back in the times of K&R. That was information certainly available in the compiler, and division was quite expensive, so you wouldn't want to perform a run-time division of sizeof. […] what does cc(1) produce for a sizeof division?

# UNIX Programmer's Manual

doesn't have C.

# UNIX Programmer's Manual, Second Edition

does, and we can observe it directly: the distributors of unix 72, which is for most effects and purposes a V1 system, include the "last1120c" cc, which, accd'g to dmr, Very early C compilers and language (or, more tellingly, primevalC.html) is a saved copy of the compiler preserved just as we were abandoning the PDP-11/20 which puts it in the V2 era. Thus:

$ pdp11 simh.cfg
PDP-11 simulator V3.8-1
Disabling CR
Disabling XQ
RF: buffering file in memory
TC0: 16b format, buffering file in memory
:login: root
root
# chdir /tmp
# cat >a.c
int gaming[20];
size() { return sizeof(gaming) / 4#2; }
# cc -c a.c
1: External definition syntax
1: _gaming undefined
2: Statement syntax
2: _sizeof undefined
2: _gaming undefined
I
II

Primeval is right!

# cat >a.c
gaming[20];
size() { return 012345; }
# cc -c a.c
2: Statement syntax
I
II

So there is a reason the return () style persists!

# ed a.c
38
s/012345/(&)/
w
40
# cc -c a.c
I
II
# ls -l a.o
total    1
134 s-rwrw  1 root     92 Jan  1 00:00:00 a.o
# od a.o
00000  00407  00020  00004  00050  00044  00000  00000  00000
00020  10546  10605  12700  12345  00167 -77764  00167 -77760
00040  40024  40000  00000  00000  00000  00000  00000  00051

(I am sparing you from the rest of the object file and the remaining 420 bytes of this output; do note, however, how this signed output style is not available in present-day od).

With our epistemic method honed, we can attack the problem directly by spelling return 1783; as

# cat >b.c
size() { return(012345 / 3); }
# cc -c b.c
I
II
# ls -l b.o
total    1
133 s-rwrw  1 root     92 Jan  1 00:00:00 b.o
# od b.o
00000  00407  00030  00002  00000  00030  00000  00000  00000
00020  10546  10605  12714  12345  12767  00003 -37262  11400
00040  00167 -77754  00167 -77750  40000  00000  00000  00000

shocking no-one, this is clearly a run-time division. The source doesn't even contain anything close to an optimiser that I could find.

A PDP-11 disassembler exists. — good news if we are to dead-confirm this — but I don't think ken would appreciate it if I were to Contact the author for more information.. (I have, in fact, just quoted the entire das (VI) manual from V1, V2, and V3 (./manx/das.6) (it goes away in V4).) It's also not part of unix72 or any dump.

But others must exist, and indeed pdp11dasm (from a re-re-re-host, as is tradition) works, yielding

000020: 010546                  mov     r5,-(r6)
000022: 010605                  mov     r6,r5
000024: 012714 012345           mov     #12345,(r4)
000030: 012767 000003 140516    mov     #3,140554
000036: 011400                  mov     (r4),r0
000040: 000167 100024           jmp     100070

after deworming + truncating the od output and echo d 0+20 > b.o.ctl.

God knows how the jump achieves division, but it obviously does. V2 verdict: no sizeof, no int/int folding.

# prestruct-c

is a copy of the compiler just before I started changing it to use structures itself (imagine!). It's a bit hard to get really accurate dates for these compilers, except that they are certainly 1972-73, by the time of the second, we had a PDP-11 that did provide mapping — this roughly lines up with the metatext of PDP-11/45 mentions in UNIX Programmer's Manual, Third Edition, and definitely lines up with the February, 1973 dating.

Still no sizeof (proof by grep -r), but there is an optimiser! as part of the first pass, which I present to the dear reader in its entirety (though I must warn that the experience is not unlike reading Perl for the first time; also, int name[] is the spelling-du-jour of int *name):

optim(p)
int p[];
{
	int p1[], p2[], t;

	if (*p != 40)				/* + */
		return(p);
	p1 = p[3];
	p2 = p[4];
	if (*p1==21) {				/* const */
		t = p1;
		p1 = p2;
		p2 = t;
	}
	if (*p2 != 21)				/* const */
		return(p);
	if ((t=p2[3]) == 0)			/* const 0 */
		return(p1);
	if (*p1!=35 & *p1!=29)			/* not & */
		return(p);
	p2 = p1[3];
	if (*p2!=20) {				/* name? */
		error("C error (optim)");
		return(p);
	}
	p2[4] =+ t;
	return(p1);
}

This folds additions where either (one operand is a constant 0) or (one operand is a constant and the other is a &var expression). One has to assume &var + 3.5 gets promoted before this, but idk if this works for 3.5 + 0 or 3.5 + 0.0. Either way, V3 verdict: no sizeof, no int/int folding.

V4 has a V3-style cc (I), and, likely, compiler. No way to know.

# UNIX Programmer's Manual, Fifth Edition

−O

Invoke the experimental object-code optimizer.

but internally this is the C object code improver! (V5, ./usr/source/s1/c2?.c) as a new third pass (and actually parses and optimises the assembly output by the second pass, not the object, but same difference).

The actual optimiser moves to the second pass to take up a whole C compiler part 2 -- expression optimizer file (ibid., ./usr/source/s1/c12.c), and is no longer just a peep-hole at the two addition cases, though they do still feature, but in 4K this time:

optim(atree)
struct tnode *atree;
{
	// ...
	dope = opdope[op];
	if ((dope&LEAF) != 0)
		return(tree);
	if ((dope&BINARY) == 0)
		return(unoptim(tree));
	/* is known to be binary */
	if ((dope&COMMUTE)!=0) {
	acomm:	d1 = tree->type;
		tree = acommute(tree);
		tree->type = d1;
		return(tree);
	}
	// ...
}

acommute(atree)
{
	// ...
	flt = isfloat(tree);
	// ...
	if (!flt) {
		/* put constants together */
		for (i=acl.nextl;i>0&&t2[0]->op==CON&&t2[-1]->op==CON;i--) {
			acl.nextl--;
			t2--;
			const(op, &t2[0]->value, t2[1]->value);
		}
	}
	if (op==PLUS) {
		/* toss out "+0" */
		if (acl.nextl>0 && ((*t2)->op==CON || (*t2)->op==SFCON)
		 && (*t2)->value==0) {
			acl.nextl--;
			t2--;
		}
		if (acl.nextl <= 0)
			return(*t2);
		/* subsume constant in "&x+c" */
		if (t2[0]->op==CON && t2[-1]->op==AMPER) {
			t2--;
			t2[0]->tr1->offset =+ t2[1]->value;
			acl.nextl--;
		}
	} else if (op==TIMES) {
		// ...
	}
	// ...
}

but also with a full constant folder:

const(op, vp, av)
int *vp;
{
	register int v;

	v = av;
	switch (op) {

	case PLUS:
		*vp =+ v;
		return;

	case TIMES:
		*vp =* v;
		return;

	case AND:
		*vp =& v;
		return;

	case OR:
		*vp =| v;
		return;

	case EXOR:
		*vp =^ v;
		return;

	case DIVIDE:
	case MOD:
		if (v==0)
			error("Divide check");
		else
			if (op==DIVIDE)
				*vp =/ v;
			else
				*vp =% v;
		return;

	case RSHIFT:
		*vp =>> v;
		return;

	case LSHIFT:
		*vp =<< v;
		return;
	}
	error("C error: const");
}

V5 verdict: yes sizeof (new: ibid., ./usr/source/s1/c0[01].c), yes int/int folding.

# &c.

V6 (./c/)'s cc doesn't change for the purposes of this analysis. By following Computer History Wiki's Installing UNIX v6 (PDP-11) on SIMH we can observe this directly:

$ pdp11 boot.ini
PDP-11 simulator V3.8-1
Disabling XQ
@unix
login: root
# chdir /tmp
# cat >a.c
gaming[012345];
size() { return(sizeof(gaming) / sizeof(*gaming)); }
# cc -c a.c
# od a.o
0000000 000407 000016 000000 000000 000110 000000 000000 000000
0000020 004567 177774 012700 012345 000400 000167 177762 000000
0000040 000071 000000 000000 000000 000000 000131 063537 066541

and naturally

000020: 004567 177774           jsr     r5,20
000024: 012700 012345           mov     #12345,r0
000030: 000400                  br      32

V7 (usr/src/cmd/c/)'s cc doesn't change for the purposes of this analysis.

Unix User's Manual, Release 3.0 ships both a classic V6 compiler and, on the PDP-11, a "portable" pcc. What it actually means by this is that it ships pcc on both the PDP-11 and VAX, but calls it /bin/pcc on the PDP-11 and /bin/cc on the VAX. The PDP-11 classic compiler grows unsigned and, with it, 8 branches in const(), but does not otherwise change for the purposes of this analysis (SysIII, src/cmd/cc/pdp11/c12.c; with thanks to my good friend Vetus).

pcc (both pccs, since, despite having a machine-independent and machine-dependent half, both are twice in the source, and the machine-independent bits are different (ibid., src/cmd/cc/pcc/* & src/cmd/cc/vax/{cc.*,pcc/,mip/})), also supports sizeof and also folds int/int divisions (but don't be fooled: the latter happens during AST construction, not in the optimiser (ibid., src/cmd/cc/pcc/mip/trees.c & src/cmd/cc/vax/mip/trees.c (these are actually the same))):

NODE *
buildtree( o, l, r ) register NODE *l, *r; {
	// ...
	opty = optype(o);

	/* check for constants */

	if( opty == UTYPE && l->in.op == ICON ){
		// ...
	else if( o==UNARY MINUS && l->in.op==FCON ){
		// ...
	else if( o==QUEST && l->in.op==ICON ) {
		// ...
	else if( (o==ANDAND || o==OROR) && (l->in.op==ICON||r->in.op==ICON) ) goto ccwarn;

	else if( opty == BITYPE && l->in.op == ICON && r->in.op == ICON ){
		// ...
		}

	else if( opty == BITYPE && (l->in.op==FCON||l->in.op==ICON) &&
		(r->in.op==FCON||r->in.op==ICON) ){
		switch(o){
		case PLUS:
		case MINUS:
		case MUL:
		case DIV:
			// ...
			switch(o){
			case PLUS:
				l->fpn.dval += r->fpn.dval;
				return(l);
			case MINUS:
				l->fpn.dval -= r->fpn.dval;
				return(l);
			case MUL:
				l->fpn.dval *= r->fpn.dval;
				return(l);
			case DIV:
				if( r->fpn.dval == 0 ) uerror( "division by 0." );
				else l->fpn.dval /= r->fpn.dval;
				return(l);
				}
			}
		}

	/* its real; we must make a new node */

	p = block( o, l, r, INT, 0, INT );

	// ...

	}

Also, VAX SysIII separately includes the improver (ibid., src/cmd/cc/vax/c2/) and the front-end does run it there.

SysVr1, sysv-pdp11_usr-src/cmd/cc/cc.mk elucidates(?) this:

#       The directory structure (global multi-machine view) assumed is:
#
#                       /cpp --- cpp.mk
#                       /
#                      /     /c2 --- c2.mk      (only for vax)
#                     /       /
#                    /       /  /cc --- cc.mk   (Ritchie's pdp11 C compiler)
#                   /       /   /
#       /usr/src/cmd --- /cc ------- cc.mk      (this makefile)
#                           \   \
#                            \  cc.c
#                             \      /mip
#                              \      /
#                               \    /   /vax --- pcc.mk
#                                \  /    /
#                               /pcc ------- pcc.mk
#                                   \
#                                    \
#                                   /pdp11 --- pcc.mk

But doesn't otherwise change for the purposes of this analysis (ibid., sysv-pdp11_usr-src/cmd/cc/pcc/mip/trees.c).

SysVr2 (usr/src/cmd/c/)'s ccs don't change for the purposes of this analysis.

SysVr3 (ATT-SYSVr3/{301,31}/usr/src/scripts/cc/) comes with just pcc and I don't see source for it but there's no reason to suspect it'd've lost the optimisation.

SysVr4 (ATT-SYSVr3/{301,31}/usr/src/scripts/cc/) comes with some compiler? But UNIX® System V Release 4 Programmer's Reference manual's cc(1) doesn't change from UNIX System V Programmer's Reference Manual's cc(1), so it's probably still the same pcc.

It also includes the all-time-funniest AT&T program as /usr/ucb/cc.

1BSD doesn't include a different C compiler (CD-ROM 1: Berkeley Systems 1978-1986, ./1bsd/).

2BSD includes some cc patches which do not affect related functionality we care about (ibid., ./2bsd/upgrade/c/).

3BSD ships cc as pcc + the extracted c2 improver (ibid., ./3bsd/usr/src/cmd/{cc.c,pcc/,mip/,c2/}). The relevant segment of ./3bsd/usr/src/cmd/mip/trees.c is the same as above.

4BSD's cc doesn't change for the purposes of this analysis (ibid., ./4.0/usr/src/cmd/mip/trees.c).

4.1BSD's cc doesn't change for the purposes of this analysis (ibid., ./4.1/usr/src/cmd/mip/trees.c).

4.2BSD's cc doesn't change for the purposes of this analysis (ibid., ./4.1/usr/src/lib/mip/trees.c).

4.3BSD's cc doesn't change for the purposes of this analysis (ibid., ./4.1/usr/src/lib/mip/trees.c).

4.3BSD-Tahoe's cc doesn't change for the purposes of this analysis (CD-ROM 2: Berkeley Systems 1987-1993, ./4.0/usr/src/cmd/mip/trees.c). It actually ships a second one in ./4.3tahoe/usr/src/lib/old_compiler/, but the purpose of this is unclear to me. The only reference (I think?) to this situation is in ./4.3tahoe/usr/doc/smm/12.uchanges/1.t, quoth:

Some effort has been made to improve error reporting for program errors and to handle exceptional conditions in which the old compiler used to punt.

4.3BSD-Reno's cc doesn't change for the purposes of this analysis (ibid., ./4.3reno/usr/src/libexec/pcc/mip/trees.c).

The 4.4BSD-Lites ship with GCC 2.3.3 (ibid., ./4.4BSD-Lite1; CD-ROM 3: Final Berkeley Releases, ./4.4BSD-Lite2).

It's difficult to come to a

# conclusion

other than "there has never been a compiler with sizeof but no int/int folding".

Perhaps for a hair? in 1973 around our V4 epistemic hole, but.

Nit-pick? Correction? Improvement? Annoying? Cute? Anything? Mail, post, or open!