[IAEP] etoys now available in Debian's non-free repository

Yoshiki Ohshima yoshiki at vpri.org
Tue Jun 24 23:36:37 EDT 2008


  Thank you, Jim!

  I've missed previous conversation on this one so it is probably
redundant, but here is some additional information:

> > We then make sure that the stage2 and stage3 binaries are identical.
> > (This check has caught hundreds of bugs in gcc, binutils, and in
> > vendor compilers.)
> > 
> 
> Ah, yes, I remember this.  I've even struggled to do this once or twice,
> but that was about 15 years ago...

  Yeah, I used to just try it, too.  (Incidentally, I needed to test
GCC 2.95.2 compiler in this way in a few months ago.  I needed to edit
a few header files to make it work on FC7.)

> > This is quite different from the eToys situation, in which there is a
> > single binary implementation of the language;

  It is not a binary implementation of the language.

> > and the sources, where
> > present, are all mixed into a binary blob that's only readable by the
> > single implementation.

  Not true.  Just open .sources file and .changes file in
/usr/share/etoys with your favorite editor that understands UTF-8 and
can handle big files.  The remaining stuff that some people might have
concerns are pre-loaded contents. But these are contents and not
different from a .sh shell script for shell or .py python program.

> > I have the same concerns that Debian does.  Is
> > there even a tool internal to eToys that confirms that everything in a
> > blob includes the matching source?

  Of course there is.  Why do you imagine that there may not be?

> >  Let alone a tool that would
> > extract that source and rebuild the blob from scratch, using a
> > virgin binary environment.

  The sources is .sources file and .changes file.  You don't have to
extract it.  What you can extract from the .image file alone is
decompiled string.
  
> > We could've bootstrapped GCC once, and limped along ever afterward
> > with binaries built from that one original GCC binary.  (In a sense,
> > the entire C compiler market has done this.  Bell Labs' original C
> > compiler was bootstrapped from a BCPL compiler, and every other C
> > compiler probably bootstrapped from Bell's C compilers.)  Instead, the
> > GCC maintainers built lots of infrastructure to allow GCC to be
> > bootstrapped anytime somebody wants to.  And to test it regularly.
> > That's the part that eToys hasn't done.

  Start Etoys, open a workspace and type something like:

  SystemNavigation current allBehaviorsDo: [:cls |
	 cls name displayAt: 0 at 200. cls compileAll].

, select it and press ctrl-D.  The system just recompiles all method
definitions from the source.

  Or, you can compare the previous result and new result by something
like:

	old := Object compiledMethodAt: #printOn:.
	Object recompile: #printOn:.
	(Object compiledMethodAt: #printOn:) = old.  (and it will return "true".)

  I sometimes do run this test for all methods when I change something
deep in the Compiler.  Yes, sure I do it multiple times so that the
Compiler compiled by my Compiler generates the same code for the
Compiler.

> The VM (virtual machine), which is compiled using a C compiler and
> exquisitely examined regularly for performance reasons, and recompiled
> with some regularity with your favorite C compiler.  As I understand it,
> Squeak generates this C code itself.

  Yes, the debugging can be even done in Smalltalk.

> This VM interprets the image file, and so this C code of the VM can and
> is regularly examined, as Yoshiki points out, and for which the code can
> be decompiled by tools and examined.  In fact, the binary image is
> routinely decompiled whenever debugging is done in Squeak.

  It is decompiled and then the corresponding portion of the .sources
file or .changes file is brought in and shown in the debugger.  So
that the user can see the actual source (the exact way the programmer
has written, with comment and proper indentation).

> So as Yoshiki points out, it is actually feasible to complete this loop
> and verify the binary in the image file has the same result; external
> programs (have) exist(ed) to do so, in Yoshiki's example, in Squeak.  In
> this case, the Thompson attack seems unlikely; having Squeak able to
> recognize you are compiling a program intended to decompile an image
> seems pretty far-fetched to me (it isn't the same as a compiler
> recognizing it is compiling itself).

  Again, nothing would be gained from making it externally, but it is
highly unlikely that there is something hidden around the program part
in the image.  Nobody can prove that something doesn't exist, but from
other content part has some malicious thing now.  And it would be
really hard to have the machine instruction pointer to point to a
portion of ByteArray in the image and run it (for example).

-- Yoshiki



More information about the Devel mailing list