Introduction to the Ruby Language
Ruby is an "Open Source" program. It is supported by
a large community for developers and users.
Distribution of Ruby
1. The Ruby Source Code can be distributed
2. The Ruby Source Code can be modified
3. The Altered Source Code can be distributed
In each case no special permission or fee is required.
Ruby is Conservative
Ruby features are in wide use in other languages. Special
or experimental features are not included.   Compiler Syntax
such as parenthesis and semicolons are not necessary in most
cases, but can be added for clarity;
Ruby is an Object Orientated Language
The notion of an object can not be separated from Ruby.
The nature of this Object Orientation in as far as Ruby
will be explained.  
It is a Script Language
When Introducing Ruby as a "Object Orientation Script Language", 
it will not please everyone.   But Ruby is a true programming
language, with strong theoretical roots.   It can be used for
almost any programming job, including GUI programs and even
web programming.
Ruby is the Interpreter
Ruby is an Interpreter, that is a fact!    The fact that
it is an interpreter,  allows it to solve problems quickly.
Ruby Transferability
Ruby is a UNIX Centered Language.  But the fact the it uses
assembly coding rarely, makes transferring the language to
other systems fairly easy.  The following is a partial list of systems and OS's that currently host the Ruby Interpreter.
- Linux
- Win32 (Windows 95 and 98, Me and NT, 2000, XP)
- Cygwin
- Djgpp
- FreeBSD
- NetBSD
- OpenBSD
- BSD/OS
- Mac OS X
- Solaris
- Tru64 UNIX
- HP-UX
- AIX
- VMS
- UX/4800
- BeOS
- OS/2 (emx)
- Psion
Automatic Memory Management
Ruby implements automatic Garbage Collection.   The programmer
does not have to worry about Malloc/Free sequences as in 'C'
and 'C++'.   When an object is no longer being used it is
automatically destroyed and it's memory returned to the pool.
Variables Have No 'Type'
Variables have to fixed type.   It is one of the most powerful
weapons of the Ruby Language.   This means that an array for
example can contain several different data types.
Ruby is written in ANSI 'C'
Nowadays,   a program written in ANSI 'C' can be ported to almost
an infinite number of systems and OS Environments.    The fact that
Ruby is written in 'C' is a major feature.   The original Ruby code
was written in K&R Style 'C' and it's influence is still visible, 
though it is now ANSI 'C' compatible.   It is compiled with gcc
in the Linux Environment.
Extended Library
The Ruby Language can be augmented with extended functions written
in 'C'.   The functions provided mirror the language grammar,   thus
features written in 'C' closely resemble Ruby Code.
# method call
Obj.Method (arg) # Ruby
Rb_funcall (obj, rb_intern ("method") and 1, arg); # C
# block call
Yield arg # Ruby
Rb_yield (arg); # C
# exceptional forwarding
Raise ArgumentError and 'wrong number of arguments' # Ruby
Rb_raise (rb_eArgError, "wrong number of arguments"); # C
Formation of # object
Arr = Array.New # Ruby
VALUE arr = rb_ary_new (); # C
Thread
Ruby implements threads within Ruby Itself.    These are totally in-process, implemented within the Ruby interpreter.   That makes the Ruby threads completely portable
The technology which reads the source code
In order to read the Ruby Source Code,  certain information and techniques are discussed that will make the process easier.
Learning Ruby Internals
Techniques of analysis
To analyze the source code, there are roughly two techniques:
1. Static Techniques
2. Dynamic Techniques
Dynamic analysis
The object program is used
The Object program is run with specific usage in mind and the
results are viewed.   Either with a debugger,  tracer,  and/or debug
code inserted into the program being Analyzed.
The movement is chased with the debugger
The user can use a debugger to watch execution flow.   What data
is loaded into data structures.    There are tools, like dia to help
draw pictures of data structures,  which can make visualization
of the data structures easier.   Also see the graphviz program.
Tracer
Tracers can also capture information about program flow, such as Ctrace...
{http://www.Vicente- .Org/ctrace} and to trace system calls Strace...
{Http://www.Wi.Leidenuniv.Nl/ - Wichert/strace/} and Ktrace tool.   Also,  the tool IDBG (www.hawthorne-press.com),  which can be a used with Ctrace.
Printf Tracing
Conditional Tracing statements embedded in the code being examined. Again, the IDBG program comes with a Ruby Tracing Support and the ability automatically insert and remove print statements and call support routines.
Rewriting, it moves
Also, if a function is difficult to understand, try changing it's
parameters or code slightly and look at the result.   The change can
often tell what it is doing.
Cflow, Cflow2dot, and Dot
The program 'cflow' or the programs 'prcc' and 'prcg' it uses can
be used with 'flow2dot' and 'dot' to produce process flow diagrams.
Static analysis
Importance of name
When doing static analysis of a program, the names of functions,  variables,
and constants can often be good clues to their usage.   This is especially
true if the original programmer followed good naming conventions and practices.
The document is read
There are also times when the document which explains internal constitution
is available or the Internal Comments are extensive enough to explain the
code.
Investigation of abbreviations
If there are abbreviations in the code (Say GC),  determine if they
are meaningful: Is it 'Graphic Context' or 'Garbage Collection'?
Call Relationships of a function
Using a program suite like 'cflow/cflow2dot/dot' to generate a call
graph of a program or section of a program is very helpful.    It is easier
to grasp the process relationships visually.
Read the function code
Read the function carefully.   Try to describe in one or two words it's
purpose.    If it is hard to read because of coding style,  the use
indent to convert the 'C' Style to a form you are comfortable with.
Try rewriting the function to your taste
Some times rewriting a function and proving it produces the same result can lead to a much better understanding of the function.   However, leave the original
code intact.   Because,   if half way through you find things are not the same,
having the original allows you to find out where you may have gone wrong.
Reading program history and change logs
A lot of information about a program is usually found in change logs,  whether
attached to the program or not.   Also CVS Logs and/or annotations of various sorts.   Mailing list of changes in the development community,  for example ruby-core
history of changes has a lot of information.
Tool for static analysis
Using 'etags' to generate a 'TAGS' file,   a lot of information is available.   For example, a list of all functions called by a particular file.    Using 'ctags' a cross reference can be generated.
Building Ruby
Building on a unix platform is divided into three parts
1. Configure
2. Make
3. Make Install
Configure
Configure is a script that try's to determine if everything needed by the build process is present.   The method of investigation is unexpected and simple.
The file 'Makefile.in' contains parameterized code that is converted
into the final makefile based on results of configure.
Makefile.in CFLAGS = @CFLAGS@
Makefile CFLAGS = -g -o2
After the configure script is executed,  the file 'config.h' is created.  
This file contains some of the results of configure execution.
:
:
#define HAVE_SYS_STAT_H 1
#define HAVE_STDLIB_H 1
#define HAVE_STRING_H 1
#define HAVE_MEMORY_H 1
#define HAVE_STRINGS_H 1
#define HAVE_INTTYPES_H 1
#define HAVE_STDINT_H 1
#define HAVE_UNISTD_H 1
#define _FILE_OFFSET_BITS 64
#define HAVE_LONG_LONG 1
#define HAVE_OFF_T 1
#define SIZEOF_INT 4
#define SIZEOF_SHORT 2
:
These values can be used by the programmer to determine if certain items are available.   For example, this is from 'ruby.h'
24 #ifdef HAVE_STDLIB_H
25 # include < Stdlib.H>
26 #endif
Autoconf
The use of Autoconf, Automake, and friends is described in full in GNU documents available at 'gnu.org'
Figure 1: Makefile Construction
Make
This second stage, Make, is processes as follows:
1. The Ruby Source Code is compiled
2. The static library is complied
3. Static Link with Miniruby is done
4. If --enabled-shared, joint ownership of libruby.so is made
5. Using Miniruby, the extended libary it compiles
6. Lastly, the Real Ruby is linked
CVS
CVS is a source management system that allows not only to current system
to be compiled,  but any previous version of the program that was entered
into the CVS System.
Ruby Construction
Physical Structure
As Ruby has gotten larger,  the program sources have been divided into a number of sub-directories:
1. Documents
2. Ruby Source Code
3. Ruby Tools for Building
4. Standard Extended Library
5. Standard Ruby Library
6. Translations and Additions
Classification of source code
The ruby source code itself is divide into several parts:
Core of Ruby language
Class.C Class-related API
Error.C Exceptional-related API
Eval.C Evaluator
Gc.C Garbage collector
Lex.C Reserved word table
Object.C Object system
Parse.Y Parser
Variable.C Constant, global variable and class variable
Ruby.H RubyPrincipal macro and prototype
Intern.H RubyC API prototype.InternIt is thought that it is
the abbreviation of internal, but it does not care
the fact that the function which has been recorded
here is used with the extended library separately.
Rubysig.H The header file which supplies the macro related to
the signals
Node.H Definition related to syntactic tree node
Env.H Definition of the structure which expresses the
context of the evaluator
Utility
Dln.C Dynamic loader
Regex.C Regular expression engine
St.C Hash table
Util.C Library of cardinal number conversion and sort etc
Ruby Initialization and Loading Routines
Dmyext.C Dummy of extended library initialization routine
(DumMY EXTention)
Inits.C Entry point of initialization routine of core and
library
Main.C Entry point of command
(Libruby it is unnecessary)
Ruby.C RubyPrincipal part of command
(Libruby it is needed)
Version.C RubyVersion
Class library
Array.C Class Array
Bignum.C Class Bignum
Compar.C Module Comparable
Dir.C Class Dir
Enum.C Module Enumerable
File.C Class File
Hash.C Class Hash (See 'st.c' also)
Io.C Class IO
Marshal.C Module Marshal
Math.C Module Math
Numeric.C Class Numeric (Integer, Fixnum, and Float)
Pack.C Array#pack, snf String#unpack
Prec.C Module Precision
Process.C Module Process
Random.C Kernel#srand(), and Rand()
Range.C Class Range
Re.C Class Regexp (See regex.c)
Signal.C Module Signal
Sprintf.C Ruby (Exclusive use of Sprint() )
String.C Class String
Struct.C Class Struct
Time.C Class Time
Platform dependence file
Bcc32/ Borland C++ (Win32)
Beos/ BeOS
Cygwin/ Cygwin (the UNIX emulation layer with Win32)
Djgpp/ Djgpp (The free enviroment for software
development for DOS)
Vms/ VMS (OS which DEC has done release at one time)
Win32/ Visual C++ (Win32)
X68/ Sharp X680x0 system (as for OS Human68k)
Logical structure
Inside the Ruby Core group of files, it is divided into three parts.  
The First creates the object world of Ruby (The Object Space).   The
Second is the 'Parser' which creates the internal representation of the ruby program.   Lastly, the evaluator that drives the program.
Object Space
The object space is the memory that holds the objects created and operated on by the evaluator.   This is explained in chapters 2 through 7.   The following node chart represents the Nodes in Object Space for a simple ruby program consisting of only a minimal class called TestCase.   The unshaded nodes are created by Ruby before reading the user's program.
Figure 2: Object Space Node Chart   (Created with the Graphviz DOT program)
Parser
The Ruby Parser converts a Ruby program into an internal representation called a "Syntactic Tree".   This representation is processed by the evaluator when executing the program.   The following ruby statements are converted into the "Syntactic Tree" as shown below.
Figure 3: Syntactic Tree for example statements
The Parser and Syntactic Tree's are discussed in chapters 8 through 12.
Evaluator
The Evaluator is where a Ruby Program actually is 'executed'. This is in the third section of this book (Appraisal). The is covered in chapters 13 through 17.
The original work is Copyright © 2002 - 2004 Minero AOKI.
Translated by Vincent ISAMBART
Translations and Additions by C.E. Thornton
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike2.5 License.