At the Ruby level, there are two procedures that can be used for loading: require and load.

require 'uri'            # load the uri library
load '/home/foo/.myrc'   # read a resource file

They are both normal methods, compiled and evaluated exactly like any other code. It means loading occurs after compilation gave control to the evaluation stage.

These two function each have their own use. ‘require’ is to load libraries, and load is to load an arbitrary file. Let’s see this in more details.


require has four features:

Ruby’s load path is in the global variable $: that contains an array of strings. For example, displaying the content of the $: in the environment I usually use would show:

% ruby -e 'puts $:'

Calling puts on an array displays one element by line so it’s easy to read.

As I ran configure using --prefix=/usr, the library path is /usr/lib/ruby and below, but if you compile it normally from the source code, the libraries will be in /usr/local/lib/ruby and below. In a Windows environment, there will also be a drive letter.

Then, let’s try to require the standard library from the load path.

require 'nkf'

If the required name has no extension, require silently compensates. First, it tries with .rb, then with .so. On some platforms it also tries the platform’s specific extension for extension libraries, for example .dll in a Windows environment or .bundle on Mac OS X.

Let’s do a simulation on my environment. ruby checks the following paths in sequential order.

/usr/lib/ruby/1.7/i686-linux/    found! has been found in /usr/lib/ruby/1.7/i686-linux. Once the file has been found, require's last feature (not loading the file more than once) locks the file. The locks are strings put in the global variable $". In our case the string "" has been put there. Even if the extension has been omitted when calling require, the file name in $" has the extension.

require 'nkf'   # after loading nkf...
p $"            # [""]  the file is locked

require 'nkf'   # nothing happens if we require it again
p $"            # [""]  the content of the lock array 
                # does not change

The are two reasons for adding the missing extension. The first one is not to load it twice if the same file is later required with its extension. The second one is to be able to load both nkf.rb and In fact the extensions are disparate (.so .dll .bundle etc.) depending of the platform, but at locking time they all become .so. That’s why when writing a Ruby program you can ignore the differences of extensions and consider it’s always so. So you can say that ruby is quite UNIX oriented.

By the way, $" can be freely modified even at the Ruby level so we cannot say it’s a strong lock. You can for example load an extension library multiple times if you clear $".


load is a lot easier than require. Like require, it searches the file in $:. But it can only load Ruby programs. Furthermore, the extension cannot be omitted: the complete file name must always be given.

load 'uri.rb'   # load the URI library that is part of
                # the standard library

In this simple example we try to load a library, but the proper way to use load is for example to load a resource file giving its full path.

Flow of the whole process

If we roughly split it, “loading a file” can be split in:

The only difference between require and load is how to find the file. The rest is the same in both.

We will develop the last evaluation part a little more. Loaded Ruby programs are basically evaluated at the top-level. It means the defined constants will be top-level constants and the defined methods will be function-style methods.

### mylib.rb
def my_p(obj)
  p obj

### first.rb
require 'mylib'
my_p MY_OBJECT   # we can use the constants and methods defined 
                 # in an other file

Only the local variable scope of the top-level changes when the file changes. In other words, local variables cannot be shared between different files. You can of course share them using for example Proc but this has nothing to do with the load mechanism.

Some people also misunderstand the loading mechanism. Whatever the class you are in when you call load, it does not change anything. Even if, like in the following example, you load a file in the module statement, it does not serve any purpose, as everything that is at the top-level of the loaded file is put at the Ruby top-level.

require 'mylib'     # whatever the place you require from, be it 
                    # at the top-level
module SandBox
  require 'mylib'   # or in a module, the result is the same

Highlights of this chapter

Here the mechanism is a lot about details, so it’s a little difficult to enumerate it simply. That’s why we will work a little differently on it, and we are going to reduce the target to 3 points:

Regarding the first point, you will understand it when you see it.

For the second point, the functions that appear in this chapter come from 4 different files, eval.c ruby.c file.c dln.c. We’ll look at the reason they are stretched in different places.

The third point is just like its name says. We will see how works the currently popular trend of execution time loading, more commonly referred to as plug-ins. This is the most important part of this chapter so I’d like to use as many pages as possible to talk about it.

Searching the library


The body of require is rb_f_require. First, we will only look at the part concerning the file search. Having many different cases is bothersome so we will limit ourselves to the case when no file extension is given.

rb_f_require() (simplified version)
5527  VALUE
5528  rb_f_require(obj, fname)
5529      VALUE obj, fname;
5530  {
5531      VALUE feature, tmp;
5532      char *ext, *ftptr; /* OK */
5533      int state;
5534      volatile int safe = ruby_safe_level;
5536      SafeStringValue(fname);
5537      ext = strrchr(RSTRING(fname)->ptr, '.');
5538      if (ext) {
              /* ...if the file extension has been given... */
5584      }
5585      tmp = fname;
5586      switch (rb_find_file_ext(&tmp, loadable_ext)) {
5587        case 0:
5588          break;
5590        case 1:
5591          feature = fname = tmp;
5592          goto load_rb;
5594        default:
5595          feature = tmp;
5596          fname = rb_find_file(tmp);
5597          goto load_dyna;
5598      }
5599      if (rb_feature_p(RSTRING(fname)->ptr, Qfalse))
5600          return Qfalse;
5601      rb_raise(rb_eLoadError, "No such file to load -- %s",
5603    load_dyna:
          /* ...load an extension library... */
5623      return Qtrue;
5625    load_rb:
          /* ...load a Ruby program... */
5648      return Qtrue;
5649  }

5491  static const char *const loadable_ext[] = {
5492      ".rb", DLEXT,    /* DLEXT=".so", ".dll", ".bundle"... */
5493  #ifdef DLEXT2
5494      DLEXT2,          /* DLEXT2=".dll" on Cygwin, MinGW */
5495  #endif
5496      0
5497  };


In this function the goto labels load_rb and load_dyna are actually like subroutines, and the two variables feature and fname are more or less their parameters. These variables have the following meaning.

variable meaning example
feature the library file name that will be put in $"
fname the full path to the library /usr/lib/ruby/1.7/uri.rb

The name feature can be found in the function rb_feature_p(). This function checks if a file has been locked (we will look at it just after).

The functions actually searching for the library are rb_find_file() and rb_find_file_ext(). rb_find_file() searches a file in the load path $'. rb_find_file_ext() does the same but the difference is that it takes as a second parameter a list of extensions (i.e. loadable_ext) and tries them in sequential order.

Below we will first look entirely at the file searching code, then we will look at the code of the require lock in load_rb.


First the file search continues in rb_find_file(). This function searches the file path in the global load path $' (rb_load_path). The string contamination check is tiresome so we’ll only look at the main part.

rb_find_file() (simplified version)
2494  VALUE
2495  rb_find_file(path)
2496      VALUE path;
2497  {
2498      VALUE tmp;
2499      char *f = RSTRING(path)->ptr;
2500      char *lpath;

2530      if (rb_load_path) {
2531          long i;
2533          Check_Type(rb_load_path, T_ARRAY);
2534          tmp = rb_ary_new();
2535          for (i=0;i<RARRAY(rb_load_path)->len;i++) {
2536              VALUE str = RARRAY(rb_load_path)->ptr[i];
2537              SafeStringValue(str);
2538              if (RSTRING(str)->len > 0) {
2539                  rb_ary_push(tmp, str);
2540              }
2541          }
2542          tmp = rb_ary_join(tmp, rb_str_new2(PATH_SEP));
2543          if (RSTRING(tmp)->len == 0) {
2544              lpath = 0;
2545          }
2546          else {
2547              lpath = RSTRING(tmp)->ptr;
2551          }
2552      }

2560      f = dln_find_file(f, lpath);
2561      if (file_load_ok(f)) {
2562          return rb_str_new2(f);
2563      }
2564      return 0;
2565  }


If we write what happens in Ruby we get the following:

tmp = []                     # make an array
$:.each do |path|            # repeat on each element of the load path
  tmp.push path if path.length > 0 # check the path and push it
lpath = tmp.join(PATH_SEP)   # concatenate all elements in one 
                             # string separated by PATH_SEP

dln_find_file(f, lpath)      # main processing

PATH_SEP is the path separator: ':' under UNIX, ';' under Windows. rb_ary_join() creates a string by putting it between the different elements. In other words, the load path that had become an array is back to a string with a separator.

Why? It’s only because dln_find_file() takes the paths as a string with PATH_SEP as a separator. But why is dln_find_file() implemented like that? It’s just because dln.c is not a library for ruby. Even if it has been written by the same author, it’s a general purpose library. That’s precisely for this reason that when I sorted the files by category in the Introduction I put this file in the Utility category. General purpose libraries cannot receive Ruby objects as parameters or read ruby global variables.

dln_find_file() also expands for example ~ to the home directory, but in fact this is already done in the omitted part of rb_find_file(). So in ruby’s case it’s not necessary.

Loading wait

Here, file search is finished quickly. Then comes is the loading code. Or more accurately, it is “up to just before the load”. The code of rb_f_require()’s load_rb has been put below.

5625    load_rb:
5626      if (rb_feature_p(RSTRING(feature)->ptr, Qtrue))
5627          return Qfalse;
5628      ruby_safe_level = 0;
5629      rb_provide_feature(feature);
5630      /* the loading of Ruby programs is serialised */
5631      if (!loading_tbl) {
5632          loading_tbl = st_init_strtable();
5633      }
5634      /* partial state */
5635      ftptr = ruby_strdup(RSTRING(feature)->ptr);
5636      st_insert(loading_tbl, ftptr, curr_thread);
          /* ...load the Ruby program and evaluate it... */
5643      st_delete(loading_tbl, &ftptr, 0); /* loading done */
5644      free(ftptr);
5645      ruby_safe_level = safe;


Like mentioned above, rb_feature_p() checks if a lock has been put in $". And rb_provide_feature() pushes a string in $", in other words locks the file.

The problem comes after. Like the comment says “the loading of Ruby programs is serialised”. In other words, a file can only be loaded from one thread, and if during the loading a thread tries to load the same file, that thread will wait for the first loading to be finished. If it were not the case:

Thread.fork {
    require 'foo'   # At the beginning of require, foo.rb is 
                    # added to $" 
}                   # However the thread changes during the evaluation 
                    # of foo.rb
require 'foo'       # foo.rb is already in $" so the function returns 
                    # immediately
# (A) the classes of foo are used...

By doing something like this, even though the foo library is not really loaded, the code at (A) ends up being executed.

The process to enter the waiting state is simple. A st_table is created in loading_tbl, the association “feature=>waiting thread” is recorded in it. curr_thread is in eval.c’s functions, its value is the current running thread.

The mechanism to enter the waiting state is very simple. A st_table is created in the loading_tbl global variable, and a “feature=>loading thread” association is created. curr_thread is a variable from eval.c, and its value is the currently running thread. That makes an exclusive lock. And in rb_feature_p(), we wait for the loading thread to end like the following.

rb_feature_p() (second half)
5477  rb_thread_t th;
5479  while (st_lookup(loading_tbl, f, &th)) {
5480      if (th == curr_thread) {
5481          return Qtrue;
5482      }
5483      CHECK_INTS;
5484      rb_thread_schedule();
5485  }


When rb_thread_schedule() is called, the control is transferred to an other thread, and this function only returns after the control returned back to the thread where it was called. When the file name disappears from loading_tbl, the loading is finished so the function can end. The curr_thread check is not to lock itself (figure 1).

Serialisation of loads
Figure 1: Serialisation of loads

Loading of Ruby programs


We will now look at the loading process itself. Let’s start by the part inside rb_f_require()’s load_rb loading Ruby programs.

rb_f_require()-load_rb- loading
5638      PUSH_TAG(PROT_NONE);
5639      if ((state = EXEC_TAG()) == 0) {
5640          rb_load(fname, 0);
5641      }
5642      POP_TAG();


Here the rb_load() that is called is in fact the real form of the Ruby level load.

rb_load() (simplified edition)
rb_load(fname, /* wrap=0 */)
    VALUE fname;
    int state;
    volatile ID last_func;
    volatile VALUE wrapper = 0;
    volatile VALUE self = ruby_top_self;
    NODE *saved_cref = ruby_cref;

    ruby_class = rb_cObject;
    ruby_cref = top_cref;           /* (A-1) CREF It changes it */
    wrapper = ruby_wrapper;
    ruby_wrapper = 0;
    ruby_frame->last_func = 0;
    ruby_frame->last_class = 0;
    ruby_frame->self = self;    /*(A-2) ruby_frame->cbase changes it */
    /* at the top-level the visibility is private by default */
    ruby_errinfo = Qnil;  /* make sure it's nil */
    state = EXEC_TAG();
    last_func = ruby_frame->last_func;
    if (state == 0) {
        NODE *node;

        /* (B) Why or the same handling as eval */
        node = ruby_eval_tree;
        if (ruby_nerrs == 0) {   /* no parse error occurred */
            eval_node(self, node);
    ruby_frame->last_func = last_func;
    ruby_cref = saved_cref;
    ruby_wrapper = wrapper;
    if (ruby_nerrs > 0) {   /* a parse error occurred */
        ruby_nerrs = 0;
    if (state) jump_tag_but_local_jump(state);
    if (!NIL_P(ruby_errinfo))   /* an exception was raised during */
                                /* the loading */

Here a/the sauce file suddenly moves to ruby. c.

It is rb_load_file () with it.

 865  void
 866  rb_load_file(fname)
 867      char *fname;
 868  {
 869      load_file(fname, 0);
 870  }


load_file() (simplified edition)
static void
load_file(fname, /* script=0 */)
    char *fname;
    VALUE f;
        FILE *fp = fopen(fname, "r");   (A)
        if (fp == NULL) {
    f = rb_file_open(fname, "r");       (B)
    rb_compile_file(fname, f, 1);       (C)

(A) In practice, the try to open using fopen() is to check if the file can be opened. If there is no problem, it’s immediately closed. It may seem a little useless but it’s an extremely simple and yet highly portable and reliable way to do it.

(B) The file is opened once again, this time using the Ruby level library The file was not opened with from the beginning not to raise any Ruby exception if the file cannot be opened. Here if any exception occurred we would like to have a loading error, but getting the errors related to open, for example Errno::ENOENT, Errno::EACCESS..., would be problematic. We are in ruby.c so we cannot stop a tag jump.

(C) Using the parser interface rb_compile_file(), the program is read from an IO object, and compiled in a syntax tree. The syntax tree is added to ruby_eval_tree so there is no need to get the result.

That’s all for the loading code. Finally, the calls were quite deep so let’s look at the callgraph of rb_f_require() bellow.

rb_f_require           ....eval.c
    rb_find_file            ....file.c
        dln_find_file           ....dln.c
        rb_load_file            ....ruby.c
                rb_compile_file     ....parse.y

We’ve seen a lot of callgraphs, they are now common sense.

The number of open required for loading

Like we’ve seen before, there are open used just to check if a file can be open, but in fact during the loading process other functions like for example rb_find_file_ext() also do checks using open. How many times is open() called in the whole process?

Well, as my main environment is Linux, I looked using strace. The output is done on stderr so it was redirected using 2>&1.

% strace ruby -e 'require "rational"' 2>&1 | grep '^open'
open("/etc/", O_RDONLY)    = -1 ENOENT
open("/etc/", O_RDONLY)      = 3
open("/usr/lib/", O_RDONLY) = 3
open("/lib/", O_RDONLY)       = 3
open("/lib/", O_RDONLY)    = 3
open("/lib/", O_RDONLY)        = 3
open("/usr/lib/ruby/1.7/rational.rb", O_RDONLY|O_LARGEFILE) = 3
open("/usr/lib/ruby/1.7/rational.rb", O_RDONLY|O_LARGEFILE) = 3
open("/usr/lib/ruby/1.7/rational.rb", O_RDONLY|O_LARGEFILE) = 3
open("/usr/lib/ruby/1.7/rational.rb", O_RDONLY|O_LARGEFILE) = 3

Loading of extension libraries


This time we will see the loading of extension libraries. We will start with rb_f_require()’s load_dyna. However, we do not need the part about locking anymore so it was removed.

5607  {
5608      int volatile old_vmode = scope_vmode;
5610      PUSH_TAG(PROT_NONE);
5611      if ((state = EXEC_TAG()) == 0) {
5612          void *handle;
5614          SCOPE_SET(SCOPE_PUBLIC);
5615          handle = dln_load(RSTRING(fname)->ptr);
5616          rb_ary_push(ruby_dln_librefs, LONG2NUM((long)handle));
5617      }
5618      POP_TAG();
5619      SCOPE_SET(old_vmode);
5620  }
5621  if (state) JUMP_TAG(state);


Needless to say I think that it is that the program of C is compiled. The program that moves is able to make it if it does it in the following manner, because an/the author is using gcc with Linux.

% gcc hello.c

It is Hello, World this probably when it does it from a/the file name! It may be a program. I am able to implement it in the following manner subsequently because gcc outputs a/the program in the file called a. out with default in UNIX.

% ./a.out
Hello, World!

Properly it is prepared.

With it is. And, confirming that "the gatherings of the necessary name" of all object file are included in" the gathering of the name that "offers when the link has collected a plural object file it is to connect mutually. It is (Figure 2) that must do it as the object file of somewhere is linked to" the name that pulls a/the line from all "a necessary name" namely and "offer. (Resolving undefined symbol) that solves an/the undefinition symbol, if I say this case by using terminology, and it becomes.

 Object file and link
Figure 2: Object file and link

Really dynamic link

Of the explanation is. It is whether or not it is good if the dynamic load is done how after. Even if it says that it is not difficult and this is good if it is called only because exclusive use API is available to a/the system usually.

For example that is in a wide area comparatively if it is UNIX is API called dlopen. Yet it is not said to that it is if it is "UNIX. For example I use NeXT style API when there is the interface that differs to HP-UX of before entirely a little bit and is Mac OS X. By the time be on credit as libdl when it is Linux although it is in libc when be a BSD system also and even same dlopen is gallant outside etc. etc., there is not transplantation nature. It is natural even to differ at all if it becomes other OS, because it differs only this even if be and be claimed to be together with a/the UNIX system. First of all same API is impossible to be used.

Thereupon, when it says how it is doing it ruby is preparing the file called dln. c the to absorb the interface that differs at all. Dln may be the omission of dynamic link. Dln_load) (is one of the function of the dln. c.

  1. Library is mapped to the address space of a/the process
  2. I take the pointer to the function where it is included in a/the library
  3. ?? of a/the library


  1. dlopen
  2. dlsym
  3. dlclose

However it corresponds. If it is Win32 API

  1. LoadLibrary (or LoadLibraryEx )
  2. GetProcAddress
  3. FreeLibrary

However it corresponds

Finally, it could arrive to the content of dln_load ().But the function whose also dln_load () is long, this and there being a reason, structure is simple.First we want looking at approximate shape.

dln_load()(Approximate shape)
    const char *file;
#if defined _WIN32 && !defined __CYGWIN__
    Win32 API So it loads
#ifdef each platform of platform independence  every of platform
..... Routine ......
#if !defined(_AIX) && !defined(NeXT)
    rb_loaderror("%s - %s", error, file);
    return 0;                   /* dummy return */

First of all let's go from the cord of API of a/the dlopen system.

1254  void*
1255  dln_load(file)
1256      const char *file;
1257  {
1259      const char *error = 0;
1260  #define DLN_ERROR() (error = dln_strerror(),\
                  strcpy(ALLOCA_N(char, strlen(error) + 1), error))
1298      char *buf;
1299      /* Init_xxxx is written to buf, (as for */ 
          /* territory alloca to allot,) */ 
1300      init_funcname(&buf, file);

1304      {
1305          void *handle;
1306          void (*init_fct)();
1308  #ifndef RTLD_LAZY
1309  # define RTLD_LAZY 1
1310  #endif
1311  #ifndef RTLD_GLOBAL
1312  # define RTLD_GLOBAL 0
1313  #endif
1315          /* (A)Loading the library */
1316          if ((handle=(void*)dlopen(file, RTLD_LAZY | RTLD_GLOBAL))
                                                            == NULL) {
1317              error = dln_strerror();
1318              goto failed;
1319          }
              /* (B)Init_xxxx()To the pointer is taken */
1321          init_fct = (void(*)())dlsym(handle, buf);
1322          if (init_fct == NULL) {
1323              error = DLN_ERROR();
1324              dlclose(handle);
1325              goto failed;
1326          }
1327          /* (C)Init_xxxx()It calls */
1328          (*init_fct)();
1330          return handle;
1331      }

1576    failed:
1577      rb_loaderror("%s - %s", error, file);
1580  }


(B)dlsym) (takes a/the function pointer from the library where handle shows. 《If returns and the value be NULL it is a failure. Here I take, call a/the pointer to Init_xxxx ().

I use LoadLibrary () and GetProcAddress () in Win32. It is very general Win32 API that is appearing in MSDN.

1254  void*
1255  dln_load(file)
1256      const char *file;
1257  {

1264      HINSTANCE handle;
1265      char winfile[MAXPATHLEN];
1266      void (*init_fct)();
1267      char *buf;
1269      if (strlen(file) >= MAXPATHLEN) \ 
                          rb_loaderror("filename too long");
1271      /* "Init_xxxx" With the character string which is */
          /* said is written to buf, (as for the territory  */
          /* alloca allotment) */
1272      init_funcname(&buf, file);
1274      strcpy(winfile, file);
1276      /* Loading the library */
1277      if ((handle = LoadLibrary(winfile)) == NULL) {
1278          error = dln_strerror();
1279          goto failed;
1280      }
1282      if ((init_fct = (void(*)()) \
              GetProcAddress(handle, buf)) == NULL) {
1283          rb_loaderror("%s - %s\n%s", dln_strerror(), buf, file);
1284      }
1286      /* Init_xxxx()It calls */
1287      (*init_fct)();
1288      return handle;

1576    failed:
1577      rb_loaderror("%s - %s", error, file);
1580  }


