Dynamic semantics of the JVMD - JVM

The speci_cation of dynamic loading and linking in the dynamic VM proceeds in three steps. First, we re_ne the run-time state. Second, we de_ne loading of references. Third, we adapt the rules of the previous JVM models.

Re_nement of the run-time state

Class loader instances are references. We use a universe Ld|a subset of the universe of references|for class loader instances. The system class loader sysLd is a distinguished element of this subset.

Since each class loader has its own name space, a single JVM program can load multiple classes which have the same name. If classes with the same name have been loaded into di_erent name spaces, the identity of the class loader that de_ned the class is needed to uniquely identify the class. As a consequence, we re_ne the notion of classes (and thereby of their _elds and methods) by adding a loader component to their names. We re_ne the universe Class to include the loader information. type Class = (Ld; Name)

The de_ning class loader (which incorporates the class into the JVM environment) is used as the loader component in all entries where the loader information is needed in a class _le. For example, if class C is de_ned by loader L and C has syntactically a super class D, then the class _le of C has an entry (L;D) for its super class. The loader L will be the initiating loader for D. Sometimes we write also DL instead of (L;D). The same applies to all other entries in the class _le. Later, we change the loader component of these entries if the de_ning loader di_ers from the initiating loader. From now on, we work only with classes which have a loader component and a class name.

State. Each class loader has its own name space. Elements of this name space are the classes the class loader has loaded. The dynamic function ldEnv keeps track of the loaded class object, which has been loaded under the given class name by the corresponding (initiating or de_ning) class loader instance.

The name of the loaded class object (classOf) is always Class. A dynamic function cOf stores for a class instance the real name of the loaded class, i.e., the class name together with its de_ning loader.

Initial values of dynamic functions needed for dynamic class loading are given at the end of this section. The following example illustrates loading of a class by di_erent loaders.

Example. Let n be a class name, let L1 and L2 be initiating loaders for n, let L be the de_ning loader for n. Let r be the reference of the corresponding class object. Then, the dynamic functions ldEnv, cOf , and classOf have the following values:

Since we have class names with the initiating loader and de_ning loader as the loader component, we have to transform a class name to a name with

Ordering of class states

Ordering of class states

the de_ning loader as the loader component. As can be seen in the above example, for a class C we get the class which contains the de_ning loader, instead of the initiating loader, as its loader component by applying the following function liftClass:
liftClass(c) = cOf (ldEnv(c))

The class state of a class could be Referenced, Linked, Initialized, and Unusable. In the dynamic machine we introduce two new states Loaded and SupersLoaded.

The state Loaded means that the class is loaded. If a class has this state, then we know nothing about the class state of the super classes and of the referenced classes.

The state SupersLoaded is used if all super classes are loaded and the class state of all super classes is at least SupersLoaded. A class is in state Referenced if all super classes have at least state Referenced and all referenced classes have at least state SupersLoaded. Fig. below shows the ordering of the di_erent class states.

Environment. When a class _le is loaded, the class environment changes. The function cEnv: Class ! ClassFile becomes dynamic. An external function load is used as a source for new class _les on the local _le system: load: (ClassPath; Class) ! Content

Types. Changing the class environment changes all environment based derived functions, in particular all typing conditions. Since already loaded and veri_ed classes should not become invalid, the class environment is only allowed to increase.

Type compatibility has to take di_erent name spaces into account. Using the extension of a class to a class name with its de_ning loader, we de_ne

Re_nement of the switch machine

Re_nement of the switch machine

as mentioned above two classes to be equal i_they have the same name and the same de_ning loader. This guarantees that type spoo_ng problems as detected by Saraswat do not occur in our model.

Loading references

In the dynamic machine a class can be in a class state less than Referenced. When such a class has to be initialized, then we _rst have to reference the class to reference all its super classes and to load all referenced classes. This is done by extending the switch machine switchVME to switchVMD where a class will be referenced if the state is less than Referenced (either Loaded or SupersLoaded).

If the class Object has to be referenced, then we set the class state immediately to Referenced, because Object1 has no super class. If a class c di_erent from Object has to be referenced, then we distinguish between the class states SupersLoaded and Loaded. If the state is SupersLoaded|meaning that at least all super classes have state SupersLoaded|we reference each super class which has not already been referenced. The invoked submachine referenceClass terminates its recursive calls because the inheritance hierarchy is _nite. If there is no unreferenced super class, we load all references of the class c, as de_ned by the submachine loadReferences. If the state of c is Loaded, then we _rst have to load the super classes of c, which is done by the machine loadSuperClasses described below.

This algorithm of referencing guarantees that the class state of a class becomes referenced only if all super classes are at least referenced and all referenced classes are at least in state SupersLoaded. Hence, this ensures that the class hierarchy is loaded completely with respect to all types which may occur during execution of the class. The following example illustrates why the class state Loaded is not su_cient for the referenced classes in our diligent machine.

Example. Let I be an interface. Let A be a class which implements I and let A be the super class of B. In the following code fragment the classes B and I are referenced, but A (the super class of B) is needed during veri_cation to determine whether B is compatible to I .

Fig. below speci_es loading of super classes and references. The two machines loadSuperClasses and loadReferences in that _gure are based on the scheme loadClasses. The scheme gets as its _rst argument a set of classes to load and the second argument is a machine which is executed if all these classes have at least state SupersLoaded. If there is any class which has not been loaded, then we load the class using the machine callLoad. This machine updates the switch function to call the method <cload> of the loader ld (the loader component of the class) and with the class name as argument.

The method <cload>(String) is not a Java API method; it is a wrapper in our machine to call the

loadClass method. The code of the method is as follows:

The method <cload> is similar to the method <clinit> which is also called only by the virtual machine. When the <cload> method returns, it has to continue at the same instruction it was invoked from. Therefore we extend the de_nition of implicitCall which is used in switchVMC .

implicitCall (m)=methNm(m) 2 f "<clinit>";"<cload>" g

Loading super classes and references

Loading super classes and references

Since we do not know and therefore cannot specify the possibly user de_ned body for the loadClass method, our model can only reect the result of its execution, namely by specifying what happens when a <cload> method returns.

If it returns normally, it has extended the loader environment. Therefore, the dynamic switch machine switchVMD sets|when the method returns|the dynamic function ldEnv for the current loader instance (in register 0) and the name of the class to load (in register 1) with the reference of the created class object (returned in res(0)). Note that the function stringOf returns the string content of the corresponding reference.

The question arises whether we really need the wrapper function. Why not simply call loadClass directly? We do not know anything about user de_ned loaders. A loader may delegate the loading by invoking the loadClass method of another loader. This loader may delegate again. However, the chain of delegation is of no interest and we are only interested in the _rst (initiating) loader and the last (de_ning)loader.

When a class is de_ned by a loader L, each class entry in the class _le contains L as the loader component. The loader L is the initiating loader for all these references. At the time when the reference is loaded by L, we can replace the loader component by Ld using the not furthermore speci_ed submachine setDe_ningLoaders, if Ld is the de_ning loader for the reference. This is what the machines in Fig. below are doing, as we are now going to explain in more detail.

The machine loadSuperClasses in Fig. above loads all super classes of the class c. If the class state of all super classes is at least SupersLoaded, then the machine setSupersLoaded sets the class state to SupersLoaded and the not furthermore speci_ed machine setDe_ningLoadersForSupers replaces the loader component for the super class entries (super

class and implemented interfaces) in the class _le with the de_ning loader. All other references are untouched, because at this time they are not necessarily loaded and the de_ning loader might be unknown.

Indirect references. An indirect reference is a class which appears only in the context of another reference. More precisely, classes in argument types of methods, and classes in return types of methods and _elds are indirect references. The other references are called direct. Let us consider the following instruction in a class D: GetStatic(A;C=getA(B))

The class C is a direct reference which has to be loaded by the same loader as D. The classes in the types A and B have to be handled in the same way as in the class C and are called indirect references. Thus we load the classes in A and B with the de_ning loader of C and not with the loader of D. This is the reason why we have to load the direct references before we load the indirect references. Otherwise we do not know the loader to load the indirect references. Note that the machine setDe_ningLoaders takes also care of the di_erence between direct and indirect references when substituting the loader component.

Loading the references is done by loadReferences which _rst loads all direct references. If all direct references are loaded and the class state for all these is at least SupersLoaded, then the indirect references are loaded in the same way. Finally the class state is set to Referenced and the not furthermore speci_ed machine setDe_ningLoaders substitutes all loader components in the class _le with the de_ning loader.

Example (; CD). The following example illustrates why we distinguish between direct and indirect references. Assume we have the following classes (for readability we extend class names by their

There is an error in the Java program, because super.getA() in the method n returns the class ALA and this class has no method n. The bytecode of the method n in class AL is of special interest:

There seems to be no error in the bytecode. The class B has a method getA which returns A and A has a method n. The problem is that B=getA returns ALA and this class is not compatible to AL.

If we load the references as above and take care of direct and indirect references, then our veri_er has to verify the following annotated program:

Now, the veri_er detects that ALA is not compatible to AL and rejects the program at the InvokeVirtual instruction.

Remark. The veri_er in JDK 1.3 from Sun accepts the bytecode for the method n in class AL and throws a LinkageError before calling the method B=getA(), although all classes have been already loaded and initialized.

The trustful JVMD

JVMD extends JVMN by the native methods for loading and resolving classes and by the native method newInstance which creates an instance for a given class object.

The trustfulVM for JVMD is the one for JVMN, extended with the new execVMD rule appearing in Fig. below and the new switch machine switchVMD appearing

Trustful execution of JVMD instructions

Trustful execution of JVMD instructions

Rules for the _nal methods of class ClassLoader. the new rules of the dynamic VM for the _nal methods of class ClassLoader.

The method findLoadedClass is required to return the class object for a class previously loaded by the called class loader. If the class is not found,a null reference is returned.

The method findSystemClass attempts to _nd the named class by using the internal class loader sysLd. If the system class is not found, a ClassNotFoundException is thrown (by loadClass). Otherwise the class is referenced and linked. Finally, the method returns the loaded class. Loading and linking is speci_ed

The defineClass method tests whether the class name (argument 1) is already in the name space of the loader (argument 0). In such a case the exception ClassFormatError is thrown3. Otherwise the class|speci_ed by the array (arguments 2, 3, and 4)|is de_ned and the machine de_neClass returns the newly created class object.

The resolveClass method references and links the class speci_ed by the reference of the class object. It throws a NullPointerException if the reference is null .

Note that there is no need to explicitly call resolveClass, because the virtual machine references and links the class|if not already done|before the class will be initialized.

The details on loading, de_ning, and linking are speci_ed in Fig. below. The loadClass machine tests whether the speci_ed class exists on the local _le system and either de_nes that class or throws a ClassNotFoundException.

The de_neClass machine takes three arguments. The _rst argument is the binary representation of the class name (the second argument) to be de_ned.

If the last argument is True, the new created class is returned by de_neClass. If the class name coincides with the expected class name, a new class object

Execution of final class loader methods

Execution of final class loader methods

is created and the corresponding dynamic functions are initialized properly. Otherwise a ClassFormatError is thrown.

The linking machine linkClass ensures that the class and its super classes are prepared. Note that the diligent VM has its own machine for linking, but in the trustful VM the class states Referenced and Linked di_er only in preparation.

Rules for the _nal methods of class Class. If a program uses class loaders to load and de_ne new classes, then the loadClass method returns the class object of the loaded class. New instances of the loaded classes can be created

Loading and linking machines

Loading and linking machines

The newInstance method has the same semantics as if in Java a new instance creation expression with an empty argument list is called. The method newInstance is called on the reference r where r is the class object. For such a given reference, the dynamic function cOf returns the real class name.

We model newInstance as a native method where we update the dynamic function meth to cOf (r )=<newInstance>() with the e_ect, that this method will be executed in the next step. Without loss of generality we assume that each loaded class C contains the following code for that method:

We update the dynamic function meth instead of setting switch to call <newInstance>, because otherwise we would create a new frame for the method <newInstance>. After returning from the method <newInstance> the native method newInstance would call <newInstance> again and again, because the state after returning from <newInstance> would be the same as before invoking <newInstance>.

Initialization. Dynamic loading works correctly under the assumption that the classes Object, ClassLoader, Class, and String are pre-linked and that an instance of the class ClassLoader, namely sysLd, is created. Furthermore, there has to be a class object for each loaded class.

Assume that initially the dynamic function heap is de_ned only for sysLd and heap(sysLd) is an instance of ClassLoader. Additionally, the class state for the classes Object, Class, ClassLoader, and String is Linked and cEnv is de_ned for these classes. Then the initial class objects and the initial values for the corresponding dynamic functions can be obtained by applying the following rule:

The dynamic VM starts execution by interpreting the instructions of the internal method <entrypoint>. The method <entrypoint>, which calls the main method of the class to execute, is de_ned in class Object.

The initial attempt to call main of class Main discovers that the class state of Main is loaded. The virtual machine tries to initialize the class and transitively its super classes using sysLd (the system loader) as the current loader. Afterwards, the class and transitively its super classes are initialized, that is veri_ed, prepared, linked, and initialized in the strict sense. Now, the method main can be called.

The defensive JVMD

The defensive JVMD extends the defensive JVMN by the native methods for dynamic loading.

The diligent JVMD

The diligent JVMD extends the diligent JVMN by the new check function checkD and the trustful machine trustfulVMD. diligentVMD = diligentScheme(verifyVM ; trustfulVMD) where verifyVM = verifySchemeN (checkD).

All rights reserved © 2020 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

JVM Topics