Dynamic semantics of the JVMC - JVM

JVMC extends JVMI by instructions to read and write class _elds, to invoke and to return from class methods and to initialize classes. Methods can either return nothing (i.e., they return void) or they return real values like integers.

We de_ne in execVMC a rule for each of the new instructions. We also introduce a submachine switchVMC which takes care of the context switch upon calling or returning from a method. This separation of method call and return from the execution of instructions in the method body is extended in below to separate the instruction execution proper from checking concerns in the re_nement of the trustful to a defensive machine and to the diligent machine coming with a bytecode veri_er.

As long as we only consider statically linked programs, classes are denoted by their class name. Field and method references are pairs of class and _eld or method names.

type MSig = (Meth;Type_)

The abstract universes Field and Meth stand for _eld and method identi_ers. Type denotes the types supported by the JVM. The JVM supports overloading. Fields can have the same identi_er, as long as the classes they are declared in are di_erent. Methods within the same class can also be overloaded, as long as the argument types di_er.

We use selector functions classNm, _eldNm, methSig, methNm and argTypes and overload selectors, as long as their meaning is not ambiguous.

For instance classNm is used to select the class name of a _eld reference as well as of a method reference. Similarly we use methNm to get the method identi_er of a method reference. We write tuples of _eld and method references in the form c=f and c=m, where c; f and m are elements of the universes Class; Field and MSig, respectively. Likewise we write method signatures as m(ts), where m and ts denote a method identi_er and a type sequence, respectively.

Environment. JVMC programs are executed with respect to a static environment cEnv, which is a mapping of classes and interfaces to class _les. (The following description does not distinguish interfaces and classes, unless explicitly stated.) Each class _le provides for every class its name, its kind (whether it is a class or an interface), its superclass (if there is any, otherwise super is unde_ned), the set of the interfaces the class implements, a table for _elds, and a table for methods. In JVMC every member of a class _le is static; this is extended in JVMO, which also supports instances.

Class _les do not include de_nitions for _elds or methods provided by any superclass, unless the de_nitions are overwritten.

Field declarations have a type and a set of modi_ers. The type of a method is its result type, which can be void. If a method is implemented in the class, the function code de_nes a nonempty sequence of instructions, otherwise the sequence is empty. An implemented method can include an exception table. Methods can have constraints on the maximum depth of their operand stacks and the number of registers they use (maxOpd and maxReg). This information is only exploited in the secure JVM.

We assume that for any class _le selector there is a derived function having the same name, that suppresses the class environment cEnv and abbreviates the data path to select the corresponding component. For instance, the overloaded function super : Class ! Class returns the direct superclass of the speci_ed class, that is super(c) = super(cEnv(c)). Another example is the derived function code(c=m) that returns the code of the given method, that is code(c=m) = code(methods(cEnv(c))(m)).

State. JVMC adds procedural abstraction in the form of class methods. Due to their presence we have not only a single JVMI frame (pc; reg; opd) but many of them of type

type Frame = (Pc; Map(RegNo; Word); Word_; Class=MSig)

which are stored in a stack. Note that the frame information is enriched by a fourth component, that always describes the method to which the frame belongs. The dynamic function stack holds the stack of active method incarnations.

The dynamic function meth holds the currently active method.

where Main is a not furthermore speci_ed class (in ASM terminology: a static 0-ary function) given by the invoker of the application.

Method invocation and return as well as implicit class initialization (see below) change the current frame (the context). In order to obtain a uniform treatment of checking and verifying of methods, we explicitly separate here method transfer from the execution of instructions within method bodies. For this purpose we introduce a new universe

that comes together with a dynamic function switch characterizing the context switch the machine is currently performing. The machine either performs no switch (Noswitch{the initial value) or it calls (Call) or returns (Result) from a method, or it initializes (InitClass) a class.

The JVM uses symbolic _eld and method references to support binary compatibility, see also. As a consequence, the calculation of _eld o_sets and of method o_sets is implementation dependent. Therefore, we keep the class _eld access as abstract as in Java and de_ne the storage function for class _elds by the same abstract function globals which initially holds the default value for each static _eld. The environment based function staticFields(c) returns the static _elds of a class c, the function defaultVal (f) returns null for _elds holding references, and 0 otherwise. (Depending on the type of the _eld di_erent representations of 0 are chosen.)

Before a class can be used it must be loaded, linked, and its class initializers must be executed. Loading and linking is introduced later. At the JVM level class initializers appear as class methods with the special name <clinit>. In JVMC a class can either be linked or initialized, similarly to the Java model which also keeps track of the initialization state of a class so that we can repeat the de_nition here:

data ClassState = Linked j Initialized

The JVM also comes with a dynamic function classState, which records the current class state. Initially all classes, except the root class Object, are Linked. The class Object is initially initialized.

Trustful execution of JVMC instructions

Trustful execution of JVMC instructions

A class is initialized, if its class state is Initialized. Formally:

initialized(c) = (classState(c) = Initialized)

Rules. Figure above de_nes trustfulSchemeC . The scheme can be read as a macro with two parameters execVM and switchVM . For JVMC we instantiate the parameters with execVMC and switchVMC .

trustfulVMC = trustfulSchemeC (execVMC ; switchVMC )

As long as a context switch is requested, switchVMC rules _re. Otherwise execVMC rules are executed.

We _rst explain the new execVMC rules for the case that the referenced class of the _eld or method is already initialized. A GetStatic instruction loads, on top of the operand stack, the value (one or two words) which is

Trustful switch machine

Trustful switch machine

stored under the _eld in the global environment. A PutStatic instruction stores the top (two) word(s) of the operand stack in the global environment at the given _eld. An InvokeStatic instruction pops the arguments from the operand stack and triggers a context switch. The switchVMC machine in Figure above then stores the current frame in the frame stack and places the arguments of the invoked method in the registers of the new frame, and execution continues at the _rst instruction of the new method. When execVMC interprets Return, it takes the required number of words (0, 1 or 2) from the top of the operand stack and triggers a context switch. Upon returning from a class initializer (implicitCall ), no return value is passed and switchVMC makes the execution continue with the frame and the instruction which triggered the class initialization; otherwise switchVMC discards the topmost frame, pushes the returned words onto the invoker's operand stack and increments the invoker's pc. We therefore de_ne (see the re_nement for class loading method <cload>):

implicitCall (m) = methNm(m) = "<clinit>"

Pushing and popping frames, as required by the switchVMC , are de_ned by the following rules:

Compilation of JavaC expressions/statements

Compilation of JavaC expressions/statements

The function makeRegs converts an argument sequence to a _nite mapping:

makeRegs(args) = f(l ; v) j (l ; v) 2 zip([0::length(args)1]; args)g

Initialization starts, that is a class initialization method is implicitly called by triggering a context switch to InitClass(c), when the class referred to in a Get-, Put- or InvokeStatic instruction is not initialized. Before a class is initialized, its superclass has to be initialized, if there is any. Interfaces can be initialized at this time, although this is not speci_ed in the Java language reference manual. This strategy is reected by the InitClass(c) rule in switchVMC. It pushes <clinit> frames on the stack and at the same time records that the class is initialized. The machine switchVMC executes this rule until in the inheritance hierarchy either the class Object or an initialized superclass is reached. Since the class state is set to initialized as soon as a frame is pushed, the _niteness of the inheritance hierarchy implies that this recursive initialization always terminates.

All rights reserved © 2020 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

JVM Topics