Library Program

Instructions and structures representing programs. @author Patryk Czarnik

Require Import Relations.
Require Import Common.
Require Import ArithOps.
Require Import ValuesNTypes.
Require Import ProgramStructures.
Require Import List.

Bytecode program

Module of this type provides structures representing bytecode programs.
Module Type PROGRAM.
  Open Local Scope type_scope.

Included modules

  Declare Module M_ProgramStructures : PROGRAM_STRUCTURES.
  Module M_ProgramStructuresFacts := ProgramStructuresFacts M_ProgramStructures.

  Import M_ProgramStructures.M_ValuesAndTypes.
  Import M_ProgramStructures.
  Import M_ProgramStructuresFacts.

Instructions

Frame-level instructions

Generic stack operation. They operate regardless the type and value of stack contents. However, the space occupied by values (cat1 vs cat2 values) make a difference.
  Inductive TGenericOp: Set :=
  | Op_nop
  | Op_dup
  | Op_dup_x1
  | Op_dup_x2
  | Op_dup2
  | Op_dup2_x1
  | Op_dup2_x2
  | Op_pop
  | Op_pop2
  | Op_swap.

Subsequent stack instructions. Most of them are still super-instructions repersenting classes of JVM instructions, e.g. all binary arithemtic operations are represented here as Op_binop.
  Inductive TStackInstr: Set :=
  
Generic stack operation.
  | SI_Generic (op: TGenericOp)
  
Constant insertion operation.
  | SI_Const (k: TKind) (v: TValue)
  
Unary arithemtic operation. In fact, only one operator neg is handled here.
  | SI_Unop (k:TKind) (unop: ArithmeticOperators.TUnOp)
  
Binary arithemtic operation.
  | SI_Binop (k:TKind) (binop: ArithmeticOperators.TBinOp)
  
Arithmetic cast operation.
  | SI_Cast (k:TKind) (castop: ArithmeticOperators.TCastOp)
  
Comparison returning integer.
Instructions making use of local variables array.
  Inductive TVarInstr: Set :=
    
Load value from a variable onto stack.
  | VI_Load (k: TVarIKind) (var: TVar)
    
Store value from stack in a variable.
  | VI_Store (k: TVarIKind) (var: TVar)
    
Increment value of a variable.
  | VI_Inc (var: TVar) (c: M_Numeric_Int.t).
Instructions that may affect control flow within a frame – conditional and unconditional jumps.
  Inductive TCondInstr: Set :=
    
Jump unconditionally.
  | CI_Goto (off: TOffset)
    
Compare to zero.
  | CI_If (k:TKind) (cmp: ArithmeticOperators.TCmpOp) (off: TOffset)
    
Compare two values.
  | CI_Cmp (k:TKind) (cmp: ArithmeticOperators.TCmpOp) (off: TOffset)
    
tableswitch instruction. The high value can be computed as low + (length table) - 1.
  | CI_Tableswitch (low: M_Numeric_Int.t) (default: TOffset) (table: list TOffset)
    
lookupswitch instruction.
  | CI_Lookupswitch (default: TOffset) (table: list (M_Numeric_Int.t × TOffset)).

Instructions that operate within the top most frame of a thread.
  Inductive TFrameInstr: Set :=
    
Stack-only operation.
  | FI_Stackop (si: TStackInstr)
    
Local variable operation.
  | FI_Var (vi: TVarInstr)
    
Conditional and unconditional jumps.
  | FI_Cond (ci: TCondInstr)

    
Jump to a subroutine. This is a different case than FI-Cond, because the current PC value is used in the result operand stack. Knowledge of PC is not required in FI_Cond semantics.
  | FI_Jsr (off: TOffset)

    
Return from a subroutine. This is a separate case of frame instruction, because it makes use of local variables array and at the same time it changes PC.
  | FI_Ret (var: TVar).

Heap access operations

Specific operations requiring reading of the heap.
  Inductive TGetOp: Set :=
  | Op_getfield (cn: TClassName) (field: TShortName)
  | Op_getstatic (cn: TClassName) (field: TShortName)
  | Op_getarray (k: TArrayIKind)
  | Op_array_length
  | Op_checkcast (rt: TReferenceType)
  | Op_instanceof (rt: TReferenceType).

Specific operations requiring writing the heap.
  Inductive TPutOp: Set :=
  | Op_putfield (cn: TClassName) (field: TShortName)
  | Op_putstatic (cn: TClassName) (field: TShortName)
  | Op_putarray (k: TArrayIKind).

Specific operations requiring writing the heap with allocation of new locations.
  Inductive TNewOp: Set :=
  | Op_newobject (cn: TClassName)
  | Op_newarray (bt: TBaseType)
  | Op_newarray_ref (rt: TReferenceType)
  | Op_newmultiarray (rt: TReferenceType) (dimensions: nat).

Monitor operations.
  Inductive TMonitorOp: Set :=
  | Op_Unlock
  | Op_Lock.

Heap instructions: reading, writing, or creating new locations.
  Inductive THeapInstr: Set :=
  | HI_Get (get_op: TGetOp)
  | HI_Put (put_op: TPutOp)
  | HI_New (new_op: TNewOp)
  | HI_Monitor (mon_op: TMonitorOp).

Other operations and ulimate definition of CoJaq instruction

Invocation modes for invoke_XXX instructions.
  Inductive TInvocationMode: Set :=
  | InvokeStatic
  | InvokeVirtual
  | InvokeSpecial
  | InvokeInterface.

The root of instructions hierarchy.
  Inductive TInstruction: Set :=
  
In-frame instructions
  | I_Frame (finstr: TFrameInstr)
  
Heap access instructions
  | I_Heap (hinstr: THeapInstr)
  
Invocation of a method
  | I_Invoke (mode: TInvocationMode) (cn: TClassName) (mn: TSignature)
  
Return from a method
  | I_Return (k: option TKind)
  
Throwing an exception
  | I_Throw.


Program and its components

Subsequent program components in bottom-up order.
Design decisions:
  • Program components are declared and specified by axioms here. It allows for different implementations of the module.
  • Program and its components, cosidered here as tangible objects, are accessible with deterministic functions rather than relations. Partial functions have option result type.
  Inductive TVisibility: Set :=
  | Private
  | Protected
  | Package
  | Public.

Methods

A block of code --- method body.
  Parameter TCode: Set.

Gets the instruction that is pointed by the given PC.
  Parameter getInstruction: TCodeTPCoption TInstruction.

Returns the next PC after the given one within the given program. The returned PC may not be a valid one (i.e. getInstruction may be equal to None
  Parameter next: TCodeTPCTPC.

instr is pointed by pc in code.
  Definition currentInstruction code pc instr : Prop :=
    getInstruction code pc = Some instr.

Complete JVM method.
  Parameter TMethod: Set.

Returns signature of the given method.
  Parameter getMethodSignature: TMethodTSignature.

Returns declared visibility level of the given method.
Is the method static?
  Parameter isMethodStatic: TMethodbool.

Is the method abstract?
  Parameter isMethodAbstract: TMethodbool.

Is the method synchonized?
  Parameter isMethodSynchronized: TMethodbool.

Is the method an instance initialisation method? ("constructor")
  Parameter isConstructor: TMethodbool.

Is the method a class initialisation method? ("static block")
  Parameter isClassInit: TMethodbool.

Returns code of the given method. Some methods may have no body (abstract and native ones); in such case None is returned.
  Parameter getMethodCode: TMethodoption TCode.

Abstract method has no body.
Returns the pc pointing to the first instruction to be executed in a method.
  Parameter getMethodBeginning: TMethodTPC.

Exception handler declared within a method.
  Parameter THandler: Set.

All exception handlers declared for a method.
  Parameter methodExnHandlers: TMethodlist THandler.

Is the given PC within the scope of the given exception handler?
  Parameter handlerInScope: THandlerTPCbool.

The name of the class of exceptions caught by the given exception handler.
  Parameter handlerType: THandlerTClassName.

Beginning of handler block.
  Parameter handlerTarget: THandlerTPC.

Fields

A field in a class.
  Parameter TField : Set.

Returns the declared type of a field.
  Parameter getFieldType : TFieldTType.

Returns the name of a field.
  Parameter getFieldName : TFieldTShortName.

Returns the declared visibility level of a field.
  Parameter getFieldVisibility : TFieldTVisibility.

States whether a field is declared final.
  Parameter isFieldFinal : TFieldbool.

States whether a field is static.
  Parameter isFieldStatic : TFieldbool.

Classes and inheritance

A class.
  Parameter TClass: Set.

Returns the qualified name of the class.
  Parameter getClassName: TClassTClassName.

Is class abstract? True for abstract classes and interfaces.
  Parameter isAbstractClass: TClassbool.

Is this an interface in fact?
  Parameter isInterfaceClass: TClassbool.

States if the ACC_SUPER flag is set in this class.
  Parameter hasSuperFlag: TClassbool.

Returns a method from a class by the signature of the method.
  Parameter getMethod: TSignatureTClassoption TMethod.


Returns the field from the class and its superclasses.
  Parameter getField: TShortNameTClassoption TField.

Returns the list of dynamic fields from the class and its superclasses.
  Parameter getDynamicFields: TClasslist TField.

Returns the list of dynamic fields with initial values from the class and its superclasses.
  Parameter getEmptyObject: TClasslist (TShortName × TValue).

The directs superclass of a class. For Object it is None.
  Parameter parentClass: TClassoption TClass.

Returns a list of all direct (explicitly declared) interfaces implemented by this class or a list of direct superinterfaces if this class is in fact an interface.
  Parameter getClassInterfaces: TClasslist TClass.
The Object class.
  Parameter class_Object: TClass.

The Throwable class.
  Parameter class_Throwable: TClass.

Object is the root of class hierarchy --- it does not have a parent.
Any class different from Object has a parent class.
  Axiom PparentClass: c, parentClass c = Nonec = class_Object.

  Axiom PparentClassDifferent: c1 c2, parentClass c1 = Some c2c1 c2.

isParentClass c1 c2 holds iff c1 is the direct parent class of c2.
  Definition isParentClass (c1: TClass) (c2: TClass): Prop :=
    parentClass c2 = Some c1.

isAncestorClass c1 c2 holds iff c1 is an acestor of c2 in the class hierarchy.
  Definition isAncestorClass: TClassTClassProp :=
    clos_trans TClass isParentClass.

isAncestorClassOrSelf c1 c2 holds iff c1 is equal to c2 or c1 is an acestor of c2 in the class hierarchy.
  Definition isAncestorClassOrSelf: TClassTClassProp :=
    clos_refl_trans TClass isParentClass.

  Definition isDirectSuperTypeClass (c1 c2: TClass) :=
       parentClass c2 = Some c1
     In c1 (getClassInterfaces c2).

  Definition isSuperTypeClass: TClassTClassProp :=
    clos_trans TClass isDirectSuperTypeClass.

  Definition isSuperTypeClassOrSelf: TClassTClassProp :=
    clos_refl_trans TClass isDirectSuperTypeClass.

The class hierarchy is well-founded.

Program

A whole JVM program in "post-link view".
  Parameter TProgram: Set.

Returns a class for the given name.
  Parameter getClass: TProgramTClassNameoption TClass.

Returns a method for the given qualified name.
  Definition getMethodFromProgram (p: TProgram) (qmn: TQualifiedMethodName): option TMethod :=
  match qmn with (cn, mn)
    match getClass p cn with
    | NoneNone
    | Some c
      match getMethod mn c with
      | NoneNone
      | Some mSome m
      end
    end
  end.

Returns code block of a method for the given qualified name.
  Parameter getMethodBodyFromProgram: TProgramTQualifiedMethodNameoption TCode.

  Axiom PgetMethodBodyFromProgram: p c m cn msig code,
    getMethodBodyFromProgram p (cn, msig) = Some code
     (getClass p cn = Some c
      getMethod msig c = Some m
      getMethodCode m = Some code).

Subtype relation

Subtype relation for reference types, as defined for instanceof and checkcast instructions.
  Fixpoint subtypeRef (p: TProgram) (rt1: TReferenceType) (rt2: TReferenceType) {struct rt1}: Prop :=
  match rt1, rt2 with
  | RTObject cn1, RTObject cn2
    match getClass p cn1, getClass p cn2 with
    | None, NoneFalse
    | None, Some _False
    | Some _, NoneFalse
    | Some c1, Some c2
      isInterfaceClass c1 = true c2 = class_Object
       isSuperTypeClassOrSelf c2 c1
    end
  | RTObject cn1, RTArray ct2False
  | RTArray ct1, RTObject cn2
    In cn2 (StandardNames.Object :: StandardNames.arrayInterfaces)
  | RTArray ct1, RTArray ct2
    match ct1, ct2 with
    | BaseType bt1, BaseType bt2
      bt1 = bt2
    | BaseType _, ReferenceType _False
    | ReferenceType _, BaseType _False
    | ReferenceType crt1, ReferenceType crt2
       subtypeRef p crt1 crt2
    end
  end.

  Inductive validHandlerInd (p: TProgram) (pc: TPC) (exn_cn: TClassName): list THandleroption THandlerProp :=
  | ValidHandler_Nil: validHandlerInd p pc exn_cn nil None

  | ValidHandler_Found: handler tl caught_class thrown_class,
    getClass p (handlerType handler) = Some caught_class
    getClass p exn_cn = Some thrown_class
    isAncestorClassOrSelf caught_class thrown_class
    handlerInScope handler pc = true
    validHandlerInd p pc exn_cn (cons handler tl) (Some handler)

  | ValidHandler_Cons: bad_handler valid_handler tl caught_class thrown_class,
    getClass p (handlerType bad_handler) = Some caught_class
    getClass p exn_cn = Some thrown_class
    ( ¬ isAncestorClassOrSelf caught_class thrown_class
      ¬ handlerInScope bad_handler pc = true) →
    validHandlerInd p pc exn_cn tl (Some valid_handler) →
    validHandlerInd p pc exn_cn (cons bad_handler tl) (Some valid_handler).

Points the valid exception handler for a given actual class of thrown exception.
  Definition validExceptionHandler (p: TProgram) (m: TMethod) (pc: TPC) (exn_cn: TClassName) (maybe_handler: option THandler): Prop :=
    validHandlerInd p pc exn_cn (methodExnHandlers m) maybe_handler.

End PROGRAM.

Program with constructors

Extends the above signature with so called constructors enabling to create concrete methods, programs and structures. This constructors set is designed to be used for writing example programs in Coq. It can be used as a target for translation of real JVM programs, but for the latter purpose a separate set of constructors could be more convenient...
Module Type PROGRAM_W_CONSTRUCTORS.

Included and required modules

  Include PROGRAM.
  Require Import List.
  Import M_ProgramStructures.M_ValuesAndTypes.
  Import M_ProgramStructures.

  Open Local Scope type_scope.

PC and offsets

A decision is made here for the meaning of PCs and offsets.
  • an "offset" is here represented with a natural number equal to the position of the target instruction.
Creates a PC basing on a natural number --- instruction position.
  Parameter pcFromPosition: natTPC.

Returns the position represented by the given PC.
  Parameter pcToPosition: TPCnat.

  Axiom PpcPosition: pc,
    pcFromPosition (pcToPosition pc) = pc.

  Axiom PpositionPc: n,
    pcToPosition (pcFromPosition n) = n.

Creates an offset basing on a natural number being the target of a jump.
  Parameter offsetFromPosition: natTOffset.

The offset always leads to the given position (the starting point of a jump is not meaningful!).
  Axiom PmakeOffset: (n:nat) (pc:TPC),
    pcToPosition (jump (offsetFromPosition n) pc) = n.

  Axiom PmakeOffset_pc: (n:nat) (pc:TPC),
    jump (offsetFromPosition n) pc = pcFromPosition n.

Creates a code block containing the given instructions in the given order.
  Parameter codeFromList: list TInstructionTCode.

PC relates directly to the position of instruction on the list.
  Axiom PgetInstructionIffNth: linstr pc,
    getInstruction (codeFromList linstr) pc =
    nth_error linstr (pcToPosition pc).

Methods

Makes a method with the given signature (name and parameters) and body.
  Parameter makeMethod: TSignatureTCodeTMethod.

makeMethod sets the signature, which is then returned by getSignature.
makeMethod sets the method code, which is then returned by getCode.
Methods created from a list begins at position 0.
For methods created from a list, next moves to the next position.
  Axiom PcodeFromList_next: pc linstr,
    pcToPosition (next (codeFromList linstr) pc) = S (pcToPosition pc).

For methods created from a list, next moves to the next position.
  Axiom PcodeFromList_next_pc: n linstr,
    next (codeFromList linstr) (pcFromPosition n) = pcFromPosition (S n).

Classes and programs

TODO: fields, require unique names
Makes a class with the given name, parent and methods.
  Parameter makeClass: TClassNamelist TMethodTClass.

makeClass sets the class name, which is then returned by getClassName.
  Axiom PmakeClass_getClassName: cn methods, getClassName (makeClass cn methods) = cn.

If a method identified by its signature existed in the methods list, then, and only then, getMethod obtains that method from class created as makeClass name methods.
  Axiom PgetMethodIffIn: cn methods m,
    In m methods getMethod (getMethodSignature m) (makeClass cn methods) = Some m.

Creates a program from a set of classes.
  Parameter makeProgram: list TClassTProgram.

If a class identified by its name existed in the classes list, then, and only then, getClass obtains that class from program created as makeProgram classes.