Metadata API

Common Language Runtime

Metadata API

This document specifies the API for emitting and importing metadata for the Common Language Runtime (CLR). This API is unmanaged and intended for use by compilers and loaders – low-level tools that require fast access to metadata with a minimum of assistance for traversing relationships (such as the class hierarchy) or for manipulating collections (such as members on a class)

Browsers and other tools, seeking a higher-level API, may instead use the managed Reflection interfaces, specified separately

This is preliminary documentation and subject to change

Last revised: 08 September 2000


1 Overview of the Metadata API 9

1.1 Metadata APIs 9

1.2 Metadata Abstractions 10

1.3 Using the APIs and Metadata Tokens 13

1.3.1 The Complile/Link Style of Interaction 13

1.3.2 The RAD Tool Style of Interaction 15

1.3.3 IMapToken 15

1.3.4 IMetaDataError 15

1.4 Related Specifications 16

1.5 Coding Conventions 16

1.5.1 Handling String Parameters 16

1.5.2 Optional Return Parameters 17

1.5.3 Storing Default Values 17

1.5.4 Null Pointers for Return Parameters 17

1.5.5 “Ignore This Argument” 18

1.5.6 Error Returns 18

2 IMetadataDispenserEx 19

2.1 DefineScope 19

2.2 OpenScope 19

2.3 OpenScopeOnMemory 20

2.4 SetOption 20

2.5 GetOption 22

3 IMetaDataEmit 23

3.1 Defining, Saving, and Merging Metadata 23

3.1.1 SetModuleProps 23

3.1.2 Save 23

3.1.3 SaveToStream 23

3.1.4 SaveToMemory 23

3.1.5 GetSaveSize 24

3.1.6 MergeEx 24

3.1.7 MergeEndEx 25

3.1.8 SetHandler 26

3.2 Custom Attributes and Custom Values 26

3.2.1 Using Custom Attributes 27

3.2.2 Using Custom Values 28

3.2.3 DefineCustomAttribute 28

3.2.4 SetCustomAttributeValue 29

3.3 Building Type Definitions 29

3.3.1 DefineTypeDef 29

3.3.2 SetTypeDefProps 30

3.4 Declaring and Defining Members 31

3.4.1 DefineMethod 31

3.4.2 SetMethodProps 32

3.4.3 DefineField 33

3.4.4 SetFieldProps 34

3.4.5 DefineNestedType 34

3.4.6 DefineParam 35

3.4.7 SetParamProps 36

3.4.8 DefineMethodImpl 36

3.4.9 SetRVA 37

3.4.10 SetFieldRVA 37

3.4.11 DefinePinvokeMap 38

3.4.12 SetPinvokeMap 38

3.4.13 SetFieldMarshal 38

3.5 Building Type and Member References 39

3.5.1 DefineTypeRefByName 39

3.5.2 DefineImportType 40

3.5.3 DefineMemberRef 40

3.5.4 DefineImportMember 41

3.5.5 DefineModuleRef 42

3.5.6 SetParent 43

3.6 Declaring Events and Properties 43

3.6.1 DefineProperty 43

3.6.2 SetPropertyProps 44

3.6.3 DefineEvent 45

3.6.4 SetEventProps 46

3.7 Specifying Layout Information for a Class 47

3.7.1 SetClassLayout 47

3.8 Miscellaneous 48

3.8.1 GetTokenFromSig 48

3.8.2 GetTokenFromTypeSpec 48

3.8.3 DefineUserString 48

3.8.4 DeleteToken 49

3.9 Order of Emission 49

4 MetaDataImport 52

4.1 Enumerating Collections 52

4.1.1 CloseEnum Method 53

4.1.2 CountEnum Method 53

4.1.3 ResetEnum 53

4.1.4 IsValidToken 53

4.1.5 EnumTypeDefs 54

4.1.6 EnumInterfaceImpls 54

4.1.7 EnumMembers 54

4.1.8 EnumMembersWithName 55

4.1.9 EnumMethods 55

4.1.10 EnumMethodsWithName 56

4.1.11 EnumUnresolvedMethods 56

4.1.12 EnumMethodSemantics 57

4.1.13 EnumFields 57

4.1.14 EnumFieldsWithName 57

4.1.15 EnumParams 58

4.1.16 EnumMethodImpls 58

4.1.17 EnumProperties 59

4.1.18 EnumEvents 59

4.1.19 EnumTypeRefs 59

4.1.20 EnumMemberRefs 60

4.1.21 EnumModuleRefs 60

4.1.22 EnumCustomAttributes 61

4.1.23 EnumSignatures 61

4.1.24 EnumTypeSpecs 61

4.1.25 EnumUserStrings 62

4.2 Finding a Specific Item in Metadata 62

4.2.1 FindTypeDefByName 62

4.2.2 FindMember 63

4.2.3 FindMethod 63

4.2.4 FindField 64

4.2.5 FindMemberRef 64

4.2.6 FindTypeRef 65

4.3 Obtaining Properties of a Specified Object 65

4.3.1 GetScopeProps 65

4.3.2 GetModuleFromScope 66

4.3.3 GetTypeDefProps 66

4.3.4 GetNestedClassProps 66

4.3.5 GetInterfaceImplProps 67

4.3.6 GetCustomAttributeProps 67

4.3.7 GetCustomAttributeByName 68

4.3.8 GetMemberProps 69

4.3.9 GetMethodProps 69

4.3.10 GetFieldProps 69

4.3.11 GetParamProps 70

4.3.12 GetParamForMethodIndex 71

4.3.13 GetPinvokeMap 71

4.3.14 GetFieldMarshal 71

4.3.15 GetRVA 72

4.3.16 GetTypeRefProps 72

4.3.17 GetMemberRefProps 72

4.3.18 GetModuleRefProps 73

4.3.19 GetPropertyProps 73

4.3.20 GetEventProps 74

4.3.21 GetMethodSemantics 75

4.3.22 GetClassLayout 75

4.3.23 GetSigFromToken 76

4.3.24 GetTypeSpecFromToken 76

4.3.25 GetUserString 76

4.3.26 GetNameFromToken 77

4.3.27 ResolveTypeRef 77

5 IMetaDataTables 78

6 MethodImpls 79

6.1 Intro 79

6.2 Details 79

6.3 ReNaming Recommendations 80

6.4 Notes 80

7 NestedTypes 82

7.1 Introduction 82

7.2 Definition 82

7.3 Supported Features 82

7.4 Visibility, Subclassing, and Member Access 84

7.5 Naming 85

7.6 Naked Instances 86

7.7 C++ “Member Classes” 86

7.8 C++ “Friends” 87

7.9 Example - Simple 87

7.10 Example – Less Simple 89

8 Distinguished Custom Attributes 91

8.1 Pseudo Custom Attributes (PCAs) 91

8.2 CAs that affect Runtime 92

9 Bitmasks 93

9.1 Token Types [CorTokenType] 93

9.2 Scope Open Flags [CorOpenFlags] 94

9.3 Options for Size Calculation [CorSaveSize] 94

9.4 Flags for Types [CorTypeAttr] 94

9.5 Flags for Fields [CorFieldAttr] 97

9.6 Flags for Methods [CorMethodAttr] 97

9.7 Flags for Method Parameters [CorParamAttr] 98

9.8 Flags for Properties [CorPropertyAttr] 99

9.9 Flags for Events [CorEventAttr] 99

9.10 Flags for MethodSemantics [CorMethodSemanticsAttr] 99

9.11 Flags for Method Implementations [CorMethodImpl] 100

9.12 Flags for Security [CorDeclSecurity] 100

9.13 Struct for Field Offsets [COR_FIELD_OFFSET] 101

9.14 Typedef for Signatures [PCOR_SIGNATURE] 101

9.15 Flags for PInvoke Interop [CorPinvokeMap] 101

9.16 SetOptions: Duplicate Checking [CorCheckDuplicatesFor] 102

9.17 SetOptions: Ref-to-Def Optimizations [CorRefToDefCheck] 102

9.18 SetOptions: Token Remap Notification [CorNotificationForTokenMovement] 103

9.19 SetOptions: Edit & Continue [CorSetENC] 103

9.20 SetOptions: Out-of-Order Errors [CorErrorIfEmitOutOfOrder] 104

9.21 SetOptions: Hide Deleted Tokens [CorImportOptions] 104

9.22 Flags for Assemblies [CorAssemblyFlags] 104

9.23 Flags for Assembly Reference [CorAssemblyRefFlags] 105

9.24 Flags for Manifest Resources [CorManifestResourceFlags] 105

9.25 Flags for Files [CorFileFlags] 105

9.26 Element Types in the runtime [CorElementType] 105

9.27 Calling Conventions [CorCallingConvention] 106

9.28 Unmanaged Calling Conventions [CorUnmanagedCallingConvention] 107

9.29 Argument Types [CorArgType] 107

9.30 Native Types [CorNativeType] 107

10 Signatures 108

10.1 MethodDefSig 109

10.2 MethodRefSig 110

10.3 StandAloneMethodSig 111

10.4 FieldSig 112

10.5 PropertySig 112

10.6 LocalVarSig 113

10.7 CustomMod 113

10.8 TypeDefEncoded and TypeRefEncoded 114

10.9 Constraint 115

10.10 Param 115

10.11 RetType 116

10.12 Type 116

10.12.1 Intrinsic 117

10.12.2 ARRAY Type ArrayShape 117

10.12.3 SZARRAY CustomMod* Type 117

10.13 ArrayShape 117

10.14 Short Form Signatures 118

11 Custom Attributes 119

11.1 Using Custom Attributes 119

11.2 Persisted Format of an Attribute-Object 120

11.3 Prolog 121

11.4 Constructor Arguments 121

11.5 Constructor Arguments – Example 1 123

11.6 Constructor Arguments – Example 2 123

11.7 Constructor Arguments – Example 3 124

11.8 Named Fields and Properties 124

11.9 Named Field – Example 125

11.10 General Case 125

11.11 SERIALIZATION_TYPE_ enum 126

12 CustomAttributes – Syntax 127

13 Marshalling Descriptor 129

14 Metadata Specific to PInvoke 131

14.1 Overview of PInvoke Metadata 132

14.2 PInvoke Metadata for Methods 134

14.3 DefineMethod for PInvoke 134

14.4 DefineMethodImpl for PInvoke 134

14.5 DefinePinvokeMap for PInvoke 135

14.6 SetPinvokeMap for PInvoke 135

14.7 Method Signatures for Plnvoke 135

14.8 PInvoke Metadata for Function Parameters 136

14.9 DefineParam for PInvoke 136

14.10 SetParamProps for PInvoke 136

14.11 PInvoke Metadata for Struct Arguments 137

14.12 DefineTypeDef for PInvoke 137

14.13 DefineField for PInvoke 138

14.14 SetClassLayout for PInvoke (Sequential) 138

14.15 SetClassLayout for PInvoke (Explicit) 139

14.16 PInvoke Metadata for Explicit Marshalling 139

14.17 SetFieldMarshal for PInvoke 140

14.18 PInvoke Custom Attributes 140

1 Overview of the Metadata API

This document defines a set of APIs for emitting and importing metadata. It explains what metadata is, and how it is used. It describes all of the data structures that are passed through this API: bitmasks, signatures, custom attributes and marshalling specifiers.

Metadata is used to describe, on the one hand, runtime types (classes, interfaces and valuetypes), fields and methods, and, on the other hand, internal implementation and layout information that is used by the runtime to JIT-compile IL, load classes, execute code, and interoperate with the COM classic or native world. This information is included with every CLR component, and is available to the runtime, tools, and services.

Compilers and tools emit metadata by calling the emit APIs during compilation and link or, with RAD tools, as a part of building components or applications. The APIs write-to and read-from in-memory data structures. At save time, these in-memory structures are compressed and persisted in binary format into the target compilation unit (.obj file), executable file, or stand-alone metadata binary file. When multiple compilation units are linked to form an .EXE or .DLL, the emit APIs provide a method used to merge the metadata sections from each compilation unit into a single integrated metadata binary.

The loader and other runtime tools and services import metadata to obtain information about components so that tasks such as loading and activation can be completed.

All manipulation of metadata is performed through the metadata APIs, insulating tools from the underlying data structures and enabling a pluggable persistence format architecture that allows runtime binary representations, COM classic type libraries, and other formats to be imported into or from memory transparently.

To learn more about the Runtime file format in general, of which the metadata binary is a part, see the “PE File Format Extensions” spec. For a description of the Runtime type model, refer to the “Virtual Object System” spec. To learn more about interoperability with COM, refer to the “COM integration” spec_cor_COM__Runtime_Interoperability_Specification. To learn more about interoperability with native platform APIs, refer to the “Platform Invoke Metadata Guide”. To learn more about Assemblies, and their metadata APIs, see “Assembly Metadata API” spec.

In order to emit and import metadata at the low-level described in this spec, you need to know two things:

· Each method, its arguments and return type – the API. That’s what this document describes

· Any data structures you must supply as arguments. There are four: bitmasks, signatures, custom attributes and marshalling descriptors. This information is gathered together into the companion spec – “Metadata Structures”

1.1 Metadata APIs

At any time you might have several distinct areas of in-memory metadata. For example, you may have one area that maps all of the metadata from an existing module, held in a file on-disk. At the same time, you may be emitting metadata into a distinct area of metadata, that you will afterwards save as a module into a new on-disk file. (We use the word “module” to mean a file that contains metadata; typically it will be a .OBJ, .EXE or .DLL file that also contains MSIL code; but it can also be a file containing only metadata)

We call each separate area of metadata a scope. Each scope corresponds to a module. Usually that module has been saved, or will be saved, to an on-disk file. But there’s no need to do so: scripting tools frequently generate in-memory metadata that is never persisted into a file. We use the term scope because it represents the scope within which metadata tokens are defined. That’s to say, a metadata token with value N completely identifies an in-memory structure (for example, holding details of a class definition) within a given scope. But that same value N may correspond to a completely different in-memory structure for a different scope.

To establish an in-memory metadata scope, use CoCreateInstance for IMetadataDispenserEx_cor_IMetadataDispenser to create a new scope or to open an existing set of metadata data structures from a file or memory location. With each Define or Open, the caller specifies which API to receive: The emit API interface, used to write to a metadata scope, is IMetadataEmit_cor_IMetadataEmit. The import API, which allows tools to read from a metadata scope, is IMetadataImport_cor_IMetadataImport.

The metadata APIs described in this specification allow a component's metadata to be accessed without the class being loaded by the runtime. The primary design goals for this API include maximizing performance and minimizing overhead – the metadata engine stops just short of providing direct access to the in-memory data structures. On the other hand, when a class is loaded at runtime, the loader imports the metadata into its own data structures, which can be browsed via the Runtime Reflection services, documented as a separate specification. The Reflection services do much more work for the client than the metadata APIs do, such as automatically walking the inheritance hierarchy to obtain information about inherited methods and fields; the metadata APIs return only the direct member declarations for a given class and expect the API client to make additional calls to walk the hierarchy and enumerate inherited methods. The former approach exposes a higher-level view of metadata, where the latter approach puts the API client in complete control of walking the data structures.

Consistent with the primary design goals, the metadata APIs perform a minimum of semantic error checking. These methods assume that the tools and services that emit metadata are enforcing the object system rules outlined in the common type system and that any additional checking on the part of the metadata engine during development time is superfluous. Specific comments about what checks are being performed accompany the specification of each method in this document.

1.2 Metadata Abstractions

Metadata stores declarative information about runtime types (classes, value types, and interfaces), global-functions and global-variable. Each such abstraction in a given metadata scope carries an identity as an mdToken (metadata token), where an mdToken is used by the metadata engine to index into a specific metadata data table in that scope. The metadata APIs return a token from each Define method and it is this token that, when passed into the appropriate Get method, is used to obtain its associated attributes. Note that an mdToken is not an immutable metadata object identifier: when two scopes are merged, tokens from the import scope are remapped into tokens in the emit scope. When a metadata scope is saved, there are various format optimizations that can result in token remaps. Managing tokens is discussed further in the next section.

To be more concrete: a metadata token is a 4-byte value. The most-significant byte specifies what type of token this is. For example, a value of 1 means it’s a TypeDef token, whilst a value of 4 means it’s a FieldDef token. (For the full list, with their values, see the CorTokenType enumeration in CorHdr.h) The lower 3 bytes give the index of the row, within a MetaData table, that the token refers to. We call those lower 3 bytes the RID, or Record IDentifier. So, for example, the metadata token with value 0x01000007 is a ‘shorthand’ way to refer to row number 7 in the TypeDef table, in the current scope. Similarly, token 0x0400001A refers to row number 26 (decimal) in the FieldDef table in the current scope. We never store anything in row zero of a metadata table. So a metadata token, whose RID is zero, we call a “nil” token. The metadata API defines a host of such nil tokens – one for each token type (for example, mdTypeDefNil, with value 0x01000000).

[The above explanation of RIDs is conceptually correct – however, in reality, the physical layout of data is much more complicated. Moreover, string tokens mdString are slightly different: their lower 3 bytes are not a record identifier, but an offset to their start location in the metadata string pool]

The following abstractions and corresponding mdToken types will be encountered in the metadata APIs. More details on these abstractions are provided in the externalization section of the common type system and, to some extent, with the appropriate Define method in this API specification.

· Module (mdModule): The metadata in a given scope describes a compilation unit, executable, or other development-, deployment-, or run-time unit, referred to in this documentation generally as a module. It is possible, although not required, to declare a name, GUID identifier, custom attributes, etc on the module as a whole.

· Module references (mdModuleRef): Compile-time references to modules, recording the source for type and member imports.

· Type declarations (mdTypeDef): Declarations of runtime reference types -- classes and interfaces – and of value types.

· Type references (mdTypeRef): References to runtime reference types and value types, such as may occur when declaring variables as runtime reference or value types or in declaring inheritance or implementation hierarchies. In a very real sense, the collection of type references in a module is the collection of compile-time import dependencies.