Name MESA_shader_integer_functions Name Strings GL_MESA_shader_integer_functions Contact Ian Romanick Contributors All the contributors of GL_ARB_gpu_shader5 Status Supported by all GLSL 1.30 capable drivers in Mesa 12.1 and later Version Version 3, March 31, 2017 Number OpenGL Extension #495 Dependencies This extension is written against the OpenGL 3.2 (Compatibility Profile) Specification. This extension is written against Version 1.50 (Revision 09) of the OpenGL Shading Language Specification. GLSL 1.30 (OpenGL) or GLSL ES 3.00 (OpenGL ES) is required. This extension interacts with ARB_gpu_shader5. This extension interacts with ARB_gpu_shader_fp64. This extension interacts with NV_gpu_shader5. Overview GL_ARB_gpu_shader5 extends GLSL in a number of useful ways. Much of this added functionality requires significant hardware support. There are many aspects, however, that can be easily implemented on any GPU with "real" integer support (as opposed to simulating integers using floating point calculations). This extension provides a set of new features to the OpenGL Shading Language to support capabilities of these GPUs, extending the capabilities of version 1.30 of the OpenGL Shading Language and version 3.00 of the OpenGL ES Shading Language. Shaders using the new functionality provided by this extension should enable this functionality via the construct #extension GL_MESA_shader_integer_functions : require (or enable) This extension provides a variety of new features for all shader types, including: * support for implicitly converting signed integer types to unsigned types, as well as more general implicit conversion and function overloading infrastructure to support new data types introduced by other extensions; * new built-in functions supporting: * splitting a floating-point number into a significand and exponent (frexp), or building a floating-point number from a significand and exponent (ldexp); * integer bitfield manipulation, including functions to find the position of the most or least significant set bit, count the number of one bits, and bitfield insertion, extraction, and reversal; * extended integer precision math, including add with carry, subtract with borrow, and extenended multiplication; The resulting extension is a strict subset of GL_ARB_gpu_shader5. IP Status No known IP claims. New Procedures and Functions None New Tokens None Additions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification (OpenGL Operation) None. Additions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification (Rasterization) None. Additions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification (Per-Fragment Operations and the Frame Buffer) None. Additions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification (Special Functions) None. Additions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification (State and State Requests) None. Additions to Appendix A of the OpenGL 3.2 (Compatibility Profile) Specification (Invariance) None. Additions to the AGL/GLX/WGL Specifications None. Modifications to The OpenGL Shading Language Specification, Version 1.50 (Revision 09) Including the following line in a shader can be used to control the language features described in this extension: #extension GL_MESA_shader_integer_functions : where is as specified in section 3.3. New preprocessor #defines are added to the OpenGL Shading Language: #define GL_MESA_shader_integer_functions 1 Modify Section 4.1.10, Implicit Conversions, p. 27 (modify table of implicit conversions) Can be implicitly Type of expression converted to --------------------- ----------------- int uint, float ivec2 uvec2, vec2 ivec3 uvec3, vec3 ivec4 uvec4, vec4 uint float uvec2 vec2 uvec3 vec3 uvec4 vec4 (modify second paragraph of the section) No implicit conversions are provided to convert from unsigned to signed integer types or from floating-point to integer types. There are no implicit array or structure conversions. (insert before the final paragraph of the section) When performing implicit conversion for binary operators, there may be multiple data types to which the two operands can be converted. For example, when adding an int value to a uint value, both values can be implicitly converted to uint and float. In such cases, a floating-point type is chosen if either operand has a floating-point type. Otherwise, an unsigned integer type is chosen if either operand has an unsigned integer type. Otherwise, a signed integer type is chosen. Modify Section 5.9, Expressions, p. 57 (modify bulleted list as follows, adding support for implicit conversion between signed and unsigned types) Expressions in the shading language are built from the following: * Constants of type bool, int, int64_t, uint, uint64_t, float, all vector types, and all matrix types. ... * The operator modulus (%) operates on signed or unsigned integer scalars or vectors. If the fundamental types of the operands do not match, the conversions from Section 4.1.10 "Implicit Conversions" are applied to produce matching types. ... Modify Section 6.1, Function Definitions, p. 63 (modify description of overloading, beginning at the top of p. 64) Function names can be overloaded. The same function name can be used for multiple functions, as long as the parameter types differ. If a function name is declared twice with the same parameter types, then the return types and all qualifiers must also match, and it is the same function being declared. For example, vec4 f(in vec4 x, out vec4 y); // (A) vec4 f(in vec4 x, out uvec4 y); // (B) okay, different argument type vec4 f(in ivec4 x, out uvec4 y); // (C) okay, different argument type int f(in vec4 x, out ivec4 y); // error, only return type differs vec4 f(in vec4 x, in vec4 y); // error, only qualifier differs vec4 f(const in vec4 x, out vec4 y); // error, only qualifier differs When function calls are resolved, an exact type match for all the arguments is sought. If an exact match is found, all other functions are ignored, and the exact match is used. If no exact match is found, then the implicit conversions in Section 4.1.10 (Implicit Conversions) will be applied to find a match. Mismatched types on input parameters (in or inout or default) must have a conversion from the calling argument type to the formal parameter type. Mismatched types on output parameters (out or inout) must have a conversion from the formal parameter type to the calling argument type. If implicit conversions can be used to find more than one matching function, a single best-matching function is sought. To determine a best match, the conversions between calling argument and formal parameter types are compared for each function argument and pair of matching functions. After these comparisons are performed, each pair of matching functions are compared. A function definition A is considered a better match than function definition B if: * for at least one function argument, the conversion for that argument in A is better than the corresponding conversion in B; and * there is no function argument for which the conversion in B is better than the corresponding conversion in A. If a single function definition is considered a better match than every other matching function definition, it will be used. Otherwise, a semantic error occurs and the shader will fail to compile. To determine whether the conversion for a single argument in one match is better than that for another match, the following rules are applied, in order: 1. An exact match is better than a match involving any implicit conversion. 2. A match involving an implicit conversion from float to double is better than a match involving any other implicit conversion. 3. A match involving an implicit conversion from either int or uint to float is better than a match involving an implicit conversion from either int or uint to double. If none of the rules above apply to a particular pair of conversions, neither conversion is considered better than the other. For the function prototypes (A), (B), and (C) above, the following examples show how the rules apply to different sets of calling argument types: f(vec4, vec4); // exact match of vec4 f(in vec4 x, out vec4 y) f(vec4, uvec4); // exact match of vec4 f(in vec4 x, out ivec4 y) f(vec4, ivec4); // matched to vec4 f(in vec4 x, out vec4 y) // (C) not relevant, can't convert vec4 to // ivec4. (A) better than (B) for 2nd // argument (rule 2), same on first argument. f(ivec4, vec4); // NOT matched. All three match by implicit // conversion. (C) is better than (A) and (B) // on the first argument. (A) is better than // (B) and (C). Modify Section 8.3, Common Functions, p. 84 (add support for single-precision frexp and ldexp functions) Syntax: genType frexp(genType x, out genIType exp); genType ldexp(genType x, in genIType exp); The function frexp() splits each single-precision floating-point number in into a binary significand, a floating-point number in the range [0.5, 1.0), and an integral exponent of two, such that: x = significand * 2 ^ exponent The significand is returned by the function; the exponent is returned in the parameter . For a floating-point value of zero, the significant and exponent are both zero. For a floating-point value that is an infinity or is not a number, the results of frexp() are undefined. If the input is a vector, this operation is performed in a component-wise manner; the value returned by the function and the value written to are vectors with the same number of components as . The function ldexp() builds a single-precision floating-point number from each significand component in and the corresponding integral exponent of two in , returning: significand * 2 ^ exponent If this product is too large to be represented as a single-precision floating-point value, the result is considered undefined. If the input is a vector, this operation is performed in a component-wise manner; the value passed in and returned by the function are vectors with the same number of components as . (add support for new integer built-in functions) Syntax: genIType bitfieldExtract(genIType value, int offset, int bits); genUType bitfieldExtract(genUType value, int offset, int bits); genIType bitfieldInsert(genIType base, genIType insert, int offset, int bits); genUType bitfieldInsert(genUType base, genUType insert, int offset, int bits); genIType bitfieldReverse(genIType value); genUType bitfieldReverse(genUType value); genIType bitCount(genIType value); genIType bitCount(genUType value); genIType findLSB(genIType value); genIType findLSB(genUType value); genIType findMSB(genIType value); genIType findMSB(genUType value); The function bitfieldExtract() extracts bits through +-1 from each component in , returning them in the least significant bits of corresponding component of the result. For unsigned data types, the most significant bits of the result will be set to zero. For signed data types, the most significant bits will be set to the value of bit +-1. If is zero, the result will be zero. The result will be undefined if or is negative, or if the sum of and is greater than the number of bits used to store the operand. Note that for vector versions of bitfieldExtract(), a single pair of and values is shared for all components. The function bitfieldInsert() inserts the least significant bits of each component of into the corresponding component of . The result will have bits numbered through +-1 taken from bits 0 through -1 of , and all other bits taken directly from the corresponding bits of . If is zero, the result will simply be . The result will be undefined if or is negative, or if the sum of and is greater than the number of bits used to store the operand. Note that for vector versions of bitfieldInsert(), a single pair of and values is shared for all components. The function bitfieldReverse() reverses the bits of . The bit numbered of the result will be taken from bit (-1)- of , where is the total number of bits used to represent . The function bitCount() returns the number of one bits in the binary representation of . The function findLSB() returns the bit number of the least significant one bit in the binary representation of . If is zero, -1 will be returned. The function findMSB() returns the bit number of the most significant bit in the binary representation of . For positive integers, the result will be the bit number of the most significant one bit. For negative integers, the result will be the bit number of the most significant zero bit. For a of zero or negative one, -1 will be returned. (support for unsigned integer add/subtract with carry-out) Syntax: genUType uaddCarry(genUType x, genUType y, out genUType carry); genUType usubBorrow(genUType x, genUType y, out genUType borrow); The function uaddCarry() adds 32-bit unsigned integers or vectors and , returning the sum modulo 2^32. The value is set to zero if the sum was less than 2^32, or one otherwise. The function usubBorrow() subtracts the 32-bit unsigned integer or vector from , returning the difference if non-negative or 2^32 plus the difference, otherwise. The value is set to zero if x >= y, or one otherwise. (support for signed and unsigned multiplies, with 32-bit inputs and a 64-bit result spanning two 32-bit outputs) Syntax: void umulExtended(genUType x, genUType y, out genUType msb, out genUType lsb); void imulExtended(genIType x, genIType y, out genIType msb, out genIType lsb); The functions umulExtended() and imulExtended() multiply 32-bit unsigned or signed integers or vectors and , producing a 64-bit result. The 32 least significant bits are returned in ; the 32 most significant bits are returned in . GLX Protocol None. Dependencies on ARB_gpu_shader_fp64 This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set of implicit conversions supported in the OpenGL Shading Language. If more than one of these extensions is supported, an expression of one type may be converted to another type if that conversion is allowed by any of these specifications. If ARB_gpu_shader_fp64 or a similar extension introducing new data types is not supported, the function overloading rule in the GLSL specification preferring promotion an input parameters to smaller type to a larger type is never applicable, as all data types are of the same size. That rule and the example referring to "double" should be removed. Dependencies on NV_gpu_shader5 This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set of implicit conversions supported in the OpenGL Shading Language. If more than one of these extensions is supported, an expression of one type may be converted to another type if that conversion is allowed by any of these specifications. If NV_gpu_shader5 is supported, integer data types are supported with four different precisions (8-, 16, 32-, and 64-bit) and floating-point data types are supported with three different precisions (16-, 32-, and 64-bit). The extension adds the following rule for output parameters, which is similar to the one present in this extension for input parameters: 5. If the formal parameters in both matches are output parameters, a conversion from a type with a larger number of bits per component is better than a conversion from a type with a smaller number of bits per component. For example, a conversion from an "int16_t" formal parameter type to "int" is better than one from an "int8_t" formal parameter type to "int". Such a rule is not provided in this extension because there is no combination of types in this extension and ARB_gpu_shader_fp64 where this rule has any effect. Errors None New State None New Implementation Dependent State None Issues (1) What should this extension be called? UNRESOLVED. This extension borrows from GL_ARB_gpu_shader5, so creating some sort of a play on that name would be viable. However, nothing in this extension should require SM5 hardware, so such a name would be a little misleading and weird. Since the primary purpose is to add integer related functions from GL_ARB_gpu_shader5, call this extension GL_MESA_shader_integer_functions for now. (2) Why is some of the formatting in this extension weird? RESOLVED: This extension is formatted to minimize the differences (as reported by 'diff --side-by-side -W180') with the GL_ARB_gpu_shader5 specification. (3) Should ldexp and frexp be included? RESOLVED: Yes. Few GPUs have native instructions to implement these functions. These are generally implemented using existing GLSL built-in functions and the other functions provided by this extension. (4) Should umulExtended and imulExtended be included? RESOLVED: Yes. These functions should be implementable on any GPU that can support the rest of this extension, but the implementation may be complex. The implementation on a GPU that only supports 32bit x 32bit = 32bit multiplication would be quite expensive. However, many GPUs (including OpenGL 4.0 GPUs that already support this function) have a 32bit x 16bit = 48bit multiplier. The implementation there is only trivially more expensive than regular 32bit multiplication. (5) Should the pack and unpack functions be included? RESOLVED: No. These functions are already available via GL_ARB_shading_language_packing. (6) Should the "BitsTo" functions be included? RESOLVED: No. These functions are already available via GL_ARB_shader_bit_encoding. Revision History Rev. Date Author Changes ---- ----------- -------- ----------------------------------------- 3 31-Mar-2017 Jon Leech Add ES support (OpenGL-Registry/issues/3) 2 7-Jul-2016 idr Fix typo in #extension line 1 20-Jun-2016 idr Initial version based on GL_ARB_gpu_shader5.