'2018/10 글 목록

주의 : 이 문서는 초심자 튜토리얼이 아닙니다. 기본 개념 정도는 안다고 가정합니다. 초심자는 [ Vulkan Tutorial ] 이나 [ Vulkan Samples Tutorial ] 을 보면서 같이 보시기 바랍니다.

경고 : Vulkan 에 대해서 완벽하게 이해하고 작성된 글은 아닙니다. 공부하면서 정리한 글이라 오류가 있을 수 있으니 이상하면 참고자료를 확인하세요.

원래 AMD 에는 Ma ntle 이라는 저수준 Graphics API 가 있었습니다.

예전에 다니던 회사에 AMD 직원이 와서 홍보를 하던 기억이 나네요. 그 당시에 혹하기는 했지만, AMD 의 특정( 최신 ) 하드웨어에서만 지원된다는 점에서 그냥 "와~ 놀랍네" 하고 말았던 기억이 납니다.

역시나 특정 하드웨어에 종속되어 있기 때문에 대중화에 실패한 것 같습니다. 일부 콘솔 게임( video game )들만이 Mantle 로 출시되었습니다. 하지만 이것이 보여준 드라마틱한 성능은 Vulkan, Direct3D 12 와 같은 저수준 API 의 시대를 열었다는데 의의가 있다는 생각이 듭니다.

특히나 Vulkan 의 경우에는 Mantle 의 자손이라고 할 수 있습니다. AMD 가 Mantle 을 Khronos Group 에 기증했고 그걸 기반으로 Vulkan 이 만들어졌기 때문입니다. 역시 Open Source 핵심 진영인 Khronos 답게 이것을 Cross-Platform API 로 만들었습니다. 그리고 특정 언어에 종속되지 않는 SPIR-V binary 를 사용하여 언어의 호환성을 유지합니다. 그러므로 GLSL 이나 HLSL 을 사용해 shader 를 작성하는 것이 가능합니다.

재밌는 건 이것이 OpenGL ES 3.1 이나 OpenGL 4.x 이상을 지원하고 있다는 것입니다. Direct3D 12 가 Window 만을 지원하고 있고 Geforce GTX 급의 카드에서만 안정적으로 지원된다는 것을 고려하면, 요즘과 같은 Cross-Platform 시대에는 Vulkan 이 대세가 되지 않을까 싶습니다.

게다가 성능은 구현측에 달려 있기는 하지만 벤치마킹 업체에서 내 놓은 결과를 보면 의미심장합니다. 아래 이미지들은 [ Quick Look: Comparing Vulkan & DX12 API Overhead on 3DMark ] 에서 가지고 왔습니다.

흥미로운 점은 Mantle 을 만든 AMD 의 카드에서 Vulkan 성능이 더 안 나온다는 겁니다. 참 이상한 결과가 아닐 수 없습니다. 역시 AMD 가 Radeon 을 인수하고 나서 그래픽스 카드쪽으로는 별로 힘을 못 쓰는 것 같네요.

어쨌든 제 예상으로는 Cross-Platform 환경에서 Vulkan 은 대세가 될 것이라 봅니다. 특히나 extension 이나 layer 를 통해서 API 를 확장하고 강화할 수 있는 가능성을 열어 두고 있기 때문에, 성능과 기능을 개선할 수 있는 연구들이 많이 나오지 않을까라는 생각이 듭니다.

그런데 문제는 Vulkan 을 배우는 것이 너무 어렵다는 것입니다. Open Source 의 한계 때문인지 Specification 문서나 API 사용 튜토리얼 류의 글만 잔뜩 있습니다. 게다가 C 로 구현되어 있기 때문에 추상화된 관점에서 API 를 익히기가 어렵습니다.

한 달이 넘는 시간 동안 ( 시간이 날 때마다? ) 열심히 공부해서 렌더러를 구성해 보려고 노력했지만, 추상화와 Glslang 및 reflection 의 벽에 막혀 삼각형 하나도 렌더링하지 못했습니다. D3D9 이나 D3D11 으로 렌더러를 구현했을 때와 비교해 보면 정말 어렵다는 생각이 듭니다.

그래서 이번 기회에 저의 실력을 향상시키고 Vulkan 에 입문하는 분들에게 조금이라도 도움이 될 수 있었으면 하는 바람으로 [ Vulkan 연구 ] 시리즈를 만들어 보기로 했습니다.

제대로 아는 것이 아니기 때문에 목차는 정하지 않았지만, Vulkan Specification 의 순서대로 진행을 하려고 하지만, 명세를 읽다가 궁금해지는 부분들을 집중적으로 파보는 방향으로 진행될 가능성이 높습니다.

저작자표시 비영리 변경금지

'Vulkan & OpenGL' 카테고리의 다른 글

[ Vulkan 연구 ] Queue Family (0)	2018.11.10
[ Vulkan 연구 ] ICD & Physical Device (6)	2018.11.05
[ Vulkan 연구 ] Vulkan Object Model (0)	2018.11.04
[ Vulkan 연구 ] Layer & Extensions (0)	2018.11.04
[ Vulkan 연구 ] 번역 : Brief guide to Vulkan layers (0)	2018.11.04
[ Vulkan 연구 ] 레이어 활성화하기 (1)	2018.11.03
[ Vulkan 연구 ] Loader (0)	2018.11.01
[ 일부 번역 ] Vulkan GLSL Specification( GL_KHR_vulkan_glsl ) (0)	2018.10.28
Vulkan Opaque Type (2)	2018.10.13
Vulkan 은 왜 C 로 구현되었을까? (10)	2018.10.03

주의 : 허락받고 번역한 것이 아니므로 언제든 내려갈 수 있습니다.

주의 : 2018-10-28 기준 spec 이므로 최신버전이 반영되어 있지 않을 수 있습니다.

주의 : 번역이 개판이므로 이상하면 원문을 참고하십시오.

원문 : https://raw.githubusercontent.com/KhronosGroup/GLSL/master/extensions/khr/GL_KHR_vulkan_glsl.txt

Name

KHR_vulkan_glsl

Name Strings

GL_KHR_vulkan_glsl

Contact

John Kessenich (johnkessenich 'at' google.com), Google

Contributors

Jeff Bolz, NVIDIA

Kerch Holt, NVIDIA

Kenneth Benzie, Codeplay

Neil Henning, Codeplay

Neil Hickey, ARM

Daniel Koch, NVIDIA

Timothy Lottes, Epic Games

David Neto, Google

Notice

http://www.khronos.org/registry/speccopyright.html

Status

Approved by Vulkan working group 03-Dec-2015.

Ratified by the Khronos Board of Promoters 15-Jan-2016.

Version

Last Modified Date: 25-Jul-2018

Revision: 46

Number

TBD.

Dependencies

이 익스텐션( extension ) 은 OpenGL GLSL versions 1.40( #version 140 ) 이상 버전에 적용될 수 있습니다.

이 익스텐션은 OpenGL ES ESSL versions 3.10( #version 310 ) 이상 버전에 적용될 수 있습니다.

이 모든 버전들은 GLSL/ESSL 시맨틱( semantics )들을 동일한 SPIR-V 1.0 시맨틱으로 매핑합니다( 가장 최근의 GLSL/ESSL 버전으로 근사( approximating )합니다 ).

Overview

이것은 GL_KHR_vulkan_glsl 익스텐션의 100 버전입니다.

이 익스텐션은 Vulkan API 에서 고수준 언어로서 GLSL 을 사용할 수 있도록 수정한 것입니다. GLSL 은 SPIR-V 로 컴파일되는데, Vulkan API 는 그것을 사용하게 됩니다.

다음과 같은 기능들이 제거되었습니다:

* opaque 타입( 역주 : [ Vulkan Opaque Type ] 참고 )을 배제한 기본 uniform 들( uniform 블락에 존재하지 않는 유니폼 변수들 )

* 자동 카운터( atomic-counters )( atomic_uint 에 기반한 것들 )

* 서브루틴( subroutines )

* shared 및 packed 블락 레이아웃( layout )들

* 이미 사장된( deprecated ) 텍스쳐링 함수들( 예 : texture2D() )

* 이미 사장된 노이즈 함수들( 예 : noise1() )

* 호환 모드 전용 기능들

* gl_DepthRangeParameters 와 gl_NumSamples

* gl_VertexID 와 gl_InstanceID

다음과 같은 기능들이 추가되었습니다:

* 푸시 상수 버퍼( push-constant buffers )

* 개별 텍스쳐 및 샘플러에 대한 셰이더 조합( shader-combining ).

* 디스크립터 셋들( descriptor sets )

* 특수화 상수들( specialization constants )

* gl_VertexIndex 및 gl_InstanceIndex

* 서브패스 입력들( subpass inputs )

* uniform/buffer 블락들을 지원하지 않는 버전들을 위해 uniform/buffer 를 위한 offset 및 align 레이아웃 한정자( qualifiers )를 지원

다음과 같은 기능들이 변경되었습니다:

* 정밀도( precision ) 한정자들( mediump 및 lowp )이 모든 버전에 대해 요구됩니다. 데스크탑 버전에 대해서도 누락되지 않습니다( 데스크탑 버전을 위한 기본 정밀도는 모든 타입에 대해 highp 입니다 )

* gl_FragColor 는 더 이상 implicit broadcast 를 지시하지 않습니다.

* uniform/buffer 블락의 배열들은 전체 오브젝트에 대해 하나의 바인딩 번호만을 가집니다. 배열 요소별로 번호를 가지는 것이 아닙니다.

* 기본 원점( origin )은 origin_lower_left 가 아니라 origin_upper_left 입니다.

이들 각각에 대해서는 아래에서 더욱 세부적으로 논의하겠습니다.

이 기능들을 활성화하는 법

이 익스텐션은 다른 익스텐션들처럼 #extension 을 사용해서 활성화되는 것이 아닙니다. 또한 profile 이나 #version 의 사용을 통해 활성화되는 것도 아닙니다. Vulkan 에서 지정한 용도( usage )를 제외하면, 의도된 GLSL/ESSL 기능들의 수준( level )은 #version, profile, #extension 에 대한 전통적인 사용을 통해서 결정됩니다.

GLSL 프런트 엔드( front-end )는 Vulkan 을 위해 SPIR-V 를 생성하기 위해서 사용됩니다. 그런 도구를 사용하는 것은 Aulkan API 나 GLSL 및 익스텐션을 정의하는 영역의 외부에서 수행됩니다. Vulkan 을 위해 SPIR-V 의 생성을 요청하는 방법에 대해서 알고자 한다면, 컴파일러 문서를 확인하시기 바랍니다( 역주 : Vulkan 을 이것을 "glslang" 이라고 부릅니다. [ GitHub ] 참고. 기본적으로 LunarG 에 포함되어 있으니 별도의 설치가 필요하지는 않습니다 ).

프런트 엔드가 이 익스텐션을 받아들이기 위해서 사용될 때, 그것은 반드시 에러를 검사하고 이 명세를 고수하지 않는 셰이더들을 거부해야만 합니다. 구현측-의존 최대값( maximums )이나 능력치( abilities )는 프런트 엔드 혹은 그 일부에 제공되어야 합니다. 그래야 그것들에 대한 에러 검사를 할 수 있습니다.

셰이더는 다음과 같은 정의를 사용해서 Vulkan 이 지원하는 레벨을 지정할 수 있습니다.

#define VULKAN 100

예를 들어, 이는 다음과 같은 셰이더 코드를 작성할 수 있도록 해 줍니다.

#ifdef VULKAN layout(set = 1, binding = 0) uniform sampler s; layout(set = 1, binding = 1) uniform texture2D t; #if VULKAN > 100 ... #endif #else layout(binding = 0) uniform sampler2D ts; #endif

Specialization Constants

SPIR-V 특수화 상수들은, 나중에 클라이언트 API 에 의해서 설정될 수 있는데, 이는 layout(constant_id=...) 를 사용해서 선언됩니다. 예를 들어 기본값 12 를 사용해서 특수화 상수를 만들고자 한다면 다음과 같이 할 수 있습니다:

layout(constant_id = 17) const int arraySize = 12;

위에서 17 은 ID 인데, API 나 다른 도구들에서 나중에 이 특수화 상수를 식별하기 위해 참조할 수 있습니다. 그리고 나서 API 나 중간 도구( intermediate tool )는, 그것이 완전히 실행 코드( executable code )로 저수준화되기( lowered ) 전에, 그 값을 다른 상수 정수로 바꿀 수 있습니다. 만약 ID 값이 저수준화 전에 변경되지 않는다면, 그 변수의 값은 여전히 12 로 남아있게 될 것입니다.

특수화 상수는 그것이 folding 되지 않는다는 것을 제외하면 const 시맨틱을 가집니다( 역주 : constant folding 은 컴파일러에서의 상수 최적화를 의미하는 것입니다. 여러 상수가 컴파일시에 하나의 상수로 변합니다. [ constant folding ] 참조 ). 그러므로 배열이 위에 있는 'arraySize' 로 선언될 수 있습니다.

vec4 data[arraySize]; // legal, even though arraySize might change

특수화 상수들은 표현식 안에 들어갈 수도 있습니다:

vec4 data2[arraySize + 2];

이는 data2 의 크기를 'arraySize' 가 지정하는 것의 2 배로 늘리게 되는데, 이는 셰이더를 실행코드로 저수준화할 때 발생하는 일입니다.

특수화 상수와 함께 구성되는 표현식은 셰이더 내에서 그냥 상수와는 다르게 특수화 상수인 것처럼 동작합니다.

arraySize + 2 // a specialization constant (with no constant_id)

그런 표현식은 상수와 같은 위치에서 사용될 수 있습니다.

constant_id 는 스칼라 정수( int ), 스칼라 실수( float ) 이나 스칼라 불리언( bool )에만 사용될 수 있습니다.

특수화 상수에는 기본 연산자와 생성자들만 적용될 수 있으며 그 역시 특수화 상수를 반환합니다:

layout(constant_id = 17) const int arraySize = 12; sin(float(arraySize)); // result is not a specialization constant

SPIR-V 특수화 상수들이 스칼라에 대해서만 존재하는 반면에, 벡터 연산을 위해 스칼라가 사용될 수 있습니다:

layout(constant_id = 18) const int scX = 1; layout(constant_id = 19) const int scZ = 1; const vec3 scVec = vec3(scX, 1, scZ); // partially specialized vector

내장 변수( built-in variable )에 constant_id 를 붙일 수 있습니다:

layout(constant_id = 18) gl_MaxImageUnits;

이는 그것이 특수화 상수처럼 동작하도록 만듭니다. 그것은 완전한 재선언( full redeclaration )은 아닙니다; 다른 모든 특성들은 원래의 내장 선언과 동일하게 온전히 남아 있게 됩니다.

특수 레이아웃인 local_size_{xyz}_id 들을 사용해서 특수화된 내장벡터 gl_WorkGroupSize 는 in 한정자에 적용됩니다. 예를 들어:

layout(local_size_x_id = 18, local_size_z_id = 19) in;

gl_WorkGroupSize.y 는 비-특수화 상수로 남아 있습니다. gl_WorkGroupSize 는 부분적으로 특수화된 벡터인 것입니다. 그것의 x 및 z 요소는 나중에 18 과 19 라는 ID 를 사용해서 특수화될 수 있습니다.

Push Constants

푸시 상수들은 uniform 블락 내에 존재하는데, 새로운 레이아웃 한정자 아이디인 push_constant 를 uniform 블락 선언에 적용함으로써 선언됩니다. 이 API 는 상수 집합을 푸시 상수 버퍼에 씁니다. 그리고 셰이더는 push_constant 블락으로부터 그것들을 읽어들입니다.

layout(push_constant) uniform BlockName { int member1; float member2; ... } InstanceName; // optional instance name ... = InstanceName.member2; // read a push constant

푸시 상수 유니폼 블락을 위해 사용되는 메모리 회계( memory accounting )는 다른 유니폼 블락들과는 다릅니다: 이것이 맞춰야만 하는 개별적인 작은 풀( small pool )이 존재합니다. 기본적으로 푸시 상수 버퍼는 std430 packing rule 을 따라야만 합니다( 역주 : [ Interface Block (GLSL) ] 의 "Memory layout" 참조, [ ARB_shader_storage_buffer_object ] 참조 ).

Descriptor Sets

디스크립터 셋 내의 각 셰이더 리소스는 ( 셋 번호( set number ), 바인딩 번호( binding number ), 배열 요소( array element ) ) 로 구성된 튜플( tuple )을 할당하는데, 그것은 디스크립터 셋 레이아웃 내의 위치를 정의하게 됩니다.

GLSL 에서는 셋 번호와 바인딩 번호가 set 과 binding 이라는 레이아웃 한정자에 의해서 할당되었습니다. 그리고 배열 요소는 배열의 첫 번째 요소를 0 으로 하여 인덱스를 순서대로 증가시켜서 묵시적으로 할당되었습니다( 그리고 비-배열 변수를 위해서는 배열 요소가 0 이었습니다 ):

// Assign set number = M, binding number = N, array element = 0 layout (set=M, binding=N) uniform sampler2D variableName; // Assign set number = M, binding number = N for all array elements, // and array element = i for the ith member of an array of size I. layout (set=M, binding=N) uniform sampler2D variableNameArray[I];

예를 들어, 결합된 두 개의 texture/sampler 오브젝트들은 서로 다른 디스크립터 셋에 다음과 같이 선언될 수 있습니다.

layout(set = 0, binding = 0) uniform sampler2D ts3; layout(set = 1, binding = 0) uniform sampler2D ts4;

디스크립터 셋의 연산 모델에 대한 세부적인 사항에 대해서 알고자 한다면 API 문서를 참고하시기 바랍니다( 역주 : [ Vulkan spec ] 의 "13.2. Descriptor Sets" 참조 ).

Storage Images

GLSL 셰이더 소스에서 저장소 이미지는 적절한 차원성( dimentionality )과 ( 필요하다면 ) 포맷 레이아웃 한정자로 구성된 uniform image 변수들을 사용해서 선언됩니다:

layout (set=m, binding=n, r32f) uniform image2D myStorageImage;

이것은 다음과 같은 SPIR-V 코드로 매핑됩니다.

... %1 = OpExtInstImport "GLSL.std.450" ... OpName %9 "myStorageImage" OpDecorate %9 DescriptorSet m OpDecorate %9 Binding n %2 = OpTypeVoid %3 = OpTypeFunction %2 %6 = OpTypeFloat 32 %7 = OpTypeImage %6 2D 0 0 0 2 R32f %8 = OpTypePointer UniformConstant %7 %9 = OpVariable %8 UniformConstant ...

Samplers

GLSL 셰이더 소스에서 샘플러는 uniform sampler 변수를 사용해서 선언되는데, 여기에서 샘플러 타입은 텍스쳐 차원성과 관련이 없습니다:

layout (set=m, binding=n) uniform sampler mySampler;

이것은 다음과 같은 SPIR-V 코드로 매핑됩니다.

... %1 = OpExtInstImport "GLSL.std.450" ... OpName %8 "mySampler" OpDecorate %8 DescriptorSet m OpDecorate %8 Binding n %2 = OpTypeVoid %3 = OpTypeFunction %2 %6 = OpTypeSampler %7 = OpTypePointer UniformConstant %6 %8 = OpVariable %7 UniformConstant ...

Textures ( Sampled Images )

GLSL 셰이더 소스에서 텍스쳐는 적절한 차원성을 가진 uniform texture 변수를 사용해서 선언됩니다 :

layout (set=m, binding=n) uniform texture2D mySampledImage;

이것은 다음과 같은 SPIR-V 코드로 매핑됩니다:

... %1 = OpExtInstImport "GLSL.std.450" ... OpName %9 "mySampledImage" OpDecorate %9 DescriptorSet m OpDecorate %9 Binding n %2 = OpTypeVoid %3 = OpTypeFunction %2 %6 = OpTypeFloat 32 %7 = OpTypeImage %6 2D 0 0 0 1 Unknown %8 = OpTypePointer UniformConstant %7 %9 = OpVariable %8 UniformConstant ...

Combined Texture + Samplers

GLSL 셰이더 소스에서 텍스쳐와 샘플러의 결합은 적절한 차원성을 가진 uniform sampler 변수를 사용해서 선언됩니다:

layout (set=m, binding=n) uniform sampler2D myCombinedImageSampler;

이것은 다음과 같은 SPIR-V 코드로 매핑됩니다:

... %1 = OpExtInstImport "GLSL.std.450" ... OpName %10 "myCombinedImageSampler" OpDecorate %10 DescriptorSet m OpDecorate %10 Binding n %2 = OpTypeVoid %3 = OpTypeFunction %2 %6 = OpTypeFloat 32 %7 = OpTypeImage %6 2D 0 0 0 1 Unknown %8 = OpTypeSampledImage %7 %9 = OpTypePointer UniformConstant %8 %10 = OpVariable %9 UniformConstant ...

결합된 이미지 샘플러의 디스크립터는 위의 섹션에서처럼 셰이더 내에서 선언된 이미지들이나 샘플러들만 참조할 수 있다는 것에 주의하시기 바랍니다.

Combining spearate samplers and textures

sampler 키워드를 사용해서 선언된 샘플러는 텍스쳐나 이미지가 아니라 필터링 정보만을 포함합니다:

uniform sampler s; // a handle to filtering information

texture2D 같은 키워드를 사용해서 선언된 텍스쳐는 필터링 정보가 아니라 이미지 정보만을 포함합니다:

uniform texture2D t; // a handle to a texture (an image in SPIR-V)

그러면 텍스쳐 검색( lookup ) 호출을 만드는 시점에 샘플러와 텍스쳐를 결합하기 위해서 생성자가 사용될 수 있습니다:

texture(sampler2D(t, s), ...);

Texture Buffers ( Uniform Texel Buffers )

GLSL 셰이더 소스에서 텍스쳐 버퍼는 uniform textureBuffer 변수를 사용해서 선언될 수 있습니다:

layout (set=m, binding=n) uniform textureBuffer myUniformTexelBuffer;

이것은 다음과 같은 SPIR-V 코드로 매핑됩니다:

... %1 = OpExtInstImport "GLSL.std.450" ... OpName %9 "myUniformTexelBuffer" OpDecorate %9 DescriptorSet m OpDecorate %9 Binding n %2 = OpTypeVoid %3 = OpTypeFunction %2 %6 = OpTypeFloat 32 %7 = OpTypeImage %6 Buffer 0 0 0 1 Unknown %8 = OpTypePointer UniformConstant %7 %9 = OpVariable %8 UniformConstant ...

Image Buffers ( Storage Texel Buffers )

GLSL 셰이더 소스에서 이미지 버퍼는 uniform imageBuffer 변수를 사용함으로써 선언됩니다:

layout (set=m, binding=n, r32f) uniform imageBuffer myStorageTexelBuffer;

이것은 다음과 같은 SPIR-V 코드로 매핑됩니다:

... %1 = OpExtInstImport "GLSL.std.450" ... OpName %9 "myStorageTexelBuffer" OpDecorate %9 DescriptorSet m OpDecorate %9 Binding n %2 = OpTypeVoid %3 = OpTypeFunction %2 %6 = OpTypeFloat 32 %7 = OpTypeImage %6 Buffer 0 0 0 2 R32f %8 = OpTypePointer UniformConstant %7 %9 = OpVariable %8 UniformConstant ...

Storage Buffers

GLSL 셰이더 소스에서 저장소 버퍼는buffer storage 한정자와 block 구문( syntax )을 사용함으로써 선언됩니다:

layout (set=m, binding=n) buffer myStorageBuffer { vec4 myElement[]; };

이것은 다음과 같은 SPIR-V 코드로 매핑됩니다:

... %1 = OpExtInstImport "GLSL.std.450" ... OpName %9 "myStorageBuffer" OpMemberName %9 0 "myElement" OpName %11 "" OpDecorate %8 ArrayStride 16 OpMemberDecorate %9 0 Offset 0 OpDecorate %9 BufferBlock OpDecorate %11 DescriptorSet m OpDecorate %11 Binding n %2 = OpTypeVoid %3 = OpTypeFunction %2 %6 = OpTypeFloat 32 %7 = OpTypeVector %6 4 %8 = OpTypeRuntimeArray %7 %9 = OpTypeStruct %8 %10 = OpTypePointer Uniform %9 %11 = OpVariable %10 Uniform ...

Uniform Buffers

GLSL 셰이더 소스에서 저장소 버퍼는uniform storage 한정자와 block 구문을 사용함으로써 선언됩니다:

layout (set=m, binding=n) uniform myUniformBuffer { vec4 myElement[32]; };

이것은 다음과 같은 SPIR-V 코드로 매핑됩니다:

... %1 = OpExtInstImport "GLSL.std.450" ... OpName %11 "myUniformBuffer" OpMemberName %11 0 "myElement" OpName %13 "" OpDecorate %10 ArrayStride 16 OpMemberDecorate %11 0 Offset 0 OpDecorate %11 Block OpDecorate %13 DescriptorSet m OpDecorate %13 Binding n %2 = OpTypeVoid %3 = OpTypeFunction %2 %6 = OpTypeFloat 32 %7 = OpTypeVector %6 4 %8 = OpTypeInt 32 0 %9 = OpConstant %8 32 %10 = OpTypeArray %7 %9 %11 = OpTypeStruct %10 %12 = OpTypePointer Uniform %11 %13 = OpVariable %12 Uniform ...

Subpass Inputs

렌더링 패스 내에서, 서브패스는 츨력 타깃에 결과를 쓸 수 있는데, 그러면 다음 서브패스는 입력 서브패스로서 그 결과를 읽어들일 수 있습니다. "서브패스 입력" 기능은 출력 타깃을 읽을 수 있는 기능이 있다고 간주합니다.

서브패스들은 새로운 유형의 집합( set )들을 통해 읽어들여질 수 있는데, 프래그먼트( fragment ) 셰이더에 대해서만 가능합니다:

subpassInput

subpassInputMS

isubpassInput

isubpassInputMS

usubpassInput

usubpassInputMS

샘플러나 이미지 오브젝트들과는 다르게, 서브패스 입력은 프로그래먼트의 ( x, y, layer ) 좌표에 의해 묵시적으로 주소지정이 됩니다.

입력 어태치먼트( input attachment )들은 input_attachment_index 와 descriptor set, 그리고 binding 번호들을 포함합니다:

layout (input_attachment_index=i, set=m, binding=n) uniform subpassInput myInputAttachment;

이것은 다음과 같은 SPIR-V 코드로 매핑됩니다:

... %1 = OpExtInstImport "GLSL.std.450" ... OpName %9 "myInputAttachment" OpDecorate %9 DescriptorSet m OpDecorate %9 Binding n OpDecorate %9 InputAttachmentIndex i %2 = OpTypeVoid %3 = OpTypeFunction %2 %6 = OpTypeFloat 32 %7 = OpTypeImage %6 SubpassData 0 0 0 2 Unknown %8 = OpTypePointer UniformConstant %7 %9 = OpVariable %8 UniformConstant ...

i 에 대한 input_attachment_index 는 입력 패스 리스트에서 i 번째 엔트리( entry )를 선택합니다. ( 더 많은 정보를 원한다면 API 명세를 확인하세요 )

이러한 오브젝트들은 서브패스 입력을 다음과 같은 함수들을 통해서 읽어들일 수 있도록 지원합니다:

gvec4 subpassLoad(gsubpassInput subpass); gvec4 subpassLoad(gsubpassInputMS subpass, int sample);

gl_FragColor

프래그먼트 스테이지 내장 gl_FragColor 는 모든 출력에 대한 broadcast 를 내포하며, SPIR-V 에 제출되지 않습니다. gl_FragColor 에 쓰는 것이 허용되는 셰이더들은 여전히 거기에 쓰기는 하는데, 단지 출력으로 쓰고 있음을 의미할 뿐입니다:

- gl_FragColor 와 같은 타입으로

- location 0 번에

- 내장 변수로 기능하지 않고

Implicit broadcast 는 존재하지 않습니다.

gl_VertexIndex 와 gl_InstanceIndex

새롭게 두 개의 내장 변수가 추가되었는데, gl_VertexIndex 와 gl_InstanceIndex 는 기존의 gl_VertexID 와 gl_InstanceID 를 대체합니다.

어떤 기저( base offset ) 에 대해 상대적인 색인( indexing )을 하는 상황에서, Vulkan 을 위한 이 내장 변수들이 정의되었는데, 이는 다음과 같은 값들을 취합니다:

gl_VertexIndex base, base+1, base+2, ... gl_InstanceIndex base, base+1, base+2, ...

여기에서 그것은 기저가 실제로 무엇이냐에에 달려 있습니다.

Mapping to SPIR-V

( 명세는 아니고 ) 정보 제공의 목적으로 이야기하자면, 다음은 GLSL 생성자를 SPIR-V 생성자로 매핑하는 구현을 위해 기대되는 방식들을 보여 줍니다:

storage class 매핑:

uniform sampler2D...; -> UniformConstant

uniform blockN { ... } ...; -> Uniform, with Block decoration

in / out variable -> Input/Output, possibly with block (below)

in / out block... -> Input/Output, with Block decoration

buffer blockN { ... } ...; -> Uniform, with BufferBlock decoration, or

StorageBuffer, when requested

N/A -> AtomicCounter

shared -> Workgroup

<normal global> -> Private

입출력 블락들이나 변수들을 매핑하는 것은 다른 버전의 GLSL 이나 ESSL 과 같습니다. 확장 변수들이나 멤버들을 이 버전에서 이용할 수 있으며, 그것의 위치는 다음과 같습니다:

이들은 SPIR-V 에 개별 변수들로 매핑되는데, ( 따로 표기된 것을 제외하고는 ) 내장 decoration( 역주 : prefix 나 한정자 등을 의미하는 듯 ) 들과 유사하게 발음됩니다:

Any stage :

in gl_NumWorkGroups

in gl_WorkGroupSize

in gl_WorkGroupID

in gl_LocalInvocationID

in gl_GlobalInvocationID

in gl_LocalInvocationIndex

in gl_VertexIndex

in gl_InstanceIndex

in gl_InvocationID

in gl_PatchVerticesIn (PatchVertices)

in gl_PrimitiveIDIn (PrimitiveID)

in/out gl_PrimitiveID (in/out based only on storage qualifier)

in gl_TessCoord

in/out gl_Layer

in/out gl_ViewportIndex

patch in/out gl_TessLevelOuter (uses Patch decoration)

patch in/out gl_TessLevelInner (uses Patch decoration)

Fragment stage only:

in gl_FragCoord

in gl_FrontFacing

in gl_ClipDistance

in gl_CullDistance

in gl_PointCoord

in gl_SampleID

in gl_SamplePosition

in gl_HelperInvocation

out gl_FragDepth

in gl_SampleMaskIn (SampleMask)

out gl_SampleMask (in/out based only on storage qualifier)

These are mapped to SPIR-V blocks, as implied by the pseudo code, with the members decorated with similarly spelled built-in decorations:

Non-fragment stage:

in/out gl_PerVertex { // some subset of these members will be used gl_Position gl_PointSize gl_ClipDistance gl_CullDistance } // name of block is for debug only

적어도 하나의 입력 및 출력 블락이 SPIR-V 의 스테이지마다 존재합니다. 하위집합( subset ) 및 멤버의 순서는 인터페이스( interface )를 공유하고 있는 스테이지 사이에서 일치할 것입니다.

Mapping of precision qulifiers:

lowp -> RelaxedPrecision, on storage variable and operation

mediump -> RelaxedPrecision, on storage variable and operation

highp -> 32-bit, same as int or float

portablility tool/mode -> OpQuantizeToF16

Mapping of precise:

precise -> NoContraction

Mapping of images:

subpassInput -> OpTypeImage with 'Dim' of SubpassData

subpassLoad() -> OpImageRead

imageLoad() -> OpImageRead

imageStore() -> OpImageWrite

texelFetch() -> OpImageFetch

imageAtomicXXX(params, data) -> %ptr = OpImageTexelPointer params

OpAtomicXXX %ptr, data

XXXQueryXXX(combined) -> %image = OpImage combined

OpXXXQueryXXX %image

Mapping of layouts:

std140/std430 -> explicit offsets/strides on struct

shared/packed -> not allowed

<default> -> not shared, but std140 or std430

max_vertices -> OutputVertices

Mapping of barriers:

barrier() (compute) -> OpControlBarrier(/*Execution*/Workgroup,

/*Memory*/Workgroup,

/*Semantics*/AcquireRelease |

WorkgroupMemory)

barrier() (tess control) -> OpControlBarrier(/*Execution*/Workgroup,

/*Memory*/Invocation,

/*Semantics*/None)

memoryBarrier() -> OpMemoryBarrier(/*Memory*/Device,

/*Semantics*/AcquireRelease |

UniformMemory |

WorkgroupMemory |

ImageMemory)

memoryBarrierBuffer() -> OpMemoryBarrier(/*Memory*/Device,

/*Semantics*/AcquireRelease |

UniformMemory)

memoryBarrierShared() -> OpMemoryBarrier(/*Memory*/Device,

/*Semantics*/AcquireRelease |

WorkgroupMemory)

memoryBarrierImage() -> OpMemoryBarrier(/*Memory*/Device,

/*Semantics*/AcquireRelease |

ImageMemory)

groupMemoryBarrier() -> OpMemoryBarrier(/*Memory*/Workgroup,

/*Semantics*/AcquireRelease |

UniformMemory |

WorkgroupMemory |

ImageMemory)

Mapping of atomics

all atomic builtin functions -> Semantics = None(Relaxed)

atomicExchange() -> OpAtomicExchange

imageAtomicExchange() -> OpAtomicExchange

atomicCompSwap() -> OpAtomicCompareExchange

imageAtomicCompSwap() -> OpAtomicCompareExchange

N/A -> OpAtomicCompareExchangeWeak

Mapping of other instructions:

% -> OpUMod/OpSMod

mod() -> OpFMod

N/A -> OpSRem/OpFRem

===== 이 아래는 [ OpenGL Shading Language Specification ] 의 변경사항과 관련한 내용이므로 따로 번역하지 않았습니다.

하지만 본문에 언급되어 있지 않은 정보들도 꽤 있어 확인해 보시는 것이 좋을 것 같습니다 =====

Changes to Chapter 1 of the OpenGL Shading Language Specification

Change the last paragraph of "1.3 Overview": "The OpenGL Graphics System

Specification will specify the OpenGL entry points used to manipulate and

communicate with GLSL programs and GLSL shaders."

Add a paragraph: "The Vulkan API will specify the Vulkan entry points used

to manipulate SPIR-V shaders. Independent offline tool chains will compile

GLSL down to the SPIR-V intermediate language. Vulkan use is not enabled

with a #extension, #version, or a profile. Instead, use of GLSL for Vulkan

is determined by offline tool-chain use. See the documentation of such

tools to see how to request generation of SPIR-V for Vulkan."

"GLSL -> SPIR-V compilers must be directed as to what SPIR-V *Capabilities*

are legal at run-time and give errors for GLSL feature use outside those

capabilities. This is also true for implementation-dependent limits that

can be error checked by the front-end against constants present in the

GLSL source: the front-end can be informed of such limits, and report

errors when they are exceeded."

Changes to Chapter 2 of the OpenGL Shading Language Specification

Change the name from

"2 Overview of OpenGL Shading"

to

"2 Overview of OpenGL and Vulkan Shading"

Remove the word "OpenGL" from three introductory paragraphs.

Changes to Chapter 3 of the OpenGL Shading Language Specification

Add a new paragraph at the end of section "3.3 Preprocessor": "When

shaders are compiled for Vulkan, the following predefined macro is

available:

#define VULKAN 100

Add the following keywords to section 3.6 Keywords:

texture1D texture2D texture3D

textureCube texture2DRect texture1DArray

texture2DArray textureBuffer texture2DMS

texture2DMSArray textureCubeArray

itexture1D itexture2D itexture3D

itextureCube itexture2DRect itexture1DArray

itexture2DArray itextureBuffer

itexture2DMS itexture2DMSArray

itextureCubeArray

utexture1D utexture2D utexture3D

utextureCube utexture2DRect utexture1DArray

utexture2DArray utextureBuffer utexture2DMS

utexture2DMSArray utextureCubeArray

sampler samplerShadow

subpassInput isubpassInput usubpassInput

subpassInputMS isubpassInputMS usubpassInputMS

Move the following keywords in section 3.6 Keywords to the reserved

section:

atomic_uint

subroutine

Changes to Chapter 4 of the OpenGL Shading Language Specification

Add into the tables in section 4.1 Basic Types, interleaved with the

existing types, using the existing descriptions (when not supplied

below):

Floating-Point Opaque Types

texture1D

texture2D

texture3D

textureCube

texture2DRect

texture1DArray

texture2DArray

textureBuffer

texture2DMS

texture2DMSArray

textureCubeArray

subpassInput | a handle for accessing a floating-point

| subpass input

subpassInputMS | a handle for accessing a multi-sampled

| floating-point subpass input

Signed Integer Opaque Types

itexture1D

itexture2D

itexture3D

itextureCube

itexture2DRect

itexture1DArray

itexture2DArray

itextureBuffer

itexture2DMS

itexture2DMSArray

itextureCubeArray

isubpassInput | a handle for accessing an integer subpass input

isubpassInputMS | a handle for accessing a multi-sampled integer

| subpass input

Unsigned Integer Opaque Types

utexture1D

utexture2D

utexture3D

utextureCube

utexture2DRect

utexture1DArray

utexture2DArray

utextureBuffer

utexture2DMS

utexture2DMSArray

utextureCubeArray

usubpassInput | a handle for accessing an unsigned integer

| subpass input

usubpassInputMS | a handle for accessing a multi-sampled unsigned

| integer subpass input

Remove the entry from the table in section 4.1 Basic Types:

atomic_uint

Add a new category in this section

"Sampler Opaque Types

sampler | a handle for accessing state describing how to

| sample a texture"

---------------------------------------------------------------------

samplerShadow | a handle for accessing state describing how to

| sample a depth texture with comparison"

Remove "structure member selection" from 4.1.7 and instead add a sentence

"Opaque types cannot be declared or nested in a structure (struct)."

Modify subsection 4.1.3 Integers, for desktop versions of GLSL, to say:

"Highp unsigned integers have exactly 32 bits of precision. Highp

signed integers use 32 bits, including a sign bit, in two's complement

form. Mediump and lowp integers are as defined by the RelaxedPrecision

decoration in SPIR-V."

Add a subsection to 4.1.7 Opaque Types:

"4.1.7.x Texture, *sampler*, and *samplerShadow* Types

"Texture (e.g., *texture2D*), *sampler*, and *samplerShadow* types are opaque

types, declared and behaving as described above for opaque types. When

aggregated into arrays within a shader, these types can only be indexed

with a dynamically uniform expression, or texture lookup will result in

undefined values. Texture variables are handles to one-, two-, and

three-dimensional textures, cube maps, etc., as enumerated in the basic

types tables. There are distinct

texture types for each texture target, and for each of float, integer,

and unsigned integer data types. Textures can be combined with a

variable of type *sampler* or *samplerShadow* to create a sampler type

(e.g., sampler2D, or sampler2DShadow). This is done with a constructor,

e.g., sampler2D(texture2D, sampler),

sampler2DShadow(texture2D, sampler),

sampler2DShadow(texture2D, samplerShadow), or

sampler2D(texture2D, samplerShadow)

and is described in more detail in section 5.4 "Constructors"."

"4.1.7.x Subpass Inputs

"Subpass input types (e.g., subpassInput) are opaque types, declared

and behaving as described above for opaque types. When aggregated into

arrays within a shader, they can only be indexed with a dynamically

uniform integral expression, otherwise results are undefined.

"Subpass input types are handles to two-dimensional single sampled or

multi-sampled images, with distinct types for each of float, integer,

and unsigned integer data types.

"Subpass input types are only available in fragment shaders. It is a

compile-time error to use them in any other stage."

Remove the section 4.1.7.3 Atomic Counters

Change section 4.3.3 Constant Expressions:

Add a new very first sentence to this section:

"SPIR-V specialization constants are expressed in GLSL as const, with

a layout qualifier identifier of constant_id, as described in section

4.4.x Specialization-Constant Qualifier."

Add to this sentence:

"A constant expression is one of...

* a variable declared with the const qualifier and an initializer,

where the initializer is a constant expression"

To make it say:

"A constant expression is one of...

* a variable declared with the const qualifier and an initializer,

where the initializer is a constant expression; this includes both

const declared with a specialization-constant layout qualifier,

e.g., 'layout(constant_id = ...)' and those declared without a

specialization-constant layout qualifier"

Add to "including getting an element of a constant array," that

"an array access with a specialization constant as an index does

not result in a constant expression"

Add to this sentence:

"A constant expression is one of...

* the value returned by a built-in function..."

To make it say:

"A constant expression is one of...

* for non-specialization-constants only: the value returned by a

built-in function... (when any function is called with an argument

that is a specialization constant, the result is not a constant

expression)"

Rewrite the last half of the last paragraph to be its own paragraph

saying:

"Non-specialization constant expressions may be evaluated by the

compiler's host platform, and are therefore not required ...

[rest of paragraph stays the same]"

Add a paragraph

"Specialization constant expressions are never evaluated by the

front-end, but instead retain the operations needed to evaluate them

later on the host."

Add to the table in section 4.4 Layout Qualifiers:

| Individual Variable | Block | Allowed Interface

------------------------------------------------------------------------

constant_id = | scalar only | | const

------------------------------------------------------------------------

push_constant | | X | uniform

------------------------------------------------------------------------

set = | opaque only | X | uniform

------------------------------------------------------------------------

input_attachment_index | subpass types only | | uniform

(The other columns remain blank.)

Also add to this table:

| Qualifier Only | Allowed Interface

-------------------------------------------------------

local_size_x_id = | X | in

local_size_y_id = | X | in

local_size_z_id = | X | in

(The other columns remain blank.)

Expand this sentence in section 4.4.1 Input Layout Qualifiers:

"Where integral-constant-expression is defined in section 4.3.3 Constant

Expressions as 'integral constant expression'"

To include the following:

", with it being a compile-time error for integer-constant-expression to

be a specialization constant: The constant used to set a layout

identifier X in layout(layout-qualifier-name = X) must evaluate to a

front-end constant containing no specialization constants."

Change the rules about locations and inputs for doubles, by removing

"If a vertex shader input is any scalar or vector type, it will consume

a single location. If a non-vertex shader input is a scalar or vector

type other than dvec3 or dvec4..."

Replacing the above with

"If an input is a scalar or vector type other than dvec3 or dvec4..."

(Making all stages have the same rule that dvec3 takes two locations...)

At the end of the paragraphs describing the *location* rules, add this

paragraph:

"When generating SPIR-V, all *in* and *out* qualified user-declared

(non built-in) variables and blocks (or all their members) must have a

shader-specified *location*. Otherwise, a compile-time error is

generated."

[Note that an earlier existing rule just above this says "If a block has

no block-level *location* layout qualifier, it is required that either all

or none of its members have a *location* layout qualifier, or a compile-

time error results."]

Change section 4.4.1.3 "Fragment Shader Inputs" from

"By default, gl_FragCoord assumes a lower-left origin for window

coordinates ... For example, the (x, y) location (0.5, 0.5) is

returned for the lowerleft-most pixel in a window. The origin can be

changed by redeclaring gl_FragCoord with the

origin_upper_left identifier."

To

"The gl_FragCoord built-in variable assumes an upper-left origin for

window coordinates ... For example, the (x, y) location (0.5, 0.5) is

returned for the upper-left-most pixel in a window. The origin can be

explicitly set by redeclaring gl_FragCoord with the origin_upper_left

identifier. It is a compile-time error to change it to

origin_lower_left."

Add to the end of section 4.4.3 Uniform Variable Layout Qualifiers:

"The /push_constant/ identifier is used to declare an entire block, and

represents a set of "push constants", as defined by the API. It is a

compile-time error to apply this to anything other than a uniform block

declaration. The values in the block will be initialized through the

API, as per the Vulkan API specification. A block declared with

layout(push_constant) may optionally include an /instance-name/.

There can be only one push_constant

block per stage, or a compile-time or link-time error will result. A

push-constant array can only be indexed with dynamically uniform indexes.

Uniform blocks declared with push_constant use different resources

than those without; and are accounted for separately. See the API

specification for more detail."

After the paragraphs about binding ("The binding identifier..."), add

"The /set/ identifier specifies the descriptor set this object belongs to.

It is a compile-time error to apply /set/ to a standalone qualifier or to

a member of a block. It is a compile-time error to apply /set/ to a block

qualified as a push_constant. By default, any non-push_constant uniform

or shader storage block declared without a /set/ identifier is assigned to

descriptor set 0. Similarly, any sampler, texture, or subpass input type

declared as a uniform, but without a /set/ identifier is also assigned

to descriptor set 0.

"If applied to an object declared as an array, all elements of the array

belong to the specified /set/.

"It is a compile-time error for either the /set/ or /binding/ value

to exceed a front-end-configuration supplied maximum value."

Remove mention of subroutine throughout section 4.4 Layout Qualifiers,

including removal of section 4.4.4 Subroutine Function Layout Qualifiers.

Change section 4.4.5 Uniform and Shader Storage Block Layout Qualifiers:

Change

"If the binding identifier is used with a uniform or shader storage block

instanced as an array, the first element of the array takes the specified

block binding and each subsequent element takes the next consecutive

uniform block binding point. For an array of arrays, each element (e.g.,

6 elements for a[2][3]) gets a binding point, and they are ordered per the

array-of-array ordering described in section 4.1.9 'Arrays.'"

"

To

"If the binding identifier is used with a uniform block or buffer block

instanced as an array, the entire array takes only the provided binding

number. The next consecutive binding number is available for a different

object. For an array of arrays, descriptor set array element numbers used

in descriptor set accesses are ordered per the array-of-array ordering

described in section 4.1.9 'Arrays.'"

Change section 4.4.6 Opaque-Uniform Layout Qualifiers:

Change

"If the binding identifier is used with an array, the first element of

the array takes the specified unit and each subsequent element takes the

next consecutive unit."

To

"If the binding identifier is used with an array, the entire array

takes only the provided binding number. The next consecutive binding

number is available for a different object."

Remove section 4.4.6.1 Atomic Counter Layout Qualifiers

Add a new subsection at the end of section 4.4:

"4.4.x Specialization-Constant Qualifier

"Specialization constants are declared using "layout(constant_id=...)".

For example:

layout(constant_id = 17) const int arraySize = 12;

"The above makes a specialization constant with a default value of 12.

17 is the ID by which the API or other tools can later refer to

this specific specialization constant. If it is never changed before

final lowering, it will retain the value of 12. It is a compile-time

error to use the constant_id qualifier on anything but a scalar bool,

int, uint, float, or double.

"Built-in constants can be declared to be specialization constants.

For example,

layout(constant_id = 31) gl_MaxClipDistances; // add specialization id

"The declaration uses just the name of the previously declared built-in

variable, with a constant_id layout declaration. It is a compile-time

error to do this after the constant has been used: Constants are strictly

either non-specialization constants or specialization constants, not

both.

"The built-in constant vector gl_WorkGroupSize can be specialized using

the local_size_{xyz}_id qualifiers, to individually give the components

an id. For example:

layout(local_size_x_id = 18, local_size_z_id = 19) in;

"This leaves gl_WorkGroupSize.y as a non-specialization constant, with

gl_WorkGroupSize being a partially specialized vector. Its x and z

components can be later specialized using the ids 18 and 19. These ids

are declared independently from declaring the work-group size:

layout(local_size_x = 32, local_size_y = 32) in; // size is (32,32,1)

layout(local_size_x_id = 18) in; // constant_id for x

layout(local_size_z_id = 19) in; // constant_id for z

"Existing rules for declaring local_size_x, local_size_y, and

local_size_z are not changed by this extension. For the local-size ids,

it is a compile-time error to provide different id values for the same

local-size id, or to provide them after any use. Otherwise, order,

placement, number of statements, and replication do not cause errors.

"Two arrays sized with specialization constants are the same type only if

sized with the same symbol, involving no operations.

layout(constant_id = 51) const int aSize = 20;

const int pad = 2;

const int total = aSize + pad; // specialization constant

int a[total], b[total]; // a and b have the same type

int c[22]; // different type than a or b

int d[aSize + pad]; // different type than a, b, or c

int e[aSize + 2]; // different type than a, b, c, or d

"Types containing arrays sized with a specialization constant cannot be

compared, assigned as aggregates, declared with an initializer, or used

as an initializer. They can, however, be passed as arguments to

functions having formal parameters of the same type.

"Arrays inside a block may be sized with a specialization constant, but

the block will have a static layout. Changing the specialized size will

not re-layout the block. In the absence of explicit offsets, the layout

will be based on the default size of the array."

Add a new subsection at the end of section 4.4:

"4.4.y Subpass Qualifier

"Subpasses are declared with the basic 'subpassInput' types. However,

they must have the layout qualifier "input_attachment_index" declared

with them, or a compile-time error results. For example:

layout(input_attachment_index = 2, ...) uniform subpassInput t;

This selects which subpass input is being read from. The value assigned

to 'input_attachment_index', say i (input_attachment_index = i), selects

that entry (ith entry) in the input list for the pass. See the API

documentation for more detail about passes and the input list.

"If an array of size N is declared, it consume N consecutive

input_attachment_index values, starting with the one provided.

"It is a compile-time or link-time error to have different variables

declared with the same input_attachment_index. This includes any overlap

in the implicit input_attachment_index consumed by array declarations.

"It is a compile-time error if the value assigned to an

input_attachment_index is greater than or equal to

gl_MaxInputAttachments."

Remove all mention of the 'shared' and 'packed' layout qualifiers.

Change section 4.4.5 Uniform and Shader Storage Block Layout Qualifiers

"The initial state of compilation is as if the following were declared:

layout(std140, column_major) uniform; // without push_constant

layout(std430, column_major) buffer;

"However, when push_constant is declared, the default layout of the

buffer will be std430. There is no method to globally set this default."

Add to this statement:

"The std430 qualifier is supported only for shader storage blocks; using

std430 on a uniform block will result in a compile-time error"

the following phrase:

"unless it is also declared with push_constant"

Add to section 4.4.5 Uniform and Shader Storage Block Layout Qualifiers,

for versions not having 'offset' and 'align' description language,

or replace with the following for versions that do have 'offset' and

'align' description language:

"The 'offset' qualifier can only be used on block members of 'uniform' or

'buffer' blocks. The 'offset' qualifier forces the qualified member to

start at or after the specified integral-constant-expression, which will

be its byte offset from the beginning of the buffer. It is a compile-time

error to have any offset, explicit or assigned, that lies within another

member of the block. Two blocks linked together in the same program with

the same block name must have the exact same set of members qualified

with 'offset' and their integral-constant-expression values must be the

same, or a link-time error results. The specified 'offset' must be a

multiple of the base alignment of the type of the block member it

qualifies, or a compile-time error results.

"The 'align' qualifier can only be used on block members of 'uniform' or

'buffer' blocks. The 'align' qualifier makes the start of each block

buffer have a minimum byte alignment. It does not affect the internal

layout within each member, which will still follow the std140 or std430

rules. The specified alignment must be greater than 0 and a power of 2,

or a compile-time error results.

"The actual alignment of a member will be the greater of the specified

'align' alignment and the standard (e.g., std140) base alignment for the

member's type. The actual offset of a member is computed as follows:

If 'offset' was declared, start with that offset, otherwise start with

the offset immediately following the preceding member (in declaration

order). If the resulting offset is not a multiple of the actual

alignment, increase it to the first offset that is a multiple of the

actual alignment. This results in the actual offset the member will have.

"When 'align' is applied to an array, it affects only the start of the

array, not the array's internal stride. Both an 'offset' and an 'align'

qualifier can be specified on a declaration.

"The 'align' qualifier, when used on a block, has the same effect as

qualifying each member with the same 'align' value as declared on the

block, and gets the same compile-time results and errors as if this had

been done. As described in general earlier, an individual member can

specify its own 'align', which overrides the block-level 'align', but

just for that member."

Remove the following preamble from section 4.7, which exists for desktop

versions, but not ES versions. Removal:

"Precision qualifiers are added for code portability with OpenGL ES, not

for functionality. They have the same syntax as in OpenGL ES, as

described below, but they have no semantic meaning, which includes no

effect on the precision used to store or operate on variables.

"If an extension adds in the same semantics and functionality in the

OpenGL ES 2.0 specification for precision qualifiers, then the extension

is allowed to reuse the keywords below for that purpose.

"For the purposes of determining if an output from one shader stage

matches an input of the next stage, the precision qualifier need not

match."

Add:

"For interface matching, uniform variables and uniform and buffer block

members must have the same precision qualification. For matching *out*

variables or block members to *in* variables and block members, the

precision qualification does not have to match.

"Global variables declared in different compilation units linked into the

same shader stage must be declared with the same precision qualification."

More generally, all versions will follow OpenGL ES semantic rules for

precision qualifiers.

Section 4.7.2 Precision Qualifiers (desktop only)

Replace the table saying "none" for all precisions with this statement:

"Mediump and lowp floating-point values have the precision defined by

the RelaxedPrecision decoration in SPIR-V."

Section 4.7.4 Default Precision Qualifiers:

For desktop versions, replace the last three paragraphs that state the

default precisions with the following instead:

"All stages have default precision qualification of highp for all types

that accept precision qualifiers."

Changes to Chapter 5 of the OpenGL Shading Language Specification

Add a new subsection at the end of section 5.4 "Constructors":

"5.4.x Sampler Constructors

"Sampler types, like *sampler2D* can be declared with an initializer

that is a constructor of the same type, and consuming a texture and a

sampler. For example:

layout(...) uniform sampler s; // handle to filtering information

layout(...) uniform texture2D t; // handle to a texture

layout(...) in vec2 tCoord;

...

texture(sampler2D(t, s), tCoord);

The result of a sampler constructor cannot be assigned to a variable:

... sampler2D sConstruct = sampler2D(t, s); // ERROR

Sampler constructors can only be consumed by a function parameter.

Sampler constructors of arrays are illegal:

layout(...) uniform texture2D tArray[6];

...

... sampler2D[](tArray, s) ... // ERROR

Formally:

* every sampler type can be used as a constructor

* the type of the constructor must match the type of the

variable being declared

* the constructor's first argument must be a texture type

* the constructor's second argument must be a scalar of type

*sampler* or *samplerShadow*

* the dimensionality (1D, 2D, 3D, Cube, Rect, Buffer, MS, and Array)

of the texture type must match that of the constructed sampler type

(that is, the suffixes of the type of the first argument and the

type of the constructor will be spelled the same way)

* there is no control flow construct (e.g., "?:") that consumes any

sampler type

Note: Shadow mismatches are allowed between constructors and the

second argument. Non-shadow samplers can be constructed from

*samplerShadow* and shadow samplers can be constructed from *sampler*.

Change section 5.9 Expressions

Add under "The sequence (,) operator..."

"Texture and sampler types cannot be used with the sequence (,)

operator."

Change under "The ternary selection operator (?:)..."

"The second and third expressions can be any type, as long their types

match."

To

"The second and third expressions can be any type, as long their types

match, except for texture and sampler types, which result in a

compile-time error."

Add a section at the end of section 5

"5.x Specialization Constant Operations"

Only some operations discussed in this section may be applied to a

specialization constant and still yield a result that is as

specialization constant. The operations allowed are listed below.

When a specialization constant is operated on with one of these

operators and with another constant or specialization constant, the

result is implicitly a specialization constant.

- int(), uint(), and bool() constructors for type conversions

from any of the following types to any of the following types:

* int

* uint

* bool

- vector versions of the above conversion constructors

- allowed implicit conversions of the above

- swizzles (e.g., foo.yx)

- The following when applied to integer or unsigned integer types:

* unary negative ( - )

* binary operations ( + , - , * , / , % )

* shift ( <<, >> )

* bitwise operations ( & , | , ^ )

- The following when applied to integer or unsigned integer scalar types:

* comparison ( == , != , > , >= , < , <= )

- The following when applied to the Boolean scalar type:

* not ( ! )

* logical operations ( && , || , ^^ )

* comparison ( == , != )

- The ternary operator ( ? : )

Changes to Chapter 6 of the OpenGL Shading Language Specification

Remove mention of subroutine throughout, including removal of

section 6.1.2 Subroutines.

Changes to Chapter 7 of the OpenGL Shading Language Specification

Changes to section 7.1 Built-In Language Variables

Replace gl_VertexID and gl_InstanceID, for non-ES with:

"in int gl_VertexIndex;"

"in int gl_InstanceIndex;"

For ES, add:

"in highp int gl_VertexIndex;"

"in highp int gl_InstanceIndex;"

The following definition for gl_VertexIndex should replace the definition

for gl_VertexID:

"The variable gl_VertexIndex is a vertex language input variable that

holds an integer index for the vertex, [See issue 7 regarding which

name goes with which semantics] relative to a base. While the

variable gl_VertexIndex is always present, its value is not always

defined. See XXX in the API specification."

The following definition for gl_InstanceIndex should replace the definition

for gl_InstanceID:

"The variable gl_InstanceIndex is a vertex language input variable that

holds the instance number of the current primitive in an instanced draw

call, relative to a base. If the current primitive does not come from

an instanced draw call, the value of gl_InstanceIndex is zero."

[See issue 7 regarding which name goes with which semantics]

Changes to section 7.3 Built-In Constants

Add

"const int gl_MaxInputAttachments = 1;"

Remove section 7.4 Built-In Uniform State (there is none in Vulkan).

Changes to Chapter 8 of the OpenGL Shading Language Specification

Add the following ES language to desktop versions of the specification:

"The operation of a built-in function can have a different precision

qualification than the precision qualification of the resulting value.

These two precision qualifications are established as follows.

"The precision qualification of the operation of a built-in function is

based on the precision qualification of its input arguments and formal

parameters: When a formal parameter specifies a precision qualifier,

that is used, otherwise, the precision qualification of the calling

argument is used. The highest precision of these will be the precision

qualification of the operation of the built-in function. Generally,

this is applied across all arguments to a built-in function, with the

exceptions being:

- bitfieldExtract and bitfieldInsert ignore the 'offset' and 'bits'

arguments.

- interpolateAt* functions only look at the 'interpolant' argument.

"The precision qualification of the result of a built-in function is

determined in one of the following ways:

- For the texture sampling, image load, and image store functions,

the precision of the return type matches the precision of the

sampler type:

uniform lowp sampler2D sampler;

highp vec2 coord;

...

lowp vec4 col = texture (sampler, coord); // texture() returns lowp

Otherwise:

- For prototypes that do not specify a resulting precision qualifier,

the precision will be the same as the precision of the operation.

(As defined earlier.)

- For prototypes that do specify a resulting precision qualifier,

the specified precision qualifier is the precision qualification of

the result."

Add precision qualifiers to the following in desktop versions:

genIType floatBitsToInt (highp genFType value)

genUType floatBitsToUint(highp genFType value)

genFType intBitsToFloat (highp genIType value)

genFType uintBitsToFloat(highp genUType value)

genFType frexp(highp genFType x, out highp genIType exp)

genFType ldexp(highp genFType x, in highp genIType exp)

highp uint packSnorm2x16(vec2 v)

vec2 unpackSnorm2x16(highp uint p)

highp uint packUnorm2x16(vec2 v)

vec2 unpackUnorm2x16(highp uint p)

vec2 unpackHalf2x16(highp uint v)

vec4 unpackUnorm4x8(highp uint v)

vec4 unpackSnorm4x8(highp uint v)

genIType bitfieldReverse(highp genIType value)

genUType bitfieldReverse(highp genUType value)

genIType findMSB(highp genIType value)

genIType findMSB(highp genUType value)

genUType uaddCarry(highp genUType x, highp genUType y,

out lowp genUType carry)

genUType usubBorrow(highp genUType x, highp genUType y,

out lowp genUType borrow)

void umulExtended(highp genUType x, highp genUType y,

out highp genUType msb, out highp genUType lsb)

void imulExtended(highp genIType x, highp genIType y,

out highp genIType msb, out highp genIType lsb)

Remove section 8.10 Atomic-Counter Functions

Remove section 8.14 Noise Functions

Add a section

"8.X Subpass Functions

"Subpass functions are only available in a fragment shader.

"Subpass inputs are read through the built-in functions below. The gvec...

and gsubpass... are matched, where they must both be the same floating

point, integer, or unsigned integer variants.

Add a table with these two entries (in the same cell):

"gvec4 subpassLoad(gsubpassInput subpass)

gvec4 subpassLoad(gsubpassInputMS subpass, int sample)"

With the description:

"Read from a subpass input, from the implicit location (x, y, layer)

of the current fragment coordinate."

Changes to the grammar

Arrays can no longer require the size to be a compile-time folded constant

expression. Change

| LEFT_BRACKET constant_expression RIGHT_BRACKET

to

| LEFT_BRACKET conditional_expression RIGHT_BRACKET

and change

| array_specifier LEFT_BRACKET constant_expression RIGHT_BRACKET

to

| array_specifier LEFT_BRACKET conditional_expression RIGHT_BRACKET

Remove the ATOMIC_UINT type_specifier_nonarray.

Remove all instances of the SUBROUTINE keyword.

Issues

1. Can we have specialization sizes in an array in a block? That prevents

putting known offsets on subsequent members.

RESOLUTION: Yes, but it does not affect offsets.

2. Can a specialization-sized array be passed by value?

RESOLUTION: Yes, if they are sized with the same specialization constant.

3. Can a texture array be variably indexed? Dynamically uniform?

Resolution (bug 14683): Dynamically uniform indexing.

4. Are arrays of a descriptor set all under the same set number, or does, say,

an array of size 4 use up 4 descriptor sets?

RESOLUTION: There is no array of descriptor sets. Arrays of resources

are in a single descriptor set and consume a single binding number.

5. Which descriptor set arrays can be variably or non-uniformly indexed?

RESOLUTION: There is no array of descriptor sets.

6. Do we want an alternate way of doing composite member specialization

constants? For example,

layout(constant_id = 18) gl_WorkGroupSize.y;

Or

layout(constant_id = 18, local_size_y = 16) in;

Or

layout(constant_id = 18) wgy = 16;

const ivec3 gl_WorkGroupSize = ivec3(1, wgy, 1);

RESOLUTION: No. Use local_size_x_id etc. for workgroup size, and

defer any more generalized way of doing this for composites.

7. What names do we really want to use for

gl_VertexIndex base, base+1, base+2, ...

gl_InstanceIndex base, base+1, base+2, ...

RESOLUTION: Use the names above.

Note that gl_VertexIndex is equivalent to OpenGL's gl_VertexID in that

it includes the value of the baseVertex parameter. gl_InstanceIndex is

NOT equivalent to OpenGL's gl_InstanceID because gl_InstanceID does NOT

include the baseInstance parameter.

8. What should "input subpasses" really be called?

RESOLVED: subpassInput.

9. The spec currently does not restrict where sampler constructors can go,

but should it? E.g., can the user write a shader like the following:

uniform texture2D t[MAX_TEXTURES];

uniform sampler s[2];

uniform int textureCount;

uniform int sampleCount;

uniform bool samplerCond;

float ShadowLookup(bool pcf, vec2 tcBase[MAX_TEXTURES])

{

float result = 0;

for (int textureIndex = 0; textureIndex < textureCount; ++textureIndex)

{

for (int sampleIndex = 0; sampleIndex < sampleCount; ++sampleIndex)

{

vec2 tc = tcBase[textureIndex] + offsets[sampleIndex];

if (samplerCond)

result += texture(sampler2D(t[textureIndex], s[0]), tc).r;

else

result += texture(sampler2D(t[textureIndex], s[1]), tc).r;

}

Or, like this?

uniform texture2D t[MAX_TEXTURES];

uniform sampler s[2];

uniform int textureCount;

uniform int sampleCount;

uniform bool samplerCond;

sampler2D combined0[MAX_TEXTURES] = sampler2D(t, s[0]);

sampler2D combined1[MAX_TEXTURES] = sampler2D(t, s[1]);

float ShadowLookup(bool pcf, vec2 tcBase[MAX_TEXTURES])

{

for (int textureIndex = 0; textureIndex < textureCount; ++textureIndex) {

for (int sampleIndex = 0; sampleIndex < sampleCount; ++sampleIndex) {

vec2 tc = tcBase[textureIndex] + offsets[sampleIndex];

if (samplerCond)

result += texture(combined0[textureIndex], tc).r;

else

result += texture(combined1[textureIndex], tc).r;

}

...

RESOLUTION (bug 14683): Only constructed at the point of use, where passed

as an argument to a function parameter.

Revision History

Rev. Date Author Changes

---- ----------- ------- --------------------------------------------

46 25-Jul-2018 JohnK No longer require sampler constructors to

check shadow matches: mismatches are allowed

45 15-Dec-2017 TobiasH moved resource binding examples in from Vulkan API spec

44 12-Dec-2017 jbolz Document mapping of barrier/atomic ops to

SPIR-V

43 25-Oct-2017 JohnK remove the already deprecated noise functions

42 07-Jul-2017 JohnK arrays of buffers consume only one binding

41 05-Jul-2017 JohnK allow std430 on push_constant declarations

40 21-May-2017 JohnK Require in/out explicit locations

39 14-Apr-2017 JohnK Update overview for StorageBuffer storage

class.

38 14-Apr-2017 JohnK Fix Vulkan public issue #466: texture2D typo.

37 26-Mar-2017 JohnK Fix glslang issue #369: remove gl_NumSamples.

36 13-Feb-2017 JohnK Fix public bug 428: allow anonymous

push_constant blocks.

35 07-Feb-2017 JohnK Add 'offset' and 'align' to all versions

34 26-Jan-2017 JohnK Allow the ternary operator to result in a

specialization constant

33 30-Aug-2016 JohnK Allow out-of-order offsets in a block

32 1-Aug-2016 JohnK Remove atomic_uint and more fully subroutine

31 20-Jul-2016 JohnK Have desktop versions respect mediump/lowp

30 12-Apr-2016 JohnK Restrict spec-const operations to non-float

29 5-Apr-2016 JohnK Clarify disallowance of spec-const arrays in

initializers

28 7-Mar-2016 JohnK Make push_constants not have sets

27 28-Feb-2016 JohnK Make the default by origin_upper_left

26 17-Feb-2016 JohnK Expand specialized array semantics

25 10-Feb-2016 JohnK Incorporate resolutions from the face to face

24 28-Jan-2016 JohnK Update the resolutions from the face to face

23 6-Jan-2016 Piers Remove support for gl_VertexID and

gl_InstanceID since they aren't supported by

Vulkan.

22 29-Dec-2015 JohnK support old versions and add semantic mapping

21 09-Dec-2015 JohnK change spelling *subpass* -> *subpassInput* and

include this and other texture/sample types in

the descriptor-set-0 default scheme

20 01-Dec-2015 JohnK push_constant default to std430, opaque types

can only aggregate as arrays

19 25-Nov-2015 JohnK Move "Shadow" from texture types to samplerShadow

18 23-Nov-2015 JohnK Bug 15206 - Indexing of push constant arrays

17 18-Nov-2015 JohnK Bug 15066: std140/std43 defaults

16 18-Nov-2015 JohnK Bug 15173: subpass inputs as arrays

15 07-Nov-2015 JohnK Bug 14683: new rules for separate texture/sampler

14 07-Nov-2015 JohnK Add specialization operators, local_size_*_id

rules, and input dvec3/dvec4 always use two

locations

13 29-Oct-2015 JohnK Rules for input att. numbers, constant_id,

and no subpassLoadMS()

12 29-Oct-2015 JohnK Explain how gl_FragColor is handled

11 9-Oct-2015 JohnK Add issue: where can sampler constructors be

10 7-Sep-2015 JohnK Add first draft specification language

9 5-Sep-2015 JohnK - make specialization id's scalar only, and

add local_size_x_id... for component-level

workgroup size setting

- address several review comments

8 2-Sep-2015 JohnK switch to using the *target* style of target

types (bug 14304)

7 15-Aug-2015 JohnK add overview for input targets

6 12-Aug-2015 JohnK document gl_VertexIndex and gl_InstanceIndex

5 16-Jul-2015 JohnK push_constant is a layout qualifier

VULKAN is the only versioning macro

constantID -> constant_id

4 12-Jul-2015 JohnK Rewrite for clarity, with proper overview,

and prepare to add full semantics

3 14-May-2015 JohnK Minor changes from meeting discussion

2 26-Apr-2015 JohnK Add controlling features/capabilities

1 26-Mar-2015 JohnK Initial revision

저작자표시 비영리 변경금지

'Vulkan & OpenGL' 카테고리의 다른 글

[ Vulkan 연구 ] Queue Family (0)	2018.11.10
[ Vulkan 연구 ] ICD & Physical Device (6)	2018.11.05
[ Vulkan 연구 ] Vulkan Object Model (0)	2018.11.04
[ Vulkan 연구 ] Layer & Extensions (0)	2018.11.04
[ Vulkan 연구 ] 번역 : Brief guide to Vulkan layers (0)	2018.11.04
[ Vulkan 연구 ] 레이어 활성화하기 (1)	2018.11.03
[ Vulkan 연구 ] Loader (0)	2018.11.01
[ Vlukan 연구 ] 들어가며 (18)	2018.10.31
Vulkan Opaque Type (2)	2018.10.13
Vulkan 은 왜 C 로 구현되었을까? (10)	2018.10.03

Vulkan specification 을 보다가 보면 "opaque" 라는 표현이 자주 나옵니다.

Command pools are opaque objects that command buffer memory is allocated from, and which allow the implementation to amortize the cost of resource creation across multiple command buffers.

- 출처 : 5.2. Command Pools, Vulkan 1.1.83 Specification.

이게 무슨 의미인지 궁금해서 찾아 보니 Wikipedia 에 "opaque data type" 이라는 개념이 있었습니다.

컴퓨터 과학에서 opaque data type 은 인터페이스에서 그것의 concrete data structure 가 정의되지 않은 data type 입니다. 이는 정보 은닉( information hiding )을 강화하는데, 그것의 값들이, 그 알 수 없는 정보에 대한 접근을 허용하는, 서브루틴을 호출함으로써만 조작될 수 있기 때문입니다.

- 출처 : Opaque d ata t ype, Wikipedia.

"Opaque" 가 "불투명한" 혹은 "이해하기 힘든" 라는 뜻을 가지고 있기 때문에 말이 되는 표현인 것 같습니다. "concrete data structure" 라는 것은 "abstract data structure" 와 대칭되는 개념인 것 같더군요.

C/C++ 을 사용한지 15 년도 넘었는데 이런 기본적인 개념도 제대로 이해하고 있지 않다니 자괴감이 드네요. 어쨌든 C++ 로 치면 "abstract data type" 이라는 것은 interface( abstract class ) 이고 "concrete data type" 은 실제 class 라고 생각하시면 됩니다.

Vulkan 에서 특정 object 들에 대한 정의를 찾아 가면 다음과 같은 매크로들을 볼 수 있습니다. VK_DEFINE_HANDLE 은 dispatchable object 를 의미하고 VK_DEFINE_NON_DISPATCHABLE_HANDLE 은 non-dispatchable object 를 의미합니다. 여기에 대한 설명은 이 문서의 주제를 벗어나므로 여기에서 구체적인 언급은 하지 않겠습니다. 나중에 따로 다루도록 하겠습니다.

VK_DEFINE_HANDLE(VkInstance)
VK_DEFINE_HANDLE(VkPhysicalDevice)
VK_DEFINE_HANDLE(VkDevice)
VK_DEFINE_HANDLE(VkQueue)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkSemaphore)
VK_DEFINE_HANDLE(VkCommandBuffer)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkFence)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkDeviceMemory)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkBuffer)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkImage)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkEvent)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkQueryPool)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkBufferView)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkImageView)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkShaderModule)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkPipelineCache)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkPipelineLayout)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkRenderPass)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkPipeline)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkDescriptorSetLayout)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkSampler)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkDescriptorPool)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkDescriptorSet)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkFramebuffer)
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkCommandPool)

어쨌든 이 매크로의 정의를 보면 다음과 같습니다.

해당 type 에 대한 구체적인 정의는 없고 단순히 pointer 를 선언하고 있을 뿐입니다. 실제 내용은 라이브러리를 구현하는 측에서 채우게 되는 것이죠. 처음에는 이 선언이 매우 당황스러웠는데 opaque object 를 구현하기 위한 선언이라고 생각하니 이해가 가는군요.

같은 이름만 가지고 있다면 구현측에서 내부에 어떤 내용을 채워도 상관이 없다는 것입니다. Vulkan 에서는 object 생성시에 vkCreateXXX() 메서드를 호출하고 필요한 정보를 VkXXXCreateInfo 라는 구조체를 통해서 전달하게 하고 있습니다. 위에서 설명하듯이 구조체 내부 구현은 알 수 없지만 특정 메서드를 통해서 접근할 수 있도록 하고 있는 것입니다.

보시면 알겠지만 전부 pointer 형으로 선언되어 있습니다. 그래서 가끔 이러한 type 을 선언할 때 기본이 type 이 pointer 형이라는 것을 까먹는 경우가 많은데, Vulkan 을 사용할 때 주의해야 할 점 중에 하나입니다.

저작자표시 비영리 변경금지

'Vulkan & OpenGL' 카테고리의 다른 글

[ Vulkan 연구 ] Queue Family (0)	2018.11.10
[ Vulkan 연구 ] ICD & Physical Device (6)	2018.11.05
[ Vulkan 연구 ] Vulkan Object Model (0)	2018.11.04
[ Vulkan 연구 ] Layer & Extensions (0)	2018.11.04
[ Vulkan 연구 ] 번역 : Brief guide to Vulkan layers (0)	2018.11.04
[ Vulkan 연구 ] 레이어 활성화하기 (1)	2018.11.03
[ Vulkan 연구 ] Loader (0)	2018.11.01
[ Vlukan 연구 ] 들어가며 (18)	2018.10.31
[ 일부 번역 ] Vulkan GLSL Specification( GL_KHR_vulkan_glsl ) (0)	2018.10.28
Vulkan 은 왜 C 로 구현되었을까? (10)	2018.10.03

Motive

최근에 Vulkan 에 관심이 좀 생겨서 tutorial 을 따라하면서 구현하고 있습니다. 그런데 이걸 추상화해서 C++ class 로 만들어 내는 데 스트레스가 심해졌습니다.

Vulkan 은 기본적으로 C 로 구현되어 있기 때문에 오브젝트가 다 structure 로 되어 있고 member function 을 사용하고 있지 않습니다. 그래서 함수가 매우 많은데 일일이 찾아야 하고 구조체에 일일이 sType 을 넣어 줘야 합니다. ( C++ template 을 사용하는 ) Type-Traits 가 존재하지 않기 때문에 구조체의 유효성 검사를 위해서는 어쩔 수 없겠죠.

게다가 호출할 때도 파라미터 개수가 많아지고 생성하고자 하는 오브젝트가 어디에 소속되어 있는 건지 알기가 어렵습니다.

D3D 처럼 vkPhysicalDevice 와 관련한 메서드가 클래스에 모여 있으면 얼마나 좋을까요?

저는 C++ 에 익숙해져 있는 개발자다 보니 Vulkan 이 D3D 보다 좀 산만하고 어렵게 느껴집니다.

그래서 회사에서 동료분이랑 "왜 Vulkan 은 C 로 작성되었을까?" 라는 주제에 대해서 이야기해 봤는데, "개발자가 C++ 을 싫어한다", "성능 차이가 있다", "OpenGL 과 API 의 호환성을 유지하려고 한다", "별로 편하게 해줄 생각이 없는 애들이다" 등등 여러 가지 의견이 나왔지만, 납득할 만한 답을 찾지는 못했습니다.

대체 왜 C++ 이 아닌 C 를 사용하게 되었는지 알 수가 없었습니다. 그런데 "Vulkan Specification" [ 1 ] 을 읽다가 보니 단서를 찾았습니다.

2.4 절에 보면 다음과 같은 내용이 나옵니다.

ABI 는 Vulkan 이 응용프로그램을 플랫폼 혹은 구현측( implementation )에 맞춰 정의될 수 있도록 해 주는 메카니즘입니다. 다양한 플랫폼상에서, 이 명세에서 기술된 C 인터페이스는 공유 라이브러리에 의해서 제공됩니다. 공유 라이브러리들은 그것을 사용하는 응용프로그램과는 독립적으로 변경될 수 있기 때문에, 그것들은 특정한 호환성 문제를 겪게 되며, 이 명세는 그것들에 대한 요구사항을 제시합니다.

공유 라이브러리 구현은 반드시 그 플랫폼을 위한 표준 C 컴파일러의 Application Binary Interface ( ABI ) 를 사용하거나, 응용프로그램 코드가 구현측의 non-default API 를 사용하도록 만드는 customized API 헤더를 제공해야만 합니다. 이 문맥에서 ABI 는 C 데이터 타입의 size, alignment, layout 을 의미합니다; 프로시저의 호출 규약( calling convention ); 그리고 C 함수와 연관되는 공유 라이브러리 심볼의 이름 규약( naming convention ). 플랫폼에 대해 호출 규약을 커스터마이징하는 것은 보통 vk_platform.h 에 있는 호출규약 매크로를 적절히 정의함으로써 가능합니다.

Vulkan 을 공유 라이브러리로서 제공하는 플랫폼에서, 라이브러리 심볼ㄹ은 "vk" 로 시작하고 구현측에서 사용하기 위해서 예약된 숫자와 대문자가 그 다음에 오게 됩니다. Vulkan 을 사용하는 응용프로그램은 절대 이런 심볼들에 대한 정의를 제공해서는 안 됩니다. 이렇게 해야, Vulkan 공유 라이브러리가 새로운 API 나 extension 을 위해 추가적인 심볼들을 갱신할 때, 그것들이 현재 존재하는 응용프로그램의 심볼과 충돌하지 않습니다.

출처 : [ 1 ].

즉 동일한 ABI 를 사용하기 위해서 C 를 사용한다는 것을 알 수 있습니다. 하지만 C++ 도 ABI 를 제공하는 것은 아닌데 왜 굳이 C 여야 하는지 의문이 생겼습니다.

C ABI vs C++ ABI

Application Binary Interface 는 Android 개발에 의해서 cross-platform 개발이 대중화되면서 잘 알려지기 시작했습니다. 저만해도 그때까지는 Windows Platform 만 대상으로 개발했기 때문에 ABI 존재를 알지 못했습니다.

사실 Mobile 개발에 큰 관심은 없었기에 ABI 라는 게 Android 에서만 사용하는 개념인줄 알았습니다. "armeabi" 라든가 "armeabi-v7a", "arm64-v8a" 같은 것들을 봐도 별로 느낌이 안 왔습니다 [ 3 ]. 보통 UE4 나 Unity 3D 같은 엔진들이 자동화를 통해서 빌드해 주기 때문에, 그냥 체크 몇 개만 해도 되기 때문이었습니다. 하지만 Gradle 같은 것들을 사용해서 커스터마이징을 하다가 보니 이런 개념들에 대해서 몰라서는 일을 하기가 힘들어졌습니다. 이제는 "이런 개념들에 대해서 모르고 있다가는 밥줄이 끊기겠구나" 싶더군요.

칸텐츠 개발자라면 모르겠지만 엔진 개발자들은 이런 주제에 대해서 공부해야 할 기회( 당위성 )가 생기면 간단하게라도 개념을 이해하고 넘어가야 겠다는 생각이 들었습니다. 그래서 이번 기회에 ABI 가 뭔지 확인해 보기로 했습니다.

Wikipedia 에서는 ABI 를 다음과 같이 정의합니다.

컴퓨터 소프트웨어에서, application binary interface( ABI ) 는 두 이진 프로그램 모듈간의 인터페이스를 의미합니다; 보통 이러한 모듈들 중 하나는 라이브러리나 운영체제 기능이며, 다른 하나는 사용자에 의해서 실행되고 있는 프로그램입니다.

ABI 는 데이터 구조와 계산 루틴( computational routine )들이 머신 코드에서 어떻게 접근되어야 하는지를 정의합니다. 이것은 저수준( low-level )이며 하드웨어 의존적인 포맷입니다; 반면에 API 는 이 접근이 소스 코드에서 이루어지고, 그것은 상대적으로 고수준( high-level )이며 상대적으로 하드웨어에 비의존적이며, 보통 인간이 읽을 수 있는 포맷입니다. ABI 에 대한 일반적인 관점은 호출 규약이며, 이는 데이터가 입력이나 계산 루틴으로부터의 출력으로서 어떻게 제공되어야 하는지를 정의합니다; 예를 들면 x86 호출 규약이 있습니다.

출처 : [ 4 ].

그런데 문제는 C++ ABI 의 경우에는 호환성 보장이 어렵다는 데 있습니다.

같은 플랫폼에서 컴파일러 간에 C++ name mangling, exception propagation, calling convention 같은 세부사항을 표준화하는 ABI 들이 있기는 하지만 cross-platform 에서의 호환성을 요구하지는 마십시오.

C++ 은 method overloading 을 지원하기 때문에 같은 클래스 내에 같은 이름이 존재하는 것이 가능하죠. 그렇기 때문에 C++ 컴파일러는 name mangling 이라는 것을 수행하게 되는데, 이게 컴파일러마다 다릅니다. 아래 이미지는 [ 2 ] 에서 가지고 왔습니다. [ 2 ] 에서는 여러 가지 예제를 만들어서 C 와 C++ 에서 심볼 이름을 만들어 내는 과정에 대해서 자세히 설명하고 있습니다.

보시면 알겠지만 컴파일러마다 너무 이름이 달라서 심볼이름으로 검색할 때 차이가 발생할 수밖에 없음을 알 수 있습니다. 예를 들어서 Vulkan 에서 validation layer 에 대한 모니터 개체를 생성하려면 다음과 같이 해야 합니다.

bool Renderer::_SetupDebugCallback()
{
	if( _bUseValidationLayer && IsInstanceLayerSupported( "VK_LAYER_LUNARG_standard_validation" ) )
	{
		auto func = reinterpret_cast< PFN_vkCreateDebugUtilsMessengerEXT >( 
				vkGetInstanceProcAddr( _VkInstance, "vkCreateDebugUtilsMessengerEXT" ) );
		assert( nullptr != func );

VkDebugUtilsMessengerCreateInfoEXT createInfo = {};
		createInfo.sType = VK_STRUCTURE_TYPE_DEBUG_UTILS_MESSENGER_CREATE_INFO_EXT;
		createInfo.messageSeverity = VK_DEBUG_UTILS_MESSAGE_SEVERITY_VERBOSE_BIT_EXT | 
				VK_DEBUG_UTILS_MESSAGE_SEVERITY_WARNING_BIT_EXT | 
				VK_DEBUG_UTILS_MESSAGE_SEVERITY_ERROR_BIT_EXT;
		createInfo.messageType = VK_DEBUG_UTILS_MESSAGE_TYPE_GENERAL_BIT_EXT | 
				VK_DEBUG_UTILS_MESSAGE_TYPE_VALIDATION_BIT_EXT | 
				VK_DEBUG_UTILS_MESSAGE_TYPE_PERFORMANCE_BIT_EXT;
		createInfo.pfnUserCallback = StaticDebugCallback;

VkResult result = func( _VkInstance, &createInfo, nullptr, &_VkDebugMessenger );
		return ( VK_SUCCESS == result );
	}

return true;
}

만약 "vkCreateDebugUtilsMessengerEXT" 함수가 C++ mangling 규칙을 따르게 된다면 어떤 규칙으로 이름을 찾아야 하는지 알 수가 없게 되었을 겁니다.

하지만 C 는 mangling 규칙이라는 것이 존재하지 않습니다. 그냥 함수 이름 앞에 "_" 가 하나 붙게 되는 형식입니다. 이에 대해서 자세하게 알고자 한다면 [ 2 ] 를 확인하시는 것이 좋습니다.

Vulkan.hpp

호환성을 위해서 C 를 사용하는 것까지는 좋지만, 너무 불편합니다.

그래서 Khronos 에서는 "Vulkan.hpp" 헤더를 통해서 C++ 을 지원하고 있습니다. 다음과 같이 header 를 include 할 수 있습니다.

만약 VULKAN_HPP_NAMESPACE 를 지정하지 않으면 기본값이 "vk" 가 됩니다. 왠지 멋이 없어서 저는 그냥 "Vulkan" 이라고 해 봤습니다.

다음과 같이 클래스로 메서드들이 encapsulation 되어 있으니 매우 편합니다.

그리고 enumeration 들도 wrapping 되서 짧아졌더군요.

매우 편하게 가지고 놀면 될 것 같습니다.

하지만 Extension 을 사용할 때는 external symbol link error 가 나는 경우가 있는 것 같더군요. 그쪽에서는 C++ 쪽 대응을 하지 않은 것으로 보입니다. 그래서 저같은 경우에는 DebugUtilMessenger 같은 걸 사용할 때는 그냥 C-style 로 작업을 했습니다.

Conclusion

Cross-Platform 개발을 위해서는 호환성을 유지하는 것이 매우 중요하므로, Vulkan 은 기본 컴파일러의 C ABI 를 사용한다는 결론을 내릴 수 있었습니다. 또한 개발의 편의성을 위해 C++ 래퍼 클래스를 제공하고 있기 때문에 C++ 을 선호하는 사람은 그것을 사용하면 편리합니다. 단점이 있다면 거의 대부분의 예제가 C API 를 사용한다는 것입니다.

References

[ 1 ] Vulkan 1.1.86 Specification, Khronos.

[ 2 ] C++ 상에서 발생하는 name mangling 에 관한 내용, 미카 프로젝트.

[ 3 ] ABI 관리, Android Developers.

[ 4 ] Application Binary Interface, Wikepedia.

저작자표시 비영리 변경금지

'Vulkan & OpenGL' 카테고리의 다른 글

[ Vulkan 연구 ] Queue Family (0)	2018.11.10
[ Vulkan 연구 ] ICD & Physical Device (6)	2018.11.05
[ Vulkan 연구 ] Vulkan Object Model (0)	2018.11.04
[ Vulkan 연구 ] Layer & Extensions (0)	2018.11.04
[ Vulkan 연구 ] 번역 : Brief guide to Vulkan layers (0)	2018.11.04
[ Vulkan 연구 ] 레이어 활성화하기 (1)	2018.11.03
[ Vulkan 연구 ] Loader (0)	2018.11.01
[ Vlukan 연구 ] 들어가며 (18)	2018.10.31
[ 일부 번역 ] Vulkan GLSL Specification( GL_KHR_vulkan_glsl ) (0)	2018.10.28
Vulkan Opaque Type (2)	2018.10.13

그냥 그런 블로그

2018/10

[ Vlukan 연구 ] 들어가며

'Vulkan & OpenGL' 카테고리의 다른 글

[ 일부 번역 ] Vulkan GLSL Specification( GL_KHR_vulkan_glsl )

'Vulkan & OpenGL' 카테고리의 다른 글

Vulkan Opaque Type

'Vulkan & OpenGL' 카테고리의 다른 글

Vulkan 은 왜 C 로 구현되었을까?

'Vulkan & OpenGL' 카테고리의 다른 글

+ Recent posts

티스토리툴바