Junior Graphics Engineer Interview Questions

Junior graphics engineer interviews require a lot more domain-specific knowledge than your typical junior software engineer interviews. I've failed many interviews because I didn't fully understand common graphics interview topics. So I made this page to include some common interview topics I encountered during my interview process. I'm not going to go super in depth on all topics on this page, but I will link to other resources that do go in depth. Not all these topics will come up, but it's good to have familiarity with all of the topics. If you're interested about my experience applying for graphics jobs, you can read about that here.


Powers of 2

Yes, really. I've been directly asked about them in at least 21 interviews.

I just know that 24 = 16, 28 = 256, and that I can multiply or divide by 2 to get whatever specific power of 2 I get asked to compute.


Like binary, but with 16 numbers.

Two's Compliment

To create a negative number in two's compliment, determine the binary representation of the positive number, flip the bits, and add 1.


Rotation matrix properties

Model matrix is model -> world space

View matrix is world -> camera space

Projection matrix applies perspective correction to the view frustum

Multiplication order matters, typically we multiply by scale, then rotation, then translation.

Ray Intersections

A ray is defined at x = dt + o, where x is position, d is direction, t is time/distance, o is origin. d and o are normalized.



Dot/Cross products

Dot product

Cross product


Way to represent rotation with 4D numbers.

Euler Angles

Simple way to represent rotation. Can run into Gimbal lock, better to use quaternions.

Barycentric Coordinates

Coordinate system where you represent a point on a triangle by the weighted sum of the triangle's vertices.

Rendering Equation

For each input direction wi, calculate the light absorbed (irradiance) by the surface along wi (Li * wi·n), then multiply that value by the amount of the absorbed light that exits the surface (radiance) in the direction of wo (fr). Then, if the surface itself emits light, calculate the amount of radiance that leaves the surface along wo (Le) and add it to the previous sum. This is equal to the total amount of radiance that leaves the surface along wo (Lo).

Why is irradiance = Li * wi·n? Iradiance is flux per unit area. The radiance (Li) is the flux. To understand how the dot product give us 'per unit area', lets think about a flashlight. If you hold a flashlight directly above a table, a flashlight will form a circle on the table. As you tilt the flashlight so that it is at a 45 degree angle to the table, the circle will turn into an ellipse. The radiance in these two cases is the same, but the area covered by the light is more in the 45 degree case, so the irradiance is smaller. This effect is modeled with the dot product term. You can see this effect visually below.

Rendering equation is impossible to solve exactly, so it has to be approximated. The basic rendering equation does not capture some rendering effects like subsurface scattering, transmission, and volumetric effects. However, the rendering equation can be modified to accomodate these effects by integrating over the whole sphere instead of the hemisphere.


Bidirectional X Distribution function, where X is

Takes in two inputs, light direction into a surface (wi), light direction out of the surface (wo). The output of a BXDF is the ratio of radiance to irradiance for the surface. You can also think of it as the amount of light absorbed along wi that is reflected out along wo, or the % chance that a ray coming in along wi will reflect out along wo.

Properties of BXDFs

Types of BXDFs


In graphics, we have to approximate things, like the rendering equation. We use various sampling techniques to calculate the approximations. We are trying to avoid aliasing/jagged edges in the images we render, sampling helps to do that.

Importance Sampling

The reason that importance sampling is so important is due to the following observation: whether a given sample contributes 50% or 0.1% to the final sum, the time it takes to evaluate the function at the two sample points is the same. Because of this, we should spend more time evaluating areas of the interval that contribute significantly to the final sum, and less time on areas that do not contribute much.

Graphics Pipeline



Tile Based Deferred Rendering. Vertices are binned into tiles, then each tile is rendered separately, which reduces memory bandwidth between pipeline stages.

CPU/GPU Architecture

Stack vs Heap



Cache, Memory, Cache Line

When executing a function, the assembly instructions and data for that function are usually stored sequentially. In order to speed up execution, the OS will prefetch a section of code instructions and data surrounding the current instruction location and store it in the cache, which is a very small, very fast bit of memory that is close to the CPU. While executing the function, the CPU will check the cache to see if the next instruction/data is in the cache. If it is, there is no need to go back to main memory to fetch the data. However, if it is not in the cache, this is a cache miss, and the data must be fetched from a lower cache level or main memory. The cache line is the amount of data that is read into the cache aat once. Since the cache relies on things being sequential, data structures that aren't contiguous can have many cache misses, stuff like graphs, trees, linked lists. Arrays have relatively fewer cache misses, since data is stored sequentially. There are multiple levels of caches, L1 is the fastest.

GPU Architecture

Thread - single invocation of a shader

SIMD group/Warp - a group of 32 threads. This size is fixed by the GPU. In a SIMD group, all threads are executed in parallel in at once on a single GPU core.

Threadgroup/Wavefront - A group of threads that are dispatched to a GPU. The threadgroup size is typically set by the programmer.

Thread masking - GPUs execute SIMD groups in parallel and in lock step. As an example, if there is an add instruction in a shader, the GPU will fetch the inputs to the add instruction all at once, and execute the add instruction all at once, and write the results all at once. This is very efficient, but causes an issue for threads with if statements and loops (divergent threads). This is because some threads will execute different instructions while other threads need to execute other instructions. To solve this, the GPU will execute both branches of the if statement for each thread. The GPU will use thread masking to ignore the writes to the registers for the threads that are not supposed to be executing the instructions. This is why if statements should generally be avoided in GPU code if possible.

Command Buffer - Instructions for a GPU draw call are stored in a command buffer. Stuff like pipeline state, buffer bindings, rendering mode (triangles, points, lines). OpenGL/WebGL do not expose command buffers through their APIs, but they are first class citizens in Metal/Vulkan/other modern APIs.

Command Queue - Command buffers are submitted to the command queue, which holds the command buffers for the draw calls.

Rendering Techniques

Physically Based Rendering

Physically Based Rendering, a way to realistically compute the way light interacts with materials. PBR uses BRDFs to approximate the rendering equation, and is generally photorealistic. PBR models typically have a diffuse/albedo component, a specular/gloss component, a roughness, a metallness, and emmission components. Not all of the components correspond to exact photorealistic parameters, but the parameters are meant to be tweakable by artists.

Forward Rendering

Typical way that the graphics pipeline works. Have some triangles, put them through the vertex shader, compute lighting in fragment shader, then write to frame buffer. You might use a depth buffer to stop a fragment from being drawn behind a pixel that has already been drawn.

Deferred Rendering

While forward rendering works ok, it does not prevent the case where you compute lighting on a fragment that will later be overwritten by another fragment that is closer to the camera. This means that your lighting calculations for the first fragment are wasted (overdraw), which is inefficient. Deferred rendering tries to solve this by deferring lighting to a second compute pass after the triangles have all been rasterized. In deferred rendering, during the fragment shader, the shader writes the various material components to a G-buffer, which is a group of textures containing diffuse/specular/normal/depth information about the triangles. After rasterization, a second compute pass takes the G-buffer and computes lighting. This way lighting is only calculated on pixels that appear on screen.

Deferred rendering has some drawbacks, it uses significantly more texture memory than forward rendering, it can't handle transparent materials well, and it can be difficult to do MSAA with it.

Visibility Buffer Rendering

Visibility buffer rendering solves some issues with deferred rendering. In visibility buffer rendering, the fragment shader writes the triangle/primitive id and draw call id to the visibility buffer. This means you only need a depth + visibility buffer, no G-buffer. Then, in the lighting compute pass, the primitive id and draw call id are used to fetch the MVP matrices and material information from a material buffer, which is then used for lighting.

Shadow Mapping

Shadow mapping is a real time shadow algorithm. A depth buffer is rendered from the camera's point of view, and from the light's point of view. Then, pairs of depth values are compared between the images to determine if the pixels should be in light or shadow. Cascaded shadow mapping creates multiple depth buffers from the light's point of view for different depth ranges away from the camera to improve the resolution of the shadows.


Mipmapping is the process of creating lower resolution version of textures. With each mip level, the resolution is cut in half (pixel count cut into a quarter). The new mipmap levels will take up 33% more memory of than original texture. Having mipmaps can reduce Moiré patterns, and can save on texture memory by loading a lower resolution version of a texture if that texture appears far away from the camera.

Tone Mapping

A way to approximate HDR content on an low dynamic range screen.


A way to approximate the washed out/glow effect that bright lights cause on camera sensors.

Ray Tracing

A technique to create images by tracing light paths through a scene.

C++/OS stuff

Virtual Functions

Polymorphic classes are allowed to have virtual functions, which allow the child classes to implement different versions of the function for each object. The program will determine which virtual function to execute for the object at runtime. classes with virtual functions will get an additional void* vptr member (increasing the object size) pointing to the vtable. The vtable is essentially a list of the object's virtual function address. Pure virtual functions are marked with = 0 and they must be implemented by the subclass.

Pointer vs Reference



Templated Classes

Templated classes allow you to write generic code/container classes. However, each new use of the class needs to be compiled individually, which can increase compile time and binary size. They can also be annoying to debug on old versions of C++.


A static variable in a class is a shared between all objects of that class, they can all access and modify it.

Size, Padding, Alignment

The size of an object is important, and there is a difference between the padding, alignment, and size of an object. An empty object without a virtual function will have a size of 1 byte, with a virtual function is 4 bytes.

Mutex and Semaphore

Used for multithreaded programming and locking to prevent race conditions. Mutexes generally allow one thread to access a resource, while a semaphore can allow multiple threads to access a resource.


Paging allows you to have 'contiguous' memory that is not actually contiguous. This is nice for heap memory, which can become fragmented.

Data Structures/Algorithms


Depth First Search and Breadth First Search. The DFS/BFS algorithms are pretty simple, and they come up occasionally. The iterative cases are very similar to each other. The iterative DFS is usually better than recursive, since large trees could cause the program to run out of stack memory.

 //The main difference between dfsIterative and bfsIterative is one
 //function uses a stack while the other uses a queue.
 void dfsIterative(TreeNode* root) {
     if (root == NULL) {
     stack s;
     while (!s.empty()) {
         TreeNode* n = s.pop();
         for (TreeNode* child: n->children) {
             if (child != NULL) {
 void bfsIterative(TreeNode* root) {
     if (root == NULL) {
     queue q;
     while (!q.empty()) {
         TreeNode* n = q.dequeue();
         for (TreeNode* child: n->children) {
             if (child != NULL) {
 //DFS can also be done recursive
 void dfsRecursive(TreeNode* root) {
     if (root == NULL) {
     for (TreeNode* child: root->children) {

Reverse Linked List

Been asked this question twice, its pretty simple to do in linear time.

 ListNode* reverse(ListNode* head) {
     ListNode* currentNode = head;
     ListNode* nextNode = head->next;
     currentNode->next = NULL;
     while (nextNode != NULL) {
         ListNode* newNextNode = nextNode->next;
         nextNode->next = currentNode;
         currentNode = nextNode;
         nextNode = newNextNode;
     //new head of the list
     return currentNode;

Spacial Data Structures

Used to query spacial data in sub-linear time. Still takes linear time to build spacial data structures.

k-d tree - partition data along alternating dimensions

Bounding Volume Hierarchy - Create bounding boxes around objects, then more bounding boxes around groups of objects

Binary Space Parition - Partition data using planes

Octree - tree where each node has 8 children

Linked List vs Arrays

Linked List


Other stuff

Cloth Simulation

Common cloth simulation methods use mass/spring. A bunch of cloth vertices are connected by springs that keep the vertices in place. There are structural springs, which are for adjacent vertices in the face. Shear vertices are for non-adjacent vertices in a face. Flexion/bend vertices are 'two hops' away.

Subdivision Surfaces

Process of smoothing out a surface and adding more geometry. Most common technique is Catmull-Clark. Typically we use half-edge data structures to represent meshes that we want to subdivide.

Fluid Simulation

Fluid simulation involves solving the Navier-Stokes equation, which model forces like gravity, pressure, viscosity, etc. Two main types of fluid: Eulerian and Lagrangian. Lagrangian is about tracking individual particles in the simulation, while Eulerian is about tracking the flow of particles through a specific location, rather than individual particles. Lagrangian is a particle based approach (mesh free), while Eulerian is a grid/mesh based approach.

Integration Methods

Explicit Euler, Implicit Euler, and Runge-Kutta (RK4) are common integration methods. Euler is typically unstable on forces that vary with time/distance/position. RK4 is more stable, but it's more complicated.

Apple Silicon

If you want to work for Apple (or are just interested), you'll want to be familiar with Apple Silicon.

Node-Based Graphs

Lots of rendering/VFX stuff uses node graphs to represent shaders/render passes/materials/etc.