Shub's logo

T-4: Adapters & Physical/Logical Devices & Device Queues

29 May, 2020

6 min read

Get Physical Device from Backend to Logical Devices

Get Physical Device from Backend to Logical Devices

What is a Device

Details discussed later.

Instantiating Adapters, Logical Device, and Device Queue Group

In GPU, there is no direct relation to Physical Device; instead, gfx-hal provides us API to get Adapter instance, which has Physical Device instance and various Queue family details.

I hope you have already read details on how to get a reference to gfx-hal Instance and its Surface. If not, please read it once before continuing.

We would get a list of adapters from an instance.

We need to update our Renderer struct, and it's implementation a bit for this.

struct Renderer<B: Backend> {
    ...
    // Device Adpter, containing Physical and Queue details
    adapter: Adapter<B>,
    // Logical Device object
    device: B::Device,
    // Queue Group for rendering reference
    queue_group: family::QueueGroup<B>,
}

impl<B: Backend> Renderer<B> {
    fn new(instance: B::Instance, surface: B::Surface) -> Self {
        let mut adapters = instance.enumerate_adapters();
        let (memory_types, limits, adapter) = {
            let adapter = adapters.remove(0);
            (
                adapter.physical_device.memory_properties().memory_types,
                adapter.physical_device.limits(),
                adapter,
            )
        };

        let (device, queue_group, supported_family) = {
            let supported_family = adapter.queue_families.iter()
                .find(|family| {
                    surface.supports_queue_family(family) && family.queue_type().supports_graphics()
                })
                .unwrap();

            let mut gpu = unsafe {
                adapter.physical_device
                    .open(&[(supported_family, &[1.0])], Features::empty())
                    .unwrap()
            };

            (
                gpu.device,
                gpu.queue_groups.pop().unwrap(),
                supported_family,
            )
        };

        Renderer {
            ...
            adapter,
            device,
            queue_group,
        }
    }
}
...
Code Breakdown

What details does an Adapter have

The following are not a complete list of Adapter properties. They mostly consist of all those properties that we have already described earlier. We are just discussing them in detail here.
// `adapter.info`: Adapter Info
{
  name: "GeForce GTX 1060 6GB",
  vendor: 4310,
  device: 7114,
  // Enum - { Other = 0, IntegratedGpu = 1, DiscreteGpu = 2, VirtualGpu = 3, Cpu = 4 }
  device_type: DiscreteGpu,
}

GPU info is quite clear. Getting info is simple - adapter.info. It basically gives us details on GPU Hardware.

// `adapter.physical_device.limits()`: Physical Device Limits,
{
  max_image_1d_size: 16384,
  max_image_2d_size: 16384,
  max_image_3d_size: 2048,
  max_image_cube_size: 16384,
  max_image_array_layers: 2048,
  max_texel_elements: 268435456,
  // ...and more
}

GPU limits are also self-explanatory. It gives us a struct containing details on GPU Memory, Concurrency limits.

// Memory Types: `adapter.physical_device.memory_properties().memory_types`
[
    MemoryType {
        properties: DEVICE_LOCAL,
        heap_index: 0,
    },
    MemoryType {
        properties: CPU_VISIBLE | COHERENT,
        heap_index: 1,
    },
    //...more
]

I won't comment too much on MemoryTypes, as details on DEVICE_LOCAL or CPU_VISIBLE MemoryTypes is unknown to me as well at this point.

Direct Quote from gfx-hal examples

Using CPU_VISIBLE memory is convenient because it can be directly memory-mapped and easily updated by the CPU, but it is very slow and so should only be used for small pieces of data that need to be updated very frequently. For something like a vertex buffer that may be much larger and should not change frequently, you should instead use a DEVICE_LOCAL buffer that gets filled by copying data from a CPU_VISIBLE staging buffer.

From the above Quote, I can get that memory_types are used to create specific types of buffers that are efficient in some places and not in the others.

I can understand if things are getting too intense. Be patient and force yourself to complete the whole tutorial. Ultimately the results will be excellent. Once we are done showing graphics on the Window, everything here will make sense.


Logical Device

How to get Logical Devices from Physical Device

How to get Logical Devices from Physical Device

As you can see from the above image, a logical device is nothing but a representation of an actual physical device.

Physical device (like NVidia GPU) can be used for various things like Games, Graphics Rendering, Data Mining, Machine Learning, and more. This vast range of use-cases is possible in GPU only due to its support for both CPU intensive tasks (tasks that do a single operation but benefit with GPU's abundant number of cores for parallel operations), as well as GPU intensive tasks. For us, we are currently looking for a Device Capability specific to Graphics intensive task.

Thus Logical Device is a representation of Physical Device, which has support for specific capabilities (thus, we used supported_family to open a Logical Device) that it works on for the time App is running.

Logical Devices are used to create and manage different resources, like buffers, shader programs, and textures.

Device Queues & Queue Families

What are Device Queues anyways?? As the name suggests, it's just a Queue. Every GPU driver provides us with Queues bound to its hardware, which can take Commands from our application and process it parallelly. Thus, we use queues to process graphics commands parallelly.

What are Queues Families anyways?? Queue Families are a collection of support details for a GPU. It points out what kind of work our GPU hardware can handle, like handling CPU compute operations, I/O transfer operations, GPU graphics/render operations, and more. If our GPU supports all of them, then we need to choose between various queue families to decide on what particular operations we want to do via our Logical Device anytime.

Cool! Let's now discuss our above Code example.

Getting a supported_family was crucial because it helped us get a specific Logical Device (gpu.device) and Queue groups (gpu.queue_groups). Supported Family defines what kind of operation we want to perform using the device and queues.

Queue Groups are used to get queues that keep hold of our commands from a Command Buffer.

Note: Opening a Physical Device instance to get a Logical Device instance is unsafe. Thus that code-block was wrapped inside unsafe {} block.

To get details on unsafe usage, read Rust Nomicon.

How do they look internally:

Logical Device representation is quite complex; thus, I won't describe it here. Better to read Device Docs, and understand how to use its APIs.

Queue Families are a collection of details on supported Queue Groups in a GPU.

(Since I was using MacOS) I got supported_family as shown below:

2020-05-16T19:22:41.155852+05:30 DEBUG enumerate_devices - >>>>>>> Queue Family Type:: General
2020-05-16T19:22:41.155987+05:30 DEBUG enumerate_devices - >>>>>>> Queue Max Queues:: 1
2020-05-16T19:22:41.156018+05:30 DEBUG enumerate_devices - >>>>>>> Queue Id:: QueueFamilyId(0)

where if you see the family id, that points to the first Queue group, in gpu.queue_groups, which contains the supported queues for creating and managing different resources.

The above representation is for Metal GPU driver in MacOS, which is very different than actual Vulkan Queue Family, which you can see in any Linux OS.

Please do not get confused with the above log, as it differs from system to system. Following is a representation of all adapter.queue_families (on Linux for Vulkan Backend).

// Queue Families
[
  {
    properties: {
      queue_flags: GRAPHICS | COMPUTE | TRANSFER | SPARSE_BINDING,
      queue_count: 16,
      timestamp_valid_bits: 64,
      min_image_transfer_granularity: {
        width: 1,
        height: 1,
        depth: 1,
      },
    },
    device: 0x00005622e6d7d271,
    index: 0,
  },
  {
    properties: {
      queue_flags: TRANSFER,
      queue_count: 1,
      timestamp_valid_bits: 64,
      min_image_transfer_granularity: {
        width: 1,
        height: 1,
        depth: 1,
      },
    },
    device: 0x00005622e6d7d270,
    index: 1,
  }
]

For a detailed explanation on Queue Family, see this Stackoverflow thread


Code

You can find the full code for this Doc, here 002-enumerate_devices

To run that code:

cargo run --bin enumerate_devices --features=metal

We don't have any change in the output in this Chapter.

© Copyright 2020 Subroto Biswas

Share