Video for Linux Two - Devices

Bill Dirks - September 12, 1999

Video for Linux Two (V4L2) is a set of APIs and standards for handling video devices on Linux. Video for Linux Two is a replacement for the Video for Linux API that comes with the kernel.

This is the top level V4L2 device driver API document. This document descibes how to setup up V4L2 device nodes, and also specifies some data structures and ioctl codes common to different device types. Supplementing this file are specification documents for each of the different V4L2 device types. Links to these documents are the table in the next section.
 

Revisions:
     

Device Types

Video for Linux Two is a suite of related driver specifications for different types of video devices and video-related data. The type of device determines what kind of data is passed via read() and write(), and/or the set of ioctl commands the driver supports. Several device types have been defined or are planned.

V4L2 devices are all Unix char type devices, with a major device number of 81. V4L2 devices have the following names in the /dev tree. (More device types can be added as needed.)
 
Device Name     Type of Device
/dev/video     Video capture interface
/dev/vfx     Video effects interface
/dev/codec     Video codec interface
/dev/vout     Video output interface
/dev/radio     AM/FM radio devices
/dev/vtx     Teletext interface chips
/dev/vbi     Data services interface

Device Minor Numbers

Because V4L2 is a catch-all for a wide variety of devices on different kinds of busses, there is no good way for it to automatically assign device minor numbers in a logical and consistent way. Therefore, minor numbers have to be given to V4L2 device drivers on the insmod command line. A single V4L2 device driver module can potentially support multiple devices of a type, and multiple types of devices. For example, a driver may support capture, vbi, and codec device types, in which case minor numbers will need to be specified for all three types of devices. The command line parameters to set minor numbers are the same as the device node name prefixed with "unit_". For example "unit_video" for /dev/video (video capture) devices, or "unit_codec" for codec devices. Valid minor numbers are 0 to 255. For multiple devices driven by a single driver, separate the minor numbers with commas.

For example, a hypothetical driver for a video capture/compression card might export both video capture and video codec interfaces. If you had two such boards in the system, you would need two minors for the capture devices, and two for the codec devices. The parameters might be:

# insmod grabber.o unit_video=0,1 unit_codec=10,11

The device nodes might be:
 
node   major, minor
/dev/video0   81, 0
/dev/video1   81, 1
/dev/codec0   81, 10
/dev/codec1   81, 11

There is no specification for what minor numbers to use with what devices. It is entirely up to the administrator of a system. Obviously, all minor numbers for all devices for all V4L2 drivers must be unique. It is not specified how a driver will handle the case where a minor number is already in use. It may fail the module load. It is expected that driver writers will handle the failure gracefully, and only the offending device will be unavailable.

It is not specified what happens when a needed minor number is not on the command line. It is expected that driver writers will include sensible default values so if you have only one V4L2 driver, it will work without manually assigned minor numbers. However, administrators really should specify the minor numbers. In particular, if you have two V4L2 drivers, you will need to specify minors to guarantee that there are no conflicts.

Video for Linux Two applications open and scan the devices to find what they are looking for. Capability queries define what each interface supports.
 


Common V4L2 API Elements

Some concepts, data structures, and ioctl commands are of general use to all or most V4L2 device types. These common API elements are defined below.
 
 

Query Capabilities - VIDIOC_QUERYCAP

All V4L2 device types support this ioctl. This ioctl call is used to obtain the capability information for a device. The driver will fill in a struct v4l2_capability object.
 

struct v4l2_capability
char name[32]   Canonical name for this device. This name is descriptive, but it is also unique for each device. It can be used to build a menu of available devices for a device-select user interface.
int type   Device type
int inputs   Number of video inputs that can be selected
int outputs   Number of video outputs that can be selected
int audios   Number of audio inputs that can be selected
int maxwidth   Best case maximum image width in pixels
int maxheight   Best case maximum image height in pixels
int minwidth   Minimum width in pixels
int minheight   Minimum height in pixels
int maxframerate   Maximum frame rate
__u32 flags   Device capability flags
__u32 reserved[4]   reserved for future capabilities
     
Values for the type field:
V4L2_TYPE_CAPTURE   Is a capture device
V4L2_TYPE_CODEC   Is a CODEC device (has draft specification)
V4L2_TYPE_OUTPUT   Is a video output device (not a graphics display) (has draft specification)
V4L2_TYPE_FX   Is an effects or video filter device (has draft specification)
V4L2_TYPE_VTR   Is a video tape recorder controller device (not spec'ed)
V4L2_TYPE_VBI   Is a VBI device (has draft specification)
V4L2_TYPE_RADIO   Is a radio device (audio and tuning ioctls in capture spec)
     
Capability flags used in the flags field:
V4L2_FLAG_READ   Can capture frames or data via the read() call
V4L2_FLAG_WRITE   Can accept frames or data via the write() call
V4L2_FLAG_STREAMING   Can transfer frames or data asynchronously via pre-allocated buffers
V4L2_FLAG_PREVIEW   Supports automatic video preview
V4L2_FLAG_SELECT   Supports the select() call
V4L2_FLAG_TUNER   Has a tuner of some form
V4L2_FLAG_MONOCHROME   Image capture is grey scale only
V4L2_FLAG_DATA_SERVICE   Does Teletext

The exact meaning of the flags will vary slightly depending on the type of device.

Note that the minimum and maximum image capture dimensions are for comparison purposes only. The actual maximum size you can capture may depend on the capture parameters, including the pixel format, compression (if any), the video standard (PAL is higher resolution than NTSC), and possibly other parameters such as the amount of on-board memory or bus bandwidth on your system. Same applies to maximum frame rate. The minimum and maximum sizes do not imply that all combinations of height/width within the range are possible.
 
 

Multiple Opens per Device

In general, V4L2 devices can be opened more than once simultaneously. Some devices may have limitations in that regard. See the documentation for the particular device type. Specifically, some devices will only be able to support I/O operations on one open at a time.

No-I/O Opens

However, it is still desirable to support additional simultaneous opens for control or administration purposes. So V4L2 has the concept of no-I/O opens. No-I/O opens can do many 'get' type ioctls, change UI controls, and do certain other non-disruptive ioctls depending on the type of device. The purpose is to be able to create standard video control panels that can run concurrently along side an application doing video input or output. Uses for no-I/O opens include UI control panels, changing channels, and performance monitoring. The application indicates an open will not be used for I/O by passing the O_NOIO flags to open().
 
 

The Stream Data Format Structure - struct v4l2_format

V4L2 devices handle multimedia data in the form of streams consisting of buffers containing formatted data. One device can have multiple simultaneous streams. Most streams are video images, but other types are possible. The v4l2_format structure incorporates a union to handle different format structures, and so that more can be added later. The application must always set the type field to indicate which type of format is being used, and which stream it applies to.
 

struct v4l2_format
__u32 type Set to one of the V4L2_BUF_TYPE_* symbols. Indicates which member of the union is used, and which stream this applies to.
struct v4l2_pix_format fmt.pix Format structure for video images.
struct v4l2_vbi_format fmt.vbi Format structure for VBI data.
__u8 fmt.raw_data[200] Reserves a fixed amount of space. Can also be cast to a custom format structure.

 

Video Image Format Structure - struct v4l2_pix_format

This structure completely defines the layout and format of an image or image buffer, including width, height, depth, pixel format, stride, and total size.
 

struct v4l2_pix_format
__u32 width   Width in pixels
__u32 height   Height in pixels
__u32 depth   Average number of bits allocated per pixel. Does not apply to compressed images.
__u32 pixelformat   The pixel format or type of compression
__u32 flags   Format flags
__u32 bytesperline   Stride from one line to the next, in bytes. Only applies if the V4L2_FMT_FLAG_BYTESPERLINE flag is set.
__u32 sizeimage   Total size of the buffer to hold a complete image, in bytes
__u32 priv   For compression-specific data for user-defined formats. Meaning depends on pixelformat. Set to zero when not used.

The depth is the amount of space allocated in the buffer per pixel, in bits. Each pixel format has a single corresponding depth value.

The pixel information may not fill all bits allocated, e.g. RGB555 and RGB32. Leftover bits are undefined. For planar YUV formats the depth is the average number of bits per pixel. For example, YUV420 is eight bits per component, but the U and V planes are 1/4 the size of the Y plane so the average bits per pixel is 12. The standard pixelformat values are listed in the table below. See the standard pixel format specification for more detailed information concerning pixel formats. Some drivers may support formats not listed here.

Bytesperline is the number of bytes of memory between two adjacent lines. Since most of the time it's not needed, bytesperline only applies if the FMT_FLAG_BYTESPERLINE flag is set. Otherwise the field is undefined and must be ignored. For YUV planar formats, it's the stride of the Y plane. When there is line padding, data begins at the start of the buffer, and pad bytes are at the end of lines. The values of pad bytes are undefined.

Sizeimage is usually (width*height*depth)/8 for uncompressed images, but it's different if bytesperline is used since there could be some padding between lines.
 

Symbols for the pixelformat field, and their corresponding depth values
pixelformat   depth   Description (see Image Formats for more information)
V4L2_PIX_FMT_RGB332   8   RGB-3-3-2, one byte-per-pixel RGB
V4L2_PIX_FMT_RGB555   16   RGB-5-5-5 packed RGB format. Extra bit is undefined
V4L2_PIX_FMT_RGB565   16   RGB-5-6-5 packed RGB format
V4L2_PIX_FMT_BGR24   24   RGB-8-8-8 packed into 24-bit words. B is at byte address 0.
V4L2_PIX_FMT_RGB24   24   RGB-8-8-8 packed into 24-bit words. R is at byte address 0.
V4L2_PIX_FMT_BGR32   32   RGB-8-8-8 into 32-bit words. B is at byte address 0. Byte3 is undefined.
V4L2_PIX_FMT_RGB32   32   RGB-8-8-8 into 32-bit words. R is at byte address 0. Byte3 is undefined.
V4L2_PIX_FMT_GREY   8   Linear grey scale. Greater values are brighter.
V4L2_PIX_FMT_YUV410   9   YUV 4:1:0, planar, 8 bits/component. Y plane, 1/16-size U plane, 1/16-size V plane.
V4L2_PIX_FMT_YUV420   12   YUV 4:2:0, planar, 8-bits per component. Y plane, 1/4-size U plane, 1/4-size V plane.
V4L2_PIX_FMT_YUYV   16   YUV 4:2:2, 8 bits/component. Byte0 = Y0, Byte1 = U01, Byte2 = Y1, Byte3 = V01, etc.
V4L2_PIX_FMT_UYVY   16   Same as YUYV, except U-Y-V-Y byte order
         
Flags defined for the v4l2_format flags field
V4L2_FMT_FLAG_BYTESPERLINE   The bytesperline field is valid
V4L2_FMT_FLAG_COMPRESSED   The image is compressed. The depth and bytesperline fields do not apply.
V4L2_FMT_FLAG_INTERLACED   The image consists of two interlaced fields
V4L2_FMT_FLAG_TOPFIELD   The image is the top field of a two-field frame
V4L2_FMT_FLAG_BOTFIELD   The image is the bottom field of a two-field frame
V4L2_FMT_CS_field   Mask for color space field
V4L2_FMT_CS_601YUV   YUV data uses ITU-R601/656 encoding
     

 

Memory-Mapping Device Buffers - VIDIOC_REQBUFS, VIDIOC_QUERYBUF

These ioctls implement a general-purpose protocol for using mmap() to make driver buffers or device memory buffers accessible to the application for I/O.

To map buffers, the application first calls VIDIOC_REQBUFS with a struct v4l2_requestbuffers filled in with the number and type of the buffers that it wants. Upon return the driver will fill in how many buffers it will allow to be mapped, and possibly modify the type field as well. The application should make sure the granted count and type are ok. Note that it is possible that the driver may have a lower limit on the number of buffers required for streaming data. If the driver returns a count value greater than the requested number then that is a lower limit and the application needs to allocate at least that many buffers.

In general, drivers can support many sets of buffers. Each set of buffers is identified by a unique buffer type value. The sets are independent and each set can hold a different type of data.

To map the buffers call VIDIOC_QUERYBUF for each buffer to get the details about the buffer, and call mmap() to map it. VIDIOC_QUERYBUF takes a struct v4l2_buffer object with the index and type fields filled in to indicate which buffer is being queried. The type field must be filled in with the value from the struct v4l2_requestbuffers object. Valid index values range from 0 to count - 1, inclusive. Upon return, the offset and length fields will be filled in with the values that must be passed to mmap() to map the buffer. Only pass offset and length values received from the VIDIOC_QUERYBUF ioctl; the driver needs the exact values it indicated to correctly map the requested buffer. Use MAP_SHARED if the buffer will be used by a forked process. When the buffer is no longer needed the application must call munmap(). The application must munmap() before closing the driver.

The driver will allocate buffers that will be located in system memory on the mmap() call, and free them on the munmap() call, if possible. These buffers are allocated in physical memory, so applications should always unmap buffers when they are not needed to free up system resources.

A common use for memory-mapped buffers is for streaming data to and from drivers with minimum overhead. Drivers will maintain internal queues of buffers and process them asynchonously. The ioctl commands for doing that are described in the device-specific documents. Memory-mapping can also be used for a variety of other purposes. Drivers can define hardware-specific memory buffer types if needed. Use V4L2_BUF_TYPE_PRIVATE_BASE and greater values for such buffer types.

The size of the buffers often depends on the format of the data. For example, the size of capture buffers is determined by the capture image format set by the VIDIOC_S_FMT ioctl. Therefore, once buffers are mapped, that can place a constraint on how the format can be changed. It is recommended to unmap related buffers before changing the format, and to re-request buffers afterward.
 
struct v4l2_requestbuffers
int count   The number of buffers requested or granted
__u32 type   The requested/granted buffer type
__u32 reserved[2]   reserved
     
struct v4l2_buffer
int index   Which buffer number this is or which to query
__u32 type   The buffer type
__u32 offset   Offset parameter to pass to mmap() to allocate this buffer
__u32 length   Length parameter to pass to mmap() to allocate this buffer and the physical length of the buffer
__u32 bytesused   The number of bytes of data in the buffer
__u32 flags   Flags concerning current status of the buffer or the data in the buffer
__s64 timestamp   Timestamp for the frame
struct v4l2_timecode timecode   The timecode for this frame, if any
__u32 sequence This number of this frame in the sequence. Usually set only by capture drivers.
__u32 reserved[4]   reserved
     
Buffer type codes and flags for the type field of struct v4l2_buffer
V4L2_BUF_TYPE_field   A bitmask to isolate the buffer type field
V4L2_BUF_TYPE_CAPTURE   The buffer is for streaming capture
V4L2_BUF_TYPE_CODECIN   The buffer is an input buffer for an image transform operation
V4L2_BUF_TYPE_CODECOUT   The buffer is an output buffer for an image transform operation
V4L2_BUF_TYPE_EFFECTSIN   Input for an effects device operation
V4L2_BUF_TYPE_EFFECTSIN2   Input for an effects device operation
V4L2_BUF_TYPE_EFFECTSOUT   Output for an effects device operation
V4L2_BUF_TYPE_VIDEOOUT   The buffer is for streaming video output
V4L2_BUF_TYPE_PRIVATE_BASE   Starting value for driver private buffer types.
     
V4L2_BUF_ATTR_DEVICEMEM   (flag) The buffer is physically located in the device's on-board memory
     
Flags for the flags field of struct v4l2_buffer
V4L2_BUF_ATTR_MAPPED   The buffer is currently memory-mapped
V4L2_BUF_FLAG_QUEUED   The buffer is queued for processing (set by the driver on VIDIOC_QBUF)
V4L2_BUF_FLAG_DONE   The buffer has data in it (set by the driver when the frame is processed, cleared by the driver on VIDIOC_QBUF)
V4L2_BUF_FLAG_KEYFRAME   This frame is a keyframe or I frame (always set for uncompressed)
V4L2_BUF_FLAG_PFRAME   This frame is a predicted frame (only for some compressions)
V4L2_BUF_FLAG_BFRAME   This frame is a bidirectionally predicted frame (only for some compressions)
V4L2_BUF_FLAG_TOPFIELD   This image is a top (odd) field in a field-alternating stream
V4L2_BUF_FLAG_BOTFIELD   This image is a bottom (even) field in a field-alternating stream
V4L2_BUF_FLAG_TIMECODE The timecode field is valid.
     

 

Timecodes

The struct v4l2_timecode structure is designed to hold an SMPTE timecode, or similar timecode. The hours, minutes, seconds and frames fields are normal binary values, not BCD.
 

struct v4l2_timecode
__u8 frames   Frame count, 0...23/24/29, depending on type of timecode
__u8 seconds   Seconds count, 0...59
__u8 minutes   Minutes count, 0...59
__u8 hours   Hours count, 0...29
__u8 userbits[4]   The "user group" bits from the timecode
__u32 flags   Other timecode flags
__u32 type   Frame rate the timecodes are based on
     
Values for the type field
V4L2_TC_TYPE_24FPS   24 frames per second, i.e. film
V4L2_TC_TYPE_25FPS   25 frames per second, i.e. PAL or SECAM video
V4L2_TC_TYPE_30FPS   30 frames per second, i.e. NTSC video
     
Flags for the flags field
V4L2_TC_FLAG_DROPFRAME   Indicates "drop frame" semantics for counting frames in 29.97 fps material.
V4L2_TC_FLAG_COLORFRAME   The "color frame" flag.
V4L2_TC_USERBITS_field   Field mask for the "binary group flags"
V4L2_TC_USERBITS_USERDEFINED   Unspecified format
V4L2_TC_USERBITS_8BITCHARS   8-bit ISO characters
     

 

Controls - VIDIOC_QUERYCTRL, VIDIOC_QUERYMENU, VIDIOC_G_CTRL, VIDIOC_S_CTRL

Devices typically have a number of user-settable controls such as brightness, saturation and so on, which would be presented to the user on a graphical user interface (GUI). But, different devices will have different controls available, and furthermore, the range of possible values, and the default value will vary from device to device. These ioctls provide the information and mechanism to create a nice user interface for these controls that will work correctly with any device.

All controls are accessed using an ID value. V4L2 defines a standard set of control IDs. Drivers can also implement their own device-specific controls using V4L2_CID_PRIVATE_BASE and increasing values. The pre-defined control IDs have the prefix V4L2_CID_, and are listed below. The ID is used when querying the properties of a control, and when getting or setting the current value.

The VIDIOC_QUERYCTRL ioctl fills in a struct v4l2_queryctrl object which describes the parameters of a control. The application must fill in the id field before making the call, and the ioctl will fill in the rest of the structure. If the specified control is not supported the ioctl returns the EINVAL error code. This interface allows for four types of controls: integer-valued, boolean-valued, menus, and buttons. An integer-valued control is usually represented by a slider or thumbwheel GUI element. A boolean-valued control is usually represented by a checkbox or toggle GUI element. A menu control is usually represented by a drop-down menu or radio buttons. A button control is represented by a button which performs some action when clicked. The type field indicates the data type of the control.

It is possible to enumerate the controls, call VIDIOC_QUERYCTRL with successive id values starting from V4L2_CID_BASE, and stop when the driver returns the EINVAL error code. After each call to VIDIOC_QUERYCTRL, check the flags field. If the control is supported by the driver then the V4L2_CTRL_FLAG_DISABLED bit will be zero. To enumerate the driver-private controls, use the same algorithm, but use successive id values starting from V4L2_CID_PRIVATE_BASE. On any control, if the V4L2_CTRL_FLAG_GRABBED flag is set, then the value of the control cannot be changed at this time, or using this file descriptor.

Control types:
 

type   minimum   maximum   step   Discussion
INTEGER   low value   high value   increment (positive)   A numerical-valued setting ranging from minimum to maximum inclusive. The step value indicates the increment between values which are actually different on the hardware.
BOOLEAN   0   1   n/a   A binary-valued setting or mode control. Zero corresponds to 'off' or 'disabled', and one means 'on' or 'enabled'.
MENU   0   N - 1   n/a   A selection among N different choices. A menu with N items on it has N possible values from 0 to N-1, inclusive. Use VIDIOC_QUERYMENU to get the menu item strings.
BUTTON 0 0 n/a Performs some action when VIDIOC_S_CTRL is called. The value field is ignored.
                 

To get and set the current value of a control, call VIDIOC_G_CTRL or VIDIOC_S_CTRL. These ioctls use a struct v4l2_control. Fill in the id field, and, if setting a value, the value field. All controls are changeable by a non-capturing open. If the id is not a supported control the driver returns EINVAL. If the value is not a legal value the driver may use the nearest legal value, or return ERANGE. If the control is read-only for some reason the driver will return EBUSY on a set attempt.
 

struct v4l2_queryctrl
__u32 id   A V4L2_CID_* value or driver-defined ID
char name[32]   A suggested label for this control
int minimum   Minimum value
int maximum   Maximum value
int step   The increment between values of an integer control that are distinct on the hardware
int default_value   Driver default value
__u32 type   Control type. One of the V4L2_CTRL_TYPE_* symbols.
__u32 flags   Control flags. V4L2_CTRL_FLAGS_* symbols
__u32 category Control category code, useful for separating controls by function. V4L2_CTRL_CAT_*
char group[32] A suggested label string for the control group
__u32 reserved[2]   reserved
     
Values for the type field of struct v4l2_queryctrl
V4L2_CTRL_TYPE_INTEGER   An integer-valued control
V4L2_CTRL_TYPE_BOOLEAN   A boolean-valued control
V4L2_CTRL_TYPE_MENU   The control has a menu of choices
V4L2_CTRL_TYPE_BUTTON A button which performs an action when clicked
     
struct v4l2_querymenu
__u32 id   The control V4L2_CID_* value
int index   The index of the menu item, 0...maximum-1
char item[32]   The menu item string
int reserved   reserved
     
struct v4l2_control
__u32 id   A V4L2_CID_* value or driver-defined ID
int value   The current value, or new value
     
Values for the control id field
V4L2_CID_BRIGHTNESS   Brightness or black level   integer
V4L2_CID_CONTRAST   Contrast or luma gain   integer
V4L2_CID_SATURATION   Color saturation or chroma gain   integer
V4L2_CID_HUE   Hue or color balance   integer
V4L2_CID_WHITENESS   Whiteness for grayscale devices   integer
V4L2_CID_BLACK_LEVEL Alternate for brightness integer
V4L2_CID_AUTO_WHITE_BALANCE Automatic white balance boolean
V4L2_CID_DO_WHITE_BALANCE Do a white balance and hold it button
V4L2_CID_RED_BALANCE Red chroma balance integer
V4L2_CID_BLUE_BALANCE Blue chroma balance integer
V4L2_CID_GAMMA Gamma adjust integer
V4L2_CID_EXPOSURE Exposure integer
V4L2_CID_AUTOGAIN Automatic gain/exposure control boolean
V4L2_CID_GAIN Gain control integer
V4L2_CID_HCENTER Horizontal image centering integer
V4L2_CID_VCENTER Vertical image centering integer
V4L2_CID_HFLIP Flip image horizontally boolean
V4L2_CID_VFLIP Flip image vertically boolean
         
V4L2_CID_AUDIO_VOLUME   Audio volume   integer
V4L2_CID_AUDIO_MUTE   Mute audio   boolean
V4L2_CID_AUDIO_BALANCE   Audio stereo balance   integer
V4L2_CID_AUDIO_BASS   Audio bass adjustment   integer
V4L2_CID_AUDIO_TREBLE   Audio treble adjustment   integer
V4L2_CID_AUDIO_LOUDNESS   Audio Loudness mode   boolean
         
V4L2_CID_BASE   Beginning of pre-defined ID values    
V4L2_CID_PRIVATE_BASE   Beginning of driver-defined controls    
         
Values for the control category field
V4L2_CTRL_CAT_VIDEO Video controls.
V4L2_CTRL_CAT_AUDIO Audio controls.
V4L2_CTRL_CAT_EFFECT Effect controls.

 

Device Performance - VIDIOC_G_PERF

The VIDIOC_G_PERF ioctl fills in a struct v4l2_performance object. Drivers will keep running tallies of the number of frames processed or dropped. Applications can compute derived values such as frames/second by polling VIDIOC_G_PERF at periodic intervals.
 

struct v4l2_performance
int frames   Total frames successfully processed since streaming was last turned on or off
int framesdropped   Total frames dropped since streaming was turned on, 0 when streaming is off
__u64 bytesin   Total bytes sent into driver since streaming was last turned on or off
__u64 bytesout   Total bytes sent out of driver since streaming was last turned on or off
__u32 reserved[4]   reserved