-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add interface to query the display color volume and linear display light encoding #106
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comments here are more like food for thought rather than concrete review comments or even suggesting changes. Maybe you are aware of all of it, or maybe something should be considered, I'm not sure. Hope it helps.
hdr_html_canvas_element.md
Outdated
|
||
If present, the `displayColorVolume` attribute specifies the set of colors that | ||
the screen of the output device can reproduce without significant color volume | ||
mapping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really good to keep optional. I want to give a little bit of background information.
This is what we want to deliver also on Wayland, but there is a problem: it is difficult to find out what the screen can emit.
EDID is supposed to tell us:
- if we use default RGB (SDR) signal colorimetry, what the effective primaries and factory default white point are;
- what additional signal colorimetry standards are supported.
HDR video modes so far fall under standard signal colorimetries which is essentially BT.2020 primaries and white point. The primaries of BT.2020 are unique wavelengths, IOW lasers. Consumer displays tend to use less saturated primaries, so they cannot emit even nearly the full BT.2020. The problem is, I am not aware of anything in EDID or DisplayID that would tell us what the display can emit.
There is a similar problem with dynamic range. CTA block in EDID can tell us the desired luminance range of the display, but that is probably not the emitted luminance range. OTOH, does it matter? If we prepare content to meet the desired luminance range, we should be able to expect that it will be displayed in the best possible way.
I have a hope that Source Based Tone Mapping (SBTM) would fix most of this in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think displayColorVolume
should be used only for high-performance applications that require precise and accurate knowledge of the display capabilities; for everything else, why not use dynamic-range
and color-gamut
media queries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... there is also a much larger issue, which was brought up in the past: knowing the gamut and luminance range of a display is not really useful without also knowing the viewing environment (ambient light level, color, etc.). So it begs the question of how useful displayColorVolume
is without also a means of parameterizing the viewing environment.
Specifying a complete output color system, e.g. Rec. ITU-T BT.2100 PQ, addresses this since such color system includes a viewing environment. Note that the signaled viewing environment (dark surround in the case of Rec. ITU-T BT.2100 PQ) might not match the current viewing environment (broad daylight), leaving the task to adapting the image to the platform/display.
Do we need to go into this level of detail for out initial pass at an HDR Canvas API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question.
What if you just took a complete color system as a whole, like BT.2100 PQ and HLG? Leave room to have a custom color system as well, if it turns out that parameterising everything and targeting actual light becomes a use case.
That's what HDMI and DisplayPort essentially do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As Pierre says, the viewing environment is built in to the signal for PQ and needs to be adapted, for HLG the equations for adaptation are given within the ITU document.
The main use case for this metadata is for when the user/operating system adjusts the diffuse white level. You can't just scale the image as that may clip the highlights and will change the perception of the shadows and mid tones. For example, if you have a 1000 nit monitor, HLG 75% signal level would normally appear at 203 nits. If you increase the diffuse white to 350 and expect the HDR image or video to follow, you would need either a brighter display, or you need to tonemap the highlights to prevent clipping. (I'm working on some example code for this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but where can we find this displayColorVolume
/ ScreenColorInfo
metadata?
A browser could expect to get it from the OS, but where would the OS get it from?
As specified below, the platform does not generally apply color volume mapping | ||
if the color volume of the image is smaller than that of the display. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On one hand this makes sense, you want to present the content as-is when possible.
OTOH, a faithful absolute colorimetric presentation does not produce then intended appearance if the viewing environment differs.
What are "platform" and "display" in this context, do they include the OS window system?
I can think of three different viewing environments in the pipeline:
- the content colorimetry has been authored for a mastering viewing environment A
- a monitor expects a video signal to be tuned for a reference viewing environment B
- the monitor adjusts the image, based on e.g. the end user brightness and contrast knobs, to look as intended in the actual viewing environment C
BT.2100 seems to have a common reference viewing environment for both PQ and HLG, so if both content and video signal follow either of them, a simple format conversion suffices. Otherwise, something more should probably happen between A and B. That you have covered by implying the viewing environment via PredefinedColorSpace
, I guess.
From B to C may still be a mapping.
It's also possible that a monitor would not do image adjustment, which means the video signal would need to be tuned for viewing environment C directly. I'm not quire sure how PQ mode monitors/TVs behave. Maybe that depends on the vendor and model even.
All in all, I would be a bit wary of promising "generally no mapping".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been touched on in the context of the HDR headroom: if the headroom is not explicit then the platform still might have to map it depending on the actual available headroom in the current viewing environment.
The same is true for any other viewing environment property the platform cares about (illuminant, etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see multiple use cases:
- the application provides an HLG or PQ image as-is and the platform performs additional adjustments if needed (the platform might actually perform no adjustments if, for example, the image is full screen and the display supports HLG or PQ natively)
- (a) the platform provides to the application the nominal min/max luminance of the display, i.e. headroom, (b) the application performs tone-mapping, and (c) the platform and the display may or may not perform additional adjustments (ambient light color, user preference, etc.)
- (a) the platform provides the min/max luminance of the display, ambient light level and color, etc., (b) the application performs all adjustments (c) the platform performs no adjustments (the display might still perform adjustments unbeknownst to the platform)
I would think that (1) should be supported no matter what.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think cases 1, 2 and 3 are really separate that much. The common thing is that the image source (application) describes the conditions of the image it has, and then the operating and window system do the best they can with it. If the application got some information in order to do a better image mapping, then the used information is part of the described image conditions. Of course, I can only speak for Wayland.
We're not really disagreeing here. All I'm saying is that "the platform does not generally apply color volume mapping
if the color volume of the image is smaller than that of the display" is a bit much to promise, and also a bit vague. If people start depending on it, you have to strike the word "generally". Otherwise the promise is moot, because it might not always hold.
We won't have that promise on Wayland. The only thing we would promise is that if you tag your image with the output's image description (color space, etc.) exactly, then the image would be sent out in the video signal without adjustments, provided it is not blended with any other window and not a subject of compositor effects. (One reason for this is to allow applications do their own complete color management, another use case is display profiling.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All I'm saying is that "the platform does not generally apply color volume mapping
if the color volume of the image is smaller than that of the display" is a bit much to promise, and also a bit vague. If people start depending on it, you have to strike the word "generally". Otherwise the promise is moot, because it might not always hold.
I agree that the wording is vague.
AFAIK the idea is that, to the extent that the platform provide information on display capabilities, the platform should be encouraged to not alter colors that fit within the display's capabilities -- unless instructed otherwise by the user, etc.
Is your point is that this is not testable and that there are too many scenarios?
The only thing we would promise is that if you tag your image with the output's image description (color space, etc.) exactly, then the image would be sent out in the video signal without adjustments, provided it is not blended with any other window and not a subject of compositor effects.
Could this requirement (i.e. commit to a specific output color system and not modify images that match this color system) be reasonably imposed on the Web Platform in narrow scenarios, e.g., in full screen mode?
Maybe this is too narrow a use case to care about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is your point is that this is not testable and that there are too many scenarios?
I guess that depends on how strict you want to be. If you want to be lax, then maybe explain what the goal or intention is rather than trying to set rules. It is difficult to infer intention from rules, but a well communicated intention should get everyone on the same page.
Another thing is that color volume includes luminance. I'm not sure if this means absolute or relative luminance. I'm worried that this might be recommending presentation that is not adapted to the viewing conditions.
Could this requirement (i.e. commit to a specific output color system and not modify images that match this color system) be reasonably imposed on the Web Platform in narrow scenarios, e.g., in full screen mode?
I'm not sure I understood the question, so pardon the wall of text.
Full screen mode is necessary, if we want to relay video HDR metadata straight to the display, because the metadata affects all of the screen. For Wayland desktop compositors that is annoying to implement, because changes in display driving mode or even HDR metadata (EOTF particularly) may take a while to sink in (ha!) or even blank the display for a moment.
Hence it is much more likely that a Wayland compositor chooses one mode (and static metadata set) for driving a display, and converts everything from applications to that. A Wayland compositor may be able to use the display controller hardware to do some or all of the conversion, so it may be possible to avoid GPU work in video playback.
This also means that full screen becomes irrelevant. A compositor converts all contents from all applications to the same, um, compositing color space that is tuned for each display. Even windowed video etc. should look fine and consistent.
If an application like a browser tags its window (or a video plane) with the output's image description, it only means that the Wayland compositor can skip most of the conversions and all automatic perceptual adjustments. Composition still happens in the compositing color space, but round-tripping to that is a no-op, and may even be skipped completely. But now the burden is on the application to follow the compositor's output image description, and there are only two good reasons to do that:
- the application is generating content (e.g. a game) and can choose an arbitrary image description to target for no or little extra cost, or
- the application wants to avoid the automatic conversions of a compositor for whatever reason.
For video playback performance using the output image description for video player's window content is detrimental, because it likely forces the video player to use a GPU rendering pass for color conversion, which means there cannot be any opportunity to off-load that conversion to the display controller hardware. Hence, one should pass video frames as-is from a hardware video decoder all the way to the window system, tagging them with the video's image description.
Display controller hardware color conversions are fixed-function and highly optimized for performance and low power consumption, which makes them preferable to the power hungry GPU when at all possible.
Referring to the diagram in https://gitlab.freedesktop.org/pq/color-and-hdr/-/merge_requests/35 , we often cannot and might not even care to control the emitted light directly, exactly because then we also need to take viewing conditions into account ourselves. As much as it pains me to leave the actual light undefined, it makes things much simpler if we can trust that whoever receives our standardized signal also takes care of adapting it appropriately.
This dilemma is the core when trying to support both entertainment and ICC workflows in the same framework. Display-referred "do exactly as I say" vs. scene-referred and display-referred automatically adapted to the display and viewing conditions at hand. In Wayland it seems we have actually let go of even trying to guarantee the actual light at protocol level. If an end user wants that level of control for their applications, they are required to profile (measure) their monitors and use a particular workflow.
Uhh... am I even on topic here?
Per 2023-09-06 meeting: add support for enumerated values of the color volume, |
…accurate Provide reference to dynamic-range media query
* `referenceWhiteLuminance` specifies the luminance of reference white as | ||
reproduced by the screen of the output device, and reflects viewing environment | ||
and user settings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a whole-system kind-of view there are a few different ways to implement this...
- The source targets a reference display and a reference viewing environment. The compositor works with the reference conditions. The sink itself is responsible for converting to the actual conditions, possibly with the help of a human changing settings on the sink.
- The source targets a reference display and a reference viewing environment. The compositor is responsible for converting from the reference conditions to the actual conditions. The sink shows the signal as-is (ideally).
- The source targets the actual conditions directly. The compositor does not change the signal. The sink shows the signal as-is.
Your sentence works just fine for all cases. In the first two, all the ScreenColorInfo
values would be fixed all the time and the display/user will do the adjustment. In the second case the values will be dynamic.
While this probably shouldn't be part of the spec, we really should make sure that it's understood that those values do not necessarily imply real-world luminances and real-world display color volume.
Not making it clear has already caused a generation of desktop displays not having adjustable HDR modes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this probably shouldn't be part of the spec, we really should make sure that it's understood that those values do not necessarily imply real-world luminances and real-world display color volume.
I have tried writing such a sentence a couple of times, and have not found the right words. Maybe something as informal and short as you suggest above would work.
From Lars:
|
From Lars:
|
From Lars:
|
Rename colorVolumeMetadata to contentColorVolume
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with the code updates, need to add teonmapping at a later date.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a good step forward.
In general I would prefer to avoid attempting to expose min/max/white luminance, and instead specify HDR headroom, but we can work that separately.
I'd also argue for using the conversion formulation in #111
hdr_html_canvas_element.md
Outdated
@@ -81,28 +88,19 @@ include the following color spaces. | |||
|
|||
"rec2100-hlg", | |||
"rec2100-pq", | |||
"rec2100-linear-display", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rec2100-display-linear
below. | ||
When drawing an image into a Canvas, the image will be transformed unless the | ||
color spaces of the image and the Canvas match. Annex A specifies transformation | ||
to and from `rec2100-pq` and `rec2100-hlg`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rec2100-pq
, rec2100-hlg
, and rec2100-display-linear
.
hdr_html_canvas_element.md
Outdated
@@ -119,6 +117,12 @@ specified in Rec. ITU-R BT.2100. | |||
_NOTE: {R', G', B'} are in the range [0, 1], i.e. they are not expressed in | |||
cd/m<sup>2</sup>_ | |||
|
|||
### rec2100-linear-display |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also rec2100-display-linear
```idl | ||
dictionary ScreenColorInfo { | ||
optional ColorVolumeInfo colorVolume; | ||
optional double referenceWhiteLuminance; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd generally prefer an "HDR headroom" formulation, without any reference to minimumLuminance
, maximumLuminance
, and referenceWhiteLuminance
, because those are not generally available and use-able.
illustrated by (c) below. | ||
|
||
![Color Volume Mapping Scenarios](./tone-mapping-scenarios.png) | ||
|
||
## Annex A: Color space conversions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to use the formulation outlined in:
#111
Closes #110
Closes #109
Closes #103
Closes #102
Closes #100
Closes #99
Closes #92
Closes #50
Closes #42
Closes #39
Closes #36