-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How do we want to specify formats? #4
Comments
From @cubanismo: Good summary Rob. I personally think where dataformat will shine is in advanced use cases, probably many of which we aren’t seeing or haven’t thought of yet because the ability to share complicated surface formats isn’t exposed anywhere yet. The dataformat spec was designed to fill in the gaps of fourcc. I feel safer going with the dataformat spec as the basis of API because of that. I agree some helper libs to actually make dataformat usable as an API interface is highly desirable, and I’ve discussed that a bit with Andrew Garrard, the spec author. I believe we’re all in agreement that the data format spec by itself is not palatable for direct use in an API without some helper libs or headers defining some conversions and further data structures. |
From @robclark: Well, it looks like dataformat is serializable which is my big concern (ie. keep in mind that not all proc’s can open all devices) so if we get some helper libs to make dataformat easier to deal with for the common cases, then sounds good to me. |
From Albert Freeman: What about HDR and beyond (e.g. full cie xyz/luv/[or similar])? |
Personally, I'd rather avoid the data format specification as it's way more verbose than we are likely to need. However, if we're going to have a centralized allocator, we need some way to tell it the format so that it can perform the allocation. For most things that can be shared between vendors, fourcc is sufficient. Where it starts to run out is if I have some crazy non-scanout format that I want to share between Vulkan and CL for instance. If we just had a vkGiveMeWhatINeedToShoveIntoGBM2 call, then we could allow some of these things to be vendor-specific. |
What's wrong with verbosity if you have a function call like:
and similar for other APIs? That seems much better to me than falling back to opaque formats when we go beyond what fourcc can do. Most apps won't care about the fancy stuff the data format spec can do, and those apps shouldn't need to think about it. They can use helper functions and tables, which we could probably include in the same package with little effort. But for the 1% that do care, it would be great to have it available. Driver vendors could use inverse helpers and ignore formats with bits they don't support. Maybe some driver vendors don't want to worry about anything advanced, but with HDR coming, perhaps some will soon care about things like defining an arbitrary white point for their YUV surface via its format. |
I guess my biggest concern is dataformat seems like way overkill for something like a v4l camera or video decoder. If we had some helpers that can map generic fourcc to dataformat and back (and return an error if passed a dataformat that doesn't map to a normal fourcc, perhaps) maybe it could work. Maybe it could work the other way by using vendor capabilities to pass the extra information, since anything "fancy" would have to be supported by both sides of the sharing. Unrelated, our header format is almost the same but not exactly as the dataformat descriptor block header. |
@cubanismo For one, thing the line of code you cited doesn't work because Khronos data format specifiers are variable-length. This makes them way more painful to work with than an emum. Also, at least when I read it, it looked like VkFormat may not be enough information to actually construct a full data format (I may not understand it quite right though). For YUV, at least, the data format includes a bunch of information that seems to more-or-less equate to tiling and/or plane locations. That seems like something we may want decoupled. |
@jekstrand Yes, in general the blocks are variable-sized, but for a fixed set of formats, the function could return pointers to static const data built at compile time. Andrew at one point had a header that generated such a table for GL or Vulkan as an example I believe. When converting from a less-detailed format description to a more-detailed one, yes there's not exact 1-1 mapping. However, generally when going from less->more, there are some reasonable defaults that can be chosen. Ideally the Vulkan working group would create a definitive mapping and publish it. Going from more->less (data format -> fourcc) would be lossy, but that's inevitable. I don't view the fact that dataformat is more expressive as a negative. I view it as future-proofing. I believe the additional YUV data you're referring to is sample positions and which plane a given component lives in? It's unclear whether we want to allocate planar surfaces as one allocation or two in allocator library terms, but I don't think there's anything in the data format descriptors that don't belong in a format, and I think fourcc also describes some of the same properties you're concerned about. |
@cubanismo If it's just a pointer to a const table, that's probably not substantially worse than the rest of the variable-sized blobs we've been talking about. I'm still a bit conflicted on whether or not the extra information is future-proofing or just adding confusion and pain. But my mind isn't 100% made-up yet. |
@jekstrand Yes, the intent is that most APIs will effectively have an enum list of formats that they use, which can be used as an index into an array of pointers to data format descriptors. What you'd normally pass around is the enum - data format descriptors are intended to be short enough for fast processing, but they can't be sufficiently general without being substantially less efficient than a simple enum. I do have a version of this mapping for Vulkan and partly GL; I'm revising it for correctness since some formats changed after it appeared. It will also definitely be only an example mapping: Vulkan and GL make no statements about whether their R,G and B channels contain red, green and blue data or something entirely different. To talk to an external API and make sense, someone probably needs to have said something about this interpretation; to be useful, mappings probably need to describe the minimum common case. Some APIs might like to fill in their standard formats and allow the table to be extended by users importing external content or specifying their own more detailed approaches, but it was never the intent that hardware has to directly interpret the data format descriptor, and ideally that software should be doing so only infrequently. The idea of having these mappings available is that you can then, for example, use them to determine how you want to do inter-API mapping. There are legitimate reasons that the user might consider all 32-bit (total) formats to be equivalent, or to believe that all RGB formats are equivalent irrespective of size. There are certainly APIs which will want to ignore the colour space aspect of the format. Rather than saying two things have to be the same, I think it's better to define what we mean by "the same". The data format specification is intended to be FourCC on steroids. Two images may be the same "format" but store planes in different places, have different strides or be different sizes - as such that's assumed to be "outside the format". Interpreting a texel block (repeating pixel pattern), however, is part of the format - otherwise block compression wouldn't be considered part of the format. The data format spec therefore does specifically include sample position information, which can be used both for YUV chroma channel sample placement and for things like Bayer samples. My experience is that image metadata tends to go missing when formats are transferred between APIs, especially if there's a helper library in between that doesn't know about the metadata. There are limits to what can sensibly be embedded in a general-purpose data format descriptor (the layout is extensible, so you could include a full ICC profile, but that's unlikely to be the general case) - but a best effort has been made to pack in as much as possible. There's also an effort to pack in things that you probably should have specified but might not have done - because there are a lot of home-rolled attempts to define this kind of thing in a way that requires incompatible extension. FourCC's belated addition of BT.709 and lack of discussion of full and narrow range YUV content is an example. The data format spec currently needs updating to support the newest HDR TV standards. I'm on it. :-) It's likely that some additional metadata will always be appropriate in some environments - particularly some HDR video formats have frame-specific metadata which is probably not best encoded in the "format", although it could be held in an extension block if it was really decided that was the best approach. In most environments I wouldn't expect it to be sensible for the "format" to change this frequently, though. |
@fluppeteer It's entirely possible that my desire for simplification comes from not having had to deal with the details of TV standards. I'm not going to wish I had :-) but I'll grant that my knowledge is incomplete. |
@jekstrand / @cubanismo / @fluppeteer, So some random thought, and something I mentioned in d3c96b6 but it seems like capability blocks are kinda the same thing as dataformat descriptors.. ie. if I understand dataformat properly it should describe all the combinations of things we are talking about with capability blocks/descriptors with respect to possible tiled/compressed/etc layout.. So maybe the question about fourcc vs dataformat is looking at things the wrong way around. Maybe user asks for a fourcc in I'm still not sure how user would ask for a particular format + 3d-array-mipmap.. maybe fourcc isn't the right input in |
@robclark, Yes, they do look similar but I'm not convinced they're the
same. We could, in theory, base our final generated metadata on the data
format spec, but I don't think capabilities is a good match.
Note: I'm not suggesting that we do. I would rather have something
Intel-specific with an Intel format enum when it comes to the final
serializable image metadata blob.
|
@jekstrand fwiw, I'm assuming we don't have to exchange this over the wire every frame so the conversion between dataformat and intel format enum is inconsequential. Admittedly I'm still at the "open mind" phase with this so just exploring other possibilities but I don't think "what enum we plug into the sampler state" is too important for how the "liballoc" (or whatever it ends up being called) API works.. My rough thoughts leading to that idea was mostly "dataformat looks a lot like constraint block/descriptor thing" and "dataformat seems too specific for a client to ask for" (ie. more appropriate as a driver response to what client asks for, or as a way to communicate description of some existing buffer between different APIs that client is using) (And PS, I've only started digging more into dataformat today so could be reading more into how it is intended to be used than I should.. but my impression so far is that it is too precise for what a client would want to ask for to create a new buffer) |
@robclark, just to be clear, I didn't mean that we shouldn't use data
format as part of the "what to send to the allocator" package. I just
meant that we may not want so shove everything in as data format
extensions. In the end, that distinction may be mute as capability and
constraint descriptions are looking a lot like data formats.
|
@jekstrand I guess an interesting question is what might we want in "capabilities block" that doesn't fit in dataformat? It seems everything that I can think of that wouldn't fit belongs in the "constraint block" instead. Anyways, I'm still in the "open mind" phase so figured it was a good time to ask this question.. we probably would need to spiff out |
I'd asked @fluppeteer about using data format descriptors to describe things like vendor tiling in the past, and he mentioned they weren't intended to work at this level. I think the specific point he made was that the format is intended to encapsulate only things that describe the layout of a single pixel/texel, rather than the arrangement of texels relative to eachother. However, while it might be an abuse of the spec from a design standpoint, I think it's technically feasible to describe things like tiling/swizzle/compression in a data format descriptor. What @robclark said about looking at it another way (app gives fourcc or whatever, and gets back a data format block) is an interesting perspective. I think I would still want the option of passing in a data format block too, perhaps describing it as a "template" data format that the library could munge to some defined extent and return, but allowing the use of other simpler formats directly in the API would probably be convenient. |
@cubanismo, I think that's a good sum-up of my reservations about stuffing everything in a dataformat. |
hmm, dataformat covering (standard) compressed formats does blur the lines a bit. My assumption was that vendor-id's and types were intended to cover vendor tiling and compression, similar to how we are talking about capabilities block, without it being part of the core spec. (Since I would expect you want that sort of information when exchanging between different APIs from same vendor.) But I could be reading more into what dataformat was intended to cover than I should. If people actively think those sorts of vendor extensions should not be in dataformat, then at least sharing the same header might be a reasonable idea. And, yeah, having a simplified dataformat (without vendor stuff) as input template, to which driver / allocator backend tacks on extra stuff might also be an option. |
I think my position was that there's a line between "how do you interpret this bit of data" (which I would reasonably call a "format") and "how do I address these pixels" (which, in the sense that stride and resolution aren't really "format", I thought of as different). Compressed formats are special because you need the whole compressed texel block to make sense of any given pixel. Bayer formats are special because you need the surrounding pixels to interpret your own - although strictly speaking you probably need more than the pixels just within the texel block to do that; the same is true for PVRTC2, of course. The basic descriptor block (i.e. not using the extension mechanism in the data format descriptor) is intended to describe anything up to 128 pixels per dimension. More specifically, it can handle four dimensions (because ASTC supports 3D compression and if you're going to do three, you may as well do four...) and gives a byte to each dimension for describing a sample position. To be friendly to YUV formats with midpoint UV, the byte values are interpreted in halves of a (nominal) pixel size (if you need more precision than that for any reason you're probably looking at multiplying up your coordinates), so you've got up to 127.5 as a position coordinate for each sample. You absolutely could describe a tiled pixel layout in this way except for the few tiles that are larger than 128 pixels across (and I know there are some layouts which do that). To do that, you'd describe every channel in every pixel in the tile separately in terms of its bit offset from the start of the tile. That's 16384 samples for a 64x64 tile of ARGB, or a format descriptor of about 256KB. I'd suggest that's not really a sensible "format", though, and that tiling should be metadata. :-) The data format descriptor certainly doesn't solve all problems. It's intended to be just for the kind of thing most people think of as a "format", and that includes actual information about interpreting the content, preferably defined in some "standard" way. Note that there are several "undefined" enums for things like colour primaries - the intent is that, for example, format conversion functions know to give up in this case, but for other cases they may know how to perform the conversion, so you can still hand-wave if something isn't defined in your API, but there's a place to keep the information if it is defined. Describing proprietary tiling in an extension block as a simple enum is another matter - I don't see a problem with that, but it could equally well be done outside the format. I don't really have a clear thought on how to handle things like hierarchical Z or lossless compression (AFBC-style) in this kind of scheme, but thus far all of this has tended to be proprietary and not used for interchange, so I've been able to ignore it. |
There are a few possibilites here. (Thanks to @robclark for the write-up. I'm mostly just moving it to github.):
Straw man suggestions:
The text was updated successfully, but these errors were encountered: