Skip to content

Commit 979e5f9

Browse files
authored
feat(iba): IBA::perpixel_op (#4299)
Inspired by a question by Vlad Erium, I have added a simpler way for C++ users of OIIO to construct IBA-like functions for simple unary and binary operations on ImageBufs where each pixel is independent and based only on the corresponding pixel of the input(s). The user only needs to supply the contents of the inner loop, i.e. just doing one pixel's work, and only needs to work for float values. All format conversion, sizing and allocation of the destination buffer, looping over pixels, and multithreading is automatic. If the actual buffers in question are not float-based, conversions will happen automatically, at about a 2x slowdown compared to everything being in float all along, which seems reasonable for the extreme simplicity, especially for use cases where the buffers are fairly likely to be float anyway. What you pass is a function or lambda that takes spans for the output and input pixel values. Here's an example that adds two images channel by channel, producing a sum image: // Assume ImageBuf A, B are the inputs, ImageBuf R is the output R = ImageBufAlgo::perpixel_op(A, B, [](span<float> r, cspan<float> a, cspan<float> b) { for (size_t c = 0, nc = size_t(r.size()); c < nc; ++c) r[c] = a[c] + b[c]; return true; }); This is exactly equivalent to calling R = ImageBufAlgo::add(A, B); and for float IB's, it's just as fast. To make the not-float case fast and not require the DISPATCH macro magic, I needed to change the ImageBuf::Iterator just a bit to add store() and load() method templates to the iterators, and add a field that holds the buffer type. That might make a slight ABI tweak, so I am thinking that I will make this for the upcoming OIIO 3.0, and not backport to the release branch. I think this is ready to introduce at this time, but I'm also studying whether more varieties of this approach are needed, whether the non-float case can be sped up even more, and whether some of the existing IBA functions should switch to using this internally (good candidates would be those that are almost always performed on float buffers, but for which the heavy template expansion of the DISPATCH approach to handling the full type zoo currently makes them very bloated and expensive to compile, for very little real-world gain). We should probably consider this to be experimental for a little while, just in case the function signature for this changes as I think about it more or add functionality. --------- Signed-off-by: Larry Gritz <lg@larrygritz.com>
1 parent 5cdcdb3 commit 979e5f9

File tree

9 files changed

+759
-5
lines changed

9 files changed

+759
-5
lines changed

src/doc/imagebuf.rst

Lines changed: 77 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -204,7 +204,7 @@ Deep data in an ImageBuf
204204
Error Handling
205205
==============
206206

207-
.. doxygenfunction:: OIIO::ImageBuf::errorf
207+
.. doxygenfunction:: OIIO::ImageBuf::errorfmt
208208
.. doxygenfunction:: OIIO::ImageBuf::has_error
209209
.. doxygenfunction:: OIIO::ImageBuf::geterror
210210

@@ -239,6 +239,82 @@ Miscellaneous
239239

240240

241241

242+
Writing your own image processing functions
243+
===========================================
244+
245+
In this section, we will discuss how to write functions that operate
246+
pixel by pixel on an ImageBuf. There are several different approaches
247+
to this, with different trade-offs in terms of speed, flexibility, and
248+
simplicity of implementation.
249+
250+
Simple pixel-by-pixel access with `ImageBufAlgo::perpixel_op()`
251+
---------------------------------------------------------------
252+
253+
Pros:
254+
255+
* You only need to supply the inner loop body, the part that does the work
256+
for a single pixel.
257+
* You can assume that all pixel data are float values.
258+
259+
Cons/Limitations:
260+
261+
* The operation must be one where each output pixel depends only on the
262+
corresponding pixel of the input images.
263+
* Currently, the operation must be unary (one input image to produce one
264+
output image), or binary (two input images, one output image). At this time,
265+
there are not options to operate on a single image in-place, or to have more
266+
than two input images, but this may be extended in the future.
267+
* Operating on `float`-based images is "full speed," but if the input images
268+
are not `float`, the automatic conversions will add some expense. In
269+
practice, we find working on non-float images to be about half the speed of
270+
float images, but this may be acceptable in exchange for the simplicity of
271+
this approach, especially for operations where you expect inputs to be float
272+
typically.
273+
274+
.. doxygenfunction:: perpixel_op(const ImageBuf &src, bool (*op)(span<float>, cspan<float>), int prepflags = ImageBufAlgo::IBAprep_DEFAULT, int nthreads = 0)
275+
276+
.. doxygenfunction:: perpixel_op(const ImageBuf &srcA, const ImageBuf &srcB, bool (*op)(span<float>, cspan<float>, cspan<float>), int prepflags = ImageBufAlgo::IBAprep_DEFAULT, int nthreads = 0)
277+
278+
Examples:
279+
280+
.. code-block:: cpp
281+
282+
// Assume ImageBuf A, B are the inputs, ImageBuf R is the output
283+
284+
/////////////////////////////////////////////////////////////////
285+
// Approach 1: using a standalone function to add two images
286+
bool my_add (span<float> r, cspan<float> a, cspan<float> b) {
287+
for (size_t c = 0, nc = size_t(r.size()); c < nc; ++c)
288+
r[c] = a[c] + b[c];
289+
return true;
290+
}
291+
292+
R = ImageBufAlgo::perpixel_op(A, B, my_add);
293+
294+
/////////////////////////////////////////////////////////////////
295+
// Approach 2: using a "functor" class to add two images
296+
struct Adder {
297+
bool operator() (span<float> r, cspan<float> a, cspan<float> b) {
298+
for (size_t c = 0, nc = size_t(r.size()); c < nc; ++c)
299+
r[c] = a[c] + b[c];
300+
return true;
301+
}
302+
};
303+
304+
Adder adder;
305+
R = ImageBufAlgo::perpixel_op(A, B, adder);
306+
307+
/////////////////////////////////////////////////////////////////
308+
// Approach 3: using a lambda to add two images
309+
R = ImageBufAlgo::perpixel_op(A, B,
310+
[](span<float> r, cspan<float> a, cspan<float> b) {
311+
for (size_t c = 0, nc = size_t(r.size()); c < nc; ++c)
312+
r[c] = a[c] + b[c];
313+
return true;
314+
});
315+
316+
317+
242318
Iterators -- the fast way of accessing individual pixels
243319
========================================================
244320

src/doc/imageioapi.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -286,6 +286,8 @@ just exist in the OIIO namespace as general utilities. (See
286286

287287
.. doxygenfunction:: get_extension_map
288288

289+
|
290+
289291
.. _sec-startupshutdown:
290292

291293
Startup and Shutdown

src/include/OpenImageIO/imagebuf.h

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1316,6 +1316,16 @@ class OIIO_API ImageBuf {
13161316
// Clear the error flag
13171317
void clear_error() { m_readerror = false; }
13181318

1319+
// Store into `span<T> dest` the channel values of the pixel the
1320+
// iterator points to.
1321+
template<typename T = float> void store(span<T> dest) const
1322+
{
1323+
OIIO_DASSERT(dest.size() >= oiio_span_size_type(m_nchannels));
1324+
convert_pixel_values(TypeDesc::BASETYPE(m_pixeltype), m_proxydata,
1325+
TypeDescFromC<T>::value(), dest.data(),
1326+
m_nchannels);
1327+
}
1328+
13191329
protected:
13201330
friend class ImageBuf;
13211331
friend class ImageBufImpl;
@@ -1338,6 +1348,7 @@ class OIIO_API ImageBuf {
13381348
char* m_proxydata = nullptr;
13391349
WrapMode m_wrap = WrapBlack;
13401350
bool m_readerror = false;
1351+
unsigned char m_pixeltype;
13411352

13421353
// Helper called by ctrs -- set up some locally cached values
13431354
// that are copied or derived from the ImageBuf.
@@ -1500,6 +1511,17 @@ class OIIO_API ImageBuf {
15001511

15011512
void* rawptr() const { return m_proxydata; }
15021513

1514+
// Load values from `span<T> src` into the pixel the iterator refers
1515+
// to, doing any conversions necessary.
1516+
template<typename T = float> void load(cspan<T> src)
1517+
{
1518+
OIIO_DASSERT(src.size() >= oiio_span_size_type(m_nchannels));
1519+
ensure_writable();
1520+
convert_pixel_values(TypeDescFromC<T>::value(), src.data(),
1521+
TypeDesc::BASETYPE(m_pixeltype), m_proxydata,
1522+
m_nchannels);
1523+
}
1524+
15031525
/// Set the number of deep data samples at this pixel. (Only use
15041526
/// this if deep_alloc() has not yet been called on the buffer.)
15051527
void set_deep_samples(int n)

src/include/OpenImageIO/imagebufalgo_util.h

Lines changed: 157 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,102 @@ parallel_image(ROI roi, std::function<void(ROI)> f)
9090

9191

9292

93+
/// Common preparation for IBA functions (or work-alikes): Given an ROI (which
94+
/// may or may not be the default ROI::All()), destination image (which may or
95+
/// may not yet be allocated), and optional input images (presented as a span
96+
/// of pointers to ImageBufs), adjust `roi` if necessary and allocate pixels
97+
/// for `dst` if necessary. If `dst` is already initialized, it will keep its
98+
/// "full" (aka display) window, otherwise its full/display window will be set
99+
/// to the union of inputs' full/display windows. If `dst` is uninitialized
100+
/// and `force_spec` is not nullptr, use `*force_spec` as `dst`'s new spec
101+
/// rather than using the first input image. Also, if any inputs are
102+
/// specified but not initialized or are broken, it's an error, so return
103+
/// false. If all is ok, return true.
104+
///
105+
/// The `options` list contains optional ParamValue's that control the
106+
/// behavior, including what input configurations are considered errors, and
107+
/// policies for how an uninitialized output is constructed from knowledge of
108+
/// the input images. The following options are recognized:
109+
///
110+
/// - "require_alpha" : int (default: 0)
111+
///
112+
/// If nonzero, require all inputs and output to have an alpha channel.
113+
///
114+
/// - "require_z" : int (default: 0)
115+
///
116+
/// If nonzero, require all inputs and output to have a z channel.
117+
///
118+
/// - "require_same_nchannels" : int (default: 0)
119+
///
120+
/// If nonzero, require all inputs and output to have the same number of
121+
/// channels.
122+
///
123+
/// - "copy_roi_full" : int (default: 1)
124+
///
125+
/// Copy the src's roi_full. This is the default behavior. Set to 0 to
126+
/// disable copying roi_full from src to dst.
127+
///
128+
/// - "support_volume" : int (default: 1)
129+
///
130+
/// Support volumetric (3D) images. This is the default behavior. Set to 0
131+
/// to disable support for 3D images.
132+
///
133+
/// - "copy_metadata" : string (default: "true")
134+
///
135+
/// If set to "true-like" value, copy most "safe" metadata from the first
136+
/// input image to the destination image. If set to "all", copy all
137+
/// metadata from the first input image to the destination image, even
138+
/// dubious things. If set to a "false-like" value, do not copy any
139+
/// metadata from the input images to the destination image.
140+
///
141+
/// - "clamp_mutual_nchannels" : int (default: 0)
142+
///
143+
/// If nonzero, clamp roi.chend to the minimum number of channels of any
144+
/// of the input images.
145+
///
146+
/// - "support_deep" : string (default: "false")
147+
///
148+
/// If "false-like" (the default), deep images (having multiple depth
149+
/// values per pixel) are not supported. If set to a true-like value
150+
/// (e.g., "1", "on", "true", "yes"), deep images are allowed, but not
151+
/// required, and if any input or output image is deep, they all must be
152+
/// deep. If set to "mixed", any mixture of deep and non-deep images may
153+
/// be supplied. If set to "required", all input and output images must be
154+
/// deep.
155+
///
156+
/// - "dst_float_pixels" : int (default: 0)
157+
///
158+
/// If nonzero and dst is uninitialized, then initialize it to float
159+
/// regardless of the pixel types of the input images.
160+
///
161+
/// - "minimize_nchannels" : int (default: 0)
162+
///
163+
/// If nonzero and dst is uninitialized and the multiple input images do
164+
/// not all have the same number of channels, initialize `dst` to have the
165+
/// smallest number of channels of any input. (If 0, the default, an
166+
/// uninitialized `dst` will be given the maximum of the number of
167+
/// channels of all input images.)
168+
///
169+
/// - "require_matching_channels" : int (default: 0)
170+
///
171+
/// If nonzero, require all input images to have the same channel *names*,
172+
/// in the same order.
173+
///
174+
/// - "merge_metadata" : int (default: 0)
175+
///
176+
/// If nonzero, merge all inputs' metadata into the `dst` image's
177+
/// metadata.
178+
///
179+
/// - "fill_zero_alloc" : int (default: 0)
180+
///
181+
/// If nonzero and `dst` is uninitialized, fill `dst` with 0 values if we
182+
/// allocate space for it.
183+
///
184+
bool
185+
IBAprep(ROI& roi, ImageBuf& dst, cspan<const ImageBuf*> srcs = {},
186+
KWArgs options = {}, ImageSpec* force_spec = nullptr);
187+
188+
93189
/// Common preparation for IBA functions: Given an ROI (which may or may not
94190
/// be the default ROI::All()), destination image (which may or may not yet
95191
/// be allocated), and optional input images, adjust roi if necessary and
@@ -506,6 +602,67 @@ inline TypeDesc type_merge (TypeDesc a, TypeDesc b, TypeDesc c)
506602
IBA_FIX_PERCHAN_LEN (av, len, 0.0f, av.size() ? av.back() : 0.0f);
507603

508604

605+
606+
/// Simple image per-pixel unary operation: Given a source image `src`, return
607+
/// an image of the same dimensions (and same data type, unless `options`
608+
/// includes the "dst_float_pixels" hint turned on, which will result in a
609+
/// float pixel result image) where each pixel is the result of running the
610+
/// caller-supplied function `op` on the corresponding pixel values of `src`.
611+
/// The `op` function should take two `span<float>` arguments, the first
612+
/// referencing a destination pixel, and the second being a reference to the
613+
/// corresponding source pixel. The `op` function should return `true` if the
614+
/// operation was successful, or `false` if there was an error.
615+
///
616+
/// The `perpixel_op` function is thread-safe and will parallelize the
617+
/// operation across multiple threads if `nthreads` is not equal to 1
618+
/// (following the usual ImageBufAlgo `nthreads` rules), and also takes care
619+
/// of all the pixel loops and conversions to and from `float` values.
620+
///
621+
/// The `options` keyword/value list contains additional controls. It supports
622+
/// all hints described by `IBAPrep()` as well as the following:
623+
///
624+
/// - "nthreads" : int (default: 0)
625+
///
626+
/// Controls the number of threads (0 signalling to use all available
627+
/// threads in the pool.
628+
///
629+
/// An example (using the binary op version) of how to implement a simple
630+
/// pixel-by-pixel `add()` operation that is the equivalent of
631+
/// `ImageBufAlgo::add()`:
632+
///
633+
/// ```
634+
/// // Assume ImageBuf A, B are the inputs, ImageBuf R is the output
635+
/// R = ImageBufAlgo::perpixel_op(A, B,
636+
/// [](span<float> r, cspan<float> a, cspan<float> b) {
637+
/// for (size_t c = 0, nc = size_t(r.size()); c < nc; ++c)
638+
/// r[c] = a[c] + b[c];
639+
/// return true;
640+
/// });
641+
/// ```
642+
///
643+
/// Caveats:
644+
/// * The operation must be one that can be applied independently to each
645+
/// pixel.
646+
/// * If the input image is not `float`-valued pixels, there may be some
647+
/// inefficiency due to the need to convert the pixels to `float` and back,
648+
/// since there is no type templating and thus no opportunity to supply a
649+
/// version of the operation that allows specialization to any other pixel
650+
/// data types
651+
//
652+
OIIO_NODISCARD OIIO_API
653+
ImageBuf
654+
perpixel_op(const ImageBuf& src, bool(*op)(span<float>, cspan<float>),
655+
KWArgs options = {});
656+
657+
/// A version of perpixel_op that performs a binary operation, taking two
658+
/// source images and a 3-argument `op` function that receives a destination
659+
/// and two source pixels.
660+
OIIO_NODISCARD OIIO_API
661+
ImageBuf
662+
perpixel_op(const ImageBuf& srcA, const ImageBuf& srcB,
663+
bool(*op)(span<float>, cspan<float>, cspan<float>),
664+
KWArgs options = {});
665+
509666
} // end namespace ImageBufAlgo
510667

511668
// clang-format on

src/libOpenImageIO/CMakeLists.txt

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -254,7 +254,9 @@ if (OIIO_BUILD_TESTS AND BUILD_TESTING)
254254
add_test (unit_imagecache ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/imagecache_test)
255255

256256
fancy_add_executable (NAME imagebufalgo_test SRC imagebufalgo_test.cpp
257-
LINK_LIBRARIES OpenImageIO ${OpenCV_LIBRARIES}
257+
LINK_LIBRARIES OpenImageIO
258+
${OpenCV_LIBRARIES}
259+
${OPENIMAGEIO_IMATH_TARGETS}
258260
FOLDER "Unit Tests" NO_INSTALL)
259261
add_test (unit_imagebufalgo ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/imagebufalgo_test)
260262

src/libOpenImageIO/imagebuf.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3163,6 +3163,7 @@ ImageBuf::IteratorBase::init_ib(WrapMode wrap, bool write)
31633163
m_y = 1 << 31;
31643164
m_z = 1 << 31;
31653165
m_wrap = (wrap == WrapDefault ? WrapBlack : wrap);
3166+
m_pixeltype = spec.format.basetype;
31663167
}
31673168

31683169

0 commit comments

Comments
 (0)