Assignment 1: Poisson Image Editing

Advanced Computer Graphics (COS 526)



Overview

In this assignment, we explore gradient-domain processing through a technique called Poisson image editing, which is used to seamlessly blend a region of a source image into a destination image using a mask.

This is motivated by standard non-seamless cloning, which is effectively a simple copy-and-paste of the source pixels into a destination, without any modification. The results are jarring and immediately stand out to the human eye, since this often leaves abrupt gradients at the transition between the new source pixels and the neighboring destination pixels.

Poisson Blending

It turns out that human perception tends to be more sensitive to gradients than to intensity values in an image. So, we can formulate this as a problem in which we try to solve for new target pixels at the destination (where the source pixels would be placed) such that we maximally preserve the existing gradients while also trying to minimize the transition gradient between the destination image and the new source region.

This Poisson problem can be discretized into a linear system of equations, which are shown below. Here, the function f represents a pixel value that we are solving for, while f* represents the existing pixel value in the destination image. v represents the discrete gradient in the source image (i.e. difference between neighboring pixel values).

Linear equations in the discrete Poisson solution

For each pixel p in the masked region, we examine the 4 neighboring pixels. We add a new equation to the linear system that constrains the gradient from the pixel p to each of its neighbors in the source image. Further, we add a boundary constraint that pulls a boundary pixel p' closer to the value of the neighboring destination pixels, which results in a solution that is more "seamless" at the masked region.

The resulting linear system of equations has one equation for each pixel in the mask. It turns out to be a very large, but very sparse linear system, which allows us to quickly compute a solution using an iterative least-squares solver for sparse matrices. Solving the system produces a vector of values representing the function f above, which are the new pixel values to be placed into the masked region of the destination image.

Mapping the results of f back into pixel coordinates, we replace the corresponding pixels in the destination image and obtain our final result.

It should also be noted that color images have 3 color channels, R, G, and B. As such, we have to perform the above computation for each color channel and then combine the results into RGB pixels before producing the final output.


Results

Below, I reproduce some of the results found in the Perez et al. paper.

Source image
Destination image
Mask
Seamless cloning result
Non-seamless cloning result



Source image 1
Source image 2
Source image 3
Mask 1
Mask 2
Mask 3
Destination image
Seamless cloning result 1
Seamless cloning result 2
Seamless cloning result 3
Non-seamless cloning result 1
Non-seamless cloning result 2
Non-seamless cloning result 3



Source image
Destination image
Mask
Seamless cloning result
Non-seamless cloning result

Extensions

After implementing the standard solving method, I decided to pursue some extensions in the form of mixed gradients and monochrome transfer.

Mixed Gradients

In the original method, the solver attempts to maintain the same gradients in the source image when blending it into the destination region. However, it is sometimes desirable to combine gradient properties of the destination region with the source region to produce a more realistic result. A good example of this is when the source region has holes in it, and we would like to retain the destination image's background texture.

In the example below, we want to blend the written characters onto the wall. If we choose to enforce maintaining the source gradient (which is nearly zero), then we can see a noticeable, flat background patch behind the characters, which comes from the source region's flat gradient. Instead, we can choose to mix our gradients when setting up the linear system of equations to take whichever gradient is higher between the destination image and the source image at each pixel. Doing so retains the destination's background gradient, while keeping the source region's higher frequent content intact.

Source image
Destination image
Mask
Source gradients only
Using mixed gradients

Monochrome Transfer

This method of image blending can also be used to perform texture transfer. The paper provides an example of blending an apple with a pear, where the result is a pear but with the texture of an apple. However, performing this blend will sometimes produce a result that retains some of the original source color, which we do not want, since we are only interested in the texture.

To avoid this, we can first transform the source image into a monochrome one. To do this, I map the RGB values into a grayscale value using the following formula used in MATLAB's color library, and use that value in all 3 color channels.

Gray = 0.2989 * R + 0.5870 * G + 0.1140 * B

Using the monochrome version of the source image, we can perform the same image blending to perform a texture transfer. In the example below, I do this using both the standard gradients as well as with the mixed gradients. Note that using mixed gradients creates a more appealing result, with more convincing detail just underneath the animal in the water's reflection.

Monochrome transfer with standard gradients
Monochrome transfer with mixed gradients

I also performed the same blending using mixed gradients with monochrome transfer with the previous example of blending the characters onto a wall. The result is the same, but the characters have lost their blue color and are now grayscale.

Chromatic transfer with mixed gradients
Monochrome transfer with mixed gradients

Implementation

This program is written in C++ using the provided ImageIO library from Prof. Rusinkiewicz along with the GNU Scientific Library for sparse linear algebra computation.

My contributions augment the provided ImageIO library with additional data structures for representing 2D coordinates and the neighbors of a pixel, for improved logical clarity. Further, I create an ImageBlend module consisting of imageblend.cpp and imageblend.h, which introduce a data structure for mapping between pixels and equations in the linear system. The module also provides functions that perform non-seamless cloning as well as seamless cloning, along with its own main function that can be invoked from the command line.

Supporting Different Image Sizes

We relax our assumptions about all 3 images having the same size and allow the source image to be of a different size from the other two. Then, we ask the user to provide a (x, y) offset to translate the source image into the desired location.

This is implemented in the program by creating a new image with the same dimensions as the destination/mask image, and then copying over the pixels from the provided source image, each offset by (x, y). This effectively produces a translated version of the original source image, padded all around by black pixels.

Using the original version of Perez's Figure 3b source, I was able to obtain a similar result by providing a translation of (30, 20) pixels. I played around with the translation further and applied a translation of (60, 20) pixels, which shifts the animal further to the right, but without any noticeable clipping of the animal due to the mask remaining in the same place. I went even further and tried a translation of (120, 20) pixels to see what would happen, and now we can observe clear clipping in the horizontal axis due to the mask and the source region being significantly misaligned.

Result using provided size-matched source
Result using original source translated (30, 20)
Result using original source translated (60, 20)
Result using original source translated (120, 20)

Building and Running ImageBlend

From the root directory, use the provided Makefile to build the program. ImageBlend requires that the GSL, libjpeg, and libpng libraries are installed.

$ make

The program is then run from the command line with 9 additional arguments:

$ ./imageblend SRC DST MASK SEAMLESS_OUT SEAMED_OUT mixed/poisson gray/normal x-offset y-offset

The first 5 arguments are all filepaths. Then choose between "mixed" or "poisson" to select the mode for gradient selection during the solving. Then choose between grayscale or normal transfer of the source image. Lastly, provide an x-offset and y-offset for the source image's translation, if needed. To ignore translation, enter 0 for both offsets.

During execution, the program provides debugging output to stderr regarding the least-squares solving. Upon completion, the program writes out an output image using seamless cloning to SEAMLESS_OUT and a non-seamless version to SEAMED_OUT.

Here is an example to reproduce the first set of results using the original Poisson blending technique:

$ ./imageblend cos526-img/perez-fig3a-src.png cos526-img/perez-fig3a-dst.png cos526-img/perez-fig3a-mask.png out.png out2.png poisson normal 0 0

References