To get started: Clone this repository and all its submodule dependencies using:
git clone --recursive https://github.com/dilevin/computer-graphics-raster-images.git
Do not fork: Clicking "Fork" will create a public repository. If you'd like to use GitHub while you work on your assignment, then mirror this repo as a new private repository: https://stackoverflow.com/questions/10065526/github-how-to-make-a-fork-of-public-repository-private
Welcome to Computer Graphics! The main purpose of this assignment will be to get you up and running with C++ and the cmake build setup used for our assignments.
On all platforms, we will assume you have installed cmake and a modern c++ compiler on Mac OS X¹, Linux², or Windows³.
We also assume that you have cloned this repository using the --recursive
flag (if not then issue git submodule update --init --recursive
).
All assignments will have a similar directory and file layout:
README.md
CMakeLists.txt
main.cpp
include/
function1.h
function2.h
...
src/
function1.cpp
function2.cpp
...
data/
...
...
The README.md
file will describe the background, contents and tasks of the
assignment.
The CMakeLists.txt
file setups up the cmake build routine for this
assignment.
The main.cpp
file will include the headers in the include/
directory and
link to the functions compiled in the src/
directory. This file contains the
main
function that is executed when the program is run from the command line.
The include/
directory contains one file for each function that you will
implement as part of the assignment. Do not change these files.
The src/
directory contains empty implementations of the functions
specified in the include/
directory. This is where you will implement the
parts of the assignment.
The data/
directory contains sample input data for your program. Keep in
mind you should create your own test data to verify your program as you write
it. It is not necessarily sufficient that your program only works on the given
sample data.
This and all following assignments will follow a typical cmake/make build routine. Starting in this directory, issue:
mkdir build
cd build
cmake ..
If you are using Mac or Linux, then issue:
make
If you are using Windows, then running cmake ..
should have created a Visual Studio solution file
called raster.sln
that you can open and build from there. Building the raster project will generate an .exe file.
Why don't you try this right now?
Once built, you can execute the assignment from inside the build/
using
./raster
Every assignment, including this one, will start with a Background section. This will cite a chapter of the book to read or review the math and algorithms behind the task in the assignment. Students following the lectures should already be familiar with this material.
The most common digital representation of a color image is a 2D array of
red/green/blue intensities at pixels. Since each entry in the array is actually
a 3-vector of color values, we can interpret an image as a 3-tensor or 3D array.
Memory on the computer is addressed linear, so an RGB image with a certain
width
and height
will be represented as width*height*3
numbers. How these
numbers are ordered is a matter of convention. In our assignment we use the
convention that the red value of pixel in the top-left corner comes first, then
its green value, then its blue value, and then the rgb values of its neighbor to
the right and so on across the row of pixels, and then moving to the next row
down the columns of rows.
Q: Suppose you have a 767×772 rgb image stored in an array called
data
. How would you access the green value at the pixel on the 36th row and 89th column?A:
data[1 + 3*(88+767*35)]
(Remember C++ starts counting with0
).
Natural images (e.g., photographs) only require color information, but to
manipulate images it is often useful to also store a value representing how much
of a pixel is "covered" by the given color. Intuitively this value (called alpha
or
.png files can store rgba images, whereas our simpler .ppm file format only stores grayscale or rgb images.
We'll use a very basic uncompressed image file format to write out the results of our tasks: the .ppm.
Like many image file formats, .ppm uses 8 bits per color value. Color
intensities are represented as an integer between 0
(0% intensity) and 255
(100% intensity). In our programs we will use unsigned char
to represent these
values when reading, writing and doing simple operations. For numerically
sensitive computations (e.g., conversion between rgb and hsv), it is convenient
to convert values to decimal representations using double precision floating
point
numbers
0
is converted to 0.0
and 255
to 1.0
.
Surprisingly there are many acceptable and reasonable ways to convert a color image into a grayscale ("black and white") image. The complexity of each method scales with the amount that method accommodates for human perception. For example, a very naive method is to average red, green and blue intensities. A slightly better (and very popular method) is to take a weighted average giving higher priority to green:
Q: Why are humans more sensitive to green?
Hint: 🐒
The raw color measurements made by modern digital cameras are typically stored with a single color channel per pixel. This information is stored as a seemingly 1-channel image, but with an understood convention for interpreting each pixel as the red, green or blue intensity value given some pattern. The most common is the Bayer pattern. In this assignment, we'll assume the top left pixel is green, its right neighbor is blue and neighbor below is red, and its kitty-corner neighbor is also green.
Q: Why are more sensors devoted to green?
Hint: 🐒
To demosaic an image, we would like to create a full rgb image without downsampling the image resolution. So for each pixel, we'll use the exact color sample when it's available and average available neighbors (in all 8 directions) to fill in missing colors. This simple linear interpolation-based method has some blurring artifacts and can be improved with more complex methods.
RGB is just one way to represent a color. Another useful representation is store
the hue, saturation, and value of a
color. This "hsv" representation also has 3-channels: typically, the
hue or h
channel is stored in degrees
(i.e., on a periodic scale) in the range s
and
value v
are given as absolute
values in
Converting between rgb and hsv is straightforward and makes it easy to implement certain image changes such as shifting the hue of an image (e.g., Instagram's "warmth" filter) and the saturation of an image (e.g., Instagram's "saturation" filter).
Every assignment, including this one, will contain a Tasks section. This will enumerate all of the tasks a student will need to complete for this assignment. These tasks will match the header/implementation pairs in the
include/
/src/
directories.
Implementations of nearly any task you're asked to implemented in this course can be found online. Do not copy these and avoid googling for code; instead, search the internet for explanations. Many topics have relevant wikipedia articles. Use these as references. Always remember to cite any references in your comments.
Feel free and encouraged to use standard template library functions in #include <algorithm>
and #include <cmath>
such as std::fmod
and std::fabs
.
Extract the 3-channel rgb data from a 4-channel rgba image.
Write an rgb or grayscale image to a .ppm file.
At this point, you should start seeing output files:
bayer.ppm
composite.ppm
demosaicked.ppm
desaturated.ppm
gray.ppm
reflected.ppm
rgb.ppm
rotated.ppm
shifted.ppm
Horizontally reflect an image (like a mirror)
Rotate an image 90° counter-clockwise
Convert a 3-channel RGB image to a 1-channel grayscale image.
Simulate an image acquired from the Bayer mosaic by taking a 3-channel rgb image and creating a single channel grayscale image composed of interleaved red/green/blue channels. The output image should be the same size as the input but only one channel.
Given a mosaiced image (interleaved GBRG colors in a single channel), created a 3-channel rgb image.
Convert a color represented by red, green and blue intensities to its representation using hue, saturation and value.
Convert a color represented by hue, saturation and value to its representation using red, green and blue intensities.
Shift the hue of a color rgb image.
Hint: Use your rgb_to_hsv
and hsv_to_rgb
functions.
Desaturate a given rgb color image by a given factor.
Hint: Use your rgb_to_hsv
and hsv_to_rgb
functions.
Compute C = A Over B, where A and B are semi-transparent rgba images and "Over" is the Porter-Duff Over operator.
Submit your completed homework on MarkUs. Open the MarkUs course
page and submit all the .cpp
files in your src/
directory under
Assignment 1: Raster Images in the raster-images
repository.
Direct your questions to the Issues page of this repository.
Help your fellow students by answering questions or positions helpful tips on Issues page of this repository.
You will need to install Xcode if you haven't already.
Many linux distributions do not include gcc and the basic development tools in their default installation. On Ubuntu, you need to install the following packages (more than needed for this assignment but should cover the whole course):
sudo apt-get install git sudo apt-get install build-essential sudo apt-get install cmake sudo apt-get install libx11-dev sudo apt-get install mesa-common-dev libgl1-mesa-dev libglu1-mesa-dev sudo apt-get install libxrandr-dev sudo apt-get install libxi-dev sudo apt-get install libxmu-dev sudo apt-get install libblas-dev
Our assignments only support the Microsoft Visual Studio 2015 compiler in 64bit mode. It will not work with a 32bit build and it will not work with older versions of visual studio.
This markdown document, and those for all other assignments, contains
$\LaTeX$ math. GitHub just shows the un-evaluated LaTeX code, but other markdown browsers may show the typeset math. Alaternatively, open theREADME.html
(must be online) to view the equations.For reference, you can generate
README.html
from theREADME.md
using multimarkdown:cat markdown/header.md README.md | multimarkdown --process-html -o README.html