CS559
Fall 2005

News
Calendar
Assignments

Lectures/Notes
Tutorials
Sample Code

Project 1
Project 2
Project 3

Basic Info
Policies

C++ hints


Version 2 - 9/27/2005 - A few hints are given, as well as some ideas as to what the expectations are for parts 4 and 5.

Version 1 - 9/26/2005 - There are some parts that aren't filled in yet, but these shouldn't prevent you from getting started.

Project 1 : Image Manipulations

Due Tuesday, October 11th 2005, 9:30am

The purpose of this project is to have you implement some image manipulations. This will allow you to demonstrate that you understand image manipulation, and an opportunity to experience some of the pragmatic issues in doing image manipulation.

This project consists of several independent pieces. You can think of it as five separate projects. You can write five separate programs. However, since these five parts all have lots in common, you are probably better off thinking about them together so you can share code between them. You could write one program that has some interface for doing all of the options.

  1. Resize: reduce
  2. Resize: enlarge
  3. Convert to Black and White
  4. Impressionist Paint
  5. Red Eye Reduction

(in case you're wondering, this is similar to last year's except we split resize into two separate operations to simplify grading).

Late Bonus:
Because this project was assigned late, we will simplify it as follows:
We will drop the lower of your grades on parts 4 and 5 (so your grade will be the average of parts 1,2,3 and whichever of parts 4 and 5 you do best on).
[[ yes, this means you can just do 4 or 5, but we really hope that you try to do both, since you will learn different things by doing them ]]

For all Parts:

Each of these operations should read in a Targa file and write back another Targa file. You may prompt the user (for example, using an FlTk file dialog box), or you may take the file names as command line arguments. Remember, that we define a Targa file to be the images that can read and written with the Libtarga library, so we recommend that you use that.

Only the last part (interactive photo fix) requires a GUI. Note that this part of your assignment will probably fulfill the requirements for Programming Assignment 3, so when we grade your project we will check for that. If you choose to do the Interactive Photo Fix in a way that does not meet the Programming Assignment 3 requirements, you'll need to show off something else that does.

We do not care what the user interface is for performing the operations. You might have 6 different command line programs, or 1 program with a GUI, or something in between. You must document your program/programs well enough that we can use it if we need to. Most importantly, you must be able to use your program.

Grading Difficulty Options

For several of the parts of this project, you will have options as to the "difficulty level" of what you try to do. For example there might be an "easy" option and a "difficult option." Each option will be assigned a letter grade that is the highest grade you can get for taking this option. For example, if you take the "B" option on some part and do it really well, you'll get a B for that part. If you take the "A" option and your program doesn't get a correct answer, you might get a "C".

You must tell us which option you chose to do before we do the grading (you must specify it in your README file).

The grading standards are designed such that if you have a correct solution to a lower difficulty option and hand it in as a partial solution to a higher option, you will get a lower grade.

You may choose a different option for each of the parts of the assignment.

For some of the parts of this project, there are "Bonus" or "Extra Credit" options. Remember, that it is course policy that you cannot use extra credit to raise your grade. If you choose to do extra credit, you must also do the normal graded part.

Grading

Each of the five parts of the assignment will be graded. Your total grade will be the average of them (all weighted equally).

You will be evaluated on:

  • The correctness of your program, as evaluated by your demonstration of it.
  • The quality of your code and documentation. (remember your readme).
  • The answers that you turn in to the questions that go along with the various parts of the assignment.
  • Correctly following directions.

Note: we will not evaluate the efficiency of your program (within reason). Concentrate on writing programs that get correct answers and are clean and readable code. If your program takes more than 10-20 seconds to process a 600x400 image, then maybe there's something wrong - but don't worry about trying to get into a race with Photoshop.

More specific information on grading will be added later.

Demonstrations:

The main testing of your program will happen at a scheduled demonstration in B240. You will compile and run your program for us - we'll bring the test data.

 

The Five Image Processing Operations:

1. Resize: Reduce

Your program must take a TGA image and produce a smaller version of it. The hints might help you!

There are 4 difficulty options:

"C" Option:
Implement downsizing by integer multiples (e.g. half, third, quarter). Do this by "pixel dropping" (e.g. no reconstruction filter and no sampling filter)
"B" Option number 1:
Implement downsizing by integer multiples using a proper sampling filer to avoid aliasing artifacts.
"B" Option number 2:
Implement downsizing by an arbitrary fraction using a variable reconstruction filter to provide linear or cubic interpolation between samples.
"A" Option:
Implement downsizing by an arbitrary fraction using a correct reconstruction filter (to provide cubic interpolation) and proper sampling to avoid aliasing.
Bonus Idea:
Implement some image transformation functions other than scaling such as swirling the image or rotating it.

Written Questions: (for B and A options only):

If you did proper sampling (B1 or A) -

  1. What sampling filter did you choose. How do you compute the filter given the resize amount?
  2. Describe an image that we could use to test to make sure that you really are doing proper filtering. Explain what it should look like when things work correctly, and what it would look like if you weren't doing sampling correctly.

If you did a variable reconstruction filter (B2 or A) -

  1. What reconstruction filter did you choose? How do you compute the filter given the resize amount and the pixel location?
  2. Describe an image that we could use to test to make sure that your reconstruction filter worked. Explain how the result would be different than if you were using nearest neighbor.

2: Resize: Enlarge

This is similar to reduce, except that you should make the image bigger.

There are Difficulty Options:

"C" Option:
Implement pixel replication to scale the image by integer multiples.
"B" Option:
Allow the user to scale the image by integer multiples. Allow the user to select between nearest neighbor, linear, and cubic reconstruction kernels.
"A" Option:
Allow the user to scale the image by any fraction greater than 1. Allow the user to select between nearest neighbor, linear, and cubic reconstruction kernels.
Recommended: try different cubic kernels to see how they are different.
Bonus Option:
Support non-uniform scaling (different amounts for X and Y)

Written Questions

If you did "B" or "A":

Describe the difference between the different kernels that you implemented in terms of how the images appear. On what images can you see a difference? Which kernels look better for which images?

3: Convert to Black and White

Your program must be able to convert a color image to a black and white one. That is, the result of this command must be an image where each pixel is either black or white (0,0,0 or 255,255,255).

In making the conversion, your program should account for the perceptual sensitivities of the human eye. (e.g. we are more sensitive to green than to red, and more sensitive to red than to blue).

Because our eyes are more sensitive to green, an equally bright green, blue, and red will not appear equally bright. This means that if you are converting between color and grayscale, the color red (255,0,0) should not turn into the same brightness as the color green (0,255,0).

The exact ratios vary. One simple one is to say that green is twice as sensitive as red is twice as sensitive as blue, which gives the rations for R,G,B as 2/7, 4/7, 1/7, or .28, .57, .14. I have also seen systems that use r = 0.212671, g = 0.715160, b = 0.072169.

The closest thing to an "official" standard is what is used in NTSC (the North American video standard) to compute brightness from colors. That's: Y = 0.30R + 0.59G + 0.11B.

There are many possible quantization algorithms. Several were discussed in class, or in the readings. You should pick the best one that you can implement. However, it is better to have a correctly working simple threshold than an incorrectly working error diffusion method.

The basic "signs of life" required for this part is that your program successfully creates a valid black and white image, and that the image resembles the input. Your program's ability to capture gradations in tone will get you a better grade.

Difficulty Options:

"CD" Option:
Implement simple thresholding
"B" Option:
Implement some form of ordered dithering
"A" Option:
Implement Floyd-Steinberg or some other error diffusion method. If you choose an algorithm other than Floyd-Steinberg (or Ostromoukhov's algorithm), please check with us first.
"Bonus" Option 1:
Allow your program to take a list of colors (so black and white is a special case) and do error diffusion to map the image into these colors.
"Bonus" Option 2:
Use a more sophisticated error diffusion method than Floyd-Steinberg. For example, there are "artistic" halftoning methods in the literature. Or, here is another algorithm that isn't much more complicated to implement than Floyd-Steinberg. Note: you must actually implement the algorithm (since there is code for Ostromoukhov's algorithm on the web).
"Bonus" Option 3:
Implement Floyd-Steinberg and Ostromoukhov's algorithm. (If you implement FS, you can adapt one of the implementations of Ostromoukhov's that is available on the web). Provide a comparison. Find images that look better with one algorithm or the other.

Written Questions:

If you did the "B" option:
Describe the masks that you used and explain how many different gray levels you algorithm can display effectively.
If you did the "A" or "Bonus" option:
Describe the algorithm that you used. If you did use Floyd-Steinberg, explain how you deal with edge overflow. If you used a different algorithm compare it to Floyd-Steinberg.

4: Impressionist Painting

Your program must take a color image and produce a new color image that is a "painted" version of the original.

The basic idea is that you sample the original image and for each sample you place a "brush stroke" in the destination image. You need to randomize the order of the strokes a little. This is a case where undersampling/aliasing actually creates some nice effects

Here's an example (click on an image to get the bigger versions):


original

straight strokes

curly strokes

 

One possibility is to have the user specify where the brush strokes go (have them click on the image and brush strokes appear there). The original version of painting did it this way. You can see a Java implementation of it here. Paul Haeberli, the inventor of the technique, is a UW alumn!

Implementing the interactive version is fun, and could probably serve as your solution to Programming Assignment 3. However, you must also have a "batch mode" version that processes a whole picture without any user intervention.

We've seen some pretty creative solutions to this assignment in past years.

Here are some ideas:

  • The simplest thing to do is to pick a simple brush shape (a small circle or square?) that you can easily draw into the image, and make all the strokes be the same. Get this to work first. The straight and curvy strokes above both do this (for more complicated brush shapes).
  • Try to implement different shapes for the strokes (as the example, or the impressionist demo, shows). My sample implementation read in this image that contained many brush shapes.
  • Adding a little bit of randomness or variety can make your pictures more "painting like". Try adding a little bit of randomness to the color or the position of the strokes. Try adding some randomness to the direction or size of the strokes. My program would randomly choose strokes from a row of the image above.
  • Try adding highlights or shadows to your strokes. The stroke image above shows how my program could mark certain pixels of the stroke to be darker or lighter.
  • (Bonus only) Try adapting the strokes based on the content of the image. For example, where the image has more detail, you might use a larger number of smaller strokes. You might orient the strokes based on the direction of lines in the image. You might cut strokes off if they cross edges in the image.
    Of course, this would require you to know how to analyze images to determine what's in them, and that's the subject of another course (Computer Vision).
    We can provide you with some hints if you are motivated.

Making good paintings from images is a hard problem. There is actually a lot of research out there. Haeberli's original paper is easy (and fun) to read. Here is another old (but good) source of lots of ideas. The version of this project that was assigned at U. Texas has even more ideas. (note: the Texas project has students use the OpenGL graphics library to render the brush strokes. You must render the brush strokes yourself (actually writing the pixels into the image).

A note on expectations:
We don't expect you to do cutting edge research - its a part of a class project for an undergrad graphics class.
Many of the techniques that adapt the strokes to the image require a tiny bit of computer vision know-how. (OK - easy version - a high-pass filter can tell you when there's detail in an image, and that you should use more small strokes).
What's cool about impressionist painting, however, is how cool you can make it look just doing the basic stuff. Most people find doing it really fun.

Difficulty Options:

"B" Option:
Implement the "basic idea" using a single brush shape. To get full credit, it should be clear from the results that your resulting image is "painted" (made of many brush strokes).
"A" Option:
Do something more creative to make better looking paintings. Implement some of the suggestions above.
Bonus Option:
You can go totally crazy with this, trying to recreate different artistic styles, automatically choosing different brush strokes over the image, ... - There's a whole literature of artistic drawing techniques that you could implement!

Written Question:

Describe the algorithm that you implemented. One specific detail that we will want to know: how do you choose where to sample/place brush strokes. If you do anything fancy, be sure to describe it.

Hand in Note: please turn in 1-3 small images produced with your painting algorithm that shows off how well it works.

 

5: Red-Eye Reduction

When you take a picture of a person with a flash that is too close to the lens, the light bounces off their retina (which is red from all the blood vessels) and back into the camera, causing their eyes to glow red. There are various ways to avoid this, such as moving the flash away from the lens (which is hard on a small camera) or making a few pre-flashes to cause the person's pupils to close so no light gets inside to the retina. (this latter effect doesn't work with babies, who don't have that reflex, or drunk or stoned people, who lose that reflex).

(Note: we will soon provide you with lots of sample images, as well as the "ZoomerWindow" example program. What makes this program interesting is that it lets you zoom in on a part of the image and paint on it accurately - very useful when you're trying to paint on something small like the eyes).

Needless to say, sometime, you get pictures where the person's eyes are glowing red. For this part of the project, you must make a tool that fixes the problem.

There are two parts to fixing red-eye:

  1. Determining which pixels to change
  2. Determining what color to change them to

Each part is surprisingly hard. The reason we're giving you this project is we want you to experiment to realize how hard it is to find colors and things like that.

A simple solution would require the user to specify each pixel by painting (now you know why I gave you the ZoomerWindow example), and paint it black. Not only is this a lot of work for the user, it also looks bad.

The ultimate solution would automatically find the red eyes and change the pixels in a realistic manner so you couldn't tell the image had been manipulated. This is really hard. Even commercial software doesn't do it perfectly.

Realistic good solutions require the user to point to where the eye is, have the system figure out the problematic pixels near where the user points, and does some set of color corrections to get the red out without making the eyes look "wrong" (hint: black eyes look almost as weird as red eyes).

Your solution will be evaluated by how much effort it is for the user, and how good the eyes look on some test cases. So, for example:

  • The ZoomerDemo is a lot of work for the user (they have to paint each pixel), and it doesn't look too good (since it makes the eyes green), so it qualifies for the "signs of life" level, but not much more.
  • The 2003 TA's sample solution allowed the user to draw a box around the eyes. It found all of the "bright red" pixels in the box, and dimmed them. This was easy for the user, and gave OK results.
  • Another sample solution had the user click once somewhere inside the eye (very easy for the user). It searches outward to figure out the region of the eye, and does some things to get rid of the red, but keep the shiny "glint". This would be an excellent solution, except that since it still has lots of bugs, so it doesn't really work. (i wanted to write up the assignment first - i'll fix the program later). So the basic signs of life of the ZoomerWindow would actually be worth a better grade.

Note: we will test your program by having you give a demo of it. We will provide some test images for you to practice on - check the class announcements web page for details.

A note on expectations:
Doing good red eye reduction is hard. Doing it really well requires a lot of thought, and some technology beyond 559. That's not what we expect. What we do expect is that you experiment a bit to realize how hard it is to identify what is "red", and to experiment a little bit of how to change images in ways that don't look horrible.

The difficulty levels:

"C" Option:
Provide a painting interface so that the user can paint the red pixels.
"B" Option:
Provide some automation such that the user doesn't need to specify all of the pixels accurately. (for example, if the user paints a non-red pixel, the system leaves it alone, or the user draws a box around a region and the system finds red pixels).
"A" Option:
Provide some automation and avoid making the eyes be a constant color, without making things look too "jaggy"

Written Question

Describe your algorithm.

Handing in your assignment

Your assignment will be handed in by placing files into your handin directory (P:/course/cs559-gleicher/handin/LOGIN/p1). The "effective date" of your handin will the be timestamp of the last file put into this directory.

Your handin must include:

  • a README file that describes each file that you handed in. It must also say which grading option you have chosen for each part. Note: this may be a text file or html. Be sure to explain any code that you've taken from other sources.
  • an INSTRUCTIONS file that tell us how to use your program. This may be a text file or html.
  • all source code (.cpp, ..h, ...). You should include libtarga.c and libtarga.h.
  • The visual studio solution and project files (.sln, .vcproj) requires to build your program.
  • Your sample painted results.

You should not include any of the intermediate files or an executable.

Note: when we test your program, we will copy your handin directory to the local disk of a STORM machine, load the solution file into Visual Studio, and hit compile. You should check to make sure this will work.

Some Implementation Hints

Some hints to make your life easier:

  1. The resizing operations are seperable. (see page 104 of the text)
  2. On a modern processor (like the Pentium 4s in the Storm lab), floating point is really fast. Doing all of your computations in floating point may make things easier (especially for the resampling operations). The thing that is NOT fast is converting between integers and floats. If you choose to do floating point computations, you should convert the image, do the computation, and convert it back at the end.
  3. One thing missing from our (intentionally hand-wavy) discussion of image processing was the connection between frequency and the actual kernel widths. Qualitatively, the radius of the resampling kernel (or at least the main hump of it) should be the same as the distance between the samples.
  4. If #3 wasn't concrete enough for you, here's a specific thing:
    For halving an image (taking every other pixel) the 2nd binomial filter 1/4 (1 2 1) seems about right. For sampling every third pixel the 4th one 1/16 (1 4 6 4 1) is OK. Notice how these kernels overlap just enough that all pixels get an even weight. You can generate kernels in between by interpolating the kernels (so, if you want to divide by 2.5, you might take 1/2 of filter 2 and 1/2 of filter 4 - remember that 1 2 1 is 0 1 2 1 0)
  5. If you really want to understand the rammifications of #4, what you should do is try different filters and see what happens. Pick one that's a little too narrow and see the aliasing. Pick one that's a little too wide and see the excess blurring.