Friday, November 15, 2013

Sonja Kovalevsky Days 2013

I was invited to give a talk at the Sonja Kovalevsky Days to show some bright high school mathematics stars what working with mathematics can be like. The presentation titled "Image recognition and mathematics" is available here.

I covered two main tools, support vector machines and principal component analysis and showed how these can be used in face detection and recognition.

The code for 2D PCA visualization is available here. The Eigenfaces example is from scikit-learn here. The SVM plots are from this, this, and this example.

Monday, August 19, 2013

Anaconda for Scientific Computing and Computer Vision

I recently came across the Anaconda Python distribution. It comes with Python, NumPy, SciPy, Pandas, Matplotlib, Numba and many other packages and you can set up virtual environments for any combination of Python and package versions. Since it uses vistual environments it installs in a folder and doesn't mess with your other Python installations. A list of included packages is here.

Using Anaconda

Here's the quick version of how to get started. Open a terminal and create an environment. For example, to create a new named environment with Python 2.7:
$ conda create -n py27 python=2.7 anaconda
To activate this environment, use:
$ source activate py27
To deactivate this environment, use:
$ source deactivate
Activation and deactivation works the same as Python's virtualenv (as far as I can tell).

Try an example with images, machine learning, and plotting

Download the scikit image quantization example. Activate your environment:
$ source activate py27
Notice that the prompt shows you are in the environment. Start an iPython session:
(py27)$ ipython --pylab
Run the file you downloaded:
In [1]: execfile("")
You should get three plots like the ones described here. Here's one of them

Monday, April 22, 2013

Watermarking images with OpenCV and NumPy

Here's a simple Python example of adding a watermark to an image using NumPy and OpenCV. I created a black/white version of the OpenCV logo for this example but feel free to use any image you like. In this case I wanted to have a white watermark image, white text, and the rest of the image unchanged. If you have watermark images in color or grayscale the same process should work. Here's the code using the OpenCV license plate sample image:
import numpy as np
import cv2

# read images
original = cv2.imread('data/licenseplate_motion.jpg')
mark = cv2.imread('logo.png') 

m,n = original.shape[:2]

# create overlay image with mark at the upper left corner, use uint16 to hold sum
overlay = np.zeros_like(original, "uint16")
overlay[:mark.shape[0],:mark.shape[1]] = mark

# add the images and clip (to avoid uint8 wrapping)
watermarked = np.array(np.clip(original+overlay, 0, 255), "uint8")

# add some text 5 pixels in from the bottom left
cv2.putText(watermarked, "Watermarking with OpenCV", (5,m-5), cv2.FONT_HERSHEY_PLAIN, fontScale=1.0, color=(255,255,255), thickness=1)

cv2.imshow("original", original)
cv2.imshow("watermarked", watermarked)
Note that I created an array of type "uint16" to hold the sum of the two images before clipping with NumPy's clip function. The text is added with OpenCV's putText function which will need position, font (out of the available OpenCV fonts) and text scale and a color tuple (here white). The result looks like this:

Tuesday, September 18, 2012

Installing PIL on OS X Mountain Lion

Installing PIL, the Python Imaging Library, on OS X sometimes gives problems with missing JPEG support. I have run into this problem before but could not find my solution so I'm adding it here as a note to self with "PIL" and "install" tags so I can locate the trick again in the future.
Here's the procedure.

1. Get libjpeg from Then in the unpacked folder:
sudo make install

2. Download PIL from (Remove PIL first if installed already.)

3. In the unpacked folder:
python build --force
sudo python install
The magic part for me was "--force". Without it it doesn't work. If you still don't get "JPEG support available" in the console when you run the first build command, then you can try adding the path to libjpeg in (basically just replace the "None" with your path). For example like this:
JPEG_ROOT = "usr/local/lib"

Sunday, August 12, 2012

Reading Gauges - Detecting Lines and Circles

I received a question from a reader on how I would approach reading a simple gauge with one needle on a good frontal image of a circular gauge meter. This makes a good example to introduce Hough transforms. Detecting circles or lines using OpenCV and Python is conceptually simple (each particular use-case requires some parameter tuning though). Below is a simple example using the OpenCV Python interface for detecting lines, line segments and circles. The documentation for the three relevant functions are here. You can also find more on using the Python interface and the plotting commands in Chapter 10 of my book.
import numpy as np
import cv2

Script using OpenCV's Hough transforms for reading images of 
simple dials.

# load grayscale image
im = cv2.imread("gauge1.jpg")
gray_im = cv2.cvtColor(im, cv2.COLOR_RGB2GRAY)

# create version to draw on and blurred version
draw_im = cv2.cvtColor(gray_im, cv2.COLOR_GRAY2BGR)
blur = cv2.GaussianBlur(gray_im, (0,0), 5)

m,n = gray_im.shape

# Hough transform for circles
circles = cv2.HoughCircles(gray_im,, 2, 10, np.array([]), 20, 60, m/10)[0]

# Hough transform for lines (regular and probabilistic)
edges = cv2.Canny(blur, 20, 60)
lines = cv2.HoughLines(edges, 2, np.pi/90, 40)[0]
plines = cv2.HoughLinesP(edges, 1, np.pi/180, 20, np.array([]), 10)[0]

# draw 
for c in circles[:3]:
 # green for circles (only draw the 3 strongest), (c[0],c[1]), c[2], (0,255,0), 2) 

for (rho, theta) in lines[:5]:
 # blue for infinite lines (only draw the 5 strongest)
 x0 = np.cos(theta)*rho 
 y0 = np.sin(theta)*rho
 pt1 = ( int(x0 + (m+n)*(-np.sin(theta))), int(y0 + (m+n)*np.cos(theta)) )
 pt2 = ( int(x0 - (m+n)*(-np.sin(theta))), int(y0 - (m+n)*np.cos(theta)) )
 cv2.line(draw_im, pt1, pt2, (255,0,0), 2) 

for l in plines:
 # red for line segments
 cv2.line(draw_im, (l[0],l[1]), (l[2],l[3]), (0,0,255), 2)

# save the resulting image
This will in turn; read an image, create a graylevel version for the detectors, detect circles using HoughCircles(), run edge detection using Canny(), detect lines with HoughLines(), detect line segments with HoughLinesP(), draw the result (green circles, blue lines, red line segments), show the result and save an image. The result can look like this: From these features, you should be able to get an estimate on the gauge reading. If you have large images, you should probably scale them down first. If the images are noisy, you should adjust the blurring for the edge detection. There are also threshold parameters to play with, check the documentation for what they mean. Good luck.

Monday, June 25, 2012

Book: Programming Computer Vision with Python

Finally, after many nights and weekends, my O'Reilly book is out! You can buy it from O'Reilly here.

Thanks to everyone who helped with feedback and comments on the draft versions I put online. The code and datasets are available from

Sunday, June 24, 2012

Arnold's Cat Map

Arnold's cat map is a fun mapping to randomize images by shuffling the pixels around. It distorts the image by shearing and then moves the pieces outside the original image region back using the integer mod function. Applied iteratively, this results in randomizing the image in a way that eventually returns the original. Here's a code example that iteratively applies the mapping and saves intermediate images:
import Image
from numpy import *
from scipy.misc import lena,imsave

# load image
im = array("cat.jpg"))
N = im.shape[0]

# create x and y components of Arnold's cat mapping
x,y = meshgrid(range(N),range(N))
xmap = (2*x+y) % N
ymap = (x+y) % N

for i in xrange(N+1):
 im = im[xmap,ymap]
Tradition has it that this mapping should be applied to pictures of cats. (The name comes from Vladimir Arnold, who demonstrated the mapping on an image of a cat)
Cat image (128*128) at iteration 0, 1, 64, 126, 127, 128. (original image here, credit CC Flickr @outlier*)