H. Chase Stevens Logo

Most Common Tags

  1. programming
  2. python
  3. code
  4. philosophy
  5. evolution
  6. game design
  7. probability
  8. video games
  9. genetic algorithms
  10. government

Archives

Loading

Posts with tag "numpy":

Making pictures using triangles (an exercise in stochastic hill-climbing algorithms) Ė Part 1

posted on: Wednesday, December 4, 2013 (4:02 am) by Chase Stevens

In this blog post, Iíd like to introduce a project Iíve been tinkering on and off (mostly off) with for the past few months. The idea behind the project was pretty simple: to write a program which, given a set number of triangles and a blank canvas, adjusts the coordinates, color, and transparency of the triangles over time in an attempt to replicate a given image. There were two primary motivations behind my project, at least initially:

  • to create (lossily) compressed versions of the source images that still look decent (and might even reduce image noise), and
  • to have those compressed versions themselves be scalable to any size.
While I found that my program met these goals fairly adequately, what I was both surprised and delighted by was its aesthetically pleasing and almost artful representations of the source images. Therefore, before delving into the more technical aspects of the implementation, Iíd like to present a few samples of the programís results. The first example uses 50 triangles, while the other two use 100.

Example 1

Iteration 9500 Iteration 50000
Output at iteration 9,500
Image size: 3.38KB
Compressed SVG size: 0.88KB
Output at iteration 50,000
Image size: 2.95KB
Compressed SVG size: 0.88KB
Iteration 1293000 Christian Bale
Output at iteration 1,293,000
Image size: 2.74KB
Compressed SVG size: 0.88KB
Original image
Size: 6.53KB

This was one of the very first tests of my program, and is in my opinion also one of its most successful and impressive recreations. As you might infer from the stark difference between iteration 9,500 and 50,000 and moderate difference between iteration 50,000 and 1,293,000, the programís output approaches complete similarity to the target image roughly logarithmically with each iteration; this trend is plotted in the graph below.
Graph of recreation error over iterations
The corresponding vector for the final result (below) is actually noticeably less accurate than the JPEG posted above, because the program has exploited JPEG artifacts as well as the specific dimensions of the target in creating its reproduction. When the triangles are used to create an SVG, these artifacts are no longer present and the image becomes less faithful.
One of the most noticeable issues with the programís image is that it entirely fails to capture the eyes in the source image. I explored various methods of solving this issue, which I will discuss in a later post on implementation. Overall, though, I thought that the program produced a more pleasing (if less accurate) image, kilobyte-for-kilobyte, than comparable "traditional" compression techniques (as pictured below), especially if enlarged.
Iteration 1293000 Christian Bale Christian Bale compressed
Programís final output
Image size: 2.74KB
Compressed SVG size: 0.88KB
Original image
Size: 6.53KB

Image compressed with Paint.NET
Size: 1.91KB
Compressed size: 1.58KB
To finish off this example: a video of the programís progress in making its output played at 30 frames per second, with frames produced every 500th iteration. The "jumps" in activity visible at 0:23 and 0:56 are instances when I made updates to the program.

Example 2

Iteration 737500 Actor Christian Bale
Output at iteration 737,500
Image size: 4.03KB
Compressed SVG size: 1.54KB
Original image
Size: 3.51KB

This example yet again highlighted an issue with the image similarity evaluation methods employed by the program. While many aspects of the source image are replicated with high fidelity, the facial features of Christian Bale are not. Below again is a video of the programís progress over time.

Example 3

Iteration 769500 Esteemed actor Christian Bale
Output at iteration 769,500
Image size: 3.27KB
Compressed SVG size: 1.54KB
Original image
Size: 12.9KB

In this instance, I experimented with another evaluation method that incorporated a saliency comparison between the target image and the output. Although the results presented here seem promising, other tests using the method were lacklustre, especially given that the saliency map calculations added a lot of processing time to each iteration.


In my next post, Iíll detail the specifics of the program. In the meanwhile, the program itself can be downloaded here; it requires numpy and PIL to run.

Tags: image processing, numpy, programming, python

Generating Contrast and Saliency Maps on the GPU (Using Python and OpenCL)

posted on: Saturday, April 13, 2013 (8:21 am) by Chase Stevens

In the field of Computational Neuroscience, saliency maps are a means of graphically representing the areas of any visual scene presenting the most "bottom-up" saliency to a human observer (i.e. those most likely to draw the viewer's attention). Although the generation of these maps is not particularly difficult on a conceptual level, doing so is quite computationally expensive if using a serial approach. Below, I provide code for quickly generating the component contrast maps needed to build a saliency map by parallelizing the task on the GPU, as adapted from MATLAB code provided by Vicente Ordonez of SUNY. To run this, you'll need pyopencl v0.92, numpy, and PIL.

Christian Bale Saliency map of Christian Bale
Original image Saliency map

Download
from PIL import Image
import itertools
import numpy as np
import pyopencl as cl
import pyopencl.array as cl_array
from pyopencl.elementwise import ElementwiseKernel
import math

## initialization of GPU context, queue, kernel
ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)
kernel_args = "int width, int len_min_width, float *image, float *contrast"
kernel_code = '''
contrast[i/3] = contrast[i/3] +
 (!(
     (i < width) ||
     (i > len_min_width) || 
     (i % width == 0) ||
     ((i+1) % width == 0) ||
     ((i+2) % width == 0) ||
     ((i+3) % width == 0) ||
     ((i-1) % width == 0) ||
     ((i-2) % width == 0)
   )
 )?
 (
     pown((image[i] - image[i-width]), 2) +
     pown((image[i] - image[i-width-3]), 2) +
     pown((image[i] - image[i-width+3]), 2) +
     pown((image[i] - image[i-3]), 2) +
     pown((image[i] - image[i+3]), 2) +
     pown((image[i] - image[i+width]), 2) +
     pown((image[i] - image[i+width-3]), 2) +
     pown((image[i] - image[i+width+3]), 2)
 ) : 0''' # for each rgb value in image, if not part of a pixel on the image
          # border, sum the squares of the differences between it and the
          # corresponding color values of each surrounding pixel, then add
          # this to the corresponding output pixel's value
contrast_kernel = ElementwiseKernel(ctx,
                                    kernel_args,
                                    kernel_code,
                                    "contrast")

flatten = lambda l: list(itertools.chain(*l))

def contrast_map(image):
    
    width = image.size[0]
    height = image.size[1]
    
    ## creating numpy arrays
    image = np.array(flatten(list(image.getdata()))).astype(np.float32) #image array
    contrast_map = np.zeros((height*width)).astype(np.float32) #blank write array
    
    ## send arrays to the gpu:
    image_gpu = cl_array.to_device(ctx,queue,image) 
    contrast_map_gpu = cl_array.to_device(ctx,queue,contrast_map) 
    
    contrast_kernel(width*3,(image.size - width - 1),image_gpu,contrast_map_gpu) #executing kernel
    contrast_map = contrast_map_gpu.get().astype(np.float32) #retrieving contrast map from gpu
    contrast_map = np.nan_to_num(contrast_map) #conversion of NaN values to zero
    
    ## normalization:
    contrast_map += max(contrast_map.min(),0)
    contrast_map /= contrast_map.max()
    contrast_map *= 255
    return contrast_map.astype(np.uint8)

def saliency_map(image):
    
    width = image.size[0]
    height = image.size[1]
    
    resizes = int(math.floor(math.log(min(width,height),2)) - 3) #calculating necessary number of
                                                                 #images for gaussian pyramid
    resized_images = [image.resize((width/factor,height/factor),Image.BICUBIC) 
                      for factor in [2**x for x in range(resizes)]] #resizing images
    contrast_resized_images = map(contrast_map, resized_images) #generating contrast maps

    ## resizing contrast maps back to original image size:
    images = list()
    for i in range(len(contrast_resized_images)):
        image = Image.new("L",resized_images[i].size)
        image.putdata(list(contrast_resized_images[i]))
        images.append(np.array(list(image.resize((width,height)).getdata())))

    ## combining images:    
    result = np.zeros_like(images[0]).astype(np.float32)
    for image in images:
        result += image.astype(np.float32)
        
    ## normalization:
    result += max(result.min(),0)
    result /= result.max()
    result *= 255
    output = Image.new("L",(width,height))
    output.putdata(list(result.astype(np.uint8)))
    return output

Tags: code, cognitive science, gpgpu, image processing, numpy, opencl, programming, python