Recognizing Cards - Effective Comparisons with Hashing

In the previous post we got as far as isolating and pre-processing the art from a card placed in front of the camera; now we come to the problem of effectively comparing it with all the possible matches. Given the possible "attacks" against the image we're trying to match, e.g. rotation, color balance, and blur, it's important to choose a comparison method that will be insensitive to the ones we can't control without losing the ability to clearly identify the correct match among thousands of impostors. A bit of googling led me to phash, a perceptual hashing algorithm that seemed ideal for my application. A good explanation of how the algorithm works can be found here, and illustrates how small attacks on the image can be neglected. I've illustrated the algorithm steps below using one of the cards from my testing group, Snowfall.

Illustration of the phash algorithm from left to right. DCT is the discrete cosine transform. Click for full-size.

The basic identification scheme is simple: calculate the hash for each possible card, then calculate the hash for the art we're identifying. These hashes are converted to ASCII strings and stored. For each hash in the collection, calculate the hamming distance (essentially how many characters in the hash string are dissimilar), and that number describes how different they are. The process of searching through a collection of hashes to find the best match in a reasonable amount of time will be the subject of the next post in this series (hint: it involves VP trees.) Obtaining hashes for all the possible card-arts is an exercise in web scrapping and loops, and isn't something I need to dive into here.

One of my first concerns upon seeing the algorithm spelled out was the discarding of color. The fantasy art we're dealing with is, in general, more colorful than most test image sets, so we might be discarding more information for less of a performance gain than usual. To that end, I decided to try a very simple approach, referred to as phash_color below: get a phash from each of the color channels and simply append them end-to-end. While it takes proportionally longer to calculate, I felt it should provide better discrimination. This expanded algorithm is illustrated below. While it is true that the results (far right column) appear highly similar across color channels, distinct improvements to identification were found across the entire corpus of images compared to the simpler (and faster) approach.

The color-aware extension of the phash algorithm. The rows correspond to individual color channels.

The color-aware extension of the phash algorithm. The rows correspond to individual color channels. Click for full-size.

I decided to make a systematic test of it, and chose four cards from my old box and grabbed images, shown below. Some small attempt was made to vary the color content and level of detail across the test images.

The four captured arts for testing the hashing algorithms.

The four captured arts for testing the hashing algorithms. The art itself is the property of Wizards of the Coast.

For several combinations of hash and pre-processing I found what I'm calling the SNR, after 'signal-to-noise ratio'. This SNR is essentially how well the hash matches the image it should, divided by the quality of the next best match. The ideal hash size was found to be 16 by a good deal of trial and error. A gallery of showing the matching strength for the four combinations (original phash, the color version, with equalized histograms, and without pre-processing) are shown below, but the general take-away is that histogram equalization makes matching easier, and including color provides additional protection against false positives.

This slideshow requires JavaScript.

If there is interest I can post the code for the color-aware phash function, but it really is as simple as breaking the image into three greyscale layers and using phash function provided by the imagehash package. Up next: VP trees and quickly determining which card it is we're looking at!

Recognizing Cards - Image Capture

Back in October I posted a short blurb on my first attempts on recognizing Magic cards through webcam imagery. A handful of factors have brought me back around to it, not the least of which is a still un-sorted collection. Also, it happened to be a good excuse to dig into image processing and search trees, things I’ve heard a lot about but never really dug into. Probably the biggest push to get back on this project was a snippet of python I found for live display of the pre-and-post processed webcam frames in real time, here. There is real novelty in seeing your code in action in a very immediate way, and it also eliminated all of the frustration I was having with convincing the camera to stay in focus between captures. At present, the program appears to behave well and recognize cards reliably!

I plan to break my thoughts on this project into a few smaller posts focusing on the specific tasks and problems that came up along the way, so I can devote enough space to the topics I found most interesting.

  • Image Pre-Processing
  • Recognizing Blurry Images: Hashing and Performance
  • Finding Matches: Fancy Binary Trees

I should note here: a lot of the ideas used in this project were taken from code others posted online. Any time I directly used (or was heavily inspired by) a chunk of code, I’ll link out to the original source as well as include a listing at the bottom of each post in this series.

Pre-Processing

The goal here was to take the camera imagery and produce an image that was most likely to be recognized as "similar" by our hashing algorithm. First and foremost, we need to deal with the fact that our camera (1) is not perfect, the white-balance, saturation, and focus of our acquired image may all be different than the image we're comparing with, and (2) the camera captures a lot more than the card alone. Let's focus on the latter problem first, isolating the card from the background.

The method I described in the previous post works sometimes, but not particularly well. It required exactly ideal lighting and a perfectly flat background. The algorithm I ended up settling on is:

  1. Convert a copy of the frame to grey-scale
  2. Store the absolute difference between that frame, and the background (more on that later)
  3. Threshold that difference-image to a binary image
  4. Find the contours present using cv2.findContours()
  5. Only look at the contours with a bounded area greater than 10k pixels (based on my camera)
  6. Find a bounding box for each of these contours and compute the aspect ratio.
  7. Throw out contours with a bounding box aspect ratio less than 0.65 or greater than 1.0
  8. If we've got exactly one contour left in the set, that's our card!

The next problem to tackle is that of perspective and rotation, which thankfully we can tackle simultaneously. In the previous steps we were able to find the contour of the card and the bounding rectangle for that contour, and we can use these.

  • Find the approximate bounding polygon for our contour using cv2.approxPolyDp().
  • If the result has more than four corners, we need to trim out the spurious corners by finding the ones closest to any other corner. These might result from a hand holding the card, for example.
  • Using the width of the bounding box, known aspect ratio of a real card, and the corners of the trapezoid bounding the card, we can construct the perspective transformation matrix.
  • Apply the perspective transform.
Camera input image. Card contour is shown in red, bounding rectangle is shown in green.

Camera input image. Card contour is shown in red, bounding rectangle is shown in green. The text labels are the result of the look-up process I'll explain in the coming posts.

The isolated and perspective-corrected card image.

The isolated and perspective-corrected card image.

Lastly, to isolate the art we simply rely on the consistency of the printed cards. By measuring the cards it was fairly easy to pick out the fractional width and height bounds for the art, and simply crop to those fractions. Now we're left with the first problem: the imperfect camera.  Due to the way we're hashing images, which will be discussed in the next post in this series, we're not terribly worried about image sharpness as the method does not preserve high frequencies. Contrast however, is a big concern. After much experimentation I settled on a very simple histogram equalization. Essentially modifying the image such that the brightest color is white and darkest color is black, without disrupting how the bits in the middle correspond. An example of this is given below.

Sample image showing (cw) the camera capture, the target image, the result of histogram equalizing the input, and the result of equalizing the target.

Sample image showing the camera capture, the target image, the result of histogram equalizing the input, and the result of equalizing the target.

So now we're at the point where we can capture convincing versions of the card art reliably from the webcam. In the next post I'll go over how I chose the hashing algorithm to compare each captured image against all the potential candidates, so we can tell which card we've actually got!

Fungiculture: Oyster Mushrooms

It has been a while since I’ve been able to post an actual update, having gotten a job, moved, and settled in during the interim. Having met with some small success growing a few basil plants indoors, I decided to branch out into mycoculture, or mushroom growing, as the requirements are a bit stricter, supplying a bit of an engineering challenge in getting it right. While it’s totally true that, given my choice of oyster mushrooms (pleurotus ostreatus) for my first attempt, I could have just as well used a plastic bag and a spray bottle, however that would not have scratched my data-gathering/total automation itch. Now let us get to the admittedly over-kill list of parts I ended up using.

During this first week-long time lapse we kept the aquarium light on continuously to provide consistent illumination for the camera, and realized that consistent (and blue) light strongly inhibits mushroom growth, which turns out to be a well-established fact  so for a week we saw very little happen aside from a moderate whitening of the surface of the mycelium-log. After a week of watching, we folded and decided to shut it down for the evening and go to bed. Of course, finally given a break from the light, mushrooms immediately appeared over night, so we excitedly resumed the time-lapse capture the following morning, resulting in the second video.

First week, under constant illumination (not much change)


Second week, without night illumination

By the end of that week we had a full flush of mushrooms to harvest, as shown in the photos below. I probably waited about a half-day too long, given that the cap-edges were just slightly beginning to droop. After cutting them from the base with a kitchen knife, the heights and cap-widths were measured for later comparison, and the rinsed mushrooms were stir-fried with green onion and garlic! They were pretty tasty! Given that I don’t usually eat mushrooms I was surprised by how much I liked them, but the inevitable bias towards something I made myself can only help. I also managed to log the conditions over the first 12 days (missing the last bit of the fruiting stage before harvesting); those data are shown below. The placement of the two sensors probably makes up for the fact that the controller was reporting ~70% RH while the logger reported values around ~60%, but there are some relatively easy calibration tests that can be done.


Mushrooms_humidity Mushrooms_temperature
Plots of the temperature and humidity as a function of time over the first 12 days.Without further delay, I should show off the fruits (kinda) of my labor, the mushrooms!

This slideshow requires JavaScript.

Rather than use this as a one-off experiment, I’ve already got the tank back under humidity control and monitoring, the first beginnings of a new flush of mushrooms are just now showing themselves. Moving forward I might try to simplify the setup, as I’ve read that environmental monitoring and power control are super-easy with cheap single-board computers (e.g. raspberry pi). It’s only coincidence that I’ve been playing with those lately for other project, just 4 years behind the curve!

Publication: Designing spectrum-splitting dichroic filters to optimize current-matched photovoltaics

After an incredibly long wait, and with an incredibly long title, the paper covering my work on thin film filters for solar energy collection is finally in print! You can find the it here as part of OSA's Applied Optics Journal.

If there's interest in a plain-English explanation of what we did and why, I'd definitely take the time to write one up! Just drop a comment below.

Wrapping up

It's been a while since I've updated here, and rather than ambiguously stating that "life's been crazy" I can concretely sum it up as "was busy graduating". Writing the dissertation, handing off projects that are still active, and generally cleaning up the path behind me have taken a sizable chunk of time. None of this is to say that I was particularly itching to dive into new projects after passing my defense. I'm moving out to the bay area finally, joining a handful of friends already out there. Hopefully this new chapter will present just as many (if not more) opportunities for projects worth writing about. For the time being I'll be focusing on the logistics of uprooting myself from Tucson, and minimizing the transplantation shock of a new city and job.

Imaging: Programatically Recognizing Art

Sorting images has been a problem on my mind for years, but I never had a really good reason to sink time into it. Just recently, I finally found a reason. It occurred to me that it would be useful to have my webcam recognize magic cards by their art, and add them to a database for collection tracking. I've since learned that this wasn't as new an idea as I'd thought, and several complete software packages for this specific application have become available in the last 6 months. Nevertheless, it was an interesting excursion into image processing, admittedly not my home turf. It worked much better than planned too.

To my mind, the problem had three main tasks that I'd not undertaken before, in order of drastically increasing difficulty:

  1. Grabbing camera input from Python
  2. Isolating the card from the background
  3. Usefully comparing images

The first part was mostly just an exercise in googling it and adapting the code to my webcam. Short version, use OpenCV2's VideoCapture method, and be sure to chuck out a few frames while the camera is auto-focusing and tuning the white balance. The second bit proved to be slightly more interesting. Given a photo of a card on a plain white background, canny edge detection can reliably pull out the edge (though I haven't tried this with white-boarded cards yet). The card-background contour is easily picked it; it has the largest area. Once we've cropped the image down to that, we can isolate the art by knowing the ratios used to layout the cards. This process is shown below.

Original image

Original image

Contours

Contours

Isolated card

Isolated card

Isolated art

Isolated art

The third and most difficult step, effectively comparing the images, will have to wait until I've got more time to write. Soon!

Music Video Exchange

The CD-exchange group I've been meeting with for several weeks now decided to have a session entirely dedicated to sharing music videos. Given the ease of distribution, we settled on videos available freely online. I've compile the lists into one long YouTube playlist, which only misses out on a single track (Royksopp's "Remind Me", which is on Vimeo). Clocking in at just about two hours, the entire list makes for an interesting and discussion-sparking evening. While Beach House's "Wishes" was the stand-out video, I thoroughly enjoyed all the selections. Feel free to share your thoughts in the comments, or link to your favorite music videos.

 

Most Likely to Rock - A Metal Overview

Some context is necessary here. Several of my friends have formed a sort of bi-weekly disc exchange program in the spirit of old painfully 90s mix-tapes. Each week we're either exchanging CDs or meeting to discuss the selections we've listened to. It's been fun so far, providing an excuse to discover new music. In addition it's required us to engage in focused listening and critically evaluate what we're hearing before the discussion. Typically they're curated around a broad theme like "travel", but this previous round turned into a genre-focused overview, with each member tackling a genre that they had more familiarity with. I'd gotten it into my head that I knew a decent amount about metal music, so I went with that idea. Various factors collided to delay the next meeting, meaning I had something like six weeks rather than two to compose my disc. This is good, because I discovered that I actually knew very little about metal going into this.

I ended up spending an solid chunk of time just determining when and how various sub-genres of metal evolved, as well as what distinguished them from their siblings. Sites like Map of Metal and Every Noise at Once provided an amazing starting point as well as a shot of motivation for the task. While this did result in a playlist, the more interesting (read: time intensive) products were the liner notes. For each track I wrote up a short paragraph or two about the genre, threw in the album cover, lyrics, and a list of other artists that fall into the same camp. It's worth noting that chronology was used as the organizing principle behind the playlist, so don't expect smooth transitions nor consistent tempos. The tracks were selected with three goals: provide exemplary tracks for the genres, be accessible for an audience who had never been interested in metal before, and above all fit within the 80 minute limit of a CD-R. I dropped a lot of amazing tracks to stay under the limit, and they were painful omissions, but the various metalheads I consulted with eventually agreed that it was a reasonable introduction.

Enough words, here's the goods.

Album Cover

Playlist (Google Docs)
Liner Notes (PDF)

Metal working: Quick thoughts

A few successful furnace cycles have been run, and a handful of lessons have resulted. I was given a whole bunch of machine shop turnings to melt down (13.6 lbs of them) and learned that it takes a little more care than melting cans. Turnings (also called chip) have very large surface area compared to their volume, and their surfaces are saturated with crystal structure defects due to the machining. Taken together this means that they very readily react and oxidize at elevated temperatures. Not thinking about any of that, I went ahead and tossed a bunch of chip into a fresh steel can and fired up the furnace.

Failed Crucible

Steel crucible failed catastrophically

As evidenced in the above photo, this didn't work out great. A second attempt was made later with three thoughts in mind. 1) Allowing the crucible to reach temperature would thicken the protective oxides on the steel that prevent reaction with aluminum, 2) an existing bath of liquid aluminum would prevent rapid oxidation of the aluminum turnings if they could be effectively submerged, and 3) the residual machining oil on the chip might be a contributor to the reaction. After allowing a new crucible to reach temperature, and observing that a foil-and-salt flux packet readily melted into a bead, cans were fed in until a good layer of liquid aluminum was apparent. At that point I was able to feed in foil-wrapped packets of turnings which seemed to melt readily without adverse reaction. I haven't come up with an easy way to remove all the machining oil, but in the short term it doesn't seem to be an issue. It will take me quite a while to melt down all the turnings, as it doesn't look like I can just feed them in non-stop, but there is no shortage of cans.


Metal working: First firings

Before I dive into my weekend's efforts, a quick note on nomenclature (so exciting, I know). I've been misappropriating a few terms as I've gone along, and should clarify. Smelting is extracting metal from ore, and inasmuch as soda cans aren't really "aluminum ore", I'm really just melting things. Also, I may have mistakenly conflated furnace (for melting) with forge (for heating to working temperature).

The first time we fired it up, starting the coals in a borrowed coal starter, the crucible seemed like it just couldn't reach a high enough temperature. After cooling and cleaning I noticed that the foil we'd thrown in as a test had melted, but only partially. Given the amount of time (and fuel) we gave it, this seemed strange. The body and lid of the furnace visibly darkened in color during use, leading me to believe that a good deal of the heat went into driving off residual water in the plaster. The body did have a week to set, but it was very thick and the lid had only set for a few days. A second reason that the aluminum did not melt may have been due to the crucible. As it is made of a silica-based ceramic, it doesn't conduct heat well and is likely better suited to a less directional electrical-element based furnace. Hoping that the initial firing had burned off any residual water, I decided to work on finding a steel crucible as well.

A second (better documented) attempt was made the following day. I thought for a while about where to find a serviceable steel crucible at short notice, and eventually decided to buy a large-mouth can of soup. The soup was tasty, and the can was steel. I've been calling them "tin cans" my entire life, but the magnet doesn't lie. While it looked bad enough after one use for me to throw it out, at a dollar or so a firing it's a reasonable approach in the short term.

Steel "Crucible"

Steel "Crucible"

I gathered together all of the supplies and got a friend over just in case of emergency then started in.

Casting Supplies

Casting Supplies. Left to right: Bag of scrap aluminum, face shield and gloves, crucibles, long steel spoon, tongs, furnace and blower, steel muffin pan, coal starter, fireplace matches, and emergency water.

After dumping the started charcoal into the furnace, placing the crucible and replacing the lid, we turned the blower on and added the flux. For flux I obtained some Morton's Lite Salt, which is half KCl and half NaCl, which lines up well with the recommended flux for aluminum. Rather than pouring it in (the airflow kept tossing it back out), a teaspoon of the salts was folded into a pouch of aluminum foil and dropped in as a packet. After just a few minutes a metallic bead of aluminum was apparent at the bottom of the crucible, so we commenced loading in the scrap aluminum. A few charcoal briquettes were added when the melting slowed down, but far less than were burnt through the first time.



The air exits the furance vent fast enough to juggle bits of aluminum!

Using discarded cans as the sole source of aluminum did generate a lot of dross, as seen piled on the brick in the photo below, though a bit of additional flux tossed in at the end did seem to free it from the liquid. The two lumps of reclaimed aluminum are visible in the muffin tin. They were allowed to cool for about 20 minutes while my friend ran to the grocery for hot dogs and marshmallows, as the coals still had a good deal of heat left in them.

Dross and Ingots

Dross and Ingots

I'm planning on collecting all the supplies and their costs into a table for reference, just in case anyone else is thinking about taking a stab at this but is being held back by cost concerns. The next steps will be to build a mold flask for green sand casting, mixing up some green sand, and picking a few good objects to cast. I've got some ideas already!