jdigittl: August 2006

Thursday, August 31, 2006

Cubes suck

Until recently I had been stationed in a cube at my office. The only good thing about this was that I could peer over and chat with my neighbors if I had something to say. The bad thing (as Joel Spolsky talks about) is that thinking for long period of time is difficult. Rands talks about his den and how he can lock himself in there and either work or play. I posit that sometimes both happen at once.

Just before I left on vacation, I moved into an office. I have never worked in an office alone before, always either being in a cube or sharing an office with a coworker. My primary concern about moving into an office alone was that I would get sucked into a whole lot of nothing, involving refreshing bloglines, reading news and twiddling thumbs.

Over the past two days, to the casual observer, this is exactly what I was doing.

In the back of my mind I knew I should be working on finding an efficient way to find the distance between a point and the surface of a cube in non-Euclidean N-space. But that sounded scary. I'd come into work, clear out my emails, catch up with the news, then draw a cube on my glass wall and stare for a bit. Staring wouldn't last longer than a few minutes. If I was feeling particularly enthusiastic I'd erase my cube and draw a new one, from a different angle - hoping to grok the mysteries of 5 dimensional cubes. Then, back to slashdot. Or YouTube. Or some partially functioning Java applet written in 98 that enumerates measure polytopes while crashing Firefox.

Every now and then, out of no where, would come some piece of insight. I'd scurry across to my old cube neighbor and lay him with my new knowledge. He'd point out why I was wrong, and then I'd head back to the office for more YouTube.

But then today, for no particular reason, it all made sense and was blindingly simple. By doing nothing I managed to get what I estimated would have been a few weeks of work done in about an hour. It turns out (as usual) I was making the problem far more complex than it really was. The sad thing is, if I were still in a cube, with the constant self-applied pressure to not look like a slacker, it probably would have taken me 3 weeks to see the light and the solution would have been far more complex than it needed to be.

Tuesday, August 29, 2006

sony vx-2000e/canon ae-1: a dirty tryst

For those who can't be bothered reading my drunken ramblings: About 8 months ago I decided to try and mount an 'old-school' lens system from a classic Canon 35mm still camera on a fairly swank Sony digital video camera. It worked.

'twas the night before the night before christmas, and josh had too much rum.
he wanted to test a hypothesis:
that the sony vx2000 has an easily replaceable lens,
contrary to what the manual says.

after about 3 hours of careful screwdrivering, my lens accidentally fell off in a kinda un-accidental way....

oh yeah, and i was kinda suprised when it still worked.

first test movie:

and here is the interesting part

the ccd fits!

it's a sonon!

the next step was to go visit mike at eyebeam, who kindly let me use their workspace and laser cutter. I used this to make some templates from mat board, then plexiglass.

using laser cutter:

testing various shapes as lens mounts:

this is what my workbench looked like... just before I used the hacksaw to void my warranty by cutting up the old lens internals to make the backing of the FD mount.

so, after i made the matching plate (the white plastic thing) i attached the FD mount to the hacked CCD mount. i also hacksawed off the front part of the metal side of the case (where the original lens used to be), this is so i can actually reach the mount point so i can easily change lenses and reach the aperture ring.

here is the view of the assembly from the front.

and the grand finale - fully assembled. you can see here i threw in another white plate to increase the lens-ccd distance as focussing was a bit tricky as above. this was the hardest part -- it really should be millimeter accurate - but its not. oh well, it seems to do the trick!

today i just opened up the ccd mount, and made some shim's from aluminium sheet. the lens-ccd distance is now within 1mm -- perfect. now it is time to buy some fancy lenses. (i have my eye on a full frame 8mm....)

Ok. I think I need to make a relay system. There is a slight problem - 35mm lenses are for 35mm film, whereas my CCD is much smaller. Hence, the image formed at the focal plane is too large. I once studied optics! I can fix this! Here is my initial sketch of how it should work:

of course, this will make the image upside-down, but i think i can tweak the LCD display to always display upside down. It already flips the image when you turn it around, so I just have to find the sensor that does that, and reverse the sense. Note, I'm also planning to switch to Nikon F-mounts as they seem to be easier to find (at least at my local camera store). As for the prime, I'm looking at the Peleng 8mm- it's pretty cheap and looks like it will do the trick.

yep. that was easy - image is now the right side up in the viewfinder (but left and right are now swapped... lesser of two evils?)

ok, after a trip to B&H (and some very strange looks) I got all the bits that I needed. But they forgot to put them in my bag, so I'm going to have to go back and get the tube. As you can see below, the system lets in a little too much light on the sides :P.

After I pick up an extension tube & make a coupler, I'll have to make an extension arm from the tripod to support the extra weight. And Then, I'll be rocking a truly tricked out camera.

ok. I couldn't find a full set of Nikon K rings, or a BR-3. So I made the missing K rings by buying 7 split lens filters, popping out the glass to make an empty tube. I made a BR-3 by taking apart a K1 and glueing it to a filter. Luckily the inside of the K1 has a 52mm flange that was a tight fit on a male threaded filter. Anyway, she is done. Well, done in the sense that all I need now is the prime lens. Ebay, here I come.

tricked out!

Update: So, I got the Peleng 8mm lens and its bloody marvelous. I ended up taking it to Australia in January, where I managed to drop it off a 4th floor balcony onto the road and under a truck. The lens now has a hairline crack in the prime glass, but generally works OK.

Below is a quick movie I edited that chronicles building the camera and the patience of my sweet one whilst putting up with me :)

Some useful links:

Peleng 8mm

Mounting 8mm on Nikon Digitals

Camera mounts & registers

Extension tubes

And a big thankyou to the people in the second hand department at B&H camera who put up with me asking for rare parts to be used for bizzare purposes.

Monday, August 07, 2006

Ethics & AOL

How many people have access to this database? (I do)

Who is concerned by the breach of privacy? (I am)

Who, despite their concerns about privacy, spent a good portion of tonight browsing other people's searches? (I did)

How many ways can you use this data to make $? (I can think of a few)

Is it legal to use this data to make money through SEM? (Probably)

Is it legal to use this data to make money through identity theft? (No)

Is it ethical?

Tsk tsk AOL..

For Postgres users (not the best way of doing things, but it works):


cat user-ct-test-collection-*.txt | grep -v "AnonID" | grep -v "\\\." > silly.txt
createdb aol
psql aol
aol=# create table tmp (anonid varchar(16), query varchar(1024), querytime varchar(32), 
       itemrank varchar(5), clickurl varchar(1024));
aol=# copy tmp from '/Users/josh/AOL-data/silly.txt';
aol=# create table aol (anonid integer, query varchar(1024), querytime timestamp, 
       itemrank integer, clickurl varchar(1024));
aol=# insert into aol select anonid::integer, query, querytime::timestamp, case when    
       itemrank='' then NULL else itemrank::integer end, clickurl from tmp;
aol=# create index aol_id on aol (anonid);

Funky stitching: part II

I really should be posting about the privacy nightmare / SEM dream of AOL releasing silly amounts of dada last night. But precisely at the time that was happening, I was walking through the West Village trying to find St. Vincents hospital.

It would be far less embarrassing if it happened 15 years ago, but given that it didn't, it was bound to happen sooner or later, especially after my girlfriend told me not to use the paring knife as a screwdriver. Last week she went overseas for a holiday, so I had a chance to catch up on dorking out and fixed my computer and sliced my finger open with a paring knife.

My first official event at Carnegie Mellon was a 'what to do in an emergency' lecture, with extra emphasis on how expensive ER visits are. It went in one ear and out the other. Likewise, when I transferred over to my workplace medical plan I was told what I needed to do before making a claim. I didn't do any of it, because I had no plans to actually use it. You know you are in America when the first thing you think of when a medical emergency is upon you is 'Where did I put my insurance card?'. It is surprisingly difficult to remove a card that is stuck to a piece of paper when one hand isn't quite working right.

After sorting that out I had a flash of my medical training and found some sterile gauze and a bandage and wrapped myself up. This was quickly followed by a flashback to an old Bill Cosby comedy routine where he joked about his mother always harassing him to wear clean underwear, incase of an emergency. For some reason this felt like important advice, so I got dressed. Not to say that my underwear weren't clean, but I wasn't looking my best, so I got dressed to impress.

That wasn't particularly rational, but I must admit that I wasn't at my finest at that point. It is also very difficult to tie shoelaces with one hand.

I found the hospital and was impressed that I was on the 'fast-track'. I wasn't impressed by the lengthy interview (which was only lengthy because the registration admin kept on pausing to continue gossiping with her friend) and requests to sign documents before I had read them. The ER ward was fairly empty; I later found out that they were closing that section, and all the hot action was on the other side of the building. A range of nurses and doctors came by, none of whom introduced themselves (contrary to the Patients Bill of Rights document that I signed and read). And none of them brought me a glass of water, even though they all said they would.

When I finally worked out who my doctor was I told her that I used to be in her position and that seemed to do the trick. Trying to remember words and phrase from med school, but without sounding like someone who picked stuff up from ER, I managed to get her to actually talk to me, which made me feel much better. As did the nerve block.

The six stitches required to get me back together didn't take too long, and I made it out by midnight. On the walk home I felt the lignocaine wear off and figured that it was about to hurt like hell, so I self medicated with some scotch.

This morning I feel worse off from the scotch than the finger, althought typing is a bitch.

Saturday, August 05, 2006

i2pi

Probably quite irrelevant to my current readers, but I finally came to terms with the fact that I needed a new powersupply and motherboard, and now i2pi.com is back. This means that when I come to terms with not really wanting to be hosted on blogspot, I'll move this blog over. I really want to take control of image aliasing again; blogspot does a terrible job at it, or I just don't know how to use it. Either way, I prefer to control my web presence.

Friday, August 04, 2006

Funky stitching

Either New York has some really funky architecture, or Google maps has some funky image stitching technology.

Wednesday, August 02, 2006

Liquidity premium = Insurance?

Greg raises the point that some large lead buyers actually get discounts, contrary to my previous post. I still haven't really thought through his statement that the mortgage vertical has plenty of liquidity, but I do have a possible explanation for the volume discounts. One way to paraphrase my initial wordy post is that the premium is a form of insurance: insurance against the cost of rebalancing the relationships to maintain an orderly market. If a large buyer is coming into your exchange, and his accommodation will require the outlay of expense not only to develop his relationship, but that of the suppliers to fill his order, then you need to protect that investment with insurance. If the buyer is large and reputable then the charge will fall away. And if they provide depth to your buy side that actually encourages further supply, then they may get advantageous pricing compared to smaller buyers.

This argument becomes clearer if we make that (false) assumption that exchanges will buy all supply upfront, and then take fulfillment and counterparty risk whilst trying to sell held inventory. For the most part this does not happen, but if we replace the concept of 'lead inventory' with 'relationship inventory' then it all follows through.

Spot lead pricing II: The fishy distribution

When I worked in media analytics / campaign management, when a statistic was to be reported on was that 'things' were drawn from the Normal distribution. The general arm-wavy argument was along the lines of "... mumble mumble law of large numbers mumble burp ...". Of course, what they really intended to invoke was the central limit theorem. But hey, I too went to business school and understand that MBA level probability & statistics is dull and arm-wavy and hence was a great time to catch up on sleep, so I usually let that slide. In lay terms the argument is that we don't really need to know the underlying distribution because with enough samples, things look normal. In the world of media analytics, where we had billions of ad impressions and millions of clicks, the 'enough samples' part usually held. But in the world of lead-gen, where a single supplier may only provide 5 to 25 leads per day, this doesn't hold.

What distribution do I use to model leads arriving into an exchange? The Poisson distribution. If anyone remembers their probability classes from school, they will remember countless examples which invariably involved people arriving to a queue at a bank teller. If you happened to take computer science, the example might be expressed as jobs arriving at a CPU, or something like that. Either way one of the key measures to describe these processes is to state the average time between successive arrivals. If you can make the assumption that the process is memoryless, i.e., the time of arrival of the next person does not depend on the time of arrival of previous persons, then you can model the time between arrivals as an Exponential distribution. And if you do this the total number of people arriving over a time interval T is distributed as the Poisson distribution.

In the chart above we see a the distribution of the number of expected arrivals in one day, when the average time between arrivals is 4.8 hours (one fifth of a day). We can see that the we expect about 5 arrivals in the day, which should be blindingly obvious.

I'm very fond of the R statistical programming environment. It managed to get me through my statistical arbitrage course, while I profited from the arbitrage between S-Plus ($$$) and R (FREE!). To my untrained eyes, they are pretty similar.

To product the chart above in R:
plot(dpois(0:50, 5), type='s')

If you make arrival times more frequent, we end up with a distribution that looks like a discrete version of the Normal distribution:

The big difference is that unlike a normal distribution, a Poisson will have P(X < 0) = 0. In other words the chance of having a negative number of arrivals is zero. So, if you permit my own arm-waviness, the Poisson distribution is somewhat like the discrete analog of the LogNormal.

In my previous post I made the statement that a supplier providing a large number of leads was unlikely to supply a dramatically lesser number in the future. Lets examine the chance of a supplier providing exactly zero leads if the previously provided X leads per day. In R, we express this as

plot(dpois(0, 0:50), type='s')

I won't spoil the surprise ending by including the chart, but needless to say if you provide less leads you are more likely to provide zero leads. Mathematics is wonderful for stating the obvious, but humor me here.

(I wish I could embed LaTeX in this blog...)

The Poisson probability distribution function is P(x) = l^x exp(-l/x!), where l is the mean number of arrivals and x is the number of arrivals that we want to know the probability of. If we set x=0 we get P(0) = l^0 exp (-l/0!) = exp(-l). So the chance of getting zero leads follows an exponential distribution and we are right back where we started this detour into the fishy distribution.

And as an exercise to the reader (don't you hate it when people do this..), what does the following mean in the context of lead gen?

a=2:100
plot(ppois(a,a),type='l')

Tuesday, August 01, 2006

How little things have changed...

From the minutes of the 1990 DataServe Annual General Meeting:

Joshua Reich's Report

Joshua's report was unavailable at the time of printing the Annual Report, however it was read out and distributed to the attendees of the Annual General Meeting. Those who didn't attend and wish to obtain a copy may contact Joshua Reich on xxx xxxx. The following points were outlined in the report.

Joshua said that he had sold an inventory program to Video Classrooms Australia for $4.95. He also indicated he had written Graphar 9.5, and update to his simple, high0level mathematicians' language. An unidentified attendee noted that he had 'been sucked into chaos [theory]'. Luke pointed out that Joshua therefore must be the suckee. Joshua also told the meeting that had produced null-modem cables and plans to make a robot.

jdigittl