Bitwise Evolution

Musings of a Seattle-area hacker with a bent on improving digital lifestyles.

Recovered Puzzles

I used to collect puzzles–back when I had a wiki to post them to–but that content was lost to me a few years ago when the system hosting my personal content had a slew of hard drive issues. I was lamenting that loss last week when a coworker suggested that I could possibly find the content on the way-back machine at archive.org, and indeed, I did!

Without further ado, here’s a short list of brainteasers (none are of my creation, and I do not have citations–if you know who to credit for any of these, please let me know and I will add proper attribution info.)

Calendar Cubes

A man has two cubes on his desk. Each face of each cube has a single-digit number wirtten on it. With these two cubes, the man is able to enumerate all the days in any month, and each morning he arranges the cubes so that the number of the current day is on top, always using both cubes. How are the numbers distributed on the cubes?

Pennies 1

You are blindfolded in a room with 100 pennies. 30 of the pennies are heads-up, the remainder are tails-up. You can interact with the pennies in any way, but your fingers are not dexterous enough to feel the contours of the coins (so you can’t feel one to see which side is heads, or tails). Since you are blindfolded, you can’t see them either. Your task is to manipulate the coins such that there are two sets and each set has an equal number of coins that are heads-up. (The sets must be disjoint, non-empty, and all pennies must be in one of the two sets.)

Pennies 2

Given N pennies, one of which is counterfiet (and therefore is of different weight from the remainder) and a balance, how can you find the counterfeit coin in less three weighings on the balance.

Eggs

You are in a 100-floor building on a planet with oddly low gravity and/or surprisingly durable eggs. You happen to have two of these eggs (unfertilized, I assure you). Your task is to find the highest floor from which you can drop an egg and have it remain intact.

Numbers

Given 99 unique integers between 1 and 100, provide an optimal algorithm to find the remaining integer in that range that is not in the set.

Hint: Bcgvzny gnxrf yvarne gvzr naq pbafgnag fcnpr.

Prisoners

50 people are inprisioned, and during their imprisonment the captor will invite people randomly in to visit with her. All visits are one-on-one, and each prisoner has a unique tunnel from their cell to the captor’s office (so you can’t look out your cell and see who is going in). In the captor’s room is a bowl that the prisoners can optionally turn over, or turn right-side up during their visit(s). The initial state of the bowl is known to everyone.

The imprisonment may last for an infinite period of time, during which each prisoner will be invited into the captor’s office many, many times (essentially infinite, but it needs not be infinite, it could just be a reasonably small number in the optimal case). The imprisonment ends when one prisoner says: “Everyone has been in to see the captor at least once.” If a prisoner says this and they are wrong, all prisoners are killed immediately. Because the captor may decide not to visit anyone for a while, it is as if the prisoners have no concept of time, so they can’t bound the number of people seen based on the passage of time.

To give the prisoners a chance, they are allowed to convene briefly before their imprisonment, during which time they can plan a strategy. How do they do it?

Sequences What is the next line in this sequence?

1
1 1
2 1
1 1 1 2
3 1 1 2
2 1 1 2 1 3

Test-Driven XML Schema dev with xmlstarlet

Just to document how I do this:

Problem: I need a schema for FooTask

Solution:

  • Create a ‘tests’ directory.
  • populate said directory with simple example xml files.
  • Name those files valid-foo.xml or invalid-bar.xml (I use numbers for foo and bar).
  • Create your xsd file in the same directory as ‘tests’. Lets call it foo.xsd
  • Copy the following Makefile into the same location.
XSD=foo.xsd
# run xmlstarlet with -e to see verbose error.

test:
    @for file in `ls -1 tests/valid*.xml`; do if xmlstarlet val -q --xsd ${XSD} $${file}; then echo "pass"; else echo "fail: $${file}"; fi; done
    @for file in `ls -1 tests/invalid*.xml`; do if ! xmlstarlet val -q --xsd ${XSD} $${file}; then echo "pass"; else echo "fail: $${file}"; fi; done

Now, run make, and if anything fails you can manually run xmlstarlet val -e --xsd foo.xsh [failing file.xml] to see the details.

Implications: Coding for Homomorphicly Encrypted Input

Craig Gentry (Stanford / IBM) recently published a paper that proves the existence of fully homomorphic crypto systems. This has caused quite a stir, since such a system would allow an untrusted party to perform computations on encrypted data, returning an encrypted result, without ever knowing anything about the input or output. I’m not going to explain what homomorphic encryption is (at least, not in great detail). Bruce Schneier’s blog has a great post about it, and the comments there are extremely helpful in understanding how it works, and what this means for cloud computing.

I make no claims to being a cryptographer, but I did have a number of questions about the practical viability of this approach. Now, there are many questions in that vein that are directed at the performance characteristics of Gentry’s approach (which are abysmal, but not asymptotically so). I was curious about The use of side effects to discern information about the encrypted content.

For example, anyone who has used a debugger knows that you can monitor the flow of a program that has been instrumented with debugging symbols, and you can learn a great deal about the input even without examining the content of variables. If a given conditional branch directs execution one way, then you know the predicate evaluated to a specific value. I set out to determine why this sort of attack is not a problem, and I ended up learning a lot about the way programs that run on encrypted data must operate.

Homomorphisms

Let’s take a moment to quickly discuss homomorphisms, and homomorphic encryption.

a homomorphism is a structure-preserving map between two algebraic structures –Wikipedia

In this case (encryption) the homomorphism is a mapping from the clear text and the cypher-text. Fully homomorphic encryption, as Gentry discovered, preserves addition and multiplication–meaning that you can add and multiply cyphertext, and the result can be decrypted to reveal clear text that has been added and multiplied in the same way. Addition and Multiplication provide the operations necessary to implement boolean logic, and therefore, are sufficient to program very complex transformations (I’m not certain that it is safe to say “arbitrarily complex”).

It’s important to realize that every addition or multiplication operation results in a value that is encrypted. The running program can not know the intermediate results, and indeed it does not.

So, how do conditionals work?

Edward Kmett posted the conversion from if/then/else to addition/multiplication on Schneier’s blog:

if (condition) {
   return then_clause;
} else {
   return else_clause;
}

becomes:

return condition * then_clause + (1-condition) * else_clause;

Here’s a “real” example (it compiles, at least) using both approaches. This is just meant to be used for explanation — compilers could easily do the translation from the code in is0_clear() to is0_enc(). I’ve written them out separately here so we can look at the generated bytecode.

public class Test {
   public int is0_clear(int input) {
      if (0==input){
         return 2;
      } else {
         return 3;
      }
   }

   public int is0_enc(int input) {
      // I'm cheating a bit to keep this simple -- calculate
      // the conditional to be either 0 or 1:
      int cond = 0==input ? 1 : 0;
      return cond * 2 + (1-cond) * 3;
   }
}

And here’s the bytecode (generated by sun-java-6, and output with javap -verbose).

public int is0_clear(int);
  Code:
   Stack=2, Locals=2, Args_size=2
   0:   iconst_0
   1:   iload_1
   2:   if_icmpne  7  // Conditional Jump!!
   5:   iconst_2
   6:   ireturn        // return a constant 2
   7:   iconst_3
   8:   ireturn        // return a constant 3

public int is0_enc(int);
  Code:
   Stack=3, Locals=3, Args_size=2
   0:   iconst_0      // lines 0-9 here are for the "cheating" part
   1:   iload_1        // just ignore them -- the arithmetic to accomplish
   2:   if_icmpne  9   // the same thing is complex, and not important.
   5:   iconst_1
   6:   goto   10
   9:   iconst_0
   10:  istore_2   // note that there are no conditional jumps below here:
   11:  iload_2   
   12:  iconst_2
   13imul
   14:  iconst_1
   15:  iload_2
   16:  isub
   17:  iconst_3
   18imul
   19:  iadd
   20:  ireturn   // return the result of the calculated expression.

Since every operation results in an unknown value, no conditional branches can be taken! Every branch has to be evaluated, and the correct result of the ‘correct’ branch is selected by multiplying by a binary value, that is itself, encrypted! This means that things like run-time short-circuit evaluation are not possible, monitoring progam flow is meaningless, (possibly?) every input will result in the same run-time, and all side-effects will happen regardless of the input.

Implications?

Going further down this rabbit hole, caching is impossible, and global state (if even posible) is likely to be extremely dangerous. I shudder to think of how Python’s concept of scoping would interact with a compiler that generates code for homomorphicly encrypted input.

Aside from the pure overhead of dealing with encrypted data, and the “refreshing” required with Gentry’s algorithm, I think that there are going to be some serious performance and development concerns once homomorphic encryption becomes a reality. The programming practices that are common in languages like Java and Python now are not likely to hold up. I expect that the APIs that enable operation on encrypted data will be based on total functions, and I have only begun to think about the implications for testing, code coverage, and quality assurance.

Cleaning up my browser.

opera-clean

I’m done with firefox — Opera 10 now plays flash well, has adblock via. urifilters, a cleaner UI (no menubar, a menu button!) vertical tabs are supported natively, etc… I don’t really like the widget toolkit used in the file open/save dialog, but that’s much better than the horrid performance/stability/bizarre bugs of Firefox.

The minimal UI possible with Opera is also a major win in my book.

Maven deployment issues

I’ve been building / porting various projects to maven lately, and pushing them to our in-house maven server. For a while, I was doing this from my laptop at home. However, at work, I’m pushing to localhost (it’s a temporary thing while we determine if maven will actually work long-term.)

The following error had me stumped for a few days:

[INFO] Error retrieving previous build number for artifact 'de.balokb:libreadline-java-i386-Linux-c23cxx6:jar': repository metadata for: 'snapshot de.balokb:libreadline-java-i386-Linux-c23cxx6:1.0-SNAPSHOT' could not be retrieved from repository: inhouse_snapshot due to an error: Exit code: 1 - Host key verification failed.

All the googling I did turned up people stumped with ssh public key problems, or users who had specified ssh: instead of extssh: … etc. It was fairly quick to elleminate those issues, or so I thought. (ssh localhost right? No problem.)

I happened to look in more detail at my pom.xml:

<repository>
      <id>inhouse</id>
      <name>Inhouse Internal Release Repository</name>
      <url>scpexe://10.0.0.26/var/www/maven/inhouse</url>
    </repository>

hm… 10.0.0.26 I wonder…

$ ssh 10.0.0.26 The authenticity of host '10.0.0.26 (10.0.0.26)' can't be established. RSA key fingerprint is a7:bf:36:4c:b9:c7:c2:f9:03:9a:3a:a7:4f:10:e5:ba. Are you sure you want to continue connecting (yes/no)?

Ah ha! I clearly can’t use a pom.xml that lists “localhost” in the server section — I’d only be able to push from one place. However, since I’d never ssh’d to 10.0.0.26 from localhost, the fingerprint was unknown, and that was causing maven to error out with the problem I saw initially.

“Fingerprint ID failed” would have been a nicer error message, but I don’tk now that that is possible.

Bitten by dependency management

dependencies-smallI’ve started using Maven to manage my java projects, and overall I’m very happy with it. It seems to be more mature than ivy, with better documentation, and the vast majority of tasks that I need “just work” (just don’t ask me about jni–that’s another post).

Today, (and yesterday, and a good portion of the night in-between) I ran into a nasty bug in a library that I didn’t know my code depended on. It isn’t particularly important what I was working on, but just for context: I needed to strip a lot of text content out of nodes in the complete wikipedia revision history dump, so I was using Sax to parse the xml stream, filter out the stuff I wanted filtered out, and save the stuff that, well, I wanted saved. Being that the input was all of wikipedia, there were a fair number of unicode characters in there. As it turns out, the 2.6.2 xercesImpl has some sort of bug that allows xml with certain characters to be read without throwing exceptions, but when you try to write the chars that were actually read, you end up trying to write characters that aren’t valid in xml. Even if I’d known that in advance, my response would have been something like “ok, so what? I’m not using xercesImpl, and certainly not a version that old”.

Well.

You see, in addition to using Maven, I’ve also been using the Google Collections and JSR305 libraries, so I just drop those <dependency> entries into the pom for all my new projects–I just assume that I’ll need them, and I usually do.

Unfortunately, JSR305 1.3.8 depends on jaxen 1.1.1, which depends on xercesImpl 2.6.2 (jaxen also needs this dependency via xom 1.0, for what that’s worth).

Because that dependency was already present in my build path (via mvn eclipse:eclipse) and in the generated jar (via <addClasspath> and <classpathPrefix> in the maven-jar-plugin configuration section), I never realized that my sax code actually had a direct dependency on xerces as well. This all came to a head when, 3.53gb into my 2.8tb run, these rather unhelpful exceptions started popping up:

java.io.IOException: The character '?' is an invalid XML character
       at org.apache.xml.serialize.BaseMarkupSerializer.characters(Unknown
Source)
       at com.stottlerhenke.tools.wikiparse.ContentStripper.characters(ContentStripper.java:195)
       at org.apache.xerces.parsers.AbstractSAXParser.characters(Unknown
Source)
       at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
Source)
       at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
       at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
       at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
       at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
       at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
       at com.stottlerhenke.tools.wikiparse.ContentStripper.parse(ContentStripper.java:96)
       at com.stottlerhenke.tools.wikiparse.ContentStripper.main(ContentStripper.java:379)

<rant> “?” is not unicode — it fits just fine in asci tables everywhere — so please don’t tell me that it’s an invalid unicode character :) (0xd800 is an invalid unicode character, and that would have been much more helpful) </rant>

Many hours later I was able to find a sample of the actual input that was causing these problems, and I was able to reproduce the issue with an input slightly smaller than 2.8tb. Once that was done, I set out to make a minimal test case. Rather than bother with a new maven project, I just hacked it out in emacs (not using google collections, etc. because, clearly, I wanted it minimal). To my surprise, everything worked, and worked fantastically! But how? I didn’t even supply an xml api on the classpath, yet it ran just fine!

In truth, I did supply an xml api — xercesImpl.jar, and many other libraries — via my environment’s $CLASSPATH. (Figuring that out was another adventure, but I digress.) Once it became clear that I was indeed using a broken library it was simply a matter of explicitly specifying the dependency on a new version of xercesImpl, and rebuilding.

The moral?

Know your dependencies! This should come along with knowing your language’s built-in APIs well. It wasn’t clear to me that the SAX packages I was using were not part of the core java API, so it didn’t strike me as odd that I didn’t need to specify a classpath entry or a pom dependency before I could use sax.

If you suspect something strange, you can see the dependency tree in the generated html documentation you get when running mvn site.

Fixing the key repeat in Ubuntu 9.04

I just upgraded my workstation to Jaunty (Ubuntu 9.04) and the key repeat delay and speed dropped to a frustrating level.

gnome-control-center can be used to fix this, but it requires that the gnome-settings-daemon be running, which forces it’s opinions on many other aspects of my environment (I run Enlightenment dr17).

Poking around a bit, and help from #e on freenode, revealed that xset can be used to fix the key repeat settings.

# Look at the current settings:
$ xset q
Keyboard Control:
  auto repeat:  on    key click percent:  0    LED mask:  00000000
  auto repeat delay:  660    repeat rate:  25
  auto repeating keys:  00ffffffdffffbbf
                        fadfffefffedffff
                        9fffffffffffffff
                        fff7ffffffffffff
# lets speed things up a bit...
$ xset r rate 250 40

Things are a little messy…

I’ve had some minor upgrade issues with the blog lately, and I am only about halfway through updating everything. In the meantime, I’m afraid things will look a bit messy. (Syntax highlighting is currently broken, and there are probably other formatting / data issues as well. I think I have to restore the uploads directory, for one, so there probably won’t be any images in the posts for a while.)

Like food?

I started writing for another blog this week:

Brewed, Bottled, Cultured and Sweetened is a blog about beer, coffee, wine, cheese, chocolate, etc… that I’m writing with an old friend from Dagit.o

The Linux Tablet: Wacom rotations – waking up on the wrong side

Update: updated the script with improved (functional) error output. added notes about xhost.

There is an annoying bug in the sequence of code that manages the wacom rotation / sleep / resume and stylus calibration right now. (Where “right now” is Ubuntu Intrepid, with the 0.8.2-1 wacom drivers.)

This is a document bug over at the ubuntu launchpad, and the poster there does a fine job of describing the intricacies of reproducing the bug, so I’ll only give a brief explanation here to help get indexed.

If you rotate the screen any amount, even returning to the original rotation, and then sleep the machine, when it wakes up, the stylus will not be calibrated properly — the cursor will be off to the side of the stylus point. It doesn’t seem to matter how it was calibrated when the machine slept, nor does it matter what rotation you’re in when you put the machine to sleep.

There is one straightforward workaround: When you wake the machine, run wacomcpl, click on stylus, click calibrate (the mouse should now be under the stylus again), and exit wacomcpl. This is incredibly cumbersome, but at least it’s better than restarting X, which is what I have been doing.

Further inspection (based largely on the thread of comments on that launchpad bug) reveals that the problem is actually related to bad values for the TopX, TopY, BottomX and BottomY settings on the wacom devices after a resume. By resetting these to their proper values for the current rotation, we can reestablish the proper calibration. First off, we need to know the proper values, and the easiest way to get them is with xsetwacom:

#!/bin/bash
# wacomSettings

echo "TopX=" `xsetwacom get stylus TopX`
echo "TopY=" `xsetwacom get stylus TopY`
echo "BottomX=" `xsetwacom get stylus BottomX`
echo "BottomY=" `xsetwacom get stylus BottomY`

Now, we’ll run this for each rotation, and save the results. You should end up with something like the following:

|rogue on bach |AC 70% | @ 00:02:26 ~|
 $ xrotate 1 && wacomSettings
xrandr to left, xsetwacom to 2
TopX= -46
TopY= -3
BottomX= 18605
BottomY= 24518
 |rogue on bach |AC 70% | @ 00:02:28 ~|
 $ xrotate 2 && wacomSettings
xrandr to inverted, xsetwacom to 3
TopX= 58
TopY= -46
BottomX= 24579
BottomY= 18605
 |rogue on bach |AC 70% | @ 00:02:35 ~|
 $ xrotate 3 && wacomSettings
xrandr to right, xsetwacom to 1
TopX= -173
TopY= 58
BottomX= 18478
BottomY= 24579
 |rogue on bach |AC 70% | @ 00:02:41 ~|
 $ xrotate 0 && wacomSettings
xrandr to normal, xsetwacom to 0
TopX= -3
TopY= -173
BottomX= 24518
BottomY= 18478

(Note that my bash prompt looks like & command lines above are indented, and the output is left-aligned)

That gives us enough information to script the calibration when we resume. For example, when resuming to a “normal” rotation, I need to run:

xsetwacom set stylus TopX -3
xsetwacom set stylus TopY -173
xsetwacom set stylus BottomX 24518
xsetwacom set stylus BottomY 18478

(Wrap that in a bash script and give it a shot!)

Here’s the full script that gets the current orientation and then calibrates the common wacom devices:

#!/bin/bash
#
# waCalibrate.sh: recalibrates the wacom stylus
#
# Author: Rogan Creswick
# License: just be nice

# Set LOG to something reasonable:
# (The file does not need to exist, but the *directory* does)
LOG=/home/rogue/calibration.out
XSETWACOM=/usr/local/bin/xsetwacom


#
# Calibrates the wacom devices {stylus, eraser, cursor} with the
# given offsets:
#
#  Usage:
#     calibrate <topx> <topy> <bottomx> <bottomy>
#
function calibrate {
    ${XSETWACOM} --display :0.0 set stylus TopX $1 >> ${LOG} 2>&1
    ${XSETWACOM} --display :0.0 set stylus TopY $2 >> ${LOG} 2>&1
    ${XSETWACOM} --display :0.0 set stylus BottomX $3 >> ${LOG} 2>&1
    ${XSETWACOM} --display :0.0 set stylus BottomY $4 >> ${LOG} 2>&1

    ${XSETWACOM} --display :0.0 set eraser TopX $1 >> ${LOG} 2>&1
    ${XSETWACOM} --display :0.0 set eraser TopY $2 >> ${LOG} 2>&1
    ${XSETWACOM} --display :0.0 set eraser BottomX $3 >> ${LOG} 2>&1
    ${XSETWACOM} --display :0.0 set eraser BottomY $4 >> ${LOG} 2>&1

    ${XSETWACOM} --display :0.0 set cursor TopX $1 >> ${LOG} 2>&1
    ${XSETWACOM} --display :0.0 set cursor TopY $2 >> ${LOG} 2>&1
    ${XSETWACOM} --display :0.0 set cursor BottomX $3 >> ${LOG} 2>&1
    ${XSETWACOM} --display :0.0 set cursor BottomY $4 >> ${LOG} 2>&1
}


function fixCalibration {
    # get the current orientation:
    ORIENTATION=`xrandr --verbose --query | grep " connected" | awk '{print $5}'`
    echo "Orientation: ${ORIENTATION}" >> ${LOG}
   
    case "${ORIENTATION}" in
    normal)
        calibrate -3 -173 24518 18478   
        ;;
    left)
        calibrate -46 -3 18605 24518
        ;;
    right)
        calibrate -173 58 18478 24579
        ;;
    inverted)
        calibrate 58 -46 24579 18605
        ;;
    *)
        calibrate -3 -173 24518 18478
        echo "ERROR!! unknown orientation! ${ORIENTATION}" >> ${LOG}
        ;;
    esac
}

case "$1" in
    resume|thaw)
    date >> ${LOG}
    fixCalibration
    whoami >> ${LOG}
        ;;
    *)
    echo "not a resum|thaw event: $1" >> ${LOG}
        ;;
esac

Stick that in /etc/pm/sleep.d/40wacomCalibrate (or some similarly named file), make it executable by all (chmod a+x /etc/pm/sleep.d/40wacomCalibrate) and it should be run when the system resumes.

Update: I found that the logging of the old script didn’t work, so I’ve updated the script to reflect that. There were also some problems with how I was testing the first script, and the actions I was taking didn’t actually trigger the bug. (The bug seems to be quite state-dependent, and markovian assumption was wrong.) To get this to work, root will need to have access to the display that xsetwacom uses. The simplest way to do this is to add xhost + to you x startup. (I put it in my ~/.xsession just before exec enlightenment-start).