<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Bitwise Evolution &#187; java</title>
	<atom:link href="http://blog.ciscavate.org/category/java/feed" rel="self" type="application/rss+xml" />
	<link>http://blog.ciscavate.org</link>
	<description>Musings of a Seattle-area hacker with a bent on improving digital lifestyles.</description>
	<lastBuildDate>Sun, 25 Jul 2010 23:21:56 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Bitten by dependency management</title>
		<link>http://blog.ciscavate.org/2009/05/bitten-by-dependency-management.html</link>
		<comments>http://blog.ciscavate.org/2009/05/bitten-by-dependency-management.html#comments</comments>
		<pubDate>Fri, 22 May 2009 23:31:45 +0000</pubDate>
		<dc:creator>rcreswick</dc:creator>
				<category><![CDATA[java]]></category>
		<category><![CDATA[tech]]></category>

		<guid isPermaLink="false">http://blog.ciscavate.org/?p=103</guid>
		<description><![CDATA[I&#8217;ve started using Maven to manage my java projects, and overall I&#8217;m very happy with it.  It seems to be more mature than ivy, with better documentation, and the vast majority of tasks that I need &#8220;just work&#8221; (just don&#8217;t ask me about jni&#8211;that&#8217;s another post).

Today, (and yesterday, and a good portion of the [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://blog.ciscavate.org/wp-content/2009/05/dependencies.png"><img src="http://blog.ciscavate.org/wp-content/2009/05/dependencies-small.png" alt="dependencies-small" title="dependencies-small" width="145" height="200" class="alignright size-full wp-image-106" /></a>I&#8217;ve started using Maven to manage my java projects, and overall I&#8217;m very happy with it.  It seems to be more mature than ivy, with better documentation, and the vast majority of tasks that I need &#8220;just work&#8221; (just don&#8217;t ask me about jni&#8211;that&#8217;s another post).</p>

<p>Today, (and yesterday, and a good portion of the night in-between) I ran into a nasty bug in a library that I didn&#8217;t know my code depended on.  It isn&#8217;t particularly important <em>what</em> I was working on, but just for context: I needed to strip a lot of text content out of nodes in the complete wikipedia revision history dump, so I was using Sax to parse the xml stream, filter out the stuff I wanted filtered out, and save the stuff that, well, I wanted saved.  Being that the input was all of wikipedia, there were a fair number of unicode characters in there.  As it turns out, the 2.6.2 xercesImpl has some sort of bug that allows xml with certain characters to be read without throwing exceptions, but when you try to write the chars that were actually read, you end up trying to write characters that aren&#8217;t valid in xml.  Even if I&#8217;d known that in advance, my response would have been something like &#8220;ok, so what? I&#8217;m not using xercesImpl, and certainly not a version <em>that</em> old&#8221;.</p>

<p>Well.</p>

<p>You see, in addition to using Maven, I&#8217;ve also been using the <a href="http://code.google.com/p/google-collections/">Google Collections</a> and <a href="http://code.google.com/p/jsr-305/">JSR305</a> libraries, so I just drop those <code>&lt;dependency&gt;</code> entries into the pom for all my new projects&#8211;I just assume that I&#8217;ll need them, and I usually do.</p>

<p>Unfortunately, JSR305 1.3.8 depends on jaxen 1.1.1, which depends on xercesImpl 2.6.2 (jaxen also needs this dependency via xom 1.0, for what that&#8217;s worth).</p>

<p>Because that dependency was already present in my build path (via <code>mvn eclipse:eclipse</code>) and in the generated jar (via <code>&lt;addClasspath&gt;</code> and <code>&lt;classpathPrefix&gt;</code> in the <code>maven-jar-plugin</code>  configuration section), I never realized that my sax code actually had a <em>direct</em> dependency on xerces as well.  This all came to a head when, 3.53gb into my 2.8tb run, these rather unhelpful exceptions started popping up:</p>

<div class="codecolorer-container bash"><div class="codecolorer" style="font-family: monospace;">java.io.IOException: The character <span class="st0">'?'</span> is an invalid XML character<br />
&nbsp; &nbsp; &nbsp; &nbsp;at org.apache.xml.serialize.BaseMarkupSerializer.characters<span class="br0">&#40;</span>Unknown<br />
Source<span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp;at com.stottlerhenke.tools.wikiparse.ContentStripper.characters<span class="br0">&#40;</span>ContentStripper.java:<span class="nu0">195</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp;at org.apache.xerces.parsers.AbstractSAXParser.characters<span class="br0">&#40;</span>Unknown<br />
Source<span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp;at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl<span class="re1">$FragmentContentDispatcher</span>.dispatch<span class="br0">&#40;</span>Unknown<br />
Source<span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp;at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument<span class="br0">&#40;</span>Unknown<br />
Source<span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp;at org.apache.xerces.parsers.XML11Configuration.parse<span class="br0">&#40;</span>Unknown Source<span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp;at org.apache.xerces.parsers.XML11Configuration.parse<span class="br0">&#40;</span>Unknown Source<span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp;at org.apache.xerces.parsers.XMLParser.parse<span class="br0">&#40;</span>Unknown Source<span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp;at org.apache.xerces.parsers.AbstractSAXParser.parse<span class="br0">&#40;</span>Unknown Source<span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp;at com.stottlerhenke.tools.wikiparse.ContentStripper.parse<span class="br0">&#40;</span>ContentStripper.java:<span class="nu0">96</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp;at com.stottlerhenke.tools.wikiparse.ContentStripper.main<span class="br0">&#40;</span>ContentStripper.java:<span class="nu0">379</span><span class="br0">&#41;</span></div></div>

<p><code>&lt;rant&gt;</code> &#8220;?&#8221; is not unicode &#8212; it fits just fine in asci tables everywhere &#8212; so please don&#8217;t tell me that it&#8217;s an invalid unicode character :) (0xd800 <em>is</em> an invalid unicode character, and that would have been <em>much</em> more helpful) <code>&lt;/rant&gt;</code></p>

<p>Many hours later I was able to find a sample of the actual input that was causing these problems, and I was able to reproduce the issue with an input slightly smaller than 2.8tb.  Once that was done, I set out to make a minimal test case.  Rather than bother with a new maven project, I just hacked it out in emacs (not using google collections, etc. because, clearly, I wanted it minimal).  To my surprise, everything worked, and worked fantastically! But how? I didn&#8217;t even supply an xml api on the classpath, yet it ran just fine!</p>

<p>In truth, I <em>did</em> supply an xml api &#8212; xercesImpl.jar, and many other libraries &#8212; via my environment&#8217;s <code>$CLASSPATH</code>.  (Figuring that out was another adventure, but I digress.)  Once it became clear that I was indeed using a broken library it was simply a matter of explicitly specifying the dependency on a new version of xercesImpl, and rebuilding.</p>

<p>The moral?</p>

<p>Know your dependencies!  This should come along with knowing your language&#8217;s built-in APIs well.  It wasn&#8217;t clear to me that the SAX packages I was using were not part of the core java API, so it didn&#8217;t strike me as odd that I didn&#8217;t need to specify a classpath entry or a pom dependency before I could use sax.</p>

<p>If you suspect something strange, you can see the dependency tree in the generated html documentation you get when running <code>mvn site</code>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.ciscavate.org/2009/05/bitten-by-dependency-management.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Auto-documenting OSGi CommandProviders</title>
		<link>http://blog.ciscavate.org/2008/10/auto-documenting-osgi-commandproviders.html</link>
		<comments>http://blog.ciscavate.org/2008/10/auto-documenting-osgi-commandproviders.html#comments</comments>
		<pubDate>Mon, 20 Oct 2008 23:08:54 +0000</pubDate>
		<dc:creator>rcreswick</dc:creator>
				<category><![CDATA[OSGi]]></category>
		<category><![CDATA[eclipse]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[tech]]></category>

		<guid isPermaLink="false">http://blog.ciscavate.org/2008/10/auto-documenting-osgi-commandproviders.html</guid>
		<description><![CDATA[(Edit: If you&#8217;re reading this after OSGi R4.2, then there is almost certainly a better way to accomplish the same thing)

I&#8217;ve been digging into OSGi a bit over the last week or so inorder to
create some Eclipse plugins that will automatically discover
eachother, and I&#8217;ve been generally impressed with the framework on the
whole.  The documentation [...]]]></description>
			<content:encoded><![CDATA[<p>(<strong>Edit:</strong> If you&#8217;re reading this after OSGi R4.2, then there is almost certainly a better way to accomplish the same thing)</p>

<p>I&#8217;ve been digging into OSGi a bit over the last week or so inorder to
create some Eclipse plugins that will automatically discover
eachother, and I&#8217;ve been generally impressed with the framework on the
whole.  The documentation is a bit lacking, but there are some good
blog posts about it.  (Specifically [Neil Bartlett's introduction to
OSGi][intro].)</p>

<p>One thing that bugged me is the repetition needed when you implement
the CommandProvider interface to add commands to the OSGi console.
CommandProvider defines one method you must supply:</p>

<div class="codecolorer-container java" style="height:35px;"><div class="codecolorer" style="font-family: monospace;"><span class="kw2">public</span> <a href="http://www.google.com/search?hl=en&amp;q=allinurl%3AString+java.sun.com&amp;bntI=I%27m%20Feeling%20Lucky"><span class="kw3">String</span></a> getHelp<span class="br0">&#40;</span><span class="br0">&#41;</span></div></div>

<p>OSGi then uses reflection to extract each of the methods that starts
with an &#8220;<em>&#8220;, and supplies those methods to the command environment as
new commands.  (The &#8220;</em>&#8221; is trimmed, and the name of the method becomes
the command name.)  General practice is to include the name of the
method in the return value of <code>getHelp()</code>, along with a description of
what the method does, eg:</p>

<div class="codecolorer-container java"><div class="codecolorer" style="font-family: monospace;"><span class="kw2">public</span> <span class="kw2">class</span> SampleCommandProvider <span class="kw2">implements</span> CommandProvider <span class="br0">&#123;</span><br />
<br />
&nbsp; &nbsp;<span class="kw2">public</span> <span class="kw2">synchronized</span> <span class="kw4">void</span> _run<span class="br0">&#40;</span>CommandInterpreter ci<span class="br0">&#41;</span> <span class="br0">&#123;</span><br />
&nbsp; &nbsp; &nbsp; <span class="co1">// do stuff.</span><br />
&nbsp; &nbsp;<span class="br0">&#125;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp;<span class="kw2">public</span> <a href="http://www.google.com/search?hl=en&amp;q=allinurl%3AString+java.sun.com&amp;bntI=I%27m%20Feeling%20Lucky"><span class="kw3">String</span></a> getHelp<span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span><br />
&nbsp; &nbsp; &nbsp; <span class="kw2">return</span> <span class="st0">&quot;<span class="es0">\t</span>run - execute a Runnable service&quot;</span>;<br />
&nbsp; &nbsp;<span class="br0">&#125;</span><br />
<span class="br0">&#125;</span></div></div>

<p>This seems like a pain to maintain, so I took a quick look at
annotations, and propose a new syntax:</p>

<div class="codecolorer-container java"><div class="codecolorer" style="font-family: monospace;"><span class="kw2">public</span> <span class="kw2">class</span> SampleCommandProvider <span class="kw2">extends</span><br />
&nbsp; &nbsp;DescriptiveCommandProvider <span class="br0">&#123;</span><br />
<br />
&nbsp; &nbsp;@CmdDescr<span class="br0">&#40;</span>description=<span class="st0">&quot;execute a Runnable service&quot;</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp;<span class="kw2">public</span> <span class="kw2">synchronized</span> <span class="kw4">void</span> _run<span class="br0">&#40;</span>CommandInterpreter ci<span class="br0">&#41;</span> <span class="br0">&#123;</span><br />
&nbsp; &nbsp; &nbsp; <span class="co1">// do stuff.</span><br />
&nbsp; &nbsp;<span class="br0">&#125;</span><br />
<span class="br0">&#125;</span></div></div>

<p>Here we&#8217;ve extracted the <code>getHelp()</code> method into a new superclass, so
our SampleCommandProvider now extends an abstract class instead of
implementing an interface.  It also makes use of an Annotation, which
we need to define:</p>

<div class="codecolorer-container java"><div class="codecolorer" style="font-family: monospace;"><span class="co2">import java.lang.annotation.ElementType;</span><br />
<span class="co2">import java.lang.annotation.Retention;</span><br />
<span class="co2">import java.lang.annotation.RetentionPolicy;</span><br />
<span class="co2">import java.lang.annotation.Target;</span><br />
&nbsp; &nbsp; <br />
@Retention<span class="br0">&#40;</span>RetentionPolicy.<span class="me1">RUNTIME</span><span class="br0">&#41;</span><br />
@Target<span class="br0">&#40;</span>ElementType.<span class="me1">METHOD</span><span class="br0">&#41;</span><br />
<span class="kw2">public</span> @<span class="kw2">interface</span> CmdDescr <span class="br0">&#123;</span><br />
&nbsp; &nbsp;<a href="http://www.google.com/search?hl=en&amp;q=allinurl%3AString+java.sun.com&amp;bntI=I%27m%20Feeling%20Lucky"><span class="kw3">String</span></a> description<span class="br0">&#40;</span><span class="br0">&#41;</span>;<br />
<span class="br0">&#125;</span></div></div>

<p>Finally, we just need to define the superclass that implements
<code>getHelp()</code>:</p>

<div class="codecolorer-container java" style="height:280px;"><div class="codecolorer" style="font-family: monospace;"><span class="co2">import java.lang.reflect.Method;</span><br />
<span class="co2">import java.util.regex.Matcher;</span><br />
<span class="co2">import java.util.regex.Pattern;</span><br />
<br />
<span class="co2">import org.eclipse.osgi.framework.console.CommandProvider;</span><br />
<br />
<span class="kw2">public</span> <span class="kw2">abstract</span> <span class="kw2">class</span> DescriptiveCommandProvider <span class="kw2">implements</span> CommandProvider <span class="br0">&#123;</span><br />
&nbsp; &nbsp;<br />
&nbsp; &nbsp;<span class="kw2">private</span> <span class="kw2">static</span> <span class="kw2">final</span> Pattern CMD_PATTERN = Pattern.<span class="me1">compile</span><span class="br0">&#40;</span><span class="st0">&quot;^_(.*)&quot;</span><span class="br0">&#41;</span>;<br />
&nbsp; &nbsp;<span class="kw2">private</span> <a href="http://www.google.com/search?hl=en&amp;q=allinurl%3AString+java.sun.com&amp;bntI=I%27m%20Feeling%20Lucky"><span class="kw3">String</span></a> help = <span class="kw2">null</span>;<br />
&nbsp; &nbsp;<br />
&nbsp; &nbsp;<span class="kw2">public</span> <a href="http://www.google.com/search?hl=en&amp;q=allinurl%3AString+java.sun.com&amp;bntI=I%27m%20Feeling%20Lucky"><span class="kw3">String</span></a> getHelp<span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span><br />
&nbsp; &nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span><span class="kw2">null</span> == help<span class="br0">&#41;</span><span class="br0">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;help = buildHelp<span class="br0">&#40;</span><span class="br0">&#41;</span>;<br />
&nbsp; &nbsp; &nbsp; <span class="br0">&#125;</span><br />
&nbsp; &nbsp; &nbsp; <span class="kw2">return</span> help;<br />
&nbsp; &nbsp;<span class="br0">&#125;</span><br />
<br />
&nbsp; &nbsp;<span class="kw2">private</span> <a href="http://www.google.com/search?hl=en&amp;q=allinurl%3AString+java.sun.com&amp;bntI=I%27m%20Feeling%20Lucky"><span class="kw3">String</span></a> buildHelp<span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span><br />
&nbsp; &nbsp; &nbsp; StringBuilder helpBuff = <span class="kw2">new</span> StringBuilder<span class="br0">&#40;</span><span class="br0">&#41;</span>;<br />
&nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; <span class="kw1">for</span> <span class="br0">&#40;</span><a href="http://www.google.com/search?hl=en&amp;q=allinurl%3AMethod+java.sun.com&amp;bntI=I%27m%20Feeling%20Lucky"><span class="kw3">Method</span></a> m : <span class="kw2">this</span>.<span class="me1">getClass</span><span class="br0">&#40;</span><span class="br0">&#41;</span>.<span class="me1">getMethods</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="kw1">if</span> <span class="br0">&#40;</span>methodIsCmd<span class="br0">&#40;</span>m<span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#123;</span>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span><span class="nu0">0</span> != helpBuff.<span class="me1">length</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;helpBuff.<span class="me1">append</span><span class="br0">&#40;</span><span class="st0">&quot;<span class="es0">\n</span>&quot;</span><span class="br0">&#41;</span>;<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="br0">&#125;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; helpBuff.<span class="me1">append</span><span class="br0">&#40;</span>getDocumentation<span class="br0">&#40;</span>m<span class="br0">&#41;</span><span class="br0">&#41;</span>;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="br0">&#125;</span><br />
&nbsp; &nbsp; &nbsp; <span class="br0">&#125;</span><br />
&nbsp; &nbsp; &nbsp; <span class="kw2">return</span> helpBuff.<span class="me1">toString</span><span class="br0">&#40;</span><span class="br0">&#41;</span>;<br />
&nbsp; &nbsp;<span class="br0">&#125;</span><br />
<br />
&nbsp; &nbsp;<span class="kw2">private</span> <span class="kw4">boolean</span> methodIsCmd<span class="br0">&#40;</span><a href="http://www.google.com/search?hl=en&amp;q=allinurl%3AMethod+java.sun.com&amp;bntI=I%27m%20Feeling%20Lucky"><span class="kw3">Method</span></a> m<span class="br0">&#41;</span> <span class="br0">&#123;</span><br />
&nbsp; &nbsp; &nbsp; <span class="kw2">return</span> CMD_PATTERN.<span class="me1">matcher</span><span class="br0">&#40;</span>m.<span class="me1">getName</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>.<span class="me1">matches</span><span class="br0">&#40;</span><span class="br0">&#41;</span>;<br />
&nbsp; &nbsp;<span class="br0">&#125;</span><br />
<br />
&nbsp; &nbsp;<span class="kw2">private</span> <a href="http://www.google.com/search?hl=en&amp;q=allinurl%3AString+java.sun.com&amp;bntI=I%27m%20Feeling%20Lucky"><span class="kw3">String</span></a> getDocumentation<span class="br0">&#40;</span><a href="http://www.google.com/search?hl=en&amp;q=allinurl%3AMethod+java.sun.com&amp;bntI=I%27m%20Feeling%20Lucky"><span class="kw3">Method</span></a> m<span class="br0">&#41;</span> <span class="br0">&#123;</span><br />
&nbsp; &nbsp; &nbsp; StringBuilder methodHelp = <span class="kw2">new</span> StringBuilder<span class="br0">&#40;</span><span class="br0">&#41;</span>;<br />
&nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; Matcher matcher = CMD_PATTERN.<span class="me1">matcher</span><span class="br0">&#40;</span>m.<span class="me1">getName</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>;<br />
&nbsp; &nbsp; &nbsp; <span class="kw1">if</span><span class="br0">&#40;</span>matcher.<span class="me1">matches</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;methodHelp.<span class="me1">append</span><span class="br0">&#40;</span><span class="st0">&quot;<span class="es0">\t</span>&quot;</span>+matcher.<span class="me1">group</span><span class="br0">&#40;</span><span class="nu0">1</span><span class="br0">&#41;</span><span class="br0">&#41;</span>;<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;CmdDescr description = m.<span class="me1">getAnnotation</span><span class="br0">&#40;</span>CmdDescr.<span class="kw2">class</span><span class="br0">&#41;</span>;<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="kw1">if</span> <span class="br0">&#40;</span><span class="kw2">null</span> != description<span class="br0">&#41;</span><span class="br0">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; methodHelp.<span class="me1">append</span><span class="br0">&#40;</span><span class="st0">&quot; - &quot;</span>+description.<span class="me1">description</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>;<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="br0">&#125;</span><br />
&nbsp; &nbsp; &nbsp; <span class="br0">&#125;</span><br />
&nbsp; &nbsp; &nbsp; <span class="kw2">return</span> methodHelp.<span class="me1">toString</span><span class="br0">&#40;</span><span class="br0">&#41;</span>;<br />
&nbsp; &nbsp;<span class="br0">&#125;</span><br />
<span class="br0">&#125;</span></div></div>

<p>Note that the actual reflection on the class only happens once &#8212; all
subsequent calls to <code>getHelp()</code> use a cached copy of the documentation.</p>

<p>[intro]: http://neilbartlett.name/blog/osgi-articles/<u style=display:none></u></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.ciscavate.org/2008/10/auto-documenting-osgi-commandproviders.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Creating Wizards in Java</title>
		<link>http://blog.ciscavate.org/2008/07/creating-wizards-in-java.html</link>
		<comments>http://blog.ciscavate.org/2008/07/creating-wizards-in-java.html#comments</comments>
		<pubDate>Tue, 01 Jul 2008 19:54:24 +0000</pubDate>
		<dc:creator>rcreswick</dc:creator>
				<category><![CDATA[java]]></category>
		<category><![CDATA[tech]]></category>

		<guid isPermaLink="false">http://blog.ciscavate.org/?p=35</guid>
		<description><![CDATA[A recent project at work required building a multi-step dialog to manage the interface between a user and an expert system (and some fairly advanced NLP to boot).  On the surface this looked like a fairly standard Wizard problem &#8212; design a bunch of screens with questions, and then collect the answers as the [...]]]></description>
			<content:encoded><![CDATA[<p>A recent project at work required building a multi-step dialog to manage the interface between a user and an expert system (and some fairly advanced NLP to boot).  On the surface this looked like a fairly standard Wizard problem &#8212; design a bunch of screens with questions, and then collect the answers as the user proceeded through the dialogs.  However, the Wizard APIs I found were either not very mature or (in the case of the <a href="http://wizard.dev.java.net">Java.net Wizards</a>) it was very difficult to create complex branching behaviors, and those branches were extremely resistant to change.  Both things are essentially show-stoppers when working with prototypes that frequently need modification.</p>

<p>In the end, I spent a weekend and a couple evenings building a new Wizard API for Java, called <a href="http://code.google.com/p/cjwizard">CJWizard</a>.  The library is released under the Apache V.2 license, so it should work for just about anything you want to use it for.  I would like to know if you&#8217;re using it, and what you&#8217;re using it for, just to sate my own curiosity :).  The project is hosted on code.google.com, so please submit issues, and feel free to contribute to the project.</p>

<p><a href="http://code.google.com/p/cjwizard">CJWizard</a> provides the structure needed to quickly create simple dialogs by implementing an abstract class (WizardPage) for each page of the dialog, and adding them to a PageFactory that can generate pages on-demand, as they are required.  This puts the programmer in full control of how the wizard proceeds.  The CJWizard architecture also makes it easy to add a wizard to an existing application (either via an additional JDialog, or embedding in some other component), and/or insert custom wrapper widgets around the dialog pages&#8211;meaning that you can quickly add customized navigational controls aside from the standard Previous/Next/Finish/Cancel buttons.</p>

<p>Some aspects were taken from the Java.Net wizard API, such as auto-detecting named components, and automatically collecting the values from them, but CJWizard takes a much simpler approach (and in some ways, a less powerful one &#8212; CJWizard does not listen to every key event, only collecting values when the user navigates away from a WizardPage). In most cases, you only need to name widgets prior to adding them to the WizardPage, and their values will be collected in a settings map automatically.</p>

<p>CJWizard was meant to provide a flexible way to generate professional-looking multi-step dialogs very quickly.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.ciscavate.org/2008/07/creating-wizards-in-java.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Day to day Memoization</title>
		<link>http://blog.ciscavate.org/2008/01/day-to-day-memoization.html</link>
		<comments>http://blog.ciscavate.org/2008/01/day-to-day-memoization.html#comments</comments>
		<pubDate>Mon, 28 Jan 2008 11:02:08 +0000</pubDate>
		<dc:creator>rcreswick</dc:creator>
				<category><![CDATA[java]]></category>
		<category><![CDATA[tech]]></category>

		<guid isPermaLink="false">http://blog.ciscavate.org/?p=33</guid>
		<description><![CDATA[Memoization (not memorization) is the process of remembering the
results of a computation for use later.  (I think of it as &#8220;making a
memo&#8221; to look back on later.)  Memoization is the core to any dynamic
programming implementation, and allows many simple algorithms to run
in linear or polynomial time when they would otherwise take an
exponential number [...]]]></description>
			<content:encoded><![CDATA[<p>Memoization (not <strong>memorization</strong>) is the process of remembering the
results of a computation for use later.  (I think of it as &#8220;making a
memo&#8221; to look back on later.)  Memoization is the core to any dynamic
programming implementation, and allows many simple algorithms to run
in linear or polynomial time when they would otherwise take an
exponential number of operations to complete.  This is most obvious in
the typical recursive Fibonacci example.  Consider the code:</p>

<div class="codecolorer-container java"><div class="codecolorer" style="font-family: monospace;"><span class="kw2">public</span> <span class="kw2">class</span> Fib<span class="br0">&#123;</span><br />
&nbsp; &nbsp;<span class="kw2">public</span> <span class="kw2">static</span> <span class="kw4">void</span> main<span class="br0">&#40;</span><a href="http://www.google.com/search?hl=en&amp;q=allinurl%3AString+java.sun.com&amp;bntI=I%27m%20Feeling%20Lucky"><span class="kw3">String</span></a><span class="br0">&#91;</span><span class="br0">&#93;</span> args<span class="br0">&#41;</span><span class="br0">&#123;</span><br />
&nbsp; &nbsp; &nbsp; <a href="http://www.google.com/search?hl=en&amp;q=allinurl%3ASystem+java.sun.com&amp;bntI=I%27m%20Feeling%20Lucky"><span class="kw3">System</span></a>.<span class="me1">out</span>.<span class="me1">println</span><span class="br0">&#40;</span><span class="st0">&quot;done: fib of &quot;</span>+args<span class="br0">&#91;</span><span class="nu0">0</span><span class="br0">&#93;</span>+<span class="st0">&quot;=&quot;</span>+<br />
&nbsp; &nbsp; &nbsp; fib<span class="br0">&#40;</span><a href="http://www.google.com/search?hl=en&amp;q=allinurl%3AInteger+java.sun.com&amp;bntI=I%27m%20Feeling%20Lucky"><span class="kw3">Integer</span></a>.<span class="me1">parseInt</span><span class="br0">&#40;</span>args<span class="br0">&#91;</span><span class="nu0">0</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>;<br />
&nbsp; &nbsp;<span class="br0">&#125;</span><br />
<br />
&nbsp; &nbsp;<span class="kw2">public</span> <span class="kw2">static</span> <span class="kw4">int</span> fib<span class="br0">&#40;</span><span class="kw4">int</span> n<span class="br0">&#41;</span><span class="br0">&#123;</span><br />
&nbsp; &nbsp; &nbsp; <span class="kw4">int</span> rval = <span class="nu0">1</span>;<br />
&nbsp; &nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span>n &amp;gt;= <span class="nu0">2</span><span class="br0">&#41;</span><span class="br0">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;rval = fib<span class="br0">&#40;</span>n - <span class="nu0">1</span><span class="br0">&#41;</span> + fib<span class="br0">&#40;</span>n - <span class="nu0">2</span><span class="br0">&#41;</span>;<br />
&nbsp; &nbsp; &nbsp; <span class="br0">&#125;</span><br />
&nbsp; &nbsp; &nbsp; <a href="http://www.google.com/search?hl=en&amp;q=allinurl%3ASystem+java.sun.com&amp;bntI=I%27m%20Feeling%20Lucky"><span class="kw3">System</span></a>.<span class="me1">out</span>.<span class="me1">println</span><span class="br0">&#40;</span><span class="st0">&quot;fib(&quot;</span>+n+<span class="st0">&quot;) = &quot;</span>+rval<span class="br0">&#41;</span>;<br />
&nbsp; &nbsp; &nbsp; <span class="kw2">return</span> rval;<br />
&nbsp; &nbsp;<span class="br0">&#125;</span><br />
<span class="br0">&#125;</span></div></div>

<p>This is a straight-forward recursive implementation of fib.  When run
with <code>n=4</code>, we see this:</p>

<div class="codecolorer-container bash"><div class="codecolorer" style="font-family: monospace;">$ javac Fib.java &amp;&amp; java Fib <span class="nu0">4</span><br />
fib<span class="br0">&#40;</span><span class="nu0">1</span><span class="br0">&#41;</span> = <span class="nu0">1</span><br />
fib<span class="br0">&#40;</span><span class="nu0">0</span><span class="br0">&#41;</span> = <span class="nu0">1</span><br />
fib<span class="br0">&#40;</span><span class="nu0">2</span><span class="br0">&#41;</span> = <span class="nu0">2</span><br />
fib<span class="br0">&#40;</span><span class="nu0">1</span><span class="br0">&#41;</span> = <span class="nu0">1</span><br />
fib<span class="br0">&#40;</span><span class="nu0">3</span><span class="br0">&#41;</span> = <span class="nu0">3</span><br />
fib<span class="br0">&#40;</span><span class="nu0">1</span><span class="br0">&#41;</span> = <span class="nu0">1</span><br />
fib<span class="br0">&#40;</span><span class="nu0">0</span><span class="br0">&#41;</span> = <span class="nu0">1</span><br />
fib<span class="br0">&#40;</span><span class="nu0">2</span><span class="br0">&#41;</span> = <span class="nu0">2</span><br />
fib<span class="br0">&#40;</span><span class="nu0">4</span><span class="br0">&#41;</span> = <span class="nu0">5</span><br />
<span class="kw1">done</span>: fib of <span class="nu0">4</span>=<span class="nu0">5</span></div></div>

<p><strong>9</strong> invocations of <code>fib(n)</code>, but only 5 <strong>unique</strong> invocations.  Lets
memoize the results, and try this again:</p>

<div class="codecolorer-container bash"><div class="codecolorer" style="font-family: monospace;">$ javac Fib.java &amp;&amp; java Fib <span class="nu0">4</span><br />
fib<span class="br0">&#40;</span><span class="nu0">1</span><span class="br0">&#41;</span> = <span class="nu0">1</span><br />
fib<span class="br0">&#40;</span><span class="nu0">0</span><span class="br0">&#41;</span> = <span class="nu0">1</span><br />
fib<span class="br0">&#40;</span><span class="nu0">2</span><span class="br0">&#41;</span> = <span class="nu0">2</span><br />
fib<span class="br0">&#40;</span><span class="nu0">3</span><span class="br0">&#41;</span> = <span class="nu0">3</span><br />
fib<span class="br0">&#40;</span><span class="nu0">4</span><span class="br0">&#41;</span> = <span class="nu0">5</span><br />
<span class="kw1">done</span>: fib of <span class="nu0">4</span>=<span class="nu0">5</span></div></div>

<p><strong>Much</strong> better &#8212; 5 invocations, 5 unique sets of parameters.</p>

<p>Here&#8217;s the source with memoization:</p>

<div class="codecolorer-container java" style="height:280px;"><div class="codecolorer" style="font-family: monospace;"><span class="kw2">public</span> <span class="kw2">class</span> Fib<span class="br0">&#123;</span><br />
&nbsp; &nbsp;<span class="kw2">static</span> Map&amp;lt;Integer, Integer&amp;gt; memos = <span class="kw2">new</span> <a href="http://www.google.com/search?hl=en&amp;q=allinurl%3AHashMap+java.sun.com&amp;bntI=I%27m%20Feeling%20Lucky"><span class="kw3">HashMap</span></a><span class="br0">&#40;</span><span class="br0">&#41;</span>; <span class="co1">// new</span><br />
<br />
&nbsp; &nbsp;<span class="kw2">public</span> <span class="kw2">static</span> <span class="kw4">void</span> main<span class="br0">&#40;</span><a href="http://www.google.com/search?hl=en&amp;q=allinurl%3AString+java.sun.com&amp;bntI=I%27m%20Feeling%20Lucky"><span class="kw3">String</span></a><span class="br0">&#91;</span><span class="br0">&#93;</span> args<span class="br0">&#41;</span><span class="br0">&#123;</span><br />
&nbsp; &nbsp; &nbsp; <a href="http://www.google.com/search?hl=en&amp;q=allinurl%3ASystem+java.sun.com&amp;bntI=I%27m%20Feeling%20Lucky"><span class="kw3">System</span></a>.<span class="me1">out</span>.<span class="me1">println</span><span class="br0">&#40;</span><span class="st0">&quot;done: fib of &quot;</span>+args<span class="br0">&#91;</span><span class="nu0">0</span><span class="br0">&#93;</span>+<span class="st0">&quot;=&quot;</span>+<br />
&nbsp; &nbsp; &nbsp; fib<span class="br0">&#40;</span><a href="http://www.google.com/search?hl=en&amp;q=allinurl%3AInteger+java.sun.com&amp;bntI=I%27m%20Feeling%20Lucky"><span class="kw3">Integer</span></a>.<span class="me1">parseInt</span><span class="br0">&#40;</span>args<span class="br0">&#91;</span><span class="nu0">0</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>;<br />
&nbsp; &nbsp;<span class="br0">&#125;</span><br />
<br />
&nbsp; &nbsp;<span class="kw2">public</span> <span class="kw2">static</span> <span class="kw4">int</span> fib<span class="br0">&#40;</span><span class="kw4">int</span> n<span class="br0">&#41;</span><span class="br0">&#123;</span><br />
&nbsp; &nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span>memos.<span class="me1">containsKey</span><span class="br0">&#40;</span>n<span class="br0">&#41;</span><span class="br0">&#41;</span> <span class="co1">// new</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="kw2">return</span> memos.<span class="me1">get</span><span class="br0">&#40;</span>n<span class="br0">&#41;</span>;&nbsp; <span class="co1">// new</span><br />
<br />
&nbsp; &nbsp; &nbsp; <span class="kw4">int</span> rval = <span class="nu0">1</span>;<br />
&nbsp; &nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span>n &amp;gt;= <span class="nu0">2</span><span class="br0">&#41;</span> <span class="br0">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;rval = fib<span class="br0">&#40;</span>n - <span class="nu0">1</span><span class="br0">&#41;</span> + fib<span class="br0">&#40;</span>n - <span class="nu0">2</span><span class="br0">&#41;</span>;<br />
&nbsp; &nbsp; &nbsp; <span class="br0">&#125;</span><br />
&nbsp; &nbsp; &nbsp; <a href="http://www.google.com/search?hl=en&amp;q=allinurl%3ASystem+java.sun.com&amp;bntI=I%27m%20Feeling%20Lucky"><span class="kw3">System</span></a>.<span class="me1">out</span>.<span class="me1">println</span><span class="br0">&#40;</span><span class="st0">&quot;fib(&quot;</span>+n+<span class="st0">&quot;) = &quot;</span>+rval<span class="br0">&#41;</span>;<br />
&nbsp; &nbsp; &nbsp; memos.<span class="me1">put</span><span class="br0">&#40;</span>n, rval<span class="br0">&#41;</span>;&nbsp; &nbsp; &nbsp; &nbsp;<span class="co1">// new</span><br />
&nbsp; &nbsp; &nbsp; <span class="kw2">return</span> rval;<br />
&nbsp; &nbsp;<span class="br0">&#125;</span><br />
<span class="br0">&#125;</span></div></div>

<p>Notice that we only needed to add 4 new lines of code in order to
memoize the results.  When <code>fib(n)</code> is called, it simply checks to see
if it has previously been called with n, and if so, that result is
used again.  If the parameter has never been seen before, the method
continues as normal, storing the computed result before returning.
Memoization turns this naive (and exponential) implementation of <code>fib(n)</code>
into an efficient (linear) operation.</p>

<h2>Memoization in the real world</h2>

<p>So, (un?)fortunately we don&#8217;t spend all day implementing cool new ways
of computing ever increasing entries of the fibinocci sequence &#8212; how
can memoization be put to use? After all, many algorithms are already
implemented in some fairly optimal fashion by the language APIs, and
you&#8217;d be a fool not to use those implementations.  What opportunity
will you have to memoize functions?</p>

<p>It turns out that you can memoize <em>anything</em>, as long as the function
is <em>pure</em> with respect to the memos (meaning: the function doesn&#8217;t depend on any thing that is not used to key the hash of memos).  If the function is not pure, then you can still use memoization, but either the memo hash must key on all the state and parameters that can affect the results of the function.  On the other hand, if f depends on some state that changes very rarely, then it may make more sense to simply discard all the stored memos each time that aspect of state is altered.</p>

<p>Memoization is extremely handy when you have very common operations that are
fairly expensive.  I recently needed to optimize some code that
compares strings based on the case-insensitive stems of the words,
with stopwords removed.  So the strings &#8220;he wanted an apple&#8221; and &#8220;he
wants apples&#8221; should be equal. (&#8221;an&#8221; is a stopword, and ignored)</p>

<p>This meant doing many, many calls to a string stemmer, each of which
is a fairly expensive operation.  Fortunately, hashing strings as
extremely cheap (on the order of 1/4th the time it took to stem a
string of the same length), and I had plenty of memory to store the
parameters and the results in a <code>Map</code>.  Adding memos to
the two primary time-hoggers (the stemmer and a tokenizer) cut the
execution time of the application down from 2 hours to just over 7
minutes.</p>

<h2>Summary</h2>

<p>You can memoize any function that only depends on it&#8217;s parameters and
constant state (or near-constant state &#8212; just don&#8217;t forget to discard your
memos when the state changes!).  If the function is invoked multiple
times you will probably see a performance improvement.</p>

<p>If you need to memoize a function with multiple arguments, then you
just need to nest Maps, or create a unique key by combining the
parameters in some way.</p>

<p>Memoization is an extremely easy way to improve performance under
certain circumstances, particularly if you have a solid grasp on when
state changes outside of your methods / functions, or program in a
functional style.  It can be memory intensive, however.  If the
results of your functions are large, or maintain references to large
objects, then memoization may <strong>penalize</strong> performance if you run out of
memory and have to make use of swap space.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.ciscavate.org/2008/01/day-to-day-memoization.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
