Infrequently Noted

Alex Russell on browsers, standards, and the process of progress.


As usual, Glen is ahead of everyone on this, but yesterday I got an overview of the Processing environment from the guys who wrote it, and it's so freaking HAWT. It runs on the JVM, but it fixes most of the annoying issues of "where's that jar file?" and "why do I need a class for the main loop?" by constraining the environment to a specific problem domain. And the results are both literally and figuratively beautiful. It feels to me like the kind of think that the Chumby guys should have been using instead of Flash.


In the course of life, there are some moments where you are just so damned thankful to be alive that you almost feel guilty for being in your own shoes. Foo Camp was one of those times.

I'm still exhausted from the whole thing, and my brain is full. Entirely full. It's going to take some time to digest all the great stuff I learned, but a couple of things stood out such that I'm terrified of forgetting any of them. First, Avi Bryant's discussion about "pipes for the web" was a discussion of how we build the small, chainable pieces for the current and next generation of things that we're all hacking on. The discussion of feeds as a generic transport type between processing systems (i.e., the Unix pipe) was amazing. With that back-to-back with Tom Coates' talk on "Dirty Semantics", I got the feeling that we're finally organizing an answer to all the things that have bugged me about the semantic web vision of the future. By acknowledging that the web is dirty, and that it's OK, Tom presented a vision of the kinds of apps that I work on that doesn't have an undercurrent of academic condescension about how you should be doing things. It buckets things into "better because the market will say so" and "worse, because the market will ignore it", and those are the kinds of quality metrics I can get behind.

I also got to meet Ed Loper, the guy who did Epydoc, and we got into a discussion about how computational linguistics and machine translation people can fix the problem of having artificial test sets that cause algorithm mutation towards solutions that might not actually be desirable in the real world. What if, instead of some test suite that has a non-human testing the various algorithms, the system were a front-end to bablefish that would allow researchers to submit a web-services call into a queue of potential translators? The system would shunt off some percentage of the overall traffic to each registered system and collect "good" or "bad" rankings (the UI is tricky here) for various translations. By using the scope of the system to test quality and then to perhaps create a leader-board so research teams can compete, it would allow translation research teams to both provide results to sponsors that are trustable enough to fund ongoing work and, eventually, to provide data to support adoption of the resulting systems either through Open Source or commercialization.

Thanks to Foo, I've got a hundred other things rattling around my head right now, and the worst bit of it is that there were so many people that I wanted to meet and things I wanted to see but couldn't. I've never experienced that depth and breadth of experience in one place before. Yesterday morning, I woke up at 9:30 after having gone to sleep at about 4:30 and I was kicking myself for having not been up at 7 because I could have been talking to people instead of sleeping.

It was that awesome.

CRM 114 on OS X

A quick note to my future self on getting CRM 114 to build and install on OS X.

First, download the latest tarball to a suitable location (/tmp will do). Explode the tarball and cd into the TRE library directory inside of it, currently tre-0.7.4. Next, run:

sudo ./configure --enable-static && make && make install

Once TRE is installed, run man agrep and marvel at the wonder that is agrep. Holy crap is that cool.

Next, edit the main CRM 114 Makefile. Comment out the line in the that reads:

LDFLAGS += -static

On OS X, dynamic library lookup is preferred and I wasn't able to get static linking working anyway. Next, uncomment these lines:

CFLAGS += -I/usr/local/include
LDFLAGS += -L/usr/local/lib

But make sure that this line is still commented out:

#LIBS += -litnl -liconv

Otherwise you'll be on a wild goose chase to find a package that includes a dynamic library for GNU gettext. Luckily, the php packages have such a beast, but to avoid more build path mucking than is absolutely necessaray, just make sure that -lintl isn't in your GCC calls.

The last change is to modify the line that reads:

-lm -ltre -o crm114_tre

to omit the "-lm" flag. It should then read simply:

-ltre -o crm114_tre

At this point, it's safe to build with:

sudo make clean && make && make install


Update: A couple of final snags, aside from the various setup bits and bobs that aren't automated). In order to actually process my spam/ham folders, it was necessary to patch crm114_config.h and rebuild. The substitution was:

//   default size of the data window: 8 megabytes.
// #define DEFAULT_DATA_WINDOW  8388608
#define DEFAULT_DATA_WINDOW 16777216

which ups the processing window for messages significantly. Also, it was necessaray, as per the comment in the file, to split up the first line of mailreaver.crm into 2 lines, like this:

#    -( spam good cache dontstore stats_only outbound undo verbose maxprio minprio delprio)

Reason #39 That Nothing Gets Written For Phones (Yet)

Moore's Law can imply a lot of different things depending on what you place as your primary priority. If it's raw compute power with no concern for power use or heat dissipation, you get huge boots every couple of years. If you're after more compute on the same power budget, just wait a while. Predicting what we'll be using in a couple of years is a related-rates problem where the backstop conditions are "any trend that can't continue won't" and whatever market forces drive the non-CPU constraints. Datacenters hit a wall in terms of power use and heat dissipation, and the result is that Intel, AMD, and Sun now are plastering the Financial District with competing billboards touting their FLOPS-per-watt in lieu of GHz ratings.

I suspect that we're in for some similar reckonings in the mobile device world, but the constraints are very different from what the PC world has been up against. For one, the overall power density in phone/PDA batteries is increasing, but only at about something between 9 and 18 percent a year depending on who you read. Competing for this meager power budget are:

So the assumption that we'll get more raw CPU power for PIM-style apps and web browsing in next year's model is related as much to the design goals of the industry as it is to any particular technology trend. But the big barrier for the use of the mobile web is, and has been, the UI. And I don't just mean the layout of mobile browsers or the way pages get formatted. No, I mean the way people actually punch alphanumeric characters into these things. The available options all revolve around form factor. Clamshell == numberpad + 3tap. PDA-ish == thumb-board, widscreen == on-screen keyboard. All of these options put at least one set of characters at a tremendous speed dis-advantage. The biggest common (fast) input element is some sort of directional input system with a center-click button. It seems like a minor quibble in the pantheon of problems with getting the web onto phones in a meaningful way, but I think it creates a real problem for the "flatness" of the internet. In the same way that there's huge pressure to get a short domain name for startups, I can easily envision a scenario where there's a bidding war to get in the top page of links on the carrier's (non-resettable?) mobile browser homepage. And carriers don't leave money on the table. They hate open markets almost as much as they hate you.

So where does that put us? If there's good news, it's that eventually carriers will learn to sell bandwidth at a reasonable price ('cause that's how they'll keep customers) and beef up their minimum-spec processor and browser requirements, but I don't think it'll happen for a couple of years. First, the UI issues are going to require something like ubiquitous thumbpads or something more drastic (chording?) to turn these devices into anything but low-end digital cameras with super-slow upload facilities. Secondly, screen density improvements and the attendant bandwidth chewing properties of bitmap graphics are likely to keep CPU/SOC spare power in check until most phones sprout dedicated graphics co-processors (and if they already have, then they're just competing for power). Once screen density, contrast, color depth, and touch sensitivity start to settle down, we'll probably be left with a sudden improvement in processing budget, but it's unclear how long that'll get soaked up by piggier OSes before trickling down to applications.

The last hurdle for making the mobile web a reality is bandwidth and latency. Luckily, bandwidth will probably get figured out. Latency, OTOH, won't. We'll just have to learn to with that one. Handset turnover rates combined with improvements in carrier networks give me some hope that by the time we have realistic browsers on phones we'll be sucking down content over relatively wide pipes. Of course, when the carriers have the "a ha!" moment around fixed-price mobile data is the big unknown. There's more money to be made if they give up a little control, but that goes against everything they know and believe.

Guess I should wrap this up, but next time, I'll detail my experience with getting smartphone emulators installed, the SNAFU that started me down this entire train of thought in the first place.

Something's Always Wrong

Like a sled dog who would gladly run himself to death, I seem not to be able to kick the habit of over-working, especially not on things that I find interesting. If my interests and efforts were mostly aligned at Jot, they're now identical vectors at SitePen. My full-time job is now what I once did as a hobby, and my hobbies are all the interesting little technical pursuits I've put off in the last couple of years. Add a large dose of Lutheran guilt about doing the right things the right way and a seeming inability to say "no" to people, and sleep seems like a particularly quaint anachronism.

Were it not for Jennifer, the woman I love madly, I think I'd be clinically insane by now. I'm good at last-minute panic. Things like "did we turn off the stove?" and "maybe I should double-check when that flight is leaving...". Jennifer, on the the other hand, is the organizer. Instead of leaving important details for the last minute, she ensures that big things happen at all. Months in advance, she'll be asking "so [insert name of band] is coming, do you want to see them?", and thanks to that (and a little "where did we put the tickets?" from me) we've been able to see Buddy Guy, BB King, and Toad The Wet Sprocket this summer.

Toad was a special treat since they've been broken up for years. Despite it carbon-dating my highschool years, I've got something of a soft spot in my iTunes playlist for old Toad stuff. Even new "Ok Go" doesn't make me as happy cycling through my playlist during these endless hours at the terminal. While their set the other night was pretty heavily scripted, they still sounded as good as any of the studio recordings. Thanks to Jennifer, I got out of the house for some celebration after turning my brain to mush in preparation for a talk at LinuxWorld and I finally got to see a band I've been listening to heavily for nearly a decade.

No matter what you may think of the music, it's hard to imagine a better gift.

Older Posts

Newer Posts