Begin main content

pipelines in Erlang

Showing again why erlang's super-lightweight concurrency rocks, David King posted code to simplify writing pipelined code in erlang.

You already think in terms of pipelines - how about "gzcat foo.tar.gz | tar xf -"? You may not have known it, but the shell is running the unzip and untar in parallel - the stdin read in tar just blocks until data is sent to stdout by gzcat.

Well a lot of tasks can be expressed in terms of pipelines, and if you can do that then getting some level of parallelisation is simple with David's helper code (even across erlang nodes, ie. machines):

              {filter,fun some_filter/1},
              {generic,fun some_complex_function/2},
              fun some_more_complicated_function/1,
              fun pipeline:collect/1]).
So basically what he's doing here is making a list of the steps - each step being implemented in a fun that accepts as input whatever the previous step outputs (the funs can even be defined inline of course). Go check out David's blog entry for the code and more detailed explanation.

Update: Changed link to Wayback machine link since the original url went away.

05:30 AM, 27 Jan 2008 by Mark Aufflick Permalink | Comments (0)


We can't solve problems by using the same kind of thinking we used when we created them.
-- Albert Einstein

Via swiss miss.

07:31 PM, 23 Jan 2008 by Mark Aufflick Permalink | Comments (0)

So small yet so big

MacBook AirWell now I don't need to wonder what I'm going to do with my bonus...

Finally a worthy replacement of my 12" Powerbook - but perhaps I should wait for the next model when Apple can incorporate the recently released 128Gig solid state drives. 64Gig just isn't going to cut it, and a 4200rpm drive is somewhat unpalatable.

Still not as cool as a Powerbook Duo though.

08:52 PM, 15 Jan 2008 by Mark Aufflick Permalink | Comments (2)

Hosting outage

So much for "Between the hours of 7am and 10am you may experience some latency for a brief few minute period." for some "re-patching work" today...

Apparently the "AS/NZS7799 Information Security Management Systems certification" standard that the datacenter runs to somehow allows cables to be patched the wrong way and not tested for 3 hours :(

09:15 PM, 12 Jan 2008 by Mark Aufflick Permalink | Comments (0)

Optimising Perl with Inline::C

I was discussing with someone today about a time I used Inline::C to massively speed up an inner loop in a Perl program. Thing is, in that case the real speedup wasn't any super smart C programming on my behalf, it was just making use of a very optimised vendor library that you could only access from C.

So I got to thinking - in normal every day code, is there any real speed benefit to be had by writing your inner loops in C.

I found an old web page by Mitchell Charity discussing Inline and, interpreting his (slightly pathological) example a little, I got a surprising result:



cmpthese( 50,
             perl_method => sub {
                 my$object = new Foo;

                 for(my$i=0;$i<1_000_000;$i++) {

             all_in_one_c => sub {
                 my$object = new Foo;


subnew { my$self = " "x 1_000_000; returnbless \$self, 'Foo' }

subset_element {
  my($self,$n,$value) = @_;
  substr($$self,$n,1) = pack("C",$value);


#defineUSING(object) unsignedchar *ptr = SvPVX(SvRV(object))

voidset_all_with_c (SV* object) {
    for(i=0;i<1000000;i++) { SET_ELEMENT(i, 67); }
               s/iter  perl_method all_in_one_c
perl_method      4.16           --        -100%
all_in_one_c 8.60e-03       48321%           --
ie. the C code was 48321% times as fast.

But it's not really comparing apples with apples - the Perl code is doing a method call on each iteration, the C code is operating on the value directly. In addition, the C code (by way of Inline::C's magic) is basically copying the string into a temporary variable, operating directly on that, and copying back - so the dereferencing is not happening on each loop. We can make those changes in Perl too, and see how that compares.


subset_all_with_perl {
    my$tmp_str = ${ $_[0] };
    substr($tmp_str, $_, 1) = pack("C",67) for 0..1_000_000;
    ${ $_[0] } = $tmp_str;

                  s/iter     perl_method all_in_one_perl    all_in_one_c
perl_method         4.21              --            -62%           -100%
all_in_one_perl     1.61            161%              --            -99%
all_in_one_c    8.60e-03          48821%          18626%              --
So eliminating the method dispatch and dereference in the loop made our Perl code much faster, but the C code is still way faster. Obviously it's a contrived example, and 1 million iterations is one heck of an inner loop, but I am still surprised by how much difference it made.

09:38 AM, 10 Jan 2008 by Mark Aufflick Permalink | Comments (0)

Compiling subversion for performance on Solaris

More for my future reference than anything else, if you want decent svn client performance on Solaris 8 (and probably other versions), you should compile Apache APR with the following arguents:

  --with-devrandom=/dev/urandom  Use non-blocking pseudo-random number device
--enable-nonportable-atomics Use optimized atomic code which may produce nonportable binaries

The nonportable-atomics option uses an atomic UltraSparc microcode instruction to replace an entire mutex algorithm (the same option works on modern Intel processors also).

Update: Although, according to the APR-0.9 change notes the solaris specific atomic code was removed due to licensing concerns... Might try to track that down and reapply locally (it said it was licenced MPL 1.0 so no problem using it locally AFAICT).

So here it is in the 0.9.3 tag: but the build process has radically changed since then so I'll need to figure out how to port the assembler source into inline gcc asm for the unified unix/apr_atomic.c in current versions.

I'll post when/if I get success!

Update 2: Here's a patch for ISA independant Solaris 10 atomics, but that doesn't help me since I'm on Solaris 8.

And here's another patch, this time for the x86 specific atomics for Solaris x86, which also doesn't help me since I'm on UltraSparc :(

01:37 AM, 07 Jan 2008 by Mark Aufflick Permalink | Comments (0)

The circularity of IDEs and code proliferation

Use of text code* (eg. Java, C++) IDEs encourage non-dynamic code. Steve Yegge puts it perfectly:

The second difficulty with the IDE perspective is that Java-style IDEs intrinsically create a circular problem. The circularity stems from the nature of programming languages: the "game piece" shapes are determined by the language's static type system. Java's game pieces don't permit code elimination because Java's static type system doesn't have any compression facilities - no macros, no lambdas, no declarative data structures, no templates, nothing that would permit the removal of the copy-and-paste duplication patterns that Java programmers think of as "inevitable boilerplate", but which are in fact easily factored out in dynamic languages.

Completing the circle, dynamic features make it more difficult for IDEs to work their static code-base-management magic. IDEs don't work as well with dynamic code features, so IDEs are responsible for encouraging the use of languages that require... IDEs. Ouch.

* I say text code IDEs specifically to exclude the Smalltalk environment where perfect automated refactoring is entirely possible with very dynamic code.

07:46 AM, 02 Jan 2008 by Mark Aufflick Permalink | Comments (0)


Blog Categories

software (41)
..cocoa (23)
  ..heads up 'tunes (5)
..ruby (6)
..lisp (4)
..perl (4)
..openacs (1)
mac (21)
embedded (2)
..microprocessor (2)
  ..avr (1)
electronics (3)
design (1)
photography (26) and white (6)
..A day in Sydney (18)
..The Daily Shoot (6)
food (2)
Book Review (2)


Icon of envelope Request notifications

Syndication Feed


Recent Comments

  1. Mark Aufflick: Re: the go/Inbox go/Sent buttons
  2. Unregistered Visitor: How do make a button to jump to folder
  3. Unregistered Visitor: Note I've updated the gist
  4. Unregistered Visitor: umbrello is now an available port on macPorts
  5. Unregistered Visitor: Updated version on Github
  6. Unregistered Visitor: Modification request.
  7. Unregistered Visitor: Accents and labels with spaces
  8. Unregistered Visitor: Mel Kaye - additional info
  9. Unregistered Visitor: mmh
  10. Mark Aufflick: Thank you