Archive for the ‘Programming’ Category

Everbody hates Java. Jax 2012.

Mittwoch, Mai 2nd, 2012

Here are the slides for my talk at Jax. I’ve not written anything up yet, so here’s a link to the interview the JAX team did with me before the conference: interview (German). The slides probably don’t make much sense without me clowning around in front of them, sorry.

Tim’s Guide to JVM Languages.

Donnerstag, Dezember 1st, 2011

JRuby: for people who realize the Ruby interpreter isn’t that hot performancewise or –more likely– whose sysadmins refuse to deploy anything but jar files. Vaguely useful in case you absolutely require a Java library for a script that involves opening a file, parsing xml, printing to the screen, adding two `byte`s or something similarly impossible to do in Java without becoming so enraged that you end up twisting some cute little animal’s head off.

Jython: see JRuby

Groovy: for hardcore Java nerds who don’t want to admit to themselves that Java isn’t the be-all end-all of programming by resorting to JRuby or Jython. Because for unknown reasons, groovy is somehow „more“ Java.

Scala: for people who feel Java isn’t special enough for them, because they’re very special. Yet they’re too limp dicked to use haskell or erlang. In all honesty, they would prefer to use ocaml, but the JVM handles cache line optimization in Intel’s upcoming Larabee architecture better and they need to squeeze every last bit of performance out of their „boxen“. They also enjoy using important words like „contravariant“ that noone including themselves understands. This makes them feel even more special.

Fantom: see Scala, add: for who Scala is too mainstream because Twitter and one other company allegedly used it a some point.

Clojure: see Scala, but switch „scheme or lisp“ in for „haskell or erlang“, feeks slightly less absurd than Scala to me.

JavaScript: oh-my-god just go ahead and scratch my eyes out, why in hell would anyone … oh yeah, it ships with the JVM (no joke). The „embedded language“ of choice in case you need to embed some language into your Java desktop software.

JavaFX: sadomasochists with a serious Sun Microsystems fetish who have wet dreams of Dukeâ„¢ (the little java dude) gnawing their balls off. They also really hate Flash and want to stick it to Adobe. But they haven’t heard that Adobe discontinued it or that you can do „mouseover“ effects in HTML5 thus enabling Rich Internet Applicationsâ„¢ without Appletsâ„¢.

All of the other JVM languages are either someone’s uni dissertation or total bullshit. Except for Frink which is pretty awesome but not really a general purpose programming language.

That said, Java is a really annoying language, but so are all other computer languages that don’t live in the JVM to some degree. It’s perfectly possible to write solid and useful code with it.

(this rant originally posted by me here. I updated it with a link to cute animals and frink)

Determining memory usage in process on OSX

Freitag, Januar 30th, 2009

My strenuous journey started with a seemingly simple task: I wanted to obtain a rough and tumble estimate of the amount of memory instantiating a rather opaque data structure from a third party library would consume.

Naively, I started writing a loop that created a new instance sleeping and then … I poked around Single Unix for a system call that could help and ended up coming up with `getrusage`. OSX man pages stated it fills in this:

struct rusage {
  struct timeval ru_utime; /* user time used */
  struct timeval ru_stime; /* system time used */
  long ru_maxrss;          /* integral max resident set size */
  long ru_ixrss;           /* integral shared text memory size */
  long ru_idrss;           /* integral unshared data size */
  long ru_isrss;           /* integral unshared stack size */
  long ru_minflt;          /* page reclaims */
  long ru_majflt;          /* page faults */
  long ru_nswap;           /* swaps */
  long ru_inblock;         /* block input operations */
  long ru_oublock;         /* block output operations */
  long ru_msgsnd;          /* messages sent */
  long ru_msgrcv;          /* messages received */
  long ru_nsignals;        /* signals received */
  long ru_nvcsw;           /* voluntary context switches */
  long ru_nivcsw;          /* involuntary context switches */
};

Unfortunately, all the fields apart from user and system time remain zeroed. `grepping` through the xnu (Darwin kernel, apparently the only available information about what does and doesn’t work under OSX) sources, I ended up finding this comment in the declaration of `struct rusage` which finally convinced me that I’m not too stupid to make a simple call:

struct    rusage {
    struct timeval ru_utime;    /* user time used (PL) */
    struct timeval ru_stime;    /* system time used (PL) */
#if defined(_POSIX_C_SOURCE) && !defined(_DARWIN_C_SOURCE)
    long    ru_opaque[14];        /* implementation defined */
#else    /* (!_POSIX_C_SOURCE || _DARWIN_C_SOURCE) */
    /*
     * Informational aliases for source compatibility with programs
     * that need more information than that provided by standards,
     * and which do not mind being OS-dependent.
     */
    long    ru_maxrss;        /* max resident set size (PL) */
#define    ru_first    ru_ixrss    /* internal: ruadd() range start */
    long    ru_ixrss;        /* integral shared memory size (NU) */
    (...)
};

At least the good folks over at SUN are kind enough to mention the fact all the fields are dummies in their man pages.

Then I found Michael Knight’s (not that one) blog, which used the underlying mach function `task_info`. Unfortunately, Apple doesn’t document the mach API at all and the sole reference they supply points directly to nowhere.

Well, if `ps` can determine memory usage, surely it should be able to tell me how. Finding the source of OSX `ps` was another story. Hint: it’s not located in the `basic_cmds`, `misc_cmds`, `shell_cmds`, or `system_cmds` package. It’s in the `adv_cmds` (advice?, advanced?, adventure?).

`ps` ended up using a bunch of equally undocumented (non-mach) kernel functions. At this point I remembered that `macfuse` contains a `procfs` for OSX. To me, using `proc` seems to be the obvious way to get memory usage under Linux, so I dug through that and saw macfuse uses `task_info` as well.

I finally found documentation for the mach API within the xnu sources under the `osfmk/man` directory or online here and was able to write a simplified version of Michael’s original.

Voila:

#include <mach/task.h>

int getmem (unsigned int *rss, unsigned int *vs)
{
    task_t task = MACH_PORT_NULL;
    struct task_basic_info t_info;
    mach_msg_type_number_t t_info_count = TASK_BASIC_INFO_COUNT;

    if (KERN_SUCCESS != task_info(mach_task_self(),
       TASK_BASIC_INFO, (task_info_t)&t_info, &t_info_count))
    {
        return -1;
    }
    *rss = t_info.resident_size;
    *vs  = t_info.virtual_size;
    return 0;
}

In case anyone knows of an even remotely portable way obtain similar information, please let me know.

No loopback interface on Windows (XP)

Mittwoch, Oktober 22nd, 2008

Learned today that Windows doesn’t support a loopback interface for localhost. In consequence, network packets destined for the local machine are never passed to any interface and therefore can’t be captured by a packet sniffer. Unfortunately, looking at the network is my preferred way of diagnosing network problems, so this behavior gets in the way. An easy workaround is to route packets to a local address to the standard gateway instead. The gateway then sends the packets back to the local machine. This is a bit of a detour, but at least the traffic shows up. This dramatically changes how the packets are being moved around, so it might not help… But just in case:

  1. grab your IP address and Gateway using ipconfig in a DOS box.
  2. route add $LOCAL_IP mask 255.255.255.255 $GATEWAY_IP metric 1
  3. when you’re done, use route delete $LOCAL_IP to get things back to normal

Visualizing Ant Redux.

Dienstag, Oktober 14th, 2008

I’ve written a small update to my ant-file visualization tool. The only visible change is that the default task is now marked in the output.

You can either download the jar containing everything you need, or build it yourself from the source available via:

svn co http://a2800276.googlecode.com/svn/branches/antvis

If Antvis is run from the command line like so:

$ java -jar antvis.jar
usage: [jre] antvis.AntVis -f inputFile [-t format] [-o outfile]
	format: format supported by dot. Default: `dot`
	outfile: Default stdout
call [jre] antvis.AntVis -l for a list of supported formats

It prints out the available options. If it’s called correctly:

$ java -jar antvis.jar -f build.xml -t png -o self.png

It will produce graphical representations of the provided build.xml file like this one

self.png

for Antviz’s own build.xml or this one

jpos_ant.png

The above is an example of a more complicated build.xml script, it ships with jpos.

Backslashes in C includes…

Samstag, September 27th, 2008

Who’d have thought:

  1. that DOS backslashes in C include paths aren’t only ugly and a pain, but also not legal* C:

    If the characters ‚, \, „, //, or/* occur in the sequence between the < and > delimiters, the behavior is undefined. Similarly, if the characters ‚, \, //, or /* occur in the sequence between the “ delimiters, the behavior is undefined. A header name preprocessing token is recognized only within a #include preprocessing directive.

    (C99 6.4.7.3)

  2. … the C99 Standard is available for free online This links directly to the pdf containing the current standard, which lives here.
  3. It’s easy to fix:

    find . -name '*.[c|h]' -print0 | xargs -0 \
       ruby -i.bak -pe 'scan(/^\s*#include.*/){ gsub(/\\/, "/") }'
    
  4. * yeah, I know, it’s legal just undefined.
    ** this post inspired by this.

„Ruby Cryptography: TINFM“

Dienstag, Juli 22nd, 2008

TINFM meaning „there is no fine manual“, of course. Just as with digest, Ruby’s openssl documentation is missing just the bits you’ll need to get started. In order to encrypt or decrypt something, you’ll first need to instantiate the approriate cipher:

        require 'openssl'
	cipher = OpenSSL::Cipher::Cipher.new NAME_OF_CIPHER

So how do I know the name of the cipher if it’s not documented? You’ll need to refer to the OpenSSL documentation, or refer to this handy list:

base64 Base 64
bf-cbc Blowfish in CBC mode
bf Alias for bf-cbc
bf-cfb Blowfish in CFB mode
bf-ecb Blowfish in ECB mode
bf-ofb Blowfish in OFB mode
cast-cbc CAST in CBC mode
cast Alias for cast-cbc
cast5-cbc CAST5 in CBC mode
cast5-cfb CAST5 in CFB mode
cast5-ecb CAST5 in ECB mode
cast5-ofb CAST5 in OFB mode
des-cbc DES in CBC mode
des Alias for des-cbc
des-cfb DES in CBC mode
des-ofb DES in OFB mode
des-ecb DES in ECB mode
des-ede-cbc Two key triple DES EDE in CBC mode
des-ede Two key triple DES EDE in ECB mode
des-ede-cfb Two key triple DES EDE in CFB mode
des-ede-ofb Two key triple DES EDE in OFB mode
des-ede3-cbc Three key triple DES EDE in CBC mode
des-ede3 Three key triple DES EDE in ECB mode
des3 Alias for des-ede3-cbc
des-ede3-cfb Three key triple DES EDE CFB mode
des-ede3-ofb Three key triple DES EDE in OFB mode
desx DESX algorithm.
idea-cbc IDEA algorithm in CBC mode
idea same as idea-cbc
idea-cfb IDEA in CFB mode
idea-ecb IDEA in ECB mode
idea-ofb IDEA in OFB mode
rc2-cbc 128 bit RC2 in CBC mode
rc2 Alias for rc2-cbc
rc2-cfb 128 bit RC2 in CFB mode
rc2-ecb 128 bit RC2 in ECB mode
rc2-ofb 128 bit RC2 in OFB mode
rc2-64-cbc 64 bit RC2 in CBC mode
rc2-40-cbc 40 bit RC2 in CBC mode
rc4 128 bit RC4
rc4-64 64 bit RC4
rc4-40 40 bit RC4
rc5-cbc RC5 cipher in CBC mode
rc5 Alias for rc5-cbc
rc5-cfb RC5 cipher in CFB mode
rc5-ecb RC5 cipher in ECB mode
rc5-ofb RC5 cipher in OFB mode
aes-[128|192|256]-cbc 128/192/256 bit AES in CBC mode
aes-[128|192|256] Alias for aes-[128|192|256]-cbc
aes-[128|192|256]-cfb 128/192/256 bit AES in 128 bit CFB mode
aes-[128|192|256]-cfb1 128/192/256 bit AES in 1 bit CFB mode
aes-[128|192|256]-cfb8 128/192/256 bit AES in 8 bit CFB mode
aes-[128|192|256]-ecb 128/192/256 bit AES in ECB mode
aes-[128|192|256]-ofb 128/192/256 bit AES in OFB mode

A list of the currently supported cipher strings, without the explanation can also be produced by calling OpenSSL::Cipher.ciphers

After you’ve instantiated the proper cipher, you tell it to either encrypt or decrypt, give it the key to use (and possibly an IV) and then pass in data using update:

	cipher.encrypt
	cipher.key = KEY_DATA
	ciphertext = cipher.update plaintext
	
	cipher.reset
	cipher.decrypt
	cipher.key = KEY_DATA
	plaintext = cipher.update plaintext

That’s all. Not really difficult, once you’ve pieced everything together.

Bit-twiddling with Ruby

Dienstag, Mai 6th, 2008

I’ve always wanted to write some routines that help out with bit twiddling. Since I’m working on some byte level stuff recently (Smartcards, ISO7816 to be precise) I’ve finally gotten around to writing an API to make handling bytes easier and self-documenting. Basically, it’s a –attention buzzword– DSL for bitfield description. Not really gotten very far, but this is how it looks up to now: if you’ve got a byte composed of bits with the following semantics:

   |8|7|5|4|3|2|1|Desc
   ================================
   |1|-|-|-|-|-|-|Channel Encrypted
   |-|0|0|0|-|-|-|Method A
   |-|1|0|0|-|-|-|Method B
   |-|0|0|1|-|-|-|Method C
   |-|-|-|-|X|X|X|Channel Number 

I can use the following ruby code to represent it:

  require "bytes"
   b = Bytes::Byte.new "1......." => :enc,
                       ".000...." => :a,
                       ".100...." => :b,
                       ".001...." => :c,
                       ".....vvv" => :channel
   b.value = 0xff
   b.enc?        # true 
   b.b?          # false
   b.b           # `b.value` is now 0xCF / "11001111"
   b.b?          # true 
   b.channel     # 7
   b.channel = 0 # `b.value` is now 0xC8 / "11001000"

Instead of using the Byte class and instantiating it with the byte’s pattern, it’s also possible to include the module Bytes which adds a attr like class function (called byte_accessor) which adds the same sort of functionality to classes. Take this –vaguely contrived– implementation of the first two bytes of an IP Packet:

require "bytes_ng"
class IPPacket
  include Bytes
   byte_accessor :ver_ihl , "vvvv ...." => :version
                           ".... vvvv" => :ihl

   byte_accessor :tos, "111. .... | Precedence" => :network_control,
                       "110. ...."              => :inet_control,
                       "101. ...."              => :critic_epc,
                       "100. ...."              => :flash_override,
                       "011. ...."              => :flash,
                       "010. ...."              => :immediate,
                       "001. ...."              => :priority,
                       "000. ...."              => :routine,
                       "...0 .... | Delay"      => :normal_delay,
                       "...1 ...."              => :low_delay,
                       ".... 0... | Throughput" => :normal_throughput,
                       ".... 1..."              => :high_throughput,
                       ".... .0.. | Reliability"=> :low_reliability,
                       ".... .1.."              => :high_reliability,
                       ".... ..1. | RFU"        => :rfu_err_1
                       ".... ...1 | RFU"        => :rfu_err_2
end

This adds two instance variables (and their respective accessors) named ver_ihl and tos to the class IPPacket. These contain the actual byte value. It also adds a bunch of methods (like in the example above) that can be used to query and set the individual bits.

I’ve not gotten around to properly releasing it yet, but it works quite well so far. In case you’re interested, you can currently get it here.

Future plans are to package it and (maybe) add multi-byte functionality.

Visualizing Ant Scripts.

Dienstag, April 8th, 2008

XML is generally not only tedious to write, but also hideous to look at, yet sometimes you gotta bite the bullet and use an ant build script.

I’ve written a little tool that renders a dependancy graph of all the tasks in a ant build.xml file. The result looks like this:

ant_deps

The above was generated from this xml file which is too long and ugly to include here.

In case you’d also like to generate nifty little pictures like the above, to beef up skimpy documentation, for example, you can download the tool here. Just call:

java -jar antvis.jar

And all the rest should be self-explanatory. You’ll need to have a copy of Graphviz installed in order to render the pictures. In case you are interested in the source, you can grab a copy using subversion here:

svn co http://a2800276.googlecode.com/svn/branches/antvis

EURUKO 2008: Native Extensions

Sonntag, März 30th, 2008

fish.png

You were expecting cats?

Here are the code examples in case you want to try them out yourself.

Here is a copy of the slides, unfortunately, they’re quiet huge at the moment, I’ll try to get smaller one as soon as I figure out keynote…

Suggestions from the audience

  • Ruby Hacking Guide : more detailed information about Ruby internals.
  • RubyInline : allows you to write C code in the middle of Ruby code
  • Somebody said the videos of the talks would be put up here
  • Dr. Nic was kind enough to write a generator to handle native extensions for his newgem tool. newgem is definitely worth trying out, it generates a LOT of boiler plate, which is sort of overwhelming. But it’s more fun to figure out what his tool does than to type boilerplate code.