Here are the slides for my talk at Jax. I’ve not written anything up yet, so here’s a link to the interview the JAX team did with me before the conference: interview (German). The slides probably don’t make much sense without me clowning around in front of them, sorry.
Archive for the ‘Programming’ Category
Everbody hates Java. Jax 2012.
Mittwoch, Mai 2nd, 2012Tim’s Guide to JVM Languages.
Donnerstag, Dezember 1st, 2011JRuby: for people who realize the Ruby interpreter isn’t that hot performancewise or –more likely– whose sysadmins refuse to deploy anything but jar
files. Vaguely useful in case you absolutely require a Java library for a script that involves opening a file, parsing xml, printing to the screen, adding two `byte`s or something similarly impossible to do in Java without becoming so enraged that you end up twisting some cute little animal’s head off.
Jython: see JRuby
Groovy: for hardcore Java nerds who don’t want to admit to themselves that Java isn’t the be-all end-all of programming by resorting to JRuby or Jython. Because for unknown reasons, groovy is somehow „more“ Java.
Scala: for people who feel Java isn’t special enough for them, because they’re very special. Yet they’re too limp dicked to use haskell or erlang. In all honesty, they would prefer to use ocaml, but the JVM handles cache line optimization in Intel’s upcoming Larabee architecture better and they need to squeeze every last bit of performance out of their „boxen“. They also enjoy using important words like „contravariant“ that noone including themselves understands. This makes them feel even more special.
Fantom: see Scala, add: for who Scala is too mainstream because Twitter and one other company allegedly used it a some point.
Clojure: see Scala, but switch „scheme or lisp“ in for „haskell or erlang“, feeks slightly less absurd than Scala to me.
JavaScript: oh-my-god just go ahead and scratch my eyes out, why in hell would anyone … oh yeah, it ships with the JVM (no joke). The „embedded language“ of choice in case you need to embed some language into your Java desktop software.
JavaFX: sadomasochists with a serious Sun Microsystems fetish who have wet dreams of Dukeâ„¢ (the little java dude) gnawing their balls off. They also really hate Flash and want to stick it to Adobe. But they haven’t heard that Adobe discontinued it or that you can do „mouseover“ effects in HTML5 thus enabling Rich Internet Applicationsâ„¢ without Appletsâ„¢.
All of the other JVM languages are either someone’s uni dissertation or total bullshit. Except for Frink which is pretty awesome but not really a general purpose programming language.
That said, Java is a really annoying language, but so are all other computer languages that don’t live in the JVM to some degree. It’s perfectly possible to write solid and useful code with it.
(this rant originally posted by me here. I updated it with a link to cute animals and frink)
Determining memory usage in process on OSX
Freitag, Januar 30th, 2009My strenuous journey started with a seemingly simple task: I wanted to obtain a rough and tumble estimate of the amount of memory instantiating a rather opaque data structure from a third party library would consume.
Naively, I started writing a loop that created a new instance sleeping and then … I poked around Single Unix for a system call that could help and ended up coming up with `getrusage`. OSX man pages stated it fills in this:
struct rusage { struct timeval ru_utime; /* user time used */ struct timeval ru_stime; /* system time used */ long ru_maxrss; /* integral max resident set size */ long ru_ixrss; /* integral shared text memory size */ long ru_idrss; /* integral unshared data size */ long ru_isrss; /* integral unshared stack size */ long ru_minflt; /* page reclaims */ long ru_majflt; /* page faults */ long ru_nswap; /* swaps */ long ru_inblock; /* block input operations */ long ru_oublock; /* block output operations */ long ru_msgsnd; /* messages sent */ long ru_msgrcv; /* messages received */ long ru_nsignals; /* signals received */ long ru_nvcsw; /* voluntary context switches */ long ru_nivcsw; /* involuntary context switches */ };
Unfortunately, all the fields apart from user and system time remain zeroed. `grepping` through the xnu (Darwin kernel, apparently the only available information about what does and doesn’t work under OSX) sources, I ended up finding this comment in the declaration of `struct rusage` which finally convinced me that I’m not too stupid to make a simple call:
struct rusage { struct timeval ru_utime; /* user time used (PL) */ struct timeval ru_stime; /* system time used (PL) */ #if defined(_POSIX_C_SOURCE) && !defined(_DARWIN_C_SOURCE) long ru_opaque[14]; /* implementation defined */ #else /* (!_POSIX_C_SOURCE || _DARWIN_C_SOURCE) */ /* * Informational aliases for source compatibility with programs * that need more information than that provided by standards, * and which do not mind being OS-dependent. */ long ru_maxrss; /* max resident set size (PL) */ #define ru_first ru_ixrss /* internal: ruadd() range start */ long ru_ixrss; /* integral shared memory size (NU) */ (...) };
At least the good folks over at SUN are kind enough to mention the fact all the fields are dummies in their man pages.
Then I found Michael Knight’s (not that one) blog, which used the underlying mach function `task_info`. Unfortunately, Apple doesn’t document the mach API at all and the sole reference they supply points directly to nowhere.
Well, if `ps` can determine memory usage, surely it should be able to tell me how. Finding the source of OSX `ps` was another story. Hint: it’s not located in the `basic_cmds`, `misc_cmds`, `shell_cmds`, or `system_cmds` package. It’s in the `adv_cmds` (advice?, advanced?, adventure?).
`ps` ended up using a bunch of equally undocumented (non-mach) kernel functions. At this point I remembered that `macfuse` contains a `procfs` for OSX. To me, using `proc` seems to be the obvious way to get memory usage under Linux, so I dug through that and saw macfuse uses `task_info` as well.
I finally found documentation for the mach API within the xnu sources under the `osfmk/man` directory or online here and was able to write a simplified version of Michael’s original.
Voila:
#include <mach/task.h> int getmem (unsigned int *rss, unsigned int *vs) { task_t task = MACH_PORT_NULL; struct task_basic_info t_info; mach_msg_type_number_t t_info_count = TASK_BASIC_INFO_COUNT; if (KERN_SUCCESS != task_info(mach_task_self(), TASK_BASIC_INFO, (task_info_t)&t_info, &t_info_count)) { return -1; } *rss = t_info.resident_size; *vs = t_info.virtual_size; return 0; }
In case anyone knows of an even remotely portable way obtain similar information, please let me know.
No loopback interface on Windows (XP)
Mittwoch, Oktober 22nd, 2008Learned today that Windows doesn’t support a loopback interface for localhost. In consequence, network packets destined for the local machine are never passed to any interface and therefore can’t be captured by a packet sniffer. Unfortunately, looking at the network is my preferred way of diagnosing network problems, so this behavior gets in the way. An easy workaround is to route packets to a local address to the standard gateway instead. The gateway then sends the packets back to the local machine. This is a bit of a detour, but at least the traffic shows up. This dramatically changes how the packets are being moved around, so it might not help… But just in case:
- grab your IP address and Gateway using
ipconfig
in a DOS box. route add $LOCAL_IP mask 255.255.255.255 $GATEWAY_IP metric 1
- when you’re done, use
route delete $LOCAL_IP
to get things back to normal
Visualizing Ant Redux.
Dienstag, Oktober 14th, 2008I’ve written a small update to my ant-file visualization tool. The only visible change is that the default task is now marked in the output.
You can either download the jar containing everything you need, or build it yourself from the source available via:
svn co http://a2800276.googlecode.com/svn/branches/antvis
If Antvis is run from the command line like so:
$ java -jar antvis.jar usage: [jre] antvis.AntVis -f inputFile [-t format] [-o outfile] format: format supported by dot. Default: `dot` outfile: Default stdout call [jre] antvis.AntVis -l for a list of supported formats
It prints out the available options. If it’s called correctly:
$ java -jar antvis.jar -f build.xml -t png -o self.png
It will produce graphical representations of the provided build.xml
file like this one
for Antviz’s own build.xml
or this one
The above is an example of a more complicated build.xml
script, it ships with jpos.
Backslashes in C includes…
Samstag, September 27th, 2008Who’d have thought:
- that DOS backslashes in C include paths aren’t only ugly and a pain, but also not legal* C:
If the characters ‚, \, „, //, or/* occur in the sequence between the < and > delimiters, the behavior is undeï¬ned. Similarly, if the characters ‚, \, //, or /* occur in the sequence between the “ delimiters, the behavior is undeï¬ned. A header name preprocessing token is recognized only within a #include preprocessing directive.
(C99 6.4.7.3)
- … the C99 Standard is available for free online This links directly to the pdf containing the current standard, which lives here.
-
It’s easy to fix:
find . -name '*.[c|h]' -print0 | xargs -0 \ ruby -i.bak -pe 'scan(/^\s*#include.*/){ gsub(/\\/, "/") }'
* yeah, I know, it’s legal just undefined.
** this post inspired by this.
„Ruby Cryptography: TINFM“
Dienstag, Juli 22nd, 2008TINFM meaning „there is no fine manual“, of course. Just as with digest
, Ruby’s openssl documentation is missing just the bits you’ll need to get started. In order to encrypt or decrypt something, you’ll first need to instantiate the approriate cipher:
require 'openssl' cipher = OpenSSL::Cipher::Cipher.new NAME_OF_CIPHER
So how do I know the name of the cipher if it’s not documented? You’ll need to refer to the OpenSSL documentation, or refer to this handy list:
base64 | Base 64 |
bf-cbc | Blowfish in CBC mode |
bf | Alias for bf-cbc |
bf-cfb | Blowfish in CFB mode |
bf-ecb | Blowfish in ECB mode |
bf-ofb | Blowfish in OFB mode |
cast-cbc | CAST in CBC mode |
cast | Alias for cast-cbc |
cast5-cbc | CAST5 in CBC mode |
cast5-cfb | CAST5 in CFB mode |
cast5-ecb | CAST5 in ECB mode |
cast5-ofb | CAST5 in OFB mode |
des-cbc | DES in CBC mode |
des | Alias for des-cbc |
des-cfb | DES in CBC mode |
des-ofb | DES in OFB mode |
des-ecb | DES in ECB mode |
des-ede-cbc | Two key triple DES EDE in CBC mode |
des-ede | Two key triple DES EDE in ECB mode |
des-ede-cfb | Two key triple DES EDE in CFB mode |
des-ede-ofb | Two key triple DES EDE in OFB mode |
des-ede3-cbc | Three key triple DES EDE in CBC mode |
des-ede3 | Three key triple DES EDE in ECB mode |
des3 | Alias for des-ede3-cbc |
des-ede3-cfb | Three key triple DES EDE CFB mode |
des-ede3-ofb | Three key triple DES EDE in OFB mode |
desx | DESX algorithm. |
idea-cbc | IDEA algorithm in CBC mode |
idea | same as idea-cbc |
idea-cfb | IDEA in CFB mode |
idea-ecb | IDEA in ECB mode |
idea-ofb | IDEA in OFB mode |
rc2-cbc | 128 bit RC2 in CBC mode |
rc2 | Alias for rc2-cbc |
rc2-cfb | 128 bit RC2 in CFB mode |
rc2-ecb | 128 bit RC2 in ECB mode |
rc2-ofb | 128 bit RC2 in OFB mode |
rc2-64-cbc | 64 bit RC2 in CBC mode |
rc2-40-cbc | 40 bit RC2 in CBC mode |
rc4 | 128 bit RC4 |
rc4-64 | 64 bit RC4 |
rc4-40 | 40 bit RC4 |
rc5-cbc | RC5 cipher in CBC mode |
rc5 | Alias for rc5-cbc |
rc5-cfb | RC5 cipher in CFB mode |
rc5-ecb | RC5 cipher in ECB mode |
rc5-ofb | RC5 cipher in OFB mode |
aes-[128|192|256]-cbc | 128/192/256 bit AES in CBC mode |
aes-[128|192|256] | Alias for aes-[128|192|256]-cbc |
aes-[128|192|256]-cfb | 128/192/256 bit AES in 128 bit CFB mode |
aes-[128|192|256]-cfb1 | 128/192/256 bit AES in 1 bit CFB mode |
aes-[128|192|256]-cfb8 | 128/192/256 bit AES in 8 bit CFB mode |
aes-[128|192|256]-ecb | 128/192/256 bit AES in ECB mode |
aes-[128|192|256]-ofb | 128/192/256 bit AES in OFB mode |
A list of the currently supported cipher strings, without the explanation can also be produced by calling OpenSSL::Cipher.ciphers
After you’ve instantiated the proper cipher, you tell it to either encrypt or decrypt, give it the key to use (and possibly an IV) and then pass in data using update
:
cipher.encrypt cipher.key = KEY_DATA ciphertext = cipher.update plaintext cipher.reset cipher.decrypt cipher.key = KEY_DATA plaintext = cipher.update plaintext
That’s all. Not really difficult, once you’ve pieced everything together.
Bit-twiddling with Ruby
Dienstag, Mai 6th, 2008I’ve always wanted to write some routines that help out with bit twiddling. Since I’m working on some byte level stuff recently (Smartcards, ISO7816 to be precise) I’ve finally gotten around to writing an API to make handling bytes easier and self-documenting. Basically, it’s a –attention buzzword– DSL for bitfield description. Not really gotten very far, but this is how it looks up to now: if you’ve got a byte composed of bits with the following semantics:
|8|7|5|4|3|2|1|Desc ================================ |1|-|-|-|-|-|-|Channel Encrypted |-|0|0|0|-|-|-|Method A |-|1|0|0|-|-|-|Method B |-|0|0|1|-|-|-|Method C |-|-|-|-|X|X|X|Channel Number
I can use the following ruby code to represent it:
require "bytes" b = Bytes::Byte.new "1......." => :enc, ".000...." => :a, ".100...." => :b, ".001...." => :c, ".....vvv" => :channel b.value = 0xff b.enc? # true b.b? # false b.b # `b.value` is now 0xCF / "11001111" b.b? # true b.channel # 7 b.channel = 0 # `b.value` is now 0xC8 / "11001000"
Instead of using the Byte
class and instantiating it with the byte’s pattern, it’s also possible to include the module Bytes
which adds a attr
like class function (called byte_accessor
) which adds the same sort of functionality to classes. Take this –vaguely contrived– implementation of the first two bytes of an IP Packet:
require "bytes_ng" class IPPacket include Bytes byte_accessor :ver_ihl , "vvvv ...." => :version ".... vvvv" => :ihl byte_accessor :tos, "111. .... | Precedence" => :network_control, "110. ...." => :inet_control, "101. ...." => :critic_epc, "100. ...." => :flash_override, "011. ...." => :flash, "010. ...." => :immediate, "001. ...." => :priority, "000. ...." => :routine, "...0 .... | Delay" => :normal_delay, "...1 ...." => :low_delay, ".... 0... | Throughput" => :normal_throughput, ".... 1..." => :high_throughput, ".... .0.. | Reliability"=> :low_reliability, ".... .1.." => :high_reliability, ".... ..1. | RFU" => :rfu_err_1 ".... ...1 | RFU" => :rfu_err_2 end
This adds two instance variables (and their respective accessors) named ver_ihl
and tos
to the class IPPacket. These contain the actual byte value. It also adds a bunch of methods (like in the example above) that can be used to query and set the individual bits.
I’ve not gotten around to properly releasing it yet, but it works quite well so far. In case you’re interested, you can currently get it here.
Future plans are to package it and (maybe) add multi-byte functionality.
Visualizing Ant Scripts.
Dienstag, April 8th, 2008XML is generally not only tedious to write, but also hideous to look at, yet sometimes you gotta bite the bullet and use an ant
build script.
I’ve written a little tool that renders a dependancy graph of all the tasks in a ant build.xml
file. The result looks like this:
The above was generated from this xml file which is too long and ugly to include here.
In case you’d also like to generate nifty little pictures like the above, to beef up skimpy documentation, for example, you can download the tool here. Just call:
java -jar antvis.jar
And all the rest should be self-explanatory. You’ll need to have a copy of Graphviz installed in order to render the pictures. In case you are interested in the source, you can grab a copy using subversion here:
EURUKO 2008: Native Extensions
Sonntag, März 30th, 2008You were expecting cats?
Here are the code examples in case you want to try them out yourself.
Here is a copy of the slides, unfortunately, they’re quiet huge at the moment, I’ll try to get smaller one as soon as I figure out keynote…
Suggestions from the audience
- Ruby Hacking Guide : more detailed information about Ruby internals.
- RubyInline : allows you to write C code in the middle of Ruby code
- Somebody said the videos of the talks would be put up here
- Dr. Nic was kind enough to write a generator to handle native extensions for his newgem tool. newgem is definitely worth trying out, it generates a LOT of boiler plate, which is sort of overwhelming. But it’s more fun to figure out what his tool does than to type boilerplate code.