Bookmarks (2) Theme: Programming languages

  • The design of elegant languages - “What I set out to do here next, is to look at programming languages from a conception of the programmer’s task and deal with some aspects in the evolution of programming languages viewed, specifically, as languages in which to compose programs.The treatment reflects largely my personal experience and taste in programming, and as such will not at all be comprehensive. In doing this I shall pay particular attention to ALGOL68. It is, however, not my aim to give a “critical but balanced”assessment of this language. Also, I will freely ascribed “innovations” to language B, even though it may be argued that the essence of the idea existed before in language A, if B was the first to do it right, or with sufficient generality.”
  • Hints on programming language design - Hoare - “This paper presents the view that a programming language is a tool which should assist the programmer in the most difficult aspects of his art, namely program design, documentation, and debugging. It discusses the objective criteria for evaluating a language design, and illustrates them by application to language features of both high level languages and machine code programming. It concludes with an annotated reading list, recommended for all intending language designers.”
  • Growing a language -
  • Scalable computer programming languages - “A program that can’t easily be extended to cope with new requirements is called brittle, which is the opposite of scalable. Since effectively all programs that are successful grow new parts, a non-scalable program usually will require a total rewrite when the requirements change, which is very wasteful. The usual reason for this kind of non-scalability is poor design abstraction; too many fundamental design decisions have been hard-wired into the code in so many places that it is difficult to change them all without introducing lots of bugs.”
  • Liskov substitution principle - “The importance of this principle becomes obvious when you consider the consequences of violating it. If there is a function which does not conform to the LSP, then that function uses a pointer or reference to a base class, but must know about all the derivatives of that base class. Such a function violates the Open-Closed principle because it must be modified whenever a new derivative of the base class is created.”
  • Computer language benchmarks - “This box plot shows how many times slower, the fastest benchmark programs for selected programming language implementations were, compared to the fastest programs.”
  • Why Python, Ruby and Javascript are slow? - A good look at the design of these dynamic programming languages, which forces these languages to be comparatively slower.

Bookmarks (1)

I am pressed for time, so instead of doing long posts, I am posting some interesting links.

  • Hash Functions - comparative analysis of performance - “The question is whether using complex functions gives you a faster program. The complex functions require more operations per one key, so they can be slower. Is the price of collisions high enough to justify the additional operations?”
  • Hash Tables - “Designing a hash function is more trial and error with a touch of theory than any well defined procedure. For example, beyond making the connection between random numbers and the desirable random distribution of hash values, the better part of the design for my JSW hash involved playing around with constants to see what would work best.”
  • Ukkonnen’s suffix tree building algorithm - “The following is an attempt to describe the Ukkonen algorithm by first showing what it does when the string is simple (i.e. does not contain any repeated characters), and then extending it to the full algorithm.”
  • Rolling Hash - “A rolling hash is a hash function where the input is hashed in a window that moves through the input. A few hash functions allow a rolling hash to be computed very quickly—the new hash value is rapidly calculated given only the old hash value, the old value removed from the window, and the new value added to the window—similar to the way a moving average function can be computed much more quickly than other low-pass filters.”
  • Matching substrings from a pattern in another string - “I’ve got a long text (about 5 MB filesize) and another text called pattern (around 2000 characters). The task is to find matching parts from a genom-pattern which are 15 characters or longer in the long text.”
  • Balanced Binary Trees - “libavl is a library in ANSI C for manipulation of various types of binary trees. This book provides an introduction to binary tree techniques and presents all of libavl’s source code, along with annotations and exercises for the reader. It also includes practical information on how to use libavl in your programs and discussion of the larger issues of how to choose efficient data structures and libraries. The book concludes with suggestions for further reading, answers to all the exercises, glossary, and index.”
  • Dynamic Programming Problems - “This site contains a collection of practice dynamic programming problems and their solutions.”
  • lcc: A retargetable compiler for C - “This book describes lcc, a retargetable compiler for ANSI C; it focuses on the implementation. Most compiler texts survey compiling algorithms, which leaves room for only a toy compiler. This book leaves the survey to others. It tours most of a practical compiler for full ANSI C, including code generators for three target machines. It gives only enough compiling theory to explain the methods that it uses.”

What to expect from the Ruby expect library?

A little background first - expect is a library to interact with programs using ruby. Conceptually it’s based on the original UNIX expect program which is commonly used to automate UNIX administration. Expect library provides an API which can be passed a regex to match the expected output from the program, and optionally an action to take on a match. I have been experimenting with the expect library to automate a gdb session. Expect is an undocumented module and :Ri does not help. So just in case you are not able to get it to work for you, read on.

Table of Contents

Spawning interactive programs

Ruby has several ways to spawn programs. For a nice treatment of the subject check out Avdi’s excellent blog posts on the subject. We’ll use Ruby’s PTY library to spawn the program, and interact with it using expect. Using PTY library prevents IO buffering, which can cause problems during interaction with a spawned process. Let’s see how to make gdb print out a help message, using two methods.

Method 1 - Single interaction
This ruby script will take a string as an argument, and print out gdb help for it. We’ll spawn a gdb process, send it the gdb help command and read the results from its stdout. Listing 1 shows the script. You’d notice that PTY.spawn is passed a Ruby block, which is executed immediately after the gdb process is spawned. When the block ends the gdb process gets terminated. PTY.spawn passes in the input/output File objects for the child gdb process, as well as its pid to the block. The till_prompt() method, reads all output from gdb till the next gdb prompt is seen, and returns the data read as a string. Notice the use of IO.getc() method. We don’t use IO.gets because when gdb prints out its prompt, it waits for input from the use and therefore does not print a newline, after the prompt. If IO.gets is used, it will stall waiting for a newline after the prompt is printed by gdb.

Sample output after executing this script is shown in Listing 2. This is a deliberately minimal program, designed to demonstrate concepts, and does not attempt to do any error handling.

 1 #!/usr/bin/env ruby
 2 require pty
 4 def till_prompt(cout)
 5     buffer = ""
 6     loop { buffer << cout.getc.chr; break if buffer =~ /\(gdb\)/ }
 7     return buffer
 8 end
10 PTY.spawn("gdb") do |gdb_out, gdb_in, pid|
11     printf till_prompt(gdb_out)
12     gdb_in.printf("help #{ARGV[0]}\n")
13     puts till_prompt(gdb_out)
14 end

Listing 1 - gdb.rb

[sudhanshu@sudhanshu-desktop]$ ./gdb.rb break
help break
Set breakpoint at specified line or function.
LOCATION may be a line number, function name, or “*” and an address.
If a line number is specified, break at start of code for that line.
If a function is specified, break at start of code for that function.
If an address is specified, break at that exact address.
With no LOCATION, uses current execution address of selected stack frame.
This is useful for breaking on return to a stack frame.

THREADNUM is the number from “info threads”.
CONDITION is a boolean expression.

Multiple breakpoints at one place are permitted, and useful if conditional.

Do “help breakpoints” for info on other commands dealing with breakpoints.

Listing 2 - gdb.rb output

Notice the one shot usage of this program. It exits immediately and that’s why sending PTY.spawn a block for execution makes sense here. We’ll see why we’d not want to send a block of code to execute, in the next method of spawning interactive processes.

Method 2 - Multiple interactions
This method is useful in those circumstances, where you’d like to save the input, output objects returned by PTY.spawn for later and interact with the process multiple times using these objects. Let’s rewrite the gdb.rb program in Listing 1, to be used as a loadable library in irb, instead of a one-shot program. Listing 3 shows this program. You’d notice the new gdb() method which is provided as an API to run arbitrary gdb commands. Also notice that this time we store a reference to the input/output objects returned by PTY.spawn. Listing 4 shows how this version can be used by the user more interactively.

 1 #!/usr/bin/env ruby
 2 require pty
 4 def till_prompt(cout)
 5     buffer = ""
 6     loop { buffer << cout.getc.chr; break if buffer =~ /\(gdb\)/ }
 7     return buffer
 8 end
10 def gdb(string)
11     @gdb_in.printf("#{string}\n")
12     puts till_prompt(@gdb_out)
13 end
15 @gdb_out, @gdb_in, @pid = PTY.spawn("gdb")
16 printf till_prompt(@gdb_out)

Listing 3 - gdb_irb.rb

[sudhanshu@sudhanshu-desktop]$ irb
>> require ‘gdb_irb.rb’
GNU gdb (GDB) 7.1-ubuntu
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu".
For bug reporting instructions, please see:
(gdb)=> true
>> gdb "help break"
 help break
Set breakpoint at specified line or function.
LOCATION may be a line number, function name, or "*" and an address.
If a line number is specified, break at start of code for that line.
If a function is specified, break at start of code for that function.
If an address is specified, break at that exact address.
With no LOCATION, uses current execution address of selected stack frame.
This is useful for breaking on return to a stack frame.

THREADNUM is the number from "info threads".
CONDITION is a boolean expression.

Multiple breakpoints at one place are permitted, and useful if conditional.

Do "help breakpoints" for info on other commands dealing with breakpoints.
=> nil
>> gdb "file /bin/date"
 file /bin/date
Reading symbols from /bin/date…(no debugging symbols found)…done.
=> nil
>> gdb "r"
Starting program: /bin/date
[Thread debugging using libthread_db enabled]
Tue Aug 10 05:37:22 IST 2010

Program exited normally.
=> nil
>> @gdb_in.inspect
=> "#<File:/dev/pts/4>"
?> @gdb_out.inspect
=> "#<File:/dev/pts/4>"

Listing 4 - Interactive GDB in irb

Notice how irb “wraps” around gdb and created a much more powerful debugging environment on top of gdb. We could for instance read large amounts of data from the process being debugged into Ruby variables/classes and analyze it using the far more powerful facilities provided by the Ruby programming enviroment, as compared to gdb macros/scripts.

Expect library

Now that we have seen how to spawn processes in Ruby, and interact with them with custom code, let’s see how to use the expect library to interact with them. The expect library adds an expect() method to the IO class, which is the basis of all input output in Ruby. The expect() method is just a beefed up version of the till_prompt() method that we saw above.

While the till_prompt() method used a fixed pattern to match the next gdb prompt, the expect() method takes a ruby String or a regular expression object of type Regexp as a pattern to match against program output.

The till_prompt() method simply returned the whole buffer after matching the fixed pattern. However, the expect() method can optionally take a Ruby block to execute as soon as the pattern matches. This block is passed in the array containing the result of the match. Alternatively, if a block is not given, it will return the result array containing the buffer against which the pattern was matched, followed by the flattened, MatchData object returned by Regexp#match().

Apart from these the expect() method can optionally take a timeout value in seconds as its second argument. If no match is found within the given time limit, it returns nil. Thus the API can be summarized in the following way:

1 result = IO.expect("pattern" | /pattern/ [, timeout in secs]) [ { |array| …. } ]

Note that if a block is passed into expect(), the return value is that returned by the block, which can be anything and not necessarily an array. Now let’s see the implementation of the above two programs using the expect() method.

1 #!/usr/bin/env ruby
2 require pty
3 require expect

5 PTY.spawn("gdb") do |gdb_out, gdb_in, pid|
6     gdb_out.expect(/\(gdb\)/) { |r| gdb_in.printf("help #{ARGV[0]}\n") }
7     puts gdb_out.expect(/\(gdb\)/)[0]
8 end

Listing 5 - gdb_expect.rb

 1 #!/usr/bin/env ruby
 2 require pty
 3 require expect
 5 def gdb(string)
 6     @gdb_in.printf("#{string}\n")
 7     puts @gdb_out.expect(/\(gdb\)/)[0]
 8 end
10 @gdb_out, @gdb_in, @pid = PTY.spawn("gdb")
11 puts @gdb_out.expect(/\(gdb\)/)[0]

Listing 6 - gdb_irb_expect.rb

Beginners with the Ruby expect library can get into trouble if they assume that the pattern being passed in to the expect() method will be matched against each line output by the spawned program. This assumption is incorrect, because as we saw the pattern is actually matched against all the characters read into a buffer which includes newline characters. It’s also worth mentioning that the newlines seen by expect() are “\r\n” and not “\n”.

As a debugging mechanism there’s a global variable called $expect_verbose, provided by the expect library. Set this variable to true in your program, and expect() method will print every character read at each intermediate step on stdout. This is an extremely useful tool for debugging expect programs.


Ubuntu Intrepid 8.10 Quickstart

This article is a quick rundown of everything that I needed to do, before I became reasonably productive, after a reinstall of Ubuntu. A recent virus attack on my Windows desktop left me fuming - the loss of productivity for two days and the amount of damage done, was enough to propel me back to my Linux desktop.

I got the time and opportunity recently to get my desktop set for its second-innings with Linux. This page serves as a holder for all that I learned in the process and to prevent me from trying to relearn all of this, if I have to do this again. Hopefully, it’ll help anyone else trying to go through the same process.
My wife likes to work on Windows, so it has been my primary OS at home, so far. I didn’t want another outage due to a virus attack, so I decided to install Windows under VMWare for her, while I have some peace of mind with Linux on my desktop. I had the following goals for my new setup:

  1. Dual-boot between Ubuntu and Windows XP from my first hard-disk
  2. Find suitable Ubuntu replacements for all the goodies I have been using in Windows
  3. Share files easily between the dual-booted Windows XP and Ubuntu easily
  4. Able to hibernate Ubuntu when power goes off automatically
  5. Restart system automatically when power comes back up
  6. Run Windows XP as guest in Ubuntu VMware host
  7. Let Windows XP guest access Internet and share files with host Ubuntu OS

I am going to gloss over details in most of the steps below, because this post is not really meant to explain the why or how of things. However, in case someone wants details, I have provided links covering the same in-depth. I have leveraged a lot of work done by other people and my gratitude goes out to them for their time and effort.

My planned dual-boot setup looks like this:


So here’s a quick walkthrough of steps needed to reach the above goals:

Table of Contents

Install Windows XP first

No brainer. This is an insurance against something not working in the VMWare Windows XP guest OS. I create an extra common partition for use by both Windows/Ubuntu. In previous configurations its been FAT32 since both Linux and Windows could write to it easily, but this time I created it as NTFS, since Ubuntu has read-write support for NTFS partitions now and it seems to be working fine for some time now. Yeah, I am feeling adventurous! But just in case, I won’t be using the partition for any mission-critical stuff, that I can’t afford to lose. If something goes wrong, this partition is going back to being FAT32.

Install Ubuntu Intrepid 8.10

I created a big /var partition so that I can put my virtual machines in that partition. I also want to play with sharing virtual machines between Windows and Linux host OS. Such virtual machines would be copies of guest VMs in /var/vmware and kept in the common partition.

Update packages in Ubuntu 8.10

The update manager in Ubuntu will be prompting you for updates by now, which may run into hundreds of Mbs. We can start the updates and leave them running overnight. Or you could complete the steps below and update later. As long as updates are being downloaded, you won’t be able to install anything else.

Update: On a Ubuntu 8.10 system which is fresh and hasn’t been updated you may not be able to install Samba, because of some packaging errors, so you may want to complete any updates first, before going ahead.

Install applications in Ubuntu 8.10

The next step is to install all needed applications, replacements for what you use in Windows etc. Before doing this just add Google’s repositories to your system. Here you go:


sudo apt-get -y install msttcorefonts
sudo apt-get -y install libdvdread3 regionset
sudo apt-get -y install samba smbclient smbfs
sudo apt-get -y install vpnc
sudo apt-get -y install thunderbird thunderbird-gnome-support latex-xft-fonts
sudo apt-get -y install picasa
sudo apt-get -y install inkscape
sudo apt-get -y install amarok
sudo apt-get -y install vlc
sudo apt-get -y install conky iotop htop powertop
sudo apt-get -y install adobe-flashplugin
sudo apt-get -y install sysinfo
sudo apt-get -y install gworldclock
sudo apt-get -y install rar p7zip-full unrar
sudo apt-get -y install gnome-do
sudo apt-get -y install nmap
sudo apt-get -y install wine
sudo apt-get -y install compizconfig-settings-manager
sudo apt-get -y install sun-java6-plugin sun-java6-jr
sudo apt-get -y install chkconfig
sudo apt-get -y install sensors-applet
sudo apt-get -y install nautilus-actions
sudo apt-get -y install nautilus-open-terminal
sudo apt-get -y install nautilus-script-manager
sudo apt-get -y install nautilus-filename-repairer
sudo apt-get -y install nautilus-script-debug
sudo apt-get -y install nautilus-image-converter
sudo apt-get -y install nautilus-script-collection-svn
sudo apt-get -y install diff-ext
sudo apt-get -y install gnome-themes-extras
sudo apt-get -y install gnochm
sudo apt-get -y install gftp
sudo apt-get -y install apcupsd
sudo apt-get -y install gapcmon
sudo apt-get -y install screenlets
sudo apt-get -y install terminator
sudo apt-get -y install electricsheep
sudo apt-get -y install guayadeque
sudo apt-get -y install xbindkeys xbindkeys-config


sudo apt-get -y install vim-full vim-gnome
sudo apt-get -y install subversion subversion-tools db4.6-util
sudo apt-get -y install manpages-dev
sudo apt-get -y install rapidsvn
sudo apt-get -y install meld
sudo apt-get -y install systemtap
sudo apt-get -y install g++
sudo apt-get -y install libncurses5-dev
sudo apt-get -y install gettext
sudo apt-get -y install kernel-patch-scripts
sudo apt-get -y install linux-source-x.x.x
sudo apt-get -y install linux-crashdump
sudo apt-get -y install linux-doc
sudo apt-get -y install linux-headers-`uname -r` build-essential xinetd
sudo apt-get -y install kernel-package fakeroot wget bzip2
sudo apt-get -y install kerneltop
sudo apt-get -y install oprofile
sudo apt-get -y install crash
sudo apt-get -y install cscope
sudo apt-get -y install ctags
sudo apt-get -y install ack-grep

Download and install VMWare Server 2.0 for Linux

There are several flavors of VMWare, VMWare Server being the free one for both Windows and Linux. Download the server from here. Install it as described here. You may run into a problem when configuring VMware server 2.0.1 in Ubuntu 8.10. It tries to build the VSOCK module and load it in the kernel, but fails. This can be solved by patching the file with a patch provided by an enterprising user. Download it here.
Before the patch:

Unable to make a vsock module that can be loaded in the running kernel:
insmod: error inserting '/tmp/vmware-config0/vsock.o': -1 Unknown symbol in module
There is probably a slight difference in the kernel configuration between the
set of C header files you specified and your running kernel. You may want to
rebuild a kernel based on that directory, or specify another directory.
The VM communication interface socket family is used in conjunction with the VM
communication interface to provide a new communication path among guests and
host. The rest of this software provided by VMware Server is designed to work
independently of this feature. If you wish to have the VSOCK feature you can
install the driver by running again after making sure that
gcc, binutils, make and the kernel sources for your running kernel are
installed on your machine. These packages are available on your distribution's
installation CD.
[ Press the Enter key to continue.]

After the patch:

VMWare config patch VSOCK!
`/tmp/vmware-config1/../Module.symvers' -> `/tmp/vmware-config1/vsock-only/Module.symvers'
Building the vsock module.
Using 2.6.x kernel build system.
make: Entering directory `/tmp/vmware-config1/vsock-only'
make -C /lib/modules/2.6.27-7-generic/build/include/.. SUBDIRS=$PWD SRCROOT=$PWD/. modules
make[1]: Entering directory `/usr/src/linux-headers-2.6.27-7-generic'
CC [M] /tmp/vmware-config1/vsock-only/linux/af_vsock.o
CC [M] /tmp/vmware-config1/vsock-only/linux/driverLog.o
CC [M] /tmp/vmware-config1/vsock-only/linux/util.o
/tmp/vmware-config1/vsock-only/linux/util.c: In function ‘VSockVmciLogPkt’:
/tmp/vmware-config1/vsock-only/linux/util.c:157: warning: format not a string literal and no format arguments
CC [M] /tmp/vmware-config1/vsock-only/linux/vsockAddr.o
LD [M] /tmp/vmware-config1/vsock-only/vsock.o
Building modules, stage 2.
MODPOST 1 modules
CC /tmp/vmware-config1/vsock-only/vsock.mod.o
LD [M] /tmp/vmware-config1/vsock-only/vsock.ko
make[1]: Leaving directory `/usr/src/linux-headers-2.6.27-7-generic'
cp -f vsock.ko ./../vsock.o
make: Leaving directory `/tmp/vmware-config1/vsock-only'
The vsock module loads perfectly into the running kernel.

Disable KVM

VMWare Server won’t be able to execute your guest OS until you unload/turn-off the built-in KVM virtualization engine in the Linux kernel. Do this:

> sudo rmmod kvm_intel
> sudo rmmod kvm
> sudo chkconfig -e kvm

chkconfig will open an editor. Just replace on with off.

Install Windows XP as guest OS

Access VMWare Server using your web-browser at this link: http://localhost:8222. Add a datastore. I use /var/vmware (that’s the reason behind /var partition being so big). Create a virtual machine and install the guest OS. Windows XP in VMWare server 2.0.1 can only run on IDE disks. I am not sure if you’d face this problem during a fresh install, but I faced this when migrating a Windows XP guest VM to Ubuntu, which was created in Windows XP itself as host OS on a virtual SCSI disk. So I had to use Acronis TrueImage first, to clone the virtual SCSI disk to a new virtual IDE drive within the guest VM and then migrate it to Linux host. Some users tried out Norton Ghost for the same but reported having problems with it (although I have seen others using it successfully, with some hoops and twists).

Next, add a new disk to your virtual machine. Start the Windows XP guest OS and format this new drive. Then turn on file sharing and give read-write access to this particular drive. I like to give the host read-write access to the guest’s drive, so that its easy to transfer data between the host and guest using this shared drive. Also the guest stores all data on this drive. The advantage of a separate drive for your data is that this disk can easily be moved to another VM, in case your current VM goes kaput.

Access the shared folders in guest VMs from Places > Network. You could setup this folder to be mounted easily by tweaking instructions provided here. You can also create a shortcut for your guest OS from VMWare server browser console, for easy access.

Install the VMWare tools in the Windows XP guest OS from the VMWare browser console. You’d need these for seamless integration of the guest OS with the host. These tools make life a lot easier, when dealing with display settings, mouse cursor etc.

Update: The key mappings for some of the keys in your guest Windows XP OS may be wrong at this time. Some keys such as the Del, Arrow keys etc. might not work for you in the guest OS. To fix this problem add the following line to your /etc/vmware/config file. This fixed the problem for me. For more details, see this thread.

xkeymap.nokeycodeMap = true

Set up VMWare networking

You might want to read this first. You’d want to configure your guest OS to use NAT VMWare networking option in order to let it access the Internet, as well as communicate with the host OS. VMWare will use the vmnet8 interface for this purpose. Find out the subnet configured by default on the vmnet8 interface, by VMWare server. On my machine this is The default gateway and the DNS server for your guest OS in that case is You can use DHCP within your guest OS to configure its network and VMWare’s DHCP server will automatically assign IP addresses to your guest VMs. I use static addresses though, which allows me to set up permissions easily using IP addresses in various other configuration files such as Samba etc.

> sudo ifconfig vmnet 8
vmnet8 Link encap:Ethernet HWaddr 00:50:56:c0:00:08
inet addr: Bcast: Mask:
inet6 addr: fe80::250:56ff:fec0:8/64 Scope:Link
RX packets:1223 errors:0 dropped:0 overruns:0 frame:0
TX packets:164 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

We could now install other operating systems and test them out too. One of the things that I have been planning to do is to give direct access to rest of the 3 physical disks to a NAS distribution such as OpenFiler, FreeNAS etc and see which one can I use for managing my disks under RAID protection. But more on that in some later post. For now our setup should begin to look like this:

Virtual Machines

Download and install other apps

Set up your fonts

Go to System > Preferences > Appearance > Fonts. Turn on Sub-pixel smoothing.

Configure your UPS

I have an APC Back UPS RS 800 (as written on front-panel), which the driver identifies as Back UPS BR 800. I tried installing NUT for monitoring my UPS but it did not work. Finally after much mucking around, I found apcupsd which was able to monitor my UPS and shutdown/hibernate it when the power goes off. A detailed guide on configuring it is present here. You can find a shorter one here. The program gapcmon is a great way to keep a tab on the UPS including past history. My /etc/apcupsd/apcupsd.conf file is as shown below and that is followed by some troubleshooting tips.

> cat /etc/apcupsd/apcupsd.conf | grep -v "#" | grep -v "^$"
LOCKFILE /var/lock
SCRIPTDIR /etc/apcupsd
PWRFAILDIR /etc/apcupsd
NOLOGON disable
EVENTSFILE /var/log/
UPSCLASS standalone
UPSMODE disable
STATFILE /var/log/apcupsd.status

You may have to create the device nodes, which are used by apcupsd to monitor the UPS. To create those devices, use this script:

> sudo /usr/share/doc/apcupsd/examples/make-hiddev
> ls -l /dev/usb/hid
total 0
crw-r--r-- 1 root root 180, 96 2009-04-22 15:33 hiddev0
crw-r--r-- 1 root root 180, 97 2009-04-22 15:33 hiddev1
crw-r--r-- 1 root root 180, 106 2009-04-22 15:33 hiddev10
crw-r--r-- 1 root root 180, 107 2009-04-22 15:33 hiddev11
crw-r--r-- 1 root root 180, 108 2009-04-22 15:33 hiddev12
crw-r--r-- 1 root root 180, 109 2009-04-22 15:33 hiddev13
crw-r--r-- 1 root root 180, 110 2009-04-22 15:33 hiddev14
crw-r--r-- 1 root root 180, 111 2009-04-22 15:33 hiddev15
crw-r--r-- 1 root root 180, 98 2009-04-22 15:33 hiddev2
crw-r--r-- 1 root root 180, 99 2009-04-22 15:33 hiddev3
crw-r--r-- 1 root root 180, 100 2009-04-22 15:33 hiddev4
crw-r--r-- 1 root root 180, 101 2009-04-22 15:33 hiddev5
crw-r--r-- 1 root root 180, 102 2009-04-22 15:33 hiddev6
crw-r--r-- 1 root root 180, 103 2009-04-22 15:33 hiddev7
crw-r--r-- 1 root root 180, 104 2009-04-22 15:33 hiddev8
crw-r--r-- 1 root root 180, 105 2009-04-22 15:33 hiddev9

Update: With Ubuntu 10.04 editing fstab to mount usbfs was not found to be necessary.

Add this line to /etc/fstab:

none /proc/bus/usb usbfs defaults 0 0

Update: With Ubuntu 10.04 it was necessary to edit the file /etc/default/apcupsd and replace ISCONFIGURED=no with ISCONFIGURED=yes, before starting the apcupsd daemon.

Start the UPS monitoring daemon:

> sudo mount -a
> cat /proc/bus/usb/devices
T: Bus=01 Lev=01 Prnt=01 Port=01 Cnt=02 Dev#= 6 Spd=1.5 MxCh= 0
D: Ver= 1.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1
P: Vendor=051d ProdID=0002 Rev= 1.06
S: Manufacturer=American Power Conversion
S: Product=Back-UPS BR 800 FW:9.o4 .I USB FW:o4
S: SerialNumber=xxxxxxxxxxxx
C:* #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr= 24mA
I:* If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=00 Prot=00 Driver=(none)
E: Ad=81(I) Atr=03(Int.) MxPS= 6 Ivl=10ms
> sudo /etc/init.d/apcupsd start
> sudo cat /var/log/daemon.log
Apr 25 16:21:01 Zork apcupsd[6015]: NIS server startup succeeded
Apr 25 16:21:01 Zork apcupsd[6015]: apcupsd 3.14.4 (18 May 2008) debian startup succeeded
> ps aux | grep ups
root 5826 0.0 0.0 20824 1024 ? Ssl Apr28 0:01 /sbin/apcupsd
root 12064 0.0 0.0 2964 776 ? S Apr28 0:00 hald-addon-hid-ups: listening on /dev/usb/hiddev0

You might want to reboot once, if it does not start the first time. That should start the hald-addon-hid-ups process, which is necessary for apcupsd to monitor the UPS.

Since a lot of us don’t have the luxury of continuous power, unlike the developed world, we’d want to shutdown/hibernate the machine when the power goes off. I prefer hibernation, instead of shutting it down, because any virtual machines running don’t have to be shut down as well when the power goes off. Apart from that, when the power returns, we get the luxury of having our desktop back with any virtual machines running in the exact pristine state, as they were. Also we could configure the BIOS to power ON the machine automatically when the power returns (if your BIOS has that option). Just follow these ArchLinux instructions to configure UPS triggered hibernation. These worked for me on Ubuntu 8.10 as well.

Configure your apps

For me this involves downloading the configuration files such as .vimrc, .bashrc etc. that I maintain in a subversion repository online. Other things that we’d like to do at this point is adding icons of most frequently accessed apps to the panel, add any useful applets (especially the hardware temperature monitoring applet - it saved my graphics card once, which I found running at more than 85 degrees), etc. Another thing I like to do is to turn on Compiz from System > Preferences > Appearance > Visual Effects. Then I fire up System > Preferences > CompizConfig Settings Manager and turn on Blur Windows, Window Previews. Also to turn on Nautilus’s subversion integration scripts run this command:

> nautilus-script-manager enable Subversion

Update: The vpnc client for Ubuntu 8.10 keeps on dropping connections frequently. To fix this, add the following line to the file /etc/vpnc/default.conf file:

DPD idle timeout (our side) 0

Setup screen rotation

When working with source code, I like to rotate my screen by 90 degrees, so that I can see more of the source code on my Dell monitor in one go. My Nvidia GeForce 8800 GTS card allows me to rotate the screen in Ubuntu. Its a shame that this support does not come built into Ubuntu’s display preferences dialog box - yeah, it says it can rotate the display, but won’t. To add this support, edit /etc/X11/xorg.conf file and add this line to the Screen: section:

Option "RandRRotation" "true"

Then rotate the screen:

> /etc/init.d/gdm stop
> /etc/init.d/gdm start
> xrandr -o left

Setup Firefox

I like to install these add-ons for Firefox:

Install applications in Windows XP Guest

Here is my list of essentials for Windows XP:

Now that you have installed all the basic apps you’d want to backup your Windows XP VM so that you don’t need to repeat the Windows XP installation ever. Copy the contents of the directory containing your virtual machines. Compress this copy and keep it safe somewhere - I am sure you’ll need it someday, Windows being what it is!

Configure Samba in Ubuntu Intrepid 8.10

To share files in the Ubuntu host with the Windows XP guest you’d want to setup Samba in Ubuntu such that it shares the data folders with the guest. A typical Samba setup is described here. If you need a simpler no frills setup, find it here. The first link worked for me within a couple of minutes.

Phew! That was a long task list. If you want to find out more about some other stuff you could tweak, take a look at these pages as well. Well, this page maybe already outdated (Ubuntu 9.04 just got released, but hopefully I am not off by a huge margin, when I make a jump to the next release of Ubuntu). For now I am not going to the next Ubuntu distribution unless there’s a really compelling reason to do so.

The maximum amount of time I spent in Ubuntu was in setting up the UPS and troubleshooting the installation of VMWare server in Ubuntu. That stuff just works in Windows. Add a bunch of utilities on top of that and you are ready to go! I wish it was the same with Ubuntu…


I am travelling. So there may not be any updates for 3 weeks. I may post photographs from my trip to my Flickr hangout though. Just in case you are interested in photography, please do drop in and say hi there.

PlaneOver The Ocean

SingaporeLeaving Hongkong

Recovering photographs from a memory card

I am an avid photographer and generally take a lot of photographs on my outdoor excursions. On my last trip the memory card in my camera became inaccessible, while trying to save the clicked photograph in it. Dozens of previously clicked photographs, also vanished without trace.

The camera’s image viewer simply refused to display any previously clicked images. My heart sank. Not only could I take more photographs throughout the trip, I also lost the few shots I had taken already. If it wasn’t for a good photograph recovery software for Windows, I’d have lost these photographs.

Recovered PhotographRecovered Photograph

Recovered Photographs

I had been planning for a while to recover photographs from that SD card. The photograph recovery software I found is one added reason to keep on coming back to Windows, despite all of its flaws. Some programs just work, without too much tinkering, when you have little time to satisfy your inner geek.

Most SD cards use the FAT filesystem, because its simple, well understood and there are free or embedded device driver implementations available easily out there for using this filesystem. I was trying to find out time to tinker with the details of the FAT filesystem, though after finding Zero Assumption Recovery, I feel no need to do so.

Cutting the long story short, this software recovered almost all of my photographs, that I had clicked. A few photographs could only be partially recovered, but these were too few and unimportant for me to worry about. This program is a shareware program, but the image recovery portion of the program is free for use.

Apart from the program, you may need a card reader. Your camera connected to the PC using a data cable may also work. I have been using a TECH-COM multiple card reader to access my SD cards, ever since I lost my camera’s data cable. The SD card I used to recover the data was a 2GB Transcend card, bought about a year ago. The card wasn’t in use for a long time. This was only the second time I was using it in my camera. A surprisingly short life, as far as I know, for a SD card. Two of my other SD cards - a 1 GB card and another 512 MB card, also from Transcend continue to work perfectly, after more than a year of frequent use.



This is not a complete review of this software. Although, it can recover files from NTFS, ext2/ext3 partitions, I did not test drive these features. The program has several other configurable options, but I just evaluated the photographs recovery feature. This is the only free feature of this program. Rest of the features are severely limited.

With the default options the program took 58 minutes to analyze the 2GB SD card in two phases. In the first phase it looks for any filesystem metadata blocks, followed by recovering the data blocks of each photograph file in the second phase. Out of total 592 photographs that could be recovered, 14 were incompletely recovered. This could be because the data blocks of those 14 photographs were overwritten. Even then a recovery rate of 98% is remarkable.

Incomplete RecoveryCompletely Recovered

Incomplete & Completely Recovered Photographs

There however seem to be several image recovery options to detect end-of-file, which decide the speed of data recovery. The dialog box below lets you choose these options. Image analysis is the default option. I chose the None option and this reduced the data recovery time to 5 mins from 58 mins earlier. The software was still able to recover all the photographs. Maybe in some corner cases, image analysis is a better option, though.

Image Recovery Options

End-of-file detection options for image recovery

The program seems to understand several file formats, including photographs, which means it can be more reliable when recovering these files. The screenshot below shows the dialog box where you can select the file formats which the software tries to validate while recovering the data.

File Formats

File formats validated

The screenshot below shows the photographs selection dialog, after it was done analyzing the SD card. Whenever you click on a photograph shown in the dialog box below, it shows a preview of that photograph in a separate window, as shown above. One annoying behaviour of this dialog is that it only shows photographs in incremental batches of 100. When you click on the link at the bottom of the tree, to show the next 100, it expands the tree and scrolls back to the top. Then you’ll have to scroll back down, to get back to the next photograph. If you are not careful enough, to keep track of which photograph you were viewing last, you’ll lose track of where to start again.

Recovered Files

List of photographs recovered

The program, however, is not without other flaws. It crashed after completing the two recovery phases i.e. making me wait for more than an hour. It again crashed immediately with a memory access violation error, as soon as it executed the second time. I was finally able to get it to recover my files on the third attempt. Three hours of labour, but finally I had the photographs I needed. If the end-of-file selection option was known to me earlier, I’d have been able to recover the photographs much faster.

For a complete walkthrough with screenshots for each step of recovery, check out this tutorial.

Mom, how do I become more creative?

Do you sometimes wonder why some people seem to be more creative than others? How do the creative ones keep coming up with ideas? This post looks at this and is biased towards people involved with technological development. After all as I said before this blog is about software development.

Getting ideas is the easy part. The difficult part is to execute those ideas. Yet, it’s so difficult to get into the idea-generation mode itself for a lot of people. This post will look at the easy part - how to generate ideas.

Table of Contents


“The cure for boredom is curiosity. There is no cure for curiosity.” - Dorothy Parker

The most important pre-requisite for generating ideas is motivation. The various sources of motivation have been studied in different contexts. Broadly speaking there are two sources of motivation - intrinsic satisfaction and external rewards.

Intrinsic satisfaction can be generated by satisfying one’s curiosity, taking action to improve yourself in an area, generating new insights from existing knowledge, solving difficult problems etc.

Similarly external factors such as financial rewards, recognition of achievements, promotions and so on are also instrumental in motivating people.

External Motivation

External Motivation

Intrinsic satisfaction though, is the best form of motivation, since it’ll most likely help you play the creative game longer, before you need a reboot. Nevertheless, whenever you feel you are losing motivation, help is at hand. If you try avoiding these phrases, the motivation will last longer.

Dilbert Motivation - I

Dilbert Motivation - II

Temporary Motivation

Dissatisfaction is a type of intrinsic motivation. It can emanate from lack of quality of a service, inefficiency with which we are forced to work, financial issues, intellectually unsatisfying work, the failure to identify one’s purpose of life, etc. Whatever be the reason, it makes people take action or at least urges them to do so.

Curiosity though takes the cake when it comes to finding out an infinite source of motivation. People who are curious are constantly exploring things, asking themselves questions, looking for stimuli and so on. All of that leads to learning new things, which when combined with existing knowledge or ideas leads to new ideas or concepts.



In the video below Richard Feynman talks about his early experiences as a boy and it’s striking how curious he was as a child. More interviews and interesting discussions with Richard Feynman are captured in the book The Pleasure of Finding Things Out. Look at what his curiosity got translated into.

The Pleasure of Finding Things Out

In his book Lateral Thinking, Edward de Bono talks about lateral thinking as a way of using information to bring about creativity and insight restructuring. Lateral thinking is an alternative to vertical thinking. It helps to put together or structure information in new patterns. While vertical thinking seeks to select a possible path from several, lateral thinking seeks to generate new paths. The result - new insights. It’s a wonderful book and really echoed a lot of what I have learned along the way, about generating ideas. Highly recommended.

Avoid Distraction

The second pre-requisite for being more creative is a distraction-less zone. I am sure you won’t find yourself going very far with creativity if your wife’s nagging you to get the kitchen sink pipe fixed or change your kid’s nappies. Get that done right now, and find yourself a quiet corner to sit in later. I am sure your television won’t help either. That’s one of the reasons why I haven’t got a television at home.

If you are losing money in the stock market, that won’t help either. Finding a good mutual fund to park that money into, can let you forget about the day to day vicissitudes of the market. There are other ways to get rich as well.

Automate your bill payments. My cell-phone bill gets paid automatically by ECS through my credit card. One less thing to worry about.

I recently found out that I am losing nearly 5000 bucks every year due to late payments. It’s because I pay by cheque which takes 3-4 days for payment realization. Since I tend to remember the due dates for payments, I end up dropping cheques on the last day, which results in late payments. So I have started getting rid of all my credit cards, which cannot be paid through netbanking. One more thing to worry less about every month. Phew.

The whole point of all that is to get organized and cut down on distractions. The less you worry about the mundane things, the more you can think freely about wonderful new things.

Now after looking at the two essential pre-requisites, let’s take a look at the two essential ingredients for creativity.


Now when you are both motivated and in a distraction-less zone, what do you do? Load yourself with information. Go ahead and read about all the new stuff people are building, discovering, inventing and fooling around with. You need stimulation - the first ingredient to creative output.

Reading is just one way of stimulating yourself. There are several other ways to do so. Take a look at this page by Hugh MacLeod, where he talks about the various inspirations for his early back of the business card cartoons. You may also like to read this article he wrote about how to be creative. The widget below shows off Hugh’s latest adventure with cartooning.

Attend conferences. Be a member of organizations like ACM, IEEE and keep track of all the research being done in your area of interest. Added benefits of membership are access to online journals and sometimes books.

Be on the cutting edge. Look at what’s getting popular on the technology front. Have a list of resources to check out when you are looking for ideas. Know how to find out more information on a topic.

Occasionally, you will find yourself in the midst of information overload. When it’s getting too much to handle, follow the advice in these articles to get the zing back in your life.

Hack Mode

The second ingredient for creative output is getting into something akin to hack mode. Mihaly Csikszentmihalyi (no I don’t know how to pronounce that) talks about the flow state of consciousness in his book Flow: The Psychology of Optimal Experience. Everyone has their own method of getting into hack mode. It’s usually realized when you have managed to achieve the previous three requirements - motivation, a distraction-less zone and some stimuli (a problem you are trying to solve?).

In this state you become oblivious to everything outside the sphere of the problem you are trying to solve. Your mind is awash with images, flow diagrams - a plasma of information, ideas and imagination. It’s almost like a drug-induced state, which turns into pain and distress, if something unexpectedly yanks you out of it.

I tend to get into hack mode during the night. As an example I have written about a 1000 lines of multi-threaded chat server code, the virtual memory implementation of a toy operating system, a 150ms input to result levenshtein implementation which works with a database of 2.88million strings (a week long hack-mode session) in single hack-mode sessions. Although, these did not involve a huge amount of work, considering what some people are able to achieve in a day, they probably are good examples of work achievable in hack mode.

All that hacking however is preceded by a hack mode for generating or putting together ideas for what needs to be done. It’s similar except that I am not implementing anything. Just thinking, juggling, combining, and toying with different ideas in my mind.

These are the 4 pillars of a creative jumpstart. There are several articles (linked from this post itself) which talk about creativity, being creative, getting into the zone, etc. I have found that almost all the points they mention belong to MASH (motivation, avoid distraction, stimuli and hack mode) in some way or the other.

Just remember MASH and apply it in your own life. I am sure each one of you will take a different approach to achieve it. Find your own way out through the maze. There are several exits.


20 high performance program design related links

This is a list of links to various articles, papers etc. related to high performance program design.

  1. What every programmer should know about memory?
  2. There is a lot more to hash functions than they teach you at school.
  3. How to search for the word pen1s in 185 emails every second.
  4. Regular expression matching can be simple and fast
  5. A scalable concurrent malloc implementation for FreeBSD
  6. Eventual consistency
  7. Latency lags bandwidth
  8. High Performance Server Design
  9. Michael Abrash’s Graphics Programming Black Book
  10. Virtual Machine Showdown: Stack vs Registers
  11. The C10k problem
  12. Judy Arrays
  13. Map-Reduce Framework
  14. Amdahl’s Law
  15. Pipelining: An overview - Part I
  16. Pipelining: An overview - PartII
  17. Wikipedia: CPU Cache
  18. Pentium: An Architectural History - Part I
  19. Pentium: An Architectural History - Part II

Ok, so you might be wondering, where’s the 20th link? I lied. You just get 19. In case anyone has more interesting links related to high performance program design, please do leave them in the comments!

Add 42klines search engine to Firefox’s search bar

I wanted an easier way to get to the 42klines search engine I created for UNIX programmers. So, I found out a few ways to put it on the Firefox search bar (IE7 supported as well). There are easy ways and another slightly more hands-on, pull up your sleeves, grab the tools and get to work type.

The 3-step Install
First the easy one. Mozilla provides a Firefox extension called Add to Search Bar, which can apparently add any search bar on any website to the Firefox search bar. These are the steps:

  • Install the extension.
  • Right click on the 42klines search bar and select the “Add to Search Bar…” option
  • Change the name of the engine in the dialog box that pops up, if you like and you are done.

More details here. So that was the easy way. You can use this method to add a search bar from any web page to your browser’s search bar. If that worked for you and you are not interested in knowing anymore, skip the rest of the post.

Generate Plugin & Install
There is another way to add a search engine to your Firefox search bar. There is an online search engine plugin generator available for generating search plugins at Here are the steps to follow:

  1. Register on the website
  2. Go to the search engine page and make a search for the word TEST.
  3. When the search results appear, copy the URL in the browser’s location bar. For the 42klines search engine this is the URL.
  4. Go to the plugin generator page and paste the URL in the “Search URL:” field of the form.
  5. Fill up the rest of the form and create the search engine plugin.
  6. You will be given the option to add the search engine plugin to the search bar. Click on the link and you are done.

This method is useful for people who provide a search engine of their own. Their search engine will now be listed among all the other search engines listed on if they made it public. You can also grab the generated plugin’s source file in the OpenSearch XML format from your account on this site. This allows you to use the source file to provide a link on your website, which triggers a javascript, which installs the search engine to a user’s browser search bar easily from your own website.

Getting your hands dirty
Now how about getting down to brass tacks and do it the hard way. Mozilla provides a way to easily add search engines from their search-engine add page. But what if the search engine you want to add is not listed among those. Or what if you have created your own search engine and want to allow people to add it to their browser’s search bar through a link on your web page? Read on.

To allow adding your search engine to a user’s Firefox browser search bar from your web page, you need to follow these two steps:

  1. Create an icon for your search engine and encode it in BASE64.
  2. Create a search engine descriptor file.
  3. Provide a link on your website which installs the search engine through Javascript.
  4. Optionally add search engine plugin auto-discovery support on your website.

Older browsers did not have the support for search bars, so you need not worry about them. Firefox 1.5 used another format called Sherlock for adding a search engine to the browser search bar. Newer browsers (Firefox 2.0+ & IE7) use a format called OpenSearch. The above links describe creating a search engine plugin using OpenSearch. OpenSearch allows a lot more than described in the links present in the above steps. Read through the documentation in case you are interested to know more.

Mozilla Mycroft Project: Search engine plugins for Firefox & IE7
Mycroft Plugin Generator: Advanced search engine plugin generator
Mycroft Search Engine Submission: Submit plugin to the Mycroft directory (OpenSearch format)
Mycroft Sherlock Submission: Mycroft page for submitting legacy Sherlock plugins
Mozilla Extensions: Create a Firefox extension if you like.
Encode data in Base64: Online tool you can use for encoding the search engine image in Base64.

42klines: A Search Engine For UNIX Programmers

Are you a UNIX programmer? Then this may be very useful to you.

Google has offered the ability to create a customized search engine (CSE) which searches a list of sites given by you. I decided to take it for a test drive. I ended up with a surprisingly useful search engine customized to serve UNIX programmers. You can find the search engine box at the top of this blog. It currently searches more than 400 websites which are useful for UNIX programmers. You will find a search box which looks like this on the top of this blog.

Unix Programmer's Search Engine

Table of Contents

Why is it useful?

If you do a Google web search, the search engine cannot identify the context in which you have done the search, immediately. A keyword such as “signals” can imply different things (traffic signals, hand signals, UNIX signals?). In order to be useful to all people, Google gives search results from different contexts, if applicable, in its search results. This means Google web search can end up wasting your time (you’ll have to filter results manually) while reducing the relevance of results in your context. A CSE, however returns results related to exactly what you want.

I realized the usefulness of this, while discussing the semantics of handling signals by multi-threaded processes in Linux, with a colleague recently. The problem we were facing was related to the way gdb was handling signals received by a multi-threaded process, we were tracing. We were not sure about the current Linux semantics so we decided to search. Co-incidentally I had added around 200 websites to this custom search engine related to UNIX programming a few days ago. So I decided to give it a test drive. I searched for signals thread. The top 5 results from the CSE gave me more than I needed to know, about Linux signal handling in multi-threaded processes. I compared the results with Google web search, and found that a very good article related to this topic, was not present at all in the first few pages of the web search results! Moreover, I found that almost all the CSE’s results in the first page were directly relevant to what I wanted to know, while the quality of web search results wasn’t that high.

The results on the web weren’t that bad, but they were not the best either. Google has done a good job with the custom search engine offering. Take a look at the results from the first page of web search below. Then try the 42klines search. Do you see the difference?

Results of Web Search

No! I am not interested in girls who give me mixed signals.

What websites does 42klines search?

For a start I have seeded the engine with more than 400 websites which can be useful to UNIX programmers. They loosely fall in the following categories:

  1. Research organizations (IEEE, ACM, Citeseer etc.)
  2. UNIX/Programming Magazines (DDJ, Linux Journal, LWN , KernelTrap etc.)
  3. Forums (Interesting google groups, etc)
  4. OS development resources (NonDot, Sandpile, x86 etc)
  5. Bookmarking sites (Reddit,
  6. Free web hosted books (Linux Device Drivers, OpenBookProject etc.)
  7. Document hosting sites (Scribd, Wikipedia, Linux HOWTOs etc.)
  8. Blogs and personal websites hosting useful programming information (Robert Love, Ulrich Drepper etc.)
  9. University courses available online and useful for UNIX programmers (MIT Open Courseware etc.)
  10. Application hosting/indexing websites (Sourceforge, FSF etc.)
  11. Conferences (USENIX, Linux Conferences etc.)
  12. Miscellaneous pages

Can I put this search engine on my own website?

Yes, you can easily do that. The search engine hosted on this website is a linked CSE. Another flavor of it called the stored CSE, is hosted in Google’s databases. The differences between the two flavors have been detailed later on in the post. You can easily add the stored CSE flavor to your iGoogle page as a gadget. You can download this code to put the 42klines search engine on your blog or website. Customize the look in whatever way you want. The search results are hosted on a page on this website, because that page requires another snippet of code from Google. If you want to host the results on your website, let me know. I’ll provide the code necessary to do so. You can skip the rest of the post, if you are not interested in knowing how the search engine works. If you want to add your own bookmarks useful for UNIX programmers in the 42klines search engine, read on. A few useful resources are listed at the end of this post.

Can I add my own bookmarks to 42klines?

Whenever I find good links or websites useful to me as a UNIX programmer I plan to add them to this search engine, for everyone’s benefit. The list of websites which are currently indexed can be given to Google in an annotation file in the XML format. The annotation file for 42klines search engine is hosted in a Subversion repository: on Assembla. Assembla hosts subversion repositories for projects. If you are interested in adding more links to 42klines, send a mail to me at sudhanshu.goswami at 42klines dot com. I’ll send an invite to you from Assembla. Checkout the 42klines search engine’s websites list by running this command:

svn checkout

If you prefer GUIs, you can also use RapidSVN on Linux to do the same. The 42klines search engine on this website is a linked CSE. It has a stored CSE flavor as well. The difference between the two flavors are detailed in the next section. List of websites to be searched are maintained in a different way for each flavor. Going forward, I plan to update the linked CSE first, while periodically bringing the stored CSE in sync with it. I maintain two flavors because, it is easy to add the stored CSE to iGoogle as a gadget.

Custom Search Engine Flavors

The table below describes the differences between a linked CSE and a stored CSE.

Stored Custom Search Engine Linked Custom Search Engine
Can be built using wizards hosted here. Metafiles can only be created manually.
Websites searched are stored in Google's database. Websites searched are stored in an annotation file hosted on your server.
Websites added to search engine database get immediately reflected in the search results. Websites added to annotation files will get reflected in the search results on the next refresh by Google. To immediately refresh or test annotation file, you can use this tool.
Maximum number of sites = 5000. Multiple annotation files allowed. Each file's max size = 3MB. Total file sizes <= 10 MB.
Get their own Google hosted web pages like this. No home page for a linked CSE created on Google. You can create your own home page for it.
People can volunteer to contribute from a stored CSE's home page. This option is not available for a linked CSE.
Restricted in number of things possible. Be creative. You can customize your annotation files on the fly. How? You can switch from a stored CSE to a linked CSE like this.
Google provides links to add this kind of an engine easily to your blog or iGoogle home page. E.g. use this to add it to your iGoogle page. Linked CSE has to be manually added to a website. E.g. Linked CSE flavor of 42klines search engine can be added by downloading and adding this piece of code to your website.

Getting your hands dirty

This section is just a blurb about things to know, while working with Google’s custom search engine. I’ll list them down pointwise.

  1. Opera’s latest version does not seem to be supported. Some features like saving options for the search engine worked, but the “Save” button got permanently disabled after saving. These kinds of problems may occur if you are using uncommon browsers. YMMW.
  2. I tried to replace the context file of the stored search engine with that of the linked search engine using the Advanced tab of the search engine’s wizard interface, however it did not work. So, no home page for the linked CSE could be created on Google.
  3. If you are not trying to customize a search engine in non-traditional ways, and just want a search box for your blog/homepage, you are better off sticking to a Stored custom search engine. However, if you have got special needs or have more than 5000 websites to search, you’ll have to use a linked search engine.
  4. Google’s custom search engines can be customized to a great extent to give highly targeted results. This can be achieved by assigning topics to websites and labeling them. Labels can be used to tweak the search results in the favor of websites stamped with a particular label or completely provide search results only from websites stamped with that label. Further a boost factor can be associated with websites to boost search results from them. You can refer to this CSE glossary, if you are having trouble following these terms.
  5. Google’s management interface for stored CSEs does not provide the ability to assign labels, boost strengths for some websites, add filters, created nested search engines etc. You can do all of these with stored CSEs, but you will have to first download the annotation file for the stored websites and the context file for your stored search engine. Then you will have to edit them manually and upload them. This can be done from the Advanced tab of the management interface.


42klines CSE: Download code to put on your website here
42klines iGoogle gadget: Add this search engine to your iGoogle page
42klines subversion repository
Coopdir: Directory of custom search engines.
GooglePicks: Picked custom search engines by Google.
RubyCorner: A custom search engine for Ruby programmers.
Python CSE: A custom search engine for Python programmers.
Linux: A custom search engine for linux users created by a sysadmin.

Update: Some cleanups done to the post. Added a table of contents, but unfortunately the anchor links did not work as expected. Still trying to figure out how to fix this. [Mar 2: Fixed. At the cost of breaking previous permalinks. Please update any bookmarks to permanent links. This site is going through some initial growing pains.]