Pages

Saturday, 26 December 2009

A Ruby script to search bookstores online

I started dabbling in Ruby some weeks back. The initial interest was sparked after reading "Treating Code as an Essay" (Yukihiro Matsumoto) - one of the chapters in Beautiful Code. So I started doing these bootstrapping exercises in Ruby. Some of the exercises are good - but nothing beats doing a small project to learn a new language.

I buy a lot of books, mostly online. There are a few good online bookstores in India, notably Flipkart.com, Infibeam.com and Indiaplaza.in (Sadly, Amazon does not have full-fledged shipping to India yet). The way I usually search for a book in online bookstores is (was, till now)

  1. Go to books.google.com and enter the book title
  2. Click on the best match
  3. Click on 'All Sellers' on the left of the page
  4. The Indian bookstores are usually listed towards the bottom. It does not include all stores, and sometimes the prices are not listed. I have to go to each individual site and check them out.

I wanted to collapse these steps into one - a simple script that would accept the name of the book and show results from all these bookstores, with comparative pricing. And the result was this

http://github.com/talonx/book-search

It's in Ruby, runs from the command line and writes the output to an HTML in the same directory called 'search.html'. Much needs to be done, like

  • Price based listing with the lowest on top
  • A web interface for the search
  • Add more bookstores - it's only Flipkart.com, Infibeam.com, Indiaplaza and Bookadda.com right now.

To run the script, type this (you need Ruby 1.8.x, available from http://www.ruby-lang.org/en/downloads/ and the Hpricot HTML parser library, available from http://github.com/whymirror/hpricot)
ruby lib\book-search.rb "<book title (in quotes if it has spaces)>"

Saturday, 12 December 2009

Do that side project

Do that side project.

How many times have you told yourself

  • I'll start that open source project I've been thinking of
  • I'll write that utility which will make my job easier
  • I'll enroll for that course on Artificial Intelligence and write that amazing recommendation system

and then did nothing?

Well, guess what. Time passes. Yes, really.

Anne Dillard said

"How we spend our days is, of course, how we spend our lives."
Think about that for a moment.

Don't waste time on thinking about when to think about planning to think about thinking about when to start thinking about doing it. Do it now.

Here are some more resources on the subject -

  1. Shut up and Hack - http://www.slideshare.net/bluesmoon/shut-up-and-hack
  2. Do it Now - http://www.stevepavlina.com/articles/do-it-now.htm
  3. Do it Fucking now - http://seoblackhat.com/2007/01/29/do-it-fucking-now/
  4. Chris Wanstrath's keynote - http://gist.github.com/6443

Monday, 7 December 2009

Adding MySQL server instances using mysqlmanager

The MySQL instance manager - mysqlmanager - provides a way to manage multiple MySQL server instances on the same installation. All these instances use a common my.cnf file - but each can be configured individually (using the same file). mysqlmanager itself provides a command line interface to control the individual instances.

Part of a sample mysql.cnf with multiple mysql instances

[mysqld1]
user = mysql
datadir = /data/mysql-1
socket = /tmp/mysql-1.sock
port = 3306

[mysqld2]
user = mysql
datadir = /data/mysql-2
socket = /tmp/mysql-2.sock
port = 3307

The ability to setup multiple database servers fast is particularly useful in development boxes where fresh DBs need to be created often. In my team, we often need to do this. Every time a new DB has to be setup, we have to go through the steps of creating a datadir, installing the system tables, adding a root password, adding the entries to the my.cnf file and starting the instance using the mysqlmanager shell.

So I whipped up a small Linux shell script which automates this process.

Here it is.

It's still in a quite primitive state - but it works!

Usage is simple -
add-mysql-instance.sh mysql config-file-location datadir groupname username password instance port instance-name mysqlmanager-user mysqlmanager-password mysqlmanager-socket-file

Of course, mysqlmanager has to be running for this to work.

I'll be adding improvements to this script - like the ability to generate a mysql instance name based on existing instances (instance names are usually mysqld1, mysqld2 etc), picking up the user name from the file itself etc.

Friday, 11 September 2009

India Needs an AntiSpam Law

The Problem
I dread it whenever I have to enter my email address at an Indian ecommerce site. It's mandatory if I am buying something, and I do it reluctantly. After the product is bought, I go to the My Account link if there is one on the site and unsubscribe from all marketing notifications (because most of the times they do not bother to tell you at the time of registering or entering your email address that you have been autosubscribed to such mails).
Note that I do not mind receiving notifications from system administrators and mails related to the delivery of the product I bought. But I do not want to keep on receiving general mails about things I am not interested in.

The inevitable happens after a couple of weeks. I get emails from the site offering me discounts on new products, new deals; in short, commercial email. Unsolicited – because I did not opt in. And in some cases I opted out explicitly. In other words, Spam. Some of these mails have an Unsubscribe link at the bottom. After you have apparently 'Unsubscribed' using the said link, one of the following things happen -

1. Similar mails keep coming, with the same Unsubscribe link. Most of these links are just mailto: links as opposed to an http: link. An http: link usually means it’s a mailing list manager software, which actually works. But a mailto: link more often than not means that somebody has to manually do the removal. Which does not happen.

2. The Unsubscribe mail bounces. Either because the Unsubscribe mailbox does not exist (Surprise!) or it has exceeded its quota because people keep on Unsubscribing and nobody reads or deletes them (Surprise!)

Here are some sites that do not have a working Unsubscribe link in their emails. All my efforts to Unsubscribe from their unwanted mails have failed. Most of these are commercial sites I use regularly.

http://www.sulekha.com
http://www.pvrcinemas.com
http://www.citibank.co.in (These guys take the cake as far as repeated requests to remove my address and repeated responses that they have done so and the and sorry-sir-it-won't-happen-again routine are concerned)
http://www.siliconindia.com
http://www.indiaplaza.in
http://www.bookmyshow.com

At this point I would distinguish between two kinds of spamming -

1. The kind I describe above. You cannot mark them as spam since you might be getting legitimate mails from the same address in future (like when you buy another product and there is a confirmation) and cannot afford to miss them.


2. The 'normal' spam that you get everyday in your junk mail folder. All mail providers detect and mark them as spam automatically. These are sent by people whose only job is to spam others, usually sitting in a country whose laws are lenient enough to allow it.
To start with, ecommerce sites need to understand that giving my email address for a necessary purpose does not imply that it entitles them send any email to that address.

My email address has a privacy status similar to my telephone number.

It’s like calling up someone every week with irrelevant news just because you happen to have their phone number. (On a related note, the Indian NDNC – National Do Not Call Registry – is a step in the right direction as far as controlling whom telemarketeers in India can call is concerned).

How do other countries deal with this?

Almost all progressive countries have laws and directives dealing with this explicitly.

EU : http://en.wikipedia.org/wiki/Directive_on_Privacy_and_Electronic_Communications
Aus : http://www.dbcde.gov.au/online_safety_and_security/spam
NZ : http://www.dia.govt.nz/DIAwebsite.nsf/wpg_URL/Services-Anti-Spam-Index
US : http://en.wikipedia.org/wiki/CAN-SPAM_Act_of_2003

Here is a more comprehensive list maintained by SpamLinks.
http://spamlinks.net/legal-laws.htm#country

More…

Then there are the ISPs (Internet Service Providers).

I have a Tata Indicom broadband connection. From time to time, these guys feel I need to know about their latest antivirus offerings, or some cool deal they have for the festive season. These mails don't even have an Unsubscribe option. When I call them up and ask to be removed from receiving these mails, the customer service people are initially clueless, and on further pressing inform me that these mails are to keep me informed. Er, what? And what if I don’t want to receive them? They say they cannot remove my email.

India needs an enforceable AntiSpam law, and now.

The Indian IT Act of 2000 and its 2008 Amendment:

Disclaimer: I am not a lawyer nor do I claim to understand law well. The views below are based on a reading and an attempt to understand publicly available documents.

The only section in the Indian IT Act – the only law in the country that deals with cyber offences – that I could find dealing with unwanted email is Section 66(A).
        any electronic mail or electronic mail message for the purpose of causing
annoyance or inconvenience or to deceive or to mislead the addressee or recipient
about the origin of such messages

Section 66(A) does not even begin to address the spam problems I describe above.

Either the existing law needs to include sections for dealing more specifically with spam or we need a standalone set of laws for making this kind of unsolicited email criminally prosecutable.

Sunday, 30 August 2009

Wondering about the state of Java Developers

A friend of mine forwarded this article by Yakov Fain on sys-con.com -

http://in.sys-con.com/node/1040135

The essence of the article is this

The author interviewed a lot of people for developer positions, and most of them who call themselves Java developers and cite extensive experience in J2EE lack basic knowledge of core Java.

This might sound suspiciously like a gross generalization, but I believe that's not the case. I had a similar experience when I interviewed people for developer positions on my team last month. The position called for both Java and Javascript experience. These are the things I encountered -

  • Most people who have worked solely on services (read outsourced) projects list all J* technologies on their resume, but know very little in depth of Java programming.
  • There are people who lack any kind of programmer mentality or skills at all and put their current role as something like Programmer Analyst, and this fact cannot be ascertained from their resume alone. They often try to highlight other (non-software development) achievements.
  • SCJP certification is no guarantee that a person can code in Java (Surprise? Not at all)
  • There are people who have 3.5 years of experience, with multiple services projects under their belts, and familiarity with a host of technologies, who cannot write a Java class which will print out the prime numbers between 0 and 100.
  • Most core CS concepts are forgotten after 2-3 years of working in services projects.

Please note that I am not generalizing, but these facts do indicate a problem somewhere. These developers actually a represent a very small distinct sample of the worldwide developer community, since all my interviews were done in India (both face to face in my Hyderabad office and over the phone).

Another interesting point I noted was that most non-Javascript developers think that Javascript is used only for form validation. Such usage also qualifies as 'extensive Javascript knowledge' in their resumes.
What should I conclude from this? Is this malaise widespread in other parts of the world as well? Is it specific to developers in India working on outsourced projects? (No, as the link by Yakov Fain shows) Is it a result of outsourcing, leading to a lack of innovation? Or is the innovation there, but the signal to noise ratio too low?