Saturday, May 24, 2008

Twitter and TinyURL

Twitter is now probably as ubiquitous as blogging. It’s been used almost as a real-time-blogging-platform. Which is cool – I love Twitter.

If Twitter is ubiquitous – so is TinyURL. TinyURL is a service that will shorten long URLs and make them look good. If you use Twitter, you will know how common it is to find http://tinyurl.com/64f3j9 kind of URLs. TinyURL is using a 6 digit base-36 shortened representation of a URL, which means they can shorten as many as 2,176,782,336 URLs – they claim to shorten 74 million URLs at the moment.

I did a quick research on how many times TinyURL is used on Twitter, and compared it with 12 of it's competitors. Following is a chart compares the number of times a 'shorted-URL-service ' is used on Twitter (it's a logarithmic chart):


I was wondering – How is TinyURL making money?

I see that there are ads on the page, but the ads don’t seem very relevant. In fact on the right side (below the fold) there is a Google-link-unit that shows competitors’ ads, such as "Cloak Urls", "URL Redirecting" and "Smaller URL".

1. It will be cool if TinyURL.com puts ads that are relevant to the previous-page where the user is coming from. Typically, I would browse a page, find the content useful, and then go to tinyurl.com with the intention of making the URL shorter. So, if TinyURL were to show ads relevant to the previous-page where I come from, that’ll be a lot more relevant, and I will be more likely to click on those ads. From a technical standpoint – this is not going to be very easy, but it should be possible.

2. Also, it might be better if TinyURL puts the “donate” links at more relevant places, and maybe put some logic such as – if this is a repeat user, then be more proactive in asking for a donation (modified screenshot attached). If, I was using TinyURL for the 4th time in the day, I would definitely like to donate some money to TinyURL. In the screenshot below, when somebody clicks on "Donate money & Make TinyURL", TinyURL give the options of PayPal and Amazon payments:


Great service – TinyURL + Twitter.

Labels: , , , , , ,

Thursday, May 08, 2008

Storage shipments - 1 Billion Gigabytes a year!

Stephen Lawson reports on Computerworld - Demand for storage is doubling every 18 to 24 months, and within five years, Roberson expects to see a "yottabyte year" when the industry as a whole ships 1 yottabyte (a billion gigabytes), or 1,000 zettabytes, of storage capacity.

HP’s new HP StorageWorks 9100 Extreme Data Storage System (ExDS9100) has a base configuration will consist of four blade servers and three storage blocks, with 246TB of storage. Customers will be able to add either type of capacity independently of the other. With two racks, a system can have as much as 820TB of storage capacity.

The ExDS9100 is scheduled to ship in the fourth quarter. HP predicted that it will cost less than $2 per gigabyte in a typical configuration.

Labels: , ,

Wednesday, May 07, 2008

How to - Convert Word Docs to Web Pages

Useful ...

Convert Word Docs to Web Pages

http://howto.wired.com/wiki/Convert_Word_Docs_to_Web_Pages

Microsoft Word has its place, but that place isn't the web. If you've ever tried to convert a Word document to an HTML document, you know that Word's built-in tools can have disastrous results -- bloated files, proprietary markup and exposed personal information are among the gems you'll get with Word's "Convert to HTML" function.

To get to a semi-sane starting point, try using Word's "Save As: Web Page, Filtered" rather than the regular web page option. This will strip out many of the proprietary tags and won't include potentially personal and revealing info contained in the File Properties dialog.

TinyMCE

Another viable option is TinyMCE, a JavaScript Rich Text Editor that offers a "Paste from Word" option. Paste From Word is intended to be used by those who would like to just "Select All" in Word and paste the content into TinyMCE. Depending on the complexity of your document, TinyMCE may be able to fix some of Word's styling quirks and output usable HTML.

Textism

The good folks over at Textism have a tool that will, to quote the Textism website, "strip Microsoft's proprietary tags and other superfluous noise from Word-generated HTML documents." The results are not only much closer to standards compliant web markup, they also create much much smaller, quickly loading pages.

I used Textism to convert this document from Word to clean HTML. I think, it did a pretty good job.

Labels: , , ,

Tuesday, April 29, 2008

Performance Showdown - SSDs vs. HDDs

Very interesting data posted on slashdot:

"Computerworld compared four disks, two popular solid state drives and two Seagate mechanical drives, for read/write performance, bootup speed, CPU utilization and other metrics. The question asked by the reviewer is whether it's worth spending an additional $550 for a SSD in your PC/laptop or to plunk down the extra $1,300 for an SSD-equipped MacBook Air? The answer is a resounding No.

Surprising performance results

“I used HD Tach to test the drives' performance -- and got some interesting results. It was the mechanical Momentus drive (non-SSD) that scored the highest burst speed at 214.3MB/sec. The Crucial SSD came in second at 137.3MB/sec., but the desktop Barracuda (non-SSD) and its 135MB/sec. clung to its heels. Advanced Media's Ridata drive trailed the pack at a leisurely 71.2MB/sec. While the two mechanical drives and the Ridata SSD posted average reads in the 54MB-to-55MB/sec. range, Crucial forged ahead at 120.7MB/sec.

SSDs are highly praised for their boot speed, so I would have been remiss had I relied solely on a standardized test. The results were a bit surprising. Crucial's SSD and the two Seagate devices all required 39 to 40 seconds to cold boot to the desktop. (There are a few minutes of behind-the-scenes activity during a Vista boot, but I determined that the boot was complete once the Windows sidebar appeared.) Ridata did best of them all, with a boot time of 32.1 seconds, although that's hardly the blazing speed you might expect from an electronic versus a mechanical device.”

Moving data

“Finally, because these SSDs have a comparatively small capacity, it's most likely that you will be transferring data from your laptop after a day's work. So I took 4,666 files and folders (a total of 8.05GB) and copied them to the drives and then copied them from those drives. I used the same secondary drive as source and destination in all cases.

Neither of the SSDs fared very well when having data copied to them. Crucial needed 243 seconds and Ridata took 264.5 seconds. That's over four minutes. The Momentus and Barracuda hard drives shaved nearly a full minute from those times at 185 seconds. In the other direction, copying the data from the drives, Crucial sprinted ahead at 130.7 seconds, but the mechanical Momentus drive wasn't far behind at 144.7 seconds. Ridata and the Barracuda were third and fourth at 156.8 and 166 seconds, respectively.”


More details here.

Labels: , , ,

Sunday, April 27, 2008

FaceBook server infrastructure

Om Malik reported on Gigaom.com:
The company is running around 10,000 servers, according to Data Center Knowledge, citing comments made by Facebook VP of technology, Jeff Rothschild, at a recent MySQL user conference. (See video of the panel.) Of the 10,000 servers, 1,800 are from MySQL and around 805 of them are memcached servers. In order to house its sprawling infrastructure, Facebook has leased data center space from DuPont Fabros in Ashburn, Va., and Digital Realty Trust in Santa Clara, Calif., DCK reports.

How much is Facebook spending on its infrastructure? The company isn’t going to tell us, but there are clues. Server and storage company Rackable today reported first-quarter 2008 sales of around $69 million. Facebook is one of its largest customers, accounting for around 10 percent of Rackable’s sales (that number could be higher, but we’ll have to wait for Rackable’s 10-Q to get a clearer picture), so some quick, back-of-the-envelope math reveals $7 million in spending by the social networking company. A well placed source of mine just let me know that Facebook is going to spend over $9 million more on servers this year. That should be good news for Rackable. Next on my list is an estimate of Facebook’s bandwidth and data center costs.

Doing a little more calculation - $7M for 10,000 servers, means about $700 per server, assuming a 10% cost for F5,firewall etc. - we are at $630 per server. That's pretty heavy duty server, I think.

Labels: , ,

Thursday, April 03, 2008

Komli to represent eBay India for all their ad sales worldwide

Komli and eBay India have entered into an exclusive partnership whereby Komli will represent eBay India for all their ad sales worldwide. In addition, Komli's ad network optimization technology PubMatic will optimize eBay's unsold ad space for maximization of revenue.

This is very exciting news for a couple of reasons:
1. A global internet giant has chosen to partner with an Indian startup for its superior understanding of online advertising and online advertising technology,
2. This bodes well for the growth of online advertising in India -- large portals, which in the past have not looked at online advertising as a key revenue driver, are starting to do that now.

For details see official news release at - http://www.komli.com/news/ebaypress.php .

Labels: , , , ,

Sunday, March 30, 2008

Iframe ad-tag vs. Script ad-tag: Online advertising tag type comparison

This is a list that I have discussed many times with friends, however I never found these on a single place so here you go ...

Differences between iframe tag and script tag:
  1. Iframe tag does not delay the loading of the web-page elements: Iframes usually load in parallel, so for example if you have several elements in a page like images, CSS, JavaScripts and HTML tags and you have the ad-tag as an iframe embedded in the page, the iframe loading would happen in parallel and it would not make your page loading slower. So, if you want page to load faster use iframe tags.
  2. Script tag does not change the “referrer” property of your ad-tag: If your ad-tag is served from inside an iframe, the ad-network that serves the ad will see a referrer property different that your page url/domain. On the other hand if you use a script tag, then the referrer url remains the same as your page url and therefore your domain name. Some ad-networks that require that the ad being served from the same domain that they were created for, will therefore not work with iframe tags (therefore they will not serve ads). Most ad-networks however allow setting of a “site-alias” that allows you to set a different domain from which the ad may be served. Read more about the referrer property here.
  3. Script tag works better for ad-networks that do contextual analysis of the content of the page: if you use iframe tags, ad-networks will not be able to look outside of the iframe therefore they will not be able to do on-the-fly contextual analysis of the contents of the page, therefore they may serve irrelevant ads. Read more about contextual analysis here.
  4. If there is more than one ad from the same ad-network, and you are using iframe tags, these ads may not be able to communicate amongst themselves since the scope of the JavaScript variables is within an iframe. Therefore if an ad-tag sets a JavaScript variable, which the other ad-tag on the same page is expected to read, this will break if you use iframe tags.
  5. Since JavaScript variables have their scope only within that iframe, they don’t contaminate the namespace of the JavaScript variables of your web-page, neither do they get affected by the JavaScript variables of your web-page.
  6. Iframe tags are easier for inclusion inside a web-page, since you can save an ad-tag in a file, and load it as an iframe into your web page. This will also allow parallel load of the ad-tag iframe. For example if your web-page is:

<html>
<script type=”” …>
</script>
<iframe
src=”ad-tag.html”></iframe>


<body>
</body>
</html>

More questions? Drop me an email.

Update: For #3 "Script tag works better for ad-networks that do contextual analysis", Google AdSense does mention in their help section for Why aren't my ads relevant?, read on:
The AdSense code was placed within an IFRAME.
Our targeting technology is not optimized to serve ads within a separate IFRAME. If you placed the AdSense code in a separate IFRAME, your site may display less targeted ads or public service ads. For better results, please implement our ad code directly into the source of your webpage. Once you make these changes, relevant ads may not appear immediately. Until we are able to re-crawl your site, which may take up to 48 hours or more, your page may continue to display untargeted or public service ads.

Labels: , , , , , , , ,

Thursday, March 20, 2008

Disk storage - where are we headed?

Some insightful articles and some of my own thoughts on the trends in data storage:

THE BACKGROUND:
Disk capacities are going up and costs are going down, however the effective transfer bandwidth (ETB) per byte of capacity has come down tremendously. Despite capacities and transfer rates increasing by factors or 10,000 and 100 respectively, typical drive ETB has actually decreased by a factor of 100. As Jim Gray said "Disks have become tapes." (Link to source).

Consider, for example, a 10 TB database. Ten years ago, this database would have occupied two thousand 5 GB drives - a common size at the time. With a 3 MB/second transfer rate, the aggregate bandwidth of these 2,000 drives would have been 6 GB/second, enabling the entire database to be scanned in about 30 minutes. Today, only about 20 higher-capacity drives would be needed to hold this same database. Those 20 drives would have an aggregate bandwidth of 1.2 GB/second, increasing the time required to scan the entire database to 150 minutes - an increase of two hours.

DISKS ARE BECOMING A SEQUENTIAL ACCESS DEVICE RATHER THAN A RANDOM ACCESS DEVICE
Jim Gray points out - We have to convert from random disk access to sequential access patterns. Disks will give you 200 accesses per second, so if you read a few kilobytes in each access, you're in the megabyte-per-second realm, and it will take a year to read a 20-terabyte disk. If you go to sequential access of larger chunks of the disk, you will get 500 times more bandwidth—you can read or write the disk in a day. So programmers have to start thinking of the disk as a sequential device rather than a random access device.

Tom White later says that - "MapReduce is a programming model for processing vast amounts of data. One of the reasons that it works so well is because it exploits a sweet spot of modern disk drive technology trends. In essence MapReduce works by repeatedly sorting and merging data that is streamed to and from disk at the transfer rate of the disk. Contrast this to accessing data from a relational database that operates at the seek rate of the disk (seeking is the process of moving the disk's head to a particular place on the disk to read or write data). Read more here.

My take is that SSDs are going to take a while to become an economically viable alternative to disks. Flash disks cost approximately $10/GB, and the OEM costs of good flash drives cost about $60/GB or more (source here). Compare this with the cost of disk, which is about $0.20/GB. So, we are looking at about 300x price difference here. So, I think, it's going to take while before SSDs become reality in storing terabytes of data. Until that time, we will have to use 50-70% empty disks to enhance striping-performance. So, if we were to use 50% empty disks, the cost of disks doubles for storing the same amount of data.

Labels: , , , , , , , , , ,

MySQL - Is there a theoretical limit?

Guy Kawasaki interviewed Marten Mickos. Marten was the CEO of MySQL, now he is the senior vice president of the database group within Sun.

Interesting question about MySQL scalability:
Guy: Is there a theoretical limit of MySQL in case a small business uses it and sales/transactions/whatever explode?
Marten: Every software product has its limits, but I think we have shown that MySQL can scale enormously. Google runs its entire ad system on MySQL. Nokia runs mobile phone networks on MySQL. Booking.com runs all their business transactions on MySQL. If a small business reaches those limits, it is not a small business any longer--it is an enormous global player.
I am a fan of MySQL myself.

Good to know that FaceBook also uses MySQL (with it's 65 million users), and YouTube used MySQL and Twitter uses MySQL (Scaling Twitter: Making Twitter 10000 Percent Faster).

Labels: , ,

Wednesday, March 19, 2008

TaffyDB - A JavaScript DB worth trying out

I recently read about TaffyDB, tried it today. Seems like a handy tool. I would like to use it. TaffyDB is a JavaScript Database, something that can be used for offline data processing in my opinion. For example, a relevant use case is I would like to cache a large report on my browser side and present different views by querying the TaffyDB (I would not like to make server side calls).

It seems like previous attemps have been made for a JavaScript database, a few example are - JavaScript SQL Database with Permanent Storage, Simple JavaScript Database, etc.

TaffyDB is pretty simple to use. Seem feature rich - Under 10K, CRUD Interface (Create, Read, Update, Delete), Sorting, Advanced Queries etc.

Code is pretty easy to write too. Pretty cool, check it out.

Labels: , , , , ,

Saturday, March 15, 2008

Leadership is influence ...

Stephen Covey writes about Gandhiji's leadership, excellent paragraph on leadership that I read recently:
"People think that leadership is a position. It isn’t. Leadership is influence. The key to influence is what we’re talking about. You can have influence without position. So don’t be so dependent upon position or formal authority, but use your moral authority, what you know is right."
Read more here.

Labels: , ,

Monday, March 10, 2008

Internet Explorer 8: My Top 10 list

IE8 has a bunch of really nice features. I read a few reviews (here and here) and made my ‘Top 10’ list, read on …

  1. AJAX Back Navigation enables users to navigate back and forth without leaving the AJAX application and could be used navigating a page without performing a traditional full navigation. This allows websites to trigger an update to browser components like the address bar by setting the window.location.hash value, firing an event to alert components in the page and even creating an entry in the travel log.
  2. DOM Storage is a simple-to-use method for storing and retrieving strings of key/value pair data. Data can be stored per tab instance for a session or persisted to the local machine. This allows pages to cache text on the machine which reduces the effect of network latencies by providing faster access to pre-cached data. Several innovative uses are possible. For example, use this in combination with the new network connectivity event to allow a page to cache data if it detects that the computer is offline.
  3. Six connections per host instead of two for broadband scenarios and a scriptable property allow for more improved performance by allowing parallelization of downloads in Internet Explorer 8. In addition, this increases functionality by ensuring a request is not blocked to a host if two connections already exist. Websites can optimize their downloads based on a scriptable property.
  4. WebSlices - WebSlices is a new feature for websites to connect to their users by subscribing to content directly within a webpage. WebSlices behave just like feeds where clients can subscribe to get updates and notify the user of changes. A WebSlice is a portion within a webpage that is treated like a subscribe-able item, just like a feed. To enable a WebSlice on your website, annotate your webpage with class names for the title, description, and other subscribe-able properties.
  5. Offline Events - This is an easy way of detecting connectivity within the confines of JavaScript. With it we can write graceful offline Ajax applications. Firefox 3 and IE 8 appear to be the only browsers to support this feature.
  6. Cross-domain Request (XDR) - XDomainRequest, is the easiest way to make anonymous requests to third-party sites that support XDR and opt in to making their data available across domains.
  7. Cross-document Messaging (XDM) APIs allow communication between documents from different domains through IFrames in a way that is easy, secure and standardized.
  8. Facebook Integration: Microsoft capitalized on their partnership with the popular social networking site, Facebook, to allow IE8 users the ability to get status updates from Facebook right from their browser toolbar.
  9. eBay Integration: Like Facebook, this feature also uses IE8's new technology, called "WebSlices", which introduces a new way to get updates from other sites via the browser itself, without having to visit the web site.
  10. Firebug for Internet Explorer - We finally have a heavily-Firebug-inspired tool inside Internet Explorer. To quote Joe Hewitt (creator of Firebug): "I couldn't be happier that Microsoft completely copied Firebug for IE8." I have to agree - a tool like this has been a long time coming and it's greatly appreciated. Only the Internet Explorer team would've ever been the ones to build this tool - there's simply too much information here that's unavailable to typical IE extensions.
  11. Browser mode toggling - At first glance this feature makes the most sense for seeing if your IE 7 page will work ok in IE 8. In actuality, however, this will end up being very useful for developing a standards-compliant page (in IE 8, FF, Safari, Opera) and then toggling to see what the result is like in IE 7. This is so much better than the IE 6 to IE 7 jump where you have to keep your browser in a virtual machine in order for it to run side-by-side (according to Microsoft, at least - even though there were standalone solutions).

Read more at the Microsoft IE8 readiness site.

Labels: , , ,

Sunday, March 09, 2008

CAPTCHA is Dead, Long Live CAPTCHA!

Interesting post on coding horror. 3 of the most well known CAPTCHA's are now broken - Google, Hotmail and Yahoo!

Wisdom comes from Gunter Ollman, he notes:

CAPTCHAs were a good idea, but frankly, in today's profit-motivated attack environment they have largely become irrelevant as a protection technology. Yes, the CAPTCHAs can be made stronger, but they are already too advanced for a large percentage of Internet users. Personally, I don't think it’s really worth strengthening the algorithms used to create more complex CAPTCHAs – instead, just deploy them as a small "speed-bump" to stop the script-kiddies and their unsophisticated automated attack tools. CAPTCHAs aren't the right tool for stopping today's commercially minded attackers.

Read more here.

Labels: , , , , , , ,

Wednesday, February 20, 2008

Skype All-Hands: Works really well

I did a "Skype All Hands" this morning. Surprisingly it worked much better than a face to face all-hands, or a teleconference-all-hands. To be specific the problem that I mostly face is - people don't talk, they don't ask questions during such all-hands meetings. In a face to face all-hands meeting, it takes a while before the first guy asks a question, and then the second guy, and many questions come towards the end of the meeting. A Skype all-hands on the other had turned out to very interactive, people asked many questions, they really participated in the meeting. It seems like engineers like typing much more than talking. Well, I love this. As an added advantage - you already have the meeting minutes (cut-and-paste from IM log), and you can do this across the oceans.

Labels: , , , ,

Tuesday, February 19, 2008

MOSSO is good - but where is my SSH and how much memory do you support?


TechCrunch reported - "Hosting provider Rackspace is offering a new cloud computing service through its subsidiary Mosso. The service competes with Amazon’s Elastic Compute Cloud (EC2), although it doesn’t require any load balancing or other administration. It also competes with Joyent and Media Temple’s Grid Service. Pricing starts at $100 a month for - 50 GB of storage, 500 GB of bandwidth for transferring data and 3 million HTTP requests. From there additional capacity per month costs: $0.50/GB of storage, $0.25/GB of bandwidth and $0.10/1,000 HTTP requests."

All this is good, but where is my ssh? Dude, how will I install my custom built software? How will I manage my Apache expire headers, how will I implement my mod_rewrite rules?

Also, it's not clear how much memory does the $100 get me?

Labels: , , , , ,

Thursday, February 07, 2008

3 Steps to Adrenaline High

Here are my 3 steps of getting ‘Adrenaline High’ at 8AM; what are yours?
  1. Pump Iron: 44lb on wrist curls, 105lb on pec-dec, gives me a high that no other drink can

  2. Music: Linkin Park at 88 decibels, so I can’t hear anything else – gives me a high at 9:15AM

  3. Keeping schedule to the minute: My morning schedule runs at a granularity of 1 minute, between 7:55 and 8:09AM there must be at lease a dozen things getting done. Getting things done ‘right’ at the 1-minute-granularity gives me a high (I think it has a name, it’s called ‘urgency syndrome’)

Labels: , ,

Tuesday, January 29, 2008

PubMatic Enables Ad Optimization Across Every Ad Network

Palo Alto, Calif. - (January 28, 2008) - PubMatic (www.pubmatic.com), the first and largest ad optimization platform for Web publishers worldwide, today announced the ability to optimize online ads across any and every ad network. Now Web publishers using PubMatic can eliminate the headache of testing and deciding which ad network and layout will maximize their revenues, because PubMatic does it for them.

Currently in beta, PubMatic serves more than 2,000 publishers and more ad networks than any other ad inventory optimization platform.

"PubMatic immediately doubled our ad revenues by recommending the optimal ad network for each and every visit to WinCustomize.com," said Michael Crassweller, Web Site Manager, StarDock. "Since Wincustomize.com serves up nearly 4 million ads per day, PubMatic's ad network optimization has made a big difference to our bottom line."

The PubMatic public beta is open to all Web publishers, regardless of geography or company size. Signing up is simple and free: publishers can visit www.pubmatic.com/signup to get started in minutes.

Labels: , , , ,

Saturday, January 26, 2008

5 Attributes of Highly Effective Programmers

Very nice article on Top 5 Attributes of Highly Effective Programmers:

What attributes can contribute to a highly successful software developer versus the ordinary run-of-the-mill kind?
Humility
Once you start assuming you’re the expert and final word on something, you’ve stopped growing, stopped learning, and stopped progressing. Pride can make you obsolete faster than you can say “Java”.
"The competent programmer is fully aware of the strictly limited size of his own skull; therefore he approaches the programming task in full humility, and among other things he avoids clever tricks like the plague." - Dijkstra
Love of Learning
Good judgement comes from experience, and experience comes from bad judgement. - Fred BrooksObviously, some balance has to be struck here. You cannot learn everything–it simply isn’t possible. Our profession is becoming increasingly specialized because there is simply too much out there. I also think that in some respects, you need to love learning just for the sake of learning.
Detail-orientedness
The steps of changing a piece of software could be:
  1. Thoroughly understand what the code is doing and why
  2. Look for any and all dependencies and interactions with this code
  3. Have a well-thought-out mental picture of how it fits together.
  4. Examine the consequences of changing the feature.
  5. Update all related code that needs to (and repeat this cycle for those components)
  6. Update auxiliary pieces that might depend on this code (build system, installer, tests, documentation, etc.)
  7. Test and repeat.
Adaptability
For me, the first step in changing my mind set is to not get frustrated every time things change (”But you specifically said we were NOT going to implement the feature to work this way!”).
Passion
Ok, that’s maybe a bit of exaggeration, but by listing the counterpoints, it’s easier to see symptoms of someone who does have passion:
  • Thinks and breaths technology
  • Reads blogs about programming
  • Reads books about programming
  • Writes a blog about programming
  • Has personal projects
  • These personal projects are more important than the boring stuff at work
  • Keeps up with latest technologies for their interests
  • Pushes for implementation of the latest technologies (not blindly, of course)
  • Goes deep in technical problems.
  • Not content with merely coding to spec.
  • Needs an outlet of creativity, whether it be professional (software design) or personal (music, model building, LEGO building, art, etc.)
  • Thinks of the world in terms of Star Trek
Just kidding on the last one…
…(maybe)

Read full article here.

Labels:

Friday, January 18, 2008

8 hacks for finding Startup office space

I started looking for a new office for Komli Engineering at Pune, India about a week back. Here I describe my journey and the final selection.

Here are some of the key points when finding an office of about 2000 sqft in Pune in Aundh/Baner area:
  1. Rates have gone up like crazy – average rate is Rs. 50/sqft., unfurnished
  2. Most office spaces have only 2 restrooms, which is too few for a 2000 sqft space. So most spaces can pretty much be rejected on that ground
  3. There are a large number of residential properties that people are converting into commercial properties for offices and showrooms, and charging Rs. 50 per-sqft!
  4. The problem with these residential-turned-commercial properties is that – a) families are living in the same building, b) kids are playing in open spaces and c) parking is mostly an issue.
  5. There are independent-bungalows available at very cheap rates. These places are great – they are peaceful, have lots of spaces, lots of parking and so on. BUT you would probably never get broadband in those places. These independent-bungalows are available at 1/3 the rental cost
  6. Your office space must be located not more than 3 minutes from 5 places that must sell wada-pav, hot samosas, cut-chai, tandoori chicken and Pizza Hut – else you are doomed, because most employees in a startup are not married, and they need to eat (when they are not writing code)
  7. The other most important things when you are renting an office space are – 1) the place is good, 2) broadband is feasible and 3) parking space is available. The “broadband” is the most unexpected thing to find out. You can find the best place and the least cost, but no broadband – that will totally blow you off. The second most difficult thing to find is parking space for 4 cars
  8. I looked at the most cool places such as a nice place next to McDonalds in Aundh, a really cool office with all glass façade – but not good enough for Komli!

I finally decided with a really nice place above “Kobe Sizzlers” in Aundh. Awesome place, lots of space, central location, 2 balconies and lots of eateries around. And the best part is – you can get Sizzlers on Demand.

Wanna join us - check-out our open positions at http://www.komli.com/careers/ .

Labels: , , , , ,

Saturday, January 12, 2008

How to recognise a good programmer?

Daniel has written a really great article about how to recognize good programmers. Here is a summary of the traits (I have marked the ones that I truly relate with, and added my comments):

Positive indicators:

  • Passionate about technology
  • Programs as a hobby
  • Will talk your ear off on a technical subject if encouraged [Mukul: This is an absolute necessity to identify a "hacker". They are so passionate about what they do that they will talk about it for more time than you want them to. I love this feature.]
  • Significant (and often numerous) personal side-projects over the years
  • Learns new technologies on his/her own
  • Opinionated about which technologies are better for various usages [Mukul: I have seen some of the technical guys getting upset when you tell them do use a certain technology or tell them to do things in a certain way - this is good. They should be opinionated.]
  • Very uncomfortable about the idea of working with a technology he doesn’t believe to be “right”
  • Clearly smart, can have great conversations on a variety of topics
  • Started programming long before university/work
  • Has some hidden “icebergs”, large personal projects under the CV radar
  • Knowledge of a large variety of unrelated technologies (may not be on CV)

Negative indicators:

  • Programming is a day job
  • Don’t really want to “talk shop”, even when encouraged to
  • Learns new technologies in company-sponsored courses
  • Happy to work with whatever technology you’ve picked, “all technologies are good”
  • Doesn’t seem too smart
  • Started programming at university
  • All programming experience is on the CV
  • Focused mainly on one or two technology stacks (e.g. everything to do with developing a java application), with no experience outside of it
Read the full article here.

Labels: , , , , , ,

Sunday, December 30, 2007

Thrudb: Better Storage?

I recently read about thrudb, and I must say I am very impressed with the lucidity with which Jake Luciani describes the problem and the solution. Here is an excerpt:

"Data on the web is often fluid and loosely structured and it is becoming increasingly difficult to fit this data into a fixed database schema which is amended over time. A simple example of this is tagging. The many-to-many relationship of tags is difficult to query efficiently using tables and SQL, such that ad-hoc solutions are required.
Also, web data is often "mashed up" and viewed together (e.g. Facebook profile) or viewed spatially (e.g. Google maps + event data).
In order to provide this new kind of data flexibility the web is moving towards a document-oriented data model, where records aren’t grouped by their structure but by their attributes.
There are also standard data-oriented issues like indexing, caching, replication and backups, which are left for "later" but are never easy to implement when it’s time to do it. There are a number of great of open source solutions to these problems, but they require proper integration and configuration. These components end up being learned over time and learned by trial and error.
Thrudb, therefore, is an attempt to simplify the modern web data layer and provide the features and tools most web-developers need. These features can be easily configured or turned off."

Looks very cool. I am going to try this out as soon as I get hold of my developer box tomorrow morning.

Thrudb talks about the following features:
• Client libraries for most languages
• Multi-master replication
• Incremental backups and redo logging
• Multiple storage backends (S3 included)
• Built for horizontal scalability
• Simple and powerful search api (Lucene)

Labels: , , , , , ,

Wednesday, December 19, 2007

My Laptop Search: From Lenovo to DELL

My laptop search started 2 weeks ago. I had some very simple requirements; however I found it incredibly tough to meet my simple requirements. I am specifically talking about buying a laptop in India, specifically in Pune.

My simple requirements were – laptop should:
  • Have an Intel Core 2 Duo based mother-board
  • Should have at least a 667MHz FSB
  • Should have at least 2MB L2 cache
  • Should have a 14” screen
  • Should be light weight
  • I assumed that all laptops would have at least 80GB HDD and 1GB RAM
So, my search began with Lenovo, I looked at the X and Y series. Surprisingly, Lenovo, uses some really old CPU, most of their mother-boards were based on 533MHz FSB, the Intel Core 2 Duo mother-boards with 667MHz FSB cost more than Rs. 45K, which is a lot. I looked around at all available models (in India), couldn’t find a model that would cost me less than Rs. 50K (Lenovo 3000 Y300/Y410/Y500).

Then I looked at Toshiba. They had some really good mother-boards, for example the Satellite M200-A411D. However the look and feel of the laptop looked more like my 5 year old son’s laptop. There was a lack of ruggedness, lack of proper ergonomic thinking for the keyboard and port design. Specifically the keyboard is white in color, which means that it will get dirty in a week, and I won’t be able to fix that.

I looked at Sony Vaio VGN-CR23G series. Really cool models. Nice colors, nice mother-board, nice keyboard, almost everything was nice. Except for the weight – it weighs 2.5 kilos, which is more than I thought it should be. The cost was also higher, at Rs. 54,990 .

Then I saw DELL VOSTRO 1400 Notebook. I got an Intel Core 2 Duo Processor (T7250), 2.0GHz, 2MB Cache, 800 MHz FSB, 1GB RAM, 120GB HDD, 9 cell battery, and Vista home. That was the end of my search. I found my laptop. I bought it locally instead of ordering it online or on phone. Turns out to be a pretty good laptop, I happy with it. DELL has got much better since I started using DELLs back when I was at VERITAS in 1999-2002 timeframe. Good job DELL.



Looking back, Lenovo is way behind in its CPUs, I think. I love the rugged design of Lenovo, but they should get newer CPUs faster FSB and more L2 cache.

Labels: , , , , , , , , ,

Tuesday, November 27, 2007

My dream machine - Nikon D40X DSLR

My Dream Machine, finally:


I purchased this from jjmehta.com on 25th November 2007.

Camera information display and settings


Some specialties about D40X:

  • The most compact Nikon digital SLR ever, featuring intuitive controls and ergonomics designed for everyone
  • 10.2-effective-megapixel
  • Fast continuous shooting mode enables up to 100 JPEG images (FINE L-size or smaller) at 3 frames per second
  • High-precision digital image processing algorithms for natural-looking pictures with faithful color and tone reproduction
  • Automatic and manual control over ISO sensitivity from ISO 100 to 1600, as well as HI 1 (manual only)
  • Quick 0.18 second power-up to respond to every photographic opportunity
  • A large 2.5-inch LCD monitor with viewing angle of approx. 170 degrees in all directions
  • Long-life rechargeable lithium-ion battery that allows up to 520 images* on a single charge (*CIPA standards)

Labels: , , , ,

Wednesday, November 14, 2007

algoGod update: Extending the submission date to December 14th 2007

November 14 2007: Here is an update on the algoGod contest.

First, we are extending the submission date to December 14th 2007.

We started the contest 4 weeks ago, and got an overwhelming response with 299 contestants registering for the event and actively solving the problem. Many contestants requested that the submission date be extended, so we have extended the last date.

Also, we would like to inform you that we are hiring machine learning experts who can work at Komli for making a world-class ad-optimization engine. If you are interested – check out our web page at http://www.komli.com/careers under ‘Machine Learning Expert’. We introduced a FAQ for algoGod, it is on the following web page: http://www.komli.com/algogod/faq.php If you do have any questions, please don’t hesitate sending it to algogod@komli.com.

Labels: , , , ,

Wednesday, November 07, 2007

AOL buying Quigo

  • AOL is buying Quigo, a contextual ad network, for a reported $340 million [techcrunch.com]
  • Quigo will be the fourth ad company AOL has acquired in 2007. Earlier in the year, AOL acquired Third Screen Media (mobile advertising), Adtech AG, an ad serving platform based in Frankfurt, Germany, and Tacoda, the behavioral targeting company. All of them roll into Platform-A, AOL’s recently announced ad platform division [paidcontent.org]
  • Quigo, has over 500 publisher relationships and about 3,000 advertisers. [readwriteweb.com]
  • Quigo has raised $45 million since opening its doors in 2000 [paidcontent.org]

Labels: , , , ,

Tuesday, October 16, 2007

algoGod: Be Crowned the World's Greatest Algorithms Expert!

Komli today launched the “algoGod contest” for machine learning, math, genetics, and algorithm experts.

http://www.komli.com/algogod/

Contest winner to receive Rs. 2,00,000

Start date: October 15th 2007.
Entries must be submitted on or before November 14th 2007.
Results will be declared on December 31st 2007.

Have you ever wondered if you are the best algorithms expert on the planet? Have you ever thought, "I know I can beat everyone, just let me prove it?" Well Komli's algoGod contest is for you, it's your chance to show the world how smart you really are!

The algoGod contest seeks to crown one expert as the 'Algorithms God'. How are we going to do this? Well, the proof is in the pudding! We want every contestant to solve a common problem, and whoever is best will receive the algoGod prize!

A little more background:

Komli lives in the world of online advertising, and online advertising is rife with opportunity with complex algorithms based on cutting edge topics such as machine learning, data mining, graph theory, etc. Online advertising is growing at a very fast pace, and the number of variables affecting the performance of an online ad has been growing at an even faster pace. Komli is devising methods for maximizing the yield of online advertising using advanced statistical machine learning methods over large-scale systems. This is a very interesting and complex algorithm problem.

Komli is currently using a set of algorithms for maximizing the yield of online ads, collectively called 'Yin-Yang'. There are a lot of interesting alternative approaches to Yin-Yang that have yet to be tried. Komli is interested in determining if any of these alternative approaches can beat Yin-Yang by making better predictions.

Komli will provide participants with anonymous ad impression data and a prediction accuracy bar that is 50% better than what Yin-Yang can do on the same training data set. Participants' solutions will be judged by 'Time complexity' and 'Space complexity' criteria. The participant whose solution works best will receive Rs. 2,00,000, bragging rights and an opportunity to work with Komli. Of course, participants have to share their method and code with Komli. Eager participants can signup for the contest by filling the form on the left. Also, please let us know of any questions at algogod@komli.com.

Labels: , , , ,

Thursday, October 04, 2007

PubMatic Engineering: Slide show

Sunday, September 30, 2007

PubMatic selected by TechCrunch as a Top 40 Startup in the World

PubMatic, a product of Komli, was selected by TechCrunch as a Top 40 Startup in the World. Nearly 750 startups from around the world applied for this honor, and PubMatic was lucky enough to be selected! This was announced at the TechCrunch40 conference in San Francisco, CA, a conference built to showcase these 40 top startups.

We are hiring!If you dream in Java, think in PHP, and talk in <xml> over IM, you should talk to us.

In addition, as part of our presentation at the conference, we announced that PubMatic has been released into a global beta available for all publishers around the world! During our alpha over 500 publishers from around the world have been using PubMatic and seeing some amazing results. See news about PubMatic here.

Online advertising is growing at a very fast pace, and the number of variables affecting the performance of an online ad has been growing at an even faster pace. Komli is devising methods for maximizing the yield of online advertising using
advanced algorithms running over large-scale systems. We are also developing decision support system for data analytics, analysis of real time data, such as user behavior and web analytics, server scalability to support 100,000,000 requests per day (to start with), and much more cool stuff.


The last I posted about Komli, we had just moved into our new office. We were still building the product. Since then a lot has changed, we wrote a bunch of code, did a beta, were selected as a Top 40 startup in the world, our team grew to 8 people, and have been having a lot of fun.


The beta release was amazing, we had close to 400 customers using PubMatic, a small team of very enthusiastic world-class programmers were writing code and managing escalations at the same time.



While we hacked code in Java, PHP, AJAX and C 12 hours a day, and listened to rock and the latest Bollywood tunes of Bhool Bhuliyaa, the continued to have a sense of humor. This is a sketch that one of us drew on the whiteboard, while he was designing a new DB schema for user authentication.

And, did I mention, we never miss a chance to have fun ...




Labels: , , , , , , , ,

Simple Tooltip

http://codeeazy.com/mktooltip/

I wanted a tooltip implementation for my web page, and was looking around. I looked a number of open-source options (see 40 tooltip scripts here and 20 here) and one commercial library. I liked the functionality however each of them was 7, 8, 10 or 25 KB in size, which I thought was too much to implement just a simple tooltip. So, I thought how about I write a tooltip library myself.

Here is an implementation, check it out at http://codeeazy.com/mktooltip/. I implemented this in 818 bytes of code, and about an hour of coding. So one would think, there must be something wrong. Well, it works. And is cool!