Mukul Kumar's Blog

Web Scalability & Performance: Real Life Lessons

Following is a presentation that I made at TechWeekend in Pune on 5th September. About sixty hard-core technical geeks were present at the sessions. Following is the presentation that I made. Feel free to share.Web Scalability & Performance
You can reach me on Twitter @mukulneetika .

Labels: cache, database, load balancer, master, memcached, mysql, performance, pubmatic, pune, scalability, scale, slave, techweekend, test, web

TaffyDB - A JavaScript DB worth trying out

I recently read about TaffyDB, tried it today. Seems like a handy tool. I would like to use it. TaffyDB is a JavaScript Database, something that can be used for offline data processing in my opinion. For example, a relevant use case is I would like to cache a large report on my browser side and present different views by querying the TaffyDB (I would not like to make server side calls).

It seems like previous attemps have been made for a JavaScript database, a few example are - JavaScript SQL Database with Permanent Storage, Simple JavaScript Database, etc.

TaffyDB is pretty simple to use. Seem feature rich - Under 10K, CRUD Interface (Create, Read, Update, Delete), Sorting, Advanced Queries etc.

Code is pretty easy to write too. Pretty cool, check it out.

Labels: browser, caching, client, database, Javascript, taffydb

Thrudb: Better Storage?

I recently read about thrudb, and I must say I am very impressed with the lucidity with which Jake Luciani describes the problem and the solution. Here is an excerpt:

"Data on the web is often fluid and loosely structured and it is becoming increasingly difficult to fit this data into a fixed database schema which is amended over time. A simple example of this is tagging. The many-to-many relationship of tags is difficult to query efficiently using tables and SQL, such that ad-hoc solutions are required.
Also, web data is often "mashed up" and viewed together (e.g. Facebook profile) or viewed spatially (e.g. Google maps + event data).
In order to provide this new kind of data flexibility the web is moving towards a document-oriented data model, where records aren’t grouped by their structure but by their attributes.
There are also standard data-oriented issues like indexing, caching, replication and backups, which are left for "later" but are never easy to implement when it’s time to do it. There are a number of great of open source solutions to these problems, but they require proper integration and configuration. These components end up being learned over time and learned by trial and error.
Thrudb, therefore, is an attempt to simplify the modern web data layer and provide the features and tools most web-developers need. These features can be easily configured or turned off."

Looks very cool. I am going to try this out as soon as I get hold of my developer box tomorrow morning.

Thrudb talks about the following features:
• Client libraries for most languages
• Multi-master replication
• Incremental backups and redo logging
• Multiple storage backends (S3 included)
• Built for horizontal scalability
• Simple and powerful search api (Lucene)

Labels: amazon, aws, data, database, rdbms, s3, thrudb

RDBMS has come to the browser

Ajaxian reports:

Firefox 3 is to support SQLite for offline storage. The new alpha release tells us this and a lot more (below).

The world of the RDBMS has come to the browser, and has jumped from server to client in the Web platform.

I think this is a pretty interesting innovation. Suddenly we will have a lot more agile storage space on the client side. We can do some complex relational storage on the client side. I wonder if cookies will undergo a major transformation (like limit on cookie size etc.). I wonder if we will see nice Javascript APIs to access the RDBMS on the client side (or did I miss it; is it already there?). I wonder if Browsers will collocate some of these data, and we may see something like 'single instance storage' on the client side? I think all of this was possible even without the RDBMS, however a database on the client side makes us think the various possibilities that existed on the server side.

Labels: browser, database, firefox, gears, google, rdbms, relational, sqlite

Top 10 Largest Databases in the World

Very interesting list "Top 10 Largest Databases in the World":

8. Amazon

By the Numbers
* 59 million active customers
* More than 42 terabytes of data

7. YouTube

By the Numbers
* 100 million videos watched per day
* 65,000 videos added each day
* 60% of all videos watched online
* At least 45 terabytes of videos

5. Sprint

By the Numbers
* 2.85 trillion database rows.
* 365 million call detail records processed per day
* At peak, 70,000 call detail record insertions per second

4. Google

Although there is not much known about the true size of Google's database (Google keeps their information locked away in a vault that would put Fort Knox to shame), there is much known about the amount of and types of information Google collects.

By the Numbers
* 91 million searches per day
* accounts for 50% of all internet searches
* Virtual profiles of countless number of users

3. AT&T

By the Numbers
* 323 terabytes of information
* 1.9 trillion phone call records

1. World Data Centre for Climate

By the Numbers
* 220 terabytes of web data
* 6 petabytes of additional data

Read full report at:
http://www.businessintelligencelowdown.com/2007/02/top_10_largest_.html

Labels: database, scale, size

Mukul Kumar's Blog

Sunday, September 06, 2009

Web Scalability & Performance: Real Life Lessons

Wednesday, March 19, 2008

TaffyDB - A JavaScript DB worth trying out

Sunday, December 30, 2007

Thrudb: Better Storage?

Tuesday, June 19, 2007

RDBMS has come to the browser

Friday, February 16, 2007

Top 10 Largest Databases in the World

Twitter Updates

Previous Posts

Archives

What am I reading