announce: mod_jsmin

Posted by Ian Holsman Wed, 06 Aug 2008 03:51:00 GMT

it’s been a long time coming.. this has been a side project since feburary. Thanks to David for nudging me to finally finish it.

 

mod_jsmin is an apache module (filter + handler) that can minify your javascript on the fly. In my tests it shrinks a file by 30% on average.

the idea is to concat the javascript together as a single file, and then minify it. Obviously if you can do this things before you put the file on the server it will be better, but in large places life is never that organized.

 

Tags ,  | 4 comments

when your down and miserable

Posted by Ian Holsman Tue, 29 Jul 2008 03:24:00 GMT

 call your local realestate agent for an instant pickup..

 

sure.. your property is worth more than when you bought it 6 months ago from me..

yours is in a ‘special’ niche that people still want.. all the others have gone down 10%. don’t worry about the 3 properties for sale down the street, they bear no resemblance to yours

 

yeah.. right

Tags  | no comments

cuil. a brief look

Posted by Ian Holsman Mon, 28 Jul 2008 17:55:00 GMT

So Cuil opened up their doors.

 

my initial impressions is that they are doing the basic search quite well, but are pretty basic when it comes to categorization, related searches and typos. For example ‘madona’ doesn’t even mention that you might be interested in madonna the entertainer. and there is no link to recent news about her.

 

Their search results seems to be the same as google’s for some results, and very small for others, So I’m guessing they need a bit more work there.

 

Categorization also fails for something like ‘PLSI’ whose abbreviation means 3 different things. (a Law thing for native indians, a Learning Style, and a clustering algorithm).

 

so for now I’d say Cuil is interesting, but needs to beef up it’s IR/NLP capabilities before it gains parity with the other engines like yahoo, aol, gigablast, and of course google.

Tags  | no comments

Looking for an IT channel/sales manager role in Melbourne

Posted by Ian Holsman Mon, 14 Jul 2008 04:36:00 GMT

for a friend (honest)

if your company is currently on the lookout for a IT channel/sales person with years of experience ping me on ian at holsman.net and I’ll introduce you.

no comments

when it rains it pours - whats new in monitoring and metrics

Posted by Ian Holsman Tue, 01 Jul 2008 17:59:00 GMT

so on the RRD mailing list there is a discussion on how to write a RRD server/accelerator to help speed up RRD. which is a great tool, but when you abuse it and try to capture hundreds of thousands of metrics it kinda uses a bit too much disk I/O. (read swamps the system)

So imagine my surprise when I noticed that orbitz has recently open sourced their monitoring framework

  • ERMA: the monitoring API
  • Graphite: a graphing component on top of it
  • Whisper: a fixed size db that stores the info

and imagine my surprise when I found out it was written in Django, my favorite framework.

and now I find out Theo Schlossnagle has just released reconnoiter (reconnoiter project home)

now.. to find a couple of hours in the day to actually get into them.

Tags , , , , ,  | 1 comment

the next thing on the gay marriage "slippery slope"

Posted by Ian Holsman Wed, 25 Jun 2008 06:17:00 GMT

I heard a interesting thing on TV just then. Spokespeople for polygamy are pushing for it to be legalized as gay marriages are.

they have a point. the major objection to gay marriage has always been the bible says “no”. Since that is not being considered in gay marriage.. why not just change the relationship number from 1:1 to 0-N:0-N and be done with it.

I don’t think it would be any stranger than the relationships i’ve saw at school growing up with all the divorces and re-marriages going on. It’s about having a supportive home for me, not about the number (or type) of the adults in the relationship

Posted in  | Tags  | 4 comments

large distributed systems

Posted by Ian Holsman Tue, 24 Jun 2008 19:30:00 GMT

So I have recently been paying a lot of attention to systems with huge amounts of data in them.

be it Relegence that deals with lots of incoming news stories and figuring out what they are about in real time or the data layer that is dealing with click streams and recommendation engines.

One of the interesting questions is how we make this data available to the publishing systems, as the data sizes mean we can’t find the entire table onto a single machine.

So in my research I have seen 4-5 ways to do horizontal partitioning (or is it vertical.. i always get confused with the names).

  • consistent hashing. easy to understand. easy to implement, until you need to add a group of machines.
  • A central database holding the location of the records. An oldie but a goodie. you can easily add machines, and reconfigure the distribution to ‘move’ records across machines to compensate. But It has a central directory which means you need to worry about scalaing the directory (a smaller problem to be sure)
  • A Distributed Hash Table approach like the one discussed at onscale
  • Crush a pseudo-random distribution thing similar to consistent hashing, that can handle adding new machines and load issues.
  • Amazon’s Dynamo which has a “a gossip based distributed failure detection and membership protocol.” looking deeper it some mixture of consistent hashing and a dht/chord.
  • Hypertable/HBase. which leave the partitioning (and replication) to the distributed file store they sit on (hadoop). which isn’t a bad idea as alot of work has gone into that. but I still have my doubts on how it will handle a OLTP load (and I should just bit the bullet and run a performance test to remove my doubts)

you also need to deal with replication/load balancing issues. Specifically you need to handle the case of failure (and failure of racks / data centers).

from what I can see you have 2 different choices to make.

  • Full Consistency vs Eventual Consistency
  • What level do to replicate at

most of the people (including engineers to be honest) can’t seem to get their head around this eventual consistency thing. They are used to a update changing the value there and then.

replication on the other hand is usually handled by letting mysql do it. personally i prefer a finer grain approach where you can have some records with more replicas than others to cater for some things being more equal than others. (brittney spears gets more hits than johnny cash for example), and dynamic loads where the system automagically adds more replicas based on activity

whatever replication we choose it has to be battled hardened.. you don’t want to call up the CEO and explain why 10% of his data has just disappeared into the ether, or why you will be down for 6 hours recovering from a tape.

the other choice is what the data store should be.

  • a relational DB like mysql.
  • a Key/Value pair, with limited semantics on how to retrieve/store the information

personally I like the key/value pair as it makes life simpler.. but I know a lot of people who like mysql

so.. this is main thing on my mind at the moment. your thoughts are welcome naturally

Posted in  | Tags , , , ,  | 4 comments

Hanging out in New York

Posted by Ian Holsman Sun, 08 Jun 2008 13:42:00 GMT

After boiling on the tarmac in Dulles for an hour I finally managed to land in New York.
Luckily for me the Puetro Rico parade was on, and I had a chance to exercise my FlipVideo gadget.
It was a stinker of a day up here, but I was amazed at the turnout


Warning the video is LOUD (so was the street)
And yes.. I think i need to hold the camera higher (I got a lot of people’s asses at the start).

Tags , ,  | 1 comment

your a silly sausage

Posted by Ian Holsman Tue, 03 Jun 2008 10:28:00 GMT

I was having a nice regular day.. just typing up loose ends for my flight tomorrow  and just doing regular stuff.

until I re-read my flight confirmation. It was then I noticed that the flight actually left in 2 hours instead of tomorrow.

Quickly got onto a taxi, shoved some stuff in my bags. (sadly most of it was in the washing machine in prep for the flight tomorrow) and waited outside.

I got to the airport to a empty counter, 30 minutes before departure. cursing to myself all the way on the 1-hour taxi ride there.

luckily I made it (thank you fog), only to have my 3 year old call me a ‘silly sausage’. I was thinking far worse things

1 comment

Alive and Kicking

Posted by Ian Holsman Sun, 18 May 2008 17:49:00 GMT

Just returned from a great week in Tel-Aviv (sadly, my bags are still in De-Gaullle   ), where me and a ‘small’ group of 11 people overcame the Relegence office in Tel-Aviv. So if you think I’ve been quiet last week thats why.

One of the most memorable parts was seeing a Irishman have a drink in a Irish pub, which is unheard of. I think we have a photo to proove it. It helped that it was right across the street from the hotel, and it closed @2am, when the others closed at 11pm.

On the business side, I think having these multi-office meetings are great, as you hardly get to see the people on the other end of the mailing list/conf call. Both sides now have a greater appreciation of what each other can do, and a couple of interesting uses of relegence’s technologies were dreamed up that wouldn’t have been possible without having the experts from both sides being in the same room.

 

Some of the interesting things that we heard this week:

Yahoo’s Geo-location ID (WOEID) – http://developer.yahoo.com/geo/ is an amazing thing that we will be looking at using if we can.

AOL’s money and finance channel (http://finance.aol.com/quotes/time-warner-inc/twx/nys) is now the #1 one ranked by comscore. Thanks to Relegence it has HUGE return visits and page views it’s related news widget gave!

Tags ,  | no comments

Older posts: 1 2 3 4 ... 54