Archives For jetpack

>JetCrawl

2010/01/16 — Leave a comment

>In an effort to provide realistic data in places.sqlite, I wrote a data generator in Python which inserts many records into places, and this was a good thing.

The data is entriely made up from random strings, and you end up with urls like this: http://ffhjhfj.uwtgbz.wsc

The same for tags, etc…

In order to make things more realistic and “testable” inside xpcshell, I created a crawler using Jetpack and standard Firefox XPCOM components. I have a feeling that QA might be interested in this as, so I posted to the Jetpack Gallery

It is not configurable from the outside, but I have plans for that. I am planning on making JetCrawl surf every night for a set period to increase my collection of data work with.

I need this data in crafting a new Places Query API that is fast and well tested against a rather large collection of bookmarks and history. The urls that Jetcrawl use are taken from the Alexa Top 100, so it is a common set to boot.

Automated tests will work better with this data since it is hitting predictable urls and it is the actual places apis creating the data in the first place.

If you want to try it out, please use a new profile. You can stop it by closing the tab. There is no UI as of yet.

Advertisements