>JetCrawl

2010/01/16 — Leave a comment

>In an effort to provide realistic data in places.sqlite, I wrote a data generator in Python which inserts many records into places, and this was a good thing.

The data is entriely made up from random strings, and you end up with urls like this: http://ffhjhfj.uwtgbz.wsc

The same for tags, etc…

In order to make things more realistic and “testable” inside xpcshell, I created a crawler using Jetpack and standard Firefox XPCOM components. I have a feeling that QA might be interested in this as, so I posted to the Jetpack Gallery

It is not configurable from the outside, but I have plans for that. I am planning on making JetCrawl surf every night for a set period to increase my collection of data work with.

I need this data in crafting a new Places Query API that is fast and well tested against a rather large collection of bookmarks and history. The urls that Jetcrawl use are taken from the Alexa Top 100, so it is a common set to boot.

Automated tests will work better with this data since it is hitting predictable urls and it is the actual places apis creating the data in the first place.

If you want to try it out, please use a new profile. You can stop it by closing the tab. There is no UI as of yet.

Advertisements

No Comments

Be the first to start the conversation!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s