General

Try our Google Mapplet, and speed up your own

I just finished putting together a Google Mapplet that helps you discover things to do while you are using Google Maps.

Mapplets are a new feature that let you add Zvents event search and all sorts of other information to Google Maps itself, combining them in any way you want. Google is fixing a few remaining bugs in their mapplet code, so right now mapplets are available only on a developer preview page. We’ll have an official announcement when Google releases Mapplets on the main Google Maps page.

In the meantime, you are welcome to try out our Zvents mapplet, if you don’t mind the odd bug or two. You can install the mapplet by clicking the Add it to Maps button on this page, then click Discover things to do – Zvents under the Mapplets tab on the Google Maps page. I think it’s a lot of fun, and I hope you do too. We’d like to hear your feedback!

The rest of this post is for my fellow Google Mapplet developers. If you’ve worked with both the regular Google Maps API and the Mapplet API, you know that there’s one big difference between the two: many of the Maps API functions are replaced with special asynchronous versions. Because of the way mapplets work, it isn’t possible for an API function to return a value directly to its caller. Instead, you must provide a callback function which receives the value. For example, this Maps API code:

var center = map.getCenter();
// do something with the map center here

must be replaced with:

map.getCenterAsync( function( center ) {
    // do something with the map center here
});

This type of asynchronous callback should be familiar to any AJAX developer.

The problem starts when you need several pieces of information at once. To find events on the part of the map you are looking at, our mapplet uses the map boundaries and center along with the map size in pixels. My first attempt at the code looked like this:

map.getSizeAsync( function( size ) {
    map.getBoundsAsync( function( bounds ) {
        map.getCenterAsync( function( center ) {
            // Run a search using size, bounds, and center
        });
    });
});

That works fine, but it is rather slow. The code first asks for the pixel size and waits for it, then asks for the geographical boundaries and waits for them, then asks for the center and waits for it. That’s a lot of waiting around! And it gets that much worse if you need even more information from the map.

It would speed things up if there were a way to combine all of those asynchronous requests into one. What if the mapplet API worked like this:

GAsync( map, 'getSize', 'getBounds', 'getCenter',
    function( size, bounds, center ) {
        // Run a search using size, bounds, and center
    });

This GAsync() function would gather up all three pieces of map information and return them in a single callback. The code is simpler and easier to read too.

Interestingly enough, it turns out to be possible to implement GAsync() as a small layer on top of the existing Mapplet API functions and gain these benefits. I did just that, and it really improved the responsiveness of our mapplet.

When I told the Google Mapplet team about this, they liked the idea too, and they will include GAsync() in the next version of the mapplet API.

If you’re already working on a mapplet, you don’t have to wait. You can use GAsync() right now by including its source code directly in your mapplet. It’s a small function that you can paste into your code, making your mapplet code easier to write and faster too.

Full details and the GAsync() source code are in my blog post.


If you enjoyed this post, make sure you subscribe to my RSS feed!

Comments

Happy Birthday, Zvents!

Zvents is two today. Congratulations to the team at our fast-growing company, and looking forward to three!

zCakecut


If you enjoyed this post, make sure you subscribe to my RSS feed!

Comments

Open Source Search: Zvents Marries Heritrix with Hadoop

The rise of significant open-source search technology projects is one of the key reasons that startups like Zvents can compete with the major portals. Doug Cutting began this trend with the release of Lucene, and the public release of the Google MapReduce paper has now led to Hadoop, an open source implementation of Google’s GFS and Map-Reduce system lead by Doug and the folks over at Yahoo.

The engineering team at the Internet Archive has built a web crawler as an open-source project called Heritrix. Heritrix provides a full set of features for running an Internet crawl. Current startup state of the art for high-volume web crawling is to combine Heritrix with Hadoop… the only problem being that the two aren’t really on speaking terms.

What are you going to do with all the Heritrix-crawled data once you’ve pulled it down? To do large-scale data mining on the crawled content, you must go through the inelegant and non-scalable process of writing to local disk and then copying into HDFS (Hadoop Distributed FileSystem). The Zvents engineering staff has developed an extension to Heritrix that allows it to crawl directly into HDFS, speeding up the process and makes it much more reliable. A source code and binary distribution of this extension can be found at http://www.zvents.com/labs

We’re pleased to contribute this component to the community, and look forward to giving back more useful pieces of our effort to build the best local events search engine.


If you enjoyed this post, make sure you subscribe to my RSS feed!

Comments

Refueling the Rocket: Zvents Raises $7 million in funding

Zvents launched at the Web 2.0 conference 13 months ago, and we’ve had an amazing year. We’ve built out big pieces of our product, and signed up our first major media partners. At every step, I’ve been proud of our team for making great things happen on limited resources.

As of today, our resources are no longer quite so limited. We’re very pleased that VantagePoint Partners, and our new board member David Carlick, are joining NetService Ventures and Red Rock in buying in to our vision and our potential. With this funding, we’ll be taking Zvents nationwide, and bringing in more great talent to join our growing team. The first of those new team members also made the official press release – Gordon Rios has joined us as CTO, coming from Yahoo! Search. Gordon will be leading the technical team that will implement our vision for the next generation of local search – truly helping people find out what is going on in their neighborhood and their city.

It’s all happening at Zvents – new partners and customers, new metro areas, new users, new employees, and whole new problems to solve. Little things that make our lives easier are also happening, like moving from our 1-bedroom apartment (for 12 people!) to an actual office.

It’s been a great year, and we’re all looking forward to what comes next.

-Ethan


fund raising funding General investors zvents
, , , ,

If you enjoyed this post, make sure you subscribe to my RSS feed!

Comments

Rails Plugin: Extended Fragment Cache

The extended_fragment_cache plugin provides content interpolation and an in-process memory cache for fragment caching. It also integrates the features of Yan Pritzker’s memcache_fragments plugin since they both operate on the same methods.

Installation

1. This plugin requires that the memcache-client gem is installed.

   # gem install memcache-client

2. Install the plugin OR the gem

   $ script/plugin install svn://rubyforge.org/var/svn/zventstools/projects/extended_fragment_cache
   - OR -
   # gem install extended_fragment_cache

In-Process Memory Cache for Fragment Caching

Fragment caching has a slight inefficiency that requires two lookups to the fragment cache store to render a single cached fragment. The two cache lookups are:

1. The read_fragment method invoked in a controller to determine if a fragment has already been cached. e.g.,

     unless read_fragment("/x/y/z")
      ...
     end

2. The cache helper method invoked in a view that renders the fragment. e.g.,

     <% cache("/x/y/z") do %>
       ...
     <% end %>

This plugin adds an in-process cache that saves the value retrieved from the fragment cache store. The in-process cache has two benefits:

1. It cuts in half the number of read requests sent to the fragment cache store. This can result in a considerable saving for sites that make heavy use of memcached.

2. Retrieving the fragment from the in-process cache is faster than going to fragment cache store. On a typical dev box, the savings are relatively small but would be noticeable in a standard production environment using memcached (where the fragment cache is often running on another machine)

Peter Zaitsev has a great post comparing the latencies of different cache types on the MySQL Performance blog: http://www.mysqlperformanceblog.com/2006/08/09/cache-performance-comparison/

The plugin automatically installs a before_filter on the ApplicationController that flushes the in-process memory cache at the start of every request.

Content Interpolation for Fragment Caching

Many modern websites mix a lot of static and dynamic content. The more dynamic content you have in your site, the harder it becomes to implement caching. In an effort to scale, you’ve implemented fragment caching all over the place. Fragment caching can be difficult if your static content is interleaved with your dynamic content. Your views become littered with cache calls which not only hurts performance (multiple calls to the cache backend), it also makes them harder to read. Content interpolation allows you substitude dynamic content into cached fragment.

Take this example view:

<% cache("/first_part") do %>
  This content is very expensive to generate, so let's fragment cache it.<br/>
<% end %>
<%= Time.now %><br/>
<% cache("/second_part") do %>
  This content is also very expensive to generate.<br/>
<% end %>

We can replace it with:

<% cache("/only_part", {}, {"__TIME_GOES_HERE__" => Time.now}) do %>
  This content is very expensive to generate, so let's fragment cache it.<br/>
  __TIME_GOES_HERE__<br/>
  This content is also very expensive to generate.<br/>
<% end %>

The latter is easier to read and induces less load on the cache backend.

We use content interpolation at Zvents to speed up our JSON methods. Converting objects to JSON representation is notoriously slow. Unfortunately, in our application, each JSON request must return some unique data. This makes caching tedious because 99% of the content returned is static for a given object, but there’s a little bit of dynamic data that must be sent back in each response. Using content interpolation, we cache the object in JSON format and substitute the dynamic values in the view.

This plugin integrates Yan Pritzker’s extension that allows content to be cached with an expiry time (from the memcache_fragments plugin) since they both operate on the same method. This allows you to do things like:

<% cache("/only_part", {:expire => 15.minutes}) do %>
  This content is very expensive to generate, so let's fragment cache it.
<% end %>

Bugs, Code and Contributing

There.s a RubyForge project set up at:

http://rubyforge.org/projects/zventstools/

Anonymous SVN access:

$ svn checkout svn://rubyforge.org/var/svn/zventstools

-Tyler


If you enjoyed this post, make sure you subscribe to my RSS feed!

Comments (1)

Rails Plugin: Association Collection Tools

Any time you use an ORM you need to know that you are often sacrificing performance for convenience and developer efficiency. In general, this is a good thing. I agree with the theory espoused by DHH that developer productivity is often more valuable than machine performance. At least, I certainly agree with it in the early stages of development. Once you get to a certain scale, however, there are cases where you’ll need to write your own code that bypasses the ORM in the name of performance. Here’s a Rails plugin that does an end-run around the ORM for some basic operations in the form of these 2 methods:

  1. fast_copy
    A method called fast_copy is added to has_and_belongs_to_many association collections that makes the process of cloning HABTM associations MUCH more efficient. Simply replace person1.items = person2.items with person1.items.fast_copy(person2) and your database, network and RAM will thank you. See below for more details.
  2. fast_add
    fast_add operates just like fast_copy, but instead of replacing the existing objects in the association, it appends new objects to the association.
  3. ids
    A method called ids is added to has_many and has_and_belongs_to_many association collections. It returns the list of object ids in the association collection without unnecessarily instantiating the objects.

Installation

  1. Install the plugin OR the gem
       $ script/plugin install svn://rubyforge.org/var/svn/zventstools/projects/association_collection_tools
       - OR -
       # gem install association_collection_tools

HABTM Fast Copy

Copies a HABTM association collection from one object to another without instantiating a bunch of ActiveRecord objects. This is faster than the standard assignment operation since:

  1. Eliminates massive number of SQL calls used in standard HABTM copy by changing it from an O(n) operation to O(1) where n is the number of objects in the association collection.
  2. It transfers only object IDs back and forth between the database instead of all object attributes. Resulting in less work for the database, less data transferred and less memory used in ruby.
  3. It doesn’t instantiate ActiveRecord objects in memory.

A normal HABTM copy (e.g., person1.items = person2.items) results in the following SQL calls.

SELECT * FROM items INNER JOIN items_people ON items.id = items_people.item_id WHERE (items_people.person_id = 1 )
SELECT * FROM items INNER JOIN items_people ON items.id = items_people.item_id WHERE (items_people.person_id = 2 )
DELETE FROM items_people WHERE person_id = 2 AND item_id IN (4)
INSERT INTO items_people (`item_id`, `person_id`) VALUES (1, 2)
INSERT INTO items_people (`item_id`, `person_id`) VALUES (2, 2)
INSERT INTO items_people (`item_id`, `person_id`) VALUES (3, 2)

Notice that:

  • items AR objects are instantiated unnecessarily (especially since person2.items are about to be deleted)
  • 1 SQL call is issued for each object (item) in the association collection (items_people)

whereas person.items.fast_copy will result in the the following SQL calls greatly reducing the impact on the database and on ruby memory utilization.

DELETE FROM items_people WHERE person_id = 2
SELECT item_id FROM items_people WHERE person_id = 1
REPLACE INTO items_people (person_id,item_id) VALUES (2,3),(2,2),(2,1)

Here are some benchmarks:

when n = 10 and 26 objects in e2.groups:

Benchmark.bm do |x|
 x.report { for i in 1..n; e1.groups.clear;e1.groups = e2.groups;end }
 x.report { for i in 1..n; e1.groups.clear;e1.groups.fast_copy(e2);end }
end

    user     system      total        real
1.140000   0.040000   1.180000 (  1.832122)
0.020000   0.010000   0.030000 (  0.125368)

when n = 100 and 26 objects in e2.groups:

     user     system      total        real
11.140000   0.360000  11.500000 ( 18.171410)
 0.140000   0.010000   0.150000 (  2.368200)

This method also supports HABTM join tables with additional attributes. Simply pass in an attribute hash as the second argument and it will add the attributes to the records it creates in the join table.

e.g, person1.items.fast_copy(person2, {:created_at => Time.now})

REALITY CHECK: The HABTM docs refer to collection_singular_ids=ids which implies identical functionality, but I can’t find mention of this method in anything other than the documentation. Maybe this actually already exists and I’m just blind, but from the looks of http://dev.rubyonrails.org/ticket/2917, it appears that it is a documentation bug.

HABTM Fast Add

fast_add operates just like fast_copy, but instead of replacing the existing objects in the association, it appends new objects to the association.person1.items.fast_add([1,2], {:user_id => user_id})

HABTM and has_many ids

Return the list of IDs in this association collection without unnecessarily instantiating a bunch of Active Record objects. What good is the id of an object without the object itself? If you think about it for a while, you’re bound to come up with many uses, especially if you write a lot of SQL by hand. For instance, the fast_copy command documented above uses this method to return an id list without instantiating AR objects. The potential savings are enormous when you’re dealing with hundreds or thousands of objects at a time.== Bugs, Code and Contributing

There.s a RubyForge project set up at:

http://rubyforge.org/projects/zventstools/

Anonymous SVN access:

$ svn checkout svn://rubyforge.org/var/svn/zventstools

-Tyler


If you enjoyed this post, make sure you subscribe to my RSS feed!

Comments

Rails Plugin: MemCacheClient Extensions

The memcache-client_extensions plugins adds three new commands to the memcache client API:

1. get_multi : retrieve more than 1 key in parallel

2. stats : retrieve server performance and utilization statistics

3. flush_all : empty all information stored in memcached

Installation

1. This plugin requires that the memcache-client gem is installed.

   # gem install memcache-client

2. Install the plugin or the gem

   $ script/plugin install svn://rubyforge.org/var/svn/zventstools/projects/memcache-client_extensions
   - OR -
   # gem install memcache-client_extensions

get_multi

Retrieve multiple values from memcached in parallel, if possible. The memcached protocol supports the ability to retrieve multiple keys in a single request. Pass in an array of keys to this method and it will: a. map the key to the appropriate memcached server b. send a single request to each server that has one or more key values

Returns a hash of values.

>> CACHE["a"] = 1
=> 1
>> CACHE["b"] = 2
=> 2
>> CACHE.get_multi(["a","b"])
=> {"a"=>1, "b"=>2}

Here’s a benchmark showing the speedup:

CACHE["a"] = 1
CACHE["b"] = 2
CACHE["c"] = 3
CACHE["d"] = 4
keys = ["a","b","c","d","e"]
Benchmark.bm do |x|
  x.report { for i in 1..1000; keys.each{|k| CACHE.get(k);} end }
  x.report { for i in 1..1000; CACHE.get_multi(keys); end }
end

returns:

     user     system      total        real
 0.180000   0.130000   0.310000 (  0.459418)
 0.200000   0.030000   0.230000 (  0.269632)

There’s a fair amount of non-DRY between get_multi and get (and threadsafe_cache_get/multi_threadsafe_cache_get and cache_get/multi_cache_get for that matter) but I think it’s worth it since the extra overhead to handle multiple return values is unneeded for a single-key get (which is by far the most common case).

stats

The stats method returns statistics for each memcached server. An explanation of the statistics can be found in the memcached docs.

Example:

>> CACHE.stats
=> {"localhost:11211"=>{"pid"=>"20188", "bytes"=>"4718", "connection_structures"=>"4", "time"=>
"1162278121", "pointer_size"=>"32", "limit_maxbytes"=>"67108864", "version"=>"1.2.0", "cmd_get"=>
"14532", "cmd_set"=>"32", "bytes_written"=>"432583", "uptime"=>"1557", "curr_items"=>"4",
"curr_connections"=>"3", "total_connections"=>"19", "get_misses"=>"0", "rusage_user"=>
"0.119981", "rusage_system"=>"0.313952", "total_items"=>"32", "get_hits"=>"14532", "bytes_read"
=>"190619"}}

flush_all

The flush_all method empties all cache namespaces on all memcached servers. This method is very useful for testing your code with memcached since you normally want to reset the cache to a known (empty) state at the beginning of each test.

Bugs, Code and Contributing

There’s a RubyForge project set up at:

http://rubyforge.org/projects/zventstools/

Anonymous SVN access:

$ svn checkout svn://rubyforge.org/var/svn/zventstools

-Tyler


If you enjoyed this post, make sure you subscribe to my RSS feed!

Comments (1)

Updates to the REST API

We’ve been gradually adding more functionality to our web services interface. A few changes have accumulated over the last 4-5 weeks, so it’s about time for an overview of what’s new.
Category methods: You may have noticed that we added event categories when we rolled out version 2.0 of Zvents. Events can belong to one or more categories which can be used to find events using search. The following methods let you access and use event categories through the REST interface:
/event_categories
/categories
/search
/search_for_venues
/search_within_group

Search Information: There are many optional parameters when you search on Zvents either through the front-end or through the REST API. These optional parameters all assume default values in the absence of an explicit value. For instance, if you don’t supply a value in the where field, the search is run against all future events. Until now, you would have to write some timestamping code in your client application to figure out the exact time range of the search.
To make your life easier, searches through the REST API now return a lot of relevant information in the search_info construct.
Repeating Events: We’ve also made it a lot easier for you to handle repeating events within search results. In the past, it was difficult (and in some cases impossible) to identify events within the same repeat pattern. We’ve added three features to address this issue:
series_count: This field tells you the number of events within a repeating series that match a search.
parent_id: This field identifies the ID of the parent event of the repeat series. All events with the same parent_id belong to the same series.
rec flag: Use the rec flag on a search to list the id and times of all events within the repeating series that match the search. You probably want to turn the trim flag on if you use this option, otherwise you’ll get redundant data back. The rec flag just returns the data in a format that’s easier to digest.
User methods: All user methods can now be invoked by user name. In the past, you had to supply a user id. It was always relatively easy to look up a user id, but this change saves you a little bit of time and effort.
As always, documentation can be found at http://www.zvents.com/rest


Happy hacking.
-Tyler


If you enjoyed this post, make sure you subscribe to my RSS feed!

Comments

Zvents 2.0 Launches!

Yup, we’re no longer in delta.

What’s new and improved about this version of Zvents?

  • Improved search relevance—when you’ve got thousands of events in a metro area, relevance becomes a key to happy users. Our results are materially better than before, and we’re committed to continuous improvement for relevance.
  • Category navigation – refine or browse to events by category, reducing time and effort to find what you’re looking for.
  • Embeddable calendars are now CSSable version 2, vastly improved
  • Much faster, lighter code – less than 1/2 the size – means faster response times.
  • Improved navigation and look and feel

You’ll also notice that we’ve got less features, not more—we’re running against the “features are king” silliness of Web 2.0 here, but we think that great design means focusing on what matters most.

Coming in the next 30 days – major customer announcements, plus bonus fun stuff!

-Tyler


General

If you enjoyed this post, make sure you subscribe to my RSS feed!

Comments

Ethan’s Interview at Under the Radar

We recently appeared at IBDN’s Under the Radar conference. It was a great experience for us—the most fun was being up on stage with CalendarHub, Mosuki, and Skobee, all of whom said they want to take events and advertising from Zvents. I was interviewed in a podcast by Cathy Brooks of the GuideWire Group, speaking about both Zvents and Web 2.0 in general.

-Shane


If you enjoyed this post, make sure you subscribe to my RSS feed!

Comments

« Previous entries