Rails url_for is slow — some data comparison

There has been some discussion, though not a lot in print, of the fact that the Rails helper method url_for (which is also the basis for link_to and other frequently-used helpers) can be slow. This slowness stems from the process of generating routes, which takes longer as your routes.rb fills up.

As the Zvents site and traffic grow, we’re always looking for opportunities to improve performance. Our routes.rb has filled up a bit (nearly 40 entries), so I decided to do a little benchmarking to see if url_for was slowing us down, and if so, how much.

Here’s what I found.

I started with a page that has approximately 200 links generated with link_to (http://www.zvents.com/movies/theaters, though the number of links depends on your location). This is a lot of links, but that made it easier to run my tests. I wrapped each link_to call with a timing block, summed up the time, then took the average. Also, these tests were run on my development machine (MacBook Pro) in a Linux VM, under fairly heavy load.

Here’s the difference between calling link_to with a hash (causing a call to url_for), and calling it with a string (no call to url_for), average over roughly 1,000 measurements:

link_to WITH url_for: 6.8 ms

link_to WITHOUT url_for: 2.0 ms

That’s a big difference — almost 5 ms, or 3.4 times longer. And the cumulative effect on a page like the one tested, with nearly 200 links, is a total render time of almost one second more.

Obviously, the results will be very different on different machines, in different environments. I’d love to do a more general, scientific analysis, when I have some time. But for now, it’s just a little more information for you to use in trying to streamline your Rails apps.


General Performance and Scalability Rails Ruby
, , ,

If you enjoyed this post, make sure you subscribe to my RSS feed!

Comments

Hypertable Podcast

Zvents software architect, Doug Judd, discussed a short code example from soon be released open-source distributed data storage system, Hypertable, with LinuxWorld. In the podcast (Download: mp3 / iTunes) he explains the basics and high performance of Hypertable which is designed to support applications requiring maximum performance, scalability, and reliability.

Since Hypertable is starting from the ground up, also discussed are the system’s goals, strengths, and limitations with some insight to the state of the art in project infrastructure, including git for revision control, cmake for build, and Google Code for issue tracking.

Listen now: mp3 / iTunes


High performance computing hypertable linuxworld open source podcast
, , , ,

If you enjoyed this post, make sure you subscribe to my RSS feed!

Comments

Federated Local Search

Zvents continues to grow at exponential rates, having signed The McClatchy Company which brings local search and things to do to over 50 additional local sites such as Triangle.com and indexing events, businesses, and performers as quickly as possible. Today’s big news though was our release of a unique, blended local search technology which combines all facets of local content into one search results page: business listings, movies, restaurants, events, sales, performers/celebrities (in town), sports, etc.

This is a tremendous advance for the local search industry, local businesses, and users, advancing the space from business listing lookups to a singular experience for discovering things to do and finding everything local. naturalsearchblog’s Chris Universal Local SearchSilver Smith has a nice comparison to Google’s Universal Search (for blended web results); Andrew Shotland, Website Magazine’s Pete Prestipino, and Greg Sterling also shared initial thoughts worth a read.

Federated local search is easily accessible throughout Zvents; the prominent search box allows users to query ‘what’ and ‘when’ (either or both) to discover things to do. IP mapping predefines user location making it easy to find things to do near them, while the clear, personalized City/State location or Advanced Search options make it easy to search another city. We’ve even blended our listings submission so you can quickly add a venue, event, or performer; a must do for local businesses who want to reach our millions of unique users through over 200 major partners.

We’re looking forward to many more enhancements to the local space throughout 2008; stay tuned!

 

 


blended federated google local local search Search serp universal universal search
, , , , , , , ,

If you enjoyed this post, make sure you subscribe to my RSS feed!

Comments (3)

The Google File System (and How It Can Be Improved)

Over the past several years, Google has built and deployed large scale distributed software infrastructure for the purpose of storing and processing the massive amount of data that they collect. Three of these systems are fundamental and underpin their entire business. They are: The Google File System, Map-Reduce, and Bigtable. They’ve described these systems in detail in three separate papers (see http://research.google.com/pubs/papers.html).

The GFS was designed to reliably store a massive amount of data (i.e. petabytes) in such a way that allows it to be efficiently processed. To achieve this goal in the most cost effective way possible, the GFS was designed to be run on large clusters of commodity hardware (i.e. the commodity PC and gigabit ethernet). The following list highlights some of the important features of GFS:

  • Global namespace. GFS provides a global namespace in the sense that any machine or process can connect to it and see the whole thing. It also runs entirely in user space and doesn’t require system administrator privileges to connect to it. NFS, on the other hand, requires you to mount many “islands” of filesystem into your own local filesystem requiring system administrator privileges to do so.
  • High Availability. Given the unreliable nature of commodity hardware, GFS has been designed to constantly check for and react to machine failures. High data availability is achieved through inter-machine replication. GFS breaks each file into chunks (default is 64MB) and each chunk is replicated across some number of machines (default is 3). Unlike RAID, it can withstand all types of machine failures (e.g. Power supply, memory, network ports) as opposed to just disk failures. Chunks are also replicated inter-rack to protect against correlated failures caused by failures in a network switch or power circuit.
  • Efficient processing. By splitting data into 64MB chunks and distributing the chunks across a large cluster of machines, the data can be processed efficiently in parallel. This is achieved by pushing processing jobs to the machines that store the chunks and running the computation at the storage nodes in parallel. This has enormous benefit since the computation requires almost no network I/O to read the data that it is processing (see Map-reduce).
  • Atomic record append. Traditional Unix filesystems support the ability to open a file in O_APPEND mode, which instructs the system to append all writes to the end of the file. Unfortunately, most implementations have a race condition where concurrent appends can conflict with one another causing one to overwrite the other. GFS supports true atomic append where the client specifies only the data to be appended and the system will append it at least once, atomically, returning the offset of where it was written, back to the client. This operation is used heavily at Google and allows files to be used as producer/consumer queues with multiple producers.
  • Snapshot. A file or directory tree can be copied almost instantaneously with the snapshot operation. To achieve this fast copy, the systems copies only the metadata and marks all of the underlying chunks copy-on-write. Only if a chunk gets modified will a new copy get created with the modification, otherwise the chunks are shared between the original files and the snapshotted files.

There are several aspects of the GFS design that could be improved given the context of Google’s usage of GFS today and the other distributed systems that have been built up around it.

The first improvement would be to get rid of the random write and support only record append. The GFS paper repeatedly emphasizes that the write workload Google typically sees consists of large streaming writes. Small random writes are almost non-existent. Since the paper was published, Google has developed a database system called Bigtable that sits on top of GFS. It is designed to efficiently handle small random updates by caching them in memory and periodically spilling data sequentially to the GFS. The number of applications that require small random updates, but where Bigtable is inappropriate, is effectively zero. By eliminating this feature, a considerable amount of complexity gets dropped from the system. Reduction of complexity in a software system generally leads to better quality and maintainability.

The next place where I see room for improvement is the fixed (64MB) chunk size. Most applications operate on fixed or variable sized records. If the file system is made aware of these record boundaries, then it could vary the size of its chunks to accomodate whole records only. With a fixed 64MB chunk size the last record in a chunk usually gets truncated, causing the application to read data from multiple locations to assemble the whole record. This isn’t such a big deal in many applications, but can cause scaling problems in the Map-reduce system. Map-reduce achieves much of its efficiency by pushing computation out to where the data is physically stored and running it locally. If, for every chunk, the last record in the chunk is truncated, then the system must fetch the other part of the record to process it, which often involves communicating with another node on the network. At scale and under load, every additional network round-trip negatively impacts performance. If the file system takes into account application record boundaries, then it can make more intelligent placement decisions and therefore reduce overall network load.

The last place in GFS where I feel there could be an improvement is in its data consistency model. Under certain conditions where there are concurrent modifications at the same location, records can get duplicated, truncated, and/or padding (e.g. dead space) can get written into the file. GFS leaves it up to the application to write self-validating and self-identifying records to guard against these situations. This just does not seem right. It places a big onus on the application. I suspect that if GFS moved to a record append only model, that it would be much easier to provide a fully consistent data model.

There are several efforts underway to build open source implementations of this distributed computing infrastructure. Zvents has been building an implementation of a Bigtable-like distributed database called Hypertable. Look for an announcement coming on this soon.

- Doug Judd


High performance computing

If you enjoyed this post, make sure you subscribe to my RSS feed!

Comments

Add Zvents to Pageflakes

Pageflakes fans can bring Zvents to to their startpage with a great flake from Wendy in our community of developers. The Zvents Event Search flake supports our what/when/where search for things to do and venues with results sorted by date or relevance.My search for concerts in San Francisco this weekend turned up O.A.R. (a personal favorite for “That Was a Crazy Game of Poker”) at The Warfield and the jazzy, folk pop of The Samples at Cafe-du-Nord.

A search in venues for “baseball” returns the obvious AT&T Park, home of the 2007 MLB All-star Game as well as the San Francisco Exploratorium. Why, you might ask? We’re discovering all sorts of things for you to do.

Add the Zvents Event Search flake to your startpage and start discovering things to do.By the way, Zvents Developer API provides programmatic access to objects including events, venues, groups, and tags, stored in Zvents. Using the API is as simple as:

  1. Login to your Zvents account and generate an API key
  2. Read the documentation
  3. Get hacking

If you have any suggestions for improvements to the APIs or have created an application, let us know.(thanks Wendy!)


If you enjoyed this post, make sure you subscribe to my RSS feed!

Comments

Try our Google Mapplet, and speed up your own

I just finished putting together a Google Mapplet that helps you discover things to do while you are using Google Maps.

Mapplets are a new feature that let you add Zvents event search and all sorts of other information to Google Maps itself, combining them in any way you want. Google is fixing a few remaining bugs in their mapplet code, so right now mapplets are available only on a developer preview page. We’ll have an official announcement when Google releases Mapplets on the main Google Maps page.

In the meantime, you are welcome to try out our Zvents mapplet, if you don’t mind the odd bug or two. You can install the mapplet by clicking the Add it to Maps button on this page, then click Discover things to do – Zvents under the Mapplets tab on the Google Maps page. I think it’s a lot of fun, and I hope you do too. We’d like to hear your feedback!

The rest of this post is for my fellow Google Mapplet developers. If you’ve worked with both the regular Google Maps API and the Mapplet API, you know that there’s one big difference between the two: many of the Maps API functions are replaced with special asynchronous versions. Because of the way mapplets work, it isn’t possible for an API function to return a value directly to its caller. Instead, you must provide a callback function which receives the value. For example, this Maps API code:

var center = map.getCenter();
// do something with the map center here

must be replaced with:

map.getCenterAsync( function( center ) {
    // do something with the map center here
});

This type of asynchronous callback should be familiar to any AJAX developer.

The problem starts when you need several pieces of information at once. To find events on the part of the map you are looking at, our mapplet uses the map boundaries and center along with the map size in pixels. My first attempt at the code looked like this:

map.getSizeAsync( function( size ) {
    map.getBoundsAsync( function( bounds ) {
        map.getCenterAsync( function( center ) {
            // Run a search using size, bounds, and center
        });
    });
});

That works fine, but it is rather slow. The code first asks for the pixel size and waits for it, then asks for the geographical boundaries and waits for them, then asks for the center and waits for it. That’s a lot of waiting around! And it gets that much worse if you need even more information from the map.

It would speed things up if there were a way to combine all of those asynchronous requests into one. What if the mapplet API worked like this:

GAsync( map, 'getSize', 'getBounds', 'getCenter',
    function( size, bounds, center ) {
        // Run a search using size, bounds, and center
    });

This GAsync() function would gather up all three pieces of map information and return them in a single callback. The code is simpler and easier to read too.

Interestingly enough, it turns out to be possible to implement GAsync() as a small layer on top of the existing Mapplet API functions and gain these benefits. I did just that, and it really improved the responsiveness of our mapplet.

When I told the Google Mapplet team about this, they liked the idea too, and they will include GAsync() in the next version of the mapplet API.

If you’re already working on a mapplet, you don’t have to wait. You can use GAsync() right now by including its source code directly in your mapplet. It’s a small function that you can paste into your code, making your mapplet code easier to write and faster too.

Full details and the GAsync() source code are in my blog post.


If you enjoyed this post, make sure you subscribe to my RSS feed!

Comments

Happy Birthday, Zvents!

Zvents is two today. Congratulations to the team at our fast-growing company, and looking forward to three!

zCakecut


If you enjoyed this post, make sure you subscribe to my RSS feed!

Comments

Open Source Search: Zvents Marries Heritrix with Hadoop

The rise of significant open-source search technology projects is one of the key reasons that startups like Zvents can compete with the major portals. Doug Cutting began this trend with the release of Lucene, and the public release of the Google MapReduce paper has now led to Hadoop, an open source implementation of Google’s GFS and Map-Reduce system lead by Doug and the folks over at Yahoo.

The engineering team at the Internet Archive has built a web crawler as an open-source project called Heritrix. Heritrix provides a full set of features for running an Internet crawl. Current startup state of the art for high-volume web crawling is to combine Heritrix with Hadoop… the only problem being that the two aren’t really on speaking terms.

What are you going to do with all the Heritrix-crawled data once you’ve pulled it down? To do large-scale data mining on the crawled content, you must go through the inelegant and non-scalable process of writing to local disk and then copying into HDFS (Hadoop Distributed FileSystem). The Zvents engineering staff has developed an extension to Heritrix that allows it to crawl directly into HDFS, speeding up the process and makes it much more reliable. A source code and binary distribution of this extension can be found at http://www.zvents.com/labs

We’re pleased to contribute this component to the community, and look forward to giving back more useful pieces of our effort to build the best local events search engine.


If you enjoyed this post, make sure you subscribe to my RSS feed!

Comments

Refueling the Rocket: Zvents Raises $7 million in funding

Zvents launched at the Web 2.0 conference 13 months ago, and we’ve had an amazing year. We’ve built out big pieces of our product, and signed up our first major media partners. At every step, I’ve been proud of our team for making great things happen on limited resources.

As of today, our resources are no longer quite so limited. We’re very pleased that VantagePoint Partners, and our new board member David Carlick, are joining NetService Ventures and Red Rock in buying in to our vision and our potential. With this funding, we’ll be taking Zvents nationwide, and bringing in more great talent to join our growing team. The first of those new team members also made the official press release – Gordon Rios has joined us as CTO, coming from Yahoo! Search. Gordon will be leading the technical team that will implement our vision for the next generation of local search – truly helping people find out what is going on in their neighborhood and their city.

It’s all happening at Zvents – new partners and customers, new metro areas, new users, new employees, and whole new problems to solve. Little things that make our lives easier are also happening, like moving from our 1-bedroom apartment (for 12 people!) to an actual office.

It’s been a great year, and we’re all looking forward to what comes next.

-Ethan


fund raising funding General investors zvents
, , , ,

If you enjoyed this post, make sure you subscribe to my RSS feed!

Comments

Rails Plugin: Extended Fragment Cache

The extended_fragment_cache plugin provides content interpolation and an in-process memory cache for fragment caching. It also integrates the features of Yan Pritzker’s memcache_fragments plugin since they both operate on the same methods.

Installation

1. This plugin requires that the memcache-client gem is installed.

   # gem install memcache-client

2. Install the plugin OR the gem

   $ script/plugin install svn://rubyforge.org/var/svn/zventstools/projects/extended_fragment_cache
   - OR -
   # gem install extended_fragment_cache

In-Process Memory Cache for Fragment Caching

Fragment caching has a slight inefficiency that requires two lookups to the fragment cache store to render a single cached fragment. The two cache lookups are:

1. The read_fragment method invoked in a controller to determine if a fragment has already been cached. e.g.,

     unless read_fragment("/x/y/z")
      ...
     end

2. The cache helper method invoked in a view that renders the fragment. e.g.,

     <% cache("/x/y/z") do %>
       ...
     <% end %>

This plugin adds an in-process cache that saves the value retrieved from the fragment cache store. The in-process cache has two benefits:

1. It cuts in half the number of read requests sent to the fragment cache store. This can result in a considerable saving for sites that make heavy use of memcached.

2. Retrieving the fragment from the in-process cache is faster than going to fragment cache store. On a typical dev box, the savings are relatively small but would be noticeable in a standard production environment using memcached (where the fragment cache is often running on another machine)

Peter Zaitsev has a great post comparing the latencies of different cache types on the MySQL Performance blog: http://www.mysqlperformanceblog.com/2006/08/09/cache-performance-comparison/

The plugin automatically installs a before_filter on the ApplicationController that flushes the in-process memory cache at the start of every request.

Content Interpolation for Fragment Caching

Many modern websites mix a lot of static and dynamic content. The more dynamic content you have in your site, the harder it becomes to implement caching. In an effort to scale, you’ve implemented fragment caching all over the place. Fragment caching can be difficult if your static content is interleaved with your dynamic content. Your views become littered with cache calls which not only hurts performance (multiple calls to the cache backend), it also makes them harder to read. Content interpolation allows you substitude dynamic content into cached fragment.

Take this example view:

<% cache("/first_part") do %>
  This content is very expensive to generate, so let's fragment cache it.<br/>
<% end %>
<%= Time.now %><br/>
<% cache("/second_part") do %>
  This content is also very expensive to generate.<br/>
<% end %>

We can replace it with:

<% cache("/only_part", {}, {"__TIME_GOES_HERE__" => Time.now}) do %>
  This content is very expensive to generate, so let's fragment cache it.<br/>
  __TIME_GOES_HERE__<br/>
  This content is also very expensive to generate.<br/>
<% end %>

The latter is easier to read and induces less load on the cache backend.

We use content interpolation at Zvents to speed up our JSON methods. Converting objects to JSON representation is notoriously slow. Unfortunately, in our application, each JSON request must return some unique data. This makes caching tedious because 99% of the content returned is static for a given object, but there’s a little bit of dynamic data that must be sent back in each response. Using content interpolation, we cache the object in JSON format and substitute the dynamic values in the view.

This plugin integrates Yan Pritzker’s extension that allows content to be cached with an expiry time (from the memcache_fragments plugin) since they both operate on the same method. This allows you to do things like:

<% cache("/only_part", {:expire => 15.minutes}) do %>
  This content is very expensive to generate, so let's fragment cache it.
<% end %>

Bugs, Code and Contributing

There.s a RubyForge project set up at:

http://rubyforge.org/projects/zventstools/

Anonymous SVN access:

$ svn checkout svn://rubyforge.org/var/svn/zventstools

-Tyler


If you enjoyed this post, make sure you subscribe to my RSS feed!

Comments (1)

« Previous entries