Akxl Labs C# ASP.NET Articles and Tutorials Akxl Labs Web Apps and Tools for Your Website

Creating a Tag Cloud - A Better Approach

Tags tagged as   Code: ASP.NET, C#, Web
Presents a graphical explanation of an algorithm to create a Tag Cloud, with an example .NET User Control in ASP.NET C# that you can use on your website.

Posted March 21, 2007    Viewed 4928 times    Add to DiggAdd to del.icio.usAdd to FURLAdd to RedditAdd to YahooAdd to BlinklistAdd to GoogleAdd to ma.gnoliaAdd to ShadowsAdd to Technorati

Summary

I previously presented an algorithm for creating a tag cloud which used a logarithmic curve fitting to set font sizes based on the number of occurrences of a tag. However, this is only appropriate for tags with a roughly normal distribution. For smaller sites, the common scenario is to have several very popular tags, and many tags which are rarely used. So, a more exact method of fitting font sizes is needed.

This article presents the algorithm for a more comprehensive method, and provides source code for a .NET User Control that can be used as an example, or as a ready-to-go tag cloud builder for your website.

Articles and downloads sponsored by:
Thanks! Amazon commissions help me pay for textbooks.

Setting Font Sizes

Let's say that we want a tag cloud whose tags range from font sizes of 9px to 13px high. What we have here is a range of 5px, centered on an average size of 11px.

So, we'll create buckets for each font size:

Bucket 0
font: 9px
Bucket 1
font: 10px
Bucket 2
font: 11px
Bucket 3
font: 12px
Bucket 4
font: 13px

Now, we need to know which tags go in each bucket. So, let's look at the number of occurrences of each tag on the website. The more occurrences there are of a tag, the larger the font size.

For this example, let's say that the least popular tag has only 2 occurrences on the site, and the most popular tag has 21 occurrences, so we have a range of 19 occurrences between the least and most popular tag. The minimum occurrences of any tag is 2.

The minimum number of occurrences a tag must have to be in a particular bucket is then:

bucket num. × range + 1 ÷ num. of buckets + minimum occurrences of any tag

The maximum number of occurrences a tag can have an still be in this particular bucket is:

range + 1 ÷ num. of buckets + minimum occurrences for bucket - 1

So, in this case, our computations for the buckets looks like this:

Range + 1: 21 - 2 + 1 = 20
Num. of Buckets: 5
Minimum occurrences of any Tag: 2

Bucket Num. Min. Occurrences to be in Bucket Max. to be in bucket
0 (0 × 20 ÷ 5) + 2 = 2 20 ÷ 5 + 2 - 1 = 5
1 (1 × 20 ÷ 5) + 2 = 6 20 ÷ 5 + 6 - 1 = 9
2 (2 × 20 ÷ 5) + 2 = 10 20 ÷ 5 + 10 - 1 = 13
3 (3 × 20 ÷ 5) + 2 = 14 20 ÷ 5 + 14 - 1 = 17
4 (4 × 20 ÷ 5) + 2 = 18 20 ÷ 5 + 18 - 1 = 21

Now, we can come up with a complete definition for our buckets. A tag's number of occurrences must be in the italicized range for the tag to go in a particular bucket. Once we place a tag in a bucket, we know it's font size.

Bucket 0
font: 9px

Occurrences:
2 to 5
Bucket 1
font: 10px

Occurrences:
6 to 9
Bucket 2
font: 11px

Occurrences:
10 to 13
Bucket 3
font: 12px

Occurrences:
14 to 17
Bucket 4
font: 13px

Occurrences:
18 to 21


.NET User Control Example Code

You can download and view my example code below. The code is a .NET User Control, which you should place in the App_Code directory of your website to run it.


The TagCloud class is responsible for creating the tag could, by implementing the algorithm above. The TagCloudData class represents a single tag, and it's number of occurrences on the website. An array of TagCloudData is the input to the TagCloud class.

The CloudHasher class is used to assign a bucket to a TagCloudData object. It overrides the GetHashCode(object) method derived from object, so that GetHashCode returns the bucket number that the TagCloudData object belongs to. This way, it is possible to build more complex data structures with TagCloudData objects, where the structure of the data will indicate the layout of the cloud. This is useful in reducing the work involved in creating a tag cloud if gathering the data to create the cloud is complex. A structure like this is not covered in the example, but the .NET SDK Documentation provides several good examples of how to use an overridden GetHashCode method to create useful data structures from templates.

Comments & Feedback


Anonymous says:
Friday, July 27, 2007 @ 1:59 AM
http://www.akxl.net/labs/articles/11/default.aspx
-jitendra vagh
Anonymous says:
Friday, July 27, 2007 @ 2:00 AM
Thanx a lot. quite exact what we were looking for.
Artem says:
Saturday, April 19, 2008 @ 12:12 PM
But, how does it use with DataSource????
Adam says:
Tuesday, April 22, 2008 @ 6:02 PM
Artem, the DataSource property just takes any collection of TagCloud objects - so long as the collection implements IEnumerable. So an array of TagCloud objects (arrays implement IEnumerable), a List or ArrayList (both List and ArrayList implement IEnumerable), a hash table, or anything else is acceptable, as long as the collection contains TagCloud objects.

This is not a data-bound control. I chose not to use direct data binding because tag clouds are usually created by complicated database queries (after all they represent the distribution of an entire data set) and they change infrequently (or at least they are usually updated infrequently). Essentially, I felt that the common case was that a tag cloud would be generated out of a cache rather than out of live data binding more often than not. The array of TagCloud objects you create can just be dumped easily into the cache.

In the case of my own tag cloud, it is on a 15 minute cache so that it doesn't get queried thousands of times per day while it is only actually updated once per week at most.

Hope that helps.
Leave this field blank:
Comment on this Entry
This work is licensed under a Creative Commons Attribution 3.0 United States License.
Please link to this article in your source code comments if you use this content.

Labs

Blog

The blog has moved.
Non-technical articles are now on a seperate site.
Contact me for the new address.

Apps

Real-Time Coffee Counter
add it to your website!
Golden Ratio Visualizer
a tool for design

Coffee Counter

Current Count:
Akxl Coffee Meter
Current Coffee:
 Peet's Malawi Songwe River

The Real-Time Coffee Meter is a free Website App from Akxl Labs. Text-only and badge versions available.