Presents a graphical explanation of an algorithm to create a Tag Cloud, with an example .NET User Control in ASP.NET C# that you can use on your website.
I previously presented an algorithm for
creating a tag cloud which used a
logarithmic curve fitting to set font sizes based on the number of occurrences of a tag. However, this is only
appropriate for tags with a roughly normal distribution. For smaller sites, the common scenario is to have several
very popular tags, and many tags which are rarely used. So, a more exact method of fitting font sizes is needed.
This article presents the algorithm for a more comprehensive method, and provides source code for a .NET User Control that
can be used as an example, or as a ready-to-go tag cloud builder for your website.
Articles and downloads sponsored by:
Thanks! Amazon commissions help me pay for textbooks.
Let's say that we want a tag cloud whose tags range from font sizes of 9px to 13px high. What we have here is a range of
5px, centered on an average size of 11px.
So, we'll create buckets for each font size:
Bucket 0 font: 9px |
Bucket 1 font: 10px |
Bucket 2 font: 11px |
Bucket 3 font: 12px |
Bucket 4 font: 13px |
Now, we need to know which tags go in each bucket. So, let's look at the number of occurrences of each tag on the website.
The more occurrences there are of a tag, the larger the font size.
For this example, let's say that the least popular tag has only 2 occurrences on the site, and the most popular tag has 21 occurrences, so
we have a range of 19 occurrences between the least and most popular tag. The minimum occurrences of any tag is 2.
The minimum number of occurrences a tag must have to be in a particular bucket is then:
bucket num.
×
range + 1
÷
num. of buckets
+
minimum occurrences of any tag
The maximum number of occurrences a tag can have an still be in this particular bucket is:
range + 1
÷
num. of buckets
+
minimum occurrences for bucket
-
1
So, in this case, our computations for the buckets looks like this:
Range + 1: 21 - 2 + 1 = 20
Num. of Buckets: 5
Minimum occurrences of any Tag: 2
| Bucket Num. |
Min. Occurrences to be in Bucket |
Max. to be in bucket |
| 0 |
(0 × 20 ÷ 5) + 2 = 2 |
20 ÷ 5 + 2 - 1 = 5 |
| 1 |
(1 × 20 ÷ 5) + 2 = 6 |
20 ÷ 5 + 6 - 1 = 9 |
| 2 |
(2 × 20 ÷ 5) + 2 = 10 |
20 ÷ 5 + 10 - 1 = 13 |
| 3 |
(3 × 20 ÷ 5) + 2 = 14 |
20 ÷ 5 + 14 - 1 = 17 |
| 4 |
(4 × 20 ÷ 5) + 2 = 18 |
20 ÷ 5 + 18 - 1 = 21 |
Now, we can come up with a complete definition for our buckets. A tag's number of occurrences must be in the italicized
range for the tag to go in a particular bucket. Once we place a tag in a bucket, we know it's font size.
Bucket 0 font: 9px
Occurrences: 2 to 5 |
Bucket 1 font: 10px
Occurrences: 6 to 9 |
Bucket 2 font: 11px
Occurrences: 10 to 13 |
Bucket 3 font: 12px
Occurrences: 14 to 17 |
Bucket 4 font: 13px
Occurrences: 18 to 21 |
You can download and view my example code below. The code is a .NET User Control, which you should place in the App_Code
directory of your website to run it.
The TagCloud class is responsible for creating the tag could, by implementing the algorithm above. The TagCloudData class
represents a single tag, and it's number of occurrences on the website. An array of TagCloudData is the input to the
TagCloud class.
The CloudHasher class is used to assign a bucket to a TagCloudData object. It overrides the GetHashCode(object) method derived
from object, so that GetHashCode returns the bucket number that the TagCloudData object belongs to. This way, it is possible to build
more complex data structures with TagCloudData objects, where the structure of the data will indicate the layout of the cloud. This is
useful in reducing the work involved in creating a tag cloud if gathering the data to create the cloud is complex. A structure like this
is not covered in the example, but the .NET SDK Documentation provides several good examples of how to use an overridden GetHashCode method
to create useful data structures from templates.
This
work is licensed under a
Creative Commons Attribution 3.0 United States License.
Please link to this article in your source code comments if you use this content.