May 7th, 2011 — Links, Random
May 5th, 2011 — Links, Random
May 5th, 2011 — Tech
On a language website that I maintain I recently switched to a full-fledged phpBB forum from a previous solution that included a comment box on every page, provided by the online service Disqus.
phpBB is the classical choice for forum software, it is open source, based on php (as the name suggests), well supported, with a lot of information scattered around on the internet, with many useful extensions and a bewildering amount of parameters to be configured.
Here are some of the pros/cons of phpBB that I noticed in comparison to the previous set-up I used:
+ More control over the content, the comments are now stored on my server
+ Better organization of topics, possibility to move messages from one thread to another, the accumulated threads serve as some sort of knowledge-base
+ More fine-grained user permissions and moderating options. While this is useful, I would be happier with some sane defaults, as it takes quite a lot of time to play around with all these settings, to little visible benefit to the users.
+ Better integration: I can customize phpBB to closely match the looks of my site, more so than with the widgets I used previously
+ Users can create user accounts within my site
+ Since phpBB has been around for ages, most people are comfortable using it
- A lot of time wasted setting up, configuring, customizing, upgrading, etc. the software
- A lot of time wasted fighting spam. While there are many half-solutions available, in this regard running your own forum is clearly inferior to a centralized service like Disqus
- The look and feel of the forum is slightly heavy-weight and outdated (not too web2.0-y). This is not a big problem, as it is familiar even to non-technical people.
- Higher threshold for commenting. Although I tried to keep registration requirements, rules, captcha’s to a minimum, the number of messages written is somewhat smaller than in the previous system. However, the posts are longer, usually more substantial and better organized, so this is actually a +.
All in all I am quite happy with phpBB, even though it is not nearly as minimalistic as I would prefer it to be. One thing, however, that I was still missing from the Disqus days is a recent posts widget. This would be embedded on a different page than the actual forum (for example on the front page) and it would show the most recent 5-10 posts that were written in the forum. As I found no obvious ready-made solution, I wrote my own, which was not very difficult, as the database structure of the forums is quite straightforward.
You can download it below, with some explanation on how to install it. Let me know if you find it useful, or if you make any improvements to it. I spent about as much time putting it together, as it took me to write this post, so it is really nothing fancy. It is released under the WTFPL license, which means that you can … well, you know.
You can see it in action on the http://www.nebulo.ro site (text in Romanian), as it shows recent comments for the forum on the same site.
Installation
Put the following line in the web page where you want the recent comments to appear:
<div id="recent-widget"></div>
And somewhere later in the same file:
<script type="text/javascript">
httpRequest("recent-widget.php", showrecent);
function showrecent(WIDGET){
d = document.getElementById('recent-widget');
d.innerHTML = WIDGET;
}
function httpRequest(url, callback) {
var httpObj = false;
if (typeof XMLHttpRequest != 'undefined') {
httpObj = new XMLHttpRequest();
} else if (window.ActiveXObject) {
try{
httpObj = new ActiveXObject('Msxml2.XMLHTTP');
} catch(e) {
try{
httpObj = new ActiveXObject('iMicrosoft.XMLHTTP');
} catch(e) {}
}
}
if (!httpObj) return;
httpObj.onreadystatechange = function() {
if (httpObj.readyState == 4) { // when request is complete
callback(httpObj.responseText);
}
};
httpObj.open('GET', url, true);
httpObj.send(null);
}
</script>
This will load the recent-widget.php file that writes the actual widget and put the output into the HTML file. Upload this on your server, so that it is accessible from your main page. The only advantage to this javascript-loader is that you can easily display the same widget on many different HTML pages. Otherwise you can simply put the stuff from recent-widget.php into your own php files.
Here is the recent-widget.php
Before you upload it, you have to modify some things in it that are about the access to the forum database and links to the forum itself and you can modify other things, like the actual layout, colors used, number of posts, etc. See the TODO comments in the file for more information. This assumes that you are running phpBB with MySQL, but if you run some other database, such as SQLite, it should still be easy to modify. Obviously, this whole widget is nothing too complicated, so even if you never programmed in php, or seen an SQL query before, it is still a decent way to learn if you tweak it to work.
February 25th, 2011 — Ideas, Random
This morning on HN: The Colour Clock (notice the .uk in the domain :).
An elegant clock with changing background color. The exact color is obtained by interpreting the hour, minute, second values as describing the Red, Green, Blue components of the color.
Link here.
The original was in flash, but Brian Collins quickly put together a version using HTML, CSS and JavaScript.
Here is his version:
http://brisy.info/colors/
My only tweak on this idea is to make the color-cycle continuous. When the seconds move from 59 to 00 there is a jump in the Blue value from maximum to minimum, which means a not-so-smooth transition in the background color.
To overcome this, I make the Blue value alternate between going up and coming down, depending on whether the previous value is odd or even. Similarly for the Green value (minutes), depending on the hour value. The Red (hour) value will go up or down according to the parity of the number of days since some remote date in the past. The javascript/jquery code achieving this can be examined (it is just a minor modification of Brian’s code). Now the clock still cycles through all colors but without sudden jumps (and now the numbers are harder to interpret directly as colors).
Here is the working version:
smooth cycling color clock.
February 15th, 2011 — Ideas
Another bunch of random ideas off my chest, for you to run with. If interested, check out the previous set also, or the web app ideas before them or these (mostly outdated or unrealistic) ideas … the rabbithole goes deeper. The usual disclaimers apply here as well.
1. Facebook downvote button
Obligatory disclaimer: I don’t have a Facebook account. But I did have one for a short and sad period of my life, so I have a rough idea what it is about. Unless things have changed dramatically since I deleted my account, the single missing feature most people want to have is the dislike button. Of course it goes against the feel-goody nature of the whole experience, so probably it will never be added.
On the other hand, many elements that are “likable” are also commentable in some way. So the idea is a browser plugin that adds the thumbs-down button within the browser (something like a GreaseMonkey script). When the user clicks it, it writes a comment with the text “I don’t like this”. Those who have the plugin installed will have these comments hidden and only see the dislike-count summary of them. Those without the plugin will see the comments, which make sense on their own, so the whole system degrades gracefully.
2. Software feature matrix
This one is a website built around the idea of unbiased comparison of products. It groups existing software or web apps into categories and lists a large set of agreed-upon features, ticking those that a software product implements, with extra information where it does not fully implement it, etc. Such tables have been written for many categories of software, for example this list of CMS software on wikipedia, but the hypothetical website would contain tables for every imaginable category, both Desktop software and web apps, with up-to-date price information and many other features. How to collect the data and how to ensure the information is unbiased, are nontrivial questions, but if done well, such a website could be quite useful.
3. Namespace explorer
There exist very good services that help us in coming up with domain names for a project. However, nowadays you might also want to ensure that you can find a matching account name on Twitter, Gmail, Facebook and who knows what other services. This application would suggest names that are available on all these services and would automate to some extent the creation of accounts. Some intuitive navigation interface or visualization of these namespaces to help find available names.
4. Paper airplane simulator
Software for modeling paper airplane folding and simulation of their aerodynamic behavior. Additionally, a system that comes up with new folding designs, evaluates and improves them using genetic algorithms or other optimization techniques.
Karl Sims’ evolved creatures meet Robert Lang’s origami work.
5. Juggling tricks optimizer
And you thought the previous idea was childish and useless. This one supposes a rich language for describing the mechanics of juggling n balls. Basic siteswap notation (although beautifully compact) doesn’t capture the full range of possible patterns. The hypothetical language could describe throws, catches, multiplexes, hand movements, the anatomically possible range and speed of movements and the physical plausibility of every ball trajectory, possibly including collisions.
Using this language we can describe any juggling pattern ever performed in the past or in the future. Now all we need is a pair of scoring functions. The first function tells how difficult a pattern is (described as a string in our language), based on the speed of movement, the resting time between throws/catches and various physiological considerations. The other function tells how interesting the pattern is. I am not even speculating on how such function would look like. Interestingness is related to difficulty, but there are some easy tricks which are very nice (and vice versa), so there is something more happening here. Having these components in place (a big if, I know), we can start optimizing using our favorite genetic algorithm from the previous idea.
The system then invents the most interesting, most hard to perform (but possible!) juggling trick. Then it designs the most interesting – yet easy pattern. Tweak the two functions, optimize again, invent, juggle, etc.
6. Car parking game
A simple game (mobile or desktop) with realistic mechanics but extremely simple graphics and control. Actually control would not necessarily be simple, but it would instead resemble the control of a (manual) gear shift car with the 3 pedals and all. The cars are basically rectangles, but the acceleration, steering, braking, collision are realistic as much as possible. Possible tasks include parking in some small place, side-parking, turning in 3 steps in a small place, etc. etc.
Also suitable as a programming project for children.
7. Song illustrator
This plays on a misconception I had when I was 4 or 5. I was familiar with audio cassettes, from which I would listen to stories and songs for children that my parents would put on for me to hear. I had heard about video players before, and I vaguely understood that they are playing moving images (I was familiar with TV already), but I seriously misunderstood something: I thought that video players could play the same audio cassettes I was listening to (say, “The Little Prince”, being read by someone). As they played the audio cassettes, I thought they would render the moving images on the spot, creating vivid animations similar to those I imagined upon hearing the text (or more interesting). It never crossed my mind that in order to have a video version, a bunch of actors would have to go through the trouble of acting it all out in front of a camera. That is so much worse than the way I originally imagined it.
So this idea is a modest version of the original concept: take a song together with the lyrics, feed it into this program, which then creates a video for it automatically and synchronizes it to the music. As a first approximation it finds Flickr images corresponding to the words from the lyrics and makes a slideshow out of those. I know, reality is not so interesting sometimes.
8. Visa matrix
Back to more realistic waters, this one is a useful tool for world-travelers and world-citizens. If there are 200 countries, make a 200-by-200 matrix, where the column shows which country you are from, the row shows which country you want to travel to and the entry shows what document you need to travel there. For a summary view (for example as a printable poster), the entries can be color-coded: from green to red showing “no document needed”, “passport needed”, “visa needed”, “entry not allowed”, etc. In the online version the cells contain links to the embassy websites where more information is available and the visa application can be initiated. All the necessary information is publicly available already, it is just not collected (as far as I know) in one place like this.
9. Online store where you can negotiate
Extremely half-baked, I am thinking this out as I write it down. They say “everything is negotiable”. While that holds only in a limited way, say in a usual mall in North America, it is very true in a bazaar, in the other side of the world. The price you get depends on how much you need it, how desperate the seller is to get rid of it, your and his/her negotiation skills and many other factors. While not all of these can be captured in an online store, certainly some subset of it could be. Currently the online stores give “offers” to you, which is too rigid. They might even give personalized offers, knowing your shopping history, personal data, etc.
In many cases, however, they might be perfectly willing to sell the product at a cheaper price if they knew for sure that I was not going to buy it for anything more than that. On the other hand they want me to pay the maximum amount that I can afford and that I am willing to pay. Probably there are legal restrictions on the exact ways to perform such price segmentation, but in theory there could be an online store where I can make a counteroffer: “oh no, I’m not buying that e-book reader for $129, I’ll pay $99 maximum, and that’s it. At most, I can spend another $20 to buy some books, but that’s all I can spend today. Take it or leave it.” Once an offer is made (from either side), it is binding. The online store would have my purchasing- as well as negotiation history, so it could decide whether it would take my offer or would rather make a counter-counter-offer and so on, ad-infinitum.
10. Random wiki image as wallpaper
Wikipedia has a nice “Random article” feature, that can easily take information-addicts on an hours-long semi-pointless article-hopping ride. Something slightly less addictive is the WikiMedia random file feature:
http://commons.wikimedia.org/wiki/Special:Random/File
The results are mostly images. You clicked it? Now click it again! A slideshow of these would make a nice, educational screensaver or periodically-updated wallpaper. There should be some check to filter out images that are too small, that are obviously uninteresting (such as portions of maps or flags of countries), and some context would be nice also, such as title, description, maybe titles of a few of the linking articles.
EDIT: I made this: http://lkozma.net/blog/random-wiki-image-wallpaper/
December 11th, 2010 — Juggling
Claude E. Shannon is best known for his 1948 paper “A Mathematical Theory of Communication” in which he created the field of Information Theory, but he had many other important contributions in diverse areas.
What is lesser known is that he had a keen interest for gadgets and devices of all sorts and he invented some quite humorous ones, like “The Ultimate Machine” (also known as “The Most Useless Machine Ever”):
Video
Shannon was also an accomplished juggler. He came up with the following elegant theorem, known as
Shannon’s Juggling Theorem
(F+D)H=(V+D)N
F is the time a ball spends in the air (Flight)
D is the time a ball spends in a hand (Dwell), or equivalently, the time a hand spends with a ball in it
V is the time a hand spends empty (Vacant)
N is the number of balls
H is the number of hands
The theorem can be derived by looking at a complete juggling cycle first from the perspective of the ball, then from the perspective of the hand, then equating the two times. This is an application of one of the most useful general tricks in combinatorics: double counting. You count/measure something in two different ways (in this case the juggling time), and use the fact that the two results have to be equal.
We can read out from the theorem some obvious facts, such that if you throw the balls higher (increase F) then V will also increase (your hands will be empty for longer). If you increase D at the expense of F and V, until they become zero (you keep holding the balls in your hands), N and H have to be equal (one ball in each hand). No surprises here, except to note that the theorem assumes that there is at most one ball in one hand at a time, so it does not apply to multiplex patterns in which several balls are simultaneously held in the same hand (we would need separate Ds for hand and ball to fix this, but the simplicity of the theorem would be lost).
What if you want to juggle more balls (increase N) but you cannot change F, V or D (you cannot juggle any faster or throw the balls any higher)? No problem, just increase the number of hands (H). One way to achieve that is by becoming more social.
Further links:
The Science of Juggling
A personal tribute to Claude Shannon
November 9th, 2010 — Music
August 16th, 2010 — Tech
“Sketching” data structures store a summary of a data set in situations where the whole data would be prohibitively costly to store (at least in a fast-access place like the memory as opposed to the hard disk). Variants of trees, hash tables, etc. are not sketching structures, they just facilitate access to the data, but they still store the data itself. However, the concept of hashing is closely related to most sketching ideas as we will see.
The main feature of sketching data structures is that they can answer certain questions about the data extremely efficiently, at the price of the occasional error. The best part is that the probability of an error can be quantified and the programmer can trade off the expected error rate with the amount of resources (storage, time) afforded. At the limit of this trade-off (when no error is allowed) these sketching structures collapse into traditional data structures.
Sketching data structures are somewhat counter-intuitive, but they can be useful in many real applications. I look at two such structures mostly for my own benefit: As I try to understand them, I write down my notes. Perhaps someone else will find them useful. Links to further information can be found in the end. Leave comments if you know of other sketching data structures that you found useful or if you have some favorite elegant and unusual data structure.
1. Bloom filter
Suppose we have to store a set of values (A) that come from a “universe” of possible values (U). Examples: IP addresses, words, names of people, etc. Then we need to check whether a new item x is a member of set A or not. For example, we might check if a word is spelled correctly by looking it up in the dictionary, or we can verify whether an IP address is banned by looking it up in our black list.
We could achieve this by storing the whole set A in our favorite data structure. Alternatively, we could just store a binary array, with one bit for each possible element in U. For example, to quickly check if a number is prime or not, we could precompute an array of bits for all numbers from 0 to the maximum value we need:
Prime = 001101010001010001010001...
To check whether a number is prime, we look at the corresponding bit and we are done. This is a dummy example, but it is already obvious that in most cases the range of possible values is too large to make this practical. The number of all possible strings of length 5, containing just letters from the English alphabet is 26^5 = 11,881,376 and in most real problems the universe U is much larger than that.
The magic of the Bloom filter allows us to get away with much less storage at the price of an occasional mistake. This mistake can only be a false positive, the Bloom filter might say that x is in A when in fact it is not. On the other hand, when it says that x is not in A, this is always true, in other words false negatives are impossible. In some applications (like the spell-checker), this is acceptable if false positives are not too frequent. In other applications (like the IP blacklist), misses are more common and in the case of a hit, we can verify the answer by reading the actual data from the more costly storage. In this case the Bloom filter can act as an efficiency layer in front of a more costly storage structure. If false positives can be tolerated, the Bloom filter can be used by itself.
The way it works is really simple: we use a binary array of size n, as in the prime numbers example, that is initialized with 0s. In this case however, n is much smaller than the total number of elements in U. For each element to be added to A, we compute k different hash values (using k independent hash functions) with results between 1 and n. We set all these locations h1, h2, …, hk (the indexes returned by the hash functions) in the binary array to 1. To check if y is in A, we compute the hash values h1(y), …, hk(y) and check the corresponding locations in the array. If at least one of them is 0, the element is missing. If all fields are 1, we can say that the element is present with a certainty that depends on n (the size of the array), k (the number of hashes) and the number of elements inserted. Note that n and k can be set beforehand by the programmer.
The source of this uncertainty is that hash values can collide. This becomes more of a problem as the array is filling up. If the array were full, the answer to all queries would be a yes. In this simple variant, deleting an element is not possible: we cannot just set the corresponding fields to 0, as this might interfere with other elements that were stored. There are many variants of Bloom filters, some allowing deletion and some allowing the storage of a few bits of data as well. For these and for some rigorous analysis, as well as some implementation tricks, see the links below.
A quick dummy example is a name database. Suppose we want to store female names and reject male names. We use two hash functions that return a number from 1 to 10 for any string.
Initial configuration: 0000000000
Insert("Sally") : 0100000001
# h1("Sally") = 2, h2("Sally") = 10
Insert("Jane") : 1110000001
# h1("Jane") = 1, h2("Jane") = 3
Insert("Mary") : 1110100001
# h1("Mary") = 5, h2("Mary") = 2 [collision]
Query("Sally")
# bits 2 and 10 are set,
# return HIT
Query("John")
# h1("John") = 10 set, but h2("John") = 4 not set
# return MISS
Query("Bob")
# h1("Bob") = 5 set, h2("Bob") = 1 set
# return HIT (false positive)
Wikipedia: Bloom filter (and variants)
Description and Ruby implementation
Quick introduction and demo
JavaScript demo
Bloom filter – the math
The original paper (Bloom, 1970)
Bloom filter, detailed survey
Tech talk: Bloom filter and variants
Bloom filter in C# using literate programming
Bloom filter in Ruby
2. Count-Min sketch
The Count-Min (CM) sketch is less known than the Bloom filter, but it is somewhat similar (especially to the counting variants of the Bloom filter). The problem here is to store a numerical value associated with each element, say the number of occurrences of the element in a stream (for example when counting accesses from different IP addresses to a server). Surprisingly, this can be done using less space than the number of elements, with the trade-off that the result can be slightly off sometimes, but mostly on the small values. Again, the parameters of the data structure can be chosen such as to obtain a desired accuracy.
CM works as follows: we have k different hash functions and k different tables which are indexed by the outputs of these functions (note that the Bloom filter can be implemented in this way as well). The fields in the tables are now integer values. Initially we have all fields set to 0 (all unseen elements have count 0). When we increase the count of an element, we increment all the corresponding k fields in the different tables (given by the hash values of the element). If a decrease operation is allowed (which makes things more difficult), we similarly subtract a value from all k elements.
To obtain the count of an element, we take the minimum of the k fields that correspond to that element (as given by the hashes). This makes intuitive sense. Out of the k values, probably some have been incremented on other elements also (if there were collisions on the hash values). However, if not all k fields have been returned by the hash functions on other elements, the minimum will give the correct value. See illustration for an example on counting hits from IP addresses:

In this example the scenario could be that we want to notice if an IP address is responsible for a lot of traffic (to further investigate if there is a problem or some kind of attack). The CM structure allows us to do this without storing a record for each address. When we increment the fields corresponding to an address, simultaneously we check if the minimum is above some threshold and we do some costly operation if it is (which might be a false alert). On the other hand, the real count can never be larger than the reported number, so if the minimum is a small number, we don’t have to do anything (this holds for the presented simple variant that does not allow decreases). As the example shows, CM sketch is most useful for detecting “heavy hitters” in a stream.
It is interesting to note that if we take the CM data structure and make the counters such that they saturate at 1, we obtain the Bloom filter.
For further study, analysis of the data structure and variants, proper choice of parameters, see the following links:
CM sketch page
Original paper (Cormode, Muthukrishnan, 2005)
Lecture slides (Helsinki Univ. of Tech)
Sketch library C++
C and Java implementations
What is your favorite counter-intuitive data structure or algorithm?
July 27th, 2010 — Ideas
Here are some more web app ideas. As I am not implementing any of them right now, I’m posting them in the hope that someone else will find them interesting and build something similar. They might not be useful in their current form, but maybe they can be further iterated and inspire someone to come up with something else. Let me know in the comments what do you think, especially if any of these exist already.
You might find interesting the seven web app ideas I posted previously or the whole list of half-baked ideas I wrote up in the past, most of which are silly, of course.
1. Wikitravel itinerary
The way I travel might not be the most typical, but here is how I usually prepare before leaving: I download and print all the relevant articles from Wikitravel and Wikipedia. These include the place where I actually go, nearby sites worth visiting, places I will pass by on the train, generic articles about the region, country, etc. I then read all this while on the train or while waiting for a bus or a hitch-hiking opportunity or whenever there’s nothing better to do during the trip. For example if I’d go to Chennai, I would download the articles on India, South Asia, South India, Railways in India, the one on Chennai, and the articles on interesting places in the nearby such as Pondicherry.
To reduce the amount of paper I carry, I slightly edit and compress the articles before printing, removing some sections or pictures, text that is common between articles, etc. For the places where I stay longer I leave most of the information, for places I just pass through on the train I remove the restaurants, hostels section, leaving just a short description of the place. If I travel with other people we often share some of this preparation, but it still takes a few hours of work that could be easily automated. In particular it takes some effort to hunt down all the relevant articles, because when you don’t know much about a place, you won’t know what is in its neighborhood either, that might have an article about it.
By the way, Wikitravel is great, this method has given us so far more up-to-date, more interesting and more reliable reading material than most traditional travel guides. So the web app would do the following: I enter where I will go, possibly a few other places that I pass by and it fetches all the possibly relevant articles from Wikipedia and Wikitravel using geographical location of places and returns a small, ready to print document. It can also ask what languages I understand and look for articles written in those. Possibly other sources could be added, such as maps or train timetables, but this alone would be quite useful already. If not for printing personalized travel guides, it could work as a mobile app: download the articles on a mobile phone to allow offline browsing while on the road.
2. External sites analytics
There is a trick described here that uses the different styling of visited links to sniff whether the user has visited some other websites or not. It was suggested that this hack could be used to customize the social “share” buttons, in order to show only those services that the visitor has used in the past. The same technique could be used to find out more useful information about the visitor: which search engine they use, whether they read the NY Times or the WSJ, if they are a Gmail or Yahoo mail user, whether they use the competitor’s product, etc. The technique does not allow reading the full browsing history, it just gives yes/no answers for a predefined set of websites.
The user of this analytics service would choose which other websites they want to follow, and for each visitor they would get the list of websites from this set that the visitor has actually visited before. Combined with the traditional analytics data: referring site, browser version, length of visit, this could be valuable business information.
Some immediate disadvantages:
- the technique can raise some privacy and ethical issues, although the information is leaked by the browser anyway, it is considered a bug, so it might reflect negatively on a website that wants to exploit this data. Also, most users would not be aware that this type of information can be collected. On the other hand many users are unaware of traditional analytics as well, which doesn’t deter websites from collecting it.
- the method is not 100% accurate, and it does not work if the visitor has deleted their browsing history
- it could potentially be embarrassing for a website to expose the list of other websites they care about (acknowledging competitors, important news sources, etc.)
3. Related page 404
Larger web sites such as those of big corporations or universities suffer inevitably from the deterioration of links. Pages often change addresses, and even though the material is still available, it can not be found at the original address. Services like Google Cache come in handy sometimes, but they might not have indexed a correct version of the page or they might contain outdated content (often the page is available somewhere else with more up-to-date content, for example a university admissions page).
A quick solution would be if the “page missing” message would be followed by suggestions of pages which are similar to the old page that used to be there. This could be achieved by a site search engine that stores important keywords of indexed pages and would search within the domain for those keywords, if the page is currently missing. This could be a feature of site search engines, either hosted on the server or managed remotely. This is similar to the second idea in this list, or this idea.
4. Stacked source widget
World map widgets that show where your website visitors are from used to be very popular. Also, there exist real-time widgets that show where the last 5 or so visitors are from (both geographically and also from which site they were referred). One combination I haven’t seen yet is a stacked graph visualization of traffic sources in the past week/month or year. This would show how total traffic numbers have changed, and also when certain sites were sending more traffic than others. It could be interesting for example to see how at some point your page was featured on reddit, later on slashdot, how traffic was picked up by other sources afterwards, etc. Of course, all these can be visualized by existing analytics packages, but I am thinking of a widget that would make this information available to the visitors of the site. The various sources would be clickable and the visitor could discover related interesting material by going to the sources. I am not sure there would be an incentive for site owners to use this, however, which is why this idea is half-baked :)
5. Unicode obfuscator
This is a very simple service inspired by this post. I enter some plain English text and it returns a set of unicode characters from other languages or special characters that visually resemble the latin letters (ignoring the actual pronunciation or history of the letters). It can be specified how much distortion should be added to the text:
at level 1: hello -> hello
at level 2: hello -> ⱨěłļō
at level 3: hello -> ɧԐԼ˪◌
While I don’t have a specific use case in mind for this, it could be used wherever you want text to be human-readable but not searchable. Otherwise it could be just a fun and useless way to write on forums, not unlike metal umlauts in band names.
6. Comment whiteboard
This one is simple: instead of the usual text comment box after a blog post or article, have a whiteboard on which visitors can draw whatever they want. This would need just the minimal tools, such as picking colors, drawing, erasing, perhaps zooming. Visitors could draw at the same time and they could overwrite each other’s drawing anytime. The history of the whiteboard could be played back, so nothing would be actually lost. Obviously, most of the time nothing would happen so the playback could be sped up significantly to include only the times before something was erased/overwritten.
I am trying really hard to think of a scenario where this would be useful. Articles on math, where formulas would be difficult to type in comments? Nah, there are ways to let users type LaTeX or just dumb down formulas to plain text. Blogs for children who can’t write yet? Hmm, maybe not. Perhaps this one really is useless. Although it could feel quite natural on touch-screen devices.
7. Rotor game
This is a game idea which I couldn’t actually make to work. If anyone would pick it up and make something out of it, I would be super happy. It is based on the rotor-router model where each node is in one of a number of states. Let’s suppose it is actually a checkered paper and each cell can be in one of four states (indicated by an arrow pointing in one of four directions). A walker is passing through these cells, following the arrows, but after he passes a cell, the arrow in that cell will turn clockwise by 90 degrees.
This set-up leads to very interesting, seemingly chaotic behavior which is of interest to mathematicians and physicist, but I am curious whether some interesting board game could be designed from this premise. My first attempt was to have the user design the initial configuration of arrows such that they will guide a ball from start to finish. To make it more difficult, there can be several balls with intersecting paths, and the arrows have to be configured such that each ball gets to its destination. The problem is that this way the puzzles become either trivial or impossible to solve. If several balls can be moving at the same time, there are additional complications: can they pass through each other? What if they get to the same cell at the same time? etc.
I still hope some interesting game/puzzle could be designed from these rules, perhaps slightly modified (hexagonal field?), but I have given up searching for it for the time being. Well, this is not strictly a web app idea, but if the game turned out to be interesting, it could be implemented as one. Here is a page where I sketched the idea during some idle times.
8. Zooming GUI for web
Zooming GUIs are a beautiful concept and there are some uses when they feel quite natural in practice (exploring graph data, browsing photo collections, certain data visualizations, mindmap-like tools, calendars, etc.) but somehow they didn’t become as widespread as they could have. There are many websites (mostly flash-based) built on zooming UI concepts, but I don’t know of a JavaScript based toolkit that would allow quick prototyping of zooming web-apps (using vector graphics, not bitmaps). I’m thinking of something similar to the Piccolo project for the Desktop, but in JavaScript. With proper controls, this could be most usable on touch-screen devices, although I’m not familiar at all with the current best native systems on those devices.