Write a new post

Thursday, October 27, 2005

I spent the last few days making something

  I felt like setting myself a project the other day, so I decided to try to write a program for constructing random sentences according to a context-free grammar. I was inspired by those nearly-grammatical-but-not-quite-meaningful spam subject lines you sometimes get; whilst they tend to use a core phrase with some random words added in to confound spam filters, it reminded me of the sorts of phrases context-free grammars (which I first encountered when reading Logic by Wilfrid Hodges) sometimes generated. Being an algorithmic and iterative process, it seemed to be quite well-suited to being processed by a computer. I was most experienced in Java, so at first I worked with that. To start off with, I didn’t want to complicate things by introducing external files (plain text or databases), so I hard-coded the CFG. This was obviously not an ideal solution, but due to the object-oriented nature of my approach, it was the most simple. I got things working to a reasonable degree, with only a little jiggery-pokery in order to balance the probabilities a bit. However, despite portability being one of the fundamental concepts behind Java, it’s not the sort of thing that most people want to mess around with. I quickly knocked up an applet front-end, but I could only get it working locally – as soon as I tried to upload it, it came up with nothing at all. I also tried compiling it to a Windows .exe, but for some reason this refused to run on anyone else’s computers. In light of all of this, and because I really felt that it was time I learnt it, I decided to try again using PHP. I downloaded UniServer, which was a lot more co-operative than previous PHP software I’d tried, and took a quick look at a basic PHP tutorial. Things were largely familiar, so I didn’t really have a problem picking it up. The most alien things were regular expressions, which I had encountered before, but which I found to be a bit fiddly. I think it’s the compactness of them – which, indeed, is one of their greatest strengths – that put me off.

  I decided that since I was starting things afresh, I might as well start using an external file for storing the CFG. It actually turned out not to be too difficult after all; I found PHP’s string-handling capabilities to be very impressive and quite helpful for what I was doing (I don’t know whether PHP is actually better at handling strings, or whether it’s simply the online documentation – which I like very much – that made things easier). My main problem was constructing a mechanism to prevent the recursion from repeating indefinitely – one of the advantages of my Java implementation was that I could determine how the program should react to excessive depth on a class-by-class basis (with a class for each symbol). I knew what I wanted to do pretty early on, but numerous small errors on my part led to several hours of misery. But it’s been over for a few days now, so I think I can safely report my success.

  Anyway, I’ve been talking at length about the creation process, which wasn’t very interesting, so I’ll cut to the proverbial chase and provide the links: the first version simply provides a random sentence each time, whilst the second allows you to alter the rule set used by the program. My intention is that people get so obsessed with fiddling with this pointless toy that humanity’s productivity is reduced to roughly zero. Why this should be my goal, I haven’t yet decided. I just like it. Any contributions anybody wants to make will be considered for incorporation, but so far most people haven’t bothered messing with the mechanics behind the thing.

  I started writing this three days ago, then got distracted and decided I’d finish it another day. When I came back to it, I couldn’t really work out what was left to say. I can’t be bothered to read it properly, so it might not make much sense. This troubles me, but not enough to actually do anything about it.

[Show more]

5 Comments:

Blogger James said...

Did I just get spammed by a diabetes site? That’s really bizarre.

28/10/05 15:28  
Anonymous Anonymous said...

Well at least you are apparently "very unique", which surely doesn't make any sense. You either are unique or not, you can't be very unique.

30/10/05 11:45  
Anonymous Anonymous said...

Hahahaha... diabetes...

Well, that was really long and although I can't say I read the whole thing or even understood most of it, I will say cupcake bananas. Why the hell don't they make more cupcake bananas?!

3/11/05 15:16  
Anonymous Anonymous said...

Rather than being inspired by subject lines I believe it was this conversation, where I first suggested to create a script like this. I know it's pedantic, but your story isn't exactly telling the truth in regard to your creativity ;-).

5/11/05 20:05  
Blogger James said...

Well the conversation was about subject lines, was it not? To quote your own words: ‘I wonder what script/programme might be capable of generating such wonderful subject lines.’ And if we’re going to get really pedantic, the first thing I had thought upon encountering context-free grammar was that such a thing is perfectly suited to computer implementation. So ner.

Unfortunately it would appear that my host has decided to delete all my PHP files suddenly and without warning; furthermore, it has also stopped letting me upload them. The bastards. If anyone knows of a good free PHP host, please let me know.

6/11/05 11:47  

Post a Comment

<< Home