Wednesday, May 24, 2006

Baseball, language, and infinity

I’m a little late to baseball this season. I guess I’ve had other things on my mind. But everyone in the office is now talking baseball so I’d better start.

Yesterday I mentioned to a couple of coworkers that there are an infinite number of possible baseball games. I was thinking about this because if you open any linguistics 101 textbook, one of the first claims you’ll come across is that the number of possible sentences in any language is infinite. For example, from p. 9 of Fromkin and Rodman An Introduction to Language Fifth Edition:

Simple memorization of all the possible sentences in a language is impossible in principle. If for every sentence in the language a longer sentence can be formed, then there is no limit to the length of any sentence and therefore no
limit to the number of sentences.

Because, unlike most games, baseball doesn’t have a time clock, you can say the same thing about baseball. Any half of an inning is not over until the defense gets 3 outs. So as long as that condition is unmet, the inning goes on. Take the longest half inning ever and change the last at bat so that batter gets on base and you’ve created a new half inning that is longer than the longest one. So there is no limit to the number of possible half innings and therefore the possible number of games.

Of course, just like language, we can describe this infinite space in a finite way. The rules of baseball are finite.

But baseball and language differ in how they produce the infinite variety. In baseball there is an end state: batters keep coming up until 3 outs are reached. Language seems to get it’s infinite variety a different way, through recursion.

I love recursion. When I was a kid I used to run the following BASIC program on my TRS-80:

10 Print “Ed is great!”
20 goto 10
It was cool to see the screen fill up with blinking “Ed is great!”s. Language uses a little more sophisticated version of the same trick. In language, noun phrases can have noun phrases inside them: “The man with the hat”. Or sentences can have sentences inside them: “Bill said that Mary likes John.” And so on.

Imagine trying to model language like baseball: you have a finite lexicon that consists of words and a sentence ending symbol, say a period. So a period is your end state. You have an output constraint on sentences when a period shows up, they end. From there you can simply string together words from the lexicon until you hit a period and then stop. This would give you all the possible sentences in a language. The trick is it would also give you a bunch of impossible ones. So you’d need some other rules or conditions to rule out the bad ones. Those rules or conditions would still have to create recursive structures.

It might work though.

2 comments:

Philadaddy said...

This reminds me of the bombastic Yankees radio announcer John Sterling. Whenever anything the least bit unusual happens in a game, he says something like, "just goes to show you, you can't predict baseball." Well no shit, that's why people watch it. What's the relationship between predictability and infinity?

Ed Keer said...

RE: predictability and infinity--They aren't really related I think. Chance plays a role in baseball, but not in grammars--at least not generative grammars. And yet both can produces an infinite set out of finite components.

I think those language generators that spam emails use to get past spam filters are a pretty good example. They generate random sentences based on the probability of a word appearing given the previous word. So you get something that is eerily language-like but isn't.

Site meter

Search This Blog