#4 - Stochastic weather
We leaf through Kaikkea sattuu again, the very interesting corpus of essays on various pragmatic manifestations of chance. In exercise #2, it was about chance at work in the mutation-selection sequence driving the evolution of species. Here, it is about meteorology. Physics professor Heikki Järvinen explains how predicting the reliability of the weather forecast is just as important to useful weather prediction as the weather forecast itself - one cannot act on a report predicting rain if not trusting it in the least, just as excessive faith in a sunshine forecast can cause havoc on the high seas. Here, forecast errors are a matter of conditional probabilities. They follow a probability distribution conditional on the specific weather forecast value - storm, wind, or glassy sea. How to jointly predict tomorrow's weather and this very forecast' certainty?
In the underworld, the jaws of high-performance computers grind equations representing the physics of atmospheric flows, coupled with models of water vapor mass ebbs, as well as various energy, mass, and momentum transfer phenomena, such as the passage of shortwave solar radiation through the atmosphere. Yet nature is continuous, computation discrete and finite. Reality is too complex and beyond reach. Models fail to capture all things happening in the smallest details. (Järvinen, 2015)
Ilmakehän simulointiin käytettävä malli on siis todellisen ilmakehän matemaattinen abstraktio, johon sisältyy virheitä ja puutteita. Nämä puutteet muodostavat sääennusteen epävarmuuden ensimmäisen lähteen.
The model used for simulating the atmosphere is therefore a mathematical abstraction of the real atmosphere, which contains errors and deficiencies. These deficiencies constitute the first source of uncertainty in weather forecasting.
Let’s move on to the dissection. Physics is interesting, but Finnish much more so.
Learning Finnish like a machine. Up front, the method deserves one clarification. Some regulars from previous episodes, for example here or here, might be inclined to hasty conclusions. They might go: this isn't very original, all you are actually doing is rehash « grammaire par l'exemple ».
First, that would not be much of an insult. Grammar inference certainly doesn’t rank the lowest among language learning tactics. It is an order of magnitude above the dry inculcation of grammar rules, which should be banned in schools - first of all because grammar does not exist.
Textbook grammar is the hallucination of outdated doctrinaires who, feeling called upon when the nations of the world crystallized between the late 18th and early 20th centuries, decided it was possible and patriotic to file their vernacular language into forty-two tables designed to systematically mold the good citizen. It is the exact counterpart of a taxonomy à la Carl von Linné, freezing, in lifeless numbered plates, a dull version of the plant world.
A language is spoken, written. It is a living phenomenon and, like the epic of species, a phenomenon in constant evolution. Mutation–selection is taking place right now, with every new line written, every syllable pronounced. There is therefore no grammar other than inferred grammar. From text to grammar. The reverse diktat is a crime against the vital force of language - and, besides, a smug satisfaction with a very simplistic model.
Won’t you concede that, in practice, a certain degree of harmonization facilitates transactions - that it speeds up business? We couldn’t agree more! This is why we are betting on learning through exposure to massive amounts of textual data. If this does not bring us into harmony with the language as it is actually practiced, we do not know what would.
And this is how machines learn. Have you ever seen a modern language model - post‑2015 - leaf through the thick volumes of a bound grammar book before generating anything reasonably correct? It practices, it scans, it picks out the most likely patterns, it comes up with the most likely next word. You won’t catch the machine scribbling in notebook margins: « Hey, interesting, this is the illative case here for the space into which we enter. And from now on, whenever I jump into the pool, I will use it again ». Yet it knows. It has seen so many examples that the correct form comes naturally.
But then, it lacks righteousness; your practice is not well aligned with your grand plans. You are not really learning like machines.
Well, in fact we are - as much as we can. We are not machines. We don't browse thousands of tokens from Common Crawl per second, not even on our rest day after weight lifting at the gym. And we, arguably, have a special something extra: let's call it consciousness or maybe conceptual thinking. The machine has no box for the illative, it chews. A belly. We have less stomach, more mind. Even without meaning to, there comes a time when we notice ourselves noticing that things ending in -VVn (or -VhVn, -seen, or -siin) designate the space we enter. We coin a good name for it.
But how do translation machines pick patterns? By matching. Source language, target language. Proposition by proposition, group of words by group of words. Finer and finer.
Our use of grammar is incidental, a learning booster for us humans, and not fundamental to the method. As already mentioned, the method can be practiced at different speeds and levels of depth. Today, speaking of weather, let's see what a quick, nearly grammar-free scan yields.
Ilmakehän simulointiin käytettävä malli on siis todellisen ilmakehän matemaattinen abstraktio, johon sisältyy virheitä ja puutteita. Nämä puutteet muodostavat sääennusteen epävarmuuden ensimmäisen lähteen.
The model used for simulating the atmosphere is therefore a mathematical abstraction of the real atmosphere, which contains errors and deficiencies. These deficiencies constitute the first source of uncertainty in weather forecasting.
Let’s start with the obvious: three propositions clearly delineated by punctuation. By arbitrary, we have a look at the relative clause in the first sentence.
… , johon sisältyy virheitä ja puutteita.
… , which contains errors and deficiencies.
The little word ja necessarily coordinates errors and deficiencies, the relative pronoun follows directly after the comma, either by elimination, or because sisältyy betrays its origin, from sisä-, inner, we have the verb contains.
… , johon sisältyy virheitä ja puutteita.
… , which contains errors and deficiencies.
… , johon sisältyy virheitä ja puutteita.
… , which contains errors and deficiencies.
… , johon sisältyy virheitä ja puutteita.
… , which contains errors and deficiencies.
… , johon sisältyy virheitä ja puutteita.
… , which contains errors and deficiencies.
… and deficiencies. These deficiencies constitute the first source of uncertainty in weather forecasting.
Repetition always works in our favor.
… ja puutteita. Nämä puutteet muodostavat sääennusteen epävarmuuden ensimmäisen lähteen.
We don't have grammar today, but still good memory. We had ensimmäisen virkavuoden for first year in office, ensimmäinen is first and first source probably ensimmäisen lähteen. The whole the first source of uncertainty in weather forecasting must aggregate in the neighborhood. We also remember what seems a regular unfolding of noun complements: the whole of ensimmäisen virkavuoden viimeisiin päiviin for (to) the last days of the first year in office. One noun phrase featuring first the complement the first year in office, then the chief noun päiviin. It would therefore be hardly surprising to find the first source of uncertainty in weather forecasting upside down. Let’s bet on
… sääennusteen epävarmuuden ensimmäisen lähteen.
… the first source of uncertainty in weather forecasting.
… sääennusteen epävarmuuden ensimmäisen lähteen.
… the first source of uncertainty in weather forecasting.
The particle epä- confirms the hypothesis. The Estonian equivalent is eba-, the particle of negation. epäselvä, unclear. epävarmuuden, uncertainty. From sääennusteen or lähteen, it would be strange if sääennusteen were source, severing its epithet ensimmäisen from its complement. This therefore confirms lähteen as source, leaving only sääennusteen for in weather forecasting.
In truth, identification is even easier when the sentence floats within its context. We are reading an essay on the stochasticity of weather forecasts, and the vocabulary revolves around that theme. We do not long ignore the stems related to climate, atmosphere, flux, uncertainty, forecast, or weather. When our sentence appears, we have already come across, among four folio-sized pages: sään, säätä, sääennusteisiin, säiden, sääennusteen, säätilasta, sääennusteeksi, säätilakin, sääennustamisen, sääennusteiden two times, sääilmiöiden. (If we may interject a brief grammatical reflection, it seems that Finnish is a language rich in inflectional cases.) No doubt sää-ennusteen matches in weather forecasting.
… sääennusteen epävarmuuden ensimmäisen lähteen.
… the first source of uncertainty in weather forecasting.
Completing the sentence pertains to optometric testing.
Nämä puutteet muodostavat sääennusteen epävarmuuden ensimmäisen lähteen.
These deficiencies constitute the first source of uncertainty in weather forecasting.
We are left with a pronoun and a verb. We can conclude with near-zero uncertainty that nämä is these, muodostavat, constitute.
Ilmakehän simulointiin käytettävä malli on siis todellisen ilmakehän matemaattinen abstraktio, …
The model used for simulating the atmosphere is therefore a mathematical abstraction of the real atmosphere, …
The first proposition is all the clearer for being technical. A fairly universal tribute to Latin.
Ilmakehän simulointiin käytettävä malli on siis todellisen ilmakehän matemaattinen abstraktio, …
The model used for simulating the atmosphere is therefore a mathematical abstraction of the real atmosphere, …
And we expect complex noun phrases to be upside down. There are three of them. The last one translates
a mathematical abstraction of the real atmosphere
the first is the whole of
The model used for simulating the atmosphere
which can (improperly) unfold into
The model used for *the simulating of the atmosphere
thereby embedding a third :
(the) simulating (of) the atmosphere
Atmosphere, twice! It’s becoming a game of Where’s Waldo or Spot the difference. In
Ilmakehän simulointiin käytettävä malli on siis todellisen ilmakehän matemaattinen abstraktio, …
ilmakehän forms twice a complement of the noun, preposed to it
Ilmakehän simulointiin …
(the) simulating (of) the atmosphere
… todellisen ilmakehän matemaattinen abstraktio, …
a mathematical abstraction of the real atmosphere
and todellisen is an epithet, real, to the second.
Unsurprisingly, not just simulating the atmosphere but the whole of The model used for simulating the atmosphere is in Finnish the other way round from English.
*(the) for simulating the atmosphere / used / model
German can also do
das zur Simulation der Atmosphäre / verwendete / Modell
We already have
Ilmakehän simulointiin käytettävä malli
*(the) for simulating the atmosphere / used / model
malli sounds a bit like model, no? Yes, malli is borrowed from Swedish mall, itself borrowed from Dutch mal, itself borrowed via Old French from Latin modulus, measure - also the ancestor, via modellus in Colloquial Latin, of the English model. Or we can simply follow the (reverse) word order: the head of the noun phrase in last position, the participle preceding.
Ilmakehän simulointiin käytettävä malli
*(the) for simulating the atmosphere / used / model
Ilmakehän simulointiin käytettävä malli
*(the) for simulating the atmosphere / used / model
Almost there.
Ilmakehän simulointiin käytettävä malli on siis todellisen ilmakehän matemaattinen abstraktio, …
The model used for simulating the atmosphere is therefore a mathematical abstraction of the real atmosphere, …
Well, we suggest without grammar that on is to be in the third person singular indicative present. siis is therefore.
Grammar, as in weather forecasting with respect to atmospheric physics and phenomena, is a rather simplistic model of the flexible diversity of language. We can in fact do without it quite easily for the exercise. Nonetheless, it is undeniably helpful. It speeds up identification through the recognition of forms. It is a fantastic ingredient in machine learning of language the human way.
References
Järvinen, H. (2015). Sään ennustamisen arpapeli. In I. Hetemäki, P. Raento, H. Sariola & T. Seppä (toim.), Kaikkea sattuu (s. 49–62). Gaudeamus.


