Wednesday, November 6, 2019

Language, Logic and Punctuation


Language, Learning and Logic


Language, Learning and Logic are the three critical aspect of intelligence, and language is in focus in the Turing Test as a way of determining whether an AI can be regarded as ‘thinking’ or ‘intelligent’.

Understanding not just how to write correctly and affectively, involves considering how it affects reading. this is more important than ever today – especially when you consider how widely you may be read. What you write today will in general be read not just by humans, but a variety of AI programs – whether they are trying to index it or understand it, or just reading it out loud.

Spoken vs Written


One thing that is often missed in such discussion is that language ‘evolved’ in its spoken form as part of our culture,  but the rules of grammar, spelling, punctuation were ‘invented’ in association with writing. When a Conversational Agent communicates with us in writing, that is already a long way from our natural modality - but provided Turing with a natural way of isolating the contestants so that we couldn't use sight and sound to distinguish them.

Whether Embodied Conversational Agents, GPS guidance systems, or e-book readers, AI systems can be made to look and sound arbitrarily realistic – although there is still a fair bit of work to do on emotion, expression and prosody.

Ironically, this is also exactly where we have issues with writing.

Morphology, Grammar and Punctuation


In relation to writing, the origin of morphology, grammar and punctuation was to capture the contextual rules that underlie language in the larger sense that includes not just words and sentences, but emotion, expression and prosody. When we are speaking, our tone of voice, rate of speaking, facial expressions, etc. all contribute to conveying meaning that goes beyond the actual words and grammatical forms used. Punctuation was invented to provide the cues that allowed us to read back the intended emotion, expression and prosody (whether read out loud or subvocally).

Unfortunately, the invention of printing changed all of this. Printers were more concerned with aesthetics and practicality than the correct logical use of punctuation.  Some rules specifically relate to the days of typesetting with lead, and the ease of chipping off small punctuation marks at the end of a line of type.

But we will here consider the fundamental punctuation marks in relation to the fundamentals of prosody - where the first and most important consideration is how it affects your breathing: that is when you take a breath.

A Prosodic view of Punctuation


Comma (,) is, at heart, a breathing point. It can thus serve a number of functions including separating a list of short phrases or clauses that don't themselves have commas in them – which would be confusing.

Semicolon (;) gives us a second level of more major break/breath points that can also separate a list of phrases or clauses which can now have commas in them – there is no possibility of comma confusion now.

One minor detail here (the Oxford comma) is whether you should or shouldn't have a comma before the ‘and’ that marks the final term in a list. You can have wars about this if you want, but a pragmatic rule is simply “put one in if you want to tell readers they need to break/breathe here”.

Colon (:) is used when you announce a list, with the list itself separated by either commas or semicolons as appropriate – although the complexity threshold where you feel you need semicolons will be lower.  The longer the sentence the more likely you are to want to use semicolons.

Colon (:) can also be used to introduce direct quotation (in “double quotes” usually), although comma is more often used for this these days. Colon is the correct way of introducing an explanation or elaboration. Single quotes are often used for quotes within quotes, ‘scare quotes’ where you acknowledge that this usage is not common or needs to be taken with a grain of salt, or highlighting ‘interesting’ words that you are discussing while not actually quoting anyone saying them.

Semicolon (;) can also be used to used to separate parallel ideas where one doesn't explain or follow from the other (in which case colon is used). This is really what's happening when it is used in a list too.

Colon and semicolon often come at the end of a clause (typically a sentence nucleus that has subject, verb and object) where the sentence is not quite finished yet because something else needs to be said to complete or fill out the idea. It is a good idea to read such sentences and see if you can replace these clause-level forms of punctuation with sentence-terminating punctuation (which means the part after the colon has to have its own subject and predicate).

Sentential punctuation


The sentence-final punctuation point or stop takes three forms, full stop (.), exclamation mark (!), and question mark (?).  Note that all three include a point as a stop mark at the baseline, the stop or point indicating a brief break or pause.

If you have multiple clauses in a question or an exclamation, it is good to review to see if they all have the same (interrogatory or exclamatory) character. Often only the first clause does, and so that is where the relevant mark should go – particularly where what follows goes on to suggest or discuss possible responses or provide parenthetical information. Other times you might need to write a series of questions, or a mix of questions, exclamations and statements.

The biggest issue relating to the use of question marks is the complex multi-clause sentence. In this case, more clarity can often (but not always) be achieved by breaking it up into a series of smaller sentences/questions.

The exclamation mark has come into a lot of criticism – and, as often happens with language-related issues, well-meaning editors have thrown the baby out with the bathwater! If you overuse exclamation marks, then you are likely to feel you have to resort to multiple exclamation marks to make your point!! We will come back to this!!!

Parenthetical punctuation


The other main use for punctuation is for some sort of bracketing – this includes the use of matched single or double quotes (which should ‘curl’ like round brackets) as well as the round, square and curly brackets. Generally the bracket forms are not seen much in fiction, but there are particular conventions as to how they should be used in technical writing – which is largely beyond the scope of the present discussion, although the so-called Harvard convention is explained here.

When we put something in round brackets (like this) it indicates that nothing is lost (grammatically or contextually) when we leave it out, but that this is helpful reminder or pointer, or an interesting aside – and it is up to the reader whether you take it or leave it.  When it is integral to the story, including an elaboration of a point you are making, a pair of commas is used for ‘parenthesis’ - or potentially a pair of dashes. Commas and dashes used to mark parenthesis do not need to close if they are sentence final (so full stop and its variants close them).  

The same rule applies in citations: you put the references to the literature in parentheses (Author,Date) when the citation is parenthetical in that is doesn't have a grammatical role to play in the sentence; Author (Date) introduces something that Author said or did, but only the date is parenthetical. In some conventions square brackets or even curly brackets are used, particularly when a numerical marker is used. Leaving out what is in brackets should not affect understanding of the sentence and who did the primary work or came up with the idea (even in the numerical/footnote/endnote usages). A footnote or endnote is simply a parenthetic explanation that is left out of the main text to avoid disrupting readability.

Avoiding parentheses (when you don't want people to ignore what you've said in parenthesis) can be tricky, and commas can be confusing. Commas (or even semicolons) are promoted to dashes when the parenthetical comments are relatively long or complex, allowing them to incorporate their own commas without confusion. Generally punctuation should go inside brackets or quotes if they make sense their, and outside if that is where they fit logically. Where both make sense, generally use the inner position – in particular, US printers and publishers tend to place comma and full stop inside for aesthetic reasons, even when it goes against the logic of the sentences.

Furthermore, the dash is often interchangeable with the colon – indicating that the explanation bit is somewhat optional/parenthetic. This explanatory usage thus looks like dash parenthesis closed by clause or sentence level punctuation (usually a full stop). I have used quite a few of these auto-closed dashes – they are very convenient and often more readable than the alternatives (with a lighter and less pedantic feel than colon). But sometimes it would be possible to just make them a separate sentence.

Generally punctuation should go inside brackets or quotes if they belong to the quoted sentence, and outside if that is the sentence they belong to. Where both make sense, generally use the inner position – in particular, US printers and publishers (inc. the much deprecated Chicago Manual of Style) tend to place comma and full stop inside for aesthetic reasons, even when it goes against the logic of the sentences (The New York Times Manual of Style and Usage is quite logical and is recommended for the US market). 

That dashed hyphen


There is a huge difference between a dash and a hyphen – and not just the differences in size.

A hyphen is morpheme-to-word or multiple-word punctuation that is used when without the hyphens normal grammar rules don't suffice to allow us to make sense of it, but but the phrase is not settled enough to be joined directly into a single word. Often these hyphenated words will be ‘collocations’ that have actually risen to have an idiomatic and close to a word-like status in current usage.  Generally, when using such a concept as multiple words in their original sense, with standard grammar, hyphens aren't used – but when the same idiomatic or technical phrase is used as an adjective hyphen will be required. (Note the use of ‘multiple word’ with and without hyphen in this paragraph.)

A dash is word-to-sentence level punctuation and is used to intersperse or conclude with parenthetic clauses that are too important to be enclosed in parentheses (round brackets), or which need more separation than is afforded by a colon or semicolon. Since it occupies a word-like level (unlike hyphen) it is appropriate to use an en-dash (viz. the size of the letter 'N') surrounded by spaces – the printer's usage of an em-dash without spaces (based on the size of 'M') is to be deprecated as both ugly and misleading—horrid! Furthermore, if the em-dash is used it must be separated from neighbouring words by thin or hair breaking spaces (el-space being the size of 'l' and corresponding to unicode 2009; best is often unicode 200A hair space). This is important so that word-processing and typesetting/formatting systems can break and justify appropriately (otherwise words in two different segments of the sentence get treated as a single word). Note that some (modern) fonts make the dashes and spaces too big — best is an Old Style (or Antiqua) font where em/en-space/dash are matched to M/N (or m/n for a bigger visual contrast).

There is one final very important punctuation mark… This is one that is very useful and can help resolve the awkwardness of trying to model real speech, and can often be used in place of a dash or a colon, and sometimes can even replace an exclamation mark.  This is the ellipsis mark '…' — which is actually a single character in Unicode (2026). Many people erroneously use (em-)dash when an ellipsis is required.

Elision refers to the dropping out of some words or parts of a word). Note that apostrophe is used for just omitting part of a word so that “She would have …” becomes “She'd've” (not “She'd of” – although that sounds similar it is not grammatically correct). With ellipsis the '…'  has spaces around it except where part of a word is missing in which case it touches the residual word. Where it represents a pause, and particularly where the last sound is held during thought, or for precise timing, again no space before the '…' ('exit expected in… 24 minutes').

Ellipsis is often used in quotes to indicate something has been omitted that was in the original text, but it is also used sentence-final for speech/thought that peters out prior to the end of the sentence. There are several possibilities here: One is that they got themselves into an awkward place and want to restart their sentence; another is that the implications are blindingly obvious; another is that the implications are obviously important but complex and uncertain.  In such cases, the person either can't finish the sentence or needn't finish the sentence.

Dash should not be used for any elliptic purpose, although this is a common error perpetrated and perpetuated by certain publisher and editors. It can be used for an interruption in direct speech, whether from an external source or the speaker's own thoughts. The normal rules of parenthetical dash apply in direct speech, except that quotes need to be closed and reopened if the parenthetical remark was not in the actual reported speech.

Note the different between em-dash marking a break beyond the speaker's control and ellipsis allowing their thought to peter out as they reconsider or change their mind or leave out something that is already understood... [by both reader and writer].  Note that square brackets can be used for ellipsis where words are omitted from a formal citation, but replaced with other words to retain the grammar and meaning (e.g. a pronoun may be replaced by a name).

The question of exclamation mark


This leads us to a final comment on question mark and exclamation mark (or 'point' in US-usage). These should be used immediately at the end of the actual question or exclamation, and followed by a new sentence (or follow up question).

Exclamations include things like “Wow!” or “Hi!” as well as commands like “Shut up!” or “Follow me!” and it is incorrect not to use the exclamation mark after such exclamations and imperatives — it actually indicates a sharp raised tone of voice. Vocatives (naming someone by name or kind) as a call for attention also require the exclamation mark, e.g. “Fred! Where have you got to?” or “Boy! Go get your mother!”

Exclamations that start with a question word tend to deserve the mark too — particularly when they lack a verb and thus fail to qualify as a sentence (“What a beauty!”; “How about that!”). But best is to stick to the question mark for a rhetorical question: “Why didn't I think of that?”; “Why not?” 

The more optional uses of exclamation are to mark surprise or unexpectedness, or something that would be an unfavourable outcome. Where you mainly want a pause, as room for thought, the ‘…’ may be appropriate.  Where you want to race on, then a simple ‘.’ may be best — or, if the idea is logically connected, a colon.

One of the reasons people have grown to dislike exclamation marks (particularly in e-mails and other e-comments) is that it can often be interpreted as deprecating or indicating surprise at someone's ignorant views (‘sneer mark’); it can also be seen as self-adulation (‘laughing-at-your-own-joke mark’).

Where you have a choice between ellipsis, terminal punctuation, dash or colon, think about the length of pause you want: the '...' ellipsis is most explicitly indicating a pause or longer break, while colon may be more or less pronounced than a full stop. Interestingly dash (like parentheses and quotes) may actually bracket a rapid aside – that is spoken faster and with a higher tone – with the normal rate and tone resuming to mark the end of the bracketed information (actually indicating content is less germane but still of interest). In the case of quotes, when quoting orally left and right finger-wiggles may be used while speaking the quoted word or phrase (along with the higher-pitch intonation).

Sometimes people like to give rules-of-thumb about the use of particular punctuation marks, e.g. high frequency of use of exclamation marks (instead of full stops), and low frequency use of full stops (because of use of colon, semicolon, dash, etc. to make longer sentences). Indeed, measures of readability tend to penalize you for both longer words and longer sentences – both are easy to fix!

But don't be worried if that leads to you sentences that start with words like 'and', ‘but’ or ‘so’. ‘However’ is, however, good to start a paragraph, being more major than ‘but’, but can also be used for a less confrontational contrast – particularly in a parenthetical second position in the sentence

For exclamation marks, if you have more than one a page (for a book with lots of dialogue and/or introspective thought or character point-of-view), then that's probably too much. Another warning sign might be having more exclamation marks than question marks.  But saying you shouldn’t use any, or at most one or two in a book, is overreacting – people are so used to reacting against the poor usages that they throw out the appropriate ones too.  But still, if it is not an exclamation to attract attention, or an imperative used to urge a course of action, or an ejaculation that is forced out of you by the circumstances… consider discarding it.

But all of these measures depend on both your genre and your audience and the style of writing you are using.

Starting sentences with a conjunction


One reason people might be inclined to have long sentences, marked with punctuation like dash, colon or semicolon, is because they've been told it is wrong to start a sentence with 'And' or 'But'. This is hogwash — and indeed in some languages there are special versions of 'and' used for starting sentences (or in second position in the sentence). It is particularly likely that you'll need to start a sentence with 'And' or 'But' when reporting natural speech. But if a sentence naturally ends, don't force other punctuation in there! And feel free to retain the 'And' or 'But' if that is what feels and sounds natural, particularly if you are in the point of view of some character.

Ending sentences with a preposition


Another common furphy concerns words that you shouldn't end a sentence with. The words concerned are called prepositions when they introduce a noun phrase ('with a preposition' -> 'you shouldn't end a sentence with a preposition'). But the same set of words do double duty as particle associated with verbs, either introducing an infinitive or participle ('to travel' or 'for traveling' ) or as part of a separable verb ('put up' -> ('he put the picture up', 'put up with' -> 'a habit I will not put up with'). Winston Churchill (jokingly) miscorrected something like this when talking about things 'up with which I will not put'.

And as fars as prepositional phrases go, you actually have to 'front' a noun in many circumstance (e.g. when a subject or an object). For example: 'The boy I gave a book to, the girl I gave some money. I couldn't see the boy I gave the book to. The boy I gave the book to wasn't there.'

Subjects and Objects

One final point that relates to common grammatical errors concerns the use of nominative and accusative case, viz. the difference between 'I' and 'me', or 'he' and 'him'. Unfortunately, teachers have often been unhelpful in saying "Don't say 'me and John', say 'John and I'", in that they haven't been clear that this only applies for subjects. And many modern editors and authors thus don't understand this. Furthermore, there is another aspect to this. It is politeness not grammar that insists you put yourself last, because it is impolite to put yourself first.

The simplest way of deciding what is correct is to leave out the complicating elements (e.g. the conjunction or the subtending clause). For example in considering 'for [my brother and] me' we see that 'me' is correct, not 'I' - all prepositions take the accusative, that is the 'object' form (me/us, him/her/them). Similarly for objects of verbs: 'he saw [my brother and] me [running|run down the street]'. 

This use of 'run' (as opposed to 'running') is an infinitive verb (versus a present participle in active form). A more common use of the infinitive involves the particle 'to'. E.g. 'he wanted [my brother and] me [to run to the shops]'. The accusative/object form is always used in front of an infinitive verb (which explains what is observed/desired of the object) — noting that the infinitival phrase can be omitted as illustrated in both the 'run' examples above.

One area that is complicated is the use of 'than' and 'as', which in modern usage can act as prepositions and take the accusative.  E.g. 'My sister is bigger than me but I am stronger than her.' However, traditionally they are used with clauses: 'My sister is bigger than I [am], but I am stronger than she [is].' And then it is possible to drop the predicate when it is obvious or redundant. So either case could be 'correct' depending on the viewpoint.

This is what's happening in the well-known joke, 

— 'Can you jump higher than a house?' 
— 'No, of course not!'
— 'I can: houses can't jump."

Which we could change to

'Can you jump higher than me?' which is generally used to question whether 'You can jump higher than I [can]?"

The latter now sounds stilted, while the former technically retains the ambiguity reflected in the joke.

Prescriptive grammarians trained in the classical tradition will impose Greek or Latin grammar on English and thus insist you should say 'It is I!' rather than 'It's me!' 

But in modern English, 'is/am/are/be' tends to be treated like any other verb and follows the Subject Verb Object (SVO) order-based grammar of English. So 'It was me not him' is quite appropriate when the speaker is not a classical prescriptive grammarian, while 'It was I not he' should be reserved for the classical pedant.

In classical languages, where adjectives and nouns are inflected for case (not just pronouns as in modern English), rather than the case (subject/object) primarily being determined by word order, the copula 'to be' is treated as being commutative (reversible) like 'equals', with the words on both sides having to be in the same case (nominative in the normal finite case: 'I am good/I am he'; accusative in the infinitive case: 'He told me to be good/He told me just to to be me [myself].') In a reflexive context (both bits referring to the same person), you can fix the problem using '-self'.

We can do this with prepositions too, in a broader set of cases involving repeated reference to a person, e.g. 'on behalf of my wife and myself' sounds better than 'on behalf of my wife and me' because it retains the pattern and balances the two halfs of the conjuctive ('and') phrase in terms of structure and syllables.


Further Reading (to think about - don't expect to agree with everything)







My Paradisi Lost stories

Kindle ebook (mobi) edition ASIN: B07ZB3VCW9 — tiny.cc/AmazonCL
Kindle paperback edition ISBN-13: 9781696380911 justified Iowan OS 11
Kindle enlarged print edn ISBN-13: 9781708810108 justified Times NR 16
Kindle large print edition ISBN-13: 9781708299453 ragged Trebuchet 18

No comments:

Post a Comment