Home folio A new way to display the spoken word in manuscripts and screenplays & speeches
formats

A new way to display the spoken word in manuscripts and screenplays & speeches

Published on 11/08/2010 by in folio

In 1996, while in my third year at Cent­ral St Mar­tins col­lege of design, I devised a basic sys­tem for  dis­play­ing the spoken word typo­graph­ic­ally. Since then I’ve expan­ded it into a full nota­tion sys­tem. I’d really value your feed­back and ideas on ways to improve it as well as any input from speech-writers, screen and stage writers and radio pro­du­cers on whether this would be of any use to you.

As we listen to someone speak, subtle vari­ations in inton­a­tion, volume, speed and rhythm con­trib­ute more to our under­stand­ing than words alone. Con­ven­tional typo­graphy com­mu­nic­ates pure, refined con­tent, stripped of most of the emo­tion. Unless we high­light a word with italic, any inform­a­tion about someone’s tone of voice must be annot­ated into the text. Such a marked dif­fer­ence between speech and text means that we have one voice for speak­ing and another for writ­ing. Dic­to­graphy tries to bridge this divide.

In dic­to­graphic nota­tion, con­ven­tions of typo­graphic and musical nota­tion are com­bined and aug­men­ted. The four basic prop­er­ties of speech: pitch, volume, tone and speed are divided into sep­ar­ate chan­nels. These ele­ments are then encoded accord­ing to a set of rules which use rel­at­ive pos­i­tion, visual weight (bold­ness and condensed-expanded) to back­ground and text col­our, word spacing.

That basic frame­work is fur­ther aug­men­ted with sym­bols relat­ing to indi­vidual vocal char­ac­ter­ist­ics: key sig­na­ture gives inform­a­tion on over­all tone, the speak­ers’ sex, nation­al­ity and accent and stand­ard pitch of the voices (think bass, bari­tone, treble and sop­rano). The ends of phrases or sen­tences are marked with a large blue dot. Phrases need­ing exclam­a­tions or ques­tion marks add Spanish-style inver­ted marks before the phrase, issu­ing advanced warn­ing to read­ers. Any non-specific vocal sounds such as a tut or a click of the tongue is indic­ated with an orange star. Like­wise, a trem­bling voice is rep­res­en­ted by a trill mark. Finally, en dashes are replaced with a dis­crete arrow­head because of the risk of con­flicts with the stave lines.

The inside front cover looks, at first glance, to be entirely abstract but is in fact a ren­der­ing of the Arch­ers epis­ode as a sample:

Ana­lys­ing a 5 minute scene from BBC Radio 4’s The Arch­ers took a huge amount of work, codi­fy­ing each speaker’s pitch, speed, volume and emo­tional cues. It should be pos­sible to auto­mate this using a soft­ware tool, or per­haps one day even to auto­mate speech record­ing dir­ectly into dic­to­graphic nota­tion. That would give a dif­fer­ent look to Hansard!

This close-up view shows the level of detail I had to go into in order to pro­duce the transcript:

Obvi­ously, this sys­tem still has ser­i­ous lim­it­a­tions. It can­not truly por­tray the vast sub­tlety of vocal dynam­ics, har­mon­ics and the bar­rage of other vari­ables inher­ent in any­thing as com­plex as human speaking.

But by offer­ing sev­eral more lay­ers of data into the text stream, it can offer a richer repro­duc­tion than con­ven­tional text and annota­tions can con­vey and I hope that might be of value to cer­tain pro­fes­sion­als for whom con­vey­ing mean­ing through speech has a value.

No Tweet­Backs yet. (Be the first to Tweet this post)

Tech­nor­ati: , , , , , ,

Bookmark and Share
 
 Share on Facebook Share on Twitter Share on Reddit Share on LinkedIn
No Comments  comments