I was recently asked if speech recognition is a suitable technology to be integrated in emergency notification procedures. In extreme situations, control of devices by speech can be faster.
The definition of speech recognition, or automatic speech recognition (ASR), is very simple. It is the technology that allows to convert voice to text. Today it is used mainly in the healthcare sector, more precisely in the document creation area. It is of course used in the telecom world in environments, where only hands free operations are possible (cars), or in IVR (interactive voice response) environments, when you call a company and want to access a particular information by your voice (voice banking, bus schedule information request, etc)
Now , can speech recognition be a useful tool in the emergency communication world? Answer is clearly “yes”. Is it today a reliable technology for the high requirements of first response teams? Answer is “not sure”. Let’s analyse:
In my business, I have detected 4 application cases:
1) Emergency manager wants to trigger an alert procedure via voice
2) Emergency manager records a message. Message will be converted to text and dispatched to contacts/population either by sms, email, RSS, IM, other...
3) Contact/destination group receives a voice alert (text-to-speech f.ex.), and answers straight by voice. Answer will be translated centrally to text, and visualised by emergency manager
4) emergency manager wants to manage whole notification tool by voice, with simple phone call.
Today our alert management solution AlarmTILT can manage points 1) and 3). Points 3) and 4) were tested, but we did not find any customer interested in implementing it. It also can have dangerous side effects in today's alert procedure methods.
For point 1), our customer (emergency manager) calls a dedicated phone number. He can then "talk" and trigger an alert procedure, that has been pre defined in the system. He can confirm by entering a security code t5hrough the voice, and get a confirmation that his order is running.
For the media voice, AlarmTILT sends by default two-way voice text-to-speech notifications to mobile phones and landlines. The acknowledge method by recipient is given via DTMF ("press "1" to confirm, press "2", press "3" for any answer you pre-defined your procedure for), but also by speech recognition (say "yes" to confirm, etc).
Voice recognition has a disadvantage, if performance and rapid mass communication is an issue, as it demands more "bandwidth", when sending mass alerts. This means, that if used, it will slow down the process of sending notifications to thousands.
Other problem arises, when speech recognition is used in noisy environments. We had weird situations, where the voice of a person standing next to the recipient of an alert would trigger something "unexpected" during callouts. It also happened once, that someone sneezing provoked a "you just confirmed with yes" in an emergency test situation. Funny during a drill, what if in a real emergency situation? Confirmation messages like "you just said "yes", is that correct?" are useful when you are trying to book a hotel room over the phone, but totally inefficient and even dangerous when you need the info from a person on the field during an emergency.
6 months ago we actually dropped speech recognition as a default answering method from our solution (by default "press 1 for yes, press 2 for no, press 9 to repeat). Of course it can be implemented and customised to exact need.
AlarmTILT works as Software and Connectivity as a Service with Service Level Agreement conditions adapted to crisis situation needs, hence administration of procedures, phonebooks etc happens online.
For point 4) in 2007 we considered including voice control into AlarmTILT management, thus giving the opportunity to an emergency manager to manage the whole system with the voice via a phone call. Again this has its "tricky" side effects in noisy environments. And after asking our customer base if they would be keen on using this technique, none showed real interest. Most of them often use Excel import to update their contact lists in AlarmTILT, for them voice recognition is probably too modern and unknown.
Conclusion: speech recognition is a very useful feature for punctual situations and cautious use. Our experience shows, that today turning text into voice is still more universal than error prone voice to text. In emergency situations, voice recognition makes sense for any message/alert/notification involving a minimum of words.
The definition of speech recognition, or automatic speech recognition (ASR), is very simple. It is the technology that allows to convert voice to text. Today it is used mainly in the healthcare sector, more precisely in the document creation area. It is of course used in the telecom world in environments, where only hands free operations are possible (cars), or in IVR (interactive voice response) environments, when you call a company and want to access a particular information by your voice (voice banking, bus schedule information request, etc)
Now , can speech recognition be a useful tool in the emergency communication world? Answer is clearly “yes”. Is it today a reliable technology for the high requirements of first response teams? Answer is “not sure”. Let’s analyse:
In my business, I have detected 4 application cases:
1) Emergency manager wants to trigger an alert procedure via voice
2) Emergency manager records a message. Message will be converted to text and dispatched to contacts/population either by sms, email, RSS, IM, other...
3) Contact/destination group receives a voice alert (text-to-speech f.ex.), and answers straight by voice. Answer will be translated centrally to text, and visualised by emergency manager
4) emergency manager wants to manage whole notification tool by voice, with simple phone call.
Today our alert management solution AlarmTILT can manage points 1) and 3). Points 3) and 4) were tested, but we did not find any customer interested in implementing it. It also can have dangerous side effects in today's alert procedure methods.
For point 1), our customer (emergency manager) calls a dedicated phone number. He can then "talk" and trigger an alert procedure, that has been pre defined in the system. He can confirm by entering a security code t5hrough the voice, and get a confirmation that his order is running.
For the media voice, AlarmTILT sends by default two-way voice text-to-speech notifications to mobile phones and landlines. The acknowledge method by recipient is given via DTMF ("press "1" to confirm, press "2", press "3" for any answer you pre-defined your procedure for), but also by speech recognition (say "yes" to confirm, etc).
Voice recognition has a disadvantage, if performance and rapid mass communication is an issue, as it demands more "bandwidth", when sending mass alerts. This means, that if used, it will slow down the process of sending notifications to thousands.
Other problem arises, when speech recognition is used in noisy environments. We had weird situations, where the voice of a person standing next to the recipient of an alert would trigger something "unexpected" during callouts. It also happened once, that someone sneezing provoked a "you just confirmed with yes" in an emergency test situation. Funny during a drill, what if in a real emergency situation? Confirmation messages like "you just said "yes", is that correct?" are useful when you are trying to book a hotel room over the phone, but totally inefficient and even dangerous when you need the info from a person on the field during an emergency.
6 months ago we actually dropped speech recognition as a default answering method from our solution (by default "press 1 for yes, press 2 for no, press 9 to repeat). Of course it can be implemented and customised to exact need.
AlarmTILT works as Software and Connectivity as a Service with Service Level Agreement conditions adapted to crisis situation needs, hence administration of procedures, phonebooks etc happens online.
For point 4) in 2007 we considered including voice control into AlarmTILT management, thus giving the opportunity to an emergency manager to manage the whole system with the voice via a phone call. Again this has its "tricky" side effects in noisy environments. And after asking our customer base if they would be keen on using this technique, none showed real interest. Most of them often use Excel import to update their contact lists in AlarmTILT, for them voice recognition is probably too modern and unknown.
Conclusion: speech recognition is a very useful feature for punctual situations and cautious use. Our experience shows, that today turning text into voice is still more universal than error prone voice to text. In emergency situations, voice recognition makes sense for any message/alert/notification involving a minimum of words.
