Easy Voice Training
Project “Easy Voice Training” is based on a unique
technology of audio signal comparison developed by
Israeli company Hineni Computer Self-Made Technology
ltd. (HNNY) for usage in the fields of
accent-independent speaker recognition and/or speech
Goal: Creating tools for visual feedback
from various vocal exercises. For example, improving
accent, singing, pronunciation of certain phonemes like
R, L, S, Sh, especially in cases where the person's
audial feedback is impaired.
For example: Persons suffering from lack of musical
hearing (or memory), impaired hearing, those with
difficulties of pronouncing foreign phonemes (i.e. the
differences between S/Sh or R/L are difficult to
perceive by Chinese speakers), children/adults with
dyslalia or other speech impairment. In addition, it can
be an aid for persons studying tonal languages (Chinese,
Japanese) – languages which rely on changes in tone
(pitch) for distinguishing different words or their
A set of applications (for PC and mobile devices) in the
form of learning games, which would display results of
real-time voice analysis and thus allow to adjust one's
voice in real time and memorize the correct position and
tension of vocal chords, tongue, lips etc. As well as a
client-server network, providing the users with
individually adjusted sets of exercises – automatically
generated by the server, based on the user’s
achievements sent from the client device.
Method of operation: The incoming audio
signal (phoneme, word, phrase) undergoes a sophisticated
comparison with a defined standard.
The results of this comparison are:
Finding matching parts of the sounds.
Evaluating the rate of similarity between
Evaluating the ratio of matching parts' duration
to whole stream duration.
Calculating the difference vector.
Matches/deviations from the standard are represented as
a vector, composed of the rate of deviation and its
direction. Example: for a Chinese speaker working on
pronunciation of the SH sound, it may be a deflection
towards the sounds S, Z or J.
Differences from existing systems:
Virtually all of comparable tools are based on the
«listen and repeat» method and are useless for people
with hearing impairment or difficulties.
There exist absolutely no tools for learning tonal
languages with visual feedback. Singing tutor programs
(programs focused on producing correct pitch and
duration of notes) have little to none visual feedback.
The few (less than 5) existing tools for improving
pronunciation (of English only) are based on speech
recognition techniques and are accent-dependent.
These tools are absolutely useless for persons with
hearing impairment as they don't evaluate the deviation
from standard. Also, they are ineffective for Chinese or
Japanese speakers, who don't perceive the differences
between S and SH or R and L.
Learning tonal languages —
Treatment of speech disorders.
Practice of speech in persons with partial
or total hearing loss (the US National Association of
the Deaf counts at least as many as 1 million members).
Improving musical hearing and memory.
Audience: Various language courses – not
depending on particular language, as the system is
oriented at correcting pronunciation against an external
standard and can be easily adapted to any course and
method. Speech therapists. Amateur singers.
Organizations, which monitor audio information exchange.
Dedicated networks for pronunciation training with
unlimited global access. (A 2003-2005 poll shows that
over 1.5 million Chinese are willing to pay 5-10 USD
every month for access to unlimited English
pronunciation training resource.
The estimated cost of launching such a network is at
most 1 million USD and monthly maintenance costs are
around 200 thousand USD, while the estimated monthly
income exceeds 7.5 million USD.)
State of the project: The algorithms and
their implementation in code have been tested. The code
can be ported to various platforms such as tablets,
Preparations are being made for creating a SDK for
mobile platforms. Business model and function model of a
full-scale (language-independent) training network are