PSYCHOPHYSIOLOGICAL STRESS ANALYSIS RESEARCH AND VALIDATION STUDIES
In March 1971, the first production models of the Voice Analyzer were produced and released. Thus the instrument was introduced that was to become the center of controversy, unequaled by anything short of nuclear power. The Voice Analyzer was developed to detect levels of significant emotional stress from human or animal voice utterances.
It was expected to be useful in psychological and psychiatric evaluations and in lie detection. The Voice Analyzer was immediately compared with the polygraph, which has been used in lie detection for years.
There are, of course, a number of similarities between the Voice Analyzer and the polygraph. Both are limited to measuring certain physiological manifestations of psychological stress. The polygraph is capable only of displaying relative stress levels and the Voice Analyzer absolute stress levels. In order to achieve accurate lie detection within these limitations, a means is required to differentiate between the stress caused by uttering a significant untruth and the stress from any other source. This distinction is accomplished by establishing control procedures to eliminate or identify non-deception related stresses. Thus, in a controlled test procedure, lie detection is based upon the stress changes displayed by the equipment. The Voice Analyzer controlled testing procedures have borrowed heavily from polygraph procedures, which have been evolving since the late 1920's.
Significant differences also exist between the Voice Analyzer and the polygraph since the Voice Analyzer was developed from technology nearly half a century more advanced than that from which the polygraph was developed. Very briefly, these differences are as follows:
The Voice Analyzer processes voice and, therefore, does not require attached sensors. This eliminates the stress caused by the unnatural and sometimes painful physical constraints of polygraph attachments. The use of voice as the source of stress responses has also created some of the controversy that surrounds the Voice Analyzer. Since the voice can be processed regardless of its final source (telephone, television, radio, etc.) it is indeed possible to do stress analysis without the subject being present and, in fact, without his knowledge. On the other hand, the lack of test controls in some of these situations limits the usefulness of these approaches in lie detection.
Responses evaluated by the Voice Analyzer are essentially instantaneous; those evaluated by the polygraph are derived from the end of a chain of body chemistry actions and reactions. A delay occurs in polygraph response to the stimulus, and an even greater delay takes place before chemistry returns to normal following a response. To offset this problem, questions must be spaced unnaturally. In addition, the polygraph will not tolerate multi-syllabic responses.
Restrictive laws have formally or informally accommodated the Voice Analyzer. In Florida, action in 1973 and 1974 resulted in state sponsored hearings. In spite of a bitter fight by the polygraph sector, the official report found that: "the Voice Analyzer is a voice polygraph; and is a reliable and credible instrument for measuring psychological stress in the hands of an adequately trained operator." While Florida law was never amended, the hearings resulted in an administrative decision that the law would not affect the use of the Voice Analyzer. Several other states have handled the problem with similar informality.
Two states did change their licensing restrictions. North Carolina created a separate profession of audio stress examiners and promulgated separate licensing requirements, including a reduced training period for the Voice Analyzer. The Arkansas legislature, under Act 342 of 1975, authorized the use of the Voice Analyzer for lie detection by law enforcement agencies. Other progress included the formation of professional societies inspired by the Voice Analyzer. The International Society of Stress Analysts was formed in 1973 to provide a broad professional base, and several state associations were formed as well. All of these accept all instruments for lie detection and can count a substantial number of polygraph examiners among their members.
It is somewhat hazardous to accept what "statistics say" and "studies demonstrate" without first examining the validity of a study's statistical basis and its parameters and techniques. What a study purports to "study" may not be what it studies at all. A study may contain unsubstantiated assumptions and uncontrolled variables that make the results unlikely to deal with the same areas as the study objectives. Usually, but not always, that study’s conclusion will include the alert that, within the limits of "this experimental model", the results attained could be expected. Obviously, if the same things were done, the same results could be expected. But where these things done faster, simpler, less expensive study?
A simple example of both favorable and unfavorable studies concerning the same piece of equipment will help to illustrate this point. If we were to test fire a new model rifle, and a reasonable number of testers were able to place all shots in the bull's eye, we would probably conclude that the weapon was sufficiently accurate to make bull's eyes. If, however other testers were unable to hit the target with the same weapon, we would suspect that something invalidated these unsuccessful shootings as a test of the weapon; perhaps the testers did not know how to fire a rifle with accuracy, or there was a bad lot of ammunition, or they were firing at dusk and could not properly see the target. If these variables were not controlled, these shootings would not be a valid test of the rifle's accuracy. We should note, then, that while the test was for the purpose of validating the accuracy of the weapon, the test itself was affected by these outside variables. Only if they are properly identified and controlled will the test be valid as a test of the rifle's accuracy.
When we speak of either the Voice Analyzer or the polygraph, we are speaking also of a system of several components, rather than just the instrument involved. In either case, the system must include stress in the examinee, properly working equipment that will convert the stress to a chart indication, a chart reading system that allows the stress to be recognized for what it is, and an examiner capable of applying the chart reading system. The Voice Analyzer must have audio tape recordings of reasonable quality, the polygraph must have ink in its pen reservoirs, and both instruments must have chart paper. When either instrument is used for lie detection, a valid controlled test procedure is required to identify those stresses caused by deception. If any one of these system components is eliminated or modified, the study as an evaluation of the Voice Analyzer or polygraph system is not valid (although it may be valid as an evaluation of the modification).
Frequently, studies performed by members of the academic community are based on an attempt to create an artificial model of the real world, rather than on the real world itself. For example, for a validation study of the polygraph or Voice Analysis, a number of students pretending they have committed a crime are tested in an attempt to detect deception concerning the pretended crime. This is sometimes an acceptable scientific approach, if a valid model can be accomplished. It has the obvious advantages of being quicker, cheaper, and more standardized as a source of study data than actual, real world lie detection examination. On the other hand, there is little assurance that the stresses produced by the pretenses are reasonably comparable to those produced by an actual jeopardy situation.
The Voice Analyzer was designed specifically for the levels of stress encountered in real world criminal, security, and clinical applications. It was never intended to and is not expected to accomplish stress measurement of game situations. In fact, if such low level stresses were allowed to provide significant responses, the Voice Analyzer's performance would be limited in the real world, as is generally considered to be the case with the galvanic skin response.
Many early validation studies of the Voice Analyzer were performed by polygraph examiners who wanted to assure themselves of its validity before switching from the polygraph to the Voice Analyzer. Since these studies were for their own purposes, few of them were documented formally. However, in a field survey of 39,000 examinations done for the Moorehead Committee Hearings in 1974, 5045 cases were reported to involve simultaneous testing with both the Voice Analyzer and the polygraph. Of these, 5,037 produced correlative results. In other words, right or wrong, the Voice Analyzer and the polygraph agreed 99.8% of the time in REAL WORLD testing. As interesting as these figures are, they cannot be accepted definitively because of the lack of detailed documentation. They do, however, indicate the acceptance of the Voice Analyzer by professional polygraph examiners based on a large of actual cases.
Fortunately, there is no shortage of formal studies of the Voice Analyzer. Exclusive of articles, testimony, and sundry other reports, there are some 60 studies of various applications. Of these, 50 are clearly favorable to the Voice Analyzer and several are partially favorable. While it is obviously impossible to report on or analyze all of these studies in a paper of this length, several are worthy of consideration.
First, the two most quoted adverse studies that most polygraph examiners are very quick to point out and studies performed by Kubis and Horvath. The former, in particular, is a study in flaws. Kubis, acting for Fordham University, proposed a contract with the U.S. Army to evaluate a developmental device know as the Voice Stress Analyzer. When the contact was let, the Army supplied what was then the only commercially available Voice Analyzer. According to the Kubis study report the study group's knowledge of the Voice Analyzer was derived from an advertisement in a police supplier’s catalog. None of the Fordham students who actually conducted the study had any training in any aspect of the Voice Analyzer. For this reason, Kubis arranged to have the Voice Analyzer charts read, through correspondence, by Gordan Barland, then a graduate student at the University of Utah. Barland was subsequently successful with the Voice Analyzer.
Detailed critiques of the Kubis report were accomplished by the manufacture of the Voice Analyzer, Sproston and Kajada Corporation, and will not be repeated here. However, a few points are in order as follows:
The Kubis test used a game type model rather than a real world situation. Thus, at the outset, the study did not evaluate the Voice Analyzer for its design function.
Only 19% of what the contract required for the Voice Analyzer evaluation was actually accomplished, and this was forced in spite of bad tape recordings.
Barland, in his cover letter submission of his findings to Kubis, criticized the structure and conduct of the test. For example, Barland wrote, "...because of the BP (blood pressure) cuff discomfort, the test was sometimes interrupted in the middle of a question sequence. I was usually aware of this only when the E (polygraph examiner) made a comment prior to turning off the recorder. This would account for large response to irrelevant questions (on the first question when the testing was resumed) which could overshadow a genuine response on the following relevant question(s)..."
Further, "...the examiner often lost his or her place in the question sequence on the R?1 test; he or she did not attempt to speak in a monotone when asking the relevant questions; he or she almost never waited for aircraft noises to dissipate before asking the next question. Although it was also bad to interrupt the test when an airplane flew overhead, from the standpoint of Voice Analysis the failure to interrupt often meant that several questions could not be evaluated; if they were relevant questions, then the entire test had to be discarded." (The test site was apparently located at the end of an airport runway.)
The American Polygraph Association has heralded the Kubis study as proof that the Voice Analyzer does not work. However, it is interesting to note that in the portion of the study where the two polygraph examiners were required to decide from their polygraph charts whether the subject was deceptive or not, one examiner was correct 51% of the time and the other 61%, essentially the flip of a coin.
It is not completely clear why Horvath is cited in opposition to Voice Analysis. Horvath's study is also a game model and is a comparison between the Voice Analyzer and the galvanic skin response, the device that led the APA to promote laws requiring cardiograph and pneumograph traces. The fact that Horvath has been recognized by the APA (of which he is a member), fortuitously led to the one completely impartial face off between the favorable and ostensibly unfavorable studies concerning Voice Analysis.
On September 18, 1978, Frank E. Milano was convicted in Superior Court for Mecklengert County, NC, of first degree rape and was sentenced to life imprisonment. The conviction was affirmed, with dissent, by the Superior Court of North Carolina on July 12, 1979. Milano filed a habeas corpus petition to the U.S. District Court for the Western District of North Carolina on October 30, 1979, based upon the fact that unfavorable polygraph testimony was admitted in the original trial and testimony relating to a favorable Voice Analyzer examination was not allowed. At the hearing on December 17, 1979, Horvath appeared as an expert witness for the prosecution and Michael Kradz appeared for Milano. Most of this testimony involved Horvath's presentation of study data considered unfavorable to the Voice Analyzer, including the Kubis report, his own study, and the defense's presentation of favorable study data.
As a result of this testimony, the court found: "The Court is satisfied from the evidence that both polygraph and Voice Analysis provided substantially reliable methods of evaluating psychological stress, and that the Voice Analyzer is at least as reliable as the polygraph, and possibly more reliable." In addition, the Court ruled: "Based upon that evidence and upon a review of the lengthy trial record, I am of the opinion that petitioner was unconstitutionally denied a fair trial when the court admitted evidence of an unfavorable lie detector report but excluded evidence of a favorable lie detector report and that he should have a new trial. The other alleged errors in the case do not appear to be of constitutional stature and will not be further discussed."
From the standpoint of validating the Voice Analyzer as an effective instrument for lie detection, several outstanding studies exist using real world cases.
In 1972, Lt. Michael Kradz conducted a study for the Howard County, Maryland, Police Department in which 43 criminal suspects took lie detection examinations that were instrumented simultaneously with a polygraph and Voice Analyzer. The test of validity was provided by a comparison of the examiner's conclusions with the findings of corollary law enforcement investigative techniques. The latter information was available for all but 7 suspects, who were among those cleared of suspicion and for whom no evidence to the contrary was uncovered. The Voice Analyzer proved 100% accurate in the 36 examinations for which complete and concrete corroboration was, or later became, available. The polygraph, in the same examinations, produced two cases of "untestable subjects" and two cases of "inconclusive results." Comparison with the conclusions of a second independent examiner produced a reliability statistic of 100% for the Voice Analyzer and 93% for the polygraph. It was concluded that the Voice Analyzer is a valid instrument for use in lie detection applications.
In 1975 Dr. John Heisse conducted a blind study in which six Voice Analyzer examiners contributed Voice Analysis charts from 53 personally administered real life lie detection examinations. All 53 cases were supported by factual corroborations obtained after the subjects were tested. Each examiner employed a prescribed procedure of test formulation and prescribed chart criteria.
The Voice Analyzer charts (with no other information) were then distributed to the other examiners in the group, who were to act as "blind" evaluators. These evaluators used the same chart interpretation criteria as the original examiners to reach their conclusions. The results of the "blind" evaluations were then submitted to the study administrator. A comparison of the conclusions of the "blind" evaluators with those of the original examiners showed a compliance between examiners, evaluators, and known facts of 96.28%. Of particular interest is the fact that the investigative experience of the original examiners and the evaluators ranged from over 20 years to less than three months, their experience with the Voice Analyzer ranging from over four years to less than three months. Original training in the instrument, with one exception, ranged from three to five days.
In the latter part of 1975 and the beginning of 1976, a Florida Polygraph Service concluded 716 pre employment examinations, 323 periodic examinations, 9 specific issue examinations (2 homicides, 3 armed robberies, 1 rape and 3 burglaries). An audio tape recording was made of each examination. The tape recordings were submitted to a Voice Analyst for evaluation, who had no knowledge of the evaluations made of the polygraph results. The Voice Analyst matched the polygraph evaluations 1045 times out of the 1048 examinations.
In 1976 a polygraph examiner submitted 36 audio tape recordings of polygraph examinations to Investigator Marcia Forbus of the Sebastial County Sheriff's Office in Ft. Smith, Arkansas. She processed the tape through the department's Voice Analyzer. She matched the polygraph examiner's evaluations in every test.
In a study conducted by Mr. Stanley Ostrowsky, a Voice Analyst, and Mr. L. Driscoll, a polygraph examiner in Columbus, Ohio. Of the 21,568 voice response tape records by 5 different polygraph examiners, all but 219 responses were analyzed on a Voice Analyzer due to poor audio quality. At the end of phase IV of the study, the CPA firm retained produced a set of statistics that the Voice Analyzer agreed 94.6% of the responses with the polygraph. In other words, when the polygraph instruments noted a deceptive response, the Voice Analyzer would agree 94.6% of the time. These statistics were presented to the Criminal Justice Department. The Criminal Justice Department Control Board has accepted the findings as valid and concluded that there is merit of credibility to the Voice Analyzer as a means of detection of deception.
During the period April 1, 1978, to March 31, 1979, the Chanute, Kansas, Police Department acted as the executive agency for a federally funded one year field evaluation of the Voice Analyzer. 159 examinations were conducted during this period at the request of 24 separate law enforcement agencies in the nine county southeast Kansas area. As a result of the year long evaluation, the Chanute Police Department found that “this instrument (Voice Analyzer) is one of the best investigative tools available at any price. We base this judgment on the Voice Analyzer's versatility, ease of operation, simplicity, accuracy, and relatively short training period...” Leading to these conclusions was the fact that 70 investigators stated the crimes solved would not have been solved without the Voice Analyzer.
In 1979 Nachshon and Amsel conducted a Voice Analysis study using tapes made during polygraph examinations of criminal suspects by Israeli Police. Independent corroborations were not used in this case, rather, the polygraph findings were assumed to be true. Agreement between the Voice Analyzer and the polygraph occurred in 94% of the cases. The Nachshon and Amsel study was carried two steps further. In the first step, blind chart reading were employed. The polygraph examiners read their charts and the Voice Analysis examiner read his without being able to identify the subject or make use of global impressions available with the original polygraph calls. In the second step, the Voice Analyzer and polygraph charts were similarly dissected to remove the control test patterns of responses so calls could be made simply on stress determinations. In both case the Voice Analyzer was superior to the polygraph in agreeing with the original polygraph examiner findings. As a result, Dr. Nachshon stated, “...I was convinced that the Voice Analyzer is as good as the polygraph instrument to detect lies...” This study is particularly interesting in that Dr. Nachshon had headed a study in 1977 that failed to validate the Voice Analyzer. The question reasonably arises: If the Voice Analyzer was demonstrably successful in a 1979 study, why was it unsuccessful in 1977? An examination of the 1977 study report reveals the flaw. None of the three persons reading the charts knew how to read Voice Analysis data. The study report found that the three data readers agreed on their reading of the same data only 10% of the time. This means that 90% of the time at least one data reader had to be in error about what the data was telling him. In the 1979 study, Tuvya Amsel, the Voice Analyzer examiner, had received training in reading Voice Analysis data.
These studies again emphasizes the system nature of the Voice Analyzer and the polygraph. If the data reading techniques employed are not valid, then the results of the study are likely to be invalid. Note: The Voice Analyzers used in this report were the Psychological Stress Evaluator (PSE, graph data) and the Mark II Voice Analyzer (PVSA, numerical data).
In May of 1976 Aviation, Space and Environmental Medicine published a study conducted by the Aeromedical Laboratory, Japan Air Self Defense Force, Tokyo, Japan. The article was entitled, “Method for Determining Pilot Stress through Analysis of Voice Communications.” The study was conducted by Isao Kuroda, Osamu Fijiwara, Noriko Okamura, and Narisuke Utsuki with acknowledgment made by Dr. Robert F. Thomas, Lt. Col., USAF, for his consistent interest and collaboration in the research. By means of a sound spectrogram, the mean vibration space of a voice can be analyzed if the space between the vertical deflections of the vowel sounds is calculated in micrometers. The VSSR can be divided into three phases: Normal, urgent, and emergency; each with three gauges of 0.5 S.D. apiece. The study concluded that vibrations space shift rate (VSSR) correlated well with the emotional status of the pilot and that the elevation of the pitch of the voice, which varies directly with the tension of the vocal cords, corresponded directly with the increased emotional tension of the pilot. In addition, under stress conditions, the muscle tension on the vocal cord and the wall of the resonating box increase and the resulting pitch, fundamental frequency and dormats also increased. This study indicates that certain characteristics of the voice change during periods of emotional stress.
Psychophysiological Touch Screen Analyzer Validation Study
A study conducted in 2004 to determine whether or not the Psychophysiological Touch Screen Analyzer (PTSA) is capable of capturing a psychophysiological response to a stimulus. The algorithms utilized in the PTSA were developed in the 1980's by Profiles for the Mark II Voice Analyzer. These algorithms were tested in a real world environment with 2000+ examinations.
In this study, a truth verification survey was administered by the PTSSA to 15 different subjects. This survey was adapted for the PTSSA from a psychological preconditioning questionnaire that had been developed for voice stress analysts and polygraph examiners in the 1980's.
The subject read the adapted questionnaire on the computer touch screen of the PTSSA. The subject answered each of the 65 questions by touching the touch screen "yes" or "no" button at the bottom of the touch screen (the 'yes' button was blue and 'no' button green). After each touch screen answer button was touched by the subject, the next question was automatically displayed. Only the last 26 questions of the touch screen survey were analyzed to determine emotional reaction patterns (the same procedure followed in voice stress analysis and polygraph).
Immediately after the touch screen survey was completed, a Psychophysiological Voice Stress Analysis (PVSA) examination was administered to each subject. The 26 questions asked were the same as the last 26 questions administered by the PTSSA. The PVSA examination was prerecorded and played back through headphones place over the subject's ears. The subject's "yes" and "no" responses to each question were digitally recorded by means of boom microphone attached to the headphones. The microphone was place in a position just to right of the subject's mouth, approximately 1 inch. After the PVSA examination was administered, each "yes" and "no" response was analyzed to determine emotional reaction patterns.
CONCLUSION: There was 100% correlation between the PTSSA and PVSA concerning the relevant and control issues. This evidence proves, beyond a reasonable doubt, that when the subject touches the specialized computer touch screen, in response to the stimulus displayed on the touch screen, the subject's psychophysiological response was captured. |