Forensic Audio AEA Contact Info

Guidelines for the Forensic Audio Recordist

by Wes Dooley

1. Scope

This white paper suggests concerns, guidelines, and practices for the collection of recorded audio evidence for forensic purposes.
The paper is a classification of guidelines intended to benefit any person engaged in the collection of audio evidence and to assist them in avoiding known errors whenever practicable.

This paper is directed to a general audience broader than the community of forensic audio analysts. These guidelines are designed to be consistent with Audio Engineering Society (AES) standards but this white paper is intended to stand alone.

While this paper does not specifically address video recording, many elements of this paper also apply to forensic video recording.

2. Normative references

Normative authority is restricted to applicable regional legal systems, for example case law, statute or special instructions determined to apply in specific cases by the finder of fact.

3. Definitions

3.1 Forensic Audio Recordist

FAR
A person performing the act of recording and storing audio forensic material for use in a legal case, investigating an ongoing activity or collecting audio documentation of interviews and other meetings. Specific examples of FARs include: law enforcement personnel; private investigators; persons serving in security capacities, governmental or private, recording potential evidence; corporate employees functioning in a human resources capacity; recordists of performances or interviews; any member of the general public seeking to document meetings of forensic significance; and corporations wanting to document business activities.

3.2 Evidentiary Transparency
An intangible property such that the finder of fact believes no material challenges to authenticity interfere with the examination of a recording's content.

3.3 Self-Documenting Procedure
The practice of embedding information in recordings - for example, recorded announcements or declarations as to the time of recording and people present.

3.4 Every Effort
A level of thoroughness and competency in collecting audio evidence such that even if absolute evidentiary transparency is lacking there is still a strong supposition that a recording is authentic.

3.5 Declaration
Embedded Declaration
Any forensically significant statement recorded on magnetic tape media with sufficient verbal clarity so that a secretarial transcription can reproduce it accurately as text.

3.6 Finder of Fact
The one who determines a fact averred by one party and denied by another. i.e. a judge, jury, magistrate, etc.

4. Actions to be taken by a FAR

A FAR should make every effort to collect audio evidence that can be recognized as authentic and admitted as evidence by a court or other finder of fact. The FAR should include relevant spoken information within the recording itself such as, date, time, location, conversation participants, as well as any other pertinent information that will enhance the credibility, completeness, or comprehensiveness of the recording. Any written materials and / or metadata that describe the circumstances of the recording should also be included.

5. Recording equipment and media

Typical FAR recording systems can be categorized as:
  1. Mono or Stereo
  2. Having external or internal microphone(s)
  3. Stand-alone or able to connect to other equipment
  4. Analog
  5. Digital

Choices

1. Mono or Stereo?
Mono recordings are less intelligible than Stereo recordings, especially when there is an overlap of speaking voices, when one person is speaking more softly than another, and when there is excessive background noise or room reverberation.

2. External or Internal microphone(s)
Although internal microphones don't get lost, in analog machines they record the mechanical noises of the cassette mechanism. External microphones are often easier to place where they will pick up all voices evenly. Body recorders systems need to have the microphone elements taped to the clothing or otherwise secured so as to reduce movement noises.

3. Self-Contained or is it able to connect to other equipment.
A recorder with a built in microphone and speaker is convenient. However the ability to connect to other equipment is an advantage, in that one has the option of using external accessories for maximizing recording and playback quality.

4. Analog
If authenticity or enhancement work might become necessary, analog recordings have several advantages over digital recordings. Advantages for proving authenticity include:
  1. An original recording can be reliably matched to its original recording machine.
  2. Stops, starts, pauses, over-recordings and erasures can be reliably identified.

A. Cassette
For enhancement, the analog advantage is the linear nature of the recording. Unless a voice activated recorder (not recommended) is used, everything is recorded. No low-level acoustical material is muted as is done by low bit-rate digital voice recorders. Linear recordings allow a specialist to often recover material that is buried beneath the noise floor of the recording.

B. Micro-Cassette
Although special stereo micro-cassette players with adjustable azimuth do a very good job of playback on difficult tapes, we recommend recording in the compact cassette format at the standard speed. When an analog recorder has several speeds, use the highest practical speed available.

5. Digital
Digital recording systems are getting better all the time, but a FAR must be aware of the compromises involved in using various formats of digital equipment. We will not examine dedicated law enforcement digital recorders other than to note:
  1. They can be very good, especially the non-lossy field units as opposed to 911 loggers.
  2. That maintaining a rigorous chain of evidence is imperative, as there is no physical master as there is with an analog recording.

A. Lossy Compression
What are the practical implication of making digital recordings that use lossy compression? Such compression schemes promise extremely long recording times. The cost of this additional recording time is that they use perceptual encoding schemes that pre-edit a recording. Like voice-operated (VOR or VOX) cassette recorders, they make automatic choices about which sounds are loud enough to be recorded. However their encoding schemes can delete both time and frequency information to achieve efficient use of digital memory space.

A loud sound in one frequency band can result in no recording of softer sounds in a higher frequency band if a lossy compression program, as they often do, calculates that the louder sounds in the lower band would mask your ability to hear softer sounds in the higher frequency band. Once such a decision is made, there is no way to get this lost material back. If understanding a quiet comment made by a women while a man is haranguing her might become important, lossy compression can be your worst enemy.

B. Linear PCM or Lossless Compression
What are the advantages of making digital recordings using linear PCM or lossless compression modes?
  1. Knowing that a recording's intelligibility can be enhanced effectively if it becomes necessary, as you chose to record all sounds and not discard low-level sounds or sounds that would typically be masked by a louder sound.
  2. Knowing that the length of your recording is accurate. Some voice-oriented compression algorithms stop and start the recording like a Voice Operated Recorder (VOR).

C. Having removable or fixed recording media
Some digital voice recording systems allow one to add a memory card to increase their recording time. This has the advantage of making longer, higher-resolution recordings possible. Removable cards can be used to archive recordings until their files can be downloaded and avoids losing the use of a voice recorder until its files can be archived.

Earlier it was observed that maintaining a rigorous chain of evidence is imperative with digital recordings, as there is no physical master in the way we are used to thinking about analog recordings.

CD and mini-disc recorders occupy a middle ground as they produce a physical recording that can be removed and archived easily. The CD-R format has the advantage that it cannot be recorded over. As stereo formats, they also make the transcriber's job easier. The new Sony Mini-Disc (MD) HD recorders are useful as they can record in either linear PCM and lossless compression formats, they are small, and their media and the battery packs support long recording times.

The many compromises involved in these examples of useful digital tools available to a FAR bring us to a very conservative suggestion. Digital recorders are highly useful, however we suggest that a FAR should strongly consider having a standard speed stereo audio cassette with external microphones as part of their standard recording package. Cassettes are a removable medium with authentication practices that are well documented and widely accepted. Recording quality can be quite high and the costs of adequate machines such as a Sony TCS-30D Pressman stereo compact cassette recorder are low. If authenticity questions might arise about a recording, using a standard compact cassette as one of your recording formats is very useful.

We strongly suggest that you consider making two recordings at the same time anytime the meeting you're recording is important, and would be hard to arrange again. For casual meetings this could be done by using multiple digital voice recorders. More formal meetings might be recorded in stereo in mini-disc and compact cassette formats while using external microphones and monitoring on headphones.

A FAR shall recognize that the location of their microphone(s) is one of the most critical choices they will make. Microphone positions that record more speech and less noise are easier to achieve with external microphone(s). We recommend using external microphones for both audio and video work, as mics built into a video camera often sound too far away from the sound source.

To sumarize, micro-cassette and non-linear digital recording formats are less desirable because of their inherent resolution limits. Most digital audio recordings cannot be reliably authenticated, with the exception of certain dedicated digital recording equipment designed for law enforcement. Using a standard speed compact audio cassette for one of your recordings, can help demonstrate that every effort has been made.

Recommended practices for compact cassette recordings include using leaderless virgin or bulk erased audio cassettes, using thicker tape, and when practicable, recording in stereo. Thicker tape stocks such as used for C-60 cassettes are mechanically more robust than the very thin tape stock used in a C-120 cassette. Better quality AC powered tape recorder/players tend to handle tape more gently than less expensive recorder/players. Tape damage to thinner tapes tends to most often occur in less expensive machines, especially when they are stopped during fast-wind or the direction of fast-wind or play is reversed.

Other recording equipment concerns and practices
Stereo compact cassette recordings made using Dolby noise reduction encoding are desirable but Dolby recording modes (type B or C) are rarely available on portable tape recorders. When noise reduction (NR) encoding has been employed, the type of NR used on the recording shall be clearly documented (see 7.5) or declared. Do not use Dolby NR for playback when it was not originally used while making the recording. Using Dolby NR decoding only on playback usually reduces intelligibility.

Setting recording levels to produce a satisfactory recording is an art informed by experience. Practice making some recordings before you tackle more important projects. Automatic Gain Control (AGC) can sometimes obscure and distort sounds, sowe recommend recording in the manual mode where practical. Voice-activated recording modes in either digital or analog recorders are not recommendedas low level sounds will not always activate the record function.

As part of the statements recorded at the beginning and end of the forensic audio recording (see 7.1.3 and 7.1.4), declarations should be made as to the choice of equipment and media (as in 7.2.7).

A FAR shall handle and maintain recording equipment so as to ensure the equipment is functioning and is operated in accordance with the manufacturer's standards, realizing that the equipment used is an important link in the chain of evidence (see 7.2.7 and 7.5).

6. Evidentiary contexts and expectations

6.1 General
Different social settings have been court-determined to fall under differing legal standards pertaining to the admissibility of forensic audio recordings. Thus it is necessary that a FAR select and set up equipment, and make declarations by recorded speech (as in 7.2) so as to adhere to legal standards of admissibility in the jurisdiction where the recording is being made. Sometimes recordings made for one purpose are used in other settings not originally intended at the time of the recording. For example, a recording of an employment interview could be the subject of a subpoena in a trial. The following examples are not intended to state any particular jurisdiction's legal standards. Our intent is to illustrate the sort of differences that might make a recording admissible or inadmissible in a particular court.

6.2 Contexts with and without privacy

6.2.1 Introduction
One major dividing line can be between social contexts with and without a reasonable expectation of privacy. Here is an example of how such constraints might be structured:

6.2.2 With privacy

6.2.2.1 Social contexts with privacy
For social contexts with an expectation of privacy, a FAR shall seek regional legal authorities to define correct practice. For example, for telephone conversation recordings, some U.S. states require two-party consent and some require only one party to be aware whether a recording is being made. People recorded in environments with a reasonable expectation of privacy must give their informed, un-coerced consent to the recording.

6.2.2.2 Equipment handling and spoken declaration with privacy
Statements of personal identities (as in 7.2.3) and statements giving unmistakable verbal consent to being recorded shall be made (as in 7.2.8). Equipment set-ups are often setup in plain sight in contexts with an expectation of privacy. However an unobtrusive equipment set-up might reduce reticence and encourage candor because people are often self-conscious in the presence of recording equipment. Microphone placement should be close to the main person or persons being recorded, or to those with the quietest voices.

6.2.3 Without privacy

6.2.3.1 Social contexts without privacy
Social contexts in which no expectation of privacy exists are as defined by courts and statutes for purposes of this document. The legal status of specific settings is not always obvious and should be subjected to adequate legal research before any recording is undertaken. A short list includes:
  • squad cars
  • interrogation rooms
  • jail or prison cells
  • public spaces
  • restaurants
  • showrooms
Adequate legal notice is used in some areas to create a presumption that continued presence waives the right of privacy. For example, theme parks or concert venues that encourage media coverage can require visitors to allow the use of their images as a condition of attendance.

6.2.3.2 Equipment handling and spoken declaration without privacy
Microphone placement should be close to the main person or persons being recorded, or to those with the quietest voices. Inconspicuous recording systems are sometimes desired in such venues. Cloth can be an ideal means of concealment. A recorder placed on a chair or on the front seat of a squad car can be hidden by a light sweater without appreciable loss of audio quality. Use of a separate microphone provides more flexibility of placement and ease of concealment. Also, such use reduces the risk of collecting noise generated by the machinery of the recording equipment.

A FAR shall always start and end each recording with a spoken declaration of time and circumstances (as in 7.2 generally, 7.2.4 and 7.2.6). In situations involving concealed recording devices, these declarations are unlikely to be made in the presence of the subject(s).

7. Self-documenting procedures and declarations

7.1 General

7.1.1 Purpose of self-documenting procedures
Self-documenting procedures should be used to ensure that the evidence speaks for itself. Well documented and transparent recordings result in straightforward authentication of the evidence and minimize the possibility that the recording will not be accepted as evidence in court.

7.1.2 Specified procedures to be followed separately and severally
In some situations, a single instance of procedural neglect can be sufficient cause for a finder of fact to exclude a recording from evidence. Recommended practices should be followed completely where possible, and as completely as possible when all elements can not be included. A FAR shall make every effort to observe whatever self-documenting procedures reasonably promote evidentiary transparency.

Many of the self-documenting procedures detailed in this document are implied in the forensic audio authentication standard AES43-2000 (informative).

7.1.3 Recommendations of what to record at the beginning of a recording
The addition of pre-recording announcements helps produce evidentiary transparency. At the beginning of the recording, recommended information includes:
  • case identification number and name
  • time, date, and year
  • location
  • FAR name, organizations and ID if appropriate
  • participants expected on the recording
  • subject of the recording
  • make, model, and serial number of the recorder, transmitter, receiver, etc.
  • manufacturer and model of the audio recording media and batch number (if present)

7.1.4 Recommendations of what to record at the end of a recording
The addition of post-recording announcements helps produce evidentiary transparency. At the end of the recording, recommended information includes:
  • case identification number and name
  • time, date, and year
  • location
  • FAR name, organizations and ID if appropriate
  • participants on the recording
  • subject of the recording

7.1.5 Stops and restarts to be documented
The stopping and restarting of the recording or the changing of the recording medium shall be documented on the recording whenever possible. Situations where continuous and uninterrupted recordings are not possible or practical shall be documented with a voice announcement that the recording is being stopped and shall note the time (as in 7.2.4) and reason for the stop. When the recording is restarted, a brief announcement shall be made noting the time of the restart (as in 7.2.4) and a brief description of identifying information, suitable for a tag or label (as in 7.5), for example a name or case number.

7.1.6 When announcements on recorded evidence cannot be made
When a FAR cannot make a recorded announcement or declaration as to the stopping or restarting of a recording, detailed written notes (as in 7.5) shall be promptly made in order to document the recording's stop time, stop duration, restart time, reason for the stop and to note any changes of people present.

7.1.7 Categories of acoustical sources
Recorded content can be grouped within four categories:
  • speech (as in 7.2)
  • performance(s) of artistic material including music or songs
  • environmental sounds (as in 7.3)
  • electro-mechanical noise (as in 7.4)

7.1.8 Encouraging consistency in these recommended practices.
The declarations or announcements mentioned above represent recommended practice. Depending on factors which can be considered by the finder of fact, any normative requirements in this document are at least informative and should be considered as part of training and for the planning of recordings. Adequate training is important because the routine habit of using self-documenting procedures should be cultivated as well as competent practices inherent to the operation of recording equipment such as microphone placement or gain sensitivity settings.

7.2 Speech

7.2.1 General speech issues
Documenting intelligible speech or conversation with a continuous recording is ordinarily the purpose of recorded audio evidence. Embedded declarations - stated on the recording itself - identifying the who, what, when, where and if appropriate, why of the recording are examples of self-documentation. Each continuously recorded section should be bracketed with announcements, if possible, such as those specified in 7.1.3 and 7.1.4. Wherever possible, these bracketing declarations should be part of one continuous recording section.

In some social contexts it will be more appropriate for a FAR to record separate declarative sections before and after. The ultimate disposition of the finder of fact is the only test method to apply to declarations made by a FAR. Errors in the collection of audio evidence may be avoided with the use of a checklist documenting steps and procedures for making a forensic audio recording.

7.2.2 Declaration of start, stop, pause and resume
A break in the continuity of an audio recording is the most common weakness found in recorded audio evidence. The concept of making every effort implies that there will be circumstances under which certain efforts are unable to be made, particularly when the need to start or stop a recording is sudden or unexpected. For the finder of fact to have confidence in the integrity of the recording any stop, start, or pause in a tape recording shall be documented (as in 7.1.5). As with the stop recording function, the pause function should not be used if it can be avoided.

7.2.3 Declaration of who is present
To avoid denials of identity in court it is important, during the documentary portions of the recording, to name or describe the parties present. Where possible this should be during continuous recording. Where this is not possible, separate declarations before and after each recorded conversation should be made. A FAR should use full names including spellings or if names are not available, full descriptions of the participants. A FAR should also request that parties who are present confirm their identities or additionally identify themselves on the recording.

7.2.4 Declaration of when or time of day
Announcing the time of day does not have to be elaborate or frequently repeated, but it is recommended both when starting and stopping a recording.

It is also useful to inject time of day declarations in the event of identifiable background noises (as in 7.3) that can later be independently verified to have occurred at a specific time of day.

7.2.5 Declaration of where or location
Location has several variables besides street address or what floor of a building, or room in a building, is being used. Descriptions of the surroundings are recommended, especially if they possess unusual audible characteristics such as echo, background silence or unique exposure to background noises of particular kinds. Consideration should be given as to whether environmental sounds might puzzle listeners. Because every location has distinctive environmental sounds, all mysteries should be eliminated since enigma weakens evidence.

7.2.6 Declaration of what and why or general and detailed purposes
Brief, self-documenting explanations of the circumstances of the situation provide a sense of background and purpose for a recording. They also reassure listeners that the FAR's activities reflect both planning and training.

7.2.7 Declaration of how a recording is made
Each piece of recording equipment used contributes to the uniqueness of a particular recording. A FAR should declare the make, model number and serial number or other unique identifiers for the recorder, media and external microphone(s) (as in 5). The location of the equipment should be described.

Often the waveform "fingerprints" of an equipment configuration will allow a recording to be reliably authenticated, by matching the recording's artifacts or peculiarities to the equipment which introduced it. Anomalies that can be demonstrated to be specific to a particular recording system, operating in a specific location (as in 7.2.5), might need to be explained in a courtroom setting. For this reason, maintenance logs (as in 7.5) can be of critical importance. Although equipment needs to be maintained in reasonable working condition, a FAR shall avoid repairs that can compromise a piece of equipment's usefulness as evidence of the authenticity of a recording. For example for cassette machines, head cleaning should be performed regularly and noted in a tape recorder's maintenance log, but more drastic changes to a machine's mechanical operation, such as head realignment, should be considered carefully. Mechanical repairs and servicing that can damage the evidentiary usefulness of equipment should be avoided.

7.2.8 Declaration of willingness to be recorded
Declaration of willingness to be recorded shall be made by all parties being recorded when in social contexts having a strong expectation of privacy. Recording declarations of willingness to be recorded by all parties present is always a recommended practice (see 7.1.8).

In social contexts in which the right to privacy is specifically waived, it is important to have subjects give their consent with clear verbal answers such as "yes" or by using a sentence. Just as on a witness stand, a grunt is not unequivocally either a yes or a no.

7.3 Environmental sounds
Environmental sounds are background noises and include unique events such as sirens or aircraft. Unique sounds that can be independently determined to have occurred at the time and location of the recording contribute to the authentication process. Audio evidence is generally not impaired by incidental noises provided that they do not interfere with intelligibility.

7.4 Electro-mechanical noise

7.4.1 Tape transition artifacts such as starts or stops
The mechanical and electrical transitions of the record and erase heads when starting, stopping and pausing a recording create identifiable artifacts on a tape. These artifacts should be documented and explained in declarations (as in 7.1.5 and 7.2.2). Electro-mechanical artifacts that are not consistent with a recording's stated history jeopardize its credibility as evidence. For example, if an accidental over-recording occurs, it should be documented in writing immediately (as in 7.5).

7.4.2 Electro magnetic interference (EMI)
Electro Magnetic Interference that impinges upon circuits and cables during a recording can result in the recording of hum or unwanted radio transmissions. Although EMI phenomena are undesirable because they distract from the intended subject recording, when they do occur they can provide additional evidence of authenticity if properly documented (see 7.5) or declared.

7.5 Ancillary written documentation and labeling
Recording media should be labeled clearly and unequivocal statements of whether a recording is an original, a copy of an original, or a copy of a copy should be made in the space available. Distinctive identification of each recording should be made in several formats. Primarily as embedded audio declarations, and then secondarily on supporting written or printed labels or sheets of standard size and easily photocopied paper, for example in the U.S. 8-1/2" x 11". The declarations enumerated above under 7.2 can be committed to writing in this manner. In situations where spoken requirements and recommendations are not followed, written documentation should be completed in a prompt and competent manner (see 7.1.6).

Maintenance logs shall be maintained and kept up to date for all recording equipment used for forensic purposes (see 7.2.7). Note also 7.6.3 Chain of custody considerations.

7.6 Physical media handling

7.6.1 Control over evidence in its physical form as recorded media
During the period while the audio evidence is being made and documented, a FAR shall control the actual physical media insofar as possible and shall be responsible for observing reasonable technical procedures for its preservation.

7.6.2 Protection from environmental hazards or accidental over-recording
Since exposure to extreme temperatures, liquids, strong magnetic fields or other environmental hazards can destroy recorded evidence, the FAR shall ensure that the physically recorded media or stand-alone voice recorders are stored in an appropriate environment so as to maximize the quality of the recorded audio content. See annex A for full titles of AES22-1997 (r2003) and AES38-2000: AES recommended practice for audio preservation and restoration.

Playback of evidence or potential-evidence tapes should be done on machines that are well maintained. This minimizes the possibility that the recorded medium might be damaged or compromised. Good maintenance for analog recorders and players would include cleaning the tape path, and demagnetizing the mechanism if practical.

For hard to reach tape paths such as a car stereo cassette or VHS VCR, playing a wet cleaning cassette is the best way to maintain a clean tape path. For most audio cassette systems, the best technique is to clean the tape path with a Q tip and a suitable solvent. For more details on cleaning and demagnetizing a tape path, a web search for "tape path demagnetizing" is useful.

It is a recommended practice to remove the plastic record enable tab or tabs on an audio cassette immediately after a recording has been made. Such removal(s) protect against unintentional alteration of the recording by making it difficult to record over it accidentally during subsequent handling. Anytime such protective practices for recorded media can be followed, they should be followed. For more information on the managing of recorded audio materials intended for examination, refer to AES27-1996.

7.6.3 Chain of custody considerations
A FAR should recognize that once a finished recording is beyond the FAR's control it can be compromised. Therefore a FAR should follow the proper documentation, administration or logging in and out of the chain of custody for audio evidence. For answering possible challenges to the authenticity of the recorded evidence, it can be useful for the FAR to retain a copy of the original recording and documentation. When analog recordings are made that can be recorded on both sides, the tabs should be removed from both sides when both sides are recorded.

Annex A

(informative)

AES standards relevant to this document.

AES27-1996, AES recommended practice for forensic purposes
- Managing recorded audio materials intended for examination

AES43-2000, AES standard for forensic purposes
- Criteria for the authentication of analog audio tape recordings

AES22-1997 (r2003): AES recommended practice for audio preservation and restoration
- Storage and handling
- Storage of polyester-base magnetic tape

AES38-2000: AES standard for audio preservation and restoration
- Life expectancy of information stored in recordable compact disc systems
- Method for estimating, based on effects of temperature and relative humidity.


End of document. 2005-07-09 revision by Paul Pegas
Copyright 2007 by Wes Dooley
Comments to: stereoms@aol.com