This vignette is considered deprecated! It’s content has been moved to the the EMU-SDMS manual (+ expanded and updated). Specifially see the the R package wrassp as well as the wrassp implementation chapters.
This document is meant as an introduction to the wrassp
package. wrassp
is a wrapper for
R around Michel Scheffers’s libassp
(Advanced Speech
Signal Processor). The libassp library
aims at providing functionality for handling speech signal files in most
common audio formats and for performing analyses common in phonetic
science/speech science. This includes the calculation of formants,
fundamental frequency, root mean square, auto correlation, a variety of
spectral analyses, zero crossing rate, filtering etc. This wrapper
provides R with a large subset of libassp’s signal processing functions
and provides them to the user in a (hopefully) user-friendly manner.
Let’s get started by locating some example material distributed with the package.
## Lade nötiges Paket: tibble
# get the path to the data that comes with the package
wavPath = system.file('extdata', package='wrassp')
# now list the .wav files so we have some audio files to play with
wavFiles = list.files(wavPath, pattern=glob2rx('*.wav'), full.names=TRUE)
One of the aims of wrassp
is to provide mechanisms to
handle speech-related files such as sound files and parametric data
files. wrassp
therefore comes with a class called
AsspDataObj
which does just that.
# load an audio file, e.g. the first one in the list above
au = read.AsspDataObj(wavFiles[1])
# show class
class(au)
## [1] "AsspDataObj"
## Assp Data Object of file /tmp/Rtmp3QfY0F/Rinst186cc6eed5e45/wrassp/extdata/lbo001.wav.
## Format: WAVE (binary)
## 19983 records at 16000 Hz
## Duration: 1.248938 s
## Number of tracks: 1
## audio (1 fields)
au
is an object of the class AsspDataObj
and, using print
, we can get some information about the
object, such as its sampling rate, its duration and what kind of data
are stored in what form. Since the file we loaded is audio only, the
object contains exactly one track. And since it’s a mono file, this
track only has one field. We will later encounter different types of
data with more than one track and more fields per track.
Here are some more ways of extracting attributes from the object, such as duration, sampling rate and the number of records:
## [1] 1.248938
## [1] 16000
## [1] 19983
## $names
## [1] "audio"
##
## $trackFormats
## [1] "INT16"
##
## $sampleRate
## [1] 16000
##
## $filePath
## [1] "/tmp/Rtmp3QfY0F/Rinst186cc6eed5e45/wrassp/extdata/lbo001.wav"
##
## $origFreq
## [1] 0
##
## $startTime
## [1] 0
##
## $startRecord
## [1] 1
##
## $endRecord
## [1] 19983
##
## $class
## [1] "AsspDataObj"
##
## $fileInfo
## [1] 21 2
An important property of AsspDataObj
is of course that
it contains data tracks, or at least one data track. As mentioned above,
the currently loaded object contains a single mono audio track.
Accessing the data is easy: AsspDataObj
stores data in
simple matrices, one matrix for each track. Broadly speaking,
AsspDataObj
is nothing but a list of at least one matrix.
All of them have the same number of rows (number of records) but each
can have a different number of columns (number of fields). Each track
has a name and we can access the track using that name.
## [1] "audio"
## [1] "audio"
## [,1]
## [1,] 5
## [2,] -2
## [3,] 17
## [4,] -5
## [5,] -5
## [6,] -2
# and we can of course also plot these samples
# (only plot every 10th element to accelerate plotting)
plot(seq(0,numRecs.AsspDataObj(au) - 1, 10) / rate.AsspDataObj(au),
au$audio[c(TRUE, rep(FALSE,9))],
type='l',
xlab='time (s)',
ylab='Audio samples')
Now, purely to give us something unequal to the original
au
object to write to disc, let’s manipulate the audio data
by simply multiplying all the sample values by a factor of
0.5
. The resulting AsspDataObj
will then be
saved to a temporary directory provided by R
.
wrassp
is of course capable of more than just the mere
reading and writing of specific signal file formats. We will now use
wrassp
to calculate the formant values, their corresponding
bandwidths, the fundamental frequency contour and the RMS energy contour
of the audio file wavFiles[1]
.
# calculate formants and corresponding bandwidth values
fmBwVals = forest(wavFiles[1], toFile=F)
# due to toFile=F this returns an object of the type AsspDataObj and
# prevents the result being saved to disc as an SSFF file
class(fmBwVals)
## [1] "AsspDataObj"
# extract track names
# this time the object contains muliple tracks (formants + their bandwidths)
tracks.AsspDataObj(fmBwVals)
## [1] "fm" "bw"
## [1] 250 4
Seeing as one might want to reuse some of the computed signals at a
later stage, wrassp
allows the user to write the result out
to file by leaving the toFile
parameter set to
TRUE
. This also allows users to process more than one file
at once.
##
## INFO: applying rmsana to 9 files
##
|
| | 0%
|
|======== | 11%
|
|================ | 22%
|
|======================= | 33%
|
|=============================== | 44%
|
|======================================= | 56%
|
|=============================================== | 67%
|
|====================================================== | 78%
|
|============================================================== | 89%
|
|======================================================================| 100%
# list new files using wrasspOutputInfos$rmsana$ext (see below)
rmsFilePaths = list.files(tempdir(),
pattern = paste0('*.',wrasspOutputInfos$rmsana$ext),
full.names = T)
# read first rms file
rmsvals = read.AsspDataObj(rmsFilePaths[1])
# plot the RMS energy contour
plot(seq(0,numRecs.AsspDataObj(rmsvals) - 1) / rate.AsspDataObj(rmsvals) +
attr(rmsvals, 'startTime'),
rmsvals$rms,
type='l',
xlab='time (s)',
ylab='RMS energy (dB)')
wrasspOutputInfos
stores meta information associated
with the different signal processing functions wrassp
provides.
## [1] "acfana" "afdiff" "affilter" "cepstrum" "cssSpectrum"
## [6] "dftSpectrum" "ksvF0" "mhsF0" "forest" "lpsSpectrum"
## [11] "rfcana" "rmsana" "zcrana"
This object can be useful to get additional information about a
specific wrassp
function. It contains information about the
default file extension ($ext
), the tracks produced
($tracks
) and the output file type
($outputType
) of any given wrassp
function.
## $ext
## [1] "fms"
##
## $tracks
## [1] "fm" "bw"
##
## $outputType
## [1] "SSFF"
For a list of the available signal processing function provided by
wrassp
simply open the package documentation:
We hope this document gives you a rough idea of how to use the
wrassp
package and what it is capable of. For more
information about the individual functions please consult the respective
R documentations (e.g. ?dftSpectrum
).
To find questions that might have already been answered or if you have an issue or a bug to report please use our GitHub issue tracker.