function sig = paravoiced(sig) % This fuction computes various parameters for an input signal segment that % are useful for analyzing a voiced speech segment. The input and output of this % function is a data structure SIG such that: % % sig = paravoiced(sig) % % For INPUT the structure must have the following fields on input: % SIG.Y => Vector with sampled audio segment % SIG.FS => Sampling rate for sig.y in Hz % % The ouput adds/replaces the following data structure fields: % PITCH -> the pitch of the voiced speech % LPC -> the LPC of the given signal (9th order) % R -> the roots of the LPC filter corresponding to frequencies between % 300 and 3900 Hz % FF -> the angles corresponding to the roots R scale by sampling frequency % (Formant Frequencies) % FM -> The magnitudes of the LPC spectrum at the Formant Frequencies % PERSEG -> one period of the prediction error (extracted from the middle of % the input segment. % ER = rms prediction error return by the LPC algorithm % % Update by Kevin D. Donohue (donohue@engr.uky.edu) April 2009 % Formant Range lowformant = 300; % Hertz highformant = 3900; % Hertz % Segment out main voiced component based on assumptions as detailed below % Find the maximum point in the segment ASSUMING it is the location of the % voiced component of the speech. env = abs(hilbert(sig.y)); [gv, gp] = max(env); tax = [0:length(env)-1]'/sig.fs; % Find dispersion before maximum % Number of seconds before max value where voiced component starts secback = sum((tax(1:gp(1))-tax(gp(1))).*env(1:gp(1))) / sum(env(1:gp(1))); % Number of seconds after max where voiced component ends secforward = sum((tax(gp(1):end)-tax(gp(1))).*env(gp(1):end))/sum(env(gp(1):end)); stp = max([gp(1)+round(sig.fs*secback),1]); enp = min([length(sig.y), gp(1)+round(sig.fs*secforward)]); % Extract segment dominated by a stationary voiced component of the speech lseg = sig.y(stp:enp); % Plot it to ensure a reasonable segmentation has been made % Comment out next 8 lines for faster operation figure(2) plot(tax,sig.y,'r') hold on my =ylim; plot([tax(stp),tax(stp)],[my(1), my(2)],'k--'); plot([tax(enp),tax(enp)],[my(1), my(2)],'k--'); hold off [sig.lpc,sig.er] = lpc(lseg,9); % Compute LPC coefficent with model order 10 predy = filter(sig.lpc,1,lseg); % Compute prediction error with all zero filter sig.r = roots(sig.lpc); % Find poles of LPC reconstruction filter sf = sig.fs*angle(sig.r)/(2*pi); % Find angles corresponding to poles and scale by sampling frequency % Find those corresponding to complex conjugate poles nf = find(sf > lowformant & sf < highformant); sig.r = sig.r(nf); % Trim to get relevant/positive roots sf = sf(nf); % Corresponding frequenies sig.ff = sort(sf); % Order frequenices from smallest to largest % Compute spectra to determine pitch. pro.fs = sig.fs; % Samping frequency pro.analwin = 40e-3; % analysis window length pro.fp_low = 60; % Lowest Possible pitch pro.fp_high = 600; % Highest Possible Pitch pro.cliptype = 'clc'; % Softcliping pro.clpercent = [.05]; % Percent of max amplitude to center clip [pitch, detval, t] = ceppitchest(predy,pro); % Find pitch values with sufficiently high correlation/confidence rp = find(detval > .2); if ~isempty(rp) sig.pitch = median(pitch(rp)); % Take median value of all detected pitches % Extract section of predection error % equal to one period rounded off. lenp = fix(length(predy)/2); pp = ceil(sig.fs/sig.pitch); [mv, mp] = max(predy(lenp-pp:lenp+pp)); st = lenp-pp-1+mp(1); sig.perseg = predy(st:st+pp); else sig.pitch = 0; % No pitch found sig.perseg = 0; end % compute LPC spectrum at Fromant Frequencies to find magnitudes sig.fm = zeros(1,length(sig.ff)); % Initalized formant frequency vector % Loop through each formant frequency for kfm = 1:length(sig.ff) w = exp(1j*sig.ff(kfm)*2*pi/sig.fs); % Loop through each coefficient in LPC polynomial for klpc = 1:length(sig.lpc) sig.fm(kfm) = sig.fm(kfm) + sig.lpc(klpc)*w^(-klpc+1); end % take magnitude of AR transfer function sig.fm(kfm) = sig.er/abs(sig.fm(kfm)); end