Envelope and frequency-following responses (FFR and FFR) are scalp-recorded electrophysiological potentials that closely follow the periodicity of complex sounds such as speech. These signals have been established as important biomarkers in speech and learning disorders. However, despite important advances, it has remained challenging to map altered FFR and FFR to altered processing in specific brain regions. Here we explore the utility of a deconvolution approach based on the assumption that FFR and FFR reflect the linear superposition of responses that are triggered by the glottal pulse in each cycle of the fundamental frequency (F0 responses). We tested the deconvolution method by applying it to FFR and FFR of rhesus monkeys to human speech and click trains with time-varying pitch patterns. Our analyses show that F0 responses could be measured with high signal-to-noise ratio and featured several spectro-temporally and topographically distinct components that likely reflect the activation of brainstem (<5 ms; 200-1000 Hz), midbrain (5-15 ms; 100-250 Hz), and cortex (15-35 ms; ~90 Hz). In contrast, F0 responses contained only one spectro-temporal component that likely reflected activity in the midbrain. In summary, our results support the notion that the latency of F0 components map meaningfully onto successive processing stages. This opens the possibility that pathologically altered FFR or FFR may be linked to altered F0 or F0 and from there to specific processing stages and ultimately spatially targeted interventions.