Wide-Band Information Transmission at the Calyx of Held

We use a mathematical model of the calyx of Held to explore information transmission at this giant glutamatergic synapse. The significant depression of the postsynaptic response to repeated stimulation in vitro is a result of various activity-dependent processes in multiple timescales, which can be reproduced by multiexponential functions in this model. When the postsynaptic current is stimulated by Poisson-distributed spike trains, its amplitude varies considerably with the preceding interspike intervals. Here we quantify the information contained in the postsynaptic current amplitude about preceding interspike intervals and determine the impact of different pre- and postsynaptic factors on information transmission. The mutual information between presynaptic spike times and the amplitude of the postsynaptic response in general decreases as the mean stimulation rate increases, but remains high even at frequencies greater than 100 Hz, unlike at many neocortical synapses. The maintenance of information transmission is attributable largely to vesicle recycling rates at low frequencies of stimulation, shifting to vesicle release probability at high frequencies. Also, at higher frequencies, the synapse operates largely in a release-ready mode in which most release sites contain a release-ready vesicle and release probabilities are low.

We use a mathematical model of the calyx of Held to explore information transmission at this giant glutamatergic synapse. The significant depression of the postsynaptic response to repeated stimulation in vitro is a result of various activity-dependent processes in multiple timescales, which can be reproduced by multiexponential functions in this model. When the postsynaptic current is stimulated by Poisson-distributed spike trains, its amplitude varies considerably with the preceding interspike intervals. Here we quantify the information contained in the postsynaptic current amplitude about preceding interspike intervals and determine the impact of different pre-and postsynaptic factors on information transmission. The mutual information between presynaptic spike times and the amplitude of the postsynaptic response in general decreases as the mean stimulation rate increases, but remains high even at frequencies greater than 100 Hz, unlike at many neocortical synapses. The maintenance of information transmission is attributable largely to vesicle recycling rates at low frequencies of stimulation, shifting to vesicle release probability at high frequencies. Also, at higher frequencies, the synapse operates largely in a release-ready mode in which most release sites contain a release-ready vesicle and release probabilities are low.

Introduction
A distinct feature of chemical synapses is their ability to filter temporal signals on short timescales, from minutes to milliseconds (Thomson, 2000). Signals arrive in the form of a temporal sequence of action potentials (APs), and a postsynaptic response depends on the precise timing of the associated AP and the recent history of AP arrivals at the synapse. Responses to regular streams of APs are typically either depressing in amplitude or facilitating, on successive APs (Zucker & Regehr, 2002). The filtering properties of synapses from a single neuron can depend on the postsynaptic target (Markram, Wang, & Tsodyks, 1998). Here we use information theory to quantify the frequency-dependent signal filtering characteristics of a mathematical model of the calyx of Held, a giant glutamatergic synapse in the medial nucleus of the trapezoid body (MNTB) of the mammalian auditory brain stem. The calyx of Held, unlike central synapses, is physically large enough to enable simultaneous recording from the presynaptic terminal and the postsynaptic target (Forsythe, 1994;Borst, Helmchen, & Sakmann, 1995;von Gersdorff & Borst, 2002;Schneggenburger, Sakaba, & Neher, 2002;Schneggenburger & Forsythe, 2006). In addition, the calyx surrounds the cell body of the receiving MNTB neuron; thus, recordings of excitatory postsynaptic currents (EPSCs) and potentials (EPSPs) are not distorted by dendritic filtering. Voltage clamp recordings reveal that repetitive stimulation of the afferent fibers of a calyx cause a strong frequency-dependent, short-term depression in the magnitude of the excitatory postsynaptic current (Borst et al., 1995). This short-term plasticity (STP) has been attributed to a combination of various factors influencing the processes of vesicular exocytosis and endocytosis, and the sensitivity of the postsynaptic glutamate receptors (for a review, see Schneggenburger & Forsythe, 2006). Mathematical models fit to such data require both fast components on a millisecond scale, describing vesicle depletion, release probability facilitation, activity-dependent recovery of the releasable vesicle pool, and postsynaptic neurotransmitter receptor desensitization (Weis, Schneggenburger, & Neher, 1999;Trommershäuser, Schneggenburger, Zippelius, & Neher, 2003;Wong, Graham, Billups, & Forsythe, 2003;Graham, Wong, & Forsythe, 2004;Hennig, Postlethwaite, Forsythe, & Graham, 2007), as well as slower components on a scale of seconds, describing vesicle recovery, metabotropic glutamate receptor activation and calcium channel inactivation (Billups, Graham, Wong, & Forsythe, 2005;Hennig et al., 2007).
As STP has important implications for communication between central neurons (Zucker & Regehr, 2002), it is desirable to know how particular features of synaptic transmission affect this communication. Information theory provides tools for quantifying the ability of synapses to transfer information about presynaptic stimuli to the postsynaptic neuron (Shannon, 1948;Zador, 1998;Borst & Theunissen, 1999;Fuhrmann, Segev, Markram, & Tsodyks, 2002;London, Schreibman, Hausser, Matthew, & Segev, 2002). At neocortical synapses, the average EPSC amplitude carries information about presynaptic spike times (Fuhrmann et al., 2002). This information is carried by both depression in vesicle availability and facilitation in release probability and is maximal at a particular mean frequency of stimulation, dependent on the time courses of recovery from depression and facilitation (Fuhrmann et al., 2002).
Such information transmission may be particularly important at the calyx of Held. This giant synapse forms a component of circuitry involved in computing interaural level and timing differences (ILD and ITD, respectively) for the determination of sound source location (Trussell, 1999). The output of the calyx target cell, a principal neuron in the MNTB, is inhibitory and provides an inhibitory coding of sound-induced stimuli at the contralateral ear to targets in the lateral and medial superior olives (LSO and MSO) for comparison with ipsilateral (and contralateral in the case of the MSO) excitatory signals for ILD and ITD calculation. Fidelity of signal transmission and spike timing precision are hypothesized to be key factors for this synapse (Trussell, 1999). The very high safety factor at this synapse means that every presynaptic spike produces a finite (nonzero) postsynaptic current, which in turn may or may not result in a postsynaptic spike. Thus, individual EPSC amplitudes are crucial to the spiking output of the MNTB and its ability to follow presynaptic spike trains. Short-term plasticity results in variability in EPSC amplitudes that depends on relative presynaptic spike times. Here we quantify how much information about presynaptic spike times is carried by the EPSC amplitudes and the STP mechanisms that are responsible.
Previous work has developed a deterministic model of average EPSC amplitude that captures multiple timescales of depression and recovery in response to stimulation of the calyx of Held (Graham et al., 2004;Hennig et al., 2007). We extend this model to a stochastic version in which individual vesicles occupy release sites and are released and replenished probabilistically. We then use information theory to correlate the amplitude of the postsynaptic response (EPSC) with the timing of afferent spikes and measure the extent of information being transferred.
The results indicate that high information transfer is maintained over a frequency range from less than 1 hertz, into the hundreds of hertz. Different synaptic mechanisms underpinning STP act as principal information carriers over different frequency spectra. The information rate, which is a function of stimulation frequency, does not appear to saturate with increasing frequency but increases linearly. EPSC amplitudes remain variable throughout the operating range of the calyx, thus influencing postsynaptic spike fidelity as a function of presynaptic interspike intervals.
The model also predicts that while initial depression following the onset of a stimulus stream is largely due to depletion of vesicles, the vesicle pools gradually recover, and steady-state depression is largely due to a reduction in vesicle release probability. The depressed steady state is the usual operating mode of this synapse due to high levels of spontaneous background activity in the presynaptic cell. This combination of high n (number of releasable vesicles) and low p (vesicle release probability) likely has metabolic consequences for the synapse (Hennig, Postlethwaite, Forsythe, & Graham, 2008).
The presynaptic compartment is assumed to contain an effectively infinite reserve vesicle pool and N = 550 small RRVPs, each of which initially contains the maximum number of n T = 5 vesicles Wu & Borst, 1999;Lange, de Roos, & Borst, 2003). The RRVPs, if they contain fewer than the maximum number of vesicles, are replenished from the reserve pool at a constant rate r p . Upon arrival of a presynaptic action potential (AP) at time t, the replenishment of RRVPs is enhanced by a constant rate r e for a short time (assumed instantaneous with the AP). The rates of vesicle recycling, r p and r e , together with vesicle release probability p j (t), determine the probability that a new vesicle will enter or leave the RRVP in a small time interval t.
Physiological experiments show that different release sites behave similarly during intense stimulation (Lange et al., 2003). For simplicity, the model assumes a uniform release probability for all vesicles, which depends on the presynaptic calcium concentration that mediates release, [Ca 2+ ] i -microdomain around release sites, on the order of 10-25 µM (Schneggenburger & Neher, 2000;Schneggenburger & Forsythe, 2006)according to a power law (Lou, Scheuss, & Schneggenburger, 2005), where k is a scaling factor that relates [Ca 2+ ] i to the release rate of readily releasable vesicles. Classically, the release rate is described as increasing with the fourth power of presynaptic calcium (Dodge & Rahamimoff, 1967). The amplitude of the activity-dependent calcium transient that mediates release, [Ca 2+ ] i , is variable due to inactivation and facilitation of calcium channels and their suppression due to activation of presynaptic mGluRs. A simplified calcium channel kinetics is described as having active c 1 (t), resting c 2 (t), inactivated i(t), and blocked b(t) states. This scheme is modeled by the following set of ordinary differential equations (Hennig et al., 2007): . (2. 2) The variable c 1 (t) describes the evolution of the amplitude of calcium, [Ca 2+ ] i ≡ C 0 c 1 (t) (C 0 = 10, giving initial transient amplitude of 10 µM for c 1 (0) = 1, with scaling factor, k, chosen to give appropriate initial release probability). Calcium channel facilitation is modeled by increasing c 1 (t) by a constant amount n f after each presynaptic spike (at times t s ), which then decays with time constant τ f to a base level c 2 (t). This variable base level, c 2 (t), accounts for the supression of the calcium current by inactivation and mGluR activation (final three equations of equation 2.2). The constants n i and n b define the frequency-and glutamate-(for mGluR activation) dependent rates into, and time constants τ i and τ b define the recovery from calcium current suppression by calcium channel inactivation and mGluR activation, respectively. Initially, c 1 (0) = c 2 (0) = 1, and at all times, c 2 (t) During a computer simulation, the vesicle release probability is updated deterministically according to the above equations, but vesicle replenishment and release is calculated stochastically. A single simulation is equivalent to a single experimental trial at a real calyx. Monte Carlo simulations of the stochastic model are carried out as follows.
Resolution: Presynaptic stimulation is defined as a sequence of interspike intervals, t. For each interval, the change in release probability, p j (t), is calculated deterministically. At each spike time, the number of vesicles that may have arrived at a RRVP since the previous spike is determined stochastically, and each vesicle in a RRVP may release with probability p j (t). Replenish: At each t, if the vesicle number in an RRVP, n j , is less than its initial maximum number n T , then each free release site, n T − n j is tested for the arrival of a new vesicle from the reserve pool during t with probability where r p is the background replenishment rate and r e is the enhanced replenishment rate. The replenishment of the RRVP with a new vesicle is calculated by testing whether a uniform random number in the interval [0, 1] is less than or equal to p j (t). If it is, then n j is incremented by one. Release: A presynaptic action potential occurs at each t. Each vesicle, n j (t), in an RRVP is tested for release against release probability p j (t). A release occurs if a uniform random number in the interval [0, 1] is less than or equal to p j (t), in which case n j is decremented by one. The total number of released vesicles, T(t), is the sum of those released from the individual RRVPs. Response: The normalized postsynaptic response (PSR), 0 ≤ R(t) ≤ 1, is the relative EPSC amplitude, which depends on the normalized number of released vesicles, T N (t) = T(t)/(N · n T ), and amount of postsynaptic AMPAR desensitization, D(t), where N · n T is the maximum number of vesicles that could be available for release (number of release sites times the size of each release site). The postsynaptic AMPAR desensitization, D(t), is modeled assuming a reversible transition into a desensitized state with an increment n d and recovery time τ d , averaged over all AMPAR pools: The model parameters derived by fitting the model to experimental EPSC amplitudes (see section 3) are summarized in Table 1 (Hennig et al., 2007). The model is implemented with Matlab code, which is available from the authors on request and from ModelDB (http://senselab.med.yale.edu/modeldb/).

Information Theory.
We use a direct method (Zador, 1998) to measure the information content in the postsynaptic response Y about an independent homogeneous Poisson spike train X. This involves computing the mutual information between pre-and postsynaptic activities to quantify the common information content in both. The sequence of independent interspike intervals (ISI) of an input spike train, X = {x 1 , x 2 , . . . , x n }, represents a Poisson process conveying temporal information. The magnitudes of the model responses, Y = {y 1 , y 2 , . . . , y n } (given by equation 2.4) to each presynaptic spike may contain information about the preceding interspike intervals. The magnitude of the response to the first input spike is taken as a reference since it has the biggest amplitude for a depressing synapse.
The first percentile of this amplitude is defined as a bin resolution, whose precision can keep the information finite (Zador, 1998;Fuhrmann et al., 2002). The remaining model responses are then discretized and distributed into the correct bins of amplitude percentile. The probability distribution of the response P(Y) over a long time course can thus be estimated with the maximum likelihood direct estimation method. The total entropy, H(Y) in Shannon's theory (Shannon, 1948), is a quantity measuring the amount of variability of the postsynaptic response Y to the ensemble of different inputs, without being constrained by input conditions, where p(y i ) is the probability of the model responses that fall in the ith percentile with a value between y i to y i+1 . The conditional entropy, H(Y|X), is a quantity that measures the reliability of the postsynaptic response Y to repeated presentations of the same inputs, where p(y i |X) is the conditional probability of the model responses that fall in the ith percentile, conditioned on the appearance of presynaptic stimulation sequence X. The mutual information I (X; Y) quantifying the common information between the presynaptic ISI sequence, X, and the amplitude of the PSR, Y, is then In numerical experiments, the stochastic model was tested with different groups of Poisson spike trains with mean frequencies f ∈ [0.1, 200] Hz. Each spike train contained an initial 24 second period followed by 1000 spikes, resulting in a time length that is dependent on the stimulation frequency. The 24 second period is three times the longest time constant in the model and allows the model output to reach a stationary state. The model responses in this period are discarded from the analysis. Mutual information is then calculated from the subsequent 1000 spikes, so the number of data collected is the same for all stimulus frequencies. Little variation was found if this was increased to 10,000 spikes, so 1000 spikes was chosen to minimize computation time while retaining accuracy.
We repeated a particular Poisson spike train of mean rate f as the input to the model 200 times. Due to the stochastic nature, the model responded differently in its PSR amplitudes at each time. We thus account for the reliability of a synapse conditioned on a specific input with an alternative calculation of the conditional entropy, where n is the total number of input spikes in a spike train inducing the PSR, andp(y i ) is the probability of all PSR magnitudes in 200 trials that are induced by the same input spike and fall in the ith percentile. This assumes that the variance of each spike reponse is independent of preceding spikes (see Figures 3a and 3b for examples of the PSR variance for differents ISIs). The 200 trials were sufficient to achieve asymptotic values for the conditional entropies.
In the presented results, a number of different model variants are tried. Exactly the same set of stimulus spike trains is used with each model variant so that any systematic bias remaining in our protocol is constant across the model variants. This allows reliable comparison of results between models. Variations in the results across different sets of spike trains for the same model are small.

Regular Stimulation.
The response of the stochastic calyx of Held model to presynaptic regular stimulation at 10, 20, 50, and 100 Hz, optimized against experimental data (Wong et al., 2003), is shown in Figure 1. Both model and experimental data are plotted normalized against the amplitude of the first response in a train. The model fits the mean experimental response as shown before with a deterministic model (Hennig et al., 2007), but now captures much of the variance in the response as well. We subsequently refer to this as the Full calyx of Held model.  Figure 1: The optimized fit of the stochastic calyx of Held model against experimental data of synaptic responses to regular presynaptic stimulation at different frequencies. The mean ± standard deviation (STD) of model (black) and experimental (gray) EPSCs, normalized against the first EPSC in the train, are plotted. Experimental data recorded from 12 cells at room temperature in brain slice preparations from P10-P13 Lister-Hooded rats (Wong et al., 2003).

Model Variants.
Based on the complete stochastic calyx model, different versions can be derived by reduction of different combinations of components, a process analogous to the use of antagonists in a pharmacological test. In this study, four additional models are used to compare their information transmission against that of the Full model. The NoSlow model has no slow presynaptic kinetics-voltage-gated calcium channel inactivation and mGluR activation. The NoFac model has no facilitation of the [Ca 2+ ] i transient, which in turn increases vesicular release probability. In the NoDes model, the postsynaptic AMPARs do not desensitize in the course of binding glutamate. The NoRepl model has no background vesicular replenishment. In each case, apart from adjustments to remove the specified component, all other model parameters are as for the Full model.

Response to Poisson Stimulation.
In Monte Carlo simulations, Poisson spike trains of 10 Hz and 100 Hz mean frequency were used to stimulate the Full and NoDes (no desensitization) calyx models and examine the variability of the postsynaptic response and the contribution of desensitization to the PSR amplitudes. Example raster plots of the PSR amplitudes from these models are given in Figure 2.
Both the raster and the associated amplitude histograms from 40 trials display a transient depression in the PSR at the start of the spike train, which is followed by a stationary state response. The raster plots show the spike-to-spike variation in PSR amplitude when the interspike interval varies. This is due to the presynaptic processes of depletion and recycling of releasable vesicles, which are mediated by the calcium ion concentration, and postsynaptic AMPAR desensitization. The variation of PSR amplitude can thus encode and transfer presynaptic spike timing information to the postsynaptic cell. During the initial depression, the raster and histogram plots show an enhanced PSR amplitude in response to both stimulation frequencies when AMPAR desensitization is not present (NoDes), compared to the complete model with desensitization (Full). However, the distinction between a model with and without desensitization becomes trivial in the stationary state, indicating that desensitization may contribute little to the information transmission here. This is due to the low release rates in the stationary state, such that the AMPARs corresponding to any particular RRVP can recover from desensitization before another release occurs at that RRVP. On average, the stationary state PSR amplitudes in response to 10 Hz stimulation are much larger than those to 100 Hz.

Response Amplitude Versus Presynaptic ISI.
To investigate if there is a clear functional relationship between the presynaptic ISI and the postsynaptic response amplitude, we applied test stimuli with known ISI relative to the last spike in a long conditioning stimulus train and measured the resulting PSR. The conditioning train (mean frequency of 10 or 100 Hz for 30 sec) first ensured the synapse was at stationary state. This protocol was repeated for a large number of test ISIs. The responses of three calyx of Held model variants as a function of the test ISIs are shown in Figures 3a  and 3b. Though there is variability in response, there is a clear functional relationship between the mean PSR and the presynaptic ISI that preceded it. Apart from very short ISIs, the PSR of all the models increases with the ISI of the presynaptic spike train. This is due to the adequate replenishment of the depleted RRVPs and recovery of release probability and desensitized AMPARs for long ISIs. For both the Full and NoSlow models, the PSR initially decreases in amplitude for increasing ISI, up to around 50 ms. This is due to the reduction in facilitation of release probability as the ISI increases to around 50 ms. The relationship between ISI and PSR amplitude is monotonic for the NoFac model, as it contains only processes that reduce the PSR and recover more with longer ISIs.
Binned amplitude histograms of steady-state PSRs resulting from Poisson 10 Hz stimulation are shown in Figure 3c. The PSRs of the five models are similar but show some variation in mean and variance. Clearly there is a distribution of amplitudes in each case, which will be contributed to by the variation in ISIs in the stimulus, as well as by stochastic transmitter release. Amplitude distribution histograms for the Full model differ depending on the mean stimulation frequency (see Figure 3d). Mean and variance are reduced at higher frequencies, with a consequent reduction in the entropy of the PSR distribution. Hence, it is likely that the Full model is capable of transmitting less information at higher mean frequencies.
Functional variability in the PSR for different ISIs can be due to any of the components contributing to transmitter release and postsynaptic current generation. The essential determinants of the PSR are the number of vesicles available for release, their individual release probability, and the AMPAR desensitization state. Poisson 10 Hz stimulation episodes for these pre-and postsynaptic components are shown in Figure 4. All components show considerable variability, which is a function of ISI. All models have a similar average vesicle release probability except the NoSlow model, which has an enhanced calcium transient and hence a higher vesicle release probability (see Figure 4a). The AMPAR desensitization remains at a low level for all models, except when the random release rates are occasionally high (most often with the NoSlow model), and the AMPARs corresponding to any particular RRVP cannot recover from desensitization before another release occurs at that RRVP (see Figure 4b). Vesicle pool occupancy is at a lower average level in the NoRepl model, due to a lower pool replenishment rate, and in the NoSlow model, due to a higher average vesicle release rate (see Figure 4c). The average PSR does vary between models, with the NoSlow model being significantly higher and the NoRepl model being lower (see Figure 4d). These model differences are accentuated at higher frequencies. The stationary state mean and variance of these model components at mean stimulation frequencies of 10, 100, and 200 Hz for the five different calyx models are shown in Figure 5. In all models, the mean PSR decreases with increasing stimulation frequency but is always highest in the NoSlow model. Underpinning the frequency dependence of the PSR in all models except NoSlow, release probability decreases, but vesicle pool occupancy increases with increasing stimulation frequency. Despite low release probability at high frequencies, the PSR is still finite in these models due to the high vesicle occupancy. The NoSlow model always has a higher average release probability, as the slow calcium-dependent processes that suppress it are absent. This allows facilitation to increase the average release probability with frequency in this model. As a consequence of the increasing release rate, vesicle pool occupancy is significantly lower in this model and decreases with frequency. For all except the NoSlow model, desensitization remains at a very low level at the stationary state.
For all except the NoSlow model, the coefficient of variation (CV) of the release probability increases with frequency, whereas the CV of vesicle occupancy decreases (data not shown). This suggests that variations in release probability could be a major source of information about presynaptic ISIs at high stimulation frequency, whereas vesicle pool occupancy may be more important at lower frequencies. For the NoSlow model, the CV of both variables is reasonably constant, increasing slightly with frequency.

Information Transmission. The mutual information (MI) between
the steady-state PSR and presynaptic ISI, the information rate, and the information efficacy of the five model variants are shown in Figure 6. The information rate is the information encoded per time unit rather than per PSR. The upper bound of the information rate can be estimated as the product of presynaptic frequency and the information per PSR (Fuhrmann et al., 2002). The information efficacy is defined as the fraction of the informative component within the total entropy of the responses (Fuhrmann et al., 2002). It measures the information transmission efficiency of a synapse. A deterministic synapse model has the best information transmission ability, which has unity information efficacy. A stochastic model, on the other hand, has noisy and informative components in its signal channels and thus has an information efficacy less than one. Linear and logarithmic plots reveal the different performances of the five calyx models for information transmission in a broad stimulation frequency range from 0.1 up to 200 Hz, covering the range of spontaneous rates, and into the range of sound-induced stimulus rates experienced by this synapse. As described below, there are two distinct phases (above and below about 10 Hz) in the information transmission, as shown in Figures 6a and 6b, determined by the balance between processes that depress the postsynaptic response and processes that amplify, or facilitate, the response. At frequencies below about 10 Hz, depressing processes that result in a smaller postsynaptic response for shorter ISIs are the main information carriers, with background vesicle replenishment being by far the major component. Other depressing components are the slow processes of calcium channel inactivation and deactivation (mGluR activation) and postsynaptic AMPAR desensitization. Slow processes with time constants of seconds, namely calcium channel inactivation and deactivation and background vesicle replenishment, will variously affect the PSR depending on the ISI, as they will not be fully recovered between presynaptic spikes. Background replenishment is particularly important as it provides a significant percentage of vesicles to the RRVPs. Its removal leads to a significant drop in MI below 20 Hz. Removal of no other process causes such a large change in MI.
With Poisson distributed spike trains, there are occasionally sufficiently small ISIs that the faster processes of facilitation and desensitization, with time constants in the tens of milliseconds, also introduce variation into the PSR and hence act as information carriers. Desensitization also acts to reduce the PSR for shorter ISIs, and thus its information capacity adds to that of vesicle replenishment and the slow processes. Its removal reduces information transmission a little at low frequencies. In contrast, facilitation carries information by increasing the PSR for shorter ISIs and so acts opposite to the depressive components. Its removal results in a slight increase in information transmission at low frequencies.
All models attain their peak information transmission at around 1 to 2 Hz mean stimulation frequency. This corresponds to the maximum variation in the fraction of vesicle recovery between ISIs, which has a rate of 0.4 vesicles per second. Long ISIs of greater than a few seconds will result in almost certain refilling of a release site, whereas very short ISIs will result in empty release sites, almost certainly not being refilled. Between these extremes, the length of ISI has a strong effect on the probability that an empty site will be refilled and the subsequent amplitude of the PSR.
The low information transmission in the low-frequency range when background vesicle replenishment is removed is not, as might be expected, due to a lack of vesicles available for release, as the RRVPs are also maintained by activity-dependent replenishment. However, this activitydependent replenishment occurs on a per spike basis and thus is independent of the presynaptic ISI. Spontaneous background replenishment, on the other hand, occurs at a constant rate and thus does carry some information about the ISI: the longer the ISI, the more likely that a vesicle will arrive at a vacant release site. Lack of background replenishment removes this ISI-dependent variability in RRVP size.
In a transition range, from around 10 to 20 Hz, information transmission switches from being carried by depressive processes to being carried by facilitation. This is evident since removal of facilitation increases information transmission at 10 Hz but decreases it at 20 Hz (see Figures 6a and 6b). Additionally, removal of desensitization and slow processes now increases information transmission, and removal of background replenishment no longer decreases information transmission at 20 Hz.
Thus, the second phase, at frequencies greater than 20 Hz, is dominated by fast facilitation of vesicle release. Facilitation maintains a finite and variable release probability, while activity-dependent vesicle recycling maintains finite-sized RRVPs, resulting in greater-than-zero amplitude PSRs that are sensitive to the presynaptic ISI, for all models except the NoFac model. Without slow components (NoSlow), information content actually increases again with frequency due to increasing facilitation. The slow components in the Full model suppress release probability, resulting in lower information transfer, but MI remains largely constant with increasing frequency. Release probability is very low in the NoFac model, resulting in negligible information transfer, even though the RRVPs are full. Desensitization is also sufficiently fast that it could act as an information carrier at high frequencies, but its magnitude is limited by the low release of neurotransmitter in this range. It would tend to counteract the information carried by facilitation. Since facilitation is dominant in this range, removal of desensitization results in a small increase in information transfer. Desensitization as an information carrier is explored in more detail below.
For all models, apart from NoFac, the information rate increases monotonically (near linearly) with frequency, reflecting the high information content maintained over the full frequency range (see Figures 6c and 6d). This is the result of the activity-dependent replenishment of the RRVPs that can keep vesicles available for release for every AP in a high-frequency train. This replenishment process does not saturate in these models, but in reality it must have a finite time course that may not keep up with still higher frequencies (Hennig et al., 2008). Facilitation is also modeled as being instantaneous and thus continues to increase with frequency until the release probability reaches one. This is not likely in reality. Saturation of facilitation would also cause the information rate to saturate rather than continue to increase.
Information efficacy is finite, and its trend with frequency tends to track that of the mutual information, also reaching optimum information efficacy at 1 to 2 Hz (see Figures 6e and 6f). This reflects a relatively constant noise level due to stochastic vesicle release and recycling. Efficacy declines during the transition phase from 10 to 20 Hz, during which information transmission switches from depression to facilitation. Thereafter, it increases with frequency as release probability declines, and, consequently, stochastic noise due to variation in vesicle release becomes small. The finite efficacy across the frequency range (from 0.1 to just under 0.4) is the result of the large number of active zones and associated release sites at this synapse, minimizing noise due to stochastic vesicle release.

Desensitization versus Facilitation.
The information-carrying capability at high frequencies of desensitization (Graham, 2002) is revealed by the NoSlow model, in which there is more release at high frequencies, leading to more desensitization. Figures 7a and 7b show information transmission in the NoSlow model, either with or without facilitation and desensitization. Removal of both mechanisms leads to a rapid decline in MI at frequencies above 10 Hz. However, removal of one or other of facilitation and desensitization results in a large increase in MI at high frequencies. Both mechanisms are efficient information carriers but counteract each other in combination. Desensitization carries more information between about 1 to 10 Hz, as recovery from desensitization is slower than decay of facilitation in our standard model (time constant of 500 ms versus 250 ms).

Variation in Model Parameters.
The results shown so far are based on the average characteristics of the calyx of Held as determined by the fitting of the model to particular data sets. Fitting the model to data from individual calyces reveals some variation in parameter values, but with relatively tight distributions around the mean values used here (Hennig et al., 2008). Hence we would not expect large qualitative differences between calyces from the MI curves shown in Figure 6.
The combinations, magnitudes, and time courses of different STP mechanisms are likely synaptic pathway specific throughout the brain. Typical neocortical excitatory synapses exhibit optimum information transmission at higher frequencies, but they also apparently have faster recovery from depression and slower decay of facilitation (Fuhrmann et al., 2002). Using typical neocortical values for background vesicle recycling and decay of facilitation shifts the optimum transmission frequency to higher values at the calyx, as shown in Figures 7c and 7d. Increasing the background replenishment rate to 2 vesicles per second (time constant of recovery of 500 ms) shifts the MI curve to the right, with optimum MI at around 4 to 5 Hz. Slowing the decay time constant of facilitation to 500 ms results in a new large peak in MI at 20 Hz and a significant reduction in MI below 10 Hz, where the increased facilitation counteracts the information transmission due to depressive processes. This is consistent with the operation of neocortical synapses (Fuhrmann et al., 2002).
The most distinctive feature of the calyx is the large number of readily releasable vesicle pools (RRVPs) that are all driven by the same presynaptic action potential. Neocortical synapses may contain only one or a few RRVPs. Increasing the number of pools increases the amount of information transmitted by the synapse (Fuhrmann et al., 2002). A single RRVP provides fewer than 0.1 bits of information per PSR for a model neocortical synapse (Fuhrmann et al., 2002). Figures 7e and 7f show that the magnitude of MI is similarly dependent on the number of RRVPs at the calyx, reducing from a maximum of 1.5 bits per PSR with 550 pools to 0.45 bits per PSR with only 50 pools. However, the increase in MI with the number of RRVPs is sublinear; hence, the number of bits of information per RRVP falls as more pools are added.
In summary, information transmission over a wide frequency range may be carried by both depressive and facilitatory processes that lead to smaller or larger PSRs, respectively, for shorter ISIs. These processes tend to counteract one another, which, for our model of the calyx of Held, results in a transition phase where MI may decline as information switches from being carried by slower vesicle recycling to faster facilitation of release probability. The optimum frequency and magnitude of MI depend on the amplitude and time course of the individual information carriers.

Discussion
A stochastic mathematical model of the calyx of Held has been explored in terms of its ability to encode temporal information about the timing of presynaptic spikes. The information transmission performance is characterized by the total and conditional entropies of the stationary state postsynaptic response (PSR) amplitudes to an input spike train. The mutual information between the input and output signals can be inferred from the total and conditional entropies, which measure the variability and reliability of the signal transmission channel, respectively. We tested the information transmission characteristics of the calyx model over a physiological stimulation range covering spontaneous background firing frequencies as well as stimulus-induced frequencies (Kopp-Scheinpflug, Lippe, Dorrscheidt, & Rubsamen, 2002). The major information carriers at this synapse appear to be presynaptic mechanisms that manipulate vesicle recycling and release rates through alteration of calcium transients in response to presynaptic action potentials. They are able to maintain high information transmission across a broad frequency range.
Examining calyx model variants indicates that different synaptic components contribute most prominently to information transmission in different frequency ranges. Vesicle recycling is the dominant mechanism at low and intermediate frequencies.
Above about 20 Hz, facilitation of vesicle release probability is the major information carrier. From 10 to 20 Hz, a transition takes place in which the information carrier switches from the depressive recycling mechanism to the facilitatory mechanism. These two mechanisms transmit information in opposite ways and so counteract each other, resulting in a local minimum in information transfer in this frequency range. Maximum information transfer occurs at around 1 to 2 Hz.
Our models also indicate possible operating modes for this synapse. In the high-frequency phase in particular, the Full and NoSlow models, though they have high information-carrying capabilities, are operating in essentially two different modes. The Full model maintains near full RRVPs and has a low, but fluctuating, vesicle release probability. The NoSlow model, however, operates in a high-release probability regime, with partly depleted RRVPs but with variation in both release probability and RRVP size contributing to information transfer. Higher release rates also lead to AMPAR desensitization becoming more significant and contributing to information transfer. These different operating modes must have metabolic consequences for biological synapses, as well as conferring different response properties to changes from the stationary state, such as might arise from the onset or offset of a sound in the environment. It is conceivable that it is metabolically easier to rapidly manipulate release probability to make use of already available vesicles than it is to recruit new vesicles to the RRVPs when required (Hennig et al., 2008).
The precise shape of the information transmission curve with frequency is determined by the time course of the STP mechanisms and their relative magnitudes in the synapse. Our calyx model contains a number of STP mechanisms whose time courses and magnitudes are optimized against a range of experimental data. These mechanisms cover a wide range of timescales (from milliseconds to seconds) and processes that facilitate or depress vesicle recycling and release and postsynaptic AMPAR responses.
Nonetheless, the currently available data do not uniquely determine the model parameters. Data from individual calyces from animals of a particular age group indicate there is some variation between each synapse (Hennig et al., 2008), but their characteristics are broadly similar. Initial EPSC amplitudes and the time course of depression transients reveal differences in initial release probabilities, but this does not necessarily transfer through to variations in stationary state responses, for which more data are required. Synapse characteristics do change with the developmental age of the animal, with AMPAR desensitization, in particular, decreasing with age (Taschenberger, Scheuss, & Neher, 2005).
Further STP mechanisms could be included in the model. Posttetanic potentiation and augmentation of release probability, due to residual calcium influencing vesicle recycling and release, are present at this synapse (Habets & Borst, 2005). These are facilitatory mechanisms with time courses of seconds to minutes. Spillover of glutamate could contribute to AMPAR desensitization and mGluR activation at neighboring release pools, increasing the depression following transmitter release. Release probability is also subject to modulation during MNTB network activity via presynaptic AMPA, GABA B , and glycine receptors (Schneggenburger & Forsythe, 2006). Facilitation and activity-dependent vesicle recycling could reasonably be expected to saturate at high frequencies. These extra mechanisms and details would affect the quantification of information transmission across the frequency range and could affect the optimum frequency, but should not destroy the wide-band characteristics of this synapse demonstrated by our current model.
In contrast to the calyx, typical neocortical synapses exhibit optimal information transmission from around 1 to 70 Hz, depending on the recovery rates of vesicle depletion and facilitation (Fuhrmann et al., 2002). Recovery from facilitation is apparently quicker, while background vesicle replenishment is slower at the calyx than seems to be the case at neocortical synapses (Markram, 1997;Markram et al., 1998;Fuhrmann et al., 2002), resulting in the low optimum frequency range. Adjusting these rates to neocortical values in the calyx model increases the optimum frequency, as expected (see Figures 7c and 7d). In vivo, each calyx receives spontaneous activity typically at rates significantly greater than 1 Hz (Kopp-Scheinpflug et al., 2002). Thus, a calyx will be operating well above its optimum information transmission point. Nonetheless, EPSC amplitudes will still be continuously signaling variations in presynaptic ISI to the receiving MNTB neuron. It is debatable whether in vivo neocortical networks can encode information more efficiently in processing activities of spontaneous or high rates. Some experimental and analytical outcomes show that neocortical synaptic connections have optimal information encoding for low, spontaneous input rates (Abeles, 1991;Arieli, Sterkin, Grinvald, & Aertsen, 1996;Fuhrmann et al., 2002). Some studies, on the other hand, treat the spontaneous activities as a noisy background which can favorably influence encoding signals of higher frequencies (e.g., 100-300 Hz) by, for example, tonically activating synaptic conductances (Bernander, Douglas, Martin, & Koch, 1991;Rapp, Yarom, & Segev, 1992;Destexhe & Paré, 1999;Hô & Destexhe, 2000).
This giant synapse has been hypothesized to form an inverting relay, converting its excitatory input spike train into an equivalent, though not necessarily identical, spiking output from its receiving MNTB neuron (Sommer, Lingenhöhl, & Friauf, 1993;Oertel, 1999;Trussell, 1999;Schneggenburger & Forsythe, 2006). This output inhibits its targets via glycinergic synapses. Experiments in slice preparations show that the MNTB output can faithfully follow high-frequency input trains into the hundreds of Hertz (Wu & Kelly, 1993). Specialist membrane properties, such as low-and high-voltage-activated potassium channels (Brew & Forsythe, 1995;Kopp-Scheinpflug, Fuchs, Lippe, Tempel, & Rubsamen, 2003) prevent temporal summation of the calyx input and enable the cell to produce a single spike out for every spike into the calyx (Wu & Kelly, 1993). Also, the response of the calyx is directly onto the cell body; hence, variation in the PSR and the information it carries about presynaptic spike times is seen directly by the spike-generating mechanism of the MNTB neuron.
This synaptic configuration is in contrast to many neuronal types in which spatial and temporal integration of synaptic input is likely crucial in generating neuronal output. Pyramidal cells in neocortex and hippocampus receive thousands of inputs, which may summate as they propagate to the cell body. These synaptic signals also interact and are shaped by active processes in the dendrites, which could ameliorate or amplify the effects of STP at individual synapses. This has important implications for the putative effects of synaptic STP on cell output. If spatial integration and active processes are ignored, the average response amplitude of a purely depressing synapse decreases roughly in proportion to the inverse of the stimulation frequency, at high frequencies (Abbott, Varela, Sen, & Nelson, 1997;Tsodyks & Markram, 1997). This has the consequence that the temporally averaged postsynaptic current generated by the synaptic input becomes independent of frequency, resulting in a steady-state cell spiking output that does not code anything about the input stimulus frequency (though changes in rate are signaled by transient alterations in spiking output; Abbott et al., 1997;Tsodyks & Markram, 1997). Though it is not purely depressing, the steady-state EPSC amplitudes at the calyx do decrease with increasing stimulus frequency. However, here successive EPSCs do not significantly summate, even at high frequencies; consequently, the individual EPSC amplitudes encode the instantaneous input spike frequency. If these EPSCs are suprathreshold for the MNTB neuron, then the output spiking will faithfully follow the input frequency. However, the amplitude coding of frequency allows the possibility that further excitatory and inhibitory synaptic inputs, neuromodulatory inputs, and intrinsic membrane properties may act synergistically to filter particular input frequencies.
In vivo, the MNTB output in fact is not entirely faithful in following the calyx input, exhibiting significant failures and variation in spike onsets under high-frequency stimulation (Kopp-Scheinpflug et al., 2002). Among the differences, spontaneous and sound-evoked spiking rates in the MNTB neurons are lower than presynaptically, but phase locking to sound frequencies greater than 1 kHz is higher postsynaptically (Kopp-Scheinpflug et al., 2002). The spiking output is determined by the summation of all excitatory and inhibitory inputs and the intrinsic cellular properties (Smith, Joris, & Yin, 1998), which are subject to modulation in response to cellular activity Song et al., 2005). It remains a challenge to elucidate exactly how the short-term modulation of synaptic input and neuronal properties combine to determine MNTB output. It is highly likely that this pathway through the MNTB acts as rather more than an "inverting relay" (Sommer et al., 1993;Oertel, 1999;Trussell, 1999;Kopp-Scheinpflug et al., 2002). An MNTB neuron and its synaptic inputs may be tuned to preferentially respond to particular features in its input from the calyx, such as sound onsets.