A 125-MHz CMOS Mixed-Signal Equalizer for Gigabit Ethernet on Copper Wire

Tai-Cheng Lee and Behzad Razavi
Electrical Engineering Department
University of California, Los Angeles

Abstract
A discrete-time mixed-signal linear equalizer designed for the analog front end of Gigabit Ethernet receivers performs cable equalization while relaxing the A/D converter complexity. Based on a coefficient-rotating FIR filter architecture, the circuit incorporates 8 taps that are adapted to the cable characteristics by means of an LMS algorithm. A distributed array of interleaved sampling circuits and a linear low-voltage multiplier topology allow both high speed and low power dissipation. Fabricated in a 0.25-μm digital CMOS technology, the equalizer operates at 125 MHz while dissipating 75 mW from a 2.5-V power supply.

I. INTRODUCTION
The analog front end of Gigabit Ethernet receivers can be realized as a high-speed analog-to-digital converter (ADC) with a resolution of 7 to 8 bits, leaving the tasks of echo cancellation and equalization entirely in the digital domain. Alternatively, the latter two functions can be partially performed in the analog domain, thereby reducing the complexity and power dissipation of both the ADC and the digital processor.

This paper describes a discrete-time linear equalizer that, along with the echo canceller reported in [1], processes the received signal in Gigabit Ethernet before digitization. The proposed equalizer architecture achieves a high speed through interleaving and coefficient rotation, proving useful in other applications as well. Using 8 taps adapted by an LMS algorithm, the circuit operates at 125 MHz in a 0.25-μm digital CMOS technology.

The next section of the paper deals with design considerations for channel equalization and introduces the equalizer architecture. Section III describes the design of the building blocks and Section IV presents the experimental results obtained for the prototype.

II. DESIGN CONSIDERATIONS
Figure 1 shows the typical characteristics of a 100-m CAT-5 twisted-pair cable in the time and frequency domains. The impulse response exhibits a long tail, thereby introducing substantial intersymbol interference (ISI), and the frequency response suffers from a loss of about 20 dB at 100 MHz.

The required performance of the architecture is determined by both the cable characteristics and the Gigabit Ethernet environment depicted in Fig. 2. Note that equalization in the analog domain offers the advantage of processing the unquantized signal, thus yielding a higher signal-to-noise ratio (SNR) for a given number of taps. A behavioral model written in C and including the sources of echo and crosstalk is used to compute (1) the minimum allowable number of taps in the equalizer and (2) the minimum tolerable accuracy of the coefficients controlling each tap. Simulations predict that 8 taps and a coefficient accuracy of 9 bits along with an ADC resolution of 4 to 5 bits provide acceptable performance.

![Fig. 1. Cable characteristics.](image)

![Fig. 2. Gigabit Ethernet receiver implementation with mixed-signal echo canceller and analog equalizer.](image)
plexity of the switching array required in this topology grows quadratically with the number of taps, thereby raising severe signal routing and crosstalk issues.

A. Proposed Architecture

The equalizer design reported here is based on a coefficient-rotating architecture. To arrive at the overall system, first consider the simplified concept illustrated in Fig. 3 as evolution in the time domain. The circuit employs an array of interleaved sample-and-hold amplifiers (SHAs) followed by multipliers. At $t = 0$, sample $x[0]$ is taken and multiplied by $c_0$. At $t = 1$ (one clock period later), the coefficients are shifted counterclockwise and sample $x[1]$ is taken and multiplied by $c_0$. Similarly, at $t = 2$, sample $x[2]$ is taken and multiplied by $c_0$. In other words, rather than shifting the analog data, the circuit simply shifts the coefficients (in the opposite direction), thus providing the same functionality as a conventional FIR filter.

The coefficient-rotating architecture offers two important advantages over conventional FIR implementations. First, owing to interleaving, it operates at a potentially higher speed. Second, by rotating only the digital coefficients, it maintains a high SNR in the analog signal path. The design nonetheless requires attention to the interleaved SHAs and their sampling clocks to ensure negligible corruption in the presence of mismatches.

Figure 4 shows the complete equalizer architecture. All analog signals are differential. Nine SHAs sample the input consecutively, presenting the held values to nine analog multipliers whose outputs are summed in the current domain. When each SHA enters the hold mode, its output is "masked" for one clock cycle by setting the corresponding coefficient to zero. This allows the SHA output to settle before it contributes to the equalizer output, reducing noise and distortion. The effective number of taps is therefore equal to eight.

The coefficients are defined by digital-to-analog converters (DACs) driven by the coefficient registers $c_0$-$c_8$. The up/down (U/D) counters both update the coefficients during the training period and rotate them during actual reception. The clock generator produces the sampling clock phases $\phi_0$-$\phi_8$ from the master clock $CK$.

The LMS machine in Fig. 4 is based on a sign-sign algorithm, requiring the polarity of the analog samples. Rather than following each SHA by a comparator, this design incorporates a dummy SHA driven by $CK$ and a single comparator to detect the polarity of the input. This technique both saves considerable power and isolates the sensitive analog samples from the kickback noise of the comparator.

The equalizer adapts by receiving a known training sequence through a CAT-5 cable. The LMS machine updates the U/D counters and hence the coefficients, forcing the average of $Error$ signal to go to zero. The results are then stored in the coefficient registers.

III. BUILDING BLOCKS

The power dissipation of the equalizer is determined by primarily the SHAs, the multipliers and the DACs. The speed-precision trade-offs make the design of these circuits particularly challenging.

A. SHA

Figure 5 illustrates the SHA topology. A differential sampling network drives a cascode differential pair. Proper choice of switch dimensions and capacitor values ensures less than 1% harmonic distortion due to charge injection. The differential pair is degenerated [4] and loaded by diode-connected devices to achieve a high linearity. The output masking technique described in Section II allows one clock cycle for settling, reducing the power consumed by the differential pair to only 0.6 mW from a 2.5-V supply.
The SHA output drives one port of a subsequent multiplier. Interestingly, capacitive paths in the multiplier couple the changes due to the rotation of coefficients to the output of the SHA (Fig. 6). Such coupling occurs in each clock cycle, corrupting the value sampled on the capacitors for eight cycles. The cascode devices used in the differential pair suppress this effect.

**B. Multiplier**

The multiplier must achieve sufficient linearity in both of its input ports, making it difficult to utilize a Gilbert cell with limited voltage headroom. The multiplier employed in this work is basis on signal-dependent degeneration of differential pairs. Figure 7 illustrates the concept in simplified form. Operating in the triode region and controlled by the SHA output, transistors $M_5$ and $M_6$ degenerate differential pairs $M_1-M_2$ and $M_3-M_4$, respectively.

In order to further linearize the multiplier, the degeneration techniques of Figs. 5 and 7 are combined, yielding the circuit shown in Fig. 8. Now, the circuit exhibits a high linearity with respect to both of its input ports, a remarkable advantage over a Gilbert cell. Each multiplier consumes 1.5 mW from a 2.5-V supply.

**C. Resistor Ladder DAC**

The 9-bit DAC used to set the analog value of the coefficients must operate at the full rate while dissipating a small power. Furthermore, since the equalizer requires nine such DACs, their complexity is of great concern. This work introduces a compact, low-power DAC that achieves a fast settling. The technique is feasible here because it generates 9-bit monotonicity but not necessarily 9-bit integral linearity.

Depicted in Fig. 9, the DAC consists of a coarse section and a fine section. The former divides the full-scale voltage, $V_{FS}$, into 32 equal segments and selects two tap voltages, $V_P$ and $V_N$, according to 1-of-n code $A_0-A_{31}$. The latter subdivides the difference between these voltages into 30 equal segments. Switches $S_0-S_{31}$ provide the intermediate voltages between the fine segments.

The fine resistor ladder slides up and down according to the MSBs and subdivides one segment of the coarse ladder. The sliding interpolation allows 9 bits of resolution with only 60 resistors and about 100 transistors. Also, the small number of MOS switches in the analog path provides a fast settling.

**D. Adder and I/V Converter**

In order to achieve fast settling, the output currents of the multipliers are summed at the sources of two common-gate devices (Fig. 10). A differential load consisting of the cross-coupled pair and the diode-connected devices is chosen to improve the linearity of current-to-voltage (I/V) conversion.
The I/V converter must drive both the output stage and the comparator that detects the sign for the LMS machine. Thus, two separate output branches are used to isolate the main output from the kickback noise of the comparator.

E. Clock Generator

The sampling phases $\phi_0-\phi_6$ in Fig. 4 must exhibit small mismatches and low jitter. While a delay-locked loop can produce these phases, supply and substrate noise may corrupt their timing significantly. Shown in Fig. 11, the multiphase clock generator is realized as a simple ring counter driven by the master clock. The circuit begins with a single ONE in the ring and rotates the ONE, producing a pulse on $\phi_0-\phi_6$ as the ONE is ANDed with a delayed version of the master clock, $\overline{CK} \cdot D$.

IV. EXPERIMENTAL RESULTS

The mixed-signal equalizer has been fabricated in a digital 0.25-$\mu$m CMOS technology. Shown in Fig. 12 is a photograph of the die, whose active area measures 0.95 mm $\times$ 0.85 mm. The circuit has been tested in a chip-on-board assembly while running at 125 MHz from a 2.5-V power supply.

Figure 13 shows the measured impulse response of the equalizer on the top and the adaptation of one of the coefficients on the bottom. The equalizer takes approximately 16 $\mu$s to converge. Figure 14 depicts the equalizer output before and after adaptation for a 100-m CAT-5 twisted-pair cable. The eye diagram yields an estimated SNR of about 20 dB.

REFERENCES


