1

Our (my graduate students and me) recent research work are summarized as follows:

1. High-level analysis and synthesis of analog and mixed-signal systems

First, in this project, I have collaborated with others to work on the synthesis of reconfigurable Delta-Sigma modulators. We proposed a systematic methodology for designing optimized topologies for reconfigurable single-loop continuous-time Delta-Sigma modulators. The methodology includes two steps. First, the signal paths and parameters of the optimized reconfigurable topologies are found by solving a set of nonlinear programming equations. The second step of the methodology is to post-optimize the found topologies by fine tuning the topology parameters considering circuit-level non-idealities, which is achieved through a simulated annealing algorithm that estimates the modulator performance by simulating detailed Simulink models of the circuits. We did a case study on designing a three-mode reconfigurable Delta-Sigma modulator. Results show that the complexity of the reconfigurable modulators designed with the proposed methodology is less than traditional three single-mode modulators. The estimated saving in power consumption for the optimized reconfigurable modulator can be up to 20% compared to that of three single-mode designs. Also, the reconfigurable topologies require less reconfigurable cells than a recent state-of-the-art reconfigurable modulator design. Besides, the produced reconfigurable topologies are more robust to circuit non-idealities such as circuit noise, integrator leakage, nonlinearity etc. compared to other designs.

Also in this research project, I have also worked on statistical analysis of Delta-Sigma modulators. I proposed a symbolic method for fast system-level statistical analysis of Delta-Sigma modulators. Based on linear modeling of Delta-Sigma modulators, a systematic symbolic formulation of statistical performance variation is derived so that variations of capacitors are directly translated to performance variation. In addition, the symbolic formulation is derived from a generic modulator topology, so that it is applicable to any topology that is covered by the generic topology. The research results have shown that the symbolic formulation can provide fast and reasonably accurate estimation of statistical performance variation, especially for high-order modulators. The speed-up factor is up to 1,000 compared to traditional timing-consuming Monte-Carlo simulations. Later, I also proposed a hierarchical approach for efficient circuit-level statistical analysis of continuous-time Delta-Sigma modulators.

Based on the above research results, I then worked with two Master students to synthesize Delta-Sigma modulators and analog filters for minimum statistical performance variation using mathematical programming techniques. Our results have shown that the synthesized modulators and filters outperform the traditional design solutions. Later, we also proposed to stochastically post-optimize the synthesized designs considering modeling noise so that the resulting designs are more robust.

Further, in this project, I also proposed a systematic design flow for optimal capacitance assignment in Switched-Capacitor biquad circuits. The proposed design flow starts with a biquad template configuration defined based on circuit regularity, then symbolically derive all the design constraints and cost functions, and finally formulate a nonlinear program to solve for optimal capacitance assignment. The main contribution of this work is that an optimal biquad configuration can be obtained in an automated and accurate way, while traditional works derive biquad configurations manually and analytically which is not efficient. Also, the proposed design flow is very general to biquad circuits and additional design constraints can be easily incorporated. Besides, we define optimal capacitance assignment in a broader sense by considering also circuit complexity and sensitivity compared to previous works. This makes it possible to tradeoff some performance metrics for others depending on the application or the designer’s concern. The research results show the flexibility and efficiency of the design flow to generate optimal biquad designs under various design tradeoffs, which outperform traditional solutions.

2. Silicon Implementation of a 4^th-order Delta-Sigma modulator

In this project, I worked with a Master student to custom-design a 4^th-order Delta-Sigma modulator for WCDMA/UMTS in TSMC 0.25μm CMOS technology. The main novelty of this work is that we explored for the possibility of designing a low-power and low-complexity Delta-Sigma modulator by employing Gm-C integrators instead of traditional OpAmp-RC integrators. We have analyzed and addressed various design challenges related to Gm-C integrator based modulator design, such as non-linearity of the first-stage Gm circuit and DAC feedback due to high-voltage swing of the first-stage integrator output. The designed modulator consumes only 3.5mW of power with a reduced power supply of 1.8 V. It was also manufactured by MOSIS with a die area of 0.13 mm².

The packaged chip is shown below:

We acknowledge the support of MOSIS (www.mosis.com) for manufacturing the design.

3. Reducing clock jitter effects in continuous-time Delta-Sigma modulators

In this funded research project, I worked with a Master student to address the issue of clock jitter effects in continuous-time Delta-Sigma modulators. We proposed a simple and novel method to reduce clock jitter effects by employing delay elements to generate a return-to-zero feedback with a fixed-width pulse for active feedback. However, the fixed-width pulse is not ideal, but subject to random and deterministic jitter caused by internal device noise and external power supply noise. Therefore, we have recently investigated the amount of width jitter of the pulse itself. It is found that the width jitter of the pulse could be much smaller than that of the original clock. In this way, the traditional pulse-width jitter of the clock could be significantly reduced. Simulation results have shown that the proposed method outperforms other reported methods in terms of the effectiveness on jitter noise reduction, ease of modulator synthesis, simplicity of circuit implementation and power consumption overhead under the same noise consideration. Also, we found that the proposed method is quite tolerant of process-induced timing variations, rendering the method practically useful.

4. Hardware implementation of speech recognition systems

In this project, we custom-designed a coprocessor for OPC (Output Probability Calculation), which is the most computation-intensive processing step in CHMM (Continuous Hidden Markov Model) based speech recognition algorithms. To speed up processing, we propose a multi-processing-element hardware architecture to allow parallel computation of OPC. The optimal number of processing elements is task dependent and is explored to achieve the optimal tradeoff between speech processing delay, energy consumption and hardware resources. To save hardware resource and reduce power consumption, a polynomial addition based method is used to compute add-log instead of the traditional look-up table based method. The proposed coprocessor has been implemented and tested in both Xilinx FPGA and IBM 0.13μm CMOS technology. To implement an entire speech recognition system, SAMSUNG S3C44b0X (containing an ARM7) is used as the micro-controller to execute the rest of speech processing. Tested with a 358-state 3-mixture 27-feature 800-word HMM, S3C44b0X operates at 40MHz and coprocessor at 10MHz to meet the real-time requirement, and the recognition accuracy is 95.2%. Experiment and analysis show that the speech recognition system based on the proposed coprocessor is especially suitable for mid-size vocabulary (100-1000 words) recognition tasks.

5. Hardware implementation of motion estimation algorithms

In this project, I worked with a Master student to target a low-power VLSI implementation of variable block size motion estimation (VBSME) used in latest video coding standards H.264/AVC. VBSME accounts for 80% for the total computation load for H.264/AVC, therefore reducing the power consumption for VBMSE is most important for power reduction of the overall video coding system. In this project, we targeted a low-power VLSI implementation for VBSME, which employs a fast full-search block matching algorithm to reduce power consumption, while preserving the optimal solution the throughput. It may be the first time that a fast full-search block matching algorithm is explored to reduce power consumption in the context of VBSME, though it is already used in fixed block size motion estimation (FBSME) before. The proposed design has been implemented and tested in Xilinx FPGA. Experiment results show that under a clock frequency of 180MHz, this design meets the real-time requirement for HDTV 720p (1280×720) at 45fps (frame-per-second). Compared to conventional VBSME design that give optimal solutions of Motion Vectors (MV), the proposed design can save power consumption by 45%.

6. Design of a hardware-based video system for real-time vehicle detection

In this project, we are designing a video-based system for real-time vehicle detection. To allow real-time operation, we target customized hardware implementation of the system instead of traditional PC based detection system. Since the system includes integration of diverse devices, currently the Video System Design Kit from Xilinx is used. The system is to be applied to facilitate vehicle merging at merging ramp entrances.

The diagram of the planned system is shown below:

Fig1a.eps

7. Development of a traffic performance measurement system for roundabouts

Traffic data collection is essential for performance assessment, safety improvement and road planning. While there are relatively mature data acquisition technologies available for highways, automatic data collection from roundabouts presents unique challenges because of more complex traffic scenes, data specifications and vehicle behavior. In this project, we propose a tracking-based automated traffic data collection system dedicated to roundabouts. The proposed system has four main steps of processing, camera calibration, vehicle segmentation, vehicle tracking and data mining. The resulting vehicle trajectory of each individual vehicle gives the position, size, shape and speed of the vehicle at each time moment. The types of traffic data that can measured with the proposed system includes origin/destination volume, waiting time at each entrance approach and the size of gaps used by merging vehicles. The overall traffic data collection process has been implemented on a regular PC environment and a software has been developed. The total processing time for a 2-hour video is currently 4 hours. The extracted traffic data has been compared to manual measurements and an accuracy of more than 90% has been achieved.